Skip to main content

SageMaker Sub-Features

Beyond the core training and inference components, SageMaker offers a rich set of sub-features that cover every aspect of the ML lifecycle — from data preparation and labeling to model explainability, monitoring, and CI/CD for ML. Understanding which feature to use for each task is essential for building production ML systems.

Data Preparation and Labeling

FeatureWhat It DoesWhen to Use
Data WranglerVisual data preparation inside Studio. Transforms, analyzes, and visualizes data. Connects to S3, Redshift, Athena, SnowflakeData prep, feature engineering, EDA before modeling. Does NOT accept direct uploads — must import from a data source
Feature StoreCentralized store for ML features with two modes: Online (low-latency) and Offline (S3-backed)Share and reuse features across teams. Online mode serves features at inference time; Offline mode provides historical data for training
Ground TruthData labeling service supporting image, text, video, and 3D labelingBuilding labeled training datasets
CanvasNo-code ML for business analysts — point-and-click model buildingWhen users with no coding knowledge need to build ML predictions

Data Wrangler Key Capabilities

  • Quick Model: Generates feature importance scores to identify which features matter most
  • Multicollinearity Detection: Uses PCA/VIF to find correlated features
  • Bias Report: Assesses fairness in your data before training

Ground Truth Workforce Options

WorkforceWhen to Use
Private workforceSensitive or confidential data
Amazon Mechanical TurkNon-sensitive data at scale
Active learningAutomatically labels high-confidence samples; humans label the rest — reduces labeling cost

Model Development and Tuning

FeatureWhat It DoesWhen to Use
AutopilotAutoML — automatically explores data, selects algorithms, trains and tunes modelsWhen you want a quick baseline model or full automated ML pipeline
Automatic Model Tuning (HPO)Hyperparameter optimization with Bayesian, Random, Grid, and Hyperband strategiesFinding optimal hyperparameters for your model
ExperimentsTrack, compare, and evaluate ML experiments — logs parameters, metrics, artifactsComparing different preprocessing or training approaches
Training CompilerOptimizes deep learning training by compiling computation graphsSpeed up PyTorch or TensorFlow training (up to 50% faster) without code changes

HPO Strategy Guide

StrategyCharacteristicsBest For
BayesianUses results of previous evaluations to choose next setMost efficient with fewer jobs
RandomSamples randomly from the search spaceGood baseline, easily parallelized
GridTests every combinationSmall, discrete search spaces
HyperbandEarly-stops poorly performing configurationsFastest overall, large search spaces
Warm startReuses results from previous tuning jobsIterating on prior tuning work

Model Explainability and Monitoring

FeatureWhat It DoesWhen to Use
ClarifyBias detection and model explainability using SHAP valuesExplain individual predictions, detect bias pre-training and post-training
Model MonitorMonitors deployed models for data drift, model quality, bias drift, and feature attribution driftProduction monitoring — detect when model performance degrades
DebuggerMonitors training jobs in real time — captures tensors, gradients, and weightsTraining-time debugging: detect vanishing gradients, overfitting, etc. Can trigger CloudWatch alarms and auto-stop training

Clarify vs. Model Monitor vs. Debugger

When It RunsWhat It Detects
ClarifyPre-training or post-training (on demand)Bias in data/model, feature contributions (SHAP)
Model MonitorContinuously on deployed endpointsData drift, model quality degradation, bias drift over time
DebuggerDuring training jobsTraining issues — vanishing gradients, overfitting, poor convergence

MLOps and Deployment

FeatureWhat It DoesWhen to Use
PipelinesML workflow orchestration — CI/CD for MLAutomated workflows: data prep, train, evaluate, deploy
Model RegistryCentral catalog for model versioning, approval status, and lineageTracking model versions across dev/staging/production
NeoCompiles models for optimized inference on edge devices (ARM, Intel, NVIDIA)Deploying models to edge hardware
Inference RecommenderBenchmarks model across instance types for best price-performanceChoosing the right endpoint instance type
Elastic InferenceAttaches fractional GPU acceleration to CPU instancesWhen a full GPU is underutilized — reduce inference cost

The MLOps Pattern

A common production pattern chains these features together:

Pipelines (orchestrate workflow) → Model Registry (version and approve) → Model Monitor (detect drift) → EventBridge (trigger retraining)

When to Use

Use SageMaker sub-features when you need integrated, purpose-built tools for specific ML lifecycle tasks. The key is matching each task to the right feature rather than building custom solutions.

Flashcards

1 / 10
Question

What is the difference between SageMaker Ground Truth and Amazon A2I?

Click to reveal
Answer

Ground Truth is for labeling training data before model building. A2I (Augmented AI) is for human review of model predictions in production. Don't confuse the two.

Key Insight

Data Wrangler does NOT accept direct file uploads — you must import data from S3, Redshift, Athena, or another supported data source. If you need feature importance scores quickly, use Data Wrangler's Quick Model rather than training a full Autopilot model.