Amazon SageMaker Core Platform
Amazon SageMaker is the central ML platform on AWS, providing a fully managed environment for the entire machine learning lifecycle. It handles everything from data preparation and model training to deployment and monitoring — eliminating the need to stitch together individual services for each step.
Overview
SageMaker provides several core components, each designed for a specific phase of the ML workflow:
| Component | What It Does | When to Use |
|---|---|---|
| SageMaker Studio | Integrated IDE for ML development. Hosts notebooks, Data Wrangler, Experiments, Debugger, and more | Default ML development environment — most SageMaker features live inside Studio |
| SageMaker Notebooks | Managed Jupyter notebooks with pre-installed ML frameworks | Exploration, prototyping, ad-hoc analysis. Single-instance only (not distributed) |
| Training Jobs | Managed training infrastructure. Provisions instances, runs training, stores model artifacts in S3 | Any model training beyond notebook prototyping |
| Real-time Endpoints | Persistent HTTPS endpoints for real-time inference with auto-scaling | Steady, low-latency inference traffic |
| Batch Transform | Run inference on entire datasets without a persistent endpoint | Periodic bulk predictions, large datasets, no real-time requirement |
| Serverless Inference | Auto-scaling endpoint that scales to zero when idle | Intermittent or unpredictable traffic, cost-sensitive workloads |
| Async Inference | Queue-based inference for large payloads (up to 1 GB) | Large payloads, long processing times, minutes of latency acceptable |
Training Data Input Modes
When running Training Jobs, SageMaker reads data from S3 using one of three modes:
| Mode | How It Works | Best For |
|---|---|---|
| File mode | Downloads all data to the training instance before training begins | Small to medium datasets |
| Pipe mode | Streams data directly from S3 during training (requires RecordIO format) | Large datasets where download time is a bottleneck |
| FastFile mode | POSIX-compatible streaming from S3 (no format restriction) | Large datasets with any file format |
Choosing an Inference Option
Use this decision guide to pick the right inference approach:
| Scenario | Choose |
|---|---|
| Consistent, low-latency traffic | Real-time Endpoint |
| Sporadic traffic with cost sensitivity | Serverless Inference (accepts cold start latency) |
| One-time or periodic scoring of large datasets | Batch Transform |
| Payloads over 6 MB or processing takes minutes | Async Inference |
| A/B testing between model versions | Real-time Endpoint with production variants |
When to Use
SageMaker is the right choice when you need custom model training, fine-grained control over the ML lifecycle, or access to SageMaker's built-in algorithms. For tasks where a pre-trained API can solve the problem (text analytics, image recognition, translation), consider AWS AI Application Services first — they require no ML expertise and are faster to deploy.
Flashcards
What are the four SageMaker inference options and when do you use each?
Click to revealReal-time Endpoints (steady low-latency traffic), Serverless Inference (intermittent traffic, scales to zero), Batch Transform (periodic bulk predictions), Async Inference (large payloads up to 1 GB, minutes of latency OK).
SageMaker Notebooks are single-instance environments meant for prototyping. For production model training, always use SageMaker Training Jobs, which provision managed infrastructure, support distributed training, and automatically store artifacts in S3.