Amazon SageMaker Core Platform

Amazon SageMaker is the central ML platform on AWS, providing a fully managed environment for the entire machine learning lifecycle. It handles everything from data preparation and model training to deployment and monitoring — eliminating the need to stitch together individual services for each step.

Overview

SageMaker provides several core components, each designed for a specific phase of the ML workflow:

Component	What It Does	When to Use
SageMaker Studio	Integrated IDE for ML development. Hosts notebooks, Data Wrangler, Experiments, Debugger, and more	Default ML development environment — most SageMaker features live inside Studio
SageMaker Notebooks	Managed Jupyter notebooks with pre-installed ML frameworks	Exploration, prototyping, ad-hoc analysis. Single-instance only (not distributed)
Training Jobs	Managed training infrastructure. Provisions instances, runs training, stores model artifacts in S3	Any model training beyond notebook prototyping
Real-time Endpoints	Persistent HTTPS endpoints for real-time inference with auto-scaling	Steady, low-latency inference traffic
Batch Transform	Run inference on entire datasets without a persistent endpoint	Periodic bulk predictions, large datasets, no real-time requirement
Serverless Inference	Auto-scaling endpoint that scales to zero when idle	Intermittent or unpredictable traffic, cost-sensitive workloads
Async Inference	Queue-based inference for large payloads (up to 1 GB)	Large payloads, long processing times, minutes of latency acceptable

Training Data Input Modes

When running Training Jobs, SageMaker reads data from S3 using one of three modes:

Mode	How It Works	Best For
File mode	Downloads all data to the training instance before training begins	Small to medium datasets
Pipe mode	Streams data directly from S3 during training (requires RecordIO format)	Large datasets where download time is a bottleneck
FastFile mode	POSIX-compatible streaming from S3 (no format restriction)	Large datasets with any file format

Choosing an Inference Option

Use this decision guide to pick the right inference approach:

Scenario	Choose
Consistent, low-latency traffic	Real-time Endpoint
Sporadic traffic with cost sensitivity	Serverless Inference (accepts cold start latency)
One-time or periodic scoring of large datasets	Batch Transform
Payloads over 6 MB or processing takes minutes	Async Inference
A/B testing between model versions	Real-time Endpoint with production variants

When to Use

SageMaker is the right choice when you need custom model training, fine-grained control over the ML lifecycle, or access to SageMaker's built-in algorithms. For tasks where a pre-trained API can solve the problem (text analytics, image recognition, translation), consider AWS AI Application Services first — they require no ML expertise and are faster to deploy.

Flashcards

1 / 8

Question

What are the four SageMaker inference options and when do you use each?

Click to reveal

Answer

Real-time Endpoints (steady low-latency traffic), Serverless Inference (intermittent traffic, scales to zero), Batch Transform (periodic bulk predictions), Async Inference (large payloads up to 1 GB, minutes of latency OK).

Key Insight

SageMaker Notebooks are single-instance environments meant for prototyping. For production model training, always use SageMaker Training Jobs, which provision managed infrastructure, support distributed training, and automatically store artifacts in S3.

Overview​

Training Data Input Modes​

Choosing an Inference Option​

When to Use​

Flashcards​

Overview

Training Data Input Modes

Choosing an Inference Option

When to Use

Flashcards