Skip to main content

Amazon SageMaker Core Platform

Amazon SageMaker is the central ML platform on AWS, providing a fully managed environment for the entire machine learning lifecycle. It handles everything from data preparation and model training to deployment and monitoring — eliminating the need to stitch together individual services for each step.

Overview

SageMaker provides several core components, each designed for a specific phase of the ML workflow:

ComponentWhat It DoesWhen to Use
SageMaker StudioIntegrated IDE for ML development. Hosts notebooks, Data Wrangler, Experiments, Debugger, and moreDefault ML development environment — most SageMaker features live inside Studio
SageMaker NotebooksManaged Jupyter notebooks with pre-installed ML frameworksExploration, prototyping, ad-hoc analysis. Single-instance only (not distributed)
Training JobsManaged training infrastructure. Provisions instances, runs training, stores model artifacts in S3Any model training beyond notebook prototyping
Real-time EndpointsPersistent HTTPS endpoints for real-time inference with auto-scalingSteady, low-latency inference traffic
Batch TransformRun inference on entire datasets without a persistent endpointPeriodic bulk predictions, large datasets, no real-time requirement
Serverless InferenceAuto-scaling endpoint that scales to zero when idleIntermittent or unpredictable traffic, cost-sensitive workloads
Async InferenceQueue-based inference for large payloads (up to 1 GB)Large payloads, long processing times, minutes of latency acceptable

Training Data Input Modes

When running Training Jobs, SageMaker reads data from S3 using one of three modes:

ModeHow It WorksBest For
File modeDownloads all data to the training instance before training beginsSmall to medium datasets
Pipe modeStreams data directly from S3 during training (requires RecordIO format)Large datasets where download time is a bottleneck
FastFile modePOSIX-compatible streaming from S3 (no format restriction)Large datasets with any file format

Choosing an Inference Option

Use this decision guide to pick the right inference approach:

ScenarioChoose
Consistent, low-latency trafficReal-time Endpoint
Sporadic traffic with cost sensitivityServerless Inference (accepts cold start latency)
One-time or periodic scoring of large datasetsBatch Transform
Payloads over 6 MB or processing takes minutesAsync Inference
A/B testing between model versionsReal-time Endpoint with production variants

When to Use

SageMaker is the right choice when you need custom model training, fine-grained control over the ML lifecycle, or access to SageMaker's built-in algorithms. For tasks where a pre-trained API can solve the problem (text analytics, image recognition, translation), consider AWS AI Application Services first — they require no ML expertise and are faster to deploy.

Flashcards

1 / 8
Question

What are the four SageMaker inference options and when do you use each?

Click to reveal
Answer

Real-time Endpoints (steady low-latency traffic), Serverless Inference (intermittent traffic, scales to zero), Batch Transform (periodic bulk predictions), Async Inference (large payloads up to 1 GB, minutes of latency OK).

Key Insight

SageMaker Notebooks are single-instance environments meant for prototyping. For production model training, always use SageMaker Training Jobs, which provision managed infrastructure, support distributed training, and automatically store artifacts in S3.