Skip to main content

Security and Governance

ML workloads handle sensitive data — training datasets, model parameters, and inference inputs/outputs all need protection. AWS provides a layered security model covering identity, encryption, network isolation, and auditing that applies to every stage of the ML lifecycle.

Overview

ServiceWhat It DoesWhen to Use
AWS IAMIdentity and access management — users, groups, roles, policiesControl who can access what. SageMaker uses execution roles (not access keys) for all components
AWS KMSKey Management Service — create and manage encryption keysEncrypt data at rest in S3, EBS, SageMaker volumes, and inter-container communication
Amazon VPCVirtual Private Cloud — network isolationIsolate SageMaker resources from the public internet
AWS CloudTrailLogs all API calls across your AWS accountAuditing, compliance, security investigation
AWS Secrets ManagerStore and rotate secrets (database credentials, API keys)Securely store credentials used by SageMaker notebooks to connect to Redshift/RDS

SageMaker Security Architecture

Identity and Access

SageMaker uses IAM execution roles as the primary access mechanism — never access keys. The execution role is attached to notebooks, training jobs, and endpoints, granting them permissions to access S3, ECR, CloudWatch, and other services.

Encryption

LayerHow to Encrypt
Data at rest (S3)KMS Customer Managed Key (CMK) on the S3 bucket
Data at rest (EBS)KMS encryption on training instance volumes
Data in transitTLS by default. Enable inter-container encryption for distributed training
Model artifactsKMS CMK applied when writing to S3

KMS vs. CloudHSM: KMS is fully managed — AWS maintains the root of trust and logs all key usage. CloudHSM gives you dedicated hardware security modules where you maintain the root of trust.

Network Isolation

The standard pattern for running SageMaker without internet access:

  1. Place SageMaker resources in a VPC with private subnets
  2. Add an S3 Gateway Endpoint for S3 access without internet
  3. Add VPC Interface Endpoints (PrivateLink) for SageMaker API access
  4. Optionally enable network_isolation=True in the training job configuration to fully block all network access

This pattern ensures training data and model artifacts never traverse the public internet.

Security Pattern Summary

RequirementSolution
Control SageMaker accessIAM execution roles
No internet access for SageMakerVPC + S3 Gateway Endpoint + VPC Interface Endpoints
Encrypt training data at restKMS CMK on S3 + enable volume encryption
Encrypt data between training containersEnable inter-container encryption
Full network isolationnetwork_isolation=True in training config
Audit all API callsCloudTrail
Secure database credentialsSecrets Manager
Column-level data accessLake Formation (covered in Analytics section)

When to Use

Security is not optional — every production ML workload should have IAM roles, KMS encryption, and VPC isolation configured. CloudTrail should always be enabled for auditing. Use Secrets Manager whenever notebooks or jobs need database credentials.

Flashcards

1 / 7
Question

How does SageMaker handle authentication — access keys or IAM roles?

Click to reveal
Answer

IAM execution roles, never access keys. Every SageMaker component (notebook, training job, endpoint) has an execution role that grants permissions to AWS services.

Key Insight

The VPC + S3 Gateway Endpoint + VPC Interface Endpoints pattern is the standard approach for running SageMaker in a private network. The S3 Gateway Endpoint is free and provides access to S3 without internet. VPC Interface Endpoints (PrivateLink) provide access to the SageMaker API itself.