Amazon ECS Cheat Sheet
Overview
Amazon Elastic Container Service (ECS) is a fully managed container orchestration service that simplifies deploying, managing, and scaling containerized applications. Unlike Kubernetes-based solutions, ECS uses its own native orchestration engine deeply integrated with the AWS ecosystem, offering a streamlined experience for teams already invested in AWS. ECS handles container placement, scheduling, and lifecycle management while integrating with IAM, VPC, CloudWatch, ALB/NLB, and other AWS services out of the box.
ECS offers two launch types: Fargate (serverless—you define CPU and memory, AWS manages the infrastructure) and EC2 (you manage the underlying instances). The core abstractions are task definitions (blueprints for containers), tasks (running instances of task definitions), services (long-running tasks with desired count and load balancing), and clusters (logical grouping of resources). ECS also supports ECS Exec for interactive debugging, capacity providers for scaling strategies, and Service Connect for simplified service-to-service networking.
Installation
AWS CLI Setup
# Install AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
# Configure credentials
aws configure
# Install ECS CLI (legacy but useful)
sudo curl -Lo /usr/local/bin/ecs-cli https://amazon-ecs-cli.s3.amazonaws.com/ecs-cli-linux-amd64-latest
sudo chmod +x /usr/local/bin/ecs-cli
# Install copilot CLI (recommended for new projects)
curl -Lo copilot https://github.com/aws/copilot-cli/releases/latest/download/copilot-linux
chmod +x copilot
sudo mv copilot /usr/local/bin/
Core Commands
| Command | Description |
|---|---|
aws ecs create-cluster --cluster-name <name> | Create a new ECS cluster |
aws ecs list-clusters | List all clusters |
aws ecs describe-clusters --clusters <name> | Describe cluster details |
aws ecs register-task-definition --cli-input-json file://task.json | Register a task definition |
aws ecs create-service | Create a service |
aws ecs update-service | Update service (deploy new version) |
aws ecs list-tasks --cluster <name> | List running tasks |
aws ecs stop-task --cluster <name> --task <arn> | Stop a task |
aws ecs execute-command | Interactive shell into container |
aws ecs delete-service --cluster <name> --service <name> --force | Delete a service |
Task Definition
{
"family": "my-web-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "web",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"environment": [
{"name": "NODE_ENV", "value": "production"}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"
}
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
]
}
Create and Manage Services
# Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json
# Create a Fargate service with ALB
aws ecs create-service \
--cluster production \
--service-name web-service \
--task-definition my-web-app:1 \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-abc123,subnet-def456],securityGroups=[sg-abc123],assignPublicIp=ENABLED}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-tg/abc123,containerName=web,containerPort=8080"
# Update service (deploy new task definition revision)
aws ecs update-service \
--cluster production \
--service web-service \
--task-definition my-web-app:2 \
--force-new-deployment
# Scale service
aws ecs update-service \
--cluster production \
--service web-service \
--desired-count 5
ECS Exec (Container Debugging)
# Enable ECS Exec on a service
aws ecs update-service \
--cluster production \
--service web-service \
--enable-execute-command
# Start interactive session
aws ecs execute-command \
--cluster production \
--task abc123def456 \
--container web \
--interactive \
--command "/bin/bash"
AWS Copilot (Simplified Workflow)
# Initialize application
copilot init --app my-app --name web --type "Load Balanced Web Service" --dockerfile ./Dockerfile
# Deploy to environment
copilot deploy --name web --env production
# View logs
copilot svc logs --name web --env production --follow
# Show service status
copilot svc status --name web
# Add a new service
copilot svc init --name api --svc-type "Backend Service"
# Scale
copilot svc override --name web # edit manifest
Configuration
Auto Scaling
# Register scalable target
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/production/web-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 20
# Create target tracking scaling policy
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/production/web-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-scaling \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}'
Capacity Providers
# Create capacity provider for Fargate Spot
aws ecs create-service \
--cluster production \
--service-name web-service \
--task-definition my-web-app \
--desired-count 6 \
--capacity-provider-strategy \
capacityProvider=FARGATE,weight=1,base=2 \
capacityProvider=FARGATE_SPOT,weight=3
Advanced Usage
Blue/Green Deployment with CodeDeploy
{
"deploymentController": {
"type": "CODE_DEPLOY"
}
}
# appspec.yaml
version: 0.0
Resources:
- TargetService:
Type: AWS::ECS::Service
Properties:
TaskDefinition: <TASK_DEFINITION>
LoadBalancerInfo:
ContainerName: "web"
ContainerPort: 8080
Service Connect
{
"serviceConnectConfiguration": {
"enabled": true,
"namespace": "production",
"services": [
{
"portName": "http",
"discoveryName": "web-service",
"clientAliases": [
{
"port": 8080,
"dnsName": "web-service"
}
]
}
]
}
}
Troubleshooting
| Issue | Solution |
|---|---|
| Task fails to start | Check CloudWatch logs at /ecs/<task-family> for container errors |
| Service stuck deploying | Check aws ecs describe-services for event messages; verify health check |
| Unable to pull image | Ensure execution role has ecr:GetDownloadUrlForLayer and NAT gateway for private subnets |
| ECS Exec not working | Ensure SSM agent is running, task role has SSM permissions, and --enable-execute-command is set |
| Tasks keep getting killed | Check memory limits—container OOM is reported as exit code 137 |
| ALB returns 502 | Container health check might be failing; verify the container starts before the ALB check interval |
| Fargate platform version errors | Specify --platform-version LATEST or a specific version like 1.4.0 |