Aller au contenu

Amazon ECS Cheat Sheet

Overview

Amazon Elastic Container Service (ECS) is a fully managed container orchestration service that simplifies deploying, managing, and scaling containerized applications. Unlike Kubernetes-based solutions, ECS uses its own native orchestration engine deeply integrated with the AWS ecosystem, offering a streamlined experience for teams already invested in AWS. ECS handles container placement, scheduling, and lifecycle management while integrating with IAM, VPC, CloudWatch, ALB/NLB, and other AWS services out of the box.

ECS offers two launch types: Fargate (serverless—you define CPU and memory, AWS manages the infrastructure) and EC2 (you manage the underlying instances). The core abstractions are task definitions (blueprints for containers), tasks (running instances of task definitions), services (long-running tasks with desired count and load balancing), and clusters (logical grouping of resources). ECS also supports ECS Exec for interactive debugging, capacity providers for scaling strategies, and Service Connect for simplified service-to-service networking.

Installation

AWS CLI Setup

# Install AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure credentials
aws configure

# Install ECS CLI (legacy but useful)
sudo curl -Lo /usr/local/bin/ecs-cli https://amazon-ecs-cli.s3.amazonaws.com/ecs-cli-linux-amd64-latest
sudo chmod +x /usr/local/bin/ecs-cli

# Install copilot CLI (recommended for new projects)
curl -Lo copilot https://github.com/aws/copilot-cli/releases/latest/download/copilot-linux
chmod +x copilot
sudo mv copilot /usr/local/bin/

Core Commands

CommandDescription
aws ecs create-cluster --cluster-name <name>Create a new ECS cluster
aws ecs list-clustersList all clusters
aws ecs describe-clusters --clusters <name>Describe cluster details
aws ecs register-task-definition --cli-input-json file://task.jsonRegister a task definition
aws ecs create-serviceCreate a service
aws ecs update-serviceUpdate service (deploy new version)
aws ecs list-tasks --cluster <name>List running tasks
aws ecs stop-task --cluster <name> --task <arn>Stop a task
aws ecs execute-commandInteractive shell into container
aws ecs delete-service --cluster <name> --service <name> --forceDelete a service

Task Definition

{
  "family": "my-web-app",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::123456789012:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "web",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest",
      "portMappings": [
        {
          "containerPort": 8080,
          "protocol": "tcp"
        }
      ],
      "essential": true,
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/my-web-app",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "environment": [
        {"name": "NODE_ENV", "value": "production"}
      ],
      "secrets": [
        {
          "name": "DB_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}

Create and Manage Services

# Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json

# Create a Fargate service with ALB
aws ecs create-service \
  --cluster production \
  --service-name web-service \
  --task-definition my-web-app:1 \
  --desired-count 3 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-abc123,subnet-def456],securityGroups=[sg-abc123],assignPublicIp=ENABLED}" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-tg/abc123,containerName=web,containerPort=8080"

# Update service (deploy new task definition revision)
aws ecs update-service \
  --cluster production \
  --service web-service \
  --task-definition my-web-app:2 \
  --force-new-deployment

# Scale service
aws ecs update-service \
  --cluster production \
  --service web-service \
  --desired-count 5

ECS Exec (Container Debugging)

# Enable ECS Exec on a service
aws ecs update-service \
  --cluster production \
  --service web-service \
  --enable-execute-command

# Start interactive session
aws ecs execute-command \
  --cluster production \
  --task abc123def456 \
  --container web \
  --interactive \
  --command "/bin/bash"

AWS Copilot (Simplified Workflow)

# Initialize application
copilot init --app my-app --name web --type "Load Balanced Web Service" --dockerfile ./Dockerfile

# Deploy to environment
copilot deploy --name web --env production

# View logs
copilot svc logs --name web --env production --follow

# Show service status
copilot svc status --name web

# Add a new service
copilot svc init --name api --svc-type "Backend Service"

# Scale
copilot svc override --name web  # edit manifest

Configuration

Auto Scaling

# Register scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/production/web-service \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 2 \
  --max-capacity 20

# Create target tracking scaling policy
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/production/web-service \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name cpu-scaling \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

Capacity Providers

# Create capacity provider for Fargate Spot
aws ecs create-service \
  --cluster production \
  --service-name web-service \
  --task-definition my-web-app \
  --desired-count 6 \
  --capacity-provider-strategy \
    capacityProvider=FARGATE,weight=1,base=2 \
    capacityProvider=FARGATE_SPOT,weight=3

Advanced Usage

Blue/Green Deployment with CodeDeploy

{
  "deploymentController": {
    "type": "CODE_DEPLOY"
  }
}
# appspec.yaml
version: 0.0
Resources:
  - TargetService:
      Type: AWS::ECS::Service
      Properties:
        TaskDefinition: <TASK_DEFINITION>
        LoadBalancerInfo:
          ContainerName: "web"
          ContainerPort: 8080

Service Connect

{
  "serviceConnectConfiguration": {
    "enabled": true,
    "namespace": "production",
    "services": [
      {
        "portName": "http",
        "discoveryName": "web-service",
        "clientAliases": [
          {
            "port": 8080,
            "dnsName": "web-service"
          }
        ]
      }
    ]
  }
}

Troubleshooting

IssueSolution
Task fails to startCheck CloudWatch logs at /ecs/<task-family> for container errors
Service stuck deployingCheck aws ecs describe-services for event messages; verify health check
Unable to pull imageEnsure execution role has ecr:GetDownloadUrlForLayer and NAT gateway for private subnets
ECS Exec not workingEnsure SSM agent is running, task role has SSM permissions, and --enable-execute-command is set
Tasks keep getting killedCheck memory limits—container OOM is reported as exit code 137
ALB returns 502Container health check might be failing; verify the container starts before the ALB check interval
Fargate platform version errorsSpecify --platform-version LATEST or a specific version like 1.4.0