Overview
AWS Step Functions is a serverless orchestration service that lets you combine AWS Lambda functions and other AWS services to build business-critical applications. It provides a visual workflow editor and uses Amazon States Language (ASL) to define state machines that coordinate multiple services into resilient workflows.
Step Functions offers two workflow types: Standard workflows for long-running, durable processes (up to one year) and Express workflows for high-volume, short-duration event processing. It provides built-in error handling, retry logic, parallel execution, and direct integrations with over 200 AWS services without writing custom code.
Installation
AWS CLI Setup
# Install AWS CLI
pip install awscli
# Configure credentials
aws configure
# Create a state machine
aws stepfunctions create-state-machine \
--name MyWorkflow \
--definition file://definition.json \
--role-arn arn:aws:iam::123456789012:role/StepFunctionsRole
# Start an execution
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:MyWorkflow \
--input '{"orderId": "12345"}'
CDK (TypeScript)
import * as sfn from 'aws-cdk-lib/aws-stepfunctions';
import * as tasks from 'aws-cdk-lib/aws-stepfunctions-tasks';
const validateOrder = new tasks.LambdaInvoke(this, 'Validate', {
lambdaFunction: validateFn,
resultPath: '$.validation',
});
const processPayment = new tasks.LambdaInvoke(this, 'Payment', {
lambdaFunction: paymentFn,
resultPath: '$.payment',
});
const definition = validateOrder
.next(new sfn.Choice(this, 'IsValid?')
.when(sfn.Condition.booleanEquals('$.validation.Payload.valid', true),
processPayment.next(new sfn.Succeed(this, 'Done')))
.otherwise(new sfn.Fail(this, 'Invalid', {
cause: 'Order validation failed',
})));
new sfn.StateMachine(this, 'OrderWorkflow', {
definition,
timeout: Duration.minutes(30),
stateMachineType: sfn.StateMachineType.STANDARD,
});
State Types
| State | Description |
|---|
Task | Execute work (Lambda, API call, service integration) |
Pass | Pass input to output, optionally transforming data |
Wait | Delay for a specified time |
Choice | Branch based on conditions |
Parallel | Execute branches simultaneously |
Map | Iterate over an array |
Succeed | End execution successfully |
Fail | End execution with an error |
Amazon States Language (ASL)
Basic Workflow
{
"Comment": "Order processing workflow",
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:validate",
"ResultPath": "$.validation",
"Next": "CheckValid"
},
"CheckValid": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.validation.valid",
"BooleanEquals": true,
"Next": "ProcessPayment"
}
],
"Default": "OrderFailed"
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:payment",
"Retry": [
{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 3,
"MaxAttempts": 3,
"BackoffRate": 2.0
}
],
"Catch": [
{
"ErrorEquals": ["States.ALL"],
"Next": "PaymentFailed",
"ResultPath": "$.error"
}
],
"Next": "OrderComplete"
},
"OrderComplete": {
"Type": "Succeed"
},
"OrderFailed": {
"Type": "Fail",
"Cause": "Order validation failed"
},
"PaymentFailed": {
"Type": "Fail",
"Cause": "Payment processing failed"
}
}
}
Parallel Execution
{
"Type": "Parallel",
"Branches": [
{
"StartAt": "SendEmail",
"States": {
"SendEmail": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:sendEmail",
"End": true
}
}
},
{
"StartAt": "UpdateDB",
"States": {
"UpdateDB": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:updateDB",
"End": true
}
}
}
],
"Next": "Done"
}
Map State (Iterate)
{
"Type": "Map",
"ItemsPath": "$.items",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "DISTRIBUTED",
"ExecutionType": "STANDARD"
},
"StartAt": "ProcessItem",
"States": {
"ProcessItem": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:processItem",
"End": true
}
}
},
"MaxConcurrency": 10,
"Next": "Aggregate"
}
Service Integrations
{
"SendSNS": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"TopicArn": "arn:aws:sns:us-east-1:123456789012:alerts",
"Message.$": "$.message"
},
"Next": "Done"
},
"WriteDynamoDB": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:putItem",
"Parameters": {
"TableName": "Orders",
"Item": {
"orderId": {"S.$": "$.orderId"},
"status": {"S": "completed"}
}
},
"Next": "Done"
},
"StartECSTask": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "arn:aws:ecs:...:cluster/my-cluster",
"TaskDefinition": "arn:aws:ecs:...:task-definition/my-task:1"
},
"Next": "Done"
}
}
CLI Commands
| Command | Description |
|---|
aws stepfunctions list-state-machines | List all state machines |
aws stepfunctions describe-state-machine --state-machine-arn <arn> | Get details |
aws stepfunctions start-execution --state-machine-arn <arn> --input '{}' | Start execution |
aws stepfunctions describe-execution --execution-arn <arn> | Get execution status |
aws stepfunctions list-executions --state-machine-arn <arn> | List executions |
aws stepfunctions get-execution-history --execution-arn <arn> | Get event history |
aws stepfunctions stop-execution --execution-arn <arn> | Stop execution |
aws stepfunctions update-state-machine --state-machine-arn <arn> --definition file://def.json | Update definition |
Advanced Usage
Wait for Callback (Task Token)
{
"WaitForApproval": {
"Type": "Task",
"Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
"Parameters": {
"QueueUrl": "https://sqs.us-east-1.amazonaws.com/123456789012/approvals",
"MessageBody": {
"taskToken.$": "$$.Task.Token",
"orderId.$": "$.orderId"
}
},
"TimeoutSeconds": 86400,
"Next": "ProcessApproval"
}
}
# Send task token callback from external system
aws stepfunctions send-task-success \
--task-token "token-value" \
--output '{"approved": true}'
JSONPath Data Manipulation
{
"Type": "Pass",
"Parameters": {
"orderId.$": "$.order.id",
"total.$": "$.order.items[*].price",
"timestamp.$": "$$.State.EnteredTime",
"executionId.$": "$$.Execution.Id"
},
"ResultPath": "$.processed",
"Next": "NextState"
}
Troubleshooting
| Issue | Solution |
|---|
| Execution failed immediately | Check IAM role permissions; verify Lambda ARNs are correct |
| State machine stuck | Check for missing End or Next fields; verify callback token handling |
| Input/output data lost | Review InputPath, OutputPath, ResultPath configuration |
| Choice state falls through | Ensure Default case is defined; verify condition variable paths |
| Express workflow limits | Express max 5 min duration; use Standard for longer workflows |
| Map state failures | Check ItemsPath points to an array; verify individual item processing |
| Timeout errors | Increase TimeoutSeconds; check downstream service latency |
| Retry not working | Verify ErrorEquals matches the actual error type thrown |