Aller au contenu

Argo Rollouts Cheat Sheet

Overview

Argo Rollouts is a Kubernetes controller and set of CRDs that provides advanced deployment strategies beyond what Kubernetes Deployments offer. It supports canary deployments, blue-green deployments, experimentation, and progressive delivery with automated analysis and rollback. Argo Rollouts integrates with service meshes, ingress controllers, and metric providers for traffic management and automated promotion decisions.

Argo Rollouts replaces the standard Kubernetes Deployment resource with a Rollout resource that offers fine-grained control over the update process. It can integrate with Prometheus, Datadog, New Relic, and other monitoring tools to automatically promote or roll back based on real-time metrics.

Installation

# Install Argo Rollouts controller
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

# Install kubectl plugin
brew install argoproj/tap/kubectl-argo-rollouts
# or
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x kubectl-argo-rollouts-linux-amd64
sudo mv kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts

# Verify
kubectl argo rollouts version
kubectl argo rollouts dashboard

Core Commands

CommandDescription
kubectl argo rollouts list rolloutsList all rollouts
kubectl argo rollouts get rollout <name>Get rollout details
kubectl argo rollouts status <name>Show rollout status
kubectl argo rollouts set image <name> <container>=<image>Update image
kubectl argo rollouts promote <name>Promote to next step
kubectl argo rollouts promote <name> --fullSkip remaining steps
kubectl argo rollouts abort <name>Abort rollout
kubectl argo rollouts retry <name>Retry aborted rollout
kubectl argo rollouts undo <name>Rollback to previous
kubectl argo rollouts dashboardOpen web dashboard

Canary Deployment

Basic Canary

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  replicas: 5
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: my-app:1.0.0
          ports:
            - containerPort: 8080
  strategy:
    canary:
      steps:
        - setWeight: 20
        - pause: { duration: 5m }
        - setWeight: 40
        - pause: { duration: 5m }
        - setWeight: 60
        - pause: { duration: 5m }
        - setWeight: 80
        - pause: { duration: 5m }

Canary with Traffic Management (Istio)

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  replicas: 5
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: my-app:1.0.0
  strategy:
    canary:
      canaryService: my-app-canary
      stableService: my-app-stable
      trafficRouting:
        istio:
          virtualServices:
            - name: my-app-vsvc
              routes:
                - primary
      steps:
        - setWeight: 10
        - pause: { duration: 2m }
        - setWeight: 30
        - pause: { duration: 5m }
        - setWeight: 50
        - pause: { duration: 5m }

Blue-Green Deployment

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: my-app:1.0.0
          ports:
            - containerPort: 8080
  strategy:
    blueGreen:
      activeService: my-app-active
      previewService: my-app-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 30
      prePromotionAnalysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: my-app-preview
      postPromotionAnalysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: my-app-active

Analysis Templates

Prometheus-Based Analysis

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
    - name: service-name
  metrics:
    - name: success-rate
      interval: 1m
      count: 5
      successCondition: result[0] >= 0.95
      failureLimit: 3
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            sum(rate(http_requests_total{service="{{args.service-name}}",status=~"2.."}[5m]))
            /
            sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))

Canary with Inline Analysis

strategy:
  canary:
    steps:
      - setWeight: 20
      - pause: { duration: 2m }
      - analysis:
          templates:
            - templateName: success-rate
          args:
            - name: service-name
              value: my-app-canary
      - setWeight: 50
      - pause: { duration: 5m }
      - analysis:
          templates:
            - templateName: success-rate
      - setWeight: 80
      - pause: { duration: 5m }

Web Metric Analysis

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: webcheck
spec:
  metrics:
    - name: webcheck
      interval: 30s
      count: 3
      successCondition: result.status == "healthy"
      provider:
        web:
          url: "http://my-app-canary:8080/healthz"
          jsonPath: "{$.status}"

Configuration

NGINX Ingress Traffic Routing

strategy:
  canary:
    canaryService: my-app-canary
    stableService: my-app-stable
    trafficRouting:
      nginx:
        stableIngress: my-app-ingress
        annotationPrefix: nginx.ingress.kubernetes.io
    steps:
      - setWeight: 10
      - pause: { duration: 5m }
      - setWeight: 50
      - pause: { duration: 10m }

Anti-Affinity and Scaling

spec:
  strategy:
    canary:
      maxSurge: "25%"
      maxUnavailable: 0
      canaryMetadata:
        labels:
          role: canary
      stableMetadata:
        labels:
          role: stable
      scaleDownDelaySeconds: 30
      abortScaleDownDelaySeconds: 30

Advanced Usage

Experiments

strategy:
  canary:
    steps:
      - experiment:
          duration: 20m
          templates:
            - name: baseline
              specRef: stable
              replicas: 2
            - name: canary
              specRef: canary
              replicas: 2
          analyses:
            - name: compare
              templateName: compare-metrics
              args:
                - name: baseline-hash
                  valueFrom:
                    podTemplateHashValue: Stable
                - name: canary-hash
                  valueFrom:
                    podTemplateHashValue: Latest

Notifications

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
  annotations:
    notifications.argoproj.io/subscribe.on-rollout-completed.slack: my-channel
    notifications.argoproj.io/subscribe.on-rollout-aborted.slack: alerts-channel

Rollout Operations

# Trigger a new rollout
kubectl argo rollouts set image my-app my-app=my-app:2.0.0

# Watch rollout progress
kubectl argo rollouts get rollout my-app --watch

# Manual promote
kubectl argo rollouts promote my-app

# Full promote (skip remaining steps)
kubectl argo rollouts promote my-app --full

# Abort (rollback)
kubectl argo rollouts abort my-app

# Retry after abort
kubectl argo rollouts retry rollout my-app

# Rollback to previous
kubectl argo rollouts undo my-app
kubectl argo rollouts undo my-app --to-revision=2

Troubleshooting

IssueSolution
Rollout stuck in ProgressingCheck analysis results; promote or abort manually
Analysis always failingVerify Prometheus query and successCondition
Traffic not shiftingVerify service mesh/ingress controller integration
Canary pods not createdCheck resource quotas and node capacity
Rollback not workingVerify revisionHistoryLimit is set high enough
Dashboard not accessibleRun kubectl argo rollouts dashboard and check port
# Debug rollout
kubectl argo rollouts get rollout my-app
kubectl argo rollouts status my-app

# Check analysis runs
kubectl get analysisrun -l rollouts-pod-template-hash

# View controller logs
kubectl logs -n argo-rollouts deployment/argo-rollouts

# List all rollouts across namespaces
kubectl argo rollouts list rollouts --all-namespaces

# Describe for events
kubectl describe rollout my-app