Skip to content

Nomad

Comprehensive HashiCorp Nomad commands and workflows for workload orchestration, job scheduling, and cluster management.

Installation & Setup

CommandDescription
nomad versionShow Nomad version
nomad agent -devStart development agent
nomad agent -config=nomad.hclStart with configuration
nomad server membersList server members
nomad node statusList client nodes

Job Management

Job Operations

CommandDescription
nomad job run example.nomadSubmit job
nomad job statusList all jobs
nomad job status exampleShow job details
nomad job stop exampleStop job
nomad job stop -purge exampleStop and purge job

Job Planning and Validation

CommandDescription
nomad job plan example.nomadPlan job changes
nomad job validate example.nomadValidate job file
nomad job inspect exampleInspect job configuration
nomad job history exampleShow job history

Job Scaling

CommandDescription
nomad job scale example 5Scale job to 5 instances
nomad job scale example group 3Scale specific group

Allocation Management

Allocation Operations

CommandDescription
nomad alloc statusList allocations
nomad alloc status ALLOC_IDShow allocation details
nomad alloc logs ALLOC_IDShow allocation logs
nomad alloc logs -f ALLOC_IDFollow allocation logs
nomad alloc exec ALLOC_ID /bin/bashExecute command in allocation

Allocation Debugging

CommandDescription
nomad alloc fs ALLOC_IDList allocation files
nomad alloc fs ALLOC_ID /path/to/fileRead allocation file
nomad alloc restart ALLOC_IDRestart allocation
nomad alloc stop ALLOC_IDStop allocation

Node Management

Node Operations

CommandDescription
nomad node statusList all nodes
nomad node status NODE_IDShow node details
nomad node drain NODE_IDDrain node
nomad node eligibility -disable NODE_IDDisable node scheduling
nomad node eligibility -enable NODE_IDEnable node scheduling

Node Maintenance

CommandDescription
nomad node drain -enable -deadline 30m NODE_IDDrain with deadline
nomad node drain -disable NODE_IDCancel drain
nomad node meta apply NODE_ID key=valueSet node metadata

Namespace Management

CommandDescription
nomad namespace listList namespaces
nomad namespace status defaultShow namespace details
nomad namespace apply -description="Dev environment" devCreate namespace
nomad namespace delete devDelete namespace

ACL Management

ACL Operations

CommandDescription
nomad acl bootstrapBootstrap ACL system
nomad acl token create -name="dev-token" -policy=dev-policyCreate token
nomad acl token listList tokens
nomad acl token info TOKEN_IDShow token details

ACL Policies

CommandDescription
nomad acl policy apply dev-policy dev-policy.hclCreate/update policy
nomad acl policy listList policies
nomad acl policy info dev-policyShow policy details

Monitoring and Debugging

System Information

CommandDescription
nomad operator raft list-peersList Raft peers
nomad operator snapshot save backup.snapCreate snapshot
nomad operator snapshot restore backup.snapRestore snapshot

Monitoring

CommandDescription
nomad monitorStream logs
nomad monitor -log-level=DEBUGDebug level logs
nomad statusShow cluster status

Job Specification Examples

Basic Web Service

hcl
job "web" {
  datacenters = ["dc1"]
  type = "service"
  
  group "web" {
    count = 3
    
    network {
      port "http" {
        static = 8080
      }
    }
    
    service {
      name = "web"
      port = "http"
      
      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      }
    }
    
    task "server" {
      driver = "docker"
      
      config {
        image = "nginx:latest"
        ports = ["http"]
      }
      
      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

Batch Job

hcl
job "batch-job" {
  datacenters = ["dc1"]
  type = "batch"
  
  group "processing" {
    count = 1
    
    task "process" {
      driver = "docker"
      
      config {
        image = "alpine:latest"
        command = "sh"
        args = ["-c", "echo 'Processing data...' && sleep 30"]
      }
      
      resources {
        cpu    = 200
        memory = 256
      }
    }
  }
}

Periodic Job

hcl
job "backup" {
  datacenters = ["dc1"]
  type = "batch"
  
  periodic {
    cron             = "0 2 * * *"
    prohibit_overlap = true
  }
  
  group "backup" {
    task "backup-task" {
      driver = "docker"
      
      config {
        image = "backup-tool:latest"
        command = "/backup.sh"
      }
      
      resources {
        cpu    = 100
        memory = 256
      }
    }
  }
}

System Job

hcl
job "monitoring" {
  datacenters = ["dc1"]
  type = "system"
  
  group "monitoring" {
    task "node-exporter" {
      driver = "docker"
      
      config {
        image = "prom/node-exporter:latest"
        network_mode = "host"
        pid_mode = "host"
      }
      
      resources {
        cpu    = 50
        memory = 64
      }
    }
  }
}

Configuration Examples

Server Configuration

hcl
datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

server {
  enabled = true
  bootstrap_expect = 3
  
  server_join {
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  }
}

consul {
  address = "127.0.0.1:8500"
}

vault {
  enabled = true
  address = "https://vault.service.consul:8200"
}

acl {
  enabled = true
}

ui {
  enabled = true
}

Client Configuration

hcl
datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

client {
  enabled = true
  
  server_join {
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  }
  
  node_class = "compute"
  
  meta {
    "type" = "worker"
    "zone" = "us-east-1a"
  }
}

plugin "docker" {
  config {
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}

consul {
  address = "127.0.0.1:8500"
}

vault {
  enabled = true
  address = "https://vault.service.consul:8200"
}

Advanced Features

Constraints and Affinities

hcl
job "web" {
  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }
  
  affinity {
    attribute = "${node.class}"
    value     = "compute"
    weight    = 100
  }
  
  group "web" {
    constraint {
      attribute = "${meta.zone}"
      value     = "us-east-1a"
    }
    
    # ... rest of group configuration
  }
}

Volume Management

hcl
job "database" {
  group "db" {
    volume "data" {
      type      = "host"
      source    = "mysql_data"
      read_only = false
    }
    
    task "mysql" {
      driver = "docker"
      
      volume_mount {
        volume      = "data"
        destination = "/var/lib/mysql"
      }
      
      config {
        image = "mysql:8.0"
      }
    }
  }
}

Service Discovery Integration

hcl
job "api" {
  group "api" {
    service {
      name = "api"
      port = "http"
      
      tags = [
        "api",
        "v1.0",
        "traefik.enable=true",
        "traefik.http.routers.api.rule=Host(`api.example.com`)"
      ]
      
      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      }
      
      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "database"
              local_bind_port  = 5432
            }
          }
        }
      }
    }
  }
}

Best Practices

Job Design

  1. Resource Allocation: Set appropriate CPU and memory limits
  2. Health Checks: Implement comprehensive health checks
  3. Graceful Shutdown: Handle SIGTERM signals properly
  4. Logging: Use structured logging with proper levels
  5. Configuration: Use templates and environment variables

Cluster Management

  1. High Availability: Deploy multiple server nodes
  2. Backup Strategy: Regular snapshots and backups
  3. Monitoring: Monitor cluster health and job status
  4. Capacity Planning: Plan for resource requirements
  5. Security: Enable ACLs and use TLS

Operations

  1. Rolling Updates: Use update strategies for zero downtime
  2. Canary Deployments: Test changes with canary deployments
  3. Resource Monitoring: Monitor resource usage
  4. Log Aggregation: Centralize log collection
  5. Alerting: Set up alerts for critical issues

Security

  1. ACL Policies: Implement least privilege access
  2. Network Security: Use service mesh for secure communication
  3. Secrets Management: Integrate with Vault for secrets
  4. Image Security: Scan container images for vulnerabilities
  5. Audit Logging: Enable audit logging for compliance