Nomad

Comprehensive HashiCorp Nomad commands and workflows for workload orchestration, job scheduling, and cluster management.

Installation & Setup

Command	Description
`nomad version`	Show Nomad version
`nomad agent -dev`	Start development agent
`nomad agent -config=nomad.hcl`	Start with configuration
`nomad server members`	List server members
`nomad node status`	List client nodes

Job Management

Job Operations

Command	Description
`nomad job run example.nomad`	Submit job
`nomad job status`	List all jobs
`nomad job status example`	Show job details
`nomad job stop example`	Stop job
`nomad job stop -purge example`	Stop and purge job

Job Planning and Validation

Command	Description
`nomad job plan example.nomad`	Plan job changes
`nomad job validate example.nomad`	Validate job file
`nomad job inspect example`	Inspect job configuration
`nomad job history example`	Show job history

Job Scaling

Command	Description
`nomad job scale example 5`	Scale job to 5 instances
`nomad job scale example group 3`	Scale specific group

Allocation Management

Allocation Operations

Command	Description
`nomad alloc status`	List allocations
`nomad alloc status ALLOC_ID`	Show allocation details
`nomad alloc logs ALLOC_ID`	Show allocation logs
`nomad alloc logs -f ALLOC_ID`	Follow allocation logs
`nomad alloc exec ALLOC_ID /bin/bash`	Execute command in allocation

Allocation Debugging

Command	Description
`nomad alloc fs ALLOC_ID`	List allocation files
`nomad alloc fs ALLOC_ID /path/to/file`	Read allocation file
`nomad alloc restart ALLOC_ID`	Restart allocation
`nomad alloc stop ALLOC_ID`	Stop allocation

Node Management

Node Operations

Command	Description
`nomad node status`	List all nodes
`nomad node status NODE_ID`	Show node details
`nomad node drain NODE_ID`	Drain node
`nomad node eligibility -disable NODE_ID`	Disable node scheduling
`nomad node eligibility -enable NODE_ID`	Enable node scheduling

Node Maintenance

Command	Description
`nomad node drain -enable -deadline 30m NODE_ID`	Drain with deadline
`nomad node drain -disable NODE_ID`	Cancel drain
`nomad node meta apply NODE_ID key=value`	Set node metadata

Namespace Management

Command	Description
`nomad namespace list`	List namespaces
`nomad namespace status default`	Show namespace details
`nomad namespace apply -description="Dev environment" dev`	Create namespace
`nomad namespace delete dev`	Delete namespace

ACL Management

ACL Operations

Command	Description
`nomad acl bootstrap`	Bootstrap ACL system
`nomad acl token create -name="dev-token" -policy=dev-policy`	Create token
`nomad acl token list`	List tokens
`nomad acl token info TOKEN_ID`	Show token details

ACL Policies

Command	Description
`nomad acl policy apply dev-policy dev-policy.hcl`	Create/update policy
`nomad acl policy list`	List policies
`nomad acl policy info dev-policy`	Show policy details

Monitoring and Debugging

System Information

Command	Description
`nomad operator raft list-peers`	List Raft peers
`nomad operator snapshot save backup.snap`	Create snapshot
`nomad operator snapshot restore backup.snap`	Restore snapshot

Monitoring

Command	Description
`nomad monitor`	Stream logs
`nomad monitor -log-level=DEBUG`	Debug level logs
`nomad status`	Show cluster status

Job Specification Examples

Basic Web Service

hcl

job "web" {
  datacenters = ["dc1"]
  type = "service"
  
  group "web" {
    count = 3
    
    network {
      port "http" {
        static = 8080
      }
    }
    
    service {
      name = "web"
      port = "http"
      
      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      }
    }
    
    task "server" {
      driver = "docker"
      
      config {
        image = "nginx:latest"
        ports = ["http"]
      }
      
      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

Batch Job

hcl

job "batch-job" {
  datacenters = ["dc1"]
  type = "batch"
  
  group "processing" {
    count = 1
    
    task "process" {
      driver = "docker"
      
      config {
        image = "alpine:latest"
        command = "sh"
        args = ["-c", "echo 'Processing data...' && sleep 30"]
      }
      
      resources {
        cpu    = 200
        memory = 256
      }
    }
  }
}

Periodic Job

hcl

job "backup" {
  datacenters = ["dc1"]
  type = "batch"
  
  periodic {
    cron             = "0 2 * * *"
    prohibit_overlap = true
  }
  
  group "backup" {
    task "backup-task" {
      driver = "docker"
      
      config {
        image = "backup-tool:latest"
        command = "/backup.sh"
      }
      
      resources {
        cpu    = 100
        memory = 256
      }
    }
  }
}

System Job

hcl

job "monitoring" {
  datacenters = ["dc1"]
  type = "system"
  
  group "monitoring" {
    task "node-exporter" {
      driver = "docker"
      
      config {
        image = "prom/node-exporter:latest"
        network_mode = "host"
        pid_mode = "host"
      }
      
      resources {
        cpu    = 50
        memory = 64
      }
    }
  }
}

Configuration Examples

Server Configuration

hcl

datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

server {
  enabled = true
  bootstrap_expect = 3
  
  server_join {
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  }
}

consul {
  address = "127.0.0.1:8500"
}

vault {
  enabled = true
  address = "https://vault.service.consul:8200"
}

acl {
  enabled = true
}

ui {
  enabled = true
}

Client Configuration

hcl

datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

client {
  enabled = true
  
  server_join {
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  }
  
  node_class = "compute"
  
  meta {
    "type" = "worker"
    "zone" = "us-east-1a"
  }
}

plugin "docker" {
  config {
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}

consul {
  address = "127.0.0.1:8500"
}

vault {
  enabled = true
  address = "https://vault.service.consul:8200"
}

Advanced Features

Constraints and Affinities

hcl

job "web" {
  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }
  
  affinity {
    attribute = "${node.class}"
    value     = "compute"
    weight    = 100
  }
  
  group "web" {
    constraint {
      attribute = "${meta.zone}"
      value     = "us-east-1a"
    }
    
    # ... rest of group configuration
  }
}

Volume Management

hcl

job "database" {
  group "db" {
    volume "data" {
      type      = "host"
      source    = "mysql_data"
      read_only = false
    }
    
    task "mysql" {
      driver = "docker"
      
      volume_mount {
        volume      = "data"
        destination = "/var/lib/mysql"
      }
      
      config {
        image = "mysql:8.0"
      }
    }
  }
}

Service Discovery Integration

hcl

job "api" {
  group "api" {
    service {
      name = "api"
      port = "http"
      
      tags = [
        "api",
        "v1.0",
        "traefik.enable=true",
        "traefik.http.routers.api.rule=Host(`api.example.com`)"
      ]
      
      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      }
      
      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "database"
              local_bind_port  = 5432
            }
          }
        }
      }
    }
  }
}

Best Practices

Job Design

Resource Allocation: Set appropriate CPU and memory limits
Health Checks: Implement comprehensive health checks
Graceful Shutdown: Handle SIGTERM signals properly
Logging: Use structured logging with proper levels
Configuration: Use templates and environment variables

Cluster Management

High Availability: Deploy multiple server nodes
Backup Strategy: Regular snapshots and backups
Monitoring: Monitor cluster health and job status
Capacity Planning: Plan for resource requirements
Security: Enable ACLs and use TLS

Operations

Rolling Updates: Use update strategies for zero downtime
Canary Deployments: Test changes with canary deployments
Resource Monitoring: Monitor resource usage
Log Aggregation: Centralize log collection
Alerting: Set up alerts for critical issues

Security

ACL Policies: Implement least privilege access
Network Security: Use service mesh for secure communication
Secrets Management: Integrate with Vault for secrets
Image Security: Scan container images for vulnerabilities
Audit Logging: Enable audit logging for compliance

Nomad ​

Installation & Setup ​

Job Management ​

Job Operations ​

Job Planning and Validation ​

Job Scaling ​

Allocation Management ​

Allocation Operations ​

Allocation Debugging ​

Node Management ​

Node Operations ​

Node Maintenance ​

Namespace Management ​

ACL Management ​

ACL Operations ​

ACL Policies ​

Monitoring and Debugging ​

System Information ​

Monitoring ​

Job Specification Examples ​

Basic Web Service ​

Batch Job ​

Periodic Job ​

System Job ​

Configuration Examples ​

Server Configuration ​

Client Configuration ​

Advanced Features ​

Constraints and Affinities ​

Volume Management ​

Service Discovery Integration ​

Best Practices ​

Job Design ​

Cluster Management ​

Operations ​

Security ​

Nomad

Installation & Setup

Job Management

Job Operations

Job Planning and Validation

Job Scaling

Allocation Management

Allocation Operations

Allocation Debugging

Node Management

Node Operations

Node Maintenance

Namespace Management

ACL Management

ACL Operations

ACL Policies

Monitoring and Debugging

System Information

Monitoring

Job Specification Examples

Basic Web Service

Batch Job

Periodic Job

System Job

Configuration Examples

Server Configuration

Client Configuration

Advanced Features

Constraints and Affinities

Volume Management

Service Discovery Integration

Best Practices

Job Design

Cluster Management

Operations

Security