コンテンツにスキップ

Nomad

Comprehensive HashiCorp Nomad commands and workflows for workload orchestration, job scheduling, and cluster management.

Installation & Setup

Command Description
nomad version Show Nomad version
nomad agent -dev Start development agent
nomad agent -config=nomad.hcl Start with configuration
nomad server members List server members
nomad node status List client nodes

Job Management

Job Operations

Command Description
nomad job run example.nomad Submit job
nomad job status List all jobs
nomad job status example Show job details
nomad job stop example Stop job
nomad job stop -purge example Stop and purge job

Job Planning and Validation

Command Description
nomad job plan example.nomad Plan job changes
nomad job validate example.nomad Validate job file
nomad job inspect example Inspect job configuration
nomad job history example Show job history

Job Scaling

Command Description
nomad job scale example 5 Scale job to 5 instances
nomad job scale example group 3 Scale specific group

Allocation Management

Allocation Operations

Command Description
nomad alloc status List allocations
nomad alloc status ALLOC_ID Show allocation details
nomad alloc logs ALLOC_ID Show allocation logs
nomad alloc logs -f ALLOC_ID Follow allocation logs
nomad alloc exec ALLOC_ID /bin/bash Execute command in allocation

Allocation Debugging

Command Description
nomad alloc fs ALLOC_ID List allocation files
nomad alloc fs ALLOC_ID /path/to/file Read allocation file
nomad alloc restart ALLOC_ID Restart allocation
nomad alloc stop ALLOC_ID Stop allocation

Node Management

Node Operations

Command Description
nomad node status List all nodes
nomad node status NODE_ID Show node details
nomad node drain NODE_ID Drain node
nomad node eligibility -disable NODE_ID Disable node scheduling
nomad node eligibility -enable NODE_ID Enable node scheduling

Node Maintenance

Command Description
nomad node drain -enable -deadline 30m NODE_ID Drain with deadline
nomad node drain -disable NODE_ID Cancel drain
nomad node meta apply NODE_ID key=value Set node metadata

Namespace Management

Command Description
nomad namespace list List namespaces
nomad namespace status default Show namespace details
nomad namespace apply -description="Dev environment" dev Create namespace
nomad namespace delete dev Delete namespace

ACL Management

ACL Operations

Command Description
nomad acl bootstrap Bootstrap ACL system
nomad acl token create -name="dev-token" -policy=dev-policy Create token
nomad acl token list List tokens
nomad acl token info TOKEN_ID Show token details

ACL Policies

Command Description
nomad acl policy apply dev-policy dev-policy.hcl Create/update policy
nomad acl policy list List policies
nomad acl policy info dev-policy Show policy details

Monitoring and Debugging

System Information

Command Description
nomad operator raft list-peers List Raft peers
nomad operator snapshot save backup.snap Create snapshot
nomad operator snapshot restore backup.snap Restore snapshot

Monitoring

Command Description
nomad monitor Stream logs
nomad monitor -log-level=DEBUG Debug level logs
nomad status Show cluster status

Job Specification Examples

Basic Web Service

job "web" \\\\{
  datacenters = ["dc1"]
  type = "service"

  group "web" \\\\{
    count = 3

    network \\\\{
      port "http" \\\\{
        static = 8080
      \\\\}
    \\\\}

    service \\\\{
      name = "web"
      port = "http"

      check \\\\{
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      \\\\}
    \\\\}

    task "server" \\\\{
      driver = "docker"

      config \\\\{
        image = "nginx:latest"
        ports = ["http"]
      \\\\}

      resources \\\\{
        cpu    = 100
        memory = 128
      \\\\}
    \\\\}
  \\\\}
\\\\}

Batch Job

job "batch-job" \\\\{
  datacenters = ["dc1"]
  type = "batch"

  group "processing" \\\\{
    count = 1

    task "process" \\\\{
      driver = "docker"

      config \\\\{
        image = "alpine:latest"
        command = "sh"
        args = ["-c", "echo 'Processing data...' && sleep 30"]
      \\\\}

      resources \\\\{
        cpu    = 200
        memory = 256
      \\\\}
    \\\\}
  \\\\}
\\\\}

Periodic Job

job "backup" \\\\{
  datacenters = ["dc1"]
  type = "batch"

  periodic \\\\{
    cron             = "0 2 * * *"
    prohibit_overlap = true
  \\\\}

  group "backup" \\\\{
    task "backup-task" \\\\{
      driver = "docker"

      config \\\\{
        image = "backup-tool:latest"
        command = "/backup.sh"
      \\\\}

      resources \\\\{
        cpu    = 100
        memory = 256
      \\\\}
    \\\\}
  \\\\}
\\\\}

System Job

job "monitoring" \\\\{
  datacenters = ["dc1"]
  type = "system"

  group "monitoring" \\\\{
    task "node-exporter" \\\\{
      driver = "docker"

      config \\\\{
        image = "prom/node-exporter:latest"
        network_mode = "host"
        pid_mode = "host"
      \\\\}

      resources \\\\{
        cpu    = 50
        memory = 64
      \\\\}
    \\\\}
  \\\\}
\\\\}

Configuration Examples

Server Configuration

datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

server \\\\{
  enabled = true
  bootstrap_expect = 3

  server_join \\\\{
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  \\\\}
\\\\}

consul \\\\{
  address = "127.0.0.1:8500"
\\\\}

vault \\\\{
  enabled = true
  address = "https://vault.service.consul:8200"
\\\\}

acl \\\\{
  enabled = true
\\\\}

ui \\\\{
  enabled = true
\\\\}

Client Configuration

datacenter = "dc1"
data_dir = "/opt/nomad/data"
log_level = "INFO"
bind_addr = "0.0.0.0"

client \\\\{
  enabled = true

  server_join \\\\{
    retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"]
  \\\\}

  node_class = "compute"

  meta \\\\{
    "type" = "worker"
    "zone" = "us-east-1a"
  \\\\}
\\\\}

plugin "docker" \\\\{
  config \\\\{
    allow_privileged = true
    volumes \\\\{
      enabled = true
    \\\\}
  \\\\}
\\\\}

consul \\\\{
  address = "127.0.0.1:8500"
\\\\}

vault \\\\{
  enabled = true
  address = "https://vault.service.consul:8200"
\\\\}

Advanced Features

Constraints and Affinities

job "web" \\\\{
  constraint \\\\{
    attribute = "$\\\\{attr.kernel.name\\\\}"
    value     = "linux"
  \\\\}

  affinity \\\\{
    attribute = "$\\\\{node.class\\\\}"
    value     = "compute"
    weight    = 100
  \\\\}

  group "web" \\\\{
    constraint \\\\{
      attribute = "$\\\\{meta.zone\\\\}"
      value     = "us-east-1a"
    \\\\}

    # ... rest of group configuration
  \\\\}
\\\\}

Volume Management

job "database" \\\\{
  group "db" \\\\{
    volume "data" \\\\{
      type      = "host"
      source    = "mysql_data"
      read_only = false
    \\\\}

    task "mysql" \\\\{
      driver = "docker"

      volume_mount \\\\{
        volume      = "data"
        destination = "/var/lib/mysql"
      \\\\}

      config \\\\{
        image = "mysql:8.0"
      \\\\}
    \\\\}
  \\\\}
\\\\}

Service Discovery Integration

job "api" \\\\{
  group "api" \\\\{
    service \\\\{
      name = "api"
      port = "http"

      tags = [
        "api",
        "v1.0",
        "traefik.enable=true",
        "traefik.http.routers.api.rule=Host(`api.example.com`)"
      ]

      check \\\\{
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      \\\\}

      connect \\\\{
        sidecar_service \\\\{
          proxy \\\\{
            upstreams \\\\{
              destination_name = "database"
              local_bind_port  = 5432
            \\\\}
          \\\\}
        \\\\}
      \\\\}
    \\\\}
  \\\\}
\\\\}

Best Practices

Job Design

  1. Resource Allocation: Set appropriate CPU and memory limits
  2. Health Checks: Implement comprehensive health checks
  3. Graceful Shutdown: Handle SIGTERM signals properly
  4. Logging: Use structured logging with proper levels
  5. Configuration: Use templates and environment variables

Cluster Management

  1. High Availability: Deploy multiple server nodes
  2. Backup Strategy: Regular snapshots and backups
  3. Monitoring: Monitor cluster health and job status
  4. Capacity Planning: Plan for resource requirements
  5. Security: Enable ACLs and use TLS

Operations

  1. Rolling Updates: Use update strategies for zero downtime
  2. Canary Deployments: Test changes with canary deployments
  3. Resource Monitoring: Monitor resource usage
  4. Log Aggregation: Centralize log collection
  5. Alerting: Set up alerts for critical issues

Security

  1. ACL Policies: Implement least privilege access
  2. Network Security: Use service mesh for secure communication
  3. Secrets Management: Integrate with Vault for secrets
  4. Image Security: Scan container images for vulnerabilities
  5. Audit Logging: Enable audit logging for compliance