Zum Inhalt

Nomad

generieren

Umfassende HashiCorp Nomad Befehle und Workflows für Workload-Orchestrierung, Job-Scheduling und Cluster-Management.

Installation und Inbetriebnahme

Command Description
nomad version Show Nomad version
nomad agent -dev Start development agent
nomad agent -config=nomad.hcl Start with configuration
nomad server members List server members
nomad node status List client nodes

Job Management

Stellenangebote

Command Description
nomad job run example.nomad Submit job
nomad job status List all jobs
nomad job status example Show job details
nomad job stop example Stop job
nomad job stop -purge example Stop and purge job

Jobplanung und Validierung

Command Description
nomad job plan example.nomad Plan job changes
nomad job validate example.nomad Validate job file
nomad job inspect example Inspect job configuration
nomad job history example Show job history

Job Scaling

Command Description
nomad job scale example 5 Scale job to 5 instances
nomad job scale example group 3 Scale specific group

Allocation Management

Zuweisungen

Command Description
nomad alloc status List allocations
nomad alloc status ALLOC_ID Show allocation details
nomad alloc logs ALLOC_ID Show allocation logs
nomad alloc logs -f ALLOC_ID Follow allocation logs
nomad alloc exec ALLOC_ID /bin/bash Execute command in allocation

Allocation Debugging

Command Description
nomad alloc fs ALLOC_ID List allocation files
nomad alloc fs ALLOC_ID /path/to/file Read allocation file
nomad alloc restart ALLOC_ID Restart allocation
nomad alloc stop ALLOC_ID Stop allocation

Node Management

Node Operationen

Command Description
nomad node status List all nodes
nomad node status NODE_ID Show node details
nomad node drain NODE_ID Drain node
nomad node eligibility -disable NODE_ID Disable node scheduling
nomad node eligibility -enable NODE_ID Enable node scheduling

Keine Wartung

Command Description
nomad node drain -enable -deadline 30m NODE_ID Drain with deadline
nomad node drain -disable NODE_ID Cancel drain
nomad node meta apply NODE_ID key=value Set node metadata

Name und Name

Command Description
nomad namespace list List namespaces
nomad namespace status default Show namespace details
nomad namespace apply -description="Dev environment" dev Create namespace
nomad namespace delete dev Delete namespace

ACL Management

ACL Operationen

Command Description
nomad acl bootstrap Bootstrap ACL system
nomad acl token create -name="dev-token" -policy=dev-policy Create token
nomad acl token list List tokens
nomad acl token info TOKEN_ID Show token details

ACL Richtlinien

Command Description
nomad acl policy apply dev-policy dev-policy.hcl Create/update policy
nomad acl policy list List policies
nomad acl policy info dev-policy Show policy details

Überwachung und Debugging

Systeminformationen

Command Description
nomad operator raft list-peers List Raft peers
nomad operator snapshot save backup.snap Create snapshot
nomad operator snapshot restore backup.snap Restore snapshot

Überwachung

Command Description
nomad monitor Stream logs
nomad monitor -log-level=DEBUG Debug level logs
nomad status Show cluster status

Beispiele für die Job-Spezifikation

Basic Web Service

```hcl job "web" \\{ datacenters = ["dc1"] type = "service"

group "web" \\{ count = 3

network \\\\{
  port "http" \\\\{
    static = 8080
  \\\\}
\\\\}

service \\\\{
  name = "web"
  port = "http"

  check \\\\{
    type     = "http"
    path     = "/health"
    interval = "10s"
    timeout  = "2s"
  \\\\}
\\\\}

task "server" \\\\{
  driver = "docker"

  config \\\\{
    image = "nginx:latest"
    ports = ["http"]
  \\\\}

  resources \\\\{
    cpu    = 100
    memory = 128
  \\\\}
\\\\}

\\} \\} ```_

Batch Job

```hcl job "batch-job" \\{ datacenters = ["dc1"] type = "batch"

group "processing" \\{ count = 1

task "process" \\\\{
  driver = "docker"

  config \\\\{
    image = "alpine:latest"
    command = "sh"
    args = ["-c", "echo 'Processing data...' && sleep 30"]
  \\\\}

  resources \\\\{
    cpu    = 200
    memory = 256
  \\\\}
\\\\}

\\} \\} ```_

Regelmäßiger Job

```hcl job "backup" \\{ datacenters = ["dc1"] type = "batch"

periodic \\{ cron = "0 2 * * *" prohibit_overlap = true \\}

group "backup" \\{ task "backup-task" \\{ driver = "docker"

  config \\\\{
    image = "backup-tool:latest"
    command = "/backup.sh"
  \\\\}

  resources \\\\{
    cpu    = 100
    memory = 256
  \\\\}
\\\\}

\\} \\} ```_

System Job

```hcl job "monitoring" \\{ datacenters = ["dc1"] type = "system"

group "monitoring" \\{ task "node-exporter" \\{ driver = "docker"

  config \\\\{
    image = "prom/node-exporter:latest"
    network_mode = "host"
    pid_mode = "host"
  \\\\}

  resources \\\\{
    cpu    = 50
    memory = 64
  \\\\}
\\\\}

\\} \\} ```_

Konfigurationsbeispiele

Serverkonfiguration

```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"

server \\{ enabled = true bootstrap_expect = 3

server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\} \\}

consul \\{ address = "127.0.0.1:8500" \\}

vault \\{ enabled = true address = "https://vault.service.consul:8200" \\}

acl \\{ enabled = true \\}

ui \\{ enabled = true \\} ```_

Client Konfiguration

```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"

client \\{ enabled = true

server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\}

node_class = "compute"

meta \\{ "type" = "worker" "zone" = "us-east-1a" \\} \\}

plugin "docker" \\{ config \\{ allow_privileged = true volumes \\{ enabled = true \\} \\} \\}

consul \\{ address = "127.0.0.1:8500" \\}

vault \\{ enabled = true address = "https://vault.service.consul:8200" \\} ```_

Erweiterte Funktionen

Einschränkungen und Affinitäten

```hcl job "web" \\{ constraint \\{ attribute = "$\\{attr.kernel.name\\}" value = "linux" \\}

affinity \\{ attribute = "$\\{node.class\\}" value = "compute" weight = 100 \\}

group "web" \\{ constraint \\{ attribute = "$\\{meta.zone\\}" value = "us-east-1a" \\}

# ... rest of group configuration

\\} \\} ```_

Finanzmanagement

```hcl job "database" \\{ group "db" \\{ volume "data" \\{ type = "host" source = "mysql_data" read_only = false \\}

task "mysql" \\\\{
  driver = "docker"

  volume_mount \\\\{
    volume      = "data"
    destination = "/var/lib/mysql"
  \\\\}

  config \\\\{
    image = "mysql:8.0"
  \\\\}
\\\\}

\\} \\} ```_

Service Discovery Integration

```hcl job "api" \\{ group "api" \\{ service \\{ name = "api" port = "http"

  tags = [
    "api",
    "v1.0",
    "traefik.enable=true",
    "traefik.http.routers.api.rule=Host(`api.example.com`)"
  ]

  check \\\\{
    type     = "http"
    path     = "/health"
    interval = "10s"
    timeout  = "2s"
  \\\\}

  connect \\\\{
    sidecar_service \\\\{
      proxy \\\\{
        upstreams \\\\{
          destination_name = "database"
          local_bind_port  = 5432
        \\\\}
      \\\\}
    \\\\}
  \\\\}
\\\\}

\\} \\} ```_

Best Practices

Job Design

  1. Resource Allocation: Setzen Sie entsprechende CPU- und Speichergrenzen
  2. **Gesundheitskontrollen*: Durchführung umfassender Gesundheitskontrollen
  3. Graceful Shutdown: Schalten Sie SIGTERM Signale richtig
  4. Logging: Verwenden Sie strukturiertes Protokoll mit den richtigen Ebenen
  5. ** Konfiguration*: Vorlagen und Umgebungsvariablen verwenden

Cluster Management

  1. ** Hohe Verfügbarkeit**: Bereitstellung mehrerer Serverknoten
  2. **Backup-Strategie*: Regelmäßige Snapshots und Backups
  3. Monitoring: Überwachung von Cluster-Gesundheit und Jobstatus
  4. **Kapazitätsplanung*: Plan für Ressourcenanforderungen
  5. Sicherheit: ACL aktivieren und TLS verwenden

Operationen

  1. **Rolling-Updates*: Verwenden Sie Update-Strategien für null Ausfallzeiten
  2. Kanzleien: Teständerungen mit Kanarieneinsätzen
  3. ** Ressourcenüberwachung** Ressourcennutzung überwachen
  4. Log Aggregation: Zentrale Protokollsammlung
  5. Alerting: Alarme für kritische Fragen einrichten

Sicherheit

  1. ACL Richtlinien: Mindestberechtigungszugriff
  2. Network Security: Dienstnetz für sichere Kommunikation verwenden
  3. **Secrets Management*: Integrieren mit Tresor für Geheimnisse
  4. Image Security: Scannen von Containerbildern für Schwachstellen
  5. **Audit Logging*: Auditprotokoll aktivieren für Compliance