Zum Inhalt

Nomad

generieren

Umfassende HashiCorp Nomad Befehle und Workflows für Workload-Orchestrierung, Job-Scheduling und Cluster-Management.

Installation und Inbetriebnahme

| | Command | Description | | | --- | --- | | | nomad version | Show Nomad version | | | | nomad agent -dev | Start development agent | | | | nomad agent -config=nomad.hcl | Start with configuration | | | | nomad server members | List server members | | | | nomad node status | List client nodes | |

Job Management

Stellenangebote

| | Command | Description | | | --- | --- | | | nomad job run example.nomad | Submit job | | | | nomad job status | List all jobs | | | | nomad job status example | Show job details | | | | nomad job stop example | Stop job | | | | nomad job stop -purge example | Stop and purge job | |

Jobplanung und Validierung

| | Command | Description | | | --- | --- | | | nomad job plan example.nomad | Plan job changes | | | | nomad job validate example.nomad | Validate job file | | | | nomad job inspect example | Inspect job configuration | | | | nomad job history example | Show job history | |

Job Scaling

| | Command | Description | | | --- | --- | | | nomad job scale example 5 | Scale job to 5 instances | | | | nomad job scale example group 3 | Scale specific group | |

Allocation Management

Zuweisungen

| | Command | Description | | | --- | --- | | | nomad alloc status | List allocations | | | | nomad alloc status ALLOC_ID | Show allocation details | | | | nomad alloc logs ALLOC_ID | Show allocation logs | | | | nomad alloc logs -f ALLOC_ID | Follow allocation logs | | | | nomad alloc exec ALLOC_ID /bin/bash | Execute command in allocation | |

Allocation Debugging

| | Command | Description | | | --- | --- | | | nomad alloc fs ALLOC_ID | List allocation files | | | | nomad alloc fs ALLOC_ID /path/to/file | Read allocation file | | | | nomad alloc restart ALLOC_ID | Restart allocation | | | | nomad alloc stop ALLOC_ID | Stop allocation | |

Node Management

Node Operationen

| | Command | Description | | | --- | --- | | | nomad node status | List all nodes | | | | nomad node status NODE_ID | Show node details | | | | nomad node drain NODE_ID | Drain node | | | | nomad node eligibility -disable NODE_ID | Disable node scheduling | | | | nomad node eligibility -enable NODE_ID | Enable node scheduling | |

Keine Wartung

| | Command | Description | | | --- | --- | | | nomad node drain -enable -deadline 30m NODE_ID | Drain with deadline | | | | nomad node drain -disable NODE_ID | Cancel drain | | | | nomad node meta apply NODE_ID key=value | Set node metadata | |

Name und Name

| | Command | Description | | | --- | --- | | | nomad namespace list | List namespaces | | | | nomad namespace status default | Show namespace details | | | | nomad namespace apply -description="Dev environment" dev | Create namespace | | | | nomad namespace delete dev | Delete namespace | |

ACL Management

ACL Operationen

| | Command | Description | | | --- | --- | | | nomad acl bootstrap | Bootstrap ACL system | | | | nomad acl token create -name="dev-token" -policy=dev-policy | Create token | | | | nomad acl token list | List tokens | | | | nomad acl token info TOKEN_ID | Show token details | |

ACL Richtlinien

| | Command | Description | | | --- | --- | | | nomad acl policy apply dev-policy dev-policy.hcl | Create/update policy | | | | nomad acl policy list | List policies | | | | nomad acl policy info dev-policy | Show policy details | |

Überwachung und Debugging

Systeminformationen

| | Command | Description | | | --- | --- | | | nomad operator raft list-peers | List Raft peers | | | | nomad operator snapshot save backup.snap | Create snapshot | | | | nomad operator snapshot restore backup.snap | Restore snapshot | |

Überwachung

| | Command | Description | | | --- | --- | | | nomad monitor | Stream logs | | | | nomad monitor -log-level=DEBUG | Debug level logs | | | | nomad status | Show cluster status | |

Beispiele für die Job-Spezifikation

Basic Web Service

```hcl job "web" \\{ datacenters = ["dc1"] type = "service"

group "web" \\{ count = 3

network \\\\{
  port "http" \\\\{
    static = 8080
  \\\\}
\\\\}

service \\\\{
  name = "web"
  port = "http"

  check \\\\{
    type     = "http"
    path     = "/health"
    interval = "10s"
    timeout  = "2s"
  \\\\}
\\\\}

task "server" \\\\{
  driver = "docker"

  config \\\\{
    image = "nginx:latest"
    ports = ["http"]
  \\\\}

  resources \\\\{
    cpu    = 100
    memory = 128
  \\\\}
\\\\}

\\} \\} ```_

Batch Job

```hcl job "batch-job" \\{ datacenters = ["dc1"] type = "batch"

group "processing" \\{ count = 1

task "process" \\\\{
  driver = "docker"

  config \\\\{
    image = "alpine:latest"
    command = "sh"
    args = ["-c", "echo 'Processing data...' && sleep 30"]
  \\\\}

  resources \\\\{
    cpu    = 200
    memory = 256
  \\\\}
\\\\}

\\} \\} ```_

Regelmäßiger Job

```hcl job "backup" \\{ datacenters = ["dc1"] type = "batch"

periodic \\{ cron = "0 2 * * *" prohibit_overlap = true \\}

group "backup" \\{ task "backup-task" \\{ driver = "docker"

  config \\\\{
    image = "backup-tool:latest"
    command = "/backup.sh"
  \\\\}

  resources \\\\{
    cpu    = 100
    memory = 256
  \\\\}
\\\\}

\\} \\} ```_

System Job

```hcl job "monitoring" \\{ datacenters = ["dc1"] type = "system"

group "monitoring" \\{ task "node-exporter" \\{ driver = "docker"

  config \\\\{
    image = "prom/node-exporter:latest"
    network_mode = "host"
    pid_mode = "host"
  \\\\}

  resources \\\\{
    cpu    = 50
    memory = 64
  \\\\}
\\\\}

\\} \\} ```_

Konfigurationsbeispiele

Serverkonfiguration

```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"

server \\{ enabled = true bootstrap_expect = 3

server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\} \\}

consul \\{ address = "127.0.0.1:8500" \\}

vault \\{ enabled = true address = "https://vault.service.consul:8200" \\}

acl \\{ enabled = true \\}

ui \\{ enabled = true \\} ```_

Client Konfiguration

```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"

client \\{ enabled = true

server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\}

node_class = "compute"

meta \\{ "type" = "worker" "zone" = "us-east-1a" \\} \\}

plugin "docker" \\{ config \\{ allow_privileged = true volumes \\{ enabled = true \\} \\} \\}

consul \\{ address = "127.0.0.1:8500" \\}

vault \\{ enabled = true address = "https://vault.service.consul:8200" \\} ```_

Erweiterte Funktionen

Einschränkungen und Affinitäten

```hcl job "web" \\{ constraint \\{ attribute = "$\\{attr.kernel.name\\}" value = "linux" \\}

affinity \\{ attribute = "$\\{node.class\\}" value = "compute" weight = 100 \\}

group "web" \\{ constraint \\{ attribute = "$\\{meta.zone\\}" value = "us-east-1a" \\}

# ... rest of group configuration

\\} \\} ```_

Finanzmanagement

```hcl job "database" \\{ group "db" \\{ volume "data" \\{ type = "host" source = "mysql_data" read_only = false \\}

task "mysql" \\\\{
  driver = "docker"

  volume_mount \\\\{
    volume      = "data"
    destination = "/var/lib/mysql"
  \\\\}

  config \\\\{
    image = "mysql:8.0"
  \\\\}
\\\\}

\\} \\} ```_

Service Discovery Integration

```hcl job "api" \\{ group "api" \\{ service \\{ name = "api" port = "http"

  tags = [
    "api",
    "v1.0",
    "traefik.enable=true",
    "traefik.http.routers.api.rule=Host(`api.example.com`)"
  ]

  check \\\\{
    type     = "http"
    path     = "/health"
    interval = "10s"
    timeout  = "2s"
  \\\\}

  connect \\\\{
    sidecar_service \\\\{
      proxy \\\\{
        upstreams \\\\{
          destination_name = "database"
          local_bind_port  = 5432
        \\\\}
      \\\\}
    \\\\}
  \\\\}
\\\\}

\\} \\} ```_

Best Practices

Job Design

  1. Resource Allocation: Setzen Sie entsprechende CPU- und Speichergrenzen
  2. *Gesundheitskontrollen: Durchführung umfassender Gesundheitskontrollen
  3. Graceful Shutdown: Schalten Sie SIGTERM Signale richtig
  4. Logging: Verwenden Sie strukturiertes Protokoll mit den richtigen Ebenen
  5. ** Konfiguration*: Vorlagen und Umgebungsvariablen verwenden

Cluster Management

  1. ** Hohe Verfügbarkeit**: Bereitstellung mehrerer Serverknoten
  2. *Backup-Strategie: Regelmäßige Snapshots und Backups
  3. Monitoring: Überwachung von Cluster-Gesundheit und Jobstatus
  4. *Kapazitätsplanung: Plan für Ressourcenanforderungen
  5. Sicherheit: ACL aktivieren und TLS verwenden

Operationen

  1. *Rolling-Updates: Verwenden Sie Update-Strategien für null Ausfallzeiten
  2. Kanzleien: Teständerungen mit Kanarieneinsätzen
  3. ** Ressourcenüberwachung** Ressourcennutzung überwachen
  4. Log Aggregation: Zentrale Protokollsammlung
  5. Alerting: Alarme für kritische Fragen einrichten

Sicherheit

  1. ACL Richtlinien: Mindestberechtigungszugriff
  2. Network Security: Dienstnetz für sichere Kommunikation verwenden
  3. *Secrets Management: Integrieren mit Tresor für Geheimnisse
  4. Image Security: Scannen von Containerbildern für Schwachstellen
  5. *Audit Logging: Auditprotokoll aktivieren für Compliance