Nomad
Umfassende HashiCorp Nomad Befehle und Workflows für Workload-Orchestrierung, Job-Scheduling und Cluster-Management.
Installation und Inbetriebnahme
| | Command | Description | |
| --- | --- |
| | nomad version
| Show Nomad version | |
| | nomad agent -dev
| Start development agent | |
| | nomad agent -config=nomad.hcl
| Start with configuration | |
| | nomad server members
| List server members | |
| | nomad node status
| List client nodes | |
Job Management
Stellenangebote
| | Command | Description | |
| --- | --- |
| | nomad job run example.nomad
| Submit job | |
| | nomad job status
| List all jobs | |
| | nomad job status example
| Show job details | |
| | nomad job stop example
| Stop job | |
| | nomad job stop -purge example
| Stop and purge job | |
Jobplanung und Validierung
| | Command | Description | |
| --- | --- |
| | nomad job plan example.nomad
| Plan job changes | |
| | nomad job validate example.nomad
| Validate job file | |
| | nomad job inspect example
| Inspect job configuration | |
| | nomad job history example
| Show job history | |
Job Scaling
| | Command | Description | |
| --- | --- |
| | nomad job scale example 5
| Scale job to 5 instances | |
| | nomad job scale example group 3
| Scale specific group | |
Allocation Management
Zuweisungen
| | Command | Description | |
| --- | --- |
| | nomad alloc status
| List allocations | |
| | nomad alloc status ALLOC_ID
| Show allocation details | |
| | nomad alloc logs ALLOC_ID
| Show allocation logs | |
| | nomad alloc logs -f ALLOC_ID
| Follow allocation logs | |
| | nomad alloc exec ALLOC_ID /bin/bash
| Execute command in allocation | |
Allocation Debugging
| | Command | Description | |
| --- | --- |
| | nomad alloc fs ALLOC_ID
| List allocation files | |
| | nomad alloc fs ALLOC_ID /path/to/file
| Read allocation file | |
| | nomad alloc restart ALLOC_ID
| Restart allocation | |
| | nomad alloc stop ALLOC_ID
| Stop allocation | |
Node Management
Node Operationen
| | Command | Description | |
| --- | --- |
| | nomad node status
| List all nodes | |
| | nomad node status NODE_ID
| Show node details | |
| | nomad node drain NODE_ID
| Drain node | |
| | nomad node eligibility -disable NODE_ID
| Disable node scheduling | |
| | nomad node eligibility -enable NODE_ID
| Enable node scheduling | |
Keine Wartung
| | Command | Description | |
| --- | --- |
| | nomad node drain -enable -deadline 30m NODE_ID
| Drain with deadline | |
| | nomad node drain -disable NODE_ID
| Cancel drain | |
| | nomad node meta apply NODE_ID key=value
| Set node metadata | |
Name und Name
| | Command | Description | |
| --- | --- |
| | nomad namespace list
| List namespaces | |
| | nomad namespace status default
| Show namespace details | |
| | nomad namespace apply -description="Dev environment" dev
| Create namespace | |
| | nomad namespace delete dev
| Delete namespace | |
ACL Management
ACL Operationen
| | Command | Description | |
| --- | --- |
| | nomad acl bootstrap
| Bootstrap ACL system | |
| | nomad acl token create -name="dev-token" -policy=dev-policy
| Create token | |
| | nomad acl token list
| List tokens | |
| | nomad acl token info TOKEN_ID
| Show token details | |
ACL Richtlinien
| | Command | Description | |
| --- | --- |
| | nomad acl policy apply dev-policy dev-policy.hcl
| Create/update policy | |
| | nomad acl policy list
| List policies | |
| | nomad acl policy info dev-policy
| Show policy details | |
Überwachung und Debugging
Systeminformationen
| | Command | Description | |
| --- | --- |
| | nomad operator raft list-peers
| List Raft peers | |
| | nomad operator snapshot save backup.snap
| Create snapshot | |
| | nomad operator snapshot restore backup.snap
| Restore snapshot | |
Überwachung
| | Command | Description | |
| --- | --- |
| | nomad monitor
| Stream logs | |
| | nomad monitor -log-level=DEBUG
| Debug level logs | |
| | nomad status
| Show cluster status | |
Beispiele für die Job-Spezifikation
Basic Web Service
```hcl job "web" \\{ datacenters = ["dc1"] type = "service"
group "web" \\{ count = 3
network \\\\{
port "http" \\\\{
static = 8080
\\\\}
\\\\}
service \\\\{
name = "web"
port = "http"
check \\\\{
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
\\\\}
\\\\}
task "server" \\\\{
driver = "docker"
config \\\\{
image = "nginx:latest"
ports = ["http"]
\\\\}
resources \\\\{
cpu = 100
memory = 128
\\\\}
\\\\}
\\} \\} ```_
Batch Job
```hcl job "batch-job" \\{ datacenters = ["dc1"] type = "batch"
group "processing" \\{ count = 1
task "process" \\\\{
driver = "docker"
config \\\\{
image = "alpine:latest"
command = "sh"
args = ["-c", "echo 'Processing data...' && sleep 30"]
\\\\}
resources \\\\{
cpu = 200
memory = 256
\\\\}
\\\\}
\\} \\} ```_
Regelmäßiger Job
```hcl job "backup" \\{ datacenters = ["dc1"] type = "batch"
periodic \\{ cron = "0 2 * * *" prohibit_overlap = true \\}
group "backup" \\{ task "backup-task" \\{ driver = "docker"
config \\\\{
image = "backup-tool:latest"
command = "/backup.sh"
\\\\}
resources \\\\{
cpu = 100
memory = 256
\\\\}
\\\\}
\\} \\} ```_
System Job
```hcl job "monitoring" \\{ datacenters = ["dc1"] type = "system"
group "monitoring" \\{ task "node-exporter" \\{ driver = "docker"
config \\\\{
image = "prom/node-exporter:latest"
network_mode = "host"
pid_mode = "host"
\\\\}
resources \\\\{
cpu = 50
memory = 64
\\\\}
\\\\}
\\} \\} ```_
Konfigurationsbeispiele
Serverkonfiguration
```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"
server \\{ enabled = true bootstrap_expect = 3
server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\} \\}
consul \\{ address = "127.0.0.1:8500" \\}
vault \\{ enabled = true address = "https://vault.service.consul:8200" \\}
acl \\{ enabled = true \\}
ui \\{ enabled = true \\} ```_
Client Konfiguration
```hcl datacenter = "dc1" data_dir = "/opt/nomad/data" log_level = "INFO" bind_addr = "0.0.0.0"
client \\{ enabled = true
server_join \\{ retry_join = ["10.0.1.10", "10.0.1.11", "10.0.1.12"] \\}
node_class = "compute"
meta \\{ "type" = "worker" "zone" = "us-east-1a" \\} \\}
plugin "docker" \\{ config \\{ allow_privileged = true volumes \\{ enabled = true \\} \\} \\}
consul \\{ address = "127.0.0.1:8500" \\}
vault \\{ enabled = true address = "https://vault.service.consul:8200" \\} ```_
Erweiterte Funktionen
Einschränkungen und Affinitäten
```hcl job "web" \\{ constraint \\{ attribute = "$\\{attr.kernel.name\\}" value = "linux" \\}
affinity \\{ attribute = "$\\{node.class\\}" value = "compute" weight = 100 \\}
group "web" \\{ constraint \\{ attribute = "$\\{meta.zone\\}" value = "us-east-1a" \\}
# ... rest of group configuration
\\} \\} ```_
Finanzmanagement
```hcl job "database" \\{ group "db" \\{ volume "data" \\{ type = "host" source = "mysql_data" read_only = false \\}
task "mysql" \\\\{
driver = "docker"
volume_mount \\\\{
volume = "data"
destination = "/var/lib/mysql"
\\\\}
config \\\\{
image = "mysql:8.0"
\\\\}
\\\\}
\\} \\} ```_
Service Discovery Integration
```hcl job "api" \\{ group "api" \\{ service \\{ name = "api" port = "http"
tags = [
"api",
"v1.0",
"traefik.enable=true",
"traefik.http.routers.api.rule=Host(`api.example.com`)"
]
check \\\\{
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
\\\\}
connect \\\\{
sidecar_service \\\\{
proxy \\\\{
upstreams \\\\{
destination_name = "database"
local_bind_port = 5432
\\\\}
\\\\}
\\\\}
\\\\}
\\\\}
\\} \\} ```_
Best Practices
Job Design
- Resource Allocation: Setzen Sie entsprechende CPU- und Speichergrenzen
- *Gesundheitskontrollen: Durchführung umfassender Gesundheitskontrollen
- Graceful Shutdown: Schalten Sie SIGTERM Signale richtig
- Logging: Verwenden Sie strukturiertes Protokoll mit den richtigen Ebenen
- ** Konfiguration*: Vorlagen und Umgebungsvariablen verwenden
Cluster Management
- ** Hohe Verfügbarkeit**: Bereitstellung mehrerer Serverknoten
- *Backup-Strategie: Regelmäßige Snapshots und Backups
- Monitoring: Überwachung von Cluster-Gesundheit und Jobstatus
- *Kapazitätsplanung: Plan für Ressourcenanforderungen
- Sicherheit: ACL aktivieren und TLS verwenden
Operationen
- *Rolling-Updates: Verwenden Sie Update-Strategien für null Ausfallzeiten
- Kanzleien: Teständerungen mit Kanarieneinsätzen
- ** Ressourcenüberwachung** Ressourcennutzung überwachen
- Log Aggregation: Zentrale Protokollsammlung
- Alerting: Alarme für kritische Fragen einrichten
Sicherheit
- ACL Richtlinien: Mindestberechtigungszugriff
- Network Security: Dienstnetz für sichere Kommunikation verwenden
- *Secrets Management: Integrieren mit Tresor für Geheimnisse
- Image Security: Scannen von Containerbildern für Schwachstellen
- *Audit Logging: Auditprotokoll aktivieren für Compliance