Komodor Cheat Sheet
Overview
Komodor is a Kubernetes troubleshooting and monitoring platform that provides end-to-end visibility into the entire Kubernetes stack by tracking every change across deployments, configurations, infrastructure, and code. It automatically correlates these changes with issues like pod failures, performance degradation, and service disruptions, enabling teams to quickly identify root causes without manually sifting through logs, events, and metrics across multiple tools.
The platform provides a unified timeline view that shows what changed, when, by whom, and what impact it had on services. Komodor integrates with CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD), monitoring tools (Datadog, Prometheus, PagerDuty), and communication platforms (Slack, Teams) to create a complete picture of system changes. Its automated root cause analysis and intelligent recommendations significantly reduce mean time to resolution (MTTR) for Kubernetes incidents.
Installation
Agent Installation via Helm
# Add Komodor Helm repo
helm repo add komodor https://helm-charts.komodor.io
helm repo update
# Install Komodor agent
helm install komodor-agent komodor/komodor-agent \
--namespace komodor \
--create-namespace \
--set apiKey="your-komodor-api-key" \
--set clusterName="production-us-east" \
--set watcher.enableAgentTaskExecution=true
# Install with Prometheus metrics collection
helm install komodor-agent komodor/komodor-agent \
--namespace komodor \
--create-namespace \
--set apiKey="your-komodor-api-key" \
--set clusterName="production-us-east" \
--set metrics.enabled=true
# Verify installation
kubectl get pods -n komodor
# Check agent status
kubectl logs -n komodor -l app=komodor-agent --tail=20
Helm Values Configuration
# komodor-values.yaml
apiKey: "your-api-key"
clusterName: "production-us-east"
watcher:
enableAgentTaskExecution: true
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
# Namespace filtering
namespacesDenylist:
- kube-system
# Or use allowlist
# namespacesAllowlist:
# - production
# - staging
# Resource types to watch
watchedResources:
deployment: true
statefulset: true
daemonset: true
job: true
cronjob: true
pod: true
service: true
configmap: true
secret: false # Disable secret watching for security
ingress: true
hpa: true
pdb: true
metrics:
enabled: true
serviceMonitor:
enabled: false
# Communication settings
communications:
slack:
enabled: true
teams:
enabled: true
Upgrade Agent
# Upgrade to latest version
helm repo update
helm upgrade komodor-agent komodor/komodor-agent \
--namespace komodor \
--values komodor-values.yaml
# Verify upgrade
kubectl get pods -n komodor -w
Core Commands — API
Service and Resource Management
# Set API credentials
export KOMODOR_API_KEY="your-api-key"
export KOMODOR_API="https://app.komodor.com/api/v1"
# List monitored services
curl -s "$KOMODOR_API/services" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.services[] | {name, namespace, cluster}'
# Get service details
curl -s "$KOMODOR_API/services/SERVICE_ID" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.'
# Get service events timeline
curl -s "$KOMODOR_API/services/SERVICE_ID/events?from=2026-05-17T00:00:00Z&to=2026-05-18T23:59:59Z" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.events[] | {type, message, timestamp}'
# Get deployment history
curl -s "$KOMODOR_API/services/SERVICE_ID/deploys" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.deploys[] | {version, status, triggeredBy, timestamp}'
# Get pod status for a service
curl -s "$KOMODOR_API/services/SERVICE_ID/pods" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.pods[] | {name, status, restarts, node}'
Event and Change Tracking
# Get all events across cluster
curl -s "$KOMODOR_API/events?cluster=production-us-east&from=2026-05-18T00:00:00Z" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.events[] | {type, resource, summary, time}'
# Filter events by type
curl -s "$KOMODOR_API/events?cluster=production-us-east&type=deploy" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
# Get config changes
curl -s "$KOMODOR_API/events?cluster=production-us-east&type=config_change" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
# Get infrastructure events (node issues, HPA, etc.)
curl -s "$KOMODOR_API/events?cluster=production-us-east&type=infrastructure" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
# Search events
curl -s "$KOMODOR_API/events?search=OOMKilled&from=2026-05-11T00:00:00Z" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
Health and Availability
# Get cluster health overview
curl -s "$KOMODOR_API/clusters/production-us-east/health" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '{healthy, unhealthy, warning, total}'
# Get unhealthy workloads
curl -s "$KOMODOR_API/clusters/production-us-east/workloads?status=unhealthy" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.workloads[] | {name, namespace, issue}'
# Get availability metrics for a service
curl -s "$KOMODOR_API/services/SERVICE_ID/availability?window=7d" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
Configuration
CI/CD Integration (GitHub Actions)
# .github/workflows/deploy.yml — Notify Komodor of deployments
name: Deploy and Notify Komodor
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Deploy to Kubernetes
run: kubectl apply -f k8s/
- name: Notify Komodor
uses: komodorio/komodor-github-action@v1
with:
apiKey: ${{ secrets.KOMODOR_API_KEY }}
service: "payment-api"
cluster: "production-us-east"
namespace: "production"
status: "success"
deploymentVersion: ${{ github.sha }}
description: "Deployed commit ${{ github.sha }}"
ArgoCD Integration
# Komodor automatically detects ArgoCD deployments
# when the agent is installed in the same cluster
# Additional ArgoCD configuration in Helm values:
argocd:
enabled: true
# Komodor will track ArgoCD Application resources
# and correlate them with deployment changes
Alert and Notification Configuration
# Configure in Komodor dashboard or via API
# Slack notification rule
curl -X POST "$KOMODOR_API/notifications/rules" \
-H "Authorization: Bearer $KOMODOR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Critical Pod Failures",
"enabled": true,
"conditions": {
"event_types": ["pod_failure", "oom_kill", "crash_loop"],
"namespaces": ["production"],
"severity": ["critical", "high"]
},
"channels": [
{
"type": "slack",
"channel": "#k8s-critical",
"webhook_url": "https://hooks.slack.com/services/T00/B00/xxx"
}
]
}'
# PagerDuty integration
curl -X POST "$KOMODOR_API/integrations/pagerduty" \
-H "Authorization: Bearer $KOMODOR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"integration_key": "pagerduty-routing-key",
"severity_mapping": {
"critical": "critical",
"high": "error",
"medium": "warning"
}
}'
Annotation-Based Configuration
# Add Komodor annotations to deployments
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-api
namespace: production
annotations:
# Link to source code
app.komodor.com/git.repository: "https://github.com/org/payment-api"
app.komodor.com/git.ref: "main"
# Team ownership
app.komodor.com/team: "payments"
# Service tier
app.komodor.com/tier: "critical"
# Related resources
app.komodor.com/relates-to: "postgres-payment,redis-cache"
# Deploy tracking
app.komodor.com/deploy.link: "https://github.com/org/payment-api/actions/runs/12345"
app.komodor.com/deploy.user: "deployer@company.com"
Advanced Usage
Automated Remediation
# Komodor supports automated actions via the agent
# Enable agent task execution in Helm values
# watcher.enableAgentTaskExecution: true
# Available automated actions:
# - Rollback deployment to previous version
# - Restart pods
# - Scale deployment
# - Cordon/uncordon nodes
# Configure auto-rollback rule
curl -X POST "$KOMODOR_API/automation/rules" \
-H "Authorization: Bearer $KOMODOR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Auto-rollback on crash loop",
"enabled": true,
"trigger": {
"type": "crash_loop_backoff",
"threshold": 5,
"window_minutes": 10
},
"action": {
"type": "rollback",
"to": "previous_stable"
},
"scope": {
"clusters": ["production-us-east"],
"namespaces": ["production"],
"labels": {"tier": "critical"}
}
}'
Multi-Cluster Management
# Install agent on each cluster with unique cluster names
for cluster in production-us-east production-eu-west staging; do
kubectl config use-context "$cluster"
helm install komodor-agent komodor/komodor-agent \
--namespace komodor \
--create-namespace \
--set apiKey="$KOMODOR_API_KEY" \
--set clusterName="$cluster"
done
# Query across clusters
curl -s "$KOMODOR_API/events?from=2026-05-18T00:00:00Z" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.events[] | {cluster, type, message}'
# Get cross-cluster service topology
curl -s "$KOMODOR_API/topology?clusters=production-us-east,production-eu-west" \
-H "Authorization: Bearer $KOMODOR_API_KEY"
Cost Analysis
# Get cost insights (if cost module enabled)
curl -s "$KOMODOR_API/costs/overview?window=30d" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '{totalCost, byNamespace}'
# Get per-service cost
curl -s "$KOMODOR_API/costs/services?window=7d&sort=cost_desc" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.services[] | {name, monthlyCost, efficiency}'
# Right-sizing recommendations
curl -s "$KOMODOR_API/costs/recommendations" \
-H "Authorization: Bearer $KOMODOR_API_KEY" | jq '.recommendations[] | {service, currentCost, recommendedCost, savings}'
Troubleshooting
| Issue | Cause | Solution |
|---|---|---|
| Agent not connecting | API key invalid or network blocked | Verify API key and allow outbound to *.komodor.com:443 |
| Events missing | Namespace filtered out | Check namespacesDenylist in Helm values |
| Deployments not tracked | Agent RBAC insufficient | Verify ClusterRole has watch/list on deployments |
| Slack notifications not sending | Webhook URL expired | Regenerate Slack webhook and update in Komodor |
| Metrics not showing | metrics.enabled set to false | Set metrics.enabled=true in Helm values |
| Config changes not detected | Secret watching disabled | Enable watchedResources.configmap: true |
| ArgoCD apps not correlated | ArgoCD integration not enabled | Set argocd.enabled: true in Helm values |
| High agent resource usage | Too many resources being watched | Filter namespaces and resource types |
# Check agent logs
kubectl logs -n komodor -l app=komodor-agent --tail=100
# Verify agent connectivity
kubectl exec -n komodor deploy/komodor-agent -- wget -qO- https://app.komodor.com/health
# Check agent version
kubectl get deploy -n komodor komodor-agent -o jsonpath='{.spec.template.spec.containers[0].image}'
# Restart agent
kubectl rollout restart deployment -n komodor komodor-agent
# Verify RBAC permissions
kubectl auth can-i list deployments --as=system:serviceaccount:komodor:komodor-agent
# Debug: check events are being captured
kubectl get events -A --sort-by='.lastTimestamp' | tail -20