コンテンツにスキップ

Kubecost Cheat Sheet

Overview

Kubecost is a Kubernetes-native cost monitoring platform that provides real-time cost visibility, allocation, and optimization recommendations for clusters running on any cloud provider or on-premises infrastructure. It breaks down costs by namespace, deployment, service, label, and team, enabling organizations to understand exactly where their Kubernetes spending goes and identify opportunities for savings through right-sizing, idle resource detection, and reserved instance recommendations.

Kubecost integrates directly with cloud billing APIs (AWS CUR, GCP BigQuery, Azure Cost Management) to reconcile actual cloud costs with in-cluster resource consumption. Beyond cost monitoring, it provides efficiency scores, savings recommendations, and governance alerts when spending exceeds defined budgets. The open-source core (OpenCost) is a CNCF sandbox project, while the commercial tier adds multi-cluster views, SSO, RBAC, and enterprise support.

Installation

Helm Installation

# Add Kubecost Helm repo
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update

# Install Kubecost
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace \
  --set kubecostToken="your-token"

# Install with Prometheus integration (use existing Prometheus)
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace \
  --set prometheus.fqdn="http://prometheus.monitoring:9090" \
  --set prometheus.enabled=false

# Install OpenCost (free open-source version)
helm install opencost opencost/opencost \
  --namespace opencost \
  --create-namespace

# Verify installation
kubectl get pods -n kubecost
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
# Dashboard: http://localhost:9090

Cloud Integration Setup

# AWS — Configure Cost and Usage Report
# 1. Create CUR in AWS Billing Console
# 2. Configure S3 bucket access

helm upgrade kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --set kubecostProductConfigs.awsSpotDataBucket="spot-data-bucket" \
  --set kubecostProductConfigs.awsSpotDataRegion="us-east-1" \
  --set kubecostProductConfigs.athenaProjectID="aws-account-id" \
  --set kubecostProductConfigs.athenaBucketName="s3://cur-athena-results" \
  --set kubecostProductConfigs.athenaRegion="us-east-1" \
  --set kubecostProductConfigs.athenaDatabase="athenacurcfn" \
  --set kubecostProductConfigs.athenaTable="cur_report"

# GCP — Configure BigQuery billing export
helm upgrade kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --set kubecostProductConfigs.gcpProjectID="project-id" \
  --set kubecostProductConfigs.bigQueryBillingDataDataset="billing.gcp_billing_export"

# Azure — Configure Cost Management export
helm upgrade kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --set kubecostProductConfigs.azureSubscriptionID="sub-id" \
  --set kubecostProductConfigs.azureClientID="client-id" \
  --set kubecostProductConfigs.azureClientPassword="client-secret" \
  --set kubecostProductConfigs.azureTenantID="tenant-id"

Core Commands — Cost Allocation API

Query Cost Data

KUBECOST="http://localhost:9090"

# Get cost allocation by namespace (last 7 days)
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=namespace" | jq '.data[] | to_entries[] | {namespace: .key, totalCost: .value.totalCost}'

# Cost allocation by deployment
curl -s "$KUBECOST/model/allocation?window=24h&aggregate=deployment" | jq '.data[] | to_entries[] | {deployment: .key, cost: .value.totalCost}'

# Cost by label (e.g., team)
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=label:team" | jq '.data[] | to_entries[] | {team: .key, cost: .value.totalCost}'

# Cost by namespace and controller
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=namespace,controller" | jq '.data'

# Filter by specific namespace
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=deployment&filterNamespaces=production" | jq '.data'

# Get cost for a specific time window
curl -s "$KUBECOST/model/allocation?window=2026-05-01T00:00:00Z,2026-05-18T00:00:00Z&aggregate=namespace"

# Idle cost allocation
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=namespace&shareIdle=true&idleByNode=false"

# Cumulative cost (not averaged per day)
curl -s "$KUBECOST/model/allocation?window=30d&aggregate=namespace&accumulate=true"

Assets and Cloud Costs

# Get cloud asset costs
curl -s "$KUBECOST/model/assets?window=7d&aggregate=type" | jq '.data'

# Costs by provider and service
curl -s "$KUBECOST/model/assets?window=7d&aggregate=provider,service"

# Node costs
curl -s "$KUBECOST/model/assets?window=7d&aggregate=type&filterTypes=Node" | jq '.data'

# Disk costs
curl -s "$KUBECOST/model/assets?window=7d&aggregate=type&filterTypes=Disk"

# Cloud costs (from billing integration)
curl -s "$KUBECOST/model/cloudCost?window=7d&aggregate=service"

Core Commands — Savings and Recommendations

# Get all savings recommendations
curl -s "$KUBECOST/model/savings" | jq '{
  monthlySavings: .monthlySavings,
  recommendations: .recommendations | length
}'

# Right-sizing recommendations for containers
curl -s "$KUBECOST/model/savings/requestSizing?window=48h&targetCPUUtilization=0.7&targetRAMUtilization=0.8" \
  | jq '.[] | {container: .containerName, namespace: .namespace, currentCPU: .currentCPURequest, recommendedCPU: .recommendedCPURequest, currentRAM: .currentRAMRequest, recommendedRAM: .recommendedRAMRequest}'

# Cluster right-sizing
curl -s "$KUBECOST/model/savings/clusterSizing?window=7d" | jq '.'

# Abandoned workloads
curl -s "$KUBECOST/model/savings/abandonedWorkloads?window=7d" | jq '.[] | {name, namespace, monthlyCost}'

# Underutilized nodes
curl -s "$KUBECOST/model/savings/underutilizedNodes?window=48h&cpuThreshold=0.3&memThreshold=0.3" | jq '.'

# Reserved instance recommendations
curl -s "$KUBECOST/model/savings/reservedInstances" | jq '{totalSavings, recommendations}'

# Orphaned resources (unattached disks, IPs)
curl -s "$KUBECOST/model/savings/orphanedResources" | jq '.[] | {type, name, monthlyCost}'

Configuration

Helm Values

# kubecost-values.yaml
kubecostToken: "your-token-here"

# Prometheus configuration
prometheus:
  enabled: true  # set false if using existing Prometheus
  server:
    retention: 15d
    resources:
      requests:
        memory: 2Gi
        cpu: 500m

# Grafana
grafana:
  enabled: true
  sidecar:
    dashboards:
      enabled: true

# Cost analyzer settings
kubecostProductConfigs:
  clusterName: "production-us-east"
  currencyCode: "USD"
  
  # Custom pricing (on-prem or custom rates)
  customPricesEnabled: false
  defaultModelPricing:
    CPU: "0.031611"
    RAM: "0.004237"
    GPU: "0.95"
    storage: "0.00005479452"

# Network costs
networkCosts:
  enabled: true
  config:
    services:
      amazon-web-services: true
      google-cloud-services: true

# Budgets and alerts
kubecostDeployment:
  budgets:
    - name: "production-budget"
      namespace: "production"
      monthlyCost: 10000
      alertThresholds: [80, 100, 120]

# Resource limits
kubecostModel:
  resources:
    requests:
      cpu: 200m
      memory: 256Mi
    limits:
      cpu: 1000m
      memory: 1Gi

# Ingress
ingress:
  enabled: true
  className: nginx
  hosts:
    - host: kubecost.internal.company.com
      paths:
        - /

Budget and Alert Configuration

# Create a budget via API
curl -X POST "$KUBECOST/model/budget" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-namespace-budget",
    "filter": {
      "namespace": "production"
    },
    "amount": 15000,
    "window": "month",
    "alerts": [
      {"threshold": 0.8, "channels": ["slack"]},
      {"threshold": 1.0, "channels": ["slack", "email"]},
      {"threshold": 1.2, "channels": ["slack", "email", "pagerduty"]}
    ]
  }'

# Configure Slack alerts
curl -X POST "$KUBECOST/model/alerts" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "budget",
    "threshold": 0.9,
    "window": "7d",
    "aggregate": "namespace",
    "filter": "production",
    "slackWebhookUrl": "https://hooks.slack.com/services/T00/B00/xxx"
  }'

Advanced Usage

Multi-Cluster Aggregation

# Configure multi-cluster (requires Enterprise)
# Each cluster reports to a central Kubecost

# Primary cluster Helm values
helm upgrade kubecost kubecost/cost-analyzer \
  --set kubecostProductConfigs.clusterName="us-east-prod" \
  --set federatedETL.federator.enabled=true \
  --set federatedETL.federator.primaryCluster=true

# Secondary cluster Helm values
helm upgrade kubecost kubecost/cost-analyzer \
  --set kubecostProductConfigs.clusterName="us-west-prod" \
  --set federatedETL.federator.enabled=true \
  --set federatedETL.federator.primaryCluster=false

# Query multi-cluster costs
curl -s "$KUBECOST/model/allocation?window=7d&aggregate=cluster,namespace"

Custom Cost Metrics (Prometheus)

# Export Kubecost metrics to Prometheus
# These metrics are automatically exposed:

# Container cost per hour
# kubecost_container_cpu_cost_per_hour
# kubecost_container_memory_cost_per_hour
# kubecost_container_gpu_cost_per_hour

# Node cost per hour
# kubecost_node_cost_per_hour

# Example PromQL queries:
# Total hourly namespace cost:
# sum(kubecost_container_cpu_cost_per_hour + kubecost_container_memory_cost_per_hour) by (namespace)

# Monthly cost projection:
# sum(kubecost_container_cpu_cost_per_hour + kubecost_container_memory_cost_per_hour) by (namespace) * 730

Automated Right-Sizing

#!/bin/bash
# auto-rightsize.sh — Generate right-sizing patches
KUBECOST="http://localhost:9090"
NAMESPACE="${1:-default}"

echo "=== Right-Sizing Recommendations for $NAMESPACE ==="

curl -s "$KUBECOST/model/savings/requestSizing?window=72h&filterNamespaces=$NAMESPACE&targetCPUUtilization=0.65&targetRAMUtilization=0.75" \
  | jq -r '.[] | select(.monthlySavings > 5) | "kubectl set resources deployment/\(.controllerName) -n \(.namespace) --requests=cpu=\(.recommendedCPURequest),memory=\(.recommendedRAMRequest) --limits=cpu=\(.recommendedCPULimit),memory=\(.recommendedRAMLimit)"'

Troubleshooting

IssueCauseSolution
Costs showing $0Cloud billing not integratedConfigure CUR (AWS), BigQuery (GCP), or Cost Management (Azure)
Missing namespaces in allocationShort query windowExtend window or check namespace has active workloads
Prometheus OOMToo many metrics retainedReduce retention or increase Prometheus memory
Stale cost dataETL pipeline delayedCheck ETL logs: kubectl logs -n kubecost -l app=cost-analyzer
Network costs missingNetwork costs not enabledSet networkCosts.enabled=true in Helm values
Multi-cluster not aggregatingFederator misconfiguredVerify primary/secondary cluster settings and connectivity
Dashboard loading slowlyLarge datasetReduce time window or filter by specific namespaces
Savings recommendations emptyInsufficient dataWait 48-72 hours for enough utilization data
# Check Kubecost pods
kubectl get pods -n kubecost

# Check cost-model logs
kubectl logs -n kubecost deploy/kubecost-cost-analyzer -c cost-model --tail=50

# Verify Prometheus connectivity
kubectl exec -n kubecost deploy/kubecost-cost-analyzer -c cost-model -- \
  wget -qO- "http://kubecost-prometheus-server/api/v1/query?query=up"

# Check ETL status
curl -s "$KUBECOST/model/etl/status" | jq '.'

# Force ETL rebuild
curl -X POST "$KUBECOST/model/etl/rebuild"

# Health check
curl -s "$KUBECOST/healthz"