콘텐츠로 이동

Grafana Mimir Cheat Sheet

Overview

Grafana Mimir is an open-source, horizontally scalable, highly available long-term storage backend for Prometheus metrics. Forked from Cortex, Mimir was redesigned by Grafana Labs for simplicity, performance, and massive scale—it can ingest and query billions of active time series. Mimir is fully compatible with Prometheus, accepting metrics via remote write and supporting PromQL queries, making it a seamless extension for existing Prometheus deployments that need longer retention or higher scale.

Mimir’s architecture consists of several microservices: distributors (accept and validate incoming samples), ingesters (write recent data to memory and object storage), store-gateways (serve historical blocks from object storage), queriers (execute PromQL queries), query-frontends (split and cache queries), compactors (merge and deduplicate blocks), and rulers (evaluate recording and alerting rules). It supports native multi-tenancy, allowing isolated metrics per organization, and can run in monolithic mode for simpler deployments or microservice mode for production scale.

Installation

Docker (Monolithic Mode)

docker run -d --name mimir \
  -p 9009:9009 \
  -v $(pwd)/mimir-config.yaml:/etc/mimir/config.yaml \
  grafana/mimir:2.12.0 \
  -config.file=/etc/mimir/config.yaml

Helm (Kubernetes)

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

# Install Mimir Distributed
helm install mimir grafana/mimir-distributed \
  --namespace mimir --create-namespace \
  -f mimir-values.yaml

Binary

wget https://github.com/grafana/mimir/releases/download/mimir-2.12.0/mimir-linux-amd64
chmod +x mimir-linux-amd64
sudo mv mimir-linux-amd64 /usr/local/bin/mimir

mimir -config.file=config.yaml

Configuration

Monolithic Config

# mimir-config.yaml
multitenancy_enabled: false

server:
  http_listen_port: 9009

distributor:
  ring:
    kvstore:
      store: memberlist

ingester:
  ring:
    kvstore:
      store: memberlist
    replication_factor: 1

blocks_storage:
  backend: filesystem
  filesystem:
    dir: /data/mimir/blocks
  tsdb:
    dir: /data/mimir/tsdb

compactor:
  data_dir: /data/mimir/compactor
  sharding_ring:
    kvstore:
      store: memberlist

store_gateway:
  sharding_ring:
    kvstore:
      store: memberlist

ruler:
  rule_path: /data/mimir/rules

ruler_storage:
  backend: filesystem
  filesystem:
    dir: /data/mimir/ruler-rules

limits:
  max_global_series_per_user: 1500000
  ingestion_rate: 100000
  ingestion_burst_size: 200000

memberlist:
  join_members: [mimir-1, mimir-2, mimir-3]

S3 Storage Backend

blocks_storage:
  backend: s3
  s3:
    endpoint: s3.amazonaws.com
    bucket_name: mimir-blocks
    region: us-east-1
    access_key_id: ${AWS_ACCESS_KEY_ID}
    secret_access_key: ${AWS_SECRET_ACCESS_KEY}

Prometheus Remote Write to Mimir

# prometheus.yml
remote_write:
  - url: http://mimir:9009/api/v1/push
    headers:
      X-Scope-OrgID: my-tenant

Grafana Alloy / Agent Config

// alloy config
prometheus.remote_write "mimir" {
  endpoint {
    url = "http://mimir:9009/api/v1/push"
    headers = {
      "X-Scope-OrgID" = "my-tenant",
    }
  }
}

Multi-Tenancy

# Write metrics with tenant header
curl -X POST http://mimir:9009/api/v1/push \
  -H "X-Scope-OrgID: tenant-a" \
  -H "Content-Type: application/x-protobuf" \
  --data-binary @metrics.pb

# Query metrics for a tenant
curl http://mimir:9009/prometheus/api/v1/query \
  -H "X-Scope-OrgID: tenant-a" \
  --data-urlencode 'query=up'

# Per-tenant limits
curl -X POST http://mimir:9009/api/v1/user_limits \
  -H "X-Scope-OrgID: tenant-a"

Per-Tenant Limits

overrides:
  tenant-a:
    max_global_series_per_user: 5000000
    ingestion_rate: 200000
    max_fetched_series_per_query: 100000
  tenant-b:
    max_global_series_per_user: 500000
    ingestion_rate: 50000

Core API Endpoints

EndpointDescription
POST /api/v1/pushRemote write ingestion
GET /prometheus/api/v1/queryInstant PromQL query
GET /prometheus/api/v1/query_rangeRange PromQL query
GET /prometheus/api/v1/labelsList label names
GET /prometheus/api/v1/label/{name}/valuesList label values
GET /prometheus/api/v1/seriesList matching series
GET /api/v1/rulesList alerting/recording rules
GET /readyReadiness probe
GET /metricsPrometheus metrics endpoint

Helm Values (Production)

# mimir-values.yaml
mimir:
  structuredConfig:
    multitenancy_enabled: true
    limits:
      max_global_series_per_user: 3000000
      ingestion_rate: 150000
    blocks_storage:
      backend: s3
      s3:
        bucket_name: mimir-blocks
        endpoint: s3.us-east-1.amazonaws.com

distributor:
  replicas: 3
  resources:
    requests:
      cpu: 1
      memory: 2Gi

ingester:
  replicas: 3
  persistentVolume:
    enabled: true
    size: 50Gi
  resources:
    requests:
      cpu: 2
      memory: 8Gi

store_gateway:
  replicas: 2
  persistentVolume:
    enabled: true
    size: 20Gi

compactor:
  replicas: 1
  persistentVolume:
    enabled: true
    size: 50Gi

querier:
  replicas: 2

query_frontend:
  replicas: 2

Advanced Usage

Recording Rules

# rules.yaml (upload via ruler API or filesystem)
groups:
  - name: aggregations
    interval: 1m
    rules:
      - record: job:http_requests:rate5m
        expr: sum(rate(http_requests_total[5m])) by (job)
      - record: instance:cpu_utilization:avg5m
        expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Query Sharding

query_frontend:
  parallelize_shardable_queries: true
  query_sharding_target_series_per_shard: 2500

Alertmanager in Mimir

alertmanager:
  data_dir: /data/mimir/alertmanager
  sharding_ring:
    kvstore:
      store: memberlist

alertmanager_storage:
  backend: s3
  s3:
    bucket_name: mimir-alertmanager

Troubleshooting

IssueSolution
per-user series limit exceededIncrease max_global_series_per_user or reduce label cardinality
ingestion rate limit exceededIncrease ingestion_rate and ingestion_burst_size per tenant
Queries timing outEnable query sharding; add query-frontend caching; reduce query range
High ingester memoryReduce max_global_series_per_user; add more ingester replicas
Missing metrics after restartCheck ingester WAL replay; verify replication factor >= 2 for HA
Compactor errorsCheck object storage connectivity; review compactor logs for overlap issues
X-Scope-OrgID requiredSet header in all requests; configure Grafana datasource with custom header