Grafana Loki Cheat Sheet

Overview

Grafana Loki is a horizontally scalable, highly available log aggregation system designed to be cost-effective and easy to operate. Unlike traditional log systems (Elasticsearch/ELK) that index the full text of every log line, Loki only indexes metadata (labels) about your logs and stores compressed log content in cheap object storage. This approach dramatically reduces storage costs and operational complexity while still allowing fast querying through its LogQL query language.

Loki is modeled after Prometheus and uses the same label-based approach for organizing and querying data. Logs are collected by agents like Promtail, Grafana Alloy, or Fluentd/Fluent Bit and sent to Loki via HTTP push. Loki can run as a single binary (monolithic mode), as microservices for high-scale deployments, or in Simple Scalable Deployment (SSD) mode that balances simplicity with scalability. It integrates seamlessly with Grafana for visualization and alerting, and pairs naturally with Prometheus metrics and Tempo traces for full observability.

Installation

Docker

# Run Loki
docker run -d --name loki \
  -p 3100:3100 \
  -v $(pwd)/loki-config.yaml:/etc/loki/local-config.yaml \
  grafana/loki:3.0.0

# Run Promtail (log collector)
docker run -d --name promtail \
  -v /var/log:/var/log \
  -v $(pwd)/promtail-config.yaml:/etc/promtail/config.yaml \
  grafana/promtail:3.0.0

Helm (Kubernetes)

# Add Grafana Helm repo
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

# Simple Scalable Deployment
helm install loki grafana/loki \
  --namespace loki --create-namespace \
  --set loki.storage.type=s3 \
  --set loki.storage.s3.endpoint=minio.minio:9000 \
  --set loki.storage.s3.bucketnames=loki-data \
  --set loki.storage.s3.access_key_id=minioadmin \
  --set loki.storage.s3.secret_access_key=minioadmin

# Install Promtail as DaemonSet
helm install promtail grafana/promtail --namespace loki

Binary

# Download Loki
wget https://github.com/grafana/loki/releases/download/v3.0.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki

# Download LogCLI
wget https://github.com/grafana/loki/releases/download/v3.0.0/logcli-linux-amd64.zip
unzip logcli-linux-amd64.zip
sudo mv logcli-linux-amd64 /usr/local/bin/logcli

Configuration

Loki Config (Monolithic)

# loki-config.yaml
auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2024-01-01
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_entries_limit_per_query: 5000
  ingestion_rate_mb: 10
  ingestion_burst_size_mb: 20

Promtail Config

# promtail-config.yaml
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: system
    static_configs:
      - targets: [localhost]
        labels:
          job: varlogs
          host: myserver
          __path__: /var/log/*.log

  - job_name: docker
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
        refresh_interval: 5s
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        target_label: 'container'

  - job_name: journal
    journal:
      max_age: 12h
      labels:
        job: systemd-journal
    relabel_configs:
      - source_labels: ['__journal__systemd_unit']
        target_label: 'unit'

LogQL Query Language

Log Stream Selection

# Select by label
{job="nginx"}
{namespace="production", container="api"}
{host=~"web-.*"}
{level!="debug"}

Log Pipeline (Filtering)

# Line filter
{job="nginx"} |= "error"
{job="nginx"} != "healthcheck"
{job="nginx"} |~ "status=[45]\\d\\d"
{job="nginx"} !~ "GET /favicon"

# JSON parsing
{job="api"} | json | status >= 400
{job="api"} | json | line_format "{{.method}} {{.path}} {{.status}}"

# Logfmt parsing
{job="app"} | logfmt | level="error" | duration > 5s

# Pattern parsing
{job="nginx"} | pattern `<ip> - - <_> "<method> <uri> <_>" <status> <size>`
  | status >= 400

Metric Queries

# Count errors per minute
rate({job="nginx"} |= "error" [1m])

# Bytes rate
bytes_rate({job="nginx"} [5m])

# Top 10 paths by request count
topk(10, sum by (path) (rate({job="nginx"} | json [5m])))

# Error rate percentage
sum(rate({job="nginx"} |= "error" [5m])) / sum(rate({job="nginx"} [5m])) * 100

# P99 latency from log lines
quantile_over_time(0.99, {job="api"} | json | unwrap duration [5m]) by (endpoint)

LogCLI Usage

Command	Description
`logcli query '{job="nginx"}'`	Query logs
`logcli labels`	List available labels
`logcli labels job`	List values for a label
`logcli series '{job="nginx"}'`	List log streams
`logcli instant-query 'rate({job="nginx"}[5m])'`	Run metric query

# Configure LogCLI
export LOKI_ADDR=http://localhost:3100

# Query last hour
logcli query '{job="nginx"}' --since=1h --limit=100

# Tail logs
logcli query '{job="nginx"} |= "error"' --tail

# Output as JSON
logcli query '{job="api"}' --output=jsonl

# Query with time range
logcli query '{job="nginx"}' \
  --from="2024-01-01T00:00:00Z" \
  --to="2024-01-02T00:00:00Z"

Advanced Usage

Loki Alerting Rules

# /loki/rules/alerts.yaml
groups:
  - name: high-error-rate
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate({job="api"} |= "error" [5m])) by (service)
            /
          sum(rate({job="api"} [5m])) by (service)
            > 0.05
        for: 10m
        labels:
          severity: critical
        annotations:
          summary: "High error rate on {{ $labels.service }}"

S3 Backend Storage

storage_config:
  tsdb_shipper:
    active_index_directory: /loki/tsdb-index
    cache_location: /loki/tsdb-cache
  aws:
    s3: s3://access_key:secret_key@us-east-1/loki-chunks
    s3forcepathstyle: true

Recording Rules

groups:
  - name: nginx_metrics
    interval: 1m
    rules:
      - record: nginx:requests:rate5m
        expr: sum(rate({job="nginx"} [5m]))
      - record: nginx:errors:rate5m
        expr: sum(rate({job="nginx"} |= "error" [5m]))

Troubleshooting

Issue	Solution
No logs appearing	Check Promtail targets at `:9080/targets`; verify Loki URL in client config
`entry out of order`	Logs must be in chronological order per stream; check `unordered_writes: true`
Query timeout	Add more specific label matchers; reduce time range; increase `query_timeout`
High memory usage	Reduce `max_entries_limit_per_query`; add caching; use SSD mode
`too many outstanding requests`	Increase `max_outstanding_per_tenant` or add more read replicas
Labels cardinality too high	Avoid dynamic labels (IPs, UUIDs); use structured metadata instead
Chunks not being flushed	Check `chunk_idle_period` and `chunk_retain_period` settings