Grafana Loki Cheat Sheet
Overview
Grafana Loki is a horizontally scalable, highly available log aggregation system designed to be cost-effective and easy to operate. Unlike traditional log systems (Elasticsearch/ELK) that index the full text of every log line, Loki only indexes metadata (labels) about your logs and stores compressed log content in cheap object storage. This approach dramatically reduces storage costs and operational complexity while still allowing fast querying through its LogQL query language.
Loki is modeled after Prometheus and uses the same label-based approach for organizing and querying data. Logs are collected by agents like Promtail, Grafana Alloy, or Fluentd/Fluent Bit and sent to Loki via HTTP push. Loki can run as a single binary (monolithic mode), as microservices for high-scale deployments, or in Simple Scalable Deployment (SSD) mode that balances simplicity with scalability. It integrates seamlessly with Grafana for visualization and alerting, and pairs naturally with Prometheus metrics and Tempo traces for full observability.
Installation
Docker
# Run Loki
docker run -d --name loki \
-p 3100:3100 \
-v $(pwd)/loki-config.yaml:/etc/loki/local-config.yaml \
grafana/loki:3.0.0
# Run Promtail (log collector)
docker run -d --name promtail \
-v /var/log:/var/log \
-v $(pwd)/promtail-config.yaml:/etc/promtail/config.yaml \
grafana/promtail:3.0.0
Helm (Kubernetes)
# Add Grafana Helm repo
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Simple Scalable Deployment
helm install loki grafana/loki \
--namespace loki --create-namespace \
--set loki.storage.type=s3 \
--set loki.storage.s3.endpoint=minio.minio:9000 \
--set loki.storage.s3.bucketnames=loki-data \
--set loki.storage.s3.access_key_id=minioadmin \
--set loki.storage.s3.secret_access_key=minioadmin
# Install Promtail as DaemonSet
helm install promtail grafana/promtail --namespace loki
Binary
# Download Loki
wget https://github.com/grafana/loki/releases/download/v3.0.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki
# Download LogCLI
wget https://github.com/grafana/loki/releases/download/v3.0.0/logcli-linux-amd64.zip
unzip logcli-linux-amd64.zip
sudo mv logcli-linux-amd64 /usr/local/bin/logcli
Configuration
Loki Config (Monolithic)
# loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
max_entries_limit_per_query: 5000
ingestion_rate_mb: 10
ingestion_burst_size_mb: 20
Promtail Config
# promtail-config.yaml
server:
http_listen_port: 9080
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets: [localhost]
labels:
job: varlogs
host: myserver
__path__: /var/log/*.log
- job_name: docker
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 5s
relabel_configs:
- source_labels: ['__meta_docker_container_name']
target_label: 'container'
- job_name: journal
journal:
max_age: 12h
labels:
job: systemd-journal
relabel_configs:
- source_labels: ['__journal__systemd_unit']
target_label: 'unit'
LogQL Query Language
Log Stream Selection
# Select by label
{job="nginx"}
{namespace="production", container="api"}
{host=~"web-.*"}
{level!="debug"}
Log Pipeline (Filtering)
# Line filter
{job="nginx"} |= "error"
{job="nginx"} != "healthcheck"
{job="nginx"} |~ "status=[45]\\d\\d"
{job="nginx"} !~ "GET /favicon"
# JSON parsing
{job="api"} | json | status >= 400
{job="api"} | json | line_format "{{.method}} {{.path}} {{.status}}"
# Logfmt parsing
{job="app"} | logfmt | level="error" | duration > 5s
# Pattern parsing
{job="nginx"} | pattern `<ip> - - <_> "<method> <uri> <_>" <status> <size>`
| status >= 400
Metric Queries
# Count errors per minute
rate({job="nginx"} |= "error" [1m])
# Bytes rate
bytes_rate({job="nginx"} [5m])
# Top 10 paths by request count
topk(10, sum by (path) (rate({job="nginx"} | json [5m])))
# Error rate percentage
sum(rate({job="nginx"} |= "error" [5m])) / sum(rate({job="nginx"} [5m])) * 100
# P99 latency from log lines
quantile_over_time(0.99, {job="api"} | json | unwrap duration [5m]) by (endpoint)
LogCLI Usage
| Command | Description |
|---|---|
logcli query '{job="nginx"}' | Query logs |
logcli labels | List available labels |
logcli labels job | List values for a label |
logcli series '{job="nginx"}' | List log streams |
logcli instant-query 'rate({job="nginx"}[5m])' | Run metric query |
# Configure LogCLI
export LOKI_ADDR=http://localhost:3100
# Query last hour
logcli query '{job="nginx"}' --since=1h --limit=100
# Tail logs
logcli query '{job="nginx"} |= "error"' --tail
# Output as JSON
logcli query '{job="api"}' --output=jsonl
# Query with time range
logcli query '{job="nginx"}' \
--from="2024-01-01T00:00:00Z" \
--to="2024-01-02T00:00:00Z"
Advanced Usage
Loki Alerting Rules
# /loki/rules/alerts.yaml
groups:
- name: high-error-rate
rules:
- alert: HighErrorRate
expr: |
sum(rate({job="api"} |= "error" [5m])) by (service)
/
sum(rate({job="api"} [5m])) by (service)
> 0.05
for: 10m
labels:
severity: critical
annotations:
summary: "High error rate on {{ $labels.service }}"
S3 Backend Storage
storage_config:
tsdb_shipper:
active_index_directory: /loki/tsdb-index
cache_location: /loki/tsdb-cache
aws:
s3: s3://access_key:secret_key@us-east-1/loki-chunks
s3forcepathstyle: true
Recording Rules
groups:
- name: nginx_metrics
interval: 1m
rules:
- record: nginx:requests:rate5m
expr: sum(rate({job="nginx"} [5m]))
- record: nginx:errors:rate5m
expr: sum(rate({job="nginx"} |= "error" [5m]))
Troubleshooting
| Issue | Solution |
|---|---|
| No logs appearing | Check Promtail targets at :9080/targets; verify Loki URL in client config |
entry out of order | Logs must be in chronological order per stream; check unordered_writes: true |
| Query timeout | Add more specific label matchers; reduce time range; increase query_timeout |
| High memory usage | Reduce max_entries_limit_per_query; add caching; use SSD mode |
too many outstanding requests | Increase max_outstanding_per_tenant or add more read replicas |
| Labels cardinality too high | Avoid dynamic labels (IPs, UUIDs); use structured metadata instead |
| Chunks not being flushed | Check chunk_idle_period and chunk_retain_period settings |