Coroot Observability

Ferramenta de observabilidade e APM baseada em eBPF de código aberto com métricas de zero-instrumentação, logs, traces e criação de perfil contínuo para ambientes Kubernetes e Docker.

Installation

Docker Compose (Quickest)

# One-command deployment with ClickHouse and Prometheus
curl -fsS https://raw.githubusercontent.com/coroot/coroot/main/deploy/docker-compose.yaml | \
  docker compose -f - up -d

# Access UI at http://localhost:8080

Kubernetes (Helm)

# Add Coroot Helm repository
helm repo add coroot https://coroot.github.io/helm-charts
helm repo update coroot

# Install the Coroot operator
helm install -n coroot --create-namespace coroot-operator coroot/coroot-operator

# Deploy Community Edition
helm install -n coroot coroot coroot/coroot-ce

# Deploy with ClickHouse replication
helm install -n coroot coroot coroot/coroot-ce \
  --set "clickhouse.shards=2,clickhouse.replicas=2"

# Port forward to access UI
kubectl port-forward -n coroot service/coroot-coroot 8080:8080

# Access UI at http://localhost:8080

Docker Swarm

# Deploy Coroot stack
curl -fsS https://raw.githubusercontent.com/coroot/coroot/main/deploy/docker-swarm-stack.yaml | \
  docker stack deploy -c - coroot

Ubuntu/Debian

# Install Coroot server
curl -sfL https://raw.githubusercontent.com/coroot/coroot/main/deploy/install.sh | \
  BOOTSTRAP_PROMETHEUS_URL="http://PROMETHEUS_IP:9090" \
  BOOTSTRAP_REFRESH_INTERVAL=15s \
  BOOTSTRAP_CLICKHOUSE_ADDRESS=CLICKHOUSE_IP:9000 \
  sh -

RHEL/CentOS

# Same installer works for RHEL-based distributions
curl -sfL https://raw.githubusercontent.com/coroot/coroot/main/deploy/install.sh | \
  BOOTSTRAP_PROMETHEUS_URL="http://PROMETHEUS_IP:9090" \
  BOOTSTRAP_REFRESH_INTERVAL=15s \
  BOOTSTRAP_CLICKHOUSE_ADDRESS=CLICKHOUSE_IP:9000 \
  sh -

Node Agent Installation

Docker

# Run node agent as privileged container
docker run --detach --name coroot-node-agent \
  --pull=always --privileged --pid host \
  -v /sys/kernel/tracing:/sys/kernel/tracing:rw \
  -v /sys/kernel/debug:/sys/kernel/debug:rw \
  -v /sys/fs/cgroup:/host/sys/fs/cgroup:ro \
  ghcr.io/coroot/coroot-node-agent \
  --cgroupfs-root=/host/sys/fs/cgroup \
  --collector-endpoint=http://COROOT_IP:8080

Linux (systemd)

# Install node agent on bare-metal or VMs
curl -sfL https://raw.githubusercontent.com/coroot/coroot-node-agent/main/install.sh | \
  COLLECTOR_ENDPOINT=http://COROOT_IP:8080 \
  SCRAPE_INTERVAL=15s \
  sh -

Kubernetes (via Helm operator)

# Node agent is automatically deployed by the Coroot operator
# No separate installation needed when using Helm

Basic Commands

Command	Description
`docker compose up -d`	Iniciar Coroot com Docker Compose
`docker compose down`	Parar todos os serviços Coroot
`docker compose logs -f`	Seguir logs Coroot
`helm install coroot coroot/coroot-ce`	Instalar Coroot em Kubernetes
`helm upgrade coroot coroot/coroot-ce`	Atualizar Coroot
`helm uninstall coroot -n coroot`	Remover Coroot do cluster
`kubectl port-forward svc/coroot-coroot 8080:8080 -n coroot`	Acessar UI Coroot

Configuration Parameters

Server Configuration

Variable	Description	Default
`BOOTSTRAP_PROMETHEUS_URL`	Prometheus server endpoint	Required
`BOOTSTRAP_REFRESH_INTERVAL`	Metrics collection interval	`15s`
`BOOTSTRAP_CLICKHOUSE_ADDRESS`	ClickHouse server address	Required
`LISTEN_ADDRESS`	HTTP listen address	`:8080`
`DATA_DIR`	Data directory path	`/var/lib/coroot`

Node Agent Configuration

Flag	Description	Default
`--collector-endpoint`	Coroot server endpoint	Required
`--cgroupfs-root`	Cgroup filesystem root path	`/sys/fs/cgroup`
`--scrape-interval`	Metrics scrape interval	`15s`
`--log-level`	Logging verbosity	`info`

Architecture Components

Component	Role
Coroot Server	Central dashboard, analysis engine, alerting
Node Agent	eBPF-based metric/log collection on each node
Cluster Agent	Database monitoring (MySQL, PostgreSQL, Redis)
ClickHouse	Metrics, logs, traces, and profiles storage
Prometheus	Metrics scraping and remote write

Key Features

Zero-Instrumentation Observability

Feature	Description
Automatic Discovery	Services auto-discovered via eBPF — no code changes needed
Service Map	Live topology map showing all service dependencies
Distributed Tracing	Request tracing across microservices without SDK
Log Collection	Automatic log gathering and pattern clustering
Continuous Profiling	CPU/memory profiling with one-click activation

Monitoring Capabilities

Capability	Description
SLO Tracking	Define and monitor Service Level Objectives
Issue Detection	Automatic identification of 80%+ of issues
Deployment Tracking	Track Kubernetes deployments and rollbacks
Cost Monitoring	AWS, GCP, Azure resource cost analysis
Network Analysis	TCP connection metrics, DNS latency, retransmits

Supported Protocols (eBPF)

Protocol	Metrics Collected
HTTP/HTTPS	Latency, error rate, throughput
gRPC	Method-level latency and errors
PostgreSQL	Query latency, connections, errors
MySQL	Query performance, slow queries
Redis	Command latency, hit/miss ratio
MongoDB	Operation latency, connections
Kafka	Producer/consumer lag, throughput
DNS	Resolution latency, failure rate

Helm Chart Values

Common Overrides

# Custom ClickHouse sizing
helm install coroot coroot/coroot-ce \
  --set clickhouse.shards=3 \
  --set clickhouse.replicas=2 \
  --set clickhouse.storage=100Gi

# Custom Prometheus settings
helm install coroot coroot/coroot-ce \
  --set prometheus.storage=50Gi \
  --set prometheus.retention=30d

# Enable ingress
helm install coroot coroot/coroot-ce \
  --set ingress.enabled=true \
  --set ingress.host=coroot.example.com

Alerting Configuration

Alert Type	Description
SLO Breach	Triggered when SLO target is at risk
Latency Spike	p99 latency exceeds threshold
Error Rate	Error percentage exceeds threshold
Resource	CPU, memory, or disk usage anomaly
Deployment	Failed or degraded deployment detected

Notification Channels

Channel	Configuration
Slack	Webhook URL
PagerDuty	Integration key
Opsgenie	API key
Email	SMTP settings
Webhook	Custom HTTP endpoint

Troubleshooting

Issue	Solution
No data appearing	Check node agent `--collector-endpoint` points to Coroot server
Missing services	Verify node agent runs with `--privileged` and `--pid host`
eBPF not loading	Ensure kernel version 4.16+ with BTF support
High memory usage	Reduce `--scrape-interval` or limit monitored namespaces
ClickHouse connection	Verify ClickHouse is running and accessible on port 9000

Best Practices

Deploy node agents on every node in your cluster for complete visibility
Use ClickHouse replication for production deployments (minimum 2 replicas)
Set meaningful SLO targets before relying on automatic alerting
Start with Docker Compose for evaluation, migrate to Helm for production
Configure Prometheus remote write to persist metrics beyond pod restarts
Use the built-in profiler to identify CPU/memory hotspots before scaling
Enable deployment tracking to correlate performance changes with releases