Envoy Proxy
Envoy is an open-source, high-performance L7 proxy designed for cloud-native applications. It serves as the data plane for service meshes like Istio and provides advanced load balancing, observability, and resiliency features without requiring application changes.
Installation
Run with Docker
# Run Envoy with a config file
docker run --rm -it \
-v $(pwd)/envoy.yaml:/etc/envoy/envoy.yaml \
-p 10000:10000 \
-p 9901:9901 \
envoyproxy/envoy:v1.30.0 \
envoy -c /etc/envoy/envoy.yaml
# Check admin interface
curl http://localhost:9901/ready
curl http://localhost:9901/stats
Install on Linux
# Debian/Ubuntu
sudo apt-get update
sudo apt-get install -y debian-keyring debian-archive-keyring apt-transport-https curl lsb-release
curl -sL 'https://deb.dl.getenvoy.io/public/gpg.8115BA8E629CC074.key' | sudo gpg --dearmor -o /usr/share/keyrings/getenvoy-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/getenvoy-keyring.gpg] https://deb.dl.getenvoy.io/public/deb/debian $(lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/getenvoy.list
sudo apt-get update && sudo apt-get install getenvoy-envoy
# macOS
brew tap tetratelabs/getenvoy
brew install envoy
# Verify
envoy --version
Install func-e (version manager)
curl https://func-e.io/install.bash | bash -s -- -b /usr/local/bin
func-e use 1.30.0
func-e run --version
Configuration
Minimal Static Configuration
# envoy.yaml — simple HTTP proxy
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match:
prefix: "/"
route:
host_rewrite_literal: www.envoyproxy.io
cluster: service_envoyproxy_io
clusters:
- name: service_envoyproxy_io
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
load_assignment:
cluster_name: service_envoyproxy_io
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: www.envoyproxy.io
port_value: 443
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: www.envoyproxy.io
admin:
address:
socket_address:
address: 0.0.0.0
port_value: 9901
xDS Bootstrap for Dynamic Configuration
# Bootstrap pointing to a control plane (e.g., Istio Pilot, go-control-plane)
node:
id: envoy-node-1
cluster: my-cluster
dynamic_resources:
ads_config:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: xds_cluster
cds_config:
resource_api_version: V3
ads: {}
lds_config:
resource_api_version: V3
ads: {}
static_resources:
clusters:
- name: xds_cluster
type: STATIC
connect_timeout: 1s
load_assignment:
cluster_name: xds_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 15010
admin:
address:
socket_address:
address: 127.0.0.1
port_value: 9901
Core Commands
| Command | Description |
|---|---|
envoy -c config.yaml | Start Envoy with a config file |
envoy -c config.yaml --log-level debug | Start with debug logging |
envoy --mode validate -c config.yaml | Validate config without starting |
envoy -c config.yaml --drain-time-s 60 | Set drain time on hot restart |
curl localhost:9901/ready | Check admin readiness |
curl localhost:9901/stats | Dump all statistics |
curl localhost:9901/stats?filter=http | Filter stats by keyword |
curl localhost:9901/stats/prometheus | Prometheus-formatted metrics |
curl localhost:9901/clusters | List upstream clusters and health |
curl localhost:9901/listeners | List active listeners |
curl localhost:9901/config_dump | Dump full xDS config |
curl localhost:9901/config_dump?resource=routes | Dump specific resource type |
curl -X POST localhost:9901/logging?level=debug | Change log level at runtime |
curl -X POST localhost:9901/healthcheck/fail | Force health check failure |
curl -X POST localhost:9901/healthcheck/ok | Restore health check |
curl localhost:9901/runtime | View runtime configuration |
curl -X POST localhost:9901/quitquitquit | Gracefully shut down Envoy |
Advanced Usage
Rate Limiting Filter
# Local rate limit — 100 req/min per route
http_filters:
- name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 100
tokens_per_fill: 100
fill_interval: 60s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
Circuit Breaker
# Cluster-level circuit breaker settings
clusters:
- name: my_backend
type: STRICT_DNS
circuit_breakers:
thresholds:
- priority: DEFAULT
max_connections: 1000
max_pending_requests: 1000
max_requests: 1000
max_retries: 3
track_remaining: true
outlier_detection:
consecutive_5xx: 5
interval: 10s
base_ejection_time: 30s
max_ejection_percent: 50
success_rate_minimum_hosts: 5
success_rate_request_volume: 100
Retry Policy
routes:
- match:
prefix: "/api"
route:
cluster: backend
retry_policy:
retry_on: "5xx,connect-failure,retriable-4xx"
num_retries: 3
per_try_timeout: 5s
retry_back_off:
base_interval: 250ms
max_interval: 1s
Header-Based Routing
virtual_hosts:
- name: my_vhost
domains: ["api.example.com"]
routes:
- match:
prefix: "/v2"
headers:
- name: "x-canary"
string_match:
exact: "true"
route:
cluster: backend_v2
weight: 100
- match:
prefix: "/"
route:
weighted_clusters:
clusters:
- name: backend_v1
weight: 90
- name: backend_v2
weight: 10
mTLS Upstream
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: /certs/client.crt
private_key:
filename: /certs/client.key
validation_context:
trusted_ca:
filename: /certs/ca.crt
sni: backend.internal
Common Workflows
Inspect Live Configuration
# Dump full running config
curl -s localhost:9901/config_dump | jq .
# Extract active routes
curl -s localhost:9901/config_dump | \
jq '.configs[] | select(.["@type"] | contains("RoutesConfigDump"))'
# Check cluster health and stats
curl -s localhost:9901/clusters | grep -E "health_flags|cx_active|rq_total"
Debugging Request Flow
# Enable debug logging for a specific component
curl -X POST "localhost:9901/logging?http=debug"
curl -X POST "localhost:9901/logging?router=debug"
curl -X POST "localhost:9901/logging?connection=debug"
# Tail access logs (if writing to file)
tail -f /var/log/envoy/access.log | jq .
# Check upstream response codes
curl -s localhost:9901/stats | grep upstream_rq_2xx
curl -s localhost:9901/stats | grep upstream_rq_5xx
Hot Restart (Zero Downtime Config Reload)
# Start initial process with drain time
envoy -c config.yaml --drain-time-s 30 --restart-epoch 0
# Send new config via hot restart
envoy -c new-config.yaml --drain-time-s 30 --restart-epoch 1
# Old process drains connections, new one takes over
Prometheus Metrics Scrape
# Scrape Envoy metrics
curl -s localhost:9901/stats/prometheus | grep -E "^envoy_http_"
# Key metrics to monitor:
# envoy_http_downstream_rq_total — total requests
# envoy_cluster_upstream_rq_retry — retried requests
# envoy_cluster_circuit_breakers_* — circuit breaker status
# envoy_cluster_upstream_cx_connect_ms — connect latency
Tips and Best Practices
- Use admin API cautiously in production — bind
localhost:9901only, never expose to external traffic; it allows config dumps and shutdown. - Validate configs before deploying —
envoy --mode validate -c config.yamlcatches syntax errors without starting the proxy. - Monitor
upstream_cx_overflowandcircuit_breakers_*stats — these indicate upstream pressure before user-visible errors occur. - Set
connect_timeouton every cluster — the default is unlimited; always set an explicit timeout (e.g.,1s) to prevent connection stalls. - Use typed extensions — always use the
typed_configfield with the full@typeURL; the olderconfigfield is deprecated. - Prefer ADS (Aggregated Discovery Service) for xDS — a single gRPC stream is more efficient and avoids race conditions between resource types.
- Use
health_checkson clusters — active health checks let Envoy eject unhealthy upstreams before a request fails. - Set
max_grpc_timeouton gRPC routes — prevents unbounded streaming calls from holding connections forever. - Tune
connection_poolper cluster — HTTP/1.1 and HTTP/2 pools have different tuning knobs; match to your backend’s protocol. - Use access log format JSON — structured logs integrate with log aggregation pipelines far more reliably than text formats.