Envoy Proxy

Envoy is an open-source, high-performance L7 proxy designed for cloud-native applications. It serves as the data plane for service meshes like Istio and provides advanced load balancing, observability, and resiliency features without requiring application changes.

Installation

Run with Docker

# Run Envoy with a config file
docker run --rm -it \
  -v $(pwd)/envoy.yaml:/etc/envoy/envoy.yaml \
  -p 10000:10000 \
  -p 9901:9901 \
  envoyproxy/envoy:v1.30.0 \
  envoy -c /etc/envoy/envoy.yaml

# Check admin interface
curl http://localhost:9901/ready
curl http://localhost:9901/stats

Install on Linux

# Debian/Ubuntu
sudo apt-get update
sudo apt-get install -y debian-keyring debian-archive-keyring apt-transport-https curl lsb-release
curl -sL 'https://deb.dl.getenvoy.io/public/gpg.8115BA8E629CC074.key' | sudo gpg --dearmor -o /usr/share/keyrings/getenvoy-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/getenvoy-keyring.gpg] https://deb.dl.getenvoy.io/public/deb/debian $(lsb_release -cs) main" | \
  sudo tee /etc/apt/sources.list.d/getenvoy.list
sudo apt-get update && sudo apt-get install getenvoy-envoy

# macOS
brew tap tetratelabs/getenvoy
brew install envoy

# Verify
envoy --version

Install func-e (version manager)

curl https://func-e.io/install.bash | bash -s -- -b /usr/local/bin
func-e use 1.30.0
func-e run --version

Configuration

Minimal Static Configuration

# envoy.yaml — simple HTTP proxy
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                http_filters:
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            host_rewrite_literal: www.envoyproxy.io
                            cluster: service_envoyproxy_io

  clusters:
    - name: service_envoyproxy_io
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      load_assignment:
        cluster_name: service_envoyproxy_io
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: www.envoyproxy.io
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          sni: www.envoyproxy.io

admin:
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9901

xDS Bootstrap for Dynamic Configuration

# Bootstrap pointing to a control plane (e.g., Istio Pilot, go-control-plane)
node:
  id: envoy-node-1
  cluster: my-cluster

dynamic_resources:
  ads_config:
    api_type: GRPC
    transport_api_version: V3
    grpc_services:
      - envoy_grpc:
          cluster_name: xds_cluster
  cds_config:
    resource_api_version: V3
    ads: {}
  lds_config:
    resource_api_version: V3
    ads: {}

static_resources:
  clusters:
    - name: xds_cluster
      type: STATIC
      connect_timeout: 1s
      load_assignment:
        cluster_name: xds_cluster
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: 127.0.0.1
                      port_value: 15010

admin:
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 9901

Core Commands

Command	Description
`envoy -c config.yaml`	Start Envoy with a config file
`envoy -c config.yaml --log-level debug`	Start with debug logging
`envoy --mode validate -c config.yaml`	Validate config without starting
`envoy -c config.yaml --drain-time-s 60`	Set drain time on hot restart
`curl localhost:9901/ready`	Check admin readiness
`curl localhost:9901/stats`	Dump all statistics
`curl localhost:9901/stats?filter=http`	Filter stats by keyword
`curl localhost:9901/stats/prometheus`	Prometheus-formatted metrics
`curl localhost:9901/clusters`	List upstream clusters and health
`curl localhost:9901/listeners`	List active listeners
`curl localhost:9901/config_dump`	Dump full xDS config
`curl localhost:9901/config_dump?resource=routes`	Dump specific resource type
`curl -X POST localhost:9901/logging?level=debug`	Change log level at runtime
`curl -X POST localhost:9901/healthcheck/fail`	Force health check failure
`curl -X POST localhost:9901/healthcheck/ok`	Restore health check
`curl localhost:9901/runtime`	View runtime configuration
`curl -X POST localhost:9901/quitquitquit`	Gracefully shut down Envoy

Advanced Usage

Rate Limiting Filter

# Local rate limit — 100 req/min per route
http_filters:
  - name: envoy.filters.http.local_ratelimit
    typed_config:
      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
      type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
      value:
        stat_prefix: http_local_rate_limiter
        token_bucket:
          max_tokens: 100
          tokens_per_fill: 100
          fill_interval: 60s
        filter_enabled:
          runtime_key: local_rate_limit_enabled
          default_value:
            numerator: 100
            denominator: HUNDRED
        filter_enforced:
          runtime_key: local_rate_limit_enforced
          default_value:
            numerator: 100
            denominator: HUNDRED

Circuit Breaker

# Cluster-level circuit breaker settings
clusters:
  - name: my_backend
    type: STRICT_DNS
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1000
          max_pending_requests: 1000
          max_requests: 1000
          max_retries: 3
          track_remaining: true
    outlier_detection:
      consecutive_5xx: 5
      interval: 10s
      base_ejection_time: 30s
      max_ejection_percent: 50
      success_rate_minimum_hosts: 5
      success_rate_request_volume: 100

Retry Policy

routes:
  - match:
      prefix: "/api"
    route:
      cluster: backend
      retry_policy:
        retry_on: "5xx,connect-failure,retriable-4xx"
        num_retries: 3
        per_try_timeout: 5s
        retry_back_off:
          base_interval: 250ms
          max_interval: 1s

Header-Based Routing

virtual_hosts:
  - name: my_vhost
    domains: ["api.example.com"]
    routes:
      - match:
          prefix: "/v2"
          headers:
            - name: "x-canary"
              string_match:
                exact: "true"
        route:
          cluster: backend_v2
          weight: 100
      - match:
          prefix: "/"
        route:
          weighted_clusters:
            clusters:
              - name: backend_v1
                weight: 90
              - name: backend_v2
                weight: 10

mTLS Upstream

transport_socket:
  name: envoy.transport_sockets.tls
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
    common_tls_context:
      tls_certificates:
        - certificate_chain:
            filename: /certs/client.crt
          private_key:
            filename: /certs/client.key
      validation_context:
        trusted_ca:
          filename: /certs/ca.crt
    sni: backend.internal

Common Workflows

Inspect Live Configuration

# Dump full running config
curl -s localhost:9901/config_dump | jq .

# Extract active routes
curl -s localhost:9901/config_dump | \
  jq '.configs[] | select(.["@type"] | contains("RoutesConfigDump"))'

# Check cluster health and stats
curl -s localhost:9901/clusters | grep -E "health_flags|cx_active|rq_total"

Debugging Request Flow

# Enable debug logging for a specific component
curl -X POST "localhost:9901/logging?http=debug"
curl -X POST "localhost:9901/logging?router=debug"
curl -X POST "localhost:9901/logging?connection=debug"

# Tail access logs (if writing to file)
tail -f /var/log/envoy/access.log | jq .

# Check upstream response codes
curl -s localhost:9901/stats | grep upstream_rq_2xx
curl -s localhost:9901/stats | grep upstream_rq_5xx

Hot Restart (Zero Downtime Config Reload)

# Start initial process with drain time
envoy -c config.yaml --drain-time-s 30 --restart-epoch 0

# Send new config via hot restart
envoy -c new-config.yaml --drain-time-s 30 --restart-epoch 1

# Old process drains connections, new one takes over

Prometheus Metrics Scrape

# Scrape Envoy metrics
curl -s localhost:9901/stats/prometheus | grep -E "^envoy_http_"

# Key metrics to monitor:
# envoy_http_downstream_rq_total          — total requests
# envoy_cluster_upstream_rq_retry         — retried requests
# envoy_cluster_circuit_breakers_*        — circuit breaker status
# envoy_cluster_upstream_cx_connect_ms    — connect latency

Tips and Best Practices

Use admin API cautiously in production — bind localhost:9901 only, never expose to external traffic; it allows config dumps and shutdown.
Validate configs before deploying — envoy --mode validate -c config.yaml catches syntax errors without starting the proxy.
Monitor upstream_cx_overflow and circuit_breakers_* stats — these indicate upstream pressure before user-visible errors occur.
Set connect_timeout on every cluster — the default is unlimited; always set an explicit timeout (e.g., 1s) to prevent connection stalls.
Use typed extensions — always use the typed_config field with the full @type URL; the older config field is deprecated.
Prefer ADS (Aggregated Discovery Service) for xDS — a single gRPC stream is more efficient and avoids race conditions between resource types.
Use health_checks on clusters — active health checks let Envoy eject unhealthy upstreams before a request fails.
Set max_grpc_timeout on gRPC routes — prevents unbounded streaming calls from holding connections forever.
Tune connection_pool per cluster — HTTP/1.1 and HTTP/2 pools have different tuning knobs; match to your backend’s protocol.
Use access log format JSON — structured logs integrate with log aggregation pipelines far more reliably than text formats.