Overview
Vitess is a database clustering system designed for horizontal scaling of MySQL. Originally developed at YouTube to solve their massive scaling challenges, it was later open-sourced and donated to the CNCF, reaching graduated status. Vitess adds a middleware layer between your application and MySQL, providing connection pooling, query routing, schema management, and transparent sharding without requiring application-level changes.
Vitess combines the reliability and feature set of MySQL with the scalability of a NoSQL database. It deploys well on Kubernetes and bare metal, supports both sharded and unsharded databases, and provides online schema migrations, backups, and topology management. Major users include Slack, Square, GitHub, HubSpot, and PlanetScale (which is built on Vitess).
Installation
Docker Compose (Development)
# Clone Vitess repository
git clone https://github.com/vitessio/vitess.git
cd vitess/examples/compose
# Start local cluster
docker-compose up -d
# Connect via MySQL protocol
mysql -h 127.0.0.1 -P 15306 -u user
Kubernetes (Operator)
# Install Vitess Operator
kubectl apply -f https://github.com/planetscale/vitess-operator/releases/latest/download/operator.yaml
# Apply cluster configuration
kubectl apply -f cluster.yaml
Local Development
# macOS
brew install vitess
# Build from source
git clone https://github.com/vitessio/vitess.git
cd vitess && make build
# Verify
vtctldclient --version
vtgate --version
Architecture Components
| Component | Description |
|---|
vtgate | Query router / proxy (application connects here) |
vttablet | MySQL sidecar managing each MySQL instance |
vtctld | Cluster management server |
vtctldclient | CLI for cluster administration |
vtadmin | Web-based admin dashboard |
topology server | Metadata store (etcd, ZooKeeper, Consul) |
Core Concepts
Keyspaces and Shards
# A keyspace is the logical database (equivalent to MySQL database)
# Shards divide the keyspace for horizontal scaling
# Create keyspace
vtctldclient CreateKeyspace --durability-policy=semi_sync commerce
# List keyspaces
vtctldclient GetKeyspaces
# Get keyspace info
vtctldclient GetKeyspace commerce
VSchema (Vitess Schema)
{
"sharded": true,
"vindexes": {
"hash": {
"type": "hash"
},
"lookup_unique": {
"type": "lookup_unique",
"params": {
"table": "user_lookup",
"from": "email",
"to": "user_id"
},
"owner": "users"
}
},
"tables": {
"users": {
"column_vindexes": [
{
"column": "id",
"name": "hash"
},
{
"column": "email",
"name": "lookup_unique"
}
]
},
"orders": {
"column_vindexes": [
{
"column": "user_id",
"name": "hash"
}
]
},
"products": {
"type": "reference"
}
}
}
# Apply VSchema
vtctldclient ApplyVSchema --vschema-file=vschema.json commerce
Vindexes (Sharding Functions)
| Vindex Type | Description | Use Case |
|---|
hash | Hashes column for distribution | Primary key sharding |
xxhash | xxHash-based distribution | String column sharding |
lookup | Cross-shard lookup table | Secondary index lookup |
lookup_unique | Unique cross-shard lookup | Unique constraints across shards |
consistent_lookup | Consistent hash lookup | Read-heavy lookups |
numeric | Direct numeric mapping | Pre-partitioned data |
region_json | Region-based routing | Geo-distributed data |
SQL Operations
Connecting
# Connect through vtgate (MySQL protocol)
mysql -h vtgate-host -P 15306 -u user -p
# Using standard MySQL drivers
# Connection string: user:pass@tcp(vtgate-host:15306)/keyspace
Supported SQL
-- DDL (routed through vtctld for online schema changes)
CREATE TABLE users (
id BIGINT NOT NULL AUTO_INCREMENT,
email VARCHAR(255) NOT NULL,
name VARCHAR(100),
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id),
UNIQUE KEY idx_email (email)
) ENGINE=InnoDB;
-- Standard DML
INSERT INTO users (email, name) VALUES ('alice@example.com', 'Alice');
SELECT * FROM users WHERE id = 1;
SELECT * FROM users WHERE email = 'alice@example.com';
UPDATE users SET name = 'Alice Smith' WHERE id = 1;
DELETE FROM users WHERE id = 1;
-- Cross-shard queries (scatter queries)
SELECT COUNT(*) FROM users;
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;
-- Joins (co-located data preferred)
SELECT u.name, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.id = 123;
-- Transactions (single-shard preferred)
BEGIN;
INSERT INTO orders (user_id, total) VALUES (123, 99.99);
UPDATE users SET order_count = order_count + 1 WHERE id = 123;
COMMIT;
Administration
Tablet Management
# List tablets
vtctldclient GetTablets
# Get tablet info
vtctldclient GetTablet zone1-100
# Ping tablet
vtctldclient PingTablet zone1-100
# Reparent (failover)
vtctldclient PlannedReparentShard commerce/0 --new-primary=zone1-101
vtctldclient EmergencyReparentShard commerce/0
# Set tablet type
vtctldclient ChangeTabletType zone1-100 rdonly
Schema Management
# Apply schema change (online DDL)
vtctldclient ApplySchema --sql="ALTER TABLE users ADD COLUMN age INT" commerce
# Online DDL strategies
vtctldclient ApplySchema --ddl-strategy="vitess" \
--sql="ALTER TABLE users ADD INDEX idx_name (name)" commerce
vtctldclient ApplySchema --ddl-strategy="gh-ost" \
--sql="ALTER TABLE users ADD COLUMN status VARCHAR(20)" commerce
# Check migration status
vtctldclient GetSchemaMigrations commerce
# Show schema
vtctldclient GetSchema zone1-100
Resharding
# Move tables between keyspaces
vtctldclient MoveTables --workflow=commerce2customer \
--target-keyspace=customer create \
--source-keyspace=commerce --tables="users,orders"
# Reshard (split shards)
vtctldclient Reshard --workflow=reshard_2to4 \
--target-keyspace=customer create \
--source-shards="-80,80-" \
--target-shards="-40,40-80,80-c0,c0-"
# Monitor progress
vtctldclient Workflow --keyspace=customer show --workflow=reshard_2to4
# Switch traffic
vtctldclient MoveTables --workflow=commerce2customer \
--target-keyspace=customer switchtraffic
# Complete migration
vtctldclient MoveTables --workflow=commerce2customer \
--target-keyspace=customer complete
Backups
# Create backup
vtctldclient Backup zone1-100
# List backups
vtctldclient GetBackups commerce/-80
# Restore from backup
vtctldclient RestoreFromBackup zone1-100
Configuration
VTGate Configuration
vtgate \
--topo_implementation=etcd2 \
--topo_global_server_address=etcd:2379 \
--topo_global_root=/vitess/global \
--cell=zone1 \
--cells_to_watch=zone1 \
--port=15001 \
--grpc_port=15991 \
--mysql_server_port=15306 \
--mysql_auth_server_impl=static \
--mysql_auth_server_static_file=auth.json \
--enable_buffer \
--buffer_size=1000 \
--buffer_min_time_between_failovers=20s
VTTablet Configuration
vttablet \
--topo_implementation=etcd2 \
--topo_global_server_address=etcd:2379 \
--topo_global_root=/vitess/global \
--tablet-path=zone1-100 \
--init_keyspace=commerce \
--init_shard=0 \
--init_tablet_type=replica \
--port=15100 \
--grpc_port=16100 \
--db_host=mysql-host \
--db_port=3306 \
--queryserver-config-pool-size=128 \
--queryserver-config-stream-pool-size=128 \
--queryserver-config-transaction-cap=300
Advanced Usage
Monitoring
# VTGate status
curl http://vtgate-host:15001/debug/status
# Tablet health
curl http://tablet-host:15100/healthz
# Query stats
curl http://vtgate-host:15001/debug/query_plans
# Topology info
vtctldclient GetCellInfoNames
vtctldclient GetCellInfo zone1
Connection Pooling Tuning
# VTTablet pool settings
--queryserver-config-pool-size=128 # Transaction pool
--queryserver-config-stream-pool-size=128 # Streaming pool
--queryserver-config-transaction-cap=300 # Max concurrent transactions
--queryserver-config-query-timeout=30 # Query timeout seconds
--queryserver-config-idle-timeout=1800 # Idle connection timeout
Troubleshooting
| Problem | Solution |
|---|
| Scatter query too slow | Add proper vindexes to VSchema; use targeted queries with shard key |
| Cross-shard transaction error | Redesign schema for co-located data; use BEGIN with single shard |
| Tablet not healthy | Check vtctldclient GetTablet; verify MySQL replication |
| Reparent failure | Use EmergencyReparentShard if planned fails; check topology |
| VSchema not applied | Verify JSON syntax; check vtctldclient GetVSchema |
| Online DDL stuck | Check GetSchemaMigrations; retry or cancel migration |
| Connection pool exhaustion | Increase pool size; check for long-running transactions |
| Resharding data mismatch | Verify vindex configuration; check VDiff results |