Weaviate Cheat Sheet
Overview
Weaviate is an AI-native vector database that uniquely integrates embedding models directly into the database engine. Rather than requiring a separate embedding step, Weaviate’s vectorizer modules (text2vec-openai, text2vec-cohere, text2vec-transformers, multi2vec-clip) generate embeddings automatically on insert and query. This tight integration simplifies pipelines and enables real-time vectorization of incoming data.
Weaviate organizes data into classes (analogous to tables), each with a defined schema specifying properties and their data types. Collections support hybrid search that combines vector similarity with BM25 keyword search, weighted by a configurable alpha parameter. Generative modules (generative-openai, generative-cohere, generative-anthropic) enable retrieval-augmented generation directly within database queries, eliminating round-trips to external LLM APIs.
Weaviate is deployed via Docker, Kubernetes, or the managed Weaviate Cloud Service (WCS). It exposes both GraphQL and REST APIs, with official Python, JavaScript, Java, and Go clients. Multi-tenancy allows thousands of isolated tenant shards within a single collection, making it practical for SaaS applications.
Installation
Docker Compose
# Minimal — no vectorizer (bring your own embeddings)
cat > docker-compose.yml << 'EOF'
version: "3.9"
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051" # gRPC
volumes:
- weaviate_data:/var/lib/weaviate
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
PERSISTENCE_DATA_PATH: /var/lib/weaviate
DEFAULT_VECTORIZER_MODULE: none
ENABLE_MODULES: ""
CLUSTER_HOSTNAME: node1
volumes:
weaviate_data:
EOF
# With OpenAI vectorizer + Generative module
cat > docker-compose-openai.yml << 'EOF'
version: "3.9"
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051"
volumes:
- weaviate_data:/var/lib/weaviate
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
PERSISTENCE_DATA_PATH: /var/lib/weaviate
DEFAULT_VECTORIZER_MODULE: text2vec-openai
ENABLE_MODULES: "text2vec-openai,generative-openai"
OPENAI_APIKEY: ${OPENAI_API_KEY}
volumes:
weaviate_data:
EOF
docker compose -f docker-compose-openai.yml up -d
Python Client
pip install weaviate-client # v4 client (recommended)
pip install weaviate-client==3.26.7 # v3 client (legacy)
Weaviate Cloud Service (WCS)
import weaviate
from weaviate.auth import AuthApiKey
client = weaviate.connect_to_wcs(
cluster_url="https://your-cluster.weaviate.network",
auth_credentials=AuthApiKey(api_key="your-wcs-api-key"),
headers={"X-OpenAI-Api-Key": "sk-..."}
)
Configuration
Client Initialization (v4)
import weaviate
from weaviate.auth import AuthApiKey
from weaviate.classes.init import AdditionalConfig, Timeout
# Local Docker
client = weaviate.connect_to_local(
host="localhost",
port=8080,
grpc_port=50051,
headers={"X-OpenAI-Api-Key": "sk-..."}
)
# Custom endpoint
client = weaviate.connect_to_custom(
http_host="localhost",
http_port=8080,
http_secure=False,
grpc_host="localhost",
grpc_port=50051,
grpc_secure=False,
auth_credentials=AuthApiKey("your-api-key"),
additional_config=AdditionalConfig(
timeout=Timeout(init=2, query=45, insert=120)
)
)
# Always close the client when done
# Use as context manager
with weaviate.connect_to_local() as client:
print(client.is_ready())
Schema / Collection Definition
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType
# Create collection with OpenAI vectorizer
articles = client.collections.create(
name="Article",
description="News articles for RAG",
vectorizer_config=Configure.Vectorizer.text2vec_openai(
model="text-embedding-3-small"
),
generative_config=Configure.Generative.openai(
model="gpt-4o-mini"
),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="body", data_type=DataType.TEXT),
Property(name="source", data_type=DataType.TEXT,
skip_vectorization=True), # Don't embed this field
Property(name="published", data_type=DataType.DATE),
Property(name="views", data_type=DataType.INT),
],
vector_index_config=Configure.VectorIndex.hnsw(
distance_metric=wvc.config.VectorDistances.COSINE,
ef_construction=128,
max_connections=64
)
)
Core Commands/API
| Method | Description |
|---|---|
client.collections.create(name, ...) | Create a new collection with schema |
client.collections.delete(name) | Delete a collection and all data |
client.collections.get(name) | Get a collection object |
client.collections.exists(name) | Check if collection exists |
client.collections.list_all() | List all collection names |
collection.data.insert(properties) | Insert a single object |
collection.data.insert_many(objects) | Batch insert objects |
collection.data.update(uuid, properties) | Update object properties |
collection.data.replace(uuid, properties) | Replace all object properties |
collection.data.delete_by_id(uuid) | Delete object by UUID |
collection.data.delete_many(where) | Batch delete by filter |
collection.data.get_by_id(uuid) | Fetch object by UUID |
collection.query.near_text(query, limit) | Semantic search by text |
collection.query.near_vector(vector, limit) | Search by raw vector |
collection.query.bm25(query, limit) | BM25 keyword search |
collection.query.hybrid(query, alpha, limit) | Hybrid vector + BM25 search |
collection.query.fetch_objects(limit, filters) | Fetch with structured filters |
collection.generate.near_text(query, ...) | RAG: search + LLM generation |
collection.generate.hybrid(query, ...) | RAG: hybrid search + generation |
collection.aggregate.over_all() | Aggregate statistics |
client.batch.dynamic() | Context manager for batch inserts |
Advanced Usage
Batch Import
import weaviate
import weaviate.classes as wvc
from weaviate.classes.data import DataObject
client = weaviate.connect_to_local(
headers={"X-OpenAI-Api-Key": "sk-..."}
)
articles = client.collections.get("Article")
# Batch insert — Weaviate handles embedding automatically
data_objects = [
DataObject(
properties={
"title": "Understanding Vector Databases",
"body": "Vector databases store embeddings and enable...",
"source": "techblog",
"published": "2024-01-15T00:00:00Z",
"views": 1250
}
),
DataObject(
properties={
"title": "RAG Pipeline Best Practices",
"body": "Retrieval-Augmented Generation combines...",
"source": "arxiv"
}
)
]
with client.batch.dynamic() as batch:
for obj in data_objects:
batch.add_object(
collection="Article",
properties=obj.properties
)
# Check for failed inserts
if articles.batch.failed_objects:
for failed in articles.batch.failed_objects:
print(f"Failed: {failed.message}")
Hybrid Search with Filters
from weaviate.classes.query import Filter, MetadataQuery
# Pure semantic search
results = articles.query.near_text(
query="machine learning inference optimization",
limit=5,
return_metadata=MetadataQuery(score=True, distance=True)
)
# BM25 keyword search
results = articles.query.bm25(
query="vector database performance benchmarks",
limit=5,
query_properties=["title", "body"],
return_metadata=MetadataQuery(score=True)
)
# Hybrid search (alpha=1.0 = pure vector, alpha=0.0 = pure BM25)
results = articles.query.hybrid(
query="RAG pipeline with LangChain",
alpha=0.75,
limit=10,
filters=Filter.by_property("source").equal("techblog"),
return_metadata=MetadataQuery(score=True)
)
for obj in results.objects:
print(f"[{obj.metadata.score:.4f}] {obj.properties['title']}")
# Complex filter
from weaviate.classes.query import Filter
f = (
Filter.by_property("views").greater_than(500) &
Filter.by_property("source").contains_any(["techblog", "arxiv"])
)
results = articles.query.near_text("deep learning", limit=5, filters=f)
Generative (RAG) Queries
from weaviate.classes.generate import GenerateOptions
# Single-result generation — apply prompt to each result individually
results = articles.generate.near_text(
query="vector database comparison",
limit=3,
single_prompt="Summarize this article in one sentence: {body}",
return_metadata=MetadataQuery(score=True)
)
for obj in results.objects:
print(f"Original: {obj.properties['title']}")
print(f"Summary: {obj.generated}")
# Grouped generation — synthesize across all retrieved results
results = articles.generate.hybrid(
query="best practices for RAG pipelines",
alpha=0.6,
limit=5,
grouped_task="Write a comprehensive guide based on these articles.",
grouped_properties=["title", "body"]
)
print(results.generated) # Combined synthesis
# Generative search with custom model config
results = articles.generate.near_text(
query="database scaling strategies",
limit=4,
single_prompt="Extract key technical claims from: {body}",
generate_options=GenerateOptions(
temperature=0.2,
max_tokens=200
)
)
Multi-Tenancy
from weaviate.classes.config import Configure
# Enable multi-tenancy at collection creation
mt_collection = client.collections.create(
name="TenantDocs",
multi_tenancy_config=Configure.multi_tenancy(
enabled=True,
auto_tenant_creation=True # Create tenants on first use
),
vectorizer_config=Configure.Vectorizer.text2vec_openai()
)
# Create tenants explicitly
from weaviate.classes.tenants import Tenant, TenantActivityStatus
mt_collection.tenants.create([
Tenant(name="tenant_acme"),
Tenant(name="tenant_globex")
])
# Insert into a tenant
tenant_collection = mt_collection.with_tenant("tenant_acme")
tenant_collection.data.insert({"title": "ACME internal doc", "body": "..."})
# Search within a tenant
results = tenant_collection.query.near_text(
query="internal procedures",
limit=5
)
# Deactivate tenant (cold storage — data preserved, not searchable)
mt_collection.tenants.update([
Tenant(name="tenant_acme", activity_status=TenantActivityStatus.COLD)
])
Custom Vectors (BYOV)
# Skip vectorizer — insert your own embeddings
no_vec_collection = client.collections.create(
name="CustomEmbeddings",
vectorizer_config=Configure.Vectorizer.none()
)
import numpy as np
my_vector = np.random.rand(1536).tolist()
no_vec_collection.data.insert(
properties={"text": "my document"},
vector=my_vector
)
# Search with a custom vector
results = no_vec_collection.query.near_vector(
near_vector=my_vector,
limit=5
)
Common Workflows
Backup and Restore
# Trigger backup via REST API (to local filesystem)
curl -X POST http://localhost:8080/v1/backups/filesystem \
-H "Content-Type: application/json" \
-d '{"id": "backup-2024-01-15", "include": ["Article"]}'
# Check backup status
curl http://localhost:8080/v1/backups/filesystem/backup-2024-01-15
# Restore
curl -X POST http://localhost:8080/v1/backups/filesystem/backup-2024-01-15/restore \
-H "Content-Type: application/json" \
-d '{"include": ["Article"]}'
Schema Migration
# Add a property to existing collection
articles.config.add_property(
Property(name="language", data_type=DataType.TEXT)
)
# Update collection settings (e.g., BM25 parameters)
from weaviate.classes.config import Reconfigure
articles.config.update(
inverted_index_config=Reconfigure.inverted_index(
bm25_b=0.75,
bm25_k1=1.2
)
)
Aggregate and Metrics
# Count objects
response = articles.aggregate.over_all(total_count=True)
print(f"Total articles: {response.total_count}")
# Aggregate with filter
from weaviate.classes.aggregate import GroupByAggregate
response = articles.aggregate.over_all(
filters=Filter.by_property("source").equal("techblog"),
total_count=True
)
Tips and Best Practices
| Tip | Details |
|---|---|
| Use v4 client | The v4 Python client (weaviate-client >= 4.0) has a cleaner API and better performance than v3 |
| Pass API keys in headers | Use headers={"X-OpenAI-Api-Key": "sk-..."} — never hardcode in schema config |
Use skip_vectorization=True | Apply to metadata properties (IDs, URLs, dates) to reduce embedding cost |
| Tune hybrid alpha | Start at 0.75 (vector-heavy); adjust toward 0.5 for balanced keyword/semantic recall |
| Index inverted index selectively | Disable inverted_index_config on properties that are never filtered to save disk space |
Batch with client.batch.dynamic() | Dynamic batching automatically tunes batch size; avoid manual batch size tuning |
| Use multi-tenancy for SaaS | Tenant-per-customer isolation scales to millions of tenants without separate deployments |
| Cold tenants for archival | Set inactive tenants to COLD status — data is preserved but excluded from search |
| Enable gRPC port 50051 | The v4 client uses gRPC by default for ~2x throughput improvement over REST |
| Monitor with Prometheus | Weaviate exports metrics at :2112/metrics; integrate with Grafana for production monitoring |