Weaviate Cheat Sheet

Overview

Weaviate is an AI-native vector database that uniquely integrates embedding models directly into the database engine. Rather than requiring a separate embedding step, Weaviate’s vectorizer modules (text2vec-openai, text2vec-cohere, text2vec-transformers, multi2vec-clip) generate embeddings automatically on insert and query. This tight integration simplifies pipelines and enables real-time vectorization of incoming data.

Weaviate organizes data into classes (analogous to tables), each with a defined schema specifying properties and their data types. Collections support hybrid search that combines vector similarity with BM25 keyword search, weighted by a configurable alpha parameter. Generative modules (generative-openai, generative-cohere, generative-anthropic) enable retrieval-augmented generation directly within database queries, eliminating round-trips to external LLM APIs.

Weaviate is deployed via Docker, Kubernetes, or the managed Weaviate Cloud Service (WCS). It exposes both GraphQL and REST APIs, with official Python, JavaScript, Java, and Go clients. Multi-tenancy allows thousands of isolated tenant shards within a single collection, making it practical for SaaS applications.

Installation

Docker Compose

# Minimal — no vectorizer (bring your own embeddings)
cat > docker-compose.yml << 'EOF'
version: "3.9"
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"   # gRPC
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      DEFAULT_VECTORIZER_MODULE: none
      ENABLE_MODULES: ""
      CLUSTER_HOSTNAME: node1
volumes:
  weaviate_data:
EOF

# With OpenAI vectorizer + Generative module
cat > docker-compose-openai.yml << 'EOF'
version: "3.9"
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      DEFAULT_VECTORIZER_MODULE: text2vec-openai
      ENABLE_MODULES: "text2vec-openai,generative-openai"
      OPENAI_APIKEY: ${OPENAI_API_KEY}
volumes:
  weaviate_data:
EOF
docker compose -f docker-compose-openai.yml up -d

Python Client

pip install weaviate-client          # v4 client (recommended)
pip install weaviate-client==3.26.7  # v3 client (legacy)

Weaviate Cloud Service (WCS)

import weaviate
from weaviate.auth import AuthApiKey

client = weaviate.connect_to_wcs(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=AuthApiKey(api_key="your-wcs-api-key"),
    headers={"X-OpenAI-Api-Key": "sk-..."}
)

Configuration

Client Initialization (v4)

import weaviate
from weaviate.auth import AuthApiKey
from weaviate.classes.init import AdditionalConfig, Timeout

# Local Docker
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051,
    headers={"X-OpenAI-Api-Key": "sk-..."}
)

# Custom endpoint
client = weaviate.connect_to_custom(
    http_host="localhost",
    http_port=8080,
    http_secure=False,
    grpc_host="localhost",
    grpc_port=50051,
    grpc_secure=False,
    auth_credentials=AuthApiKey("your-api-key"),
    additional_config=AdditionalConfig(
        timeout=Timeout(init=2, query=45, insert=120)
    )
)

# Always close the client when done
# Use as context manager
with weaviate.connect_to_local() as client:
    print(client.is_ready())

Schema / Collection Definition

import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

# Create collection with OpenAI vectorizer
articles = client.collections.create(
    name="Article",
    description="News articles for RAG",
    vectorizer_config=Configure.Vectorizer.text2vec_openai(
        model="text-embedding-3-small"
    ),
    generative_config=Configure.Generative.openai(
        model="gpt-4o-mini"
    ),
    properties=[
        Property(name="title",     data_type=DataType.TEXT),
        Property(name="body",      data_type=DataType.TEXT),
        Property(name="source",    data_type=DataType.TEXT,
                 skip_vectorization=True),     # Don't embed this field
        Property(name="published", data_type=DataType.DATE),
        Property(name="views",     data_type=DataType.INT),
    ],
    vector_index_config=Configure.VectorIndex.hnsw(
        distance_metric=wvc.config.VectorDistances.COSINE,
        ef_construction=128,
        max_connections=64
    )
)

Core Commands/API

Method	Description
`client.collections.create(name, ...)`	Create a new collection with schema
`client.collections.delete(name)`	Delete a collection and all data
`client.collections.get(name)`	Get a collection object
`client.collections.exists(name)`	Check if collection exists
`client.collections.list_all()`	List all collection names
`collection.data.insert(properties)`	Insert a single object
`collection.data.insert_many(objects)`	Batch insert objects
`collection.data.update(uuid, properties)`	Update object properties
`collection.data.replace(uuid, properties)`	Replace all object properties
`collection.data.delete_by_id(uuid)`	Delete object by UUID
`collection.data.delete_many(where)`	Batch delete by filter
`collection.data.get_by_id(uuid)`	Fetch object by UUID
`collection.query.near_text(query, limit)`	Semantic search by text
`collection.query.near_vector(vector, limit)`	Search by raw vector
`collection.query.bm25(query, limit)`	BM25 keyword search
`collection.query.hybrid(query, alpha, limit)`	Hybrid vector + BM25 search
`collection.query.fetch_objects(limit, filters)`	Fetch with structured filters
`collection.generate.near_text(query, ...)`	RAG: search + LLM generation
`collection.generate.hybrid(query, ...)`	RAG: hybrid search + generation
`collection.aggregate.over_all()`	Aggregate statistics
`client.batch.dynamic()`	Context manager for batch inserts

Advanced Usage

Batch Import

import weaviate
import weaviate.classes as wvc
from weaviate.classes.data import DataObject

client = weaviate.connect_to_local(
    headers={"X-OpenAI-Api-Key": "sk-..."}
)
articles = client.collections.get("Article")

# Batch insert — Weaviate handles embedding automatically
data_objects = [
    DataObject(
        properties={
            "title":     "Understanding Vector Databases",
            "body":      "Vector databases store embeddings and enable...",
            "source":    "techblog",
            "published": "2024-01-15T00:00:00Z",
            "views":     1250
        }
    ),
    DataObject(
        properties={
            "title": "RAG Pipeline Best Practices",
            "body":  "Retrieval-Augmented Generation combines...",
            "source": "arxiv"
        }
    )
]

with client.batch.dynamic() as batch:
    for obj in data_objects:
        batch.add_object(
            collection="Article",
            properties=obj.properties
        )

# Check for failed inserts
if articles.batch.failed_objects:
    for failed in articles.batch.failed_objects:
        print(f"Failed: {failed.message}")

Hybrid Search with Filters

from weaviate.classes.query import Filter, MetadataQuery

# Pure semantic search
results = articles.query.near_text(
    query="machine learning inference optimization",
    limit=5,
    return_metadata=MetadataQuery(score=True, distance=True)
)

# BM25 keyword search
results = articles.query.bm25(
    query="vector database performance benchmarks",
    limit=5,
    query_properties=["title", "body"],
    return_metadata=MetadataQuery(score=True)
)

# Hybrid search (alpha=1.0 = pure vector, alpha=0.0 = pure BM25)
results = articles.query.hybrid(
    query="RAG pipeline with LangChain",
    alpha=0.75,
    limit=10,
    filters=Filter.by_property("source").equal("techblog"),
    return_metadata=MetadataQuery(score=True)
)

for obj in results.objects:
    print(f"[{obj.metadata.score:.4f}] {obj.properties['title']}")

# Complex filter
from weaviate.classes.query import Filter
f = (
    Filter.by_property("views").greater_than(500) &
    Filter.by_property("source").contains_any(["techblog", "arxiv"])
)
results = articles.query.near_text("deep learning", limit=5, filters=f)

Generative (RAG) Queries

from weaviate.classes.generate import GenerateOptions

# Single-result generation — apply prompt to each result individually
results = articles.generate.near_text(
    query="vector database comparison",
    limit=3,
    single_prompt="Summarize this article in one sentence: {body}",
    return_metadata=MetadataQuery(score=True)
)

for obj in results.objects:
    print(f"Original: {obj.properties['title']}")
    print(f"Summary:  {obj.generated}")

# Grouped generation — synthesize across all retrieved results
results = articles.generate.hybrid(
    query="best practices for RAG pipelines",
    alpha=0.6,
    limit=5,
    grouped_task="Write a comprehensive guide based on these articles.",
    grouped_properties=["title", "body"]
)
print(results.generated)   # Combined synthesis

# Generative search with custom model config
results = articles.generate.near_text(
    query="database scaling strategies",
    limit=4,
    single_prompt="Extract key technical claims from: {body}",
    generate_options=GenerateOptions(
        temperature=0.2,
        max_tokens=200
    )
)

Multi-Tenancy

from weaviate.classes.config import Configure

# Enable multi-tenancy at collection creation
mt_collection = client.collections.create(
    name="TenantDocs",
    multi_tenancy_config=Configure.multi_tenancy(
        enabled=True,
        auto_tenant_creation=True   # Create tenants on first use
    ),
    vectorizer_config=Configure.Vectorizer.text2vec_openai()
)

# Create tenants explicitly
from weaviate.classes.tenants import Tenant, TenantActivityStatus
mt_collection.tenants.create([
    Tenant(name="tenant_acme"),
    Tenant(name="tenant_globex")
])

# Insert into a tenant
tenant_collection = mt_collection.with_tenant("tenant_acme")
tenant_collection.data.insert({"title": "ACME internal doc", "body": "..."})

# Search within a tenant
results = tenant_collection.query.near_text(
    query="internal procedures",
    limit=5
)

# Deactivate tenant (cold storage — data preserved, not searchable)
mt_collection.tenants.update([
    Tenant(name="tenant_acme", activity_status=TenantActivityStatus.COLD)
])

Custom Vectors (BYOV)

# Skip vectorizer — insert your own embeddings
no_vec_collection = client.collections.create(
    name="CustomEmbeddings",
    vectorizer_config=Configure.Vectorizer.none()
)

import numpy as np
my_vector = np.random.rand(1536).tolist()

no_vec_collection.data.insert(
    properties={"text": "my document"},
    vector=my_vector
)

# Search with a custom vector
results = no_vec_collection.query.near_vector(
    near_vector=my_vector,
    limit=5
)

Common Workflows

Backup and Restore

# Trigger backup via REST API (to local filesystem)
curl -X POST http://localhost:8080/v1/backups/filesystem \
  -H "Content-Type: application/json" \
  -d '{"id": "backup-2024-01-15", "include": ["Article"]}'

# Check backup status
curl http://localhost:8080/v1/backups/filesystem/backup-2024-01-15

# Restore
curl -X POST http://localhost:8080/v1/backups/filesystem/backup-2024-01-15/restore \
  -H "Content-Type: application/json" \
  -d '{"include": ["Article"]}'

Schema Migration

# Add a property to existing collection
articles.config.add_property(
    Property(name="language", data_type=DataType.TEXT)
)

# Update collection settings (e.g., BM25 parameters)
from weaviate.classes.config import Reconfigure
articles.config.update(
    inverted_index_config=Reconfigure.inverted_index(
        bm25_b=0.75,
        bm25_k1=1.2
    )
)

Aggregate and Metrics

# Count objects
response = articles.aggregate.over_all(total_count=True)
print(f"Total articles: {response.total_count}")

# Aggregate with filter
from weaviate.classes.aggregate import GroupByAggregate
response = articles.aggregate.over_all(
    filters=Filter.by_property("source").equal("techblog"),
    total_count=True
)

Tips and Best Practices

Tip	Details
Use v4 client	The v4 Python client (weaviate-client >= 4.0) has a cleaner API and better performance than v3
Pass API keys in headers	Use `headers={"X-OpenAI-Api-Key": "sk-..."}` — never hardcode in schema config
Use `skip_vectorization=True`	Apply to metadata properties (IDs, URLs, dates) to reduce embedding cost
Tune hybrid alpha	Start at 0.75 (vector-heavy); adjust toward 0.5 for balanced keyword/semantic recall
Index inverted index selectively	Disable `inverted_index_config` on properties that are never filtered to save disk space
Batch with `client.batch.dynamic()`	Dynamic batching automatically tunes batch size; avoid manual batch size tuning
Use multi-tenancy for SaaS	Tenant-per-customer isolation scales to millions of tenants without separate deployments
Cold tenants for archival	Set inactive tenants to COLD status — data is preserved but excluded from search
Enable gRPC port 50051	The v4 client uses gRPC by default for ~2x throughput improvement over REST
Monitor with Prometheus	Weaviate exports metrics at `:2112/metrics`; integrate with Grafana for production monitoring