Ir al contenido

Weaviate Cheat Sheet

Overview

Weaviate is an AI-native vector database that uniquely integrates embedding models directly into the database engine. Rather than requiring a separate embedding step, Weaviate’s vectorizer modules (text2vec-openai, text2vec-cohere, text2vec-transformers, multi2vec-clip) generate embeddings automatically on insert and query. This tight integration simplifies pipelines and enables real-time vectorization of incoming data.

Weaviate organizes data into classes (analogous to tables), each with a defined schema specifying properties and their data types. Collections support hybrid search that combines vector similarity with BM25 keyword search, weighted by a configurable alpha parameter. Generative modules (generative-openai, generative-cohere, generative-anthropic) enable retrieval-augmented generation directly within database queries, eliminating round-trips to external LLM APIs.

Weaviate is deployed via Docker, Kubernetes, or the managed Weaviate Cloud Service (WCS). It exposes both GraphQL and REST APIs, with official Python, JavaScript, Java, and Go clients. Multi-tenancy allows thousands of isolated tenant shards within a single collection, making it practical for SaaS applications.

Installation

Docker Compose

# Minimal — no vectorizer (bring your own embeddings)
cat > docker-compose.yml << 'EOF'
version: "3.9"
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"   # gRPC
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      DEFAULT_VECTORIZER_MODULE: none
      ENABLE_MODULES: ""
      CLUSTER_HOSTNAME: node1
volumes:
  weaviate_data:
EOF

# With OpenAI vectorizer + Generative module
cat > docker-compose-openai.yml << 'EOF'
version: "3.9"
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - weaviate_data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      PERSISTENCE_DATA_PATH: /var/lib/weaviate
      DEFAULT_VECTORIZER_MODULE: text2vec-openai
      ENABLE_MODULES: "text2vec-openai,generative-openai"
      OPENAI_APIKEY: ${OPENAI_API_KEY}
volumes:
  weaviate_data:
EOF
docker compose -f docker-compose-openai.yml up -d

Python Client

pip install weaviate-client          # v4 client (recommended)
pip install weaviate-client==3.26.7  # v3 client (legacy)

Weaviate Cloud Service (WCS)

import weaviate
from weaviate.auth import AuthApiKey

client = weaviate.connect_to_wcs(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=AuthApiKey(api_key="your-wcs-api-key"),
    headers={"X-OpenAI-Api-Key": "sk-..."}
)

Configuration

Client Initialization (v4)

import weaviate
from weaviate.auth import AuthApiKey
from weaviate.classes.init import AdditionalConfig, Timeout

# Local Docker
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051,
    headers={"X-OpenAI-Api-Key": "sk-..."}
)

# Custom endpoint
client = weaviate.connect_to_custom(
    http_host="localhost",
    http_port=8080,
    http_secure=False,
    grpc_host="localhost",
    grpc_port=50051,
    grpc_secure=False,
    auth_credentials=AuthApiKey("your-api-key"),
    additional_config=AdditionalConfig(
        timeout=Timeout(init=2, query=45, insert=120)
    )
)

# Always close the client when done
# Use as context manager
with weaviate.connect_to_local() as client:
    print(client.is_ready())

Schema / Collection Definition

import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

# Create collection with OpenAI vectorizer
articles = client.collections.create(
    name="Article",
    description="News articles for RAG",
    vectorizer_config=Configure.Vectorizer.text2vec_openai(
        model="text-embedding-3-small"
    ),
    generative_config=Configure.Generative.openai(
        model="gpt-4o-mini"
    ),
    properties=[
        Property(name="title",     data_type=DataType.TEXT),
        Property(name="body",      data_type=DataType.TEXT),
        Property(name="source",    data_type=DataType.TEXT,
                 skip_vectorization=True),     # Don't embed this field
        Property(name="published", data_type=DataType.DATE),
        Property(name="views",     data_type=DataType.INT),
    ],
    vector_index_config=Configure.VectorIndex.hnsw(
        distance_metric=wvc.config.VectorDistances.COSINE,
        ef_construction=128,
        max_connections=64
    )
)

Core Commands/API

MethodDescription
client.collections.create(name, ...)Create a new collection with schema
client.collections.delete(name)Delete a collection and all data
client.collections.get(name)Get a collection object
client.collections.exists(name)Check if collection exists
client.collections.list_all()List all collection names
collection.data.insert(properties)Insert a single object
collection.data.insert_many(objects)Batch insert objects
collection.data.update(uuid, properties)Update object properties
collection.data.replace(uuid, properties)Replace all object properties
collection.data.delete_by_id(uuid)Delete object by UUID
collection.data.delete_many(where)Batch delete by filter
collection.data.get_by_id(uuid)Fetch object by UUID
collection.query.near_text(query, limit)Semantic search by text
collection.query.near_vector(vector, limit)Search by raw vector
collection.query.bm25(query, limit)BM25 keyword search
collection.query.hybrid(query, alpha, limit)Hybrid vector + BM25 search
collection.query.fetch_objects(limit, filters)Fetch with structured filters
collection.generate.near_text(query, ...)RAG: search + LLM generation
collection.generate.hybrid(query, ...)RAG: hybrid search + generation
collection.aggregate.over_all()Aggregate statistics
client.batch.dynamic()Context manager for batch inserts

Advanced Usage

Batch Import

import weaviate
import weaviate.classes as wvc
from weaviate.classes.data import DataObject

client = weaviate.connect_to_local(
    headers={"X-OpenAI-Api-Key": "sk-..."}
)
articles = client.collections.get("Article")

# Batch insert — Weaviate handles embedding automatically
data_objects = [
    DataObject(
        properties={
            "title":     "Understanding Vector Databases",
            "body":      "Vector databases store embeddings and enable...",
            "source":    "techblog",
            "published": "2024-01-15T00:00:00Z",
            "views":     1250
        }
    ),
    DataObject(
        properties={
            "title": "RAG Pipeline Best Practices",
            "body":  "Retrieval-Augmented Generation combines...",
            "source": "arxiv"
        }
    )
]

with client.batch.dynamic() as batch:
    for obj in data_objects:
        batch.add_object(
            collection="Article",
            properties=obj.properties
        )

# Check for failed inserts
if articles.batch.failed_objects:
    for failed in articles.batch.failed_objects:
        print(f"Failed: {failed.message}")

Hybrid Search with Filters

from weaviate.classes.query import Filter, MetadataQuery

# Pure semantic search
results = articles.query.near_text(
    query="machine learning inference optimization",
    limit=5,
    return_metadata=MetadataQuery(score=True, distance=True)
)

# BM25 keyword search
results = articles.query.bm25(
    query="vector database performance benchmarks",
    limit=5,
    query_properties=["title", "body"],
    return_metadata=MetadataQuery(score=True)
)

# Hybrid search (alpha=1.0 = pure vector, alpha=0.0 = pure BM25)
results = articles.query.hybrid(
    query="RAG pipeline with LangChain",
    alpha=0.75,
    limit=10,
    filters=Filter.by_property("source").equal("techblog"),
    return_metadata=MetadataQuery(score=True)
)

for obj in results.objects:
    print(f"[{obj.metadata.score:.4f}] {obj.properties['title']}")

# Complex filter
from weaviate.classes.query import Filter
f = (
    Filter.by_property("views").greater_than(500) &
    Filter.by_property("source").contains_any(["techblog", "arxiv"])
)
results = articles.query.near_text("deep learning", limit=5, filters=f)

Generative (RAG) Queries

from weaviate.classes.generate import GenerateOptions

# Single-result generation — apply prompt to each result individually
results = articles.generate.near_text(
    query="vector database comparison",
    limit=3,
    single_prompt="Summarize this article in one sentence: {body}",
    return_metadata=MetadataQuery(score=True)
)

for obj in results.objects:
    print(f"Original: {obj.properties['title']}")
    print(f"Summary:  {obj.generated}")

# Grouped generation — synthesize across all retrieved results
results = articles.generate.hybrid(
    query="best practices for RAG pipelines",
    alpha=0.6,
    limit=5,
    grouped_task="Write a comprehensive guide based on these articles.",
    grouped_properties=["title", "body"]
)
print(results.generated)   # Combined synthesis

# Generative search with custom model config
results = articles.generate.near_text(
    query="database scaling strategies",
    limit=4,
    single_prompt="Extract key technical claims from: {body}",
    generate_options=GenerateOptions(
        temperature=0.2,
        max_tokens=200
    )
)

Multi-Tenancy

from weaviate.classes.config import Configure

# Enable multi-tenancy at collection creation
mt_collection = client.collections.create(
    name="TenantDocs",
    multi_tenancy_config=Configure.multi_tenancy(
        enabled=True,
        auto_tenant_creation=True   # Create tenants on first use
    ),
    vectorizer_config=Configure.Vectorizer.text2vec_openai()
)

# Create tenants explicitly
from weaviate.classes.tenants import Tenant, TenantActivityStatus
mt_collection.tenants.create([
    Tenant(name="tenant_acme"),
    Tenant(name="tenant_globex")
])

# Insert into a tenant
tenant_collection = mt_collection.with_tenant("tenant_acme")
tenant_collection.data.insert({"title": "ACME internal doc", "body": "..."})

# Search within a tenant
results = tenant_collection.query.near_text(
    query="internal procedures",
    limit=5
)

# Deactivate tenant (cold storage — data preserved, not searchable)
mt_collection.tenants.update([
    Tenant(name="tenant_acme", activity_status=TenantActivityStatus.COLD)
])

Custom Vectors (BYOV)

# Skip vectorizer — insert your own embeddings
no_vec_collection = client.collections.create(
    name="CustomEmbeddings",
    vectorizer_config=Configure.Vectorizer.none()
)

import numpy as np
my_vector = np.random.rand(1536).tolist()

no_vec_collection.data.insert(
    properties={"text": "my document"},
    vector=my_vector
)

# Search with a custom vector
results = no_vec_collection.query.near_vector(
    near_vector=my_vector,
    limit=5
)

Common Workflows

Backup and Restore

# Trigger backup via REST API (to local filesystem)
curl -X POST http://localhost:8080/v1/backups/filesystem \
  -H "Content-Type: application/json" \
  -d '{"id": "backup-2024-01-15", "include": ["Article"]}'

# Check backup status
curl http://localhost:8080/v1/backups/filesystem/backup-2024-01-15

# Restore
curl -X POST http://localhost:8080/v1/backups/filesystem/backup-2024-01-15/restore \
  -H "Content-Type: application/json" \
  -d '{"include": ["Article"]}'

Schema Migration

# Add a property to existing collection
articles.config.add_property(
    Property(name="language", data_type=DataType.TEXT)
)

# Update collection settings (e.g., BM25 parameters)
from weaviate.classes.config import Reconfigure
articles.config.update(
    inverted_index_config=Reconfigure.inverted_index(
        bm25_b=0.75,
        bm25_k1=1.2
    )
)

Aggregate and Metrics

# Count objects
response = articles.aggregate.over_all(total_count=True)
print(f"Total articles: {response.total_count}")

# Aggregate with filter
from weaviate.classes.aggregate import GroupByAggregate
response = articles.aggregate.over_all(
    filters=Filter.by_property("source").equal("techblog"),
    total_count=True
)

Tips and Best Practices

TipDetails
Use v4 clientThe v4 Python client (weaviate-client >= 4.0) has a cleaner API and better performance than v3
Pass API keys in headersUse headers={"X-OpenAI-Api-Key": "sk-..."} — never hardcode in schema config
Use skip_vectorization=TrueApply to metadata properties (IDs, URLs, dates) to reduce embedding cost
Tune hybrid alphaStart at 0.75 (vector-heavy); adjust toward 0.5 for balanced keyword/semantic recall
Index inverted index selectivelyDisable inverted_index_config on properties that are never filtered to save disk space
Batch with client.batch.dynamic()Dynamic batching automatically tunes batch size; avoid manual batch size tuning
Use multi-tenancy for SaaSTenant-per-customer isolation scales to millions of tenants without separate deployments
Cold tenants for archivalSet inactive tenants to COLD status — data is preserved but excluded from search
Enable gRPC port 50051The v4 client uses gRPC by default for ~2x throughput improvement over REST
Monitor with PrometheusWeaviate exports metrics at :2112/metrics; integrate with Grafana for production monitoring