Haystack Cheat Sheet

Overview

Haystack is an end-to-end framework for building retrieval-augmented generation (RAG), question answering, and document search systems. Developed by deepset, it follows a pipeline-based architecture where modular components — document converters, preprocessors, embedders, retrievers, rankers, and generators — are connected by named input/output ports into dataflow graphs.

Haystack 2.x (the current major version) introduced a complete redesign with type-safe component interfaces, declarative pipeline serialization to YAML, and a clean separation between components (stateless processors) and document stores (persistent vector backends). Supported backends include InMemoryDocumentStore, Elasticsearch, OpenSearch, Chroma, Qdrant, Weaviate, and pgvector.

The framework is LLM-agnostic, with official integrations for OpenAI, Anthropic, Cohere, Mistral, HuggingFace Inference API, and local models via Ollama. Haystack also provides tools for pipeline evaluation (faithfulness, context recall, RAGAS metrics), REST API deployment via Hayhooks, and a visual pipeline builder in the Haystack Studio.

Installation

Core and Integrations

# Core Haystack 2.x
pip install haystack-ai

# Document store backends
pip install chroma-haystack           # ChromaDB
pip install qdrant-haystack           # Qdrant
pip install elasticsearch-haystack    # Elasticsearch
pip install weaviate-haystack         # Weaviate

# LLM and embedding integrations
pip install openai                    # OpenAI (included in core)
pip install cohere-haystack           # Cohere
pip install amazon-bedrock-haystack   # AWS Bedrock
pip install ollama-haystack           # Ollama local models

# Document processing
pip install pypdf trafilatura         # PDF and web scraping
pip install sentence-transformers     # Local embedding models

# Evaluation
pip install haystack-experimental     # RAGAS metrics, advanced eval

# REST API serving
pip install hayhooks                  # FastAPI wrapper for pipelines

# Full stack (common setup)
pip install haystack-ai pypdf sentence-transformers

Environment Variables

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export COHERE_API_KEY="..."
export HF_API_TOKEN="hf_..."

Configuration

Document Stores

# In-Memory (development and testing)
from haystack.document_stores.in_memory import InMemoryDocumentStore

store = InMemoryDocumentStore(
    embedding_similarity_function="cosine"  # or "dot_product"
)

# ChromaDB
from chroma_haystack import ChromaDocumentStore

store = ChromaDocumentStore(
    collection_name="my_docs",
    persist_path="./chroma_haystack"
)

# Qdrant
from qdrant_haystack import QdrantDocumentStore

store = QdrantDocumentStore(
    url="http://localhost:6333",
    index="haystack_docs",
    embedding_dim=1536,
    recreate_index=False
)

# Elasticsearch
from elasticsearch_haystack import ElasticsearchDocumentStore

store = ElasticsearchDocumentStore(
    hosts="http://localhost:9200",
    index="haystack"
)

Core Commands/API

Component / Method	Description
`Pipeline()`	Create a new pipeline
`pipeline.add_component(name, component)`	Register a component
`pipeline.connect(from_output, to_input)`	Wire component outputs to inputs
`pipeline.run(inputs)`	Execute the pipeline
`pipeline.to_dict()`	Serialize pipeline to dict
`Pipeline.from_dict(data)`	Deserialize pipeline from dict
`pipeline.draw("pipeline.png")`	Visualize pipeline graph
`OpenAIDocumentEmbedder(model)`	Embed documents with OpenAI
`OpenAITextEmbedder(model)`	Embed a query string
`InMemoryEmbeddingRetriever(store)`	Retrieve by embedding similarity
`InMemoryBM25Retriever(store)`	BM25 keyword retrieval
`OpenAIGenerator(model)`	Generate response with OpenAI
`OpenAIChatGenerator(model)`	Chat-based generation
`PromptBuilder(template)`	Build prompts from Jinja2 templates
`DocumentJoiner(join_mode)`	Merge results from multiple retrievers
`TransformersDocumentEmbedder(model)`	Local HuggingFace embeddings
`SentenceTransformersDocumentEmbedder(model)`	Sentence-Transformers embeddings
`PyPDFToDocument()`	Convert PDF to Document objects
`RecursiveDocumentSplitter(chunk_size)`	Chunk documents recursively
`DocumentWriter(store)`	Write documents to a store
`MetadataRouter(rules)`	Route documents by metadata
`AnswerBuilder()`	Format answers from generator output

Advanced Usage

Indexing Pipeline

from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import OpenAIDocumentEmbedder
from haystack.components.preprocessors import DocumentSplitter, DocumentCleaner
from haystack.components.converters import PyPDFToDocument
from haystack.components.writers import DocumentWriter
from haystack.components.routers import FileTypeRouter

store = InMemoryDocumentStore()

# Build indexing pipeline
indexing = Pipeline()
indexing.add_component("pdf_converter",  PyPDFToDocument())
indexing.add_component("cleaner",        DocumentCleaner())
indexing.add_component("splitter",       DocumentSplitter(
    split_by="word",
    split_length=250,
    split_overlap=30
))
indexing.add_component("embedder",       OpenAIDocumentEmbedder(
    model="text-embedding-3-small"
))
indexing.add_component("writer",         DocumentWriter(document_store=store))

# Wire components
indexing.connect("pdf_converter.documents", "cleaner.documents")
indexing.connect("cleaner.documents",       "splitter.documents")
indexing.connect("splitter.documents",      "embedder.documents")
indexing.connect("embedder.documents",      "writer.documents")

# Run indexing
result = indexing.run({
    "pdf_converter": {"sources": ["document.pdf", "report.pdf"]}
})
print(f"Indexed {result['writer']['documents_written']} chunks")

# Add documents directly
docs = [
    Document(content="Haystack is a RAG framework.", meta={"source": "docs"}),
    Document(content="pgvector extends PostgreSQL.", meta={"source": "blog"})
]
embedder = OpenAIDocumentEmbedder(model="text-embedding-3-small")
store.write_documents(embedder.run(docs)["documents"])

RAG Query Pipeline

from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

template = """
Answer the question based on the given context.

Context:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}

Question: {{ question }}
"""

rag = Pipeline()
rag.add_component("query_embedder", OpenAITextEmbedder(model="text-embedding-3-small"))
rag.add_component("retriever",      InMemoryEmbeddingRetriever(
    document_store=store, top_k=5
))
rag.add_component("prompt",         PromptBuilder(template=template))
rag.add_component("generator",      OpenAIGenerator(
    model="gpt-4o-mini",
    generation_kwargs={"temperature": 0.2, "max_tokens": 512}
))

rag.connect("query_embedder.embedding", "retriever.query_embedding")
rag.connect("retriever.documents",      "prompt.documents")
rag.connect("prompt.prompt",            "generator.prompt")

# Run the RAG pipeline
result = rag.run({
    "query_embedder": {"text": "What is Haystack used for?"},
    "prompt":         {"question": "What is Haystack used for?"}
})

print(result["generator"]["replies"][0])

Hybrid Retrieval Pipeline

from haystack.components.retrievers.in_memory import (
    InMemoryEmbeddingRetriever,
    InMemoryBM25Retriever
)
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import TransformersSimilarityRanker

hybrid = Pipeline()
hybrid.add_component("query_embedder",  OpenAITextEmbedder("text-embedding-3-small"))
hybrid.add_component("vector_retriever", InMemoryEmbeddingRetriever(store, top_k=10))
hybrid.add_component("bm25_retriever",  InMemoryBM25Retriever(store, top_k=10))
hybrid.add_component("joiner",          DocumentJoiner(join_mode="reciprocal_rank_fusion"))
hybrid.add_component("ranker",          TransformersSimilarityRanker(
    model="cross-encoder/ms-marco-MiniLM-L-6-v2",
    top_k=5
))
hybrid.add_component("prompt",          PromptBuilder(template=template))
hybrid.add_component("generator",       OpenAIGenerator("gpt-4o-mini"))

hybrid.connect("query_embedder.embedding",     "vector_retriever.query_embedding")
hybrid.connect("vector_retriever.documents",   "joiner.documents")
hybrid.connect("bm25_retriever.documents",     "joiner.documents")
hybrid.connect("joiner.documents",             "ranker.documents")
hybrid.connect("ranker.documents",             "prompt.documents")
hybrid.connect("prompt.prompt",                "generator.prompt")

result = hybrid.run({
    "query_embedder":  {"text": "RAG pipeline performance"},
    "bm25_retriever":  {"query": "RAG pipeline performance"},
    "ranker":          {"query": "RAG pipeline performance"},
    "prompt":          {"question": "RAG pipeline performance"}
})

Custom Component

from haystack import component, default_from_dict, default_to_dict
from haystack.dataclasses import Document
from typing import Optional, List

@component
class KeywordFilter:
    """Filter documents by required keywords."""

    def __init__(self, keywords: List[str], require_all: bool = False):
        self.keywords  = [k.lower() for k in keywords]
        self.require_all = require_all

    @component.output_types(documents=List[Document])
    def run(self, documents: List[Document]) -> dict:
        results = []
        for doc in documents:
            text = doc.content.lower()
            matches = [k in text for k in self.keywords]
            if self.require_all:
                if all(matches):
                    results.append(doc)
            else:
                if any(matches):
                    results.append(doc)
        return {"documents": results}

    def to_dict(self) -> dict:
        return default_to_dict(self, keywords=self.keywords, require_all=self.require_all)

    @classmethod
    def from_dict(cls, data: dict):
        return default_from_dict(cls, data)

# Use in pipeline
pipeline.add_component("keyword_filter", KeywordFilter(["vector", "embedding"]))
pipeline.connect("retriever.documents", "keyword_filter.documents")

Pipeline Serialization

import yaml

# Save pipeline to YAML
pipeline_dict = rag.to_dict()
with open("rag_pipeline.yaml", "w") as f:
    yaml.dump(pipeline_dict, f)

# Load pipeline from YAML
with open("rag_pipeline.yaml") as f:
    pipeline_data = yaml.safe_load(f)
loaded_pipeline = Pipeline.from_dict(pipeline_data)

# Run loaded pipeline
result = loaded_pipeline.run({
    "query_embedder": {"text": "my question"},
    "prompt":         {"question": "my question"}
})

Common Workflows

Pipeline Evaluation

from haystack.evaluation import EvaluationRunResult
from haystack.components.evaluators import (
    FaithfulnessEvaluator,
    ContextRelevanceEvaluator,
    SASEvaluator
)

# Faithfulness — does the answer stay true to the retrieved context?
faithfulness = FaithfulnessEvaluator()

eval_result = faithfulness.run(
    questions=["What is RAG?", "How does HNSW work?"],
    contexts=[["RAG combines retrieval with generation."], ["HNSW is a graph-based ANN index."]],
    responses=["RAG stands for Retrieval-Augmented Generation.", "HNSW uses hierarchical graph layers."]
)
print(f"Mean faithfulness: {eval_result['score']:.3f}")

# Context relevance
relevance = ContextRelevanceEvaluator()
eval_result = relevance.run(
    questions=["What is RAG?"],
    contexts=[["RAG combines retrieval with generation."]]
)

Hayhooks REST API

pip install hayhooks

# Serve a pipeline YAML as REST endpoint
hayhooks run --pipelines-dir ./pipelines

# Now call the pipeline via REST
curl -X POST http://localhost:1416/rag_pipeline/run \
  -H "Content-Type: application/json" \
  -d '{
    "query_embedder": {"text": "What is Haystack?"},
    "prompt": {"question": "What is Haystack?"}
  }'

Tips and Best Practices

Tip	Details
Use `pipeline.draw()` early	Visualize the wiring before running to catch connection errors
Warm up components with `.warm_up()`	Some components (e.g., TransformersDocumentEmbedder) load models on first run; call `warm_up()` at startup
Pass both `text` and `question` separately	PromptBuilder and embedder inputs are independent; wire query text to both
Use `DocumentJoiner` for hybrid	`join_mode="reciprocal_rank_fusion"` outperforms simple concatenation for hybrid search
Chunk at 250-500 words with 10-15% overlap	`DocumentSplitter(split_by="word", split_length=300, split_overlap=30)` is a safe default
Cache embeddings during indexing	Re-embedding is expensive; store indexed documents persistently and skip re-indexing if unchanged
Use `DocumentCleaner` before splitting	Removes whitespace noise and boilerplate that degrades chunk quality
Set `meta` fields consistently	Use uniform metadata keys across documents for reliable filtering downstream
Evaluate with ground truth	Build a small labeled test set; track faithfulness and recall@k as you change pipeline parameters
Pin Haystack version	Haystack 2.x has frequent API changes; pin `haystack-ai==2.x.y` in production `requirements.txt`