Haystack Cheat Sheet
Overview
Haystack is an end-to-end framework for building retrieval-augmented generation (RAG), question answering, and document search systems. Developed by deepset, it follows a pipeline-based architecture where modular components — document converters, preprocessors, embedders, retrievers, rankers, and generators — are connected by named input/output ports into dataflow graphs.
Haystack 2.x (the current major version) introduced a complete redesign with type-safe component interfaces, declarative pipeline serialization to YAML, and a clean separation between components (stateless processors) and document stores (persistent vector backends). Supported backends include InMemoryDocumentStore, Elasticsearch, OpenSearch, Chroma, Qdrant, Weaviate, and pgvector.
The framework is LLM-agnostic, with official integrations for OpenAI, Anthropic, Cohere, Mistral, HuggingFace Inference API, and local models via Ollama. Haystack also provides tools for pipeline evaluation (faithfulness, context recall, RAGAS metrics), REST API deployment via Hayhooks, and a visual pipeline builder in the Haystack Studio.
Installation
Core and Integrations
# Core Haystack 2.x
pip install haystack-ai
# Document store backends
pip install chroma-haystack # ChromaDB
pip install qdrant-haystack # Qdrant
pip install elasticsearch-haystack # Elasticsearch
pip install weaviate-haystack # Weaviate
# LLM and embedding integrations
pip install openai # OpenAI (included in core)
pip install cohere-haystack # Cohere
pip install amazon-bedrock-haystack # AWS Bedrock
pip install ollama-haystack # Ollama local models
# Document processing
pip install pypdf trafilatura # PDF and web scraping
pip install sentence-transformers # Local embedding models
# Evaluation
pip install haystack-experimental # RAGAS metrics, advanced eval
# REST API serving
pip install hayhooks # FastAPI wrapper for pipelines
# Full stack (common setup)
pip install haystack-ai pypdf sentence-transformers
Environment Variables
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export COHERE_API_KEY="..."
export HF_API_TOKEN="hf_..."
Configuration
Document Stores
# In-Memory (development and testing)
from haystack.document_stores.in_memory import InMemoryDocumentStore
store = InMemoryDocumentStore(
embedding_similarity_function="cosine" # or "dot_product"
)
# ChromaDB
from chroma_haystack import ChromaDocumentStore
store = ChromaDocumentStore(
collection_name="my_docs",
persist_path="./chroma_haystack"
)
# Qdrant
from qdrant_haystack import QdrantDocumentStore
store = QdrantDocumentStore(
url="http://localhost:6333",
index="haystack_docs",
embedding_dim=1536,
recreate_index=False
)
# Elasticsearch
from elasticsearch_haystack import ElasticsearchDocumentStore
store = ElasticsearchDocumentStore(
hosts="http://localhost:9200",
index="haystack"
)
Core Commands/API
| Component / Method | Description |
|---|---|
Pipeline() | Create a new pipeline |
pipeline.add_component(name, component) | Register a component |
pipeline.connect(from_output, to_input) | Wire component outputs to inputs |
pipeline.run(inputs) | Execute the pipeline |
pipeline.to_dict() | Serialize pipeline to dict |
Pipeline.from_dict(data) | Deserialize pipeline from dict |
pipeline.draw("pipeline.png") | Visualize pipeline graph |
OpenAIDocumentEmbedder(model) | Embed documents with OpenAI |
OpenAITextEmbedder(model) | Embed a query string |
InMemoryEmbeddingRetriever(store) | Retrieve by embedding similarity |
InMemoryBM25Retriever(store) | BM25 keyword retrieval |
OpenAIGenerator(model) | Generate response with OpenAI |
OpenAIChatGenerator(model) | Chat-based generation |
PromptBuilder(template) | Build prompts from Jinja2 templates |
DocumentJoiner(join_mode) | Merge results from multiple retrievers |
TransformersDocumentEmbedder(model) | Local HuggingFace embeddings |
SentenceTransformersDocumentEmbedder(model) | Sentence-Transformers embeddings |
PyPDFToDocument() | Convert PDF to Document objects |
RecursiveDocumentSplitter(chunk_size) | Chunk documents recursively |
DocumentWriter(store) | Write documents to a store |
MetadataRouter(rules) | Route documents by metadata |
AnswerBuilder() | Format answers from generator output |
Advanced Usage
Indexing Pipeline
from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import OpenAIDocumentEmbedder
from haystack.components.preprocessors import DocumentSplitter, DocumentCleaner
from haystack.components.converters import PyPDFToDocument
from haystack.components.writers import DocumentWriter
from haystack.components.routers import FileTypeRouter
store = InMemoryDocumentStore()
# Build indexing pipeline
indexing = Pipeline()
indexing.add_component("pdf_converter", PyPDFToDocument())
indexing.add_component("cleaner", DocumentCleaner())
indexing.add_component("splitter", DocumentSplitter(
split_by="word",
split_length=250,
split_overlap=30
))
indexing.add_component("embedder", OpenAIDocumentEmbedder(
model="text-embedding-3-small"
))
indexing.add_component("writer", DocumentWriter(document_store=store))
# Wire components
indexing.connect("pdf_converter.documents", "cleaner.documents")
indexing.connect("cleaner.documents", "splitter.documents")
indexing.connect("splitter.documents", "embedder.documents")
indexing.connect("embedder.documents", "writer.documents")
# Run indexing
result = indexing.run({
"pdf_converter": {"sources": ["document.pdf", "report.pdf"]}
})
print(f"Indexed {result['writer']['documents_written']} chunks")
# Add documents directly
docs = [
Document(content="Haystack is a RAG framework.", meta={"source": "docs"}),
Document(content="pgvector extends PostgreSQL.", meta={"source": "blog"})
]
embedder = OpenAIDocumentEmbedder(model="text-embedding-3-small")
store.write_documents(embedder.run(docs)["documents"])
RAG Query Pipeline
from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
template = """
Answer the question based on the given context.
Context:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Question: {{ question }}
"""
rag = Pipeline()
rag.add_component("query_embedder", OpenAITextEmbedder(model="text-embedding-3-small"))
rag.add_component("retriever", InMemoryEmbeddingRetriever(
document_store=store, top_k=5
))
rag.add_component("prompt", PromptBuilder(template=template))
rag.add_component("generator", OpenAIGenerator(
model="gpt-4o-mini",
generation_kwargs={"temperature": 0.2, "max_tokens": 512}
))
rag.connect("query_embedder.embedding", "retriever.query_embedding")
rag.connect("retriever.documents", "prompt.documents")
rag.connect("prompt.prompt", "generator.prompt")
# Run the RAG pipeline
result = rag.run({
"query_embedder": {"text": "What is Haystack used for?"},
"prompt": {"question": "What is Haystack used for?"}
})
print(result["generator"]["replies"][0])
Hybrid Retrieval Pipeline
from haystack.components.retrievers.in_memory import (
InMemoryEmbeddingRetriever,
InMemoryBM25Retriever
)
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import TransformersSimilarityRanker
hybrid = Pipeline()
hybrid.add_component("query_embedder", OpenAITextEmbedder("text-embedding-3-small"))
hybrid.add_component("vector_retriever", InMemoryEmbeddingRetriever(store, top_k=10))
hybrid.add_component("bm25_retriever", InMemoryBM25Retriever(store, top_k=10))
hybrid.add_component("joiner", DocumentJoiner(join_mode="reciprocal_rank_fusion"))
hybrid.add_component("ranker", TransformersSimilarityRanker(
model="cross-encoder/ms-marco-MiniLM-L-6-v2",
top_k=5
))
hybrid.add_component("prompt", PromptBuilder(template=template))
hybrid.add_component("generator", OpenAIGenerator("gpt-4o-mini"))
hybrid.connect("query_embedder.embedding", "vector_retriever.query_embedding")
hybrid.connect("vector_retriever.documents", "joiner.documents")
hybrid.connect("bm25_retriever.documents", "joiner.documents")
hybrid.connect("joiner.documents", "ranker.documents")
hybrid.connect("ranker.documents", "prompt.documents")
hybrid.connect("prompt.prompt", "generator.prompt")
result = hybrid.run({
"query_embedder": {"text": "RAG pipeline performance"},
"bm25_retriever": {"query": "RAG pipeline performance"},
"ranker": {"query": "RAG pipeline performance"},
"prompt": {"question": "RAG pipeline performance"}
})
Custom Component
from haystack import component, default_from_dict, default_to_dict
from haystack.dataclasses import Document
from typing import Optional, List
@component
class KeywordFilter:
"""Filter documents by required keywords."""
def __init__(self, keywords: List[str], require_all: bool = False):
self.keywords = [k.lower() for k in keywords]
self.require_all = require_all
@component.output_types(documents=List[Document])
def run(self, documents: List[Document]) -> dict:
results = []
for doc in documents:
text = doc.content.lower()
matches = [k in text for k in self.keywords]
if self.require_all:
if all(matches):
results.append(doc)
else:
if any(matches):
results.append(doc)
return {"documents": results}
def to_dict(self) -> dict:
return default_to_dict(self, keywords=self.keywords, require_all=self.require_all)
@classmethod
def from_dict(cls, data: dict):
return default_from_dict(cls, data)
# Use in pipeline
pipeline.add_component("keyword_filter", KeywordFilter(["vector", "embedding"]))
pipeline.connect("retriever.documents", "keyword_filter.documents")
Pipeline Serialization
import yaml
# Save pipeline to YAML
pipeline_dict = rag.to_dict()
with open("rag_pipeline.yaml", "w") as f:
yaml.dump(pipeline_dict, f)
# Load pipeline from YAML
with open("rag_pipeline.yaml") as f:
pipeline_data = yaml.safe_load(f)
loaded_pipeline = Pipeline.from_dict(pipeline_data)
# Run loaded pipeline
result = loaded_pipeline.run({
"query_embedder": {"text": "my question"},
"prompt": {"question": "my question"}
})
Common Workflows
Pipeline Evaluation
from haystack.evaluation import EvaluationRunResult
from haystack.components.evaluators import (
FaithfulnessEvaluator,
ContextRelevanceEvaluator,
SASEvaluator
)
# Faithfulness — does the answer stay true to the retrieved context?
faithfulness = FaithfulnessEvaluator()
eval_result = faithfulness.run(
questions=["What is RAG?", "How does HNSW work?"],
contexts=[["RAG combines retrieval with generation."], ["HNSW is a graph-based ANN index."]],
responses=["RAG stands for Retrieval-Augmented Generation.", "HNSW uses hierarchical graph layers."]
)
print(f"Mean faithfulness: {eval_result['score']:.3f}")
# Context relevance
relevance = ContextRelevanceEvaluator()
eval_result = relevance.run(
questions=["What is RAG?"],
contexts=[["RAG combines retrieval with generation."]]
)
Hayhooks REST API
pip install hayhooks
# Serve a pipeline YAML as REST endpoint
hayhooks run --pipelines-dir ./pipelines
# Now call the pipeline via REST
curl -X POST http://localhost:1416/rag_pipeline/run \
-H "Content-Type: application/json" \
-d '{
"query_embedder": {"text": "What is Haystack?"},
"prompt": {"question": "What is Haystack?"}
}'
Tips and Best Practices
| Tip | Details |
|---|---|
Use pipeline.draw() early | Visualize the wiring before running to catch connection errors |
Warm up components with .warm_up() | Some components (e.g., TransformersDocumentEmbedder) load models on first run; call warm_up() at startup |
Pass both text and question separately | PromptBuilder and embedder inputs are independent; wire query text to both |
Use DocumentJoiner for hybrid | join_mode="reciprocal_rank_fusion" outperforms simple concatenation for hybrid search |
| Chunk at 250-500 words with 10-15% overlap | DocumentSplitter(split_by="word", split_length=300, split_overlap=30) is a safe default |
| Cache embeddings during indexing | Re-embedding is expensive; store indexed documents persistently and skip re-indexing if unchanged |
Use DocumentCleaner before splitting | Removes whitespace noise and boilerplate that degrades chunk quality |
Set meta fields consistently | Use uniform metadata keys across documents for reliable filtering downstream |
| Evaluate with ground truth | Build a small labeled test set; track faithfulness and recall@k as you change pipeline parameters |
| Pin Haystack version | Haystack 2.x has frequent API changes; pin haystack-ai==2.x.y in production requirements.txt |