Aller au contenu

Haystack Cheat Sheet

Overview

Haystack is an end-to-end framework for building retrieval-augmented generation (RAG), question answering, and document search systems. Developed by deepset, it follows a pipeline-based architecture where modular components — document converters, preprocessors, embedders, retrievers, rankers, and generators — are connected by named input/output ports into dataflow graphs.

Haystack 2.x (the current major version) introduced a complete redesign with type-safe component interfaces, declarative pipeline serialization to YAML, and a clean separation between components (stateless processors) and document stores (persistent vector backends). Supported backends include InMemoryDocumentStore, Elasticsearch, OpenSearch, Chroma, Qdrant, Weaviate, and pgvector.

The framework is LLM-agnostic, with official integrations for OpenAI, Anthropic, Cohere, Mistral, HuggingFace Inference API, and local models via Ollama. Haystack also provides tools for pipeline evaluation (faithfulness, context recall, RAGAS metrics), REST API deployment via Hayhooks, and a visual pipeline builder in the Haystack Studio.

Installation

Core and Integrations

# Core Haystack 2.x
pip install haystack-ai

# Document store backends
pip install chroma-haystack           # ChromaDB
pip install qdrant-haystack           # Qdrant
pip install elasticsearch-haystack    # Elasticsearch
pip install weaviate-haystack         # Weaviate

# LLM and embedding integrations
pip install openai                    # OpenAI (included in core)
pip install cohere-haystack           # Cohere
pip install amazon-bedrock-haystack   # AWS Bedrock
pip install ollama-haystack           # Ollama local models

# Document processing
pip install pypdf trafilatura         # PDF and web scraping
pip install sentence-transformers     # Local embedding models

# Evaluation
pip install haystack-experimental     # RAGAS metrics, advanced eval

# REST API serving
pip install hayhooks                  # FastAPI wrapper for pipelines

# Full stack (common setup)
pip install haystack-ai pypdf sentence-transformers

Environment Variables

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export COHERE_API_KEY="..."
export HF_API_TOKEN="hf_..."

Configuration

Document Stores

# In-Memory (development and testing)
from haystack.document_stores.in_memory import InMemoryDocumentStore

store = InMemoryDocumentStore(
    embedding_similarity_function="cosine"  # or "dot_product"
)

# ChromaDB
from chroma_haystack import ChromaDocumentStore

store = ChromaDocumentStore(
    collection_name="my_docs",
    persist_path="./chroma_haystack"
)

# Qdrant
from qdrant_haystack import QdrantDocumentStore

store = QdrantDocumentStore(
    url="http://localhost:6333",
    index="haystack_docs",
    embedding_dim=1536,
    recreate_index=False
)

# Elasticsearch
from elasticsearch_haystack import ElasticsearchDocumentStore

store = ElasticsearchDocumentStore(
    hosts="http://localhost:9200",
    index="haystack"
)

Core Commands/API

Component / MethodDescription
Pipeline()Create a new pipeline
pipeline.add_component(name, component)Register a component
pipeline.connect(from_output, to_input)Wire component outputs to inputs
pipeline.run(inputs)Execute the pipeline
pipeline.to_dict()Serialize pipeline to dict
Pipeline.from_dict(data)Deserialize pipeline from dict
pipeline.draw("pipeline.png")Visualize pipeline graph
OpenAIDocumentEmbedder(model)Embed documents with OpenAI
OpenAITextEmbedder(model)Embed a query string
InMemoryEmbeddingRetriever(store)Retrieve by embedding similarity
InMemoryBM25Retriever(store)BM25 keyword retrieval
OpenAIGenerator(model)Generate response with OpenAI
OpenAIChatGenerator(model)Chat-based generation
PromptBuilder(template)Build prompts from Jinja2 templates
DocumentJoiner(join_mode)Merge results from multiple retrievers
TransformersDocumentEmbedder(model)Local HuggingFace embeddings
SentenceTransformersDocumentEmbedder(model)Sentence-Transformers embeddings
PyPDFToDocument()Convert PDF to Document objects
RecursiveDocumentSplitter(chunk_size)Chunk documents recursively
DocumentWriter(store)Write documents to a store
MetadataRouter(rules)Route documents by metadata
AnswerBuilder()Format answers from generator output

Advanced Usage

Indexing Pipeline

from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import OpenAIDocumentEmbedder
from haystack.components.preprocessors import DocumentSplitter, DocumentCleaner
from haystack.components.converters import PyPDFToDocument
from haystack.components.writers import DocumentWriter
from haystack.components.routers import FileTypeRouter

store = InMemoryDocumentStore()

# Build indexing pipeline
indexing = Pipeline()
indexing.add_component("pdf_converter",  PyPDFToDocument())
indexing.add_component("cleaner",        DocumentCleaner())
indexing.add_component("splitter",       DocumentSplitter(
    split_by="word",
    split_length=250,
    split_overlap=30
))
indexing.add_component("embedder",       OpenAIDocumentEmbedder(
    model="text-embedding-3-small"
))
indexing.add_component("writer",         DocumentWriter(document_store=store))

# Wire components
indexing.connect("pdf_converter.documents", "cleaner.documents")
indexing.connect("cleaner.documents",       "splitter.documents")
indexing.connect("splitter.documents",      "embedder.documents")
indexing.connect("embedder.documents",      "writer.documents")

# Run indexing
result = indexing.run({
    "pdf_converter": {"sources": ["document.pdf", "report.pdf"]}
})
print(f"Indexed {result['writer']['documents_written']} chunks")

# Add documents directly
docs = [
    Document(content="Haystack is a RAG framework.", meta={"source": "docs"}),
    Document(content="pgvector extends PostgreSQL.", meta={"source": "blog"})
]
embedder = OpenAIDocumentEmbedder(model="text-embedding-3-small")
store.write_documents(embedder.run(docs)["documents"])

RAG Query Pipeline

from haystack import Pipeline
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

template = """
Answer the question based on the given context.

Context:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}

Question: {{ question }}
"""

rag = Pipeline()
rag.add_component("query_embedder", OpenAITextEmbedder(model="text-embedding-3-small"))
rag.add_component("retriever",      InMemoryEmbeddingRetriever(
    document_store=store, top_k=5
))
rag.add_component("prompt",         PromptBuilder(template=template))
rag.add_component("generator",      OpenAIGenerator(
    model="gpt-4o-mini",
    generation_kwargs={"temperature": 0.2, "max_tokens": 512}
))

rag.connect("query_embedder.embedding", "retriever.query_embedding")
rag.connect("retriever.documents",      "prompt.documents")
rag.connect("prompt.prompt",            "generator.prompt")

# Run the RAG pipeline
result = rag.run({
    "query_embedder": {"text": "What is Haystack used for?"},
    "prompt":         {"question": "What is Haystack used for?"}
})

print(result["generator"]["replies"][0])

Hybrid Retrieval Pipeline

from haystack.components.retrievers.in_memory import (
    InMemoryEmbeddingRetriever,
    InMemoryBM25Retriever
)
from haystack.components.joiners import DocumentJoiner
from haystack.components.rankers import TransformersSimilarityRanker

hybrid = Pipeline()
hybrid.add_component("query_embedder",  OpenAITextEmbedder("text-embedding-3-small"))
hybrid.add_component("vector_retriever", InMemoryEmbeddingRetriever(store, top_k=10))
hybrid.add_component("bm25_retriever",  InMemoryBM25Retriever(store, top_k=10))
hybrid.add_component("joiner",          DocumentJoiner(join_mode="reciprocal_rank_fusion"))
hybrid.add_component("ranker",          TransformersSimilarityRanker(
    model="cross-encoder/ms-marco-MiniLM-L-6-v2",
    top_k=5
))
hybrid.add_component("prompt",          PromptBuilder(template=template))
hybrid.add_component("generator",       OpenAIGenerator("gpt-4o-mini"))

hybrid.connect("query_embedder.embedding",     "vector_retriever.query_embedding")
hybrid.connect("vector_retriever.documents",   "joiner.documents")
hybrid.connect("bm25_retriever.documents",     "joiner.documents")
hybrid.connect("joiner.documents",             "ranker.documents")
hybrid.connect("ranker.documents",             "prompt.documents")
hybrid.connect("prompt.prompt",                "generator.prompt")

result = hybrid.run({
    "query_embedder":  {"text": "RAG pipeline performance"},
    "bm25_retriever":  {"query": "RAG pipeline performance"},
    "ranker":          {"query": "RAG pipeline performance"},
    "prompt":          {"question": "RAG pipeline performance"}
})

Custom Component

from haystack import component, default_from_dict, default_to_dict
from haystack.dataclasses import Document
from typing import Optional, List

@component
class KeywordFilter:
    """Filter documents by required keywords."""

    def __init__(self, keywords: List[str], require_all: bool = False):
        self.keywords  = [k.lower() for k in keywords]
        self.require_all = require_all

    @component.output_types(documents=List[Document])
    def run(self, documents: List[Document]) -> dict:
        results = []
        for doc in documents:
            text = doc.content.lower()
            matches = [k in text for k in self.keywords]
            if self.require_all:
                if all(matches):
                    results.append(doc)
            else:
                if any(matches):
                    results.append(doc)
        return {"documents": results}

    def to_dict(self) -> dict:
        return default_to_dict(self, keywords=self.keywords, require_all=self.require_all)

    @classmethod
    def from_dict(cls, data: dict):
        return default_from_dict(cls, data)

# Use in pipeline
pipeline.add_component("keyword_filter", KeywordFilter(["vector", "embedding"]))
pipeline.connect("retriever.documents", "keyword_filter.documents")

Pipeline Serialization

import yaml

# Save pipeline to YAML
pipeline_dict = rag.to_dict()
with open("rag_pipeline.yaml", "w") as f:
    yaml.dump(pipeline_dict, f)

# Load pipeline from YAML
with open("rag_pipeline.yaml") as f:
    pipeline_data = yaml.safe_load(f)
loaded_pipeline = Pipeline.from_dict(pipeline_data)

# Run loaded pipeline
result = loaded_pipeline.run({
    "query_embedder": {"text": "my question"},
    "prompt":         {"question": "my question"}
})

Common Workflows

Pipeline Evaluation

from haystack.evaluation import EvaluationRunResult
from haystack.components.evaluators import (
    FaithfulnessEvaluator,
    ContextRelevanceEvaluator,
    SASEvaluator
)

# Faithfulness — does the answer stay true to the retrieved context?
faithfulness = FaithfulnessEvaluator()

eval_result = faithfulness.run(
    questions=["What is RAG?", "How does HNSW work?"],
    contexts=[["RAG combines retrieval with generation."], ["HNSW is a graph-based ANN index."]],
    responses=["RAG stands for Retrieval-Augmented Generation.", "HNSW uses hierarchical graph layers."]
)
print(f"Mean faithfulness: {eval_result['score']:.3f}")

# Context relevance
relevance = ContextRelevanceEvaluator()
eval_result = relevance.run(
    questions=["What is RAG?"],
    contexts=[["RAG combines retrieval with generation."]]
)

Hayhooks REST API

pip install hayhooks

# Serve a pipeline YAML as REST endpoint
hayhooks run --pipelines-dir ./pipelines

# Now call the pipeline via REST
curl -X POST http://localhost:1416/rag_pipeline/run \
  -H "Content-Type: application/json" \
  -d '{
    "query_embedder": {"text": "What is Haystack?"},
    "prompt": {"question": "What is Haystack?"}
  }'

Tips and Best Practices

TipDetails
Use pipeline.draw() earlyVisualize the wiring before running to catch connection errors
Warm up components with .warm_up()Some components (e.g., TransformersDocumentEmbedder) load models on first run; call warm_up() at startup
Pass both text and question separatelyPromptBuilder and embedder inputs are independent; wire query text to both
Use DocumentJoiner for hybridjoin_mode="reciprocal_rank_fusion" outperforms simple concatenation for hybrid search
Chunk at 250-500 words with 10-15% overlapDocumentSplitter(split_by="word", split_length=300, split_overlap=30) is a safe default
Cache embeddings during indexingRe-embedding is expensive; store indexed documents persistently and skip re-indexing if unchanged
Use DocumentCleaner before splittingRemoves whitespace noise and boilerplate that degrades chunk quality
Set meta fields consistentlyUse uniform metadata keys across documents for reliable filtering downstream
Evaluate with ground truthBuild a small labeled test set; track faithfulness and recall@k as you change pipeline parameters
Pin Haystack versionHaystack 2.x has frequent API changes; pin haystack-ai==2.x.y in production requirements.txt