RAGBuilder Cheat Sheet

Overview

RAGBuilder is an open-source toolkit that simplifies building and optimizing retrieval-augmented generation pipelines through a no-code interface. It automatically evaluates different combinations of chunking strategies, embedding models, retrieval methods, and LLM generators to find the optimal RAG configuration for your specific dataset. RAGBuilder provides a web UI for configuring experiments and viewing evaluation results.

The tool addresses the challenge of RAG pipeline optimization by systematically testing component combinations rather than requiring manual tuning. It supports various document types, multiple vector stores, and both local and API-based models, producing a ranked leaderboard of pipeline configurations with quality metrics.

Installation

pip install ragbuilder

# Start the web UI
ragbuilder
# Opens at http://localhost:8005

From Source

git clone https://github.com/KruxAI/ragbuilder.git
cd ragbuilder
pip install -e .
ragbuilder

Environment Setup

# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
COHERE_API_KEY=...
HUGGINGFACE_API_KEY=...

Core Concepts

Pipeline Components

Component	Options	Role
Document Loader	PDF, DOCX, TXT, HTML, CSV	Ingest source documents
Chunker	Recursive, Token, Sentence, Semantic	Split documents into chunks
Embedding	OpenAI, HuggingFace, Cohere, local	Generate vector embeddings
Vector Store	Chroma, FAISS, Qdrant	Store and retrieve vectors
Retriever	Similarity, MMR, Hybrid, BM25	Find relevant chunks
Reranker	Cohere, Cross-encoder, None	Re-score retrieved chunks
Generator	OpenAI, Anthropic, Ollama	Generate final answers

Optimization Process

1. CONFIGURE - Select components and their parameter ranges
2. EVALUATE - RAGBuilder tests all combinations
3. COMPARE - View results ranked by quality metrics
4. DEPLOY - Export the best pipeline configuration

Web UI Usage

Step 1: Upload Documents

1. Open http://localhost:8005
2. Click "New Project"
3. Upload source documents (PDF, DOCX, TXT)
4. Provide evaluation Q&A pairs (or generate them)

Step 2: Configure Search Space

Chunking:
  ☑ Recursive Character (sizes: 500, 1000, 1500)
  ☑ Token-based (sizes: 256, 512)
  ☑ Semantic chunking
  Overlap: 50, 100, 200

Embeddings:
  ☑ text-embedding-3-small
  ☑ text-embedding-3-large
  ☑ BAAI/bge-small-en-v1.5

Retrieval:
  ☑ Similarity search (k: 3, 5, 10)
  ☑ MMR (k: 5, diversity: 0.3, 0.5, 0.7)
  ☑ Hybrid (BM25 + vector)

Reranking:
  ☑ None
  ☑ Cohere rerank-english-v3.0
  ☑ Cross-encoder/ms-marco-MiniLM-L-12-v2

Generator:
  ☑ gpt-4o (temperature: 0, 0.3)
  ☑ gpt-4o-mini (temperature: 0)

Step 3: Run Evaluation

Click "Start Evaluation"
RAGBuilder will:
- Test all component combinations
- Score each with faithfulness, relevancy, recall
- Display ranked leaderboard
- Show detailed metrics per configuration

CLI Usage

# Create project
ragbuilder create --name my-rag-project

# Add documents
ragbuilder add-docs --project my-rag-project --path ./documents/

# Generate eval dataset
ragbuilder generate-eval \
  --project my-rag-project \
  --num-questions 50

# Run optimization
ragbuilder optimize \
  --project my-rag-project \
  --config config.yaml

# View results
ragbuilder results --project my-rag-project

# Export best pipeline
ragbuilder export \
  --project my-rag-project \
  --rank 1 \
  --output best_pipeline.py

Configuration

config.yaml

project:
  name: my-rag-project
  documents_dir: ./documents/
  eval_dataset: ./eval_qa.json

search_space:
  chunking:
    - type: recursive
      chunk_size: [500, 1000, 1500]
      chunk_overlap: [50, 100, 200]
    - type: token
      chunk_size: [256, 512]
      chunk_overlap: [32, 64]
    - type: semantic
      breakpoint_threshold: [0.3, 0.5]

  embedding:
    - provider: openai
      model: text-embedding-3-small
    - provider: openai
      model: text-embedding-3-large
    - provider: huggingface
      model: BAAI/bge-small-en-v1.5

  vector_store:
    - type: chroma
    - type: faiss

  retrieval:
    - type: similarity
      top_k: [3, 5, 10]
    - type: mmr
      top_k: [5, 10]
      lambda_mult: [0.3, 0.5, 0.7]
    - type: hybrid
      alpha: [0.3, 0.5, 0.7]

  reranker:
    - type: none
    - type: cohere
      model: rerank-english-v3.0
      top_n: [3, 5]
    - type: cross_encoder
      model: cross-encoder/ms-marco-MiniLM-L-12-v2
      top_n: [3, 5]

  generator:
    - provider: openai
      model: gpt-4o
      temperature: [0.0, 0.3]
    - provider: openai
      model: gpt-4o-mini
      temperature: [0.0]

evaluation:
  metrics:
    - faithfulness
    - answer_relevancy
    - context_recall
    - context_precision
  evaluator: deepeval  # deepeval or ragas

Evaluation Dataset Format

[
  {
    "question": "What is retrieval augmented generation?",
    "ground_truth": "RAG combines document retrieval with LLM generation to ground responses in factual data.",
    "context": ["RAG is a technique that retrieves relevant documents..."]
  },
  {
    "question": "How does vector search work?",
    "ground_truth": "Vector search finds similar items by computing distances between embedding vectors.",
    "context": ["Vector databases store embeddings and use approximate nearest neighbor algorithms..."]
  }
]

Python API

from ragbuilder import RAGBuilder, SearchSpace

# Define search space
search_space = SearchSpace(
    chunking=[
        {"type": "recursive", "chunk_size": [500, 1000], "chunk_overlap": [50, 100]},
    ],
    embedding=[
        {"provider": "openai", "model": "text-embedding-3-small"},
    ],
    retrieval=[
        {"type": "similarity", "top_k": [3, 5, 10]},
        {"type": "mmr", "top_k": [5], "lambda_mult": [0.5]},
    ],
    generator=[
        {"provider": "openai", "model": "gpt-4o", "temperature": [0.0]},
    ]
)

# Build and evaluate
builder = RAGBuilder(
    documents_dir="./documents/",
    eval_dataset="./eval_qa.json",
    search_space=search_space
)

results = builder.optimize()

# View leaderboard
for rank, config in enumerate(results.leaderboard[:5], 1):
    print(f"#{rank}: Score={config['score']:.3f}")
    print(f"  Chunking: {config['chunking']}")
    print(f"  Embedding: {config['embedding']}")
    print(f"  Retrieval: {config['retrieval']}")

# Export best config
results.export_pipeline(rank=1, output_path="best_pipeline.py")

Advanced Usage

Auto-Generate Eval Dataset

from ragbuilder import EvalGenerator

generator = EvalGenerator(
    documents_dir="./documents/",
    llm_model="gpt-4o",
    num_questions=100,
    question_types=["factual", "reasoning", "comparison"]
)

eval_dataset = generator.generate()
eval_dataset.save("eval_qa.json")

Custom Metrics

from ragbuilder import RAGBuilder

def custom_metric(question, answer, context, ground_truth):
    """Custom evaluation metric returning 0-1 score."""
    keywords = ground_truth.lower().split()
    matched = sum(1 for k in keywords if k in answer.lower())
    return matched / len(keywords) if keywords else 0

builder = RAGBuilder(
    documents_dir="./documents/",
    eval_dataset="./eval_qa.json",
    custom_metrics={"keyword_coverage": custom_metric}
)

Compare Specific Configurations

from ragbuilder import RAGPipeline

# Create two specific pipelines
pipeline_a = RAGPipeline(
    chunking={"type": "recursive", "chunk_size": 1000, "chunk_overlap": 100},
    embedding={"provider": "openai", "model": "text-embedding-3-small"},
    retrieval={"type": "similarity", "top_k": 5},
    generator={"provider": "openai", "model": "gpt-4o", "temperature": 0.0}
)

pipeline_b = RAGPipeline(
    chunking={"type": "semantic", "breakpoint_threshold": 0.5},
    embedding={"provider": "openai", "model": "text-embedding-3-large"},
    retrieval={"type": "hybrid", "alpha": 0.5},
    generator={"provider": "openai", "model": "gpt-4o", "temperature": 0.0}
)

# Compare on eval dataset
comparison = RAGBuilder.compare(
    pipelines=[pipeline_a, pipeline_b],
    eval_dataset="eval_qa.json"
)
print(comparison.summary())

Troubleshooting

Issue	Solution
Web UI not loading	Check port 8005 is free, try `ragbuilder --port 8006`
API key errors	Verify `.env` file, check key format
Evaluation too slow	Reduce search space, use fewer eval questions
Out of memory	Use smaller embedding models, reduce combinations
Document parsing fails	Check file format, try different loader
Metric computation error	Verify eval dataset format matches expected schema
Export fails	Check output directory permissions
Inconsistent results	Increase eval dataset size (50+ questions)

# Debug mode
ragbuilder --debug

# View project status
ragbuilder status --project my-rag-project

# Clean up project
ragbuilder clean --project my-rag-project