تخطَّ إلى المحتوى

RAGBuilder Cheat Sheet

Overview

RAGBuilder is an open-source toolkit that simplifies building and optimizing retrieval-augmented generation pipelines through a no-code interface. It automatically evaluates different combinations of chunking strategies, embedding models, retrieval methods, and LLM generators to find the optimal RAG configuration for your specific dataset. RAGBuilder provides a web UI for configuring experiments and viewing evaluation results.

The tool addresses the challenge of RAG pipeline optimization by systematically testing component combinations rather than requiring manual tuning. It supports various document types, multiple vector stores, and both local and API-based models, producing a ranked leaderboard of pipeline configurations with quality metrics.

Installation

pip install ragbuilder

# Start the web UI
ragbuilder
# Opens at http://localhost:8005

From Source

git clone https://github.com/KruxAI/ragbuilder.git
cd ragbuilder
pip install -e .
ragbuilder

Environment Setup

# .env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
COHERE_API_KEY=...
HUGGINGFACE_API_KEY=...

Core Concepts

Pipeline Components

ComponentOptionsRole
Document LoaderPDF, DOCX, TXT, HTML, CSVIngest source documents
ChunkerRecursive, Token, Sentence, SemanticSplit documents into chunks
EmbeddingOpenAI, HuggingFace, Cohere, localGenerate vector embeddings
Vector StoreChroma, FAISS, QdrantStore and retrieve vectors
RetrieverSimilarity, MMR, Hybrid, BM25Find relevant chunks
RerankerCohere, Cross-encoder, NoneRe-score retrieved chunks
GeneratorOpenAI, Anthropic, OllamaGenerate final answers

Optimization Process

1. CONFIGURE - Select components and their parameter ranges
2. EVALUATE - RAGBuilder tests all combinations
3. COMPARE - View results ranked by quality metrics
4. DEPLOY - Export the best pipeline configuration

Web UI Usage

Step 1: Upload Documents

1. Open http://localhost:8005
2. Click "New Project"
3. Upload source documents (PDF, DOCX, TXT)
4. Provide evaluation Q&A pairs (or generate them)

Step 2: Configure Search Space

Chunking:
  ☑ Recursive Character (sizes: 500, 1000, 1500)
  ☑ Token-based (sizes: 256, 512)
  ☑ Semantic chunking
  Overlap: 50, 100, 200

Embeddings:
  ☑ text-embedding-3-small
  ☑ text-embedding-3-large
  ☑ BAAI/bge-small-en-v1.5

Retrieval:
  ☑ Similarity search (k: 3, 5, 10)
  ☑ MMR (k: 5, diversity: 0.3, 0.5, 0.7)
  ☑ Hybrid (BM25 + vector)

Reranking:
  ☑ None
  ☑ Cohere rerank-english-v3.0
  ☑ Cross-encoder/ms-marco-MiniLM-L-12-v2

Generator:
  ☑ gpt-4o (temperature: 0, 0.3)
  ☑ gpt-4o-mini (temperature: 0)

Step 3: Run Evaluation

Click "Start Evaluation"
RAGBuilder will:
- Test all component combinations
- Score each with faithfulness, relevancy, recall
- Display ranked leaderboard
- Show detailed metrics per configuration

CLI Usage

# Create project
ragbuilder create --name my-rag-project

# Add documents
ragbuilder add-docs --project my-rag-project --path ./documents/

# Generate eval dataset
ragbuilder generate-eval \
  --project my-rag-project \
  --num-questions 50

# Run optimization
ragbuilder optimize \
  --project my-rag-project \
  --config config.yaml

# View results
ragbuilder results --project my-rag-project

# Export best pipeline
ragbuilder export \
  --project my-rag-project \
  --rank 1 \
  --output best_pipeline.py

Configuration

config.yaml

project:
  name: my-rag-project
  documents_dir: ./documents/
  eval_dataset: ./eval_qa.json

search_space:
  chunking:
    - type: recursive
      chunk_size: [500, 1000, 1500]
      chunk_overlap: [50, 100, 200]
    - type: token
      chunk_size: [256, 512]
      chunk_overlap: [32, 64]
    - type: semantic
      breakpoint_threshold: [0.3, 0.5]

  embedding:
    - provider: openai
      model: text-embedding-3-small
    - provider: openai
      model: text-embedding-3-large
    - provider: huggingface
      model: BAAI/bge-small-en-v1.5

  vector_store:
    - type: chroma
    - type: faiss

  retrieval:
    - type: similarity
      top_k: [3, 5, 10]
    - type: mmr
      top_k: [5, 10]
      lambda_mult: [0.3, 0.5, 0.7]
    - type: hybrid
      alpha: [0.3, 0.5, 0.7]

  reranker:
    - type: none
    - type: cohere
      model: rerank-english-v3.0
      top_n: [3, 5]
    - type: cross_encoder
      model: cross-encoder/ms-marco-MiniLM-L-12-v2
      top_n: [3, 5]

  generator:
    - provider: openai
      model: gpt-4o
      temperature: [0.0, 0.3]
    - provider: openai
      model: gpt-4o-mini
      temperature: [0.0]

evaluation:
  metrics:
    - faithfulness
    - answer_relevancy
    - context_recall
    - context_precision
  evaluator: deepeval  # deepeval or ragas

Evaluation Dataset Format

[
  {
    "question": "What is retrieval augmented generation?",
    "ground_truth": "RAG combines document retrieval with LLM generation to ground responses in factual data.",
    "context": ["RAG is a technique that retrieves relevant documents..."]
  },
  {
    "question": "How does vector search work?",
    "ground_truth": "Vector search finds similar items by computing distances between embedding vectors.",
    "context": ["Vector databases store embeddings and use approximate nearest neighbor algorithms..."]
  }
]

Python API

from ragbuilder import RAGBuilder, SearchSpace

# Define search space
search_space = SearchSpace(
    chunking=[
        {"type": "recursive", "chunk_size": [500, 1000], "chunk_overlap": [50, 100]},
    ],
    embedding=[
        {"provider": "openai", "model": "text-embedding-3-small"},
    ],
    retrieval=[
        {"type": "similarity", "top_k": [3, 5, 10]},
        {"type": "mmr", "top_k": [5], "lambda_mult": [0.5]},
    ],
    generator=[
        {"provider": "openai", "model": "gpt-4o", "temperature": [0.0]},
    ]
)

# Build and evaluate
builder = RAGBuilder(
    documents_dir="./documents/",
    eval_dataset="./eval_qa.json",
    search_space=search_space
)

results = builder.optimize()

# View leaderboard
for rank, config in enumerate(results.leaderboard[:5], 1):
    print(f"#{rank}: Score={config['score']:.3f}")
    print(f"  Chunking: {config['chunking']}")
    print(f"  Embedding: {config['embedding']}")
    print(f"  Retrieval: {config['retrieval']}")

# Export best config
results.export_pipeline(rank=1, output_path="best_pipeline.py")

Advanced Usage

Auto-Generate Eval Dataset

from ragbuilder import EvalGenerator

generator = EvalGenerator(
    documents_dir="./documents/",
    llm_model="gpt-4o",
    num_questions=100,
    question_types=["factual", "reasoning", "comparison"]
)

eval_dataset = generator.generate()
eval_dataset.save("eval_qa.json")

Custom Metrics

from ragbuilder import RAGBuilder

def custom_metric(question, answer, context, ground_truth):
    """Custom evaluation metric returning 0-1 score."""
    keywords = ground_truth.lower().split()
    matched = sum(1 for k in keywords if k in answer.lower())
    return matched / len(keywords) if keywords else 0

builder = RAGBuilder(
    documents_dir="./documents/",
    eval_dataset="./eval_qa.json",
    custom_metrics={"keyword_coverage": custom_metric}
)

Compare Specific Configurations

from ragbuilder import RAGPipeline

# Create two specific pipelines
pipeline_a = RAGPipeline(
    chunking={"type": "recursive", "chunk_size": 1000, "chunk_overlap": 100},
    embedding={"provider": "openai", "model": "text-embedding-3-small"},
    retrieval={"type": "similarity", "top_k": 5},
    generator={"provider": "openai", "model": "gpt-4o", "temperature": 0.0}
)

pipeline_b = RAGPipeline(
    chunking={"type": "semantic", "breakpoint_threshold": 0.5},
    embedding={"provider": "openai", "model": "text-embedding-3-large"},
    retrieval={"type": "hybrid", "alpha": 0.5},
    generator={"provider": "openai", "model": "gpt-4o", "temperature": 0.0}
)

# Compare on eval dataset
comparison = RAGBuilder.compare(
    pipelines=[pipeline_a, pipeline_b],
    eval_dataset="eval_qa.json"
)
print(comparison.summary())

Troubleshooting

IssueSolution
Web UI not loadingCheck port 8005 is free, try ragbuilder --port 8006
API key errorsVerify .env file, check key format
Evaluation too slowReduce search space, use fewer eval questions
Out of memoryUse smaller embedding models, reduce combinations
Document parsing failsCheck file format, try different loader
Metric computation errorVerify eval dataset format matches expected schema
Export failsCheck output directory permissions
Inconsistent resultsIncrease eval dataset size (50+ questions)
# Debug mode
ragbuilder --debug

# View project status
ragbuilder status --project my-rag-project

# Clean up project
ragbuilder clean --project my-rag-project