Aller au contenu

Letta Cheat Sheet

Overview

Letta (formerly MemGPT) is a framework for building stateful LLM agents with persistent memory. It implements a virtual context management system that allows LLMs to operate beyond their context window limits by managing their own memory hierarchy. Agents can read and write to archival storage, maintain conversation history, and edit their own system prompts at runtime.

The framework provides a server-based architecture where agents run as persistent services with REST API access. Letta agents maintain core memory (always in context), recall memory (searchable conversation history), and archival memory (long-term vector storage). This enables applications like long-running personal assistants, document Q&A systems, and autonomous agents that learn over time.

Installation

pip Install

pip install letta
letta quickstart --backend openai
# Or use local models
letta quickstart --backend ollama

Docker

docker run -d \
  --name letta-server \
  -p 8283:8283 \
  -v letta_data:/root/.letta \
  -e OPENAI_API_KEY=sk-... \
  letta/letta:latest

# Access server at http://localhost:8283
# Access ADE (Agent Development Environment) at http://localhost:8283/ade

Docker Compose

version: '3.8'
services:
  letta:
    image: letta/letta:latest
    ports:
      - "8283:8283"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    volumes:
      - letta_data:/root/.letta

volumes:
  letta_data:

From Source

git clone https://github.com/letta-ai/letta.git
cd letta
pip install -e ".[dev]"
letta server

Core Concepts

Memory Architecture

Memory TypeDescriptionPersistenceSearchable
Core MemoryAlways in LLM context (persona + human info)YesNo (always visible)
Recall MemoryConversation historyYesYes (search)
Archival MemoryLong-term vector storageYesYes (semantic)

Python SDK

from letta import create_client

# Create a client
client = create_client()

# Create an agent
agent_state = client.create_agent(
    name="my-assistant",
    memory_blocks=[
        {"label": "human", "value": "Name: Alice. Preferences: concise answers."},
        {"label": "persona", "value": "You are a helpful research assistant."}
    ],
    llm_config={"model": "gpt-4o", "model_endpoint_type": "openai"},
    embedding_config={"embedding_endpoint_type": "openai", "embedding_model": "text-embedding-3-small"}
)

# Send a message
response = client.send_message(
    agent_id=agent_state.id,
    role="user",
    message="What do you know about me?"
)
print(response.messages)

# Add to archival memory
client.insert_archival_memory(
    agent_id=agent_state.id,
    memory="Alice works at Acme Corp as a data scientist."
)

# Search archival memory
results = client.get_archival_memory(
    agent_id=agent_state.id,
    query="workplace"
)

REST API

# Create an agent
curl -X POST http://localhost:8283/v1/agents \
  -H "Content-Type: application/json" \
  -d '{
    "name": "research-agent",
    "memory_blocks": [
      {"label": "human", "value": "User is a software engineer."},
      {"label": "persona", "value": "You are a coding assistant."}
    ],
    "llm": "gpt-4o",
    "embedding": "text-embedding-3-small"
  }'

# Send a message
curl -X POST http://localhost:8283/v1/agents/{agent_id}/messages \
  -H "Content-Type: application/json" \
  -d '{
    "role": "user",
    "message": "Help me optimize this SQL query"
  }'

# Get agent memory
curl -X GET http://localhost:8283/v1/agents/{agent_id}/memory

# Search archival memory
curl -X GET "http://localhost:8283/v1/agents/{agent_id}/archival?query=python&count=5"

# List all agents
curl -X GET http://localhost:8283/v1/agents

# Delete an agent
curl -X DELETE http://localhost:8283/v1/agents/{agent_id}

Agent Tools

Built-in Tools

ToolFunctionDescription
send_messageSends visible message to userPrimary communication
core_memory_appendAdds to core memory blockRemember new facts
core_memory_replaceEdits core memory blockUpdate existing facts
archival_memory_insertStores in archival memoryLong-term storage
archival_memory_searchSearches archival memoryRecall stored info
conversation_searchSearches recall memoryFind past messages

Custom Tools

from letta import create_client

client = create_client()

# Define a custom tool
def get_weather(city: str) -> str:
    """Get the current weather for a city.

    Args:
        city: The city name to get weather for.

    Returns:
        A string describing the current weather.
    """
    import requests
    resp = requests.get(f"https://wttr.in/{city}?format=3")
    return resp.text

# Register the tool
tool = client.create_tool(get_weather)

# Create agent with custom tool
agent = client.create_agent(
    name="weather-agent",
    tools=[tool.name, "send_message"]
)

Configuration

Server Configuration

# Environment variables
export LETTA_PG_URI=postgresql://user:pass@localhost:5432/letta
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

# Start server with custom config
letta server --port 8283 --host 0.0.0.0

# Configure default model
letta server --default-llm gpt-4o

Model Configuration

from letta import LLMConfig, EmbeddingConfig

llm_config = LLMConfig(
    model="gpt-4o",
    model_endpoint_type="openai",
    model_endpoint="https://api.openai.com/v1",
    context_window=128000
)

embedding_config = EmbeddingConfig(
    embedding_endpoint_type="openai",
    embedding_model="text-embedding-3-small",
    embedding_dim=1536,
    embedding_chunk_size=300
)

# Use local models via Ollama
local_llm = LLMConfig(
    model="llama3.1:70b",
    model_endpoint_type="ollama",
    model_endpoint="http://localhost:11434",
    context_window=8192
)

Advanced Usage

Multi-Agent Systems

from letta import create_client

client = create_client()

# Create specialized agents
researcher = client.create_agent(
    name="researcher",
    memory_blocks=[
        {"label": "persona", "value": "You research topics thoroughly."},
        {"label": "human", "value": ""}
    ]
)

writer = client.create_agent(
    name="writer",
    memory_blocks=[
        {"label": "persona", "value": "You write clear summaries."},
        {"label": "human", "value": ""}
    ]
)

# Orchestrate agents
research_result = client.send_message(
    agent_id=researcher.id,
    role="user",
    message="Research the latest trends in RAG systems"
)

summary = client.send_message(
    agent_id=writer.id,
    role="user",
    message=f"Summarize this research: {research_result.messages}"
)

Persistent Data Sources

# Attach a data source to an agent
source = client.create_source(name="company-docs")

# Load files into source
client.load_file_to_source(
    source_id=source.id,
    file_path="/path/to/documents/manual.pdf"
)

# Attach source to agent (populates archival memory)
client.attach_source_to_agent(
    source_id=source.id,
    agent_id=agent.id
)

Streaming Responses

# Stream agent responses
stream = client.send_message(
    agent_id=agent.id,
    role="user",
    message="Write a detailed analysis",
    stream=True
)

for chunk in stream:
    if hasattr(chunk, 'content'):
        print(chunk.content, end='', flush=True)

Troubleshooting

IssueSolution
Agent not respondingCheck LLM API key and model availability
Memory not persistingVerify database connection (PostgreSQL recommended)
Context window exceededReduce core memory size or switch to larger context model
Archival search returns emptyCheck embedding model config, ensure data was inserted
Slow response timesUse faster model, reduce archival search count
Tool execution failsCheck tool function signature matches docstring
Docker container exitsCheck logs: docker logs letta-server
Connection refused on 8283Ensure server is running: letta server
# Debug mode
letta server --debug

# Check server health
curl http://localhost:8283/v1/health

# View agent details
curl http://localhost:8283/v1/agents/{agent_id}

# Export agent state
curl http://localhost:8283/v1/agents/{agent_id}/export > agent_backup.json