Skip to content

MemGPT / Letta - OS-Style Agent Memory Cheatsheet

MemGPT / Letta - OS-Style Agent Memory Cheatsheet

MemGPT is the technique — and Letta the framework that grew from it — for giving LLM agents operating-system-style memory management. The core idea: treat the context window like RAM (fast but small) and add “disk” in the form of searchable archival memory. The agent itself decides, via tool calls, what to keep in main context and what to page out to storage, letting it maintain coherent long-term memory far beyond the raw context limit. (The project is now developed as Letta.)

Installation

MethodCommand
pippip install letta
Run the serverletta server
Dockerdocker run -p 8283:8283 letta/letta:latest
ADE (web UI)connect the Agent Development Environment to the server
Verifyletta version

Memory Architecture

TierAnalogyContents
Main context (core memory)RAMPersona + key facts always in the prompt
Recall memoryRecent filesConversation history, searchable
Archival memoryDiskArbitrary long-term facts, searchable
The agentThe OSDecides what to page in/out via tools

Core Memory (Always In-Context)

BlockPurpose
personaWho the agent is / how it behaves
humanWhat it knows about the user
Custom blocksDomain-specific always-present facts

The agent edits these blocks with tools (core_memory_append, core_memory_replace) as it learns.

Creating an Agent

from letta_client import Letta

client = Letta(base_url="http://localhost:8283")

agent = client.agents.create(
    name="assistant",
    memory_blocks=[
        {"label": "persona", "value": "I am a concise, helpful assistant."},
        {"label": "human", "value": "The user's name is Nick."},
    ],
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
)

Messaging & Memory Tools

response = client.agents.messages.create(
    agent_id=agent.id,
    messages=[{"role": "user", "content": "Remember I prefer dark mode."}],
)
Tool (agent-invoked)Action
core_memory_appendAdd to an always-in-context block
core_memory_replaceUpdate a core memory block
archival_memory_insertStore a fact to archival (disk)
archival_memory_searchRetrieve from archival memory
conversation_searchSearch recall memory

Archival Memory

CommandDescription
client.agents.passages.create(agent_id, text=...)Insert an archival memory
client.agents.passages.list(agent_id)List stored passages
Agent searchAgent calls archival_memory_search automatically when relevant

Persistence & State

FeatureNote
Stateful agentsAgent state persists on the server across sessions
StorageSQLite by default; PostgreSQL for production
Export/importSerialize agents to move them between deployments
Multi-agentRun and coordinate several stateful agents

Common Workflows

# A long-running assistant that remembers across sessions
# 1) create once with persona/human blocks
# 2) each session, just send messages — Letta manages memory paging
client.agents.messages.create(agent_id=agent.id,
    messages=[{"role": "user", "content": "What do you remember about me?"}])
# the agent searches recall/archival and answers with persistent context

MemGPT/Letta vs Other Memory

AspectLetta (MemGPT)Mem0Zep
ModelOS-style paging, agent-managedMulti-tier storeTemporal graph
StatefulnessServer-side agentsLibraryService
ControlAgent decides pagingApp decidesService manages
Best forLong-running autonomous agentsPersonalizationTemporal facts

Resources