Skip to content

Zep - Context Engineering & Memory for Agents Cheatsheet

Zep - Context Engineering & Memory for Agents Cheatsheet

Zep is a memory and context-engineering layer for AI agents. Built on the Graphiti temporal knowledge graph engine, it ingests conversation history and business data, fuses them into a queryable graph that tracks how facts change over time, and returns relevant, governed context with low latency to ground agent responses. It offers an open-source core and a managed cloud service (SOC 2 / HIPAA), with SDKs for Python, TypeScript, and Go.

Installation / Setup

TargetCommand
Python SDKpip install zep-cloud (cloud)
TypeScript SDKnpm install @getzep/zep-cloud
Self-hosted (Community Edition)run via the project’s Docker Compose
API keyexport ZEP_API_KEY=...

Core Concepts

TermMeaning
UserAn end user the agent serves
ThreadA conversation session for a user
GraphThe temporal knowledge graph of a user/group
FactA time-aware relationship in the graph
Context blockAssembled, ready-to-inject context string

Users & Threads

from zep_cloud.client import Zep
zep = Zep(api_key="...")

zep.user.add(user_id="nick", email="nick@example.com")
zep.thread.create(thread_id="t1", user_id="nick")
CallDescription
user.add(...)Create a user
thread.create(...)Start a conversation thread
thread.add_messages(...)Append messages (auto-ingested to the graph)
user.delete(...)Remove a user and their data

Adding Memory

zep.thread.add_messages(
    thread_id="t1",
    messages=[{"role": "user", "content": "I moved to Berlin.", "name": "Nick"}],
)

# Add non-chat business data directly to the graph
zep.graph.add(user_id="nick", type="text",
              data="Nick's subscription tier is Pro.")
CallDescription
thread.add_messages(...)Ingest conversation turns
graph.add(...)Add arbitrary text/JSON to the graph
IngestionEntities/facts extracted and time-stamped automatically

Retrieving Context

# Get an assembled context block for the prompt
memory = zep.thread.get_user_context(thread_id="t1")
print(memory.context)   # ready-to-inject string of relevant facts

# Or query the graph directly
edges = zep.graph.search(user_id="nick", query="where does Nick live?")
CallReturns
thread.get_user_context(...)A synthesized context block
graph.search(...)Facts/edges or nodes matching a query
Search scopeedges (facts), nodes (entities), or episodes

Why Temporal

Because Zep is graph-based and time-aware, contradictory updates do not overwrite blindly — old facts are invalidated with a timestamp and new ones recorded, so the agent gets the current truth while history stays queryable.

CapabilityBenefit
Fact invalidationCurrent context stays accurate
ProvenanceTrace facts to their source
Governed retrievalLow-latency, permissioned context
Cross-sessionMemory persists across threads

Common Workflows

# The agent loop with Zep memory
zep.thread.add_messages(thread_id="t1", messages=user_turn)
context = zep.thread.get_user_context(thread_id="t1").context
# prepend `context` to your LLM system prompt, then generate

Zep vs Other Memory Layers

AspectZepMem0raw vector store
ModelTemporal graph (Graphiti)Multi-tierEmbeddings only
Temporal factsYesLimitedNo
Context assemblyBuilt-in blockRetrievalManual
Best forProduction agent memoryPersonalizationSimple recall

Resources