Zep - Context Engineering & Memory for Agents Cheatsheet

Zep is a memory and context-engineering layer for AI agents. Built on the Graphiti temporal knowledge graph engine, it ingests conversation history and business data, fuses them into a queryable graph that tracks how facts change over time, and returns relevant, governed context with low latency to ground agent responses. It offers an open-source core and a managed cloud service (SOC 2 / HIPAA), with SDKs for Python, TypeScript, and Go.

Installation / Setup

Target	Command
Python SDK	`pip install zep-cloud` (cloud)
TypeScript SDK	`npm install @getzep/zep-cloud`
Self-hosted (Community Edition)	run via the project’s Docker Compose
API key	`export ZEP_API_KEY=...`

Core Concepts

Term	Meaning
User	An end user the agent serves
Thread	A conversation session for a user
Graph	The temporal knowledge graph of a user/group
Fact	A time-aware relationship in the graph
Context block	Assembled, ready-to-inject context string

Users & Threads

from zep_cloud.client import Zep
zep = Zep(api_key="...")

zep.user.add(user_id="nick", email="nick@example.com")
zep.thread.create(thread_id="t1", user_id="nick")

Call	Description
`user.add(...)`	Create a user
`thread.create(...)`	Start a conversation thread
`thread.add_messages(...)`	Append messages (auto-ingested to the graph)
`user.delete(...)`	Remove a user and their data

Adding Memory

zep.thread.add_messages(
    thread_id="t1",
    messages=[{"role": "user", "content": "I moved to Berlin.", "name": "Nick"}],
)

# Add non-chat business data directly to the graph
zep.graph.add(user_id="nick", type="text",
              data="Nick's subscription tier is Pro.")

Call	Description
`thread.add_messages(...)`	Ingest conversation turns
`graph.add(...)`	Add arbitrary text/JSON to the graph
Ingestion	Entities/facts extracted and time-stamped automatically

Retrieving Context

# Get an assembled context block for the prompt
memory = zep.thread.get_user_context(thread_id="t1")
print(memory.context)   # ready-to-inject string of relevant facts

# Or query the graph directly
edges = zep.graph.search(user_id="nick", query="where does Nick live?")

Call	Returns
`thread.get_user_context(...)`	A synthesized context block
`graph.search(...)`	Facts/edges or nodes matching a query
Search scope	edges (facts), nodes (entities), or episodes

Why Temporal

Because Zep is graph-based and time-aware, contradictory updates do not overwrite blindly — old facts are invalidated with a timestamp and new ones recorded, so the agent gets the current truth while history stays queryable.

Capability	Benefit
Fact invalidation	Current context stays accurate
Provenance	Trace facts to their source
Governed retrieval	Low-latency, permissioned context
Cross-session	Memory persists across threads

Common Workflows

# The agent loop with Zep memory
zep.thread.add_messages(thread_id="t1", messages=user_turn)
context = zep.thread.get_user_context(thread_id="t1").context
# prepend `context` to your LLM system prompt, then generate

Zep vs Other Memory Layers

Aspect	Zep	Mem0	raw vector store
Model	Temporal graph (Graphiti)	Multi-tier	Embeddings only
Temporal facts	Yes	Limited	No
Context assembly	Built-in block	Retrieval	Manual
Best for	Production agent memory	Personalization	Simple recall

Zep - Context Engineering & Memory for Agents Cheatsheet

Zep - Context Engineering & Memory for Agents Cheatsheet

Installation / Setup

Core Concepts

Users & Threads

Adding Memory

Retrieving Context

Why Temporal

Common Workflows

Zep vs Other Memory Layers

Resources