Mem0 provides a persistent, intelligent memory layer for AI applications. It automatically extracts, stores, and retrieves relevant memories from conversations, enabling personalized and context-aware AI across sessions. Supports user, agent, and session scoping with vector + graph storage backends.
Website: https://mem0.ai
GitHub: https://github.com/mem0ai/mem0
Docs: https://docs.mem0.ai
Dashboard: https://app.mem0.ai
Installation
# Core library (self-hosted)
pip install mem0ai
# With graph memory support
pip install "mem0ai[graph]"
# With specific vector store backends
pip install "mem0ai[qdrant]"
pip install "mem0ai[chroma]"
# With all optional dependencies
pip install "mem0ai[all]"
# Verify install
python -c "from mem0 import Memory; print('OK')"
Configuration
Managed API (Quickest Start)
from mem0 import MemoryClient
# Use Mem0's managed cloud service
client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx") # From app.mem0.ai
# All data stored in Mem0's cloud — no setup needed
client.add("I prefer dark mode in all applications.", user_id="alice")
Self-Hosted with OpenAI + Qdrant
from mem0 import Memory
config = {
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"api_key": "sk-...",
"temperature": 0.1,
}
},
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
"api_key": "sk-...",
}
},
"vector_store": {
"provider": "qdrant",
"config": {
"host": "localhost",
"port": 6333,
"collection_name": "memories",
"embedding_model_dims": 1536,
}
},
"version": "v1.1",
}
m = Memory.from_config(config)
Self-Hosted with Local Models
config = {
"llm": {
"provider": "ollama",
"config": {
"model": "llama3.2",
"ollama_base_url": "http://localhost:11434",
"temperature": 0,
}
},
"embedder": {
"provider": "ollama",
"config": {
"model": "nomic-embed-text",
"ollama_base_url": "http://localhost:11434",
}
},
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "memories",
"path": "/data/chroma_db",
}
},
}
m = Memory.from_config(config)
Graph Memory Configuration
config = {
"llm": {"provider": "openai", "config": {"model": "gpt-4o"}},
"embedder": {"provider": "openai", "config": {"model": "text-embedding-3-small"}},
"vector_store": {
"provider": "qdrant",
"config": {"host": "localhost", "port": 6333},
},
"graph_store": {
"provider": "neo4j",
"config": {
"url": "neo4j://localhost:7687",
"username": "neo4j",
"password": "password",
}
},
"version": "v1.1",
}
m = Memory.from_config(config)
Core API
Memory Operations
| Method | Description |
|---|
m.add(messages, user_id=...) | Extract and store memories from messages |
m.get(memory_id) | Retrieve a specific memory by ID |
m.get_all(user_id=...) | Get all memories for a user/agent/session |
m.search(query, user_id=...) | Semantic search over memories |
m.update(memory_id, data) | Update a memory’s content |
m.delete(memory_id) | Delete a specific memory |
m.delete_all(user_id=...) | Delete all memories for a user |
m.history(memory_id) | Get version history of a memory |
m.reset() | Clear all memories (use with caution) |
Scoping Parameters
| Parameter | Type | Description |
|---|
user_id | str | Scope memories to a specific user |
agent_id | str | Scope memories to a specific agent |
run_id | str | Scope memories to a specific session/run |
metadata | dict | Attach custom metadata to memories |
filters | dict | Filter by metadata during search/retrieval |
limit | int | Max number of results to return |
Advanced Usage
Adding and Searching Memories
from mem0 import Memory
m = Memory()
# Add from a string
result = m.add(
"I'm a vegetarian and allergic to nuts.",
user_id="alice",
metadata={"category": "dietary", "source": "profile"},
)
print(result) # {"results": [{"id": "...", "memory": "User is vegetarian and allergic to nuts", "event": "ADD"}]}
# Add from conversation history
messages = [
{"role": "user", "content": "I love hiking in the mountains."},
{"role": "assistant", "content": "That sounds wonderful! Any favorite trails?"},
{"role": "user", "content": "Yes, I love Yosemite. I go every summer."},
]
m.add(messages, user_id="alice")
# Search memories
results = m.search("What are Alice's food preferences?", user_id="alice", limit=5)
for r in results["results"]:
print(f"[{r['score']:.2f}] {r['memory']}")
# Get all memories
all_mems = m.get_all(user_id="alice")
print(f"Total memories: {len(all_mems['results'])}")
Multi-Scope Memory (User + Agent)
# Store agent-specific behavior preferences
m.add(
"Always respond in bullet points for technical questions.",
agent_id="tech-assistant",
)
# Store user-specific preferences
m.add(
"Prefers concise answers, no more than 3 sentences.",
user_id="bob",
)
# Retrieve both when generating response
agent_mems = m.search("response style", agent_id="tech-assistant")
user_mems = m.search("answer length", user_id="bob")
# Combine context
context = "\n".join([
*[r["memory"] for r in agent_mems["results"]],
*[r["memory"] for r in user_mems["results"]],
])
Session-Scoped Memory
import uuid
session_id = str(uuid.uuid4())
# Store within session
m.add(
"User is debugging a Docker networking issue with bridge networks.",
run_id=session_id,
user_id="alice",
)
# Retrieve session context
session_ctx = m.get_all(run_id=session_id, user_id="alice")
# Cross-session user memory (no run_id filter)
long_term = m.search("Docker experience", user_id="alice")
Memory-Augmented Chat
from openai import OpenAI
from mem0 import Memory
openai_client = OpenAI()
m = Memory()
def chat_with_memory(user_message: str, user_id: str) -> str:
# Retrieve relevant memories
memories = m.search(user_message, user_id=user_id, limit=5)
memory_context = "\n".join([f"- {r['memory']}" for r in memories["results"]])
# Build system prompt with memory
system_prompt = f"""You are a helpful personal assistant.
What you know about this user:
{memory_context if memory_context else "No previous memories yet."}
Use this context to personalize your responses."""
# Generate response
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
],
)
assistant_message = response.choices[0].message.content
# Store the conversation as new memory
m.add(
[
{"role": "user", "content": user_message},
{"role": "assistant", "content": assistant_message},
],
user_id=user_id,
)
return assistant_message
# Usage
print(chat_with_memory("I just moved to Seattle.", "alice"))
print(chat_with_memory("Recommend some outdoor activities.", "alice"))
# Second response will know Alice is in Seattle
Managed API Client
from mem0 import MemoryClient
client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx")
# Add memory
client.add(
[{"role": "user", "content": "My tech stack is Python + FastAPI + PostgreSQL."}],
user_id="developer-42",
)
# Search
results = client.search("programming languages", user_id="developer-42", limit=3)
# Get all memories
all_memories = client.get_all(user_id="developer-42")
# Delete specific memory
client.delete(memory_id="mem_xxxxxxxxxx")
# Delete all for user
client.delete_all(user_id="developer-42")
LangChain Integration
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from mem0 import MemoryClient
client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx")
class Mem0LangChainMemory(ConversationBufferMemory):
user_id: str
def save_context(self, inputs: dict, outputs: dict) -> None:
super().save_context(inputs, outputs)
# Persist to Mem0
client.add(
[
{"role": "user", "content": inputs.get("input", "")},
{"role": "assistant", "content": outputs.get("output", "")},
],
user_id=self.user_id,
)
def load_memory_variables(self, inputs: dict) -> dict:
# Augment with long-term memories
query = inputs.get("input", "")
long_term = client.search(query, user_id=self.user_id, limit=5)
memories_str = "\n".join([r["memory"] for r in long_term["results"]])
base = super().load_memory_variables(inputs)
if memories_str:
base["history"] = f"Long-term context:\n{memories_str}\n\n" + base.get("history", "")
return base
Common Workflows
Personalized AI Assistant
def build_personalized_prompt(user_id: str, current_query: str) -> str:
memories = m.search(current_query, user_id=user_id, limit=10)
preferences = m.search("preferences style format", user_id=user_id, limit=5)
all_context = {r["memory"] for r in memories["results"] + preferences["results"]}
return f"""User preferences and history:
{chr(10).join(f"- {mem}" for mem in all_context)}
Current request: {current_query}"""
Memory Maintenance
# View memory history (to understand how it evolved)
history = m.history("mem_xxxxxxxxxx")
for h in history:
print(f"{h['event']} at {h['timestamp']}: {h['memory']}")
# Update incorrect memory
m.update("mem_xxxxxxxxxx", "User is based in New York (not Seattle).")
# Delete stale memories
all_mems = m.get_all(user_id="alice")
for mem in all_mems["results"]:
if "2023" in mem.get("metadata", {}).get("year", ""):
m.delete(mem["id"])
Tips and Best Practices
| Topic | Recommendation |
|---|
| Scoping | Always pass user_id to prevent memory leakage between users |
| Memory extraction | Pass full conversation turns (user + assistant), not just user messages |
| Search queries | Use natural language queries, not keywords — Mem0 uses semantic search |
| Limit | Default search returns 10 results; use limit=5 for prompt-size control |
| Managed vs self-hosted | Use managed API for prototyping; self-host for data privacy requirements |
| Graph memory | Enable graph memory for relationship tracking (e.g., family, colleagues) |
| Deduplication | Mem0 automatically deduplicates and updates conflicting memories |
| Metadata filtering | Use filters={"category": "dietary"} to scope search to specific domains |
| Session cleanup | Delete run_id-scoped memories after session ends to save storage |
| LLM choice | Use GPT-4o-mini or Claude Haiku for memory extraction — it’s called frequently |
| Privacy | Review what Mem0 extracts; add a infer=False option for sensitive conversations |
| Cost | Memory extraction uses LLM tokens; balance frequency with cost requirements |