Mem0

Mem0 provides a persistent, intelligent memory layer for AI applications. It automatically extracts, stores, and retrieves relevant memories from conversations, enabling personalized and context-aware AI across sessions. Supports user, agent, and session scoping with vector + graph storage backends.

Website: https://mem0.ai
GitHub: https://github.com/mem0ai/mem0
Docs: https://docs.mem0.ai
Dashboard: https://app.mem0.ai

Installation

# Core library (self-hosted)
pip install mem0ai

# With graph memory support
pip install "mem0ai[graph]"

# With specific vector store backends
pip install "mem0ai[qdrant]"
pip install "mem0ai[chroma]"

# With all optional dependencies
pip install "mem0ai[all]"

# Verify install
python -c "from mem0 import Memory; print('OK')"

Configuration

Managed API (Quickest Start)

from mem0 import MemoryClient

# Use Mem0's managed cloud service
client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx")  # From app.mem0.ai

# All data stored in Mem0's cloud — no setup needed
client.add("I prefer dark mode in all applications.", user_id="alice")

Self-Hosted with OpenAI + Qdrant

from mem0 import Memory

config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini",
            "api_key": "sk-...",
            "temperature": 0.1,
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small",
            "api_key": "sk-...",
        }
    },
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "host": "localhost",
            "port": 6333,
            "collection_name": "memories",
            "embedding_model_dims": 1536,
        }
    },
    "version": "v1.1",
}

m = Memory.from_config(config)

Self-Hosted with Local Models

config = {
    "llm": {
        "provider": "ollama",
        "config": {
            "model": "llama3.2",
            "ollama_base_url": "http://localhost:11434",
            "temperature": 0,
        }
    },
    "embedder": {
        "provider": "ollama",
        "config": {
            "model": "nomic-embed-text",
            "ollama_base_url": "http://localhost:11434",
        }
    },
    "vector_store": {
        "provider": "chroma",
        "config": {
            "collection_name": "memories",
            "path": "/data/chroma_db",
        }
    },
}

m = Memory.from_config(config)

Graph Memory Configuration

config = {
    "llm": {"provider": "openai", "config": {"model": "gpt-4o"}},
    "embedder": {"provider": "openai", "config": {"model": "text-embedding-3-small"}},
    "vector_store": {
        "provider": "qdrant",
        "config": {"host": "localhost", "port": 6333},
    },
    "graph_store": {
        "provider": "neo4j",
        "config": {
            "url": "neo4j://localhost:7687",
            "username": "neo4j",
            "password": "password",
        }
    },
    "version": "v1.1",
}

m = Memory.from_config(config)

Core API

Memory Operations

Method	Description
`m.add(messages, user_id=...)`	Extract and store memories from messages
`m.get(memory_id)`	Retrieve a specific memory by ID
`m.get_all(user_id=...)`	Get all memories for a user/agent/session
`m.search(query, user_id=...)`	Semantic search over memories
`m.update(memory_id, data)`	Update a memory’s content
`m.delete(memory_id)`	Delete a specific memory
`m.delete_all(user_id=...)`	Delete all memories for a user
`m.history(memory_id)`	Get version history of a memory
`m.reset()`	Clear all memories (use with caution)

Scoping Parameters

Parameter	Type	Description
`user_id`	`str`	Scope memories to a specific user
`agent_id`	`str`	Scope memories to a specific agent
`run_id`	`str`	Scope memories to a specific session/run
`metadata`	`dict`	Attach custom metadata to memories
`filters`	`dict`	Filter by metadata during search/retrieval
`limit`	`int`	Max number of results to return

Advanced Usage

Adding and Searching Memories

from mem0 import Memory

m = Memory()

# Add from a string
result = m.add(
    "I'm a vegetarian and allergic to nuts.",
    user_id="alice",
    metadata={"category": "dietary", "source": "profile"},
)
print(result)  # {"results": [{"id": "...", "memory": "User is vegetarian and allergic to nuts", "event": "ADD"}]}

# Add from conversation history
messages = [
    {"role": "user", "content": "I love hiking in the mountains."},
    {"role": "assistant", "content": "That sounds wonderful! Any favorite trails?"},
    {"role": "user", "content": "Yes, I love Yosemite. I go every summer."},
]
m.add(messages, user_id="alice")

# Search memories
results = m.search("What are Alice's food preferences?", user_id="alice", limit=5)
for r in results["results"]:
    print(f"[{r['score']:.2f}] {r['memory']}")

# Get all memories
all_mems = m.get_all(user_id="alice")
print(f"Total memories: {len(all_mems['results'])}")

Multi-Scope Memory (User + Agent)

# Store agent-specific behavior preferences
m.add(
    "Always respond in bullet points for technical questions.",
    agent_id="tech-assistant",
)

# Store user-specific preferences
m.add(
    "Prefers concise answers, no more than 3 sentences.",
    user_id="bob",
)

# Retrieve both when generating response
agent_mems = m.search("response style", agent_id="tech-assistant")
user_mems = m.search("answer length", user_id="bob")

# Combine context
context = "\n".join([
    *[r["memory"] for r in agent_mems["results"]],
    *[r["memory"] for r in user_mems["results"]],
])

Session-Scoped Memory

import uuid

session_id = str(uuid.uuid4())

# Store within session
m.add(
    "User is debugging a Docker networking issue with bridge networks.",
    run_id=session_id,
    user_id="alice",
)

# Retrieve session context
session_ctx = m.get_all(run_id=session_id, user_id="alice")

# Cross-session user memory (no run_id filter)
long_term = m.search("Docker experience", user_id="alice")

Memory-Augmented Chat

from openai import OpenAI
from mem0 import Memory

openai_client = OpenAI()
m = Memory()

def chat_with_memory(user_message: str, user_id: str) -> str:
    # Retrieve relevant memories
    memories = m.search(user_message, user_id=user_id, limit=5)
    memory_context = "\n".join([f"- {r['memory']}" for r in memories["results"]])

    # Build system prompt with memory
    system_prompt = f"""You are a helpful personal assistant.
    
What you know about this user:
{memory_context if memory_context else "No previous memories yet."}

Use this context to personalize your responses."""

    # Generate response
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_message},
        ],
    )
    assistant_message = response.choices[0].message.content

    # Store the conversation as new memory
    m.add(
        [
            {"role": "user", "content": user_message},
            {"role": "assistant", "content": assistant_message},
        ],
        user_id=user_id,
    )

    return assistant_message

# Usage
print(chat_with_memory("I just moved to Seattle.", "alice"))
print(chat_with_memory("Recommend some outdoor activities.", "alice"))
# Second response will know Alice is in Seattle

Managed API Client

from mem0 import MemoryClient

client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx")

# Add memory
client.add(
    [{"role": "user", "content": "My tech stack is Python + FastAPI + PostgreSQL."}],
    user_id="developer-42",
)

# Search
results = client.search("programming languages", user_id="developer-42", limit=3)

# Get all memories
all_memories = client.get_all(user_id="developer-42")

# Delete specific memory
client.delete(memory_id="mem_xxxxxxxxxx")

# Delete all for user
client.delete_all(user_id="developer-42")

LangChain Integration

from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from mem0 import MemoryClient

client = MemoryClient(api_key="m0-xxxxxxxxxxxxxxxxxxxx")

class Mem0LangChainMemory(ConversationBufferMemory):
    user_id: str

    def save_context(self, inputs: dict, outputs: dict) -> None:
        super().save_context(inputs, outputs)
        # Persist to Mem0
        client.add(
            [
                {"role": "user", "content": inputs.get("input", "")},
                {"role": "assistant", "content": outputs.get("output", "")},
            ],
            user_id=self.user_id,
        )

    def load_memory_variables(self, inputs: dict) -> dict:
        # Augment with long-term memories
        query = inputs.get("input", "")
        long_term = client.search(query, user_id=self.user_id, limit=5)
        memories_str = "\n".join([r["memory"] for r in long_term["results"]])
        base = super().load_memory_variables(inputs)
        if memories_str:
            base["history"] = f"Long-term context:\n{memories_str}\n\n" + base.get("history", "")
        return base

Common Workflows

Personalized AI Assistant

def build_personalized_prompt(user_id: str, current_query: str) -> str:
    memories = m.search(current_query, user_id=user_id, limit=10)
    preferences = m.search("preferences style format", user_id=user_id, limit=5)

    all_context = {r["memory"] for r in memories["results"] + preferences["results"]}

    return f"""User preferences and history:
{chr(10).join(f"- {mem}" for mem in all_context)}

Current request: {current_query}"""

Memory Maintenance

# View memory history (to understand how it evolved)
history = m.history("mem_xxxxxxxxxx")
for h in history:
    print(f"{h['event']} at {h['timestamp']}: {h['memory']}")

# Update incorrect memory
m.update("mem_xxxxxxxxxx", "User is based in New York (not Seattle).")

# Delete stale memories
all_mems = m.get_all(user_id="alice")
for mem in all_mems["results"]:
    if "2023" in mem.get("metadata", {}).get("year", ""):
        m.delete(mem["id"])

Tips and Best Practices

Topic	Recommendation
Scoping	Always pass `user_id` to prevent memory leakage between users
Memory extraction	Pass full conversation turns (user + assistant), not just user messages
Search queries	Use natural language queries, not keywords — Mem0 uses semantic search
Limit	Default search returns 10 results; use `limit=5` for prompt-size control
Managed vs self-hosted	Use managed API for prototyping; self-host for data privacy requirements
Graph memory	Enable graph memory for relationship tracking (e.g., family, colleagues)
Deduplication	Mem0 automatically deduplicates and updates conflicting memories
Metadata filtering	Use `filters={"category": "dietary"}` to scope search to specific domains
Session cleanup	Delete `run_id`-scoped memories after session ends to save storage
LLM choice	Use GPT-4o-mini or Claude Haiku for memory extraction — it’s called frequently
Privacy	Review what Mem0 extracts; add a `infer=False` option for sensitive conversations
Cost	Memory extraction uses LLM tokens; balance frequency with cost requirements