MetaGPT

Installation

# Install via pip (Python 3.9–3.11 recommended)
pip install metagpt

# Install with all extras
pip install "metagpt[all]"

# Install from source (latest features)
git clone https://github.com/geekan/MetaGPT
cd MetaGPT
pip install -e .

# Verify installation
python -c "import metagpt; print(metagpt.__version__)"

# System dependencies for browser/scraping features
pip install playwright
playwright install chromium

Configuration

# Create config file (first-time setup)
metagpt --init-config

# Config file location:
# Linux/Mac: ~/.metagpt/config2.yaml
# Windows:   %USERPROFILE%\.metagpt\config2.yaml

# ~/.metagpt/config2.yaml
llm:
  api_type: "openai"           # openai | anthropic | azure | ollama | groq
  model: "gpt-4o"
  api_key: "sk-your-key-here"
  max_token: 4096
  temperature: 0.0

# For Anthropic Claude
# llm:
#   api_type: "anthropic"
#   model: "claude-3-5-sonnet-20241022"
#   api_key: "sk-ant-your-key"

# For local models (Ollama)
# llm:
#   api_type: "ollama"
#   model: "ollama/codellama"
#   base_url: "http://localhost:11434"

# Project output directory
workspace:
  path: "/home/user/metagpt-workspace"

# Enable human-in-the-loop review
human_review: false

# Cost controls
max_budget: 3.0          # USD limit per run
token_limit: 40000       # token limit per run

# Browser for web-enabled agents
browser:
  engine: "playwright"
  browser_type: "chromium"

# Environment variable alternative
export OPENAI_API_KEY=sk-your-key
export ANTHROPIC_API_KEY=sk-ant-your-key
export METAGPT_PROJECT_PATH=/home/user/metagpt-workspace

Core Commands / API

CLI Command	Description
`metagpt "requirement"`	Run software team on a requirement
`metagpt "req" --project-name NAME`	Set output project name
`metagpt "req" --project-path PATH`	Set output directory
`metagpt "req" --n-round N`	Limit collaboration rounds
`metagpt "req" --code-review`	Enable peer code review step
`metagpt "req" --run-tests`	Run generated tests
`metagpt "req" --recover-path PATH`	Resume from saved state
`metagpt "req" --human-approval`	Enable human-in-the-loop
`metagpt --init-config`	Generate default config file
`metagpt --help`	Show all options

Python API	Description
`Team()`	Create a multi-agent team
`team.hire([roles])`	Add roles to the team
`team.invest(budget)`	Set investment budget (USD)
`team.run_project("req")`	Start a software project
`asyncio.run(team.run(n_round=N))`	Run N collaboration rounds
`DataInterpreter()`	Create data science agent
`di.run("task")`	Run a data analysis task

Advanced Usage

Software Team — Full Pipeline

import asyncio
from metagpt.roles import (
    ProductManager,
    Architect,
    ProjectManager,
    Engineer,
    QaEngineer,
)
from metagpt.team import Team

async def build_software(requirement: str):
    company = Team()
    
    # Hire the full software team
    company.hire([
        ProductManager(),    # Writes PRD from requirement
        Architect(),         # Designs system architecture
        ProjectManager(),    # Creates task breakdown
        Engineer(n_borg=3),  # 3 parallel engineer agents
        QaEngineer(),        # Writes and runs tests
    ])
    
    # Set cost limit (optional safety measure)
    company.invest(3.00)     # $3.00 USD max
    
    # Start the project
    company.run_project(requirement)
    
    # Run collaboration rounds
    await company.run(n_round=5)

requirement = """
Build a web scraper that:
1. Accepts a URL and CSS selector as input
2. Extracts matching elements and their text content
3. Saves results to a CSV file with timestamp
4. Handles errors gracefully (timeouts, invalid URLs)
5. Include a CLI interface with --url and --selector flags
"""

asyncio.run(build_software(requirement))

Understanding the Role Pipeline

User Requirement
      ↓
ProductManager   → writes PRD (Product Requirements Document)
      ↓
Architect        → designs system design doc + file structure
      ↓
ProjectManager   → creates task list, assigns to engineers
      ↓
Engineer(s)      → writes actual code files
      ↓
QaEngineer       → writes unit tests, runs them
      ↓
Output: complete codebase in workspace/

# CLI equivalent of full pipeline
metagpt "Build a web scraper CLI tool" \
  --code-review \
  --run-tests \
  --n-round 5 \
  --project-name web_scraper

Data Interpreter (Data Science Agent)

import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter

async def analyze_data():
    di = DataInterpreter()
    
    # Natural language data analysis task
    result = await di.run("""
    I have a CSV file at /data/sales_2024.csv with columns:
    date, product, category, revenue, units_sold, region
    
    Please:
    1. Load and clean the data (handle missing values)
    2. Calculate monthly revenue trends by category
    3. Identify top 5 products by total revenue
    4. Find which region has the highest growth rate
    5. Create visualizations for each analysis
    6. Save a summary report to /data/analysis_report.md
    """)
    
    print(result)

asyncio.run(analyze_data())

# DataInterpreter with specific tools enabled
from metagpt.roles.di.data_interpreter import DataInterpreter
from metagpt.tools.tool_registry import TOOL_REGISTRY

async def run_di_with_tools():
    # Enable specific tools: web search, code execution, file I/O
    di = DataInterpreter(
        use_reflection=True,          # Self-correct on errors
        tools=["web_search", "file_read", "file_write", "execute_nb_code"],
    )
    
    result = await di.run("""
    Research the top 5 Python web frameworks by GitHub stars as of today,
    then create a comparison table and save it as frameworks.md
    """)
    
asyncio.run(run_di_with_tools())

Custom Roles

import asyncio
from metagpt.roles import Role
from metagpt.actions import Action
from metagpt.schema import Message

# Define a custom Action
class SecurityAudit(Action):
    name: str = "SecurityAudit"
    
    async def run(self, code: str) -> str:
        prompt = f"""
        Perform a security audit of the following code.
        Check for:
        - SQL injection vulnerabilities
        - Command injection risks
        - Hardcoded credentials
        - Insecure dependencies
        - Missing input validation
        
        Code:
        ```
        {code}
        ```
        
        Return a structured report with: severity (HIGH/MEDIUM/LOW), 
        description, line number (if applicable), and fix recommendation.
        """
        return await self._aask(prompt)

# Define a custom Role using the Action
class SecurityEngineer(Role):
    name: str = "SecurityEngineer"
    profile: str = "Security Engineer"
    goal: str = "Audit code for security vulnerabilities"
    constraints: str = "Be thorough, specific, and actionable"
    
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.set_actions([SecurityAudit])
    
    async def _act(self) -> Message:
        # Get the latest message (code to audit)
        todo = self.rc.todo
        msg = self.get_memories(k=1)[0]
        
        # Perform the audit
        audit_result = await todo.run(msg.content)
        
        return Message(
            content=audit_result,
            role=self.profile,
            cause_by=type(todo),
        )

# Use the custom role in a team
async def run_security_audit():
    from metagpt.team import Team
    from metagpt.roles import Engineer
    
    team = Team()
    team.hire([
        Engineer(),           # Writes code
        SecurityEngineer(),   # Reviews it for security
    ])
    
    team.run_project("Build a login system with user authentication")
    await team.run(n_round=3)

asyncio.run(run_security_audit())

Actions — Core Building Blocks

from metagpt.actions import Action
from metagpt.schema import Message

# Simple single-LLM-call action
class WriteUnitTests(Action):
    name: str = "WriteUnitTests"
    
    PROMPT_TEMPLATE: str = """
    Write comprehensive pytest unit tests for this Python code.
    Include: happy path, edge cases, error conditions.
    Use pytest fixtures and parametrize where appropriate.
    
    Code:
    {code}
    
    Return only the test file content.
    """
    
    async def run(self, code: str) -> str:
        prompt = self.PROMPT_TEMPLATE.format(code=code)
        return await self._aask(prompt)

# Action with structured output (using Pydantic)
from pydantic import BaseModel
from typing import List

class TaskList(BaseModel):
    tasks: List[str]
    estimated_hours: float
    priority: str

class PlanProject(Action):
    name: str = "PlanProject"
    
    async def run(self, requirements: str) -> TaskList:
        prompt = f"""
        Break down this project into specific tasks with time estimates.
        Requirements: {requirements}
        
        Return JSON matching: {{"tasks": [...], "estimated_hours": N, "priority": "high|medium|low"}}
        """
        response = await self._aask(prompt)
        # Parse and return structured output
        import json
        data = json.loads(response)
        return TaskList(**data)

Memory and Context Management

from metagpt.roles import Role
from metagpt.memory import Memory

# Roles automatically maintain message memory
# Access memory within a role:
class ContextAwareRole(Role):
    async def _act(self) -> Message:
        # Get all messages in memory
        all_messages = self.get_memories()
        
        # Get last N messages
        recent = self.get_memories(k=5)
        
        # Get messages from specific role
        prd_docs = [m for m in all_messages if m.role == "ProductManager"]
        
        # Full context for LLM call
        context = "\n".join([f"{m.role}: {m.content}" for m in recent])
        
        response = await self._aask(
            f"Given this context:\n{context}\n\nPerform your task."
        )
        return Message(content=response, role=self.profile)

Human-in-the-Loop

import asyncio
from metagpt.team import Team
from metagpt.roles import ProductManager, Architect, Engineer

async def run_with_human_review():
    team = Team()
    team.hire([
        ProductManager(),
        Architect(),
        Engineer(),
    ])
    
    # Enable human review at each major step
    team.invest(5.0)
    team.run_project(
        "Build a task management REST API",
        send_to="ProductManager",    # Start with PM
    )
    
    # Run one round at a time, review output
    await team.run(n_round=1)   # ProductManager writes PRD
    
    # Review .metagpt/workspace/output/PRD.md
    user_input = input("Review the PRD. Approve? (y/n/edit): ")
    if user_input == "n":
        print("Stopping for revision")
        return
    
    await team.run(n_round=1)   # Architect designs system
    # Review architecture...
    
    await team.run(n_round=2)   # Engineers code

asyncio.run(run_with_human_review())

# Enable in config for automatic review prompts
# ~/.metagpt/config2.yaml
human_review: true

Resume from Checkpoint

# MetaGPT saves state automatically during runs
# If interrupted, resume from last checkpoint:
metagpt "Build a REST API" --recover-path /path/to/workspace/project_dir

# Find workspace path from previous run output or:
ls ~/.metagpt/workspace/

import asyncio
from metagpt.team import Team

async def resume_project():
    team = Team()
    team.recover_path = "/home/user/.metagpt/workspace/my_project"
    team.invest(5.0)
    # Continue from where it left off
    await team.run(n_round=3)

asyncio.run(resume_project())

Common Workflows

Full Software Project from Scratch

# Minimal CLI usage
metagpt "Create a Python CLI tool for managing a todo list with SQLite storage, \
supporting: add, list, complete, and delete operations" \
  --project-name todo_cli \
  --code-review \
  --run-tests \
  --n-round 6

# Output in: ./workspace/todo_cli/

Data Science Pipeline

import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter

async def ml_pipeline():
    di = DataInterpreter(use_reflection=True)
    
    await di.run("""
    Build a machine learning pipeline:
    1. Load the Iris dataset from sklearn
    2. Perform EDA: distributions, correlations, pairplot
    3. Train 3 models: LogisticRegression, RandomForest, SVM
    4. Cross-validate each with 5-fold CV
    5. Compare performance (accuracy, precision, recall, F1)
    6. Save the best model as model.pkl
    7. Create a predictions function and test it
    8. Generate a report.md summarizing results
    """)

asyncio.run(ml_pipeline())

Code Review and Improvement Team

import asyncio
from metagpt.roles import Role, Engineer, QaEngineer
from metagpt.team import Team

async def review_existing_code():
    team = Team()
    team.hire([Engineer(), QaEngineer()])
    
    with open("existing_code.py") as f:
        code = f.read()
    
    team.run_project(f"""
    Review and improve this existing Python code:
    
    ```python
    {code}
    ```
    
    Tasks:
    1. Add type hints to all functions
    2. Add docstrings following Google style
    3. Refactor any code smells
    4. Write comprehensive pytest tests
    5. Ensure all tests pass
    """)
    
    await team.run(n_round=4)

asyncio.run(review_existing_code())

Tips and Best Practices

Cost Control

# Always set a budget limit
team.invest(2.00)    # Hard stop at $2.00

# Fewer rounds = fewer LLM calls
await team.run(n_round=3)    # 3 rounds instead of default 5+

# Use cheaper model for lower-stakes tasks
# config2.yaml: model: "gpt-4o-mini"   (10x cheaper than gpt-4o)

# Monitor spend in output logs — MetaGPT prints token cost per step

Role	Purpose	Skip if…
`ProductManager`	Writes PRD	Requirement is already detailed
`Architect`	System design	Simple scripts or single-file apps
`ProjectManager`	Task assignment	Only 1 engineer
`Engineer`	Code writing	Always needed
`QaEngineer`	Test writing	Use `--run-tests` flag

Effective Requirements Writing

# Write requirements that specify:
# 1. Tech stack (language, frameworks, versions)
# 2. Input/output format
# 3. File structure expectations  
# 4. How to run the app
# 5. Test expectations

good_requirement = """
Tech stack: Python 3.11, FastAPI 0.100+, SQLAlchemy 2.0, SQLite
Build a URL shortener service with:
- POST /shorten {url: str} → {short_code: str, short_url: str}
- GET /{code} → redirects to original URL
- GET /stats/{code} → {clicks: int, created_at: str}
- SQLite database with URL and clicks tables
- Runs with: uvicorn main:app --reload --port 8000
- Tests: pytest tests/ with at least 80% coverage
"""

Debugging Multi-Agent Runs

# Enable verbose logging
export METAGPT_LOG_LEVEL=DEBUG
metagpt "your requirement"

# Check individual agent outputs in workspace
ls workspace/your_project/docs/
# PRD.md           — ProductManager output
# system_design/   — Architect output
# task/            — ProjectManager output

# Each file contains full LLM output for that agent's step
# Inspect to understand where things went wrong

# Common issues:
# Agent loops — increase n_round or use --recover-path
# OOM / timeout — reduce engineer count (n_borg=1)
# Bad code — enable --code-review flag
# Tests failing — enable --run-tests and check workspace/tests/

Scaling Considerations

Scenario	Configuration
Simple script	`Engineer()` only, n_round=2
Small app	PM + Architect + Engineer, n_round=4
Full product	All 5 roles, n_round=5–8
Data analysis	`DataInterpreter()` only
Code review	`Engineer + QaEngineer`, n_round=3
Parallel coding	`Engineer(n_borg=3)` — 3 parallel agents

# For large projects, hire multiple engineers
company.hire([
    ProductManager(),
    Architect(),
    ProjectManager(),
    Engineer(n_borg=5),   # 5 engineers code in parallel
    QaEngineer(),
])