コンテンツにスキップ

MongoDB Cheatsheet

MongoDB - The Document Database

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.

Table of Contents

Installation

Ubuntu/Debian

# Import MongoDB public GPG key
wget -qO - https://www.mongodb.org/static/pgp/server-7.0.asc | sudo apt-key add -

# Create list file for MongoDB
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list

# Update package database
sudo apt-get update

# Install MongoDB
sudo apt-get install -y mongodb-org

# Start MongoDB
sudo systemctl start mongod
sudo systemctl enable mongod

# Check status
sudo systemctl status mongod

# Connect to MongoDB
mongosh

CentOS/RHEL/Fedora

# Create repository file
sudo tee /etc/yum.repos.d/mongodb-org-7.0.repo << EOF
[mongodb-org-7.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/\$releasever/mongodb-org/7.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-7.0.asc
EOF

# Install MongoDB
sudo yum install -y mongodb-org

# Start MongoDB
sudo systemctl start mongod
sudo systemctl enable mongod

# Connect to MongoDB
mongosh

macOS

# Using Homebrew
brew tap mongodb/brew
brew install mongodb-community

# Start MongoDB
brew services start mongodb/brew/mongodb-community

# Connect to MongoDB
mongosh

# Manual installation
curl -O https://fastdl.mongodb.org/osx/mongodb-macos-x86_64-7.0.4.tgz
tar -zxvf mongodb-macos-x86_64-7.0.4.tgz
sudo mv mongodb-macos-x86_64-7.0.4 /usr/local/mongodb
export PATH=/usr/local/mongodb/bin:$PATH

Windows

# Download installer from https://www.mongodb.com/try/download/community
# Run the .msi installer

# Start MongoDB as service
net start MongoDB

# Connect using MongoDB Shell
mongosh

# Or install via Chocolatey
choco install mongodb

Docker

# Pull MongoDB image
docker pull mongo:7.0

# Run MongoDB container
docker run --name mongodb-container \
  -p 27017:27017 \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  -d mongo:7.0

# Connect to MongoDB in container
docker exec -it mongodb-container mongosh -u admin -p password

# Run with persistent data
docker run --name mongodb-container \
  -p 27017:27017 \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  -v mongodb-data:/data/db \
  -d mongo:7.0

# Docker Compose
cat > docker-compose.yml << EOF
version: '3.8'
services:
  mongodb:
    image: mongo:7.0
    container_name: mongodb
    restart: always
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password
    volumes:
      - mongodb_data:/data/db

volumes:
  mongodb_data:
EOF

docker-compose up -d

Basic Commands

Connecting to MongoDB

// Connect to local MongoDB
mongosh

// Connect to remote MongoDB
mongosh "mongodb://username:password@hostname:27017/database"

// Connect with options
mongosh "mongodb://username:password@hostname:27017/database?authSource=admin&ssl=true"

// Connect to replica set
mongosh "mongodb://host1:27017,host2:27017,host3:27017/database?replicaSet=myReplicaSet"

// Connect to MongoDB Atlas
mongosh "mongodb+srv://username:password@cluster.mongodb.net/database"

Basic Information

// Show current database
db

// Show all databases
show dbs

// Switch to database
use myDatabase

// Show collections in current database
show collections

// Show users
show users

// Show roles
show roles

// Get database stats
db.stats()

// Get collection stats
db.myCollection.stats()

// Get server status
db.serverStatus()

// Show current operations
db.currentOp()

// Get MongoDB version
db.version()

// Get help
help
db.help()
db.myCollection.help()

Shell Operations

// Execute JavaScript
var result = db.users.findOne()
print(result.name)

// Load external JavaScript file
load("script.js")

// Exit shell
exit
quit()

// Clear screen
cls

// Show command history
history

// Set display options
DBQuery.shellBatchSize = 10  // Limit results per page

Database Operations

Creating and Managing Databases

// Switch to database (creates if doesn't exist)
use myDatabase

// Create database with first document
db.users.insertOne({name: "John", email: "john@example.com"})

// Drop current database
db.dropDatabase()

// List all databases
show dbs

// Get database information
db.stats()
db.stats(1024*1024)  // Stats in MB

// Get database name
db.getName()

// Clone database
db.cloneDatabase("source_host")

// Copy database
db.copyDatabase("source_db", "target_db", "source_host")

Database Administration

// Get database profiling level
db.getProfilingLevel()

// Set profiling level
db.setProfilingLevel(2)  // 0=off, 1=slow ops, 2=all ops

// Set profiling with slow operation threshold
db.setProfilingLevel(1, {slowms: 100})

// Get profiling status
db.getProfilingStatus()

// View profiler data
db.system.profile.find().limit(5).sort({ts: -1}).pretty()

// Run database command
db.runCommand({serverStatus: 1})

// Get last error
db.getLastError()

// Get last operation
db.getLastErrorObj()

// Force error
db.forceError()

// Reset error
db.resetError()

Collection Operations

Creating Collections

// Create collection implicitly
db.users.insertOne({name: "John"})

// Create collection explicitly
db.createCollection("users")

// Create collection with options
db.createCollection("users", {
  capped: true,
  size: 100000,
  max: 5000
})

// Create collection with validation
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"],
      properties: {
        name: {
          bsonType: "string",
          description: "must be a string and is required"
        },
        email: {
          bsonType: "string",
          pattern: "^.+@.+$",
          description: "must be a valid email address"
        },
        age: {
          bsonType: "int",
          minimum: 0,
          maximum: 150,
          description: "must be an integer between 0 and 150"
        }
      }
    }
  }
})

// Create time series collection
db.createCollection("temperatures", {
  timeseries: {
    timeField: "timestamp",
    metaField: "metadata",
    granularity: "hours"
  }
})

Managing Collections

// List collections
show collections
db.listCollections()

// Get collection information
db.users.stats()

// Rename collection
db.users.renameCollection("customers")

// Drop collection
db.users.drop()

// Check if collection exists
db.listCollections({name: "users"}).hasNext()

// Get collection options
db.runCommand({listCollections: 1, filter: {name: "users"}})

// Modify collection
db.runCommand({
  collMod: "users",
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"]
    }
  }
})

// Convert to capped collection
db.runCommand({convertToCapped: "users", size: 100000})

Collection Validation

// Add validation to existing collection
db.runCommand({
  collMod: "users",
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"],
      properties: {
        name: {
          bsonType: "string",
          minLength: 1,
          maxLength: 100
        },
        email: {
          bsonType: "string",
          pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
        },
        age: {
          bsonType: "int",
          minimum: 0,
          maximum: 150
        },
        status: {
          enum: ["active", "inactive", "pending"]
        }
      }
    }
  },
  validationLevel: "strict",
  validationAction: "error"
})

// Set validation level
// strict: apply to all inserts and updates
// moderate: apply to inserts and updates to valid documents
db.runCommand({
  collMod: "users",
  validationLevel: "moderate"
})

// Set validation action
// error: reject invalid documents
// warn: log warning but allow invalid documents
db.runCommand({
  collMod: "users",
  validationAction: "warn"
})

Document Operations

Insert Operations

// Insert single document
db.users.insertOne({
  name: "John Doe",
  email: "john@example.com",
  age: 30,
  status: "active"
})

// Insert multiple documents
db.users.insertMany([
  {name: "Alice", email: "alice@example.com", age: 25},
  {name: "Bob", email: "bob@example.com", age: 35},
  {name: "Charlie", email: "charlie@example.com", age: 28}
])

// Insert with custom _id
db.users.insertOne({
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "Custom ID User",
  email: "custom@example.com"
})

// Insert with ordered/unordered operations
db.users.insertMany([
  {name: "User1", email: "user1@example.com"},
  {name: "User2", email: "user2@example.com"}
], {ordered: false})

// Insert with write concern
db.users.insertOne(
  {name: "Important User", email: "important@example.com"},
  {writeConcern: {w: "majority", j: true, wtimeout: 5000}}
)

// Bulk insert operations
var bulk = db.users.initializeUnorderedBulkOp()
bulk.insert({name: "Bulk User 1", email: "bulk1@example.com"})
bulk.insert({name: "Bulk User 2", email: "bulk2@example.com"})
bulk.execute()

Update Operations

// Update single document
db.users.updateOne(
  {name: "John Doe"},
  {$set: {age: 31, lastModified: new Date()}}
)

// Update multiple documents
db.users.updateMany(
  {status: "inactive"},
  {$set: {status: "archived", archivedDate: new Date()}}
)

// Replace entire document
db.users.replaceOne(
  {name: "John Doe"},
  {
    name: "John Doe",
    email: "john.doe@newdomain.com",
    age: 31,
    status: "active",
    profile: {
      bio: "Software developer",
      location: "New York"
    }
  }
)

// Upsert (update or insert)
db.users.updateOne(
  {email: "newuser@example.com"},
  {
    $set: {name: "New User", status: "active"},
    $setOnInsert: {createdDate: new Date()}
  },
  {upsert: true}
)

// Update with operators
db.users.updateOne(
  {name: "John Doe"},
  {
    $inc: {age: 1, loginCount: 1},
    $push: {tags: "premium"},
    $addToSet: {skills: {$each: ["JavaScript", "MongoDB"]}},
    $unset: {temporaryField: ""},
    $rename: {oldFieldName: "newFieldName"},
    $min: {lowestScore: 85},
    $max: {highestScore: 95},
    $mul: {points: 1.1}
  }
)

// Update array elements
db.users.updateOne(
  {name: "John Doe", "addresses.type": "home"},
  {$set: {"addresses.$.street": "123 New Street"}}
)

// Update all array elements
db.users.updateOne(
  {name: "John Doe"},
  {$set: {"addresses.$[].verified": true}}
)

// Update specific array elements with filters
db.users.updateOne(
  {name: "John Doe"},
  {$set: {"addresses.$[addr].verified": true}},
  {arrayFilters: [{"addr.type": "work"}]}
)

Delete Operations

// Delete single document
db.users.deleteOne({name: "John Doe"})

// Delete multiple documents
db.users.deleteMany({status: "inactive"})

// Delete all documents in collection
db.users.deleteMany({})

// Delete with write concern
db.users.deleteOne(
  {name: "Important User"},
  {writeConcern: {w: "majority", j: true}}
)

// Find and delete
db.users.findOneAndDelete(
  {status: "pending"},
  {sort: {createdDate: 1}}
)

// Bulk delete operations
var bulk = db.users.initializeUnorderedBulkOp()
bulk.find({status: "inactive"}).remove()
bulk.find({lastLogin: {$lt: new Date("2023-01-01")}}).removeOne()
bulk.execute()

Find and Modify Operations

// Find and update
db.users.findOneAndUpdate(
  {name: "John Doe"},
  {$inc: {version: 1}, $set: {lastModified: new Date()}},
  {
    returnDocument: "after",  // "before" or "after"
    upsert: true,
    sort: {createdDate: -1}
  }
)

// Find and replace
db.users.findOneAndReplace(
  {name: "John Doe"},
  {
    name: "John Doe",
    email: "john@newdomain.com",
    status: "updated"
  },
  {returnDocument: "after"}
)

// Find and delete
db.users.findOneAndDelete(
  {status: "toDelete"},
  {sort: {priority: -1}}
)

Query Operations

Basic Queries

// Find all documents
db.users.find()

// Find with pretty formatting
db.users.find().pretty()

// Find one document
db.users.findOne()

// Find with condition
db.users.find({name: "John Doe"})

// Find with multiple conditions
db.users.find({name: "John Doe", status: "active"})

// Find with projection (select specific fields)
db.users.find({}, {name: 1, email: 1, _id: 0})

// Find with exclusion
db.users.find({}, {password: 0, internalNotes: 0})

// Count documents
db.users.countDocuments()
db.users.countDocuments({status: "active"})

// Estimate document count (faster but less accurate)
db.users.estimatedDocumentCount()

// Check if documents exist
db.users.findOne({email: "john@example.com"}) !== null

Query Operators

// Comparison operators
db.users.find({age: {$eq: 30}})        // Equal
db.users.find({age: {$ne: 30}})        // Not equal
db.users.find({age: {$gt: 30}})        // Greater than
db.users.find({age: {$gte: 30}})       // Greater than or equal
db.users.find({age: {$lt: 30}})        // Less than
db.users.find({age: {$lte: 30}})       // Less than or equal
db.users.find({age: {$in: [25, 30, 35]}})     // In array
db.users.find({age: {$nin: [25, 30, 35]}})    // Not in array

// Logical operators
db.users.find({
  $and: [
    {age: {$gte: 25}},
    {status: "active"}
  ]
})

db.users.find({
  $or: [
    {age: {$lt: 25}},
    {status: "premium"}
  ]
})

db.users.find({
  $nor: [
    {age: {$lt: 18}},
    {status: "banned"}
  ]
})

db.users.find({age: {$not: {$lt: 18}}})

// Element operators
db.users.find({email: {$exists: true}})
db.users.find({age: {$type: "int"}})
db.users.find({age: {$type: 16}})  // BSON type number

// Evaluation operators
db.users.find({
  $expr: {$gt: ["$age", "$retirementAge"]}
})

db.users.find({
  email: {$regex: /^john/, $options: "i"}
})

db.users.find({
  $text: {$search: "john developer"}
})

db.users.find({
  location: {
    $geoWithin: {
      $centerSphere: [[-74, 40.74], 10/3963.2]
    }
  }
})

// Array operators
db.users.find({tags: {$all: ["developer", "javascript"]}})
db.users.find({tags: {$elemMatch: {$gte: 80, $lt: 90}}})
db.users.find({tags: {$size: 3}})

Advanced Queries

// Nested field queries
db.users.find({"profile.age": 30})
db.users.find({"address.city": "New York"})

// Array queries
db.users.find({tags: "developer"})  // Array contains value
db.users.find({"scores.0": {$gt: 85}})  // First array element
db.users.find({scores: {$elemMatch: {$gte: 80, $lt: 90}}})

// Query with sort
db.users.find().sort({age: 1, name: -1})  // 1=ascending, -1=descending

// Query with limit and skip
db.users.find().limit(10)
db.users.find().skip(20).limit(10)

// Query with cursor methods
db.users.find()
  .sort({createdDate: -1})
  .limit(10)
  .skip(0)
  .hint({createdDate: -1})  // Force index usage

// Distinct values
db.users.distinct("status")
db.users.distinct("tags")
db.users.distinct("age", {status: "active"})

// Cursor iteration
var cursor = db.users.find()
while (cursor.hasNext()) {
  var doc = cursor.next()
  print(doc.name)
}

// forEach iteration
db.users.find().forEach(function(doc) {
  print(doc.name + " - " + doc.email)
})

// Map function
db.users.find().map(function(doc) {
  return doc.name + " (" + doc.age + ")"
})
// Create text index
db.users.createIndex({
  name: "text",
  bio: "text",
  skills: "text"
})

// Text search
db.users.find({$text: {$search: "javascript developer"}})

// Text search with score
db.users.find(
  {$text: {$search: "javascript developer"}},
  {score: {$meta: "textScore"}}
).sort({score: {$meta: "textScore"}})

// Phrase search
db.users.find({$text: {$search: "\"full stack developer\""}})

// Exclude terms
db.users.find({$text: {$search: "javascript -python"}})

// Language-specific search
db.users.find({$text: {$search: "développeur", $language: "french"}})

// Case-sensitive search
db.users.find({$text: {$search: "JavaScript", $caseSensitive: true}})

// Diacritic-sensitive search
db.users.find({$text: {$search: "café", $diacriticSensitive: true}})

Indexing

Creating Indexes

// Single field index
db.users.createIndex({email: 1})  // 1=ascending, -1=descending

// Compound index
db.users.createIndex({status: 1, age: -1})

// Multikey index (for arrays)
db.users.createIndex({tags: 1})

// Text index
db.users.createIndex({name: "text", bio: "text"})

// Geospatial indexes
db.locations.createIndex({coordinates: "2dsphere"})  // GeoJSON
db.locations.createIndex({coordinates: "2d"})        // Legacy coordinates

// Hashed index
db.users.createIndex({userId: "hashed"})

// Partial index
db.users.createIndex(
  {email: 1},
  {partialFilterExpression: {status: "active"}}
)

// Sparse index
db.users.createIndex({phone: 1}, {sparse: true})

// Unique index
db.users.createIndex({email: 1}, {unique: true})

// TTL index (Time To Live)
db.sessions.createIndex({createdAt: 1}, {expireAfterSeconds: 3600})

// Background index creation
db.users.createIndex({name: 1}, {background: true})

// Index with custom name
db.users.createIndex({email: 1}, {name: "email_unique_idx"})

// Wildcard index
db.products.createIndex({"attributes.$**": 1})

Managing Indexes

// List indexes
db.users.getIndexes()

// Get index information
db.users.getIndexKeys()

// Drop index
db.users.dropIndex({email: 1})
db.users.dropIndex("email_1")

// Drop all indexes (except _id)
db.users.dropIndexes()

// Rebuild indexes
db.users.reIndex()

// Hide/unhide index
db.users.hideIndex("email_1")
db.users.unhideIndex("email_1")

// Get index stats
db.users.aggregate([{$indexStats: {}}])

// Check index usage
db.users.find({email: "john@example.com"}).explain("executionStats")

// Force index usage
db.users.find({email: "john@example.com"}).hint({email: 1})

// Index intersection
db.users.find({status: "active", age: {$gte: 25}}).hint({$natural: 1})

Index Performance

// Explain query execution
db.users.find({email: "john@example.com"}).explain()
db.users.find({email: "john@example.com"}).explain("executionStats")
db.users.find({email: "john@example.com"}).explain("allPlansExecution")

// Analyze index effectiveness
db.users.find({status: "active"}).explain("executionStats").executionStats

// Check if index is used
var explain = db.users.find({email: "john@example.com"}).explain("executionStats")
explain.executionStats.executionSuccess
explain.executionStats.totalDocsExamined
explain.executionStats.totalKeysExamined

// Index selectivity analysis
db.users.aggregate([
  {$group: {_id: "$status", count: {$sum: 1}}},
  {$sort: {count: -1}}
])

// Find unused indexes
db.runCommand({$indexStats: {}})

// Monitor index usage
db.users.aggregate([{$indexStats: {}}])

Aggregation Framework

Basic Aggregation

// Simple aggregation
db.users.aggregate([
  {$match: {status: "active"}},
  {$group: {_id: "$department", count: {$sum: 1}}},
  {$sort: {count: -1}}
])

// Count by field
db.users.aggregate([
  {$group: {_id: "$status", count: {$sum: 1}}}
])

// Sum and average
db.orders.aggregate([
  {$group: {
    _id: "$customerId",
    totalAmount: {$sum: "$amount"},
    avgAmount: {$avg: "$amount"},
    orderCount: {$sum: 1}
  }}
])

// Min and max
db.products.aggregate([
  {$group: {
    _id: "$category",
    minPrice: {$min: "$price"},
    maxPrice: {$max: "$price"},
    products: {$push: "$name"}
  }}
])

Aggregation Stages

// $match - Filter documents
db.users.aggregate([
  {$match: {age: {$gte: 25}, status: "active"}}
])

// $project - Select and transform fields
db.users.aggregate([
  {$project: {
    name: 1,
    email: 1,
    ageGroup: {
      $cond: {
        if: {$gte: ["$age", 30]},
        then: "senior",
        else: "junior"
      }
    }
  }}
])

// $group - Group documents
db.sales.aggregate([
  {$group: {
    _id: {year: {$year: "$date"}, month: {$month: "$date"}},
    totalSales: {$sum: "$amount"},
    avgSale: {$avg: "$amount"},
    salesCount: {$sum: 1}
  }}
])

// $sort - Sort documents
db.users.aggregate([
  {$sort: {age: -1, name: 1}}
])

// $limit and $skip - Pagination
db.users.aggregate([
  {$skip: 20},
  {$limit: 10}
])

// $unwind - Deconstruct arrays
db.users.aggregate([
  {$unwind: "$tags"},
  {$group: {_id: "$tags", count: {$sum: 1}}}
])

// $lookup - Join collections
db.orders.aggregate([
  {$lookup: {
    from: "users",
    localField: "userId",
    foreignField: "_id",
    as: "user"
  }}
])

// $addFields - Add new fields
db.users.aggregate([
  {$addFields: {
    fullName: {$concat: ["$firstName", " ", "$lastName"]},
    isAdult: {$gte: ["$age", 18]}
  }}
])

// $replaceRoot - Replace document root
db.users.aggregate([
  {$replaceRoot: {newRoot: "$profile"}}
])

// $facet - Multiple aggregation pipelines
db.products.aggregate([
  {$facet: {
    priceRanges: [
      {$bucket: {
        groupBy: "$price",
        boundaries: [0, 100, 500, 1000, Infinity],
        default: "Other"
      }}
    ],
    categories: [
      {$group: {_id: "$category", count: {$sum: 1}}}
    ]
  }}
])

Advanced Aggregation

// Complex pipeline
db.orders.aggregate([
  // Stage 1: Match recent orders
  {$match: {
    orderDate: {$gte: new Date("2023-01-01")}
  }},

  // Stage 2: Lookup user information
  {$lookup: {
    from: "users",
    localField: "userId",
    foreignField: "_id",
    as: "user"
  }},

  // Stage 3: Unwind user array
  {$unwind: "$user"},

  // Stage 4: Lookup product information
  {$lookup: {
    from: "products",
    localField: "items.productId",
    foreignField: "_id",
    as: "productDetails"
  }},

  // Stage 5: Add calculated fields
  {$addFields: {
    totalAmount: {$sum: "$items.price"},
    customerName: "$user.name",
    orderMonth: {$month: "$orderDate"}
  }},

  // Stage 6: Group by month and customer
  {$group: {
    _id: {
      month: "$orderMonth",
      customerId: "$userId"
    },
    customerName: {$first: "$customerName"},
    totalOrders: {$sum: 1},
    totalSpent: {$sum: "$totalAmount"},
    avgOrderValue: {$avg: "$totalAmount"}
  }},

  // Stage 7: Sort by total spent
  {$sort: {totalSpent: -1}},

  // Stage 8: Limit results
  {$limit: 100}
])

// Window functions (MongoDB 5.0+)
db.sales.aggregate([
  {$setWindowFields: {
    partitionBy: "$department",
    sortBy: {salary: -1},
    output: {
      rank: {$rank: {}},
      denseRank: {$denseRank: {}},
      runningTotal: {
        $sum: "$salary",
        window: {documents: ["unbounded preceding", "current"]}
      }
    }
  }}
])

// Time series aggregation
db.temperatures.aggregate([
  {$match: {
    timestamp: {
      $gte: new Date("2023-01-01"),
      $lt: new Date("2023-02-01")
    }
  }},
  {$group: {
    _id: {
      $dateTrunc: {
        date: "$timestamp",
        unit: "hour"
      }
    },
    avgTemp: {$avg: "$temperature"},
    minTemp: {$min: "$temperature"},
    maxTemp: {$max: "$temperature"},
    readings: {$sum: 1}
  }},
  {$sort: {_id: 1}}
])

Aggregation Operators

// Arithmetic operators
db.products.aggregate([
  {$project: {
    name: 1,
    price: 1,
    discountedPrice: {$multiply: ["$price", 0.9]},
    tax: {$multiply: ["$price", 0.1]},
    finalPrice: {
      $add: [
        {$multiply: ["$price", 0.9]},
        {$multiply: ["$price", 0.1]}
      ]
    }
  }}
])

// String operators
db.users.aggregate([
  {$project: {
    name: 1,
    email: 1,
    domain: {$arrayElemAt: [{$split: ["$email", "@"]}, 1]},
    initials: {
      $concat: [
        {$substr: ["$firstName", 0, 1]},
        {$substr: ["$lastName", 0, 1]}
      ]
    },
    nameLength: {$strLenCP: "$name"}
  }}
])

// Date operators
db.orders.aggregate([
  {$project: {
    orderId: 1,
    orderDate: 1,
    year: {$year: "$orderDate"},
    month: {$month: "$orderDate"},
    dayOfWeek: {$dayOfWeek: "$orderDate"},
    quarter: {
      $ceil: {$divide: [{$month: "$orderDate"}, 3]}
    },
    daysSinceOrder: {
      $divide: [
        {$subtract: [new Date(), "$orderDate"]},
        1000 * 60 * 60 * 24
      ]
    }
  }}
])

// Array operators
db.users.aggregate([
  {$project: {
    name: 1,
    tags: 1,
    tagCount: {$size: "$tags"},
    firstTag: {$arrayElemAt: ["$tags", 0]},
    hasJavaScript: {$in: ["javascript", "$tags"]},
    uniqueTags: {$setUnion: ["$tags", []]},
    sortedTags: {$sortArray: {input: "$tags", sortBy: 1}}
  }}
])

// Conditional operators
db.users.aggregate([
  {$project: {
    name: 1,
    age: 1,
    category: {
      $switch: {
        branches: [
          {case: {$lt: ["$age", 18]}, then: "Minor"},
          {case: {$lt: ["$age", 65]}, then: "Adult"},
          {case: {$gte: ["$age", 65]}, then: "Senior"}
        ],
        default: "Unknown"
      }
    },
    status: {
      $cond: {
        if: {$eq: ["$isActive", true]},
        then: "Active",
        else: "Inactive"
      }
    }
  }}
])

Data Modeling

Document Structure

// Embedded documents
{
  _id: ObjectId("..."),
  name: "John Doe",
  email: "john@example.com",
  address: {
    street: "123 Main St",
    city: "New York",
    state: "NY",
    zipCode: "10001",
    country: "USA"
  },
  phones: [
    {type: "home", number: "555-1234"},
    {type: "work", number: "555-5678"}
  ]
}

// Referenced documents
// User document
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "John Doe",
  email: "john@example.com"
}

// Order document
{
  _id: ObjectId("507f1f77bcf86cd799439012"),
  userId: ObjectId("507f1f77bcf86cd799439011"),
  orderDate: new Date(),
  items: [
    {productId: ObjectId("..."), quantity: 2, price: 29.99},
    {productId: ObjectId("..."), quantity: 1, price: 49.99}
  ],
  totalAmount: 109.97
}

Schema Design Patterns

// One-to-One: Embedded
{
  _id: ObjectId("..."),
  name: "John Doe",
  profile: {
    bio: "Software developer",
    avatar: "avatar.jpg",
    preferences: {
      theme: "dark",
      language: "en"
    }
  }
}

// One-to-Many: Embedded (small arrays)
{
  _id: ObjectId("..."),
  title: "Blog Post",
  content: "...",
  comments: [
    {
      author: "Alice",
      text: "Great post!",
      date: new Date()
    },
    {
      author: "Bob",
      text: "Thanks for sharing",
      date: new Date()
    }
  ]
}

// One-to-Many: Referenced (large arrays)
// Blog post
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  title: "Blog Post",
  content: "...",
  author: "John Doe"
}

// Comments (separate collection)
{
  _id: ObjectId("..."),
  postId: ObjectId("507f1f77bcf86cd799439011"),
  author: "Alice",
  text: "Great post!",
  date: new Date()
}

// Many-to-Many: Array of references
// User document
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "John Doe",
  skills: [
    ObjectId("507f1f77bcf86cd799439021"),  // JavaScript
    ObjectId("507f1f77bcf86cd799439022"),  // MongoDB
    ObjectId("507f1f77bcf86cd799439023")   // Node.js
  ]
}

// Skill document
{
  _id: ObjectId("507f1f77bcf86cd799439021"),
  name: "JavaScript",
  category: "Programming Language"
}

// Polymorphic pattern
{
  _id: ObjectId("..."),
  type: "vehicle",
  subtype: "car",
  make: "Toyota",
  model: "Camry",
  doors: 4,
  fuelType: "gasoline"
}

{
  _id: ObjectId("..."),
  type: "vehicle",
  subtype: "motorcycle",
  make: "Harley Davidson",
  model: "Street 750",
  engineSize: "750cc"
}

Advanced Patterns

// Bucket pattern (for time series data)
{
  _id: ObjectId("..."),
  sensor_id: "sensor_001",
  timestamp: new Date("2023-01-01T00:00:00Z"),
  measurements: [
    {time: new Date("2023-01-01T00:00:00Z"), temp: 20.5, humidity: 65},
    {time: new Date("2023-01-01T00:01:00Z"), temp: 20.7, humidity: 64},
    {time: new Date("2023-01-01T00:02:00Z"), temp: 20.6, humidity: 66}
  ],
  count: 3,
  min_temp: 20.5,
  max_temp: 20.7,
  avg_temp: 20.6
}

// Outlier pattern
{
  _id: ObjectId("..."),
  product_id: "product_001",
  year: 2023,
  month: 1,
  sales: [
    {day: 1, amount: 1000},
    {day: 2, amount: 1200},
    // ... normal days
    {day: 15, amount: 50000, note: "Black Friday sale"}  // Outlier
  ]
}

// Computed pattern
{
  _id: ObjectId("..."),
  product_id: "product_001",
  reviews: [
    {rating: 5, comment: "Excellent!"},
    {rating: 4, comment: "Good product"},
    {rating: 5, comment: "Love it!"}
  ],
  // Computed fields
  total_reviews: 3,
  average_rating: 4.67,
  rating_distribution: {
    5: 2,
    4: 1,
    3: 0,
    2: 0,
    1: 0
  }
}

// Extended reference pattern
{
  _id: ObjectId("..."),
  order_id: "ORD-001",
  customer: {
    id: ObjectId("507f1f77bcf86cd799439011"),
    name: "John Doe",
    email: "john@example.com"  // Denormalized for quick access
  },
  items: [
    {
      product: {
        id: ObjectId("507f1f77bcf86cd799439021"),
        name: "Laptop",
        price: 999.99  // Denormalized
      },
      quantity: 1
    }
  ]
}

Replication

Replica Set Setup

// Initialize replica set
rs.initiate({
  _id: "myReplicaSet",
  members: [
    {_id: 0, host: "mongodb1.example.com:27017"},
    {_id: 1, host: "mongodb2.example.com:27017"},
    {_id: 2, host: "mongodb3.example.com:27017"}
  ]
})

// Add member to replica set
rs.add("mongodb4.example.com:27017")

// Add member with options
rs.add({
  host: "mongodb4.example.com:27017",
  priority: 0.5,
  votes: 1
})

// Remove member from replica set
rs.remove("mongodb4.example.com:27017")

// Check replica set status
rs.status()

// Check replica set configuration
rs.conf()

// Check if current node is primary
rs.isMaster()

// Step down primary (force election)
rs.stepDown(60)  // Step down for 60 seconds

// Force reconfiguration
rs.reconfig(config, {force: true})

Replica Set Configuration

// Configure replica set with different member types
var config = {
  _id: "myReplicaSet",
  members: [
    // Primary eligible members
    {_id: 0, host: "mongodb1.example.com:27017", priority: 2},
    {_id: 1, host: "mongodb2.example.com:27017", priority: 1},

    // Secondary only (priority 0)
    {_id: 2, host: "mongodb3.example.com:27017", priority: 0},

    // Hidden member (for backups)
    {_id: 3, host: "mongodb4.example.com:27017", priority: 0, hidden: true},

    // Arbiter (voting only, no data)
    {_id: 4, host: "mongodb5.example.com:27017", arbiterOnly: true},

    // Delayed member (for point-in-time recovery)
    {_id: 5, host: "mongodb6.example.com:27017", priority: 0, slaveDelay: 3600}
  ]
}

rs.initiate(config)

// Modify replica set configuration
var config = rs.conf()
config.members[0].priority = 3
rs.reconfig(config)

// Set read preference
db.getMongo().setReadPref("secondary")
db.getMongo().setReadPref("primaryPreferred")
db.getMongo().setReadPref("secondaryPreferred")

// Read from specific tag
db.getMongo().setReadPref("secondary", [{datacenter: "west"}])

Monitoring Replication

// Check replication lag
rs.printReplicationInfo()
rs.printSlaveReplicationInfo()

// Get oplog information
db.oplog.rs.find().sort({$natural: -1}).limit(1)

// Check sync status
db.runCommand({replSetGetStatus: 1})

// Monitor oplog size
db.oplog.rs.stats()

// Check last applied operation
db.runCommand({replSetGetStatus: 1}).members.forEach(function(member) {
  print(member.name + ": " + member.optimeDate)
})

// Force sync from specific member
db.adminCommand({replSetSyncFrom: "mongodb2.example.com:27017"})

// Resync member (full resync)
db.adminCommand({resync: 1})

Sharding

Shard Cluster Setup

// Start config servers (replica set)
mongod --configsvr --replSet configReplSet --port 27019 --dbpath /data/configdb

// Initialize config server replica set
rs.initiate({
  _id: "configReplSet",
  configsvr: true,
  members: [
    {_id: 0, host: "config1.example.com:27019"},
    {_id: 1, host: "config2.example.com:27019"},
    {_id: 2, host: "config3.example.com:27019"}
  ]
})

// Start shard servers (replica sets)
mongod --shardsvr --replSet shard1ReplSet --port 27018 --dbpath /data/shard1

// Start mongos (query router)
mongos --configdb configReplSet/config1.example.com:27019,config2.example.com:27019,config3.example.com:27019 --port 27017

// Connect to mongos and add shards
sh.addShard("shard1ReplSet/shard1-1.example.com:27018,shard1-2.example.com:27018,shard1-3.example.com:27018")
sh.addShard("shard2ReplSet/shard2-1.example.com:27018,shard2-2.example.com:27018,shard2-3.example.com:27018")

// Enable sharding for database
sh.enableSharding("myDatabase")

// Shard collection
sh.shardCollection("myDatabase.users", {userId: 1})

// Shard with compound key
sh.shardCollection("myDatabase.orders", {customerId: 1, orderDate: 1})

// Shard with hashed key
sh.shardCollection("myDatabase.logs", {_id: "hashed"})

Managing Shards

// Check sharding status
sh.status()

// List shards
db.adminCommand({listShards: 1})

// Check if collection is sharded
db.users.getShardDistribution()

// Move chunk manually
sh.moveChunk("myDatabase.users", {userId: 1000}, "shard2ReplSet")

// Split chunk
sh.splitAt("myDatabase.users", {userId: 5000})

// Enable balancer
sh.enableBalancing("myDatabase.users")

// Disable balancer
sh.disableBalancing("myDatabase.users")

// Check balancer status
sh.getBalancerState()
sh.isBalancerRunning()

// Set balancer window
db.settings.update(
  {_id: "balancer"},
  {$set: {activeWindow: {start: "23:00", stop: "06:00"}}},
  {upsert: true}
)

// Remove shard (drain and remove)
db.adminCommand({removeShard: "shard3ReplSet"})

// Check shard removal progress
db.adminCommand({removeShard: "shard3ReplSet"})

Shard Key Selection

// Good shard keys have:
// 1. High cardinality
// 2. Low frequency
// 3. Non-monotonic change

// Examples of good shard keys:
// User ID (if well distributed)
sh.shardCollection("app.users", {userId: 1})

// Compound key with high cardinality
sh.shardCollection("app.events", {userId: 1, timestamp: 1})

// Hashed key for even distribution
sh.shardCollection("app.logs", {_id: "hashed"})

// Examples of poor shard keys:
// Monotonically increasing (timestamp, ObjectId)
// Low cardinality (status, category)
// Hotspotting keys

// Zone sharding (tag-aware sharding)
sh.addShardTag("shard1ReplSet", "US")
sh.addShardTag("shard2ReplSet", "EU")

sh.addTagRange(
  "myDatabase.users",
  {country: "US", userId: MinKey},
  {country: "US", userId: MaxKey},
  "US"
)

sh.addTagRange(
  "myDatabase.users",
  {country: "EU", userId: MinKey},
  {country: "EU", userId: MaxKey},
  "EU"
)

Security

Authentication

// Create admin user
use admin
db.createUser({
  user: "admin",
  pwd: "securePassword",
  roles: ["userAdminAnyDatabase", "dbAdminAnyDatabase", "readWriteAnyDatabase"]
})

// Create database user
use myDatabase
db.createUser({
  user: "appUser",
  pwd: "appPassword",
  roles: ["readWrite"]
})

// Create user with specific privileges
db.createUser({
  user: "analyst",
  pwd: "analystPassword",
  roles: [
    {role: "read", db: "analytics"},
    {role: "readWrite", db: "reports"}
  ]
})

// Authenticate
db.auth("username", "password")

// Change user password
db.changeUserPassword("username", "newPassword")

// Update user roles
db.updateUser("username", {
  roles: [
    {role: "readWrite", db: "myDatabase"},
    {role: "read", db: "analytics"}
  ]
})

// Drop user
db.dropUser("username")

// List users
db.getUsers()
show users

// Get current user info
db.runCommand({connectionStatus: 1})

Authorization and Roles

// Built-in roles
// Database roles: read, readWrite
// Database admin roles: dbAdmin, dbOwner, userAdmin
// Cluster admin roles: clusterAdmin, clusterManager, clusterMonitor, hostManager
// Backup/restore roles: backup, restore
// All-database roles: readAnyDatabase, readWriteAnyDatabase, userAdminAnyDatabase, dbAdminAnyDatabase
// Superuser roles: root

// Create custom role
db.createRole({
  role: "customRole",
  privileges: [
    {
      resource: {db: "myDatabase", collection: "users"},
      actions: ["find", "insert", "update"]
    },
    {
      resource: {db: "myDatabase", collection: "logs"},
      actions: ["find"]
    }
  ],
  roles: []
})

// Grant role to user
db.grantRolesToUser("username", ["customRole"])

// Revoke role from user
db.revokeRolesFromUser("username", ["customRole"])

// Update role privileges
db.updateRole("customRole", {
  privileges: [
    {
      resource: {db: "myDatabase", collection: ""},
      actions: ["find", "insert", "update", "remove"]
    }
  ]
})

// Drop role
db.dropRole("customRole")

// List roles
db.getRoles()
show roles

// Get role information
db.getRole("roleName", {showPrivileges: true})

SSL/TLS Configuration

# Generate SSL certificates
openssl req -newkey rsa:2048 -new -x509 -days 3653 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key

# Combine certificate and key
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem

# Start MongoDB with SSL
mongod --sslMode requireSSL --sslPEMKeyFile /path/to/mongodb.pem

# Connect with SSL
mongosh --ssl --sslCAFile /path/to/ca.pem --host hostname

# MongoDB configuration file (mongod.conf)
net:
  ssl:
    mode: requireSSL
    PEMKeyFile: /path/to/mongodb.pem
    CAFile: /path/to/ca.pem

Field-Level Encryption

// Client-side field level encryption setup
const { MongoClient, ClientEncryption } = require('mongodb');

const client = new MongoClient(uri, {
  useNewUrlParser: true,
  useUnifiedTopology: true,
  autoEncryption: {
    keyVaultNamespace: 'encryption.__keyVault',
    kmsProviders: {
      local: {
        key: localMasterKey
      }
    },
    schemaMap: {
      'myDatabase.users': {
        bsonType: 'object',
        properties: {
          ssn: {
            encrypt: {
              keyId: dataKeyId,
              bsonType: 'string',
              algorithm: 'AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic'
            }
          },
          creditCard: {
            encrypt: {
              keyId: dataKeyId,
              bsonType: 'string',
              algorithm: 'AEAD_AES_256_CBC_HMAC_SHA_512-Random'
            }
          }
        }
      }
    }
  }
});

// Create data encryption key
const encryption = new ClientEncryption(client, {
  keyVaultNamespace: 'encryption.__keyVault',
  kmsProviders: {
    local: {
      key: localMasterKey
    }
  }
});

const dataKeyId = await encryption.createDataKey('local');

// Insert encrypted document
await db.users.insertOne({
  name: 'John Doe',
  ssn: '123-45-6789',  // Will be encrypted
  creditCard: '4111-1111-1111-1111'  // Will be encrypted
});

Auditing

// Enable auditing (mongod.conf)
auditLog:
  destination: file
  format: JSON
  path: /var/log/mongodb/audit.json
  filter: '{ atype: { $in: ["authenticate", "authCheck"] } }'

// Audit specific operations
auditLog:
  destination: file
  format: JSON
  path: /var/log/mongodb/audit.json
  filter: '{
    $or: [
      { "atype": "authenticate" },
      { "atype": "authCheck" },
      { "atype": "createUser" },
      { "atype": "dropUser" },
      { "atype": "createRole" },
      { "atype": "dropRole" },
      { "atype": "createCollection" },
      { "atype": "dropCollection" }
    ]
  }'

// View audit logs
tail -f /var/log/mongodb/audit.json | jq '.'

// Audit log analysis
grep "authenticate" /var/log/mongodb/audit.json | jq '.users[0].user'
grep "authCheck" /var/log/mongodb/audit.json | jq '.param.command'

Backup and Restore

mongodump and mongorestore

# Backup entire MongoDB instance
mongodump --host localhost:27017 --out /backup/mongodb

# Backup specific database
mongodump --host localhost:27017 --db myDatabase --out /backup/mongodb

# Backup specific collection
mongodump --host localhost:27017 --db myDatabase --collection users --out /backup/mongodb

# Backup with authentication
mongodump --host localhost:27017 --username admin --password --authenticationDatabase admin --out /backup/mongodb

# Backup with query filter
mongodump --host localhost:27017 --db myDatabase --collection users --query '{"status": "active"}' --out /backup/mongodb

# Backup in archive format
mongodump --host localhost:27017 --db myDatabase --archive=/backup/myDatabase.archive

# Backup with compression
mongodump --host localhost:27017 --db myDatabase --gzip --out /backup/mongodb

# Restore entire backup
mongorestore --host localhost:27017 /backup/mongodb

# Restore specific database
mongorestore --host localhost:27017 --db myDatabase /backup/mongodb/myDatabase

# Restore to different database
mongorestore --host localhost:27017 --db newDatabase /backup/mongodb/myDatabase

# Restore specific collection
mongorestore --host localhost:27017 --db myDatabase --collection users /backup/mongodb/myDatabase/users.bson

# Restore with drop existing
mongorestore --host localhost:27017 --drop /backup/mongodb

# Restore from archive
mongorestore --host localhost:27017 --archive=/backup/myDatabase.archive

# Restore with authentication
mongorestore --host localhost:27017 --username admin --password --authenticationDatabase admin /backup/mongodb

Filesystem Snapshots

# Stop MongoDB (for consistent snapshot)
sudo systemctl stop mongod

# Create filesystem snapshot (LVM example)
sudo lvcreate --size 1G --snapshot --name mongodb-snapshot /dev/vg0/mongodb-lv

# Start MongoDB
sudo systemctl start mongod

# Mount snapshot
sudo mkdir /mnt/mongodb-snapshot
sudo mount /dev/vg0/mongodb-snapshot /mnt/mongodb-snapshot

# Copy data from snapshot
sudo cp -r /mnt/mongodb-snapshot/data /backup/mongodb-snapshot-$(date +%Y%m%d)

# Unmount and remove snapshot
sudo umount /mnt/mongodb-snapshot
sudo lvremove /dev/vg0/mongodb-snapshot

# For replica sets (no downtime required)
# Take snapshot from secondary member
# Ensure secondary is caught up before snapshot

Cloud Backup Solutions

# MongoDB Atlas automated backups
# - Continuous backups with point-in-time recovery
# - Scheduled snapshot backups
# - Cross-region backup copies

# AWS backup using EBS snapshots
aws ec2 create-snapshot --volume-id vol-1234567890abcdef0 --description "MongoDB backup $(date)"

# Google Cloud backup using persistent disk snapshots
gcloud compute disks snapshot mongodb-disk --snapshot-names=mongodb-backup-$(date +%Y%m%d)

# Azure backup using managed disk snapshots
az snapshot create --resource-group myResourceGroup --source mongodb-disk --name mongodb-backup-$(date +%Y%m%d)

Automated Backup Scripts

#!/bin/bash
# mongodb_backup.sh

# Configuration
MONGO_HOST="localhost:27017"
MONGO_USER="backup_user"
MONGO_PASS="backup_password"
BACKUP_DIR="/backup/mongodb"
RETENTION_DAYS=7
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup directory
mkdir -p $BACKUP_DIR/$DATE

# Perform backup
mongodump --host $MONGO_HOST \
  --username $MONGO_USER \
  --password $MONGO_PASS \
  --authenticationDatabase admin \
  --gzip \
  --out $BACKUP_DIR/$DATE

# Check backup success
if [ $? -eq 0 ]; then
  echo "Backup completed successfully: $BACKUP_DIR/$DATE"

  # Compress backup
  tar -czf $BACKUP_DIR/mongodb_backup_$DATE.tar.gz -C $BACKUP_DIR $DATE
  rm -rf $BACKUP_DIR/$DATE

  # Upload to cloud storage (optional)
  # aws s3 cp $BACKUP_DIR/mongodb_backup_$DATE.tar.gz s3://my-backup-bucket/

  # Clean old backups
  find $BACKUP_DIR -name "mongodb_backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete

else
  echo "Backup failed!"
  exit 1
fi

# Add to crontab for daily backups
# 0 2 * * * /path/to/mongodb_backup.sh >> /var/log/mongodb_backup.log 2>&1

Performance Optimization

Query Optimization

// Use explain to analyze queries
db.users.find({email: "john@example.com"}).explain("executionStats")

// Create appropriate indexes
db.users.createIndex({email: 1})
db.users.createIndex({status: 1, age: -1})

// Use projection to limit returned fields
db.users.find({status: "active"}, {name: 1, email: 1, _id: 0})

// Use limit for large result sets
db.users.find({status: "active"}).limit(100)

// Optimize aggregation pipelines
// Move $match stages early
db.users.aggregate([
  {$match: {status: "active"}},  // Filter early
  {$lookup: {from: "orders", localField: "_id", foreignField: "userId", as: "orders"}},
  {$match: {"orders.0": {$exists: true}}}  // Filter after lookup
])

// Use $project to reduce document size
db.users.aggregate([
  {$match: {status: "active"}},
  {$project: {name: 1, email: 1, lastLogin: 1}},  // Reduce document size
  {$sort: {lastLogin: -1}},
  {$limit: 100}
])

// Use covered queries (query covered entirely by index)
db.users.createIndex({status: 1, name: 1, email: 1})
db.users.find({status: "active"}, {name: 1, email: 1, _id: 0})

// Avoid regex at beginning of string
// Bad: db.users.find({name: /^John/})
// Good: db.users.find({name: {$gte: "John", $lt: "Joho"}})

// Use hint to force index usage
db.users.find({status: "active", age: {$gte: 25}}).hint({status: 1, age: 1})

Index Optimization

// Monitor index usage
db.users.aggregate([{$indexStats: {}}])

// Find unused indexes
db.runCommand({collStats: "users", indexDetails: true})

// Analyze index effectiveness
var explain = db.users.find({email: "john@example.com"}).explain("executionStats")
print("Documents examined: " + explain.executionStats.totalDocsExamined)
print("Keys examined: " + explain.executionStats.totalKeysExamined)
print("Documents returned: " + explain.executionStats.executionSuccess)

// Index intersection
db.users.find({status: "active", age: {$gte: 25}})
// Can use separate indexes on status and age

// Compound index order matters
// For query: {status: "active", age: {$gte: 25}, name: "John"}
// Index: {status: 1, age: 1, name: 1} (equality, range, sort)

// Partial indexes for filtered queries
db.users.createIndex(
  {email: 1},
  {partialFilterExpression: {status: "active"}}
)

// TTL indexes for automatic deletion
db.sessions.createIndex({createdAt: 1}, {expireAfterSeconds: 3600})

// Text indexes for search
db.products.createIndex({name: "text", description: "text"})

// Geospatial indexes
db.locations.createIndex({coordinates: "2dsphere"})

Connection Optimization

// Connection pooling configuration
const client = new MongoClient(uri, {
  maxPoolSize: 10,        // Maximum connections in pool
  minPoolSize: 5,         // Minimum connections in pool
  maxIdleTimeMS: 30000,   // Close connections after 30 seconds of inactivity
  serverSelectionTimeoutMS: 5000,  // How long to try selecting a server
  socketTimeoutMS: 45000, // How long a send or receive on a socket can take
  bufferMaxEntries: 0     // Disable mongoose buffering
});

// Read preferences for replica sets
db.getMongo().setReadPref("secondaryPreferred")

// Write concerns for durability vs performance
db.users.insertOne(
  {name: "John", email: "john@example.com"},
  {writeConcern: {w: 1, j: false}}  // Fast but less durable
)

db.users.insertOne(
  {name: "Important", email: "important@example.com"},
  {writeConcern: {w: "majority", j: true}}  // Slower but more durable
)

// Batch operations
var bulk = db.users.initializeUnorderedBulkOp()
for (var i = 0; i < 1000; i++) {
  bulk.insert({name: "User" + i, email: "user" + i + "@example.com"})
}
bulk.execute()

Memory and Storage Optimization

// Monitor memory usage
db.serverStatus().mem
db.serverStatus().wiredTiger.cache

// Storage engine configuration (wiredTiger)
// In mongod.conf:
storage:
  engine: wiredTiger
  wiredTiger:
    engineConfig:
      cacheSizeGB: 8
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: snappy
    indexConfig:
      prefixCompression: true

// Compact collections to reclaim space
db.runCommand({compact: "users"})

// Check collection storage stats
db.users.stats()

// Use capped collections for logs
db.createCollection("logs", {capped: true, size: 100000000, max: 1000000})

// GridFS for large files
var bucket = new GridFSBucket(db, {bucketName: "files"})

// Sharding for horizontal scaling
sh.enableSharding("myDatabase")
sh.shardCollection("myDatabase.users", {userId: "hashed"})

Monitoring

Database Monitoring

// Server status
db.serverStatus()

// Database statistics
db.stats()
db.stats(1024*1024)  // In MB

// Collection statistics
db.users.stats()

// Current operations
db.currentOp()

// Kill operation
db.killOp(operationId)

// Profiler
db.setProfilingLevel(2)  // Profile all operations
db.setProfilingLevel(1, {slowms: 100})  // Profile slow operations

// View profiler data
db.system.profile.find().limit(5).sort({ts: -1}).pretty()

// Profiler statistics
db.system.profile.aggregate([
  {$group: {
    _id: "$command.find",
    count: {$sum: 1},
    avgDuration: {$avg: "$millis"},
    maxDuration: {$max: "$millis"}
  }},
  {$sort: {avgDuration: -1}}
])

// Index usage statistics
db.users.aggregate([{$indexStats: {}}])

// Connection statistics
db.serverStatus().connections

// Replication lag (on secondary)
rs.printSlaveReplicationInfo()

// Oplog information
db.oplog.rs.find().sort({$natural: -1}).limit(1)

Performance Metrics

// Query performance monitoring
var slowQueries = db.system.profile.find({
  millis: {$gt: 1000}
}).sort({ts: -1}).limit(10)

slowQueries.forEach(function(query) {
  print("Duration: " + query.millis + "ms")
  print("Command: " + JSON.stringify(query.command))
  print("---")
})

// Index effectiveness
db.users.find({email: "john@example.com"}).explain("executionStats").executionStats

// Memory usage
var memStats = db.serverStatus().mem
print("Resident: " + memStats.resident + "MB")
print("Virtual: " + memStats.virtual + "MB")
print("Mapped: " + memStats.mapped + "MB")

// WiredTiger cache statistics
var cacheStats = db.serverStatus().wiredTiger.cache
print("Cache size: " + cacheStats["maximum bytes configured"] / 1024 / 1024 + "MB")
print("Cache used: " + cacheStats["bytes currently in the cache"] / 1024 / 1024 + "MB")

// Network statistics
var networkStats = db.serverStatus().network
print("Bytes in: " + networkStats.bytesIn)
print("Bytes out: " + networkStats.bytesOut)
print("Requests: " + networkStats.numRequests)

// Lock statistics
db.serverStatus().locks

// Background flushing
db.serverStatus().backgroundFlushing

Monitoring Scripts

#!/bin/bash
# mongodb_monitor.sh

MONGO_HOST="localhost:27017"
MONGO_USER="monitor"
MONGO_PASS="password"

# Check if MongoDB is running
if ! mongosh --host $MONGO_HOST --username $MONGO_USER --password $MONGO_PASS --eval "db.runCommand('ping')" > /dev/null 2>&1; then
  echo "ERROR: MongoDB is not responding"
  exit 1
fi

# Check replication lag
LAG=$(mongosh --host $MONGO_HOST --username $MONGO_USER --password $MONGO_PASS --quiet --eval "
  if (rs.status().ok) {
    var lag = rs.status().members.find(m => m.self).optimeDate - rs.status().members.find(m => m.state === 1).optimeDate;
    print(Math.abs(lag));
  } else {
    print(0);
  }
")

if [ $LAG -gt 10000 ]; then
  echo "WARNING: Replication lag is ${LAG}ms"
fi

# Check slow queries
SLOW_QUERIES=$(mongosh --host $MONGO_HOST --username $MONGO_USER --password $MONGO_PASS --quiet --eval "
  db.system.profile.countDocuments({millis: {\$gt: 1000}, ts: {\$gt: new Date(Date.now() - 300000)}})
")

if [ $SLOW_QUERIES -gt 10 ]; then
  echo "WARNING: $SLOW_QUERIES slow queries in last 5 minutes"
fi

# Check connections
CONNECTIONS=$(mongosh --host $MONGO_HOST --username $MONGO_USER --password $MONGO_PASS --quiet --eval "
  db.serverStatus().connections.current
")

if [ $CONNECTIONS -gt 800 ]; then
  echo "WARNING: High connection count: $CONNECTIONS"
fi

echo "MongoDB monitoring completed at $(date)"

Third-Party Monitoring Tools

# MongoDB Compass (GUI)
# Download from https://www.mongodb.com/products/compass

# MongoDB Atlas monitoring (cloud)
# Built-in monitoring for Atlas clusters

# Prometheus + Grafana
# Install mongodb_exporter
wget https://github.com/percona/mongodb_exporter/releases/download/v0.20.0/mongodb_exporter-0.20.0.linux-amd64.tar.gz
tar -xzf mongodb_exporter-0.20.0.linux-amd64.tar.gz
./mongodb_exporter --mongodb.uri="mongodb://monitor:password@localhost:27017"

# Prometheus configuration
scrape_configs:
  - job_name: 'mongodb'
    static_configs:
      - targets: ['localhost:9216']

# Grafana dashboard
# Import dashboard ID: 2583 (MongoDB dashboard)

# Datadog monitoring
# Install Datadog agent with MongoDB integration

# New Relic monitoring
# Install New Relic infrastructure agent with MongoDB plugin

GridFS

GridFS Basics

// GridFS is used for storing files larger than 16MB
// Files are split into chunks (255KB by default)

// Using GridFS with MongoDB shell
use myDatabase

// Store file in GridFS
var fs = new GridFS(db, "myFiles")
fs.put("/path/to/largefile.pdf", "largefile.pdf")

// List files in GridFS
db.myFiles.files.find()

// Get file information
db.myFiles.files.findOne({filename: "largefile.pdf"})

// Retrieve file
var file = fs.get("largefile.pdf")
file

// Delete file
fs.delete("largefile.pdf")

GridFS with Node.js

const { MongoClient, GridFSBucket } = require('mongodb');

async function gridfsExample() {
  const client = new MongoClient('mongodb://localhost:27017');
  await client.connect();

  const db = client.db('myDatabase');
  const bucket = new GridFSBucket(db, { bucketName: 'uploads' });

  // Upload file
  const fs = require('fs');
  const uploadStream = bucket.openUploadStream('example.pdf', {
    metadata: { 
      userId: 'user123',
      uploadDate: new Date(),
      contentType: 'application/pdf'
    }
  });

  fs.createReadStream('/path/to/file.pdf').pipe(uploadStream);

  uploadStream.on('finish', () => {
    console.log('File uploaded successfully');
  });

  // Download file
  const downloadStream = bucket.openDownloadStreamByName('example.pdf');
  downloadStream.pipe(fs.createWriteStream('/path/to/downloaded.pdf'));

  // Find files
  const files = await bucket.find({ 'metadata.userId': 'user123' }).toArray();
  console.log(files);

  // Delete file
  await bucket.delete(fileId);

  await client.close();
}

GridFS Management

// Check GridFS collections
db.fs.files.find()
db.fs.chunks.find()

// GridFS statistics
db.fs.files.stats()
db.fs.chunks.stats()

// Find orphaned chunks
db.fs.chunks.aggregate([
  {
    $lookup: {
      from: "fs.files",
      localField: "files_id",
      foreignField: "_id",
      as: "file"
    }
  },
  {
    $match: { file: { $size: 0 } }
  }
])

// Clean up orphaned chunks
var orphanedChunks = db.fs.chunks.aggregate([
  {
    $lookup: {
      from: "fs.files",
      localField: "files_id",
      foreignField: "_id",
      as: "file"
    }
  },
  {
    $match: { file: { $size: 0 } }
  }
]).toArray();

orphanedChunks.forEach(function(chunk) {
  db.fs.chunks.deleteOne({_id: chunk._id});
});

// Index GridFS collections for performance
db.fs.files.createIndex({filename: 1})
db.fs.files.createIndex({"metadata.userId": 1})
db.fs.chunks.createIndex({files_id: 1, n: 1})

// Custom GridFS bucket
var customBucket = new GridFSBucket(db, {
  bucketName: 'images',
  chunkSizeBytes: 1024 * 1024  // 1MB chunks
});

Change Streams

Basic Change Streams

// Watch all changes in database
const changeStream = db.watch();

changeStream.on('change', (change) => {
  console.log('Change detected:', change);
});

// Watch changes in specific collection
const userChangeStream = db.users.watch();

userChangeStream.on('change', (change) => {
  console.log('User change:', change);
});

// Watch specific operations
const insertStream = db.users.watch([
  { $match: { operationType: 'insert' } }
]);

// Watch changes with full document
const fullDocStream = db.users.watch([], {
  fullDocument: 'updateLookup'
});

fullDocStream.on('change', (change) => {
  console.log('Full document:', change.fullDocument);
});

// Resume change stream from specific point
const resumeToken = changeStream.getResumeToken();
const resumedStream = db.users.watch([], {
  resumeAfter: resumeToken
});

Advanced Change Streams

// Filter changes by specific fields
const filteredStream = db.users.watch([
  {
    $match: {
      $and: [
        { operationType: 'update' },
        { 'updateDescription.updatedFields.status': { $exists: true } }
      ]
    }
  }
]);

// Watch changes for specific document
const docStream = db.users.watch([
  {
    $match: {
      'fullDocument._id': ObjectId('507f1f77bcf86cd799439011')
    }
  }
]);

// Transform change stream output
const transformedStream = db.users.watch([
  {
    $match: { operationType: { $in: ['insert', 'update'] } }
  },
  {
    $project: {
      _id: 1,
      operationType: 1,
      documentKey: 1,
      'fullDocument.name': 1,
      'fullDocument.email': 1,
      timestamp: '$clusterTime'
    }
  }
]);

// Change stream with start time
const startTime = new Date();
const timeBasedStream = db.users.watch([], {
  startAtOperationTime: startTime
});

// Error handling for change streams
changeStream.on('error', (error) => {
  console.error('Change stream error:', error);
  // Implement retry logic
});

// Close change stream
changeStream.close();

Change Streams with Applications

// Node.js example with change streams
const { MongoClient } = require('mongodb');

async function watchChanges() {
  const client = new MongoClient('mongodb://localhost:27017');
  await client.connect();

  const db = client.db('myDatabase');
  const collection = db.collection('users');

  // Watch for user status changes
  const changeStream = collection.watch([
    {
      $match: {
        $and: [
          { operationType: 'update' },
          { 'updateDescription.updatedFields.status': { $exists: true } }
        ]
      }
    }
  ], { fullDocument: 'updateLookup' });

  changeStream.on('change', async (change) => {
    const { documentKey, fullDocument, updateDescription } = change;

    console.log(`User ${documentKey._id} status changed to ${fullDocument.status}`);

    // Trigger business logic based on status change
    if (fullDocument.status === 'premium') {
      await sendWelcomeEmail(fullDocument.email);
    } else if (fullDocument.status === 'inactive') {
      await scheduleAccountCleanup(documentKey._id);
    }
  });

  // Handle errors and reconnection
  changeStream.on('error', (error) => {
    console.error('Change stream error:', error);
    setTimeout(() => {
      watchChanges(); // Restart change stream
    }, 5000);
  });
}

// Real-time notifications
async function setupNotifications() {
  const changeStream = db.notifications.watch([
    {
      $match: {
        operationType: 'insert',
        'fullDocument.userId': currentUserId
      }
    }
  ]);

  changeStream.on('change', (change) => {
    const notification = change.fullDocument;
    // Send to WebSocket client
    websocket.send(JSON.stringify({
      type: 'notification',
      data: notification
    }));
  });
}

Transactions

Single Document Transactions

// MongoDB provides atomicity for single document operations
// These are automatically atomic:

db.users.updateOne(
  { _id: ObjectId("507f1f77bcf86cd799439011") },
  {
    $inc: { balance: -100 },
    $push: { transactions: { type: "debit", amount: 100, date: new Date() } }
  }
)

// findAndModify operations are also atomic
db.users.findOneAndUpdate(
  { _id: ObjectId("507f1f77bcf86cd799439011") },
  { $inc: { balance: -100 } },
  { returnDocument: "after" }
)

Multi-Document Transactions

// Multi-document transactions (MongoDB 4.0+)
// Requires replica set or sharded cluster

const session = db.getMongo().startSession();

try {
  session.startTransaction();

  // Transfer money between accounts
  const fromAccount = session.getDatabase("bank").users.findOne(
    { accountId: "account1" },
    { session: session }
  );

  if (fromAccount.balance < 100) {
    throw new Error("Insufficient funds");
  }

  // Debit from account
  session.getDatabase("bank").users.updateOne(
    { accountId: "account1" },
    { $inc: { balance: -100 } },
    { session: session }
  );

  // Credit to account
  session.getDatabase("bank").users.updateOne(
    { accountId: "account2" },
    { $inc: { balance: 100 } },
    { session: session }
  );

  // Log transaction
  session.getDatabase("bank").transactions.insertOne({
    from: "account1",
    to: "account2",
    amount: 100,
    timestamp: new Date()
  }, { session: session });

  // Commit transaction
  session.commitTransaction();
  console.log("Transaction completed successfully");

} catch (error) {
  console.error("Transaction failed:", error);
  session.abortTransaction();
} finally {
  session.endSession();
}

Transactions with Node.js

const { MongoClient } = require('mongodb');

async function transferMoney(fromAccountId, toAccountId, amount) {
  const client = new MongoClient('mongodb://localhost:27017');
  await client.connect();

  const session = client.startSession();

  try {
    await session.withTransaction(async () => {
      const db = client.db('bank');
      const accounts = db.collection('accounts');

      // Check source account balance
      const fromAccount = await accounts.findOne(
        { accountId: fromAccountId },
        { session }
      );

      if (!fromAccount || fromAccount.balance < amount) {
        throw new Error('Insufficient funds');
      }

      // Perform transfer
      await accounts.updateOne(
        { accountId: fromAccountId },
        { $inc: { balance: -amount } },
        { session }
      );

      await accounts.updateOne(
        { accountId: toAccountId },
        { $inc: { balance: amount } },
        { session }
      );

      // Log transaction
      await db.collection('transactions').insertOne({
        from: fromAccountId,
        to: toAccountId,
        amount: amount,
        timestamp: new Date(),
        status: 'completed'
      }, { session });

    }, {
      readConcern: { level: 'majority' },
      writeConcern: { w: 'majority' },
      readPreference: 'primary'
    });

    console.log('Transfer completed successfully');

  } catch (error) {
    console.error('Transfer failed:', error);
    throw error;
  } finally {
    await session.endSession();
    await client.close();
  }
}

// Usage
transferMoney('account1', 'account2', 100)
  .then(() => console.log('Done'))
  .catch(console.error);

Transaction Best Practices

// Keep transactions short
// Bad: Long-running transaction
session.startTransaction();
// ... many operations
// ... external API calls
// ... complex calculations
session.commitTransaction();

// Good: Short transaction
session.startTransaction();
// Only essential database operations
session.commitTransaction();

// Use appropriate read/write concerns
session.startTransaction({
  readConcern: { level: 'majority' },
  writeConcern: { w: 'majority', j: true }
});

// Handle transaction conflicts
async function retryTransaction(operation, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      await operation();
      return;
    } catch (error) {
      if (error.hasErrorLabel('TransientTransactionError') && i < maxRetries - 1) {
        console.log('Retrying transaction...');
        continue;
      }
      throw error;
    }
  }
}

// Avoid hotspots in sharded clusters
// Use well-distributed shard keys for transactional collections

// Monitor transaction performance
db.serverStatus().transactions

Best Practices

Schema Design Best Practices

// 1. Embed vs Reference decision tree
// Embed when:
// - Data is accessed together
// - Data doesn't change frequently
// - Document size stays reasonable (<16MB)
// - One-to-few relationships

// Reference when:
// - Data is accessed independently
// - Data changes frequently
// - Document would become too large
// - Many-to-many relationships

// 2. Use appropriate data types
// Good
{
  _id: ObjectId("..."),
  age: 25,                    // Number, not string
  isActive: true,             // Boolean, not string
  createdAt: new Date(),      // Date, not string
  tags: ["javascript", "mongodb"]  // Array, not comma-separated string
}

// Bad
{
  _id: "507f1f77bcf86cd799439011",  // String instead of ObjectId
  age: "25",                       // String instead of number
  isActive: "true",                // String instead of boolean
  createdAt: "2023-01-01",         // String instead of date
  tags: "javascript,mongodb"       // String instead of array
}

// 3. Design for your queries
// If you frequently query by status and date:
db.orders.createIndex({status: 1, orderDate: -1})

// Structure documents to support common access patterns
{
  _id: ObjectId("..."),
  userId: ObjectId("..."),
  status: "active",
  orderDate: new Date(),
  items: [
    {productId: ObjectId("..."), quantity: 2, price: 29.99},
    {productId: ObjectId("..."), quantity: 1, price: 49.99}
  ],
  // Denormalize frequently accessed data
  customerInfo: {
    name: "John Doe",
    email: "john@example.com"
  }
}

Performance Best Practices

// 1. Create indexes for your queries
// Analyze your query patterns
db.users.find({status: "active", age: {$gte: 25}}).sort({lastLogin: -1})

// Create compound index
db.users.createIndex({status: 1, age: 1, lastLogin: -1})

// 2. Use projection to limit returned data
// Bad: Return entire document
db.users.find({status: "active"})

// Good: Return only needed fields
db.users.find({status: "active"}, {name: 1, email: 1, _id: 0})

// 3. Use aggregation pipeline efficiently
// Move $match stages early
db.orders.aggregate([
  {$match: {status: "completed"}},        // Filter early
  {$lookup: {from: "products", ...}},     // Then join
  {$match: {totalAmount: {$gte: 100}}}    // Filter again if needed
])

// 4. Batch operations
// Bad: Individual inserts
for (let i = 0; i < 1000; i++) {
  db.users.insertOne({name: "User" + i});
}

// Good: Batch insert
var docs = [];
for (let i = 0; i < 1000; i++) {
  docs.push({name: "User" + i});
}
db.users.insertMany(docs);

// 5. Use appropriate read preferences
// For analytics queries on replica set
db.getMongo().setReadPref("secondary")

Security Best Practices

// 1. Enable authentication
// Start MongoDB with --auth
mongod --auth --dbpath /data/db

// 2. Create users with minimal privileges
// Don't use admin user for applications
db.createUser({
  user: "appUser",
  pwd: "strongPassword",
  roles: [
    {role: "readWrite", db: "myApp"},
    {role: "read", db: "analytics"}
  ]
})

// 3. Use SSL/TLS for connections
mongod --sslMode requireSSL --sslPEMKeyFile /path/to/cert.pem

// 4. Validate input data
db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["name", "email"],
      properties: {
        email: {
          bsonType: "string",
          pattern: "^.+@.+$"
        }
      }
    }
  }
})

// 5. Use field-level encryption for sensitive data
// Encrypt SSN, credit card numbers, etc.

// 6. Enable auditing
// In mongod.conf:
auditLog:
  destination: file
  format: JSON
  path: /var/log/mongodb/audit.json

// 7. Regular security updates
// Keep MongoDB version updated
// Monitor security advisories

Operational Best Practices

// 1. Monitor your MongoDB deployment
// Set up monitoring for:
// - Query performance
// - Replication lag
// - Disk usage
// - Connection count
// - Index usage

// 2. Regular backups
// Automated daily backups
// Test restore procedures
// Store backups in multiple locations

// 3. Capacity planning
// Monitor growth trends
// Plan for peak loads
// Consider sharding before you need it

// 4. Use replica sets for production
// Minimum 3 members
// Use appropriate write concerns
// Monitor replication lag

// 5. Optimize for your workload
// Read-heavy: Use read replicas
// Write-heavy: Consider sharding
// Mixed: Optimize indexes and queries

// 6. Document your schema and operations
// Maintain schema documentation
// Document operational procedures
// Keep runbooks updated

// 7. Test disaster recovery procedures
// Regular failover tests
// Backup restoration tests
// Network partition scenarios

Development Best Practices

// 1. Use connection pooling
const client = new MongoClient(uri, {
  maxPoolSize: 10,
  minPoolSize: 5
});

// 2. Handle errors properly
try {
  await db.users.insertOne(user);
} catch (error) {
  if (error.code === 11000) {
    // Duplicate key error
    throw new Error('User already exists');
  }
  throw error;
}

// 3. Use transactions when needed
// For multi-document operations that must be atomic
await session.withTransaction(async () => {
  await accounts.updateOne({_id: fromId}, {$inc: {balance: -amount}}, {session});
  await accounts.updateOne({_id: toId}, {$inc: {balance: amount}}, {session});
});

// 4. Validate data at application level
function validateUser(user) {
  if (!user.email || !user.email.includes('@')) {
    throw new Error('Invalid email');
  }
  if (!user.name || user.name.length < 2) {
    throw new Error('Name must be at least 2 characters');
  }
}

// 5. Use environment-specific configurations
const config = {
  development: {
    uri: 'mongodb://localhost:27017/myapp_dev',
    options: {maxPoolSize: 5}
  },
  production: {
    uri: process.env.MONGODB_URI,
    options: {
      maxPoolSize: 20,
      ssl: true,
      replicaSet: 'production-rs'
    }
  }
};

// 6. Implement proper logging
const winston = require('winston');
const logger = winston.createLogger({
  level: 'info',
  format: winston.format.json(),
  transports: [
    new winston.transports.File({filename: 'mongodb.log'})
  ]
});

// Log slow queries
db.setProfilingLevel(1, {slowms: 100});

Summary

MongoDB is a powerful, flexible NoSQL document database that excels at handling diverse data types and scaling horizontally. This comprehensive cheatsheet covers essential MongoDB operations from basic CRUD to advanced topics like sharding, transactions, and performance optimization.

Key Strengths: - Flexible Schema: JSON-like documents with dynamic schemas - Horizontal Scaling: Built-in sharding for distributed deployments - Rich Query Language: Powerful aggregation framework and indexing - High Availability: Replica sets with automatic failover - Developer Friendly: Intuitive document model and extensive driver support

Best Use Cases: - Content management systems and catalogs - Real-time analytics and IoT applications - Mobile and social applications - Product catalogs and inventory management - Applications requiring rapid development and iteration

Important Considerations: - Proper schema design is crucial for performance - Index strategy should align with query patterns - Regular monitoring and maintenance are essential - Backup and disaster recovery procedures must be tested - Security configuration requires careful attention

By following the practices and techniques outlined in this cheatsheet, you can effectively design, implement, and maintain MongoDB databases that are secure, performant, and scalable for any application requirement.