Neo4j Cheatsheet¶

Neo4j - Graph Database

Neo4j is a highly scalable native graph database that leverages data relationships as first-class entities, helping enterprises build intelligent applications to meet today's evolving data challenges.

Table of Contents¶

Installation
Basic Operations
Cypher Query Language
Nodes and Relationships
Data Import/Export
Indexes and Constraints
Performance Optimization
Administration
Security
Clustering
Monitoring
Best Practices

Installation¶

Ubuntu/Debian Installation¶

# Add Neo4j repository
wget -O - https://debian.neo4j.com/neotechnology.gpg.key | sudo apt-key add -
echo 'deb https://debian.neo4j.com stable latest' | sudo tee /etc/apt/sources.list.d/neo4j.list

# Update package index
sudo apt update

# Install Neo4j Community Edition
sudo apt install neo4j

# Install Neo4j Enterprise Edition (requires license)
sudo apt install neo4j-enterprise

# Start Neo4j service
sudo systemctl start neo4j
sudo systemctl enable neo4j

# Check status
sudo systemctl status neo4j

# Access Neo4j Browser
# http://localhost:7474
# Default credentials: neo4j/neo4j (change on first login)

CentOS/RHEL Installation¶

# Add Neo4j repository
sudo rpm --import https://debian.neo4j.com/neotechnology.gpg.key

sudo tee /etc/yum.repos.d/neo4j.repo << EOF
[neo4j]
name=Neo4j RPM Repository
baseurl=https://yum.neo4j.com/stable
enabled=1
gpgcheck=1
EOF

# Install Neo4j
sudo yum install neo4j

# Start Neo4j service
sudo systemctl start neo4j
sudo systemctl enable neo4j

# Check status
sudo systemctl status neo4j

Docker Installation¶

# Pull Neo4j image
docker pull neo4j:latest

# Run Neo4j container
docker run \
    --name neo4j \
    -p 7474:7474 -p 7687:7687 \
    -d \
    -v neo4j-data:/data \
    -v neo4j-logs:/logs \
    -v neo4j-import:/var/lib/neo4j/import \
    -v neo4j-plugins:/plugins \
    --env NEO4J_AUTH=neo4j/password \
    neo4j:latest

# Run with custom configuration
docker run \
    --name neo4j \
    -p 7474:7474 -p 7687:7687 \
    -d \
    -v neo4j-data:/data \
    -v neo4j-logs:/logs \
    --env NEO4J_AUTH=neo4j/password \
    --env NEO4J_dbms_memory_heap_initial__size=1G \
    --env NEO4J_dbms_memory_heap_max__size=1G \
    --env NEO4J_dbms_memory_pagecache_size=1G \
    neo4j:latest

# Docker Compose setup
cat > docker-compose.yml << EOF
version: '3.8'
services:
  neo4j:
    image: neo4j:latest
    container_name: neo4j
    ports:
      - "7474:7474"
      - "7687:7687"
    environment:
      - NEO4J_AUTH=neo4j/password
      - NEO4J_dbms_memory_heap_initial__size=1G
      - NEO4J_dbms_memory_heap_max__size=1G
      - NEO4J_dbms_memory_pagecache_size=1G
    volumes:
      - neo4j-data:/data
      - neo4j-logs:/logs
      - neo4j-import:/var/lib/neo4j/import
      - neo4j-plugins:/plugins
    restart: unless-stopped

volumes:
  neo4j-data:
  neo4j-logs:
  neo4j-import:
  neo4j-plugins:
EOF

docker-compose up -d

Manual Installation¶

# Download Neo4j
wget https://neo4j.com/artifact.php?name=neo4j-community-5.13.0-unix.tar.gz -O neo4j-community-5.13.0-unix.tar.gz

# Extract
tar -xzf neo4j-community-5.13.0-unix.tar.gz
sudo mv neo4j-community-5.13.0 /opt/neo4j

# Set environment variables
echo 'export NEO4J_HOME=/opt/neo4j' >> ~/.bashrc
echo 'export PATH=$PATH:$NEO4J_HOME/bin' >> ~/.bashrc
source ~/.bashrc

# Set initial password
neo4j-admin dbms set-initial-password password

# Start Neo4j
neo4j start

# Check status
neo4j status

# Stop Neo4j
neo4j stop

# Access Neo4j Browser
# http://localhost:7474

Basic Operations¶

Neo4j Browser¶

# Access Neo4j Browser
http://localhost:7474

# Connect with credentials
# Username: neo4j
# Password: (set during installation)

# Basic browser commands
:help          // Show help
:clear         // Clear the result frame
:history       // Show command history
:config        // Show configuration
:schema        // Show database schema
:sysinfo       // Show system information
:queries       // Show running queries
:server status // Show server status

Cypher Shell¶

# Connect to Neo4j using Cypher Shell
cypher-shell

# Connect with specific credentials
cypher-shell -u neo4j -p password

# Connect to specific database
cypher-shell -d mydb

# Execute Cypher from file
cypher-shell -f script.cypher

# Execute single command
cypher-shell "MATCH (n) RETURN count(n);"

# Connect to remote Neo4j
cypher-shell -a bolt://remote-host:7687 -u neo4j -p password

Configuration¶

# Main configuration file
/etc/neo4j/neo4j.conf

# Common configuration settings
# Memory settings
dbms.memory.heap.initial_size=1G
dbms.memory.heap.max_size=1G
dbms.memory.pagecache.size=1G

# Network settings
dbms.default_listen_address=0.0.0.0
dbms.connector.bolt.listen_address=:7687
dbms.connector.http.listen_address=:7474
dbms.connector.https.listen_address=:7473

# Security settings
dbms.security.auth_enabled=true
dbms.security.allow_csv_import_from_file_urls=true

# Logging
dbms.logs.query.enabled=true
dbms.logs.query.threshold=0

# Transaction settings
dbms.transaction.timeout=60s
dbms.transaction.concurrent.maximum=1000

# Apply configuration changes
sudo systemctl restart neo4j

Cypher Query Language¶

Basic Syntax¶

-- Comments in Cypher
// Single line comment
/* Multi-line
   comment */

-- Case sensitivity
// Keywords are case-insensitive: MATCH, match, Match
// Labels, property names, and variables are case-sensitive

-- Basic query structure
MATCH (pattern)
WHERE condition
RETURN result
ORDER BY property
LIMIT number;

-- Variables and patterns
MATCH (n)           // Variable 'n' represents any node
MATCH (p:Person)    // Variable 'p' represents nodes with label 'Person'
MATCH ()-[r]->()    // Variable 'r' represents any relationship
MATCH (a)-[:KNOWS]->(b)  // Specific relationship type

Data Types¶

-- Primitive types
RETURN 42 AS integer;
RETURN 3.14 AS float;
RETURN "Hello" AS string;
RETURN true AS boolean;
RETURN null AS nullValue;

-- Temporal types
RETURN date('2023-12-01') AS dateValue;
RETURN time('14:30:00') AS timeValue;
RETURN datetime('2023-12-01T14:30:00') AS datetimeValue;
RETURN duration('P1Y2M3DT4H5M6S') AS durationValue;

-- Composite types
RETURN [1, 2, 3] AS list;
RETURN {name: 'John', age: 30} AS map;

-- Spatial types
RETURN point({x: 3, y: 4}) AS cartesianPoint;
RETURN point({latitude: 40.7128, longitude: -74.0060}) AS geographicPoint;

Pattern Matching¶

-- Node patterns
MATCH (n)                    // Any node
MATCH (p:Person)             // Node with label Person
MATCH (p:Person:Employee)    // Node with multiple labels
MATCH (p {name: 'John'})     // Node with property
MATCH (p:Person {name: 'John', age: 30})  // Node with label and properties

-- Relationship patterns
MATCH (a)-[r]->(b)           // Any relationship from a to b
MATCH (a)-[:KNOWS]->(b)      // Specific relationship type
MATCH (a)-[r:KNOWS]->(b)     // Relationship with variable
MATCH (a)-[:KNOWS|LIKES]->(b) // Multiple relationship types
MATCH (a)-[*1..3]->(b)       // Variable length path (1 to 3 hops)
MATCH (a)-[*]->(b)           // Variable length path (any length)

-- Bidirectional relationships
MATCH (a)-[:KNOWS]-(b)       // Relationship in either direction
MATCH (a)<-[:KNOWS]-(b)      // Relationship from b to a

-- Complex patterns
MATCH (a:Person)-[:KNOWS]->(b:Person)-[:WORKS_FOR]->(c:Company)
WHERE a.age > 25 AND c.name = 'TechCorp'
RETURN a.name, b.name, c.name;

Nodes and Relationships¶

Creating Nodes¶

-- Create single node
CREATE (n);

-- Create node with label
CREATE (p:Person);

-- Create node with properties
CREATE (p:Person {name: 'John Doe', age: 30, email: 'john@example.com'});

-- Create multiple nodes
CREATE (p1:Person {name: 'Alice'}), (p2:Person {name: 'Bob'});

-- Create node with multiple labels
CREATE (p:Person:Employee {name: 'John', department: 'IT'});

-- Create and return node
CREATE (p:Person {name: 'Jane'})
RETURN p;

-- Create node with computed properties
CREATE (p:Person {
  name: 'John',
  created: datetime(),
  id: randomUUID()
});

Creating Relationships¶

-- Create relationship between existing nodes
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS]->(b);

-- Create relationship with properties
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS {since: date('2020-01-01'), strength: 0.8}]->(b);

-- Create nodes and relationship in one statement
CREATE (a:Person {name: 'Alice'})-[:KNOWS]->(b:Person {name: 'Bob'});

-- Create bidirectional relationship
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS]->(b), (b)-[:KNOWS]->(a);

-- Create multiple relationships
MATCH (p:Person {name: 'John'}), (c1:Company {name: 'TechCorp'}), (c2:Company {name: 'DataInc'})
CREATE (p)-[:WORKS_FOR {start: date('2020-01-01')}]->(c1),
       (p)-[:PREVIOUSLY_WORKED_FOR {start: date('2018-01-01'), end: date('2019-12-31')}]->(c2);

Reading Data¶

-- Return all nodes
MATCH (n) RETURN n;

-- Return nodes with specific label
MATCH (p:Person) RETURN p;

-- Return specific properties
MATCH (p:Person) RETURN p.name, p.age;

-- Return with aliases
MATCH (p:Person) RETURN p.name AS name, p.age AS age;

-- Return relationships
MATCH (a:Person)-[r:KNOWS]->(b:Person)
RETURN a.name, type(r), b.name;

-- Return paths
MATCH path = (a:Person)-[:KNOWS]->(b:Person)
RETURN path;

-- Count nodes
MATCH (p:Person) RETURN count(p);

-- Distinct values
MATCH (p:Person) RETURN DISTINCT p.department;

-- Limit results
MATCH (p:Person) RETURN p LIMIT 10;

-- Order results
MATCH (p:Person) RETURN p ORDER BY p.age DESC;

-- Skip results (pagination)
MATCH (p:Person) RETURN p ORDER BY p.name SKIP 10 LIMIT 10;

Updating Data¶

-- Update node properties
MATCH (p:Person {name: 'John'})
SET p.age = 31, p.updated = datetime();

-- Add new property
MATCH (p:Person {name: 'John'})
SET p.email = 'john.doe@example.com';

-- Update using map
MATCH (p:Person {name: 'John'})
SET p += {age: 31, city: 'New York'};

-- Replace all properties
MATCH (p:Person {name: 'John'})
SET p = {name: 'John Doe', age: 31, email: 'john@example.com'};

-- Add label
MATCH (p:Person {name: 'John'})
SET p:Employee;

-- Remove property
MATCH (p:Person {name: 'John'})
REMOVE p.email;

-- Remove label
MATCH (p:Person {name: 'John'})
REMOVE p:Employee;

-- Update relationship properties
MATCH (a:Person)-[r:KNOWS]->(b:Person)
WHERE a.name = 'Alice' AND b.name = 'Bob'
SET r.strength = 0.9, r.updated = datetime();

Deleting Data¶

-- Delete node (must delete relationships first)
MATCH (p:Person {name: 'John'})
DELETE p;

-- Delete node and all its relationships
MATCH (p:Person {name: 'John'})
DETACH DELETE p;

-- Delete relationship
MATCH (a:Person)-[r:KNOWS]->(b:Person)
WHERE a.name = 'Alice' AND b.name = 'Bob'
DELETE r;

-- Delete multiple nodes
MATCH (p:Person)
WHERE p.age < 18
DETACH DELETE p;

-- Delete all data (use with caution!)
MATCH (n)
DETACH DELETE n;

-- Conditional delete
MATCH (p:Person)
WHERE p.lastLogin < date('2022-01-01')
DETACH DELETE p;

Data Import/Export¶

CSV Import¶

-- Load CSV with headers
LOAD CSV WITH HEADERS FROM 'file:///people.csv' AS row
CREATE (p:Person {
  name: row.name,
  age: toInteger(row.age),
  email: row.email
});

-- Load CSV without headers
LOAD CSV FROM 'file:///people.csv' AS row
CREATE (p:Person {
  name: row[0],
  age: toInteger(row[1]),
  email: row[2]
});

-- Load CSV from URL
LOAD CSV WITH HEADERS FROM 'https://example.com/data.csv' AS row
CREATE (p:Person {name: row.name, age: toInteger(row.age)});

-- Load CSV with field terminator
LOAD CSV WITH HEADERS FROM 'file:///data.tsv' AS row
FIELDTERMINATOR '\t'
CREATE (p:Person {name: row.name});

-- Batch processing for large files
:auto USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///large_file.csv' AS row
CREATE (p:Person {name: row.name, age: toInteger(row.age)});

-- Load relationships from CSV
LOAD CSV WITH HEADERS FROM 'file:///relationships.csv' AS row
MATCH (a:Person {id: row.from_id}), (b:Person {id: row.to_id})
CREATE (a)-[:KNOWS {since: date(row.since)}]->(b);

-- Error handling during import
LOAD CSV WITH HEADERS FROM 'file:///people.csv' AS row
WITH row WHERE row.name IS NOT NULL
CREATE (p:Person {
  name: row.name,
  age: CASE WHEN row.age IS NOT NULL THEN toInteger(row.age) ELSE null END
});

APOC Import/Export¶

-- Install APOC plugin first
-- Download from: https://github.com/neo4j-contrib/neo4j-apoc-procedures

-- Export to CSV
CALL apoc.export.csv.all("all-data.csv", {});

-- Export specific nodes to CSV
CALL apoc.export.csv.query(
  "MATCH (p:Person) RETURN p.name, p.age, p.email",
  "people.csv",
  {}
);

-- Export to JSON
CALL apoc.export.json.all("all-data.json", {});

-- Import from JSON
CALL apoc.load.json("file:///data.json") YIELD value
CREATE (p:Person {
  name: value.name,
  age: value.age
});

-- Import from XML
CALL apoc.load.xml("file:///data.xml") YIELD value
UNWIND value._children AS person
CREATE (p:Person {
  name: person.name._text,
  age: toInteger(person.age._text)
});

-- Import from database
CALL apoc.load.jdbc(
  "jdbc:mysql://localhost:3306/mydb",
  "SELECT name, age FROM people"
) YIELD row
CREATE (p:Person {name: row.name, age: row.age});

Bulk Import Tool¶

# Neo4j Admin Import Tool (for initial data load)
# Prepare CSV files: nodes.csv, relationships.csv

# Import nodes and relationships
neo4j-admin database import full \
  --nodes=Person=people.csv \
  --nodes=Company=companies.csv \
  --relationships=WORKS_FOR=works_for.csv \
  --relationships=KNOWS=knows.csv \
  neo4j

# Import with custom delimiters
neo4j-admin database import full \
  --delimiter="|" \
  --array-delimiter=";" \
  --nodes=Person=people.csv \
  --relationships=KNOWS=relationships.csv \
  neo4j

# Example CSV format for nodes (people.csv)
# personId:ID,name,age:int,:LABEL
# 1,Alice,30,Person
# 2,Bob,25,Person

# Example CSV format for relationships (knows.csv)
# :START_ID,:END_ID,:TYPE,since
# 1,2,KNOWS,2020-01-01

Indexes and Constraints¶

Indexes¶

-- Create index on single property
CREATE INDEX person_name_index FOR (p:Person) ON (p.name);

-- Create index on multiple properties (composite index)
CREATE INDEX person_name_age_index FOR (p:Person) ON (p.name, p.age);

-- Create text index for full-text search
CREATE FULLTEXT INDEX person_fulltext_index FOR (p:Person) ON EACH [p.name, p.description];

-- Create index on relationship property
CREATE INDEX knows_since_index FOR ()-[r:KNOWS]-() ON (r.since);

-- Show all indexes
SHOW INDEXES;

-- Show specific index
SHOW INDEX YIELD name, labelsOrTypes, properties WHERE name = 'person_name_index';

-- Drop index
DROP INDEX person_name_index;

-- Create index if not exists
CREATE INDEX person_email_index IF NOT EXISTS FOR (p:Person) ON (p.email);

Constraints¶

-- Unique constraint
CREATE CONSTRAINT person_email_unique FOR (p:Person) REQUIRE p.email IS UNIQUE;

-- Node key constraint (multiple properties must be unique together)
CREATE CONSTRAINT person_name_age_key FOR (p:Person) REQUIRE (p.name, p.age) IS NODE KEY;

-- Property existence constraint (Enterprise Edition)
CREATE CONSTRAINT person_name_exists FOR (p:Person) REQUIRE p.name IS NOT NULL;

-- Relationship property existence constraint
CREATE CONSTRAINT knows_since_exists FOR ()-[r:KNOWS]-() REQUIRE r.since IS NOT NULL;

-- Show all constraints
SHOW CONSTRAINTS;

-- Drop constraint
DROP CONSTRAINT person_email_unique;

-- Create constraint if not exists
CREATE CONSTRAINT person_id_unique IF NOT EXISTS FOR (p:Person) REQUIRE p.id IS UNIQUE;

Full-text Search¶

-- Create full-text index
CREATE FULLTEXT INDEX person_search FOR (p:Person) ON EACH [p.name, p.description, p.bio];

-- Search using full-text index
CALL db.index.fulltext.queryNodes("person_search", "john AND developer") YIELD node, score
RETURN node.name, node.description, score;

-- Search with fuzzy matching
CALL db.index.fulltext.queryNodes("person_search", "john~") YIELD node, score
RETURN node.name, score;

-- Search with wildcards
CALL db.index.fulltext.queryNodes("person_search", "john*") YIELD node, score
RETURN node.name, score;

-- Search with phrase
CALL db.index.fulltext.queryNodes("person_search", '"software developer"') YIELD node, score
RETURN node.name, score;

-- Search relationships
CREATE FULLTEXT INDEX review_search FOR ()-[r:REVIEWED]-() ON EACH [r.title, r.content];

CALL db.index.fulltext.queryRelationships("review_search", "excellent") YIELD relationship, score
RETURN relationship, score;

Performance Optimization¶

Query Optimization¶

-- Use EXPLAIN to see query plan
EXPLAIN
MATCH (p:Person {name: 'John'})
RETURN p;

-- Use PROFILE to see actual execution statistics
PROFILE
MATCH (p:Person)-[:KNOWS]->(friend:Person)
WHERE p.age > 30
RETURN p.name, collect(friend.name);

-- Use indexes for WHERE clauses
// Good: Uses index
MATCH (p:Person)
WHERE p.email = 'john@example.com'
RETURN p;

// Bad: Full scan
MATCH (p:Person)
WHERE p.email CONTAINS '@example.com'
RETURN p;

-- Use LIMIT to reduce result set
MATCH (p:Person)
RETURN p
ORDER BY p.created DESC
LIMIT 10;

-- Use WITH for intermediate processing
MATCH (p:Person)-[:KNOWS]->(friend:Person)
WITH p, count(friend) AS friendCount
WHERE friendCount > 5
RETURN p.name, friendCount;

-- Avoid Cartesian products
// Bad: Cartesian product
MATCH (p:Person), (c:Company)
WHERE p.company = c.name
RETURN p, c;

// Good: Use relationships
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
RETURN p, c;

Memory Management¶

-- Use PERIODIC COMMIT for large operations (Legacy)
:auto USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///large_file.csv' AS row
CREATE (p:Person {name: row.name});

-- Use CALL IN TRANSACTIONS for batching (Neo4j 4.4+)
:auto
LOAD CSV WITH HEADERS FROM 'file:///large_file.csv' AS row
CALL {
  WITH row
  CREATE (p:Person {name: row.name, age: toInteger(row.age)})
} IN TRANSACTIONS OF 1000 ROWS;

-- Limit memory usage with LIMIT
MATCH (p:Person)
WITH p
ORDER BY p.created
LIMIT 1000
MATCH (p)-[:KNOWS]->(friend)
RETURN p.name, collect(friend.name);

Query Hints¶

-- Force index usage
MATCH (p:Person)
USING INDEX p:Person(name)
WHERE p.name = 'John'
RETURN p;

-- Force scan
MATCH (p:Person)
USING SCAN p:Person
WHERE p.age > 30
RETURN p;

-- Join hint
MATCH (p:Person), (c:Company)
USING JOIN ON p
WHERE p.company_id = c.id
RETURN p, c;

Administration¶

Database Management¶

-- Show databases
SHOW DATABASES;

-- Create database
CREATE DATABASE mydb;

-- Use database
:use mydb;

-- Drop database
DROP DATABASE mydb;

-- Start database
START DATABASE mydb;

-- Stop database
STOP DATABASE mydb;

-- Show default database
SHOW DEFAULT DATABASE;

-- Set default database
ALTER DATABASE mydb SET ACCESS READ WRITE;

User Management¶

-- Show users
SHOW USERS;

-- Create user
CREATE USER alice SET PASSWORD 'password123' CHANGE NOT REQUIRED;

-- Create user with password change required
CREATE USER bob SET PASSWORD 'temp123' CHANGE REQUIRED;

-- Change user password
ALTER USER alice SET PASSWORD 'newpassword123';

-- Drop user
DROP USER alice;

-- Show current user
SHOW CURRENT USER;

-- Show user privileges
SHOW USER alice PRIVILEGES;

Role Management¶

-- Show roles
SHOW ROLES;

-- Create role
CREATE ROLE developer;

-- Grant role to user
GRANT ROLE developer TO alice;

-- Revoke role from user
REVOKE ROLE developer FROM alice;

-- Drop role
DROP ROLE developer;

-- Show role privileges
SHOW ROLE developer PRIVILEGES;

Privilege Management¶

-- Grant database privileges
GRANT ACCESS ON DATABASE mydb TO alice;
GRANT START ON DATABASE mydb TO developer;
GRANT STOP ON DATABASE mydb TO admin;

-- Grant graph privileges
GRANT MATCH {*} ON GRAPH mydb TO alice;
GRANT CREATE ON GRAPH mydb TO developer;
GRANT DELETE ON GRAPH mydb TO admin;

-- Grant specific node privileges
GRANT MATCH {Person} ON GRAPH mydb TO alice;
GRANT CREATE ON GRAPH mydb NODES Person TO developer;

-- Grant relationship privileges
GRANT MATCH {KNOWS} ON GRAPH mydb TO alice;
GRANT CREATE ON GRAPH mydb RELATIONSHIPS KNOWS TO developer;

-- Revoke privileges
REVOKE ACCESS ON DATABASE mydb FROM alice;
REVOKE MATCH {*} ON GRAPH mydb FROM alice;

-- Show privileges
SHOW PRIVILEGES;
SHOW USER alice PRIVILEGES;
SHOW ROLE developer PRIVILEGES;

Backup and Restore¶

# Online backup (Enterprise Edition)
neo4j-admin database backup --to-path=/backup/location mydb

# Full backup
neo4j-admin database backup --to-path=/backup/location --include-metadata=all mydb

# Incremental backup
neo4j-admin database backup --to-path=/backup/location --incremental mydb

# Restore from backup
neo4j-admin database restore --from-path=/backup/location mydb

# Dump database
neo4j-admin database dump --to-path=/dump/location mydb

# Load database from dump
neo4j-admin database load --from-path=/dump/location mydb

# Copy database
neo4j-admin database copy --to-path=/copy/location mydb

Security¶

Authentication¶

# Configuration in neo4j.conf
dbms.security.auth_enabled=true
dbms.security.auth_provider=native

# LDAP authentication (Enterprise Edition)
dbms.security.auth_provider=ldap
dbms.security.ldap.host=ldap.example.com
dbms.security.ldap.port=389
dbms.security.ldap.user_dn_template=cn={0},ou=users,dc=example,dc=com

# Active Directory authentication
dbms.security.auth_provider=ldap
dbms.security.ldap.host=ad.example.com
dbms.security.ldap.port=389
dbms.security.ldap.user_dn_template={0}@example.com

SSL/TLS Configuration¶

# Enable HTTPS
dbms.connector.https.enabled=true
dbms.connector.https.listen_address=:7473

# SSL certificates
dbms.ssl.policy.https.enabled=true
dbms.ssl.policy.https.base_directory=certificates/https
dbms.ssl.policy.https.private_key=private.key
dbms.ssl.policy.https.public_certificate=public.crt

# Bolt SSL
dbms.connector.bolt.tls_level=REQUIRED
dbms.ssl.policy.bolt.enabled=true
dbms.ssl.policy.bolt.base_directory=certificates/bolt

Network Security¶

# Bind to specific interfaces
dbms.default_listen_address=10.0.0.1
dbms.connector.bolt.listen_address=10.0.0.1:7687
dbms.connector.http.listen_address=10.0.0.1:7474

# Disable HTTP connector (use HTTPS only)
dbms.connector.http.enabled=false

# Configure allowed origins for browser
dbms.security.http_access_control_allow_origin=https://myapp.example.com

# Firewall rules (example for iptables)
# Allow Neo4j ports from specific networks
# iptables -A INPUT -p tcp --dport 7474 -s 10.0.0.0/24 -j ACCEPT
# iptables -A INPUT -p tcp --dport 7687 -s 10.0.0.0/24 -j ACCEPT

Clustering¶

Causal Cluster Setup (Enterprise Edition)¶

# Core server configuration (neo4j.conf)
dbms.mode=CORE
causal_clustering.minimum_core_cluster_size_at_formation=3
causal_clustering.initial_discovery_members=core1:5000,core2:5000,core3:5000
causal_clustering.discovery_listen_address=0.0.0.0:5000
causal_clustering.transaction_listen_address=0.0.0.0:6000
causal_clustering.raft_listen_address=0.0.0.0:7000

# Read replica configuration
dbms.mode=READ_REPLICA
causal_clustering.initial_discovery_members=core1:5000,core2:5000,core3:5000

# Start cluster members
# Start core servers first, then read replicas
neo4j start

# Check cluster status
CALL dbms.cluster.overview();

# Show cluster topology
CALL dbms.cluster.role();

# Show routing table
CALL dbms.routing.getRoutingTable({}, "mydb");

Cluster Management¶

-- Show cluster overview
CALL dbms.cluster.overview();

-- Show cluster role
CALL dbms.cluster.role();

-- Show cluster routing table
CALL dbms.routing.getRoutingTable({}, "system");

-- Unbind from cluster (remove member)
CALL dbms.cluster.unbindFromCluster();

-- Check cluster connectivity
CALL dbms.cluster.checkConnectivity();

Load Balancing¶

# Application connection with load balancing
# Use neo4j:// protocol for routing
neo4j://core1:7687,core2:7687,core3:7687

# Driver configuration example (Java)
Driver driver = GraphDatabase.driver(
    "neo4j://core1:7687,core2:7687,core3:7687",
    AuthTokens.basic("neo4j", "password"),
    Config.builder()
        .withMaxConnectionLifetime(30, TimeUnit.MINUTES)
        .withMaxConnectionPoolSize(50)
        .withConnectionAcquisitionTimeout(2, TimeUnit.MINUTES)
        .build()
);

# Read/write session routing
// Write session (routes to leader)
try (Session session = driver.session(SessionConfig.forDatabase("mydb"))) {
    session.writeTransaction(tx -> {
        return tx.run("CREATE (p:Person {name: 'Alice'})");
    });
}

// Read session (routes to followers/read replicas)
try (Session session = driver.session(SessionConfig.forDatabase("mydb"))) {
    session.readTransaction(tx -> {
        return tx.run("MATCH (p:Person) RETURN p.name");
    });
}

Monitoring¶

System Monitoring¶

-- Show system information
CALL dbms.components();

-- Show query performance
CALL dbms.listQueries();

-- Show running transactions
CALL dbms.listTransactions();

-- Show connection information
CALL dbms.listConnections();

-- Show memory usage
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Memory Pools") YIELD attributes
RETURN attributes;

-- Show page cache metrics
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Page cache") YIELD attributes
RETURN attributes;

-- Show transaction metrics
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Transactions") YIELD attributes
RETURN attributes;

Query Monitoring¶

-- Enable query logging (in neo4j.conf)
dbms.logs.query.enabled=true
dbms.logs.query.threshold=0
dbms.logs.query.parameter_logging_enabled=true

-- Show slow queries
CALL dbms.listQueries() YIELD queryId, query, elapsedTimeMillis
WHERE elapsedTimeMillis > 1000
RETURN queryId, query, elapsedTimeMillis;

-- Kill long-running query
CALL dbms.killQuery('query-123');

-- Kill transaction
CALL dbms.killTransaction('transaction-456');

-- Show query plan cache
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Query management") YIELD attributes
RETURN attributes;

Performance Metrics¶

# JMX monitoring endpoints
# Memory usage
org.neo4j:instance=kernel#0,name=Memory Pools

# Page cache
org.neo4j:instance=kernel#0,name=Page cache

# Transactions
org.neo4j:instance=kernel#0,name=Transactions

# Store sizes
org.neo4j:instance=kernel#0,name=Store file sizes

# Bolt connections
org.neo4j:instance=kernel#0,name=Bolt

# HTTP connections
org.neo4j:instance=kernel#0,name=HTTP

# Example monitoring with curl
curl -u neo4j:password \
  -H "Accept: application/json" \
  http://localhost:7474/db/manage/server/jmx/domain/org.neo4j

Log Monitoring¶

# Log file locations
/var/log/neo4j/neo4j.log      # Main log
/var/log/neo4j/debug.log      # Debug log
/var/log/neo4j/query.log      # Query log
/var/log/neo4j/security.log   # Security log

# Monitor logs
tail -f /var/log/neo4j/neo4j.log

# Query log analysis
grep "ERROR" /var/log/neo4j/query.log
grep "WARN" /var/log/neo4j/neo4j.log

# Log rotation configuration
dbms.logs.query.rotation.keep_number=7
dbms.logs.query.rotation.size=20M
dbms.logs.debug.rotation.keep_number=7
dbms.logs.debug.rotation.size=20M

Best Practices¶

Data Modeling¶

-- Use meaningful labels
// Good
CREATE (p:Person {name: 'John'});
CREATE (c:Company {name: 'TechCorp'});

// Bad
CREATE (n {type: 'person', name: 'John'});

-- Use specific relationship types
// Good
CREATE (p:Person)-[:WORKS_FOR]->(c:Company);
CREATE (p:Person)-[:LIVES_IN]->(city:City);

// Bad
CREATE (p:Person)-[:RELATED_TO {type: 'works_for'}]->(c:Company);

-- Denormalize for performance
// Store frequently accessed data on nodes
CREATE (p:Person {
  name: 'John',
  friendCount: 150,  // Denormalized count
  lastLogin: datetime()
});

-- Use appropriate data types
CREATE (p:Person {
  name: 'John',           // String
  age: 30,                // Integer
  salary: 75000.50,       // Float
  active: true,           // Boolean
  created: datetime(),    // DateTime
  tags: ['developer', 'manager']  // List
});

Query Best Practices¶

-- Start with most selective patterns
// Good: Start with unique constraint
MATCH (p:Person {email: 'john@example.com'})
MATCH (p)-[:WORKS_FOR]->(c:Company)
RETURN p, c;

// Bad: Start with broad pattern
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
WHERE p.email = 'john@example.com'
RETURN p, c;

-- Use parameters for dynamic queries
// Good: Parameterized query
MATCH (p:Person {name: $name})
RETURN p;

// Bad: String concatenation (security risk)
// MATCH (p:Person {name: '" + userInput + "'}) RETURN p;

-- Limit result sets
MATCH (p:Person)
RETURN p
ORDER BY p.created DESC
LIMIT 20;

-- Use OPTIONAL MATCH for optional patterns
MATCH (p:Person)
OPTIONAL MATCH (p)-[:WORKS_FOR]->(c:Company)
RETURN p.name, c.name;

-- Collect related data efficiently
MATCH (p:Person)-[:KNOWS]->(friend:Person)
RETURN p.name, collect(friend.name) AS friends;

Performance Best Practices¶

-- Create appropriate indexes
CREATE INDEX person_email_index FOR (p:Person) ON (p.email);
CREATE INDEX company_name_index FOR (c:Company) ON (c.name);

-- Use constraints for data integrity
CREATE CONSTRAINT person_email_unique FOR (p:Person) REQUIRE p.email IS UNIQUE;

-- Avoid large transactions
// Good: Process in batches
MATCH (p:Person)
WHERE p.lastLogin < date('2022-01-01')
WITH p LIMIT 1000
DETACH DELETE p;

// Bad: Delete all at once
MATCH (p:Person)
WHERE p.lastLogin < date('2022-01-01')
DETACH DELETE p;

-- Use EXPLAIN and PROFILE
PROFILE
MATCH (p:Person)-[:KNOWS*2..3]->(friend:Person)
WHERE p.name = 'John'
RETURN friend.name;

-- Optimize memory usage
// Use WITH to reduce intermediate results
MATCH (p:Person)-[:KNOWS]->(friend:Person)
WITH p, count(friend) AS friendCount
WHERE friendCount > 10
RETURN p.name, friendCount;

Security Best Practices¶

-- Use least privilege principle
// Create role with minimal permissions
CREATE ROLE reader;
GRANT MATCH {*} ON GRAPH mydb TO reader;
GRANT ROLE reader TO user1;

-- Validate input data
// Use constraints to ensure data quality
CREATE CONSTRAINT person_email_format FOR (p:Person) 
REQUIRE p.email =~ '.*@.*\\..*';

-- Use parameterized queries
// Always use parameters for user input
MATCH (p:Person {email: $userEmail})
RETURN p;

-- Regular security audits
SHOW USERS;
SHOW ROLES;
SHOW PRIVILEGES;

-- Monitor access logs
// Enable security logging
dbms.security.logs.query.enabled=true

Operational Best Practices¶

# Regular backups
# Schedule daily backups
0 2 * * * neo4j-admin database backup --to-path=/backup/$(date +\%Y\%m\%d) mydb

# Monitor disk space
df -h /var/lib/neo4j

# Monitor memory usage
free -h

# Regular maintenance
# Compact store files
neo4j-admin database compact mydb

# Check consistency
neo4j-admin database check mydb

# Update statistics
# In Cypher
CALL db.stats.collect();

# Monitor query performance
# Review slow query log regularly
grep "WARN" /var/log/neo4j/query.log

# Capacity planning
# Monitor growth trends
# Plan for 3x data growth
# Monitor query patterns

Summary¶

Neo4j is a powerful graph database that excels at managing and querying highly connected data. This cheatsheet provides comprehensive coverage of Neo4j operations from basic graph concepts to advanced administration.

Key Strengths: - Native Graph Processing: Optimized for traversing relationships - Cypher Query Language: Intuitive and powerful graph query language - ACID Compliance: Full transaction support with consistency guarantees - Scalability: Horizontal scaling with causal clustering - Flexibility: Schema-optional with dynamic data modeling

Best Use Cases: - Social networks and recommendation engines - Fraud detection and risk management - Knowledge graphs and semantic search - Network and IT operations - Supply chain and logistics optimization

Important Considerations: - Graph data modeling requires different thinking than relational - Performance depends heavily on proper indexing and query design - Memory requirements can be significant for large graphs - Clustering features require Enterprise Edition - Query complexity can grow quickly with deep traversals

By following the practices and techniques outlined in this cheatsheet, you can effectively design, implement, and maintain Neo4j graph databases that provide powerful insights into connected data and support complex relationship-based queries with high performance.