InfluxDB Cheatsheet¶

InfluxDB - Time Series Database

InfluxDB is an open-source time series database designed to handle high write and query loads. It's purpose-built for time-stamped data including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.

Table of Contents¶

Installation
Basic Concepts
InfluxQL
Flux Query Language
Data Management
Retention Policies
Continuous Queries
Administration
Monitoring
Performance Tuning
Clustering
Best Practices

Installation¶

Ubuntu/Debian Installation¶

# Add InfluxDB repository
curl -s https://repos.influxdata.com/influxdata-archive_compat.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdb.list

# Update package index
sudo apt update

# Install InfluxDB
sudo apt install influxdb

# Start InfluxDB service
sudo systemctl start influxdb
sudo systemctl enable influxdb

# Check status
sudo systemctl status influxdb

# Verify installation
influx -version

CentOS/RHEL Installation¶

# Add InfluxDB repository
sudo tee /etc/yum.repos.d/influxdb.repo << EOF
[influxdb]
name = InfluxDB Repository - RHEL
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF

# Install InfluxDB
sudo yum install influxdb

# Start InfluxDB service
sudo systemctl start influxdb
sudo systemctl enable influxdb

# Check status
sudo systemctl status influxdb

Docker Installation¶

# Pull InfluxDB image
docker pull influxdb:latest

# Run InfluxDB container
docker run -d \
  --name influxdb \
  -p 8086:8086 \
  -v influxdb-data:/var/lib/influxdb \
  -e INFLUXDB_DB=mydb \
  -e INFLUXDB_ADMIN_USER=admin \
  -e INFLUXDB_ADMIN_PASSWORD=password \
  influxdb:latest

# Run InfluxDB 2.x
docker run -d \
  --name influxdb2 \
  -p 8086:8086 \
  -v influxdb2-data:/var/lib/influxdb2 \
  -e DOCKER_INFLUXDB_INIT_MODE=setup \
  -e DOCKER_INFLUXDB_INIT_USERNAME=admin \
  -e DOCKER_INFLUXDB_INIT_PASSWORD=password123 \
  -e DOCKER_INFLUXDB_INIT_ORG=myorg \
  -e DOCKER_INFLUXDB_INIT_BUCKET=mybucket \
  influxdb:2.7

# Docker Compose setup
cat > docker-compose.yml << EOF
version: '3.8'
services:
  influxdb:
    image: influxdb:latest
    container_name: influxdb
    ports:
      - "8086:8086"
    environment:
      - INFLUXDB_DB=mydb
      - INFLUXDB_ADMIN_USER=admin
      - INFLUXDB_ADMIN_PASSWORD=password
    volumes:
      - influxdb-data:/var/lib/influxdb
    restart: unless-stopped

volumes:
  influxdb-data:
EOF

docker-compose up -d

Manual Installation¶

# Download InfluxDB
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.8.10_linux_amd64.tar.gz

# Extract
tar -xzf influxdb-1.8.10_linux_amd64.tar.gz
sudo mv influxdb-1.8.10-1 /opt/influxdb

# Create user and directories
sudo useradd --system --home-dir /var/lib/influxdb --shell /bin/false influxdb
sudo mkdir -p /var/lib/influxdb
sudo chown influxdb:influxdb /var/lib/influxdb

# Create systemd service
sudo tee /etc/systemd/system/influxdb.service << EOF
[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target

[Service]
User=influxdb
Group=influxdb
ExecStart=/opt/influxdb/usr/bin/influxd -config /etc/influxdb/influxdb.conf
KillMode=control-group
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl start influxdb
sudo systemctl enable influxdb

Basic Concepts¶

Key Concepts¶

# Time Series Data Structure
# measurement,tag_key1=tag_value1,tag_key2=tag_value2 field_key1=field_value1,field_key2=field_value2 timestamp

# Example:
# temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000

# Components:
# - Measurement: Similar to table name (e.g., "temperature")
# - Tags: Indexed metadata (e.g., location=office, sensor=A)
# - Fields: Actual data values (e.g., temperature=23.5, humidity=45.2)
# - Timestamp: Unix nanosecond timestamp

# Database: Container for users, retention policies, continuous queries, and time series data
# Retention Policy: Describes how long InfluxDB keeps data and how many copies exist
# Series: Collection of data that share a measurement, tag set, and field key
# Point: Single data record with measurement, tag set, field set, and timestamp

InfluxDB CLI¶

# Connect to InfluxDB
influx

# Connect with specific parameters
influx -host localhost -port 8086

# Connect with authentication
influx -username admin -password password

# Connect to specific database
influx -database mydb

# Execute single command
influx -execute "SHOW DATABASES"

# Execute commands from file
influx -import -path=data.txt -precision=ns

# Set precision
influx -precision rfc3339

Basic Commands¶

-- Show databases
SHOW DATABASES;

-- Create database
CREATE DATABASE mydb;

-- Use database
USE mydb;

-- Show measurements (tables)
SHOW MEASUREMENTS;

-- Show tag keys
SHOW TAG KEYS;

-- Show tag values
SHOW TAG VALUES FROM "temperature" WITH KEY = "location";

-- Show field keys
SHOW FIELD KEYS;

-- Show series
SHOW SERIES;

-- Show retention policies
SHOW RETENTION POLICIES;

-- Show users
SHOW USERS;

InfluxQL¶

Writing Data¶

-- Insert single point
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2

-- Insert with timestamp
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000

-- Insert multiple points
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2
INSERT temperature,location=office,sensor=B temperature=24.1,humidity=43.8
INSERT temperature,location=warehouse,sensor=C temperature=18.9,humidity=52.1

-- Insert with different field types
INSERT metrics,host=server1 cpu_usage=85.2,memory_usage=1024i,disk_full=false,status="running"

-- Batch insert using line protocol
# temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000
# temperature,location=office,sensor=B temperature=24.1,humidity=43.8 1609459260000000000
# temperature,location=warehouse,sensor=C temperature=18.9,humidity=52.1 1609459320000000000

Querying Data¶

-- Basic SELECT
SELECT * FROM temperature;

-- Select specific fields
SELECT temperature, humidity FROM temperature;

-- Select with WHERE clause
SELECT * FROM temperature WHERE location = 'office';

-- Select with time range
SELECT * FROM temperature WHERE time >= '2023-01-01T00:00:00Z' AND time <= '2023-01-02T00:00:00Z';

-- Select with relative time
SELECT * FROM temperature WHERE time >= now() - 1h;

-- Select with LIMIT
SELECT * FROM temperature LIMIT 10;

-- Select with ORDER BY
SELECT * FROM temperature ORDER BY time DESC LIMIT 10;

-- Select with GROUP BY
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;

-- Select with GROUP BY time
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- Select with multiple GROUP BY
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m), location;

-- Select with FILL
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(null);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(previous);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(linear);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(0);

Aggregation Functions¶

-- COUNT
SELECT COUNT(temperature) FROM temperature WHERE time >= now() - 1h;

-- MEAN
SELECT MEAN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- MEDIAN
SELECT MEDIAN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- MODE
SELECT MODE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- SPREAD
SELECT SPREAD(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- STDDEV
SELECT STDDEV(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- SUM
SELECT SUM(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- MIN and MAX
SELECT MIN(temperature), MAX(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- FIRST and LAST
SELECT FIRST(temperature), LAST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- PERCENTILE
SELECT PERCENTILE(temperature, 95) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

Selector Functions¶

-- TOP
SELECT TOP(temperature, 5) FROM temperature WHERE time >= now() - 1h;

-- BOTTOM
SELECT BOTTOM(temperature, 5) FROM temperature WHERE time >= now() - 1h;

-- FIRST
SELECT FIRST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;

-- LAST
SELECT LAST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;

-- MAX
SELECT MAX(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- MIN
SELECT MIN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- SAMPLE
SELECT SAMPLE(temperature, 3) FROM temperature WHERE time >= now() - 1h;

Transformation Functions¶

-- ABS
SELECT ABS(temperature) FROM temperature WHERE time >= now() - 1h;

-- ACOS, ASIN, ATAN
SELECT ACOS(temperature/100), ASIN(temperature/100), ATAN(temperature/100) FROM temperature WHERE time >= now() - 1h;

-- CEIL, FLOOR, ROUND
SELECT CEIL(temperature), FLOOR(temperature), ROUND(temperature) FROM temperature WHERE time >= now() - 1h;

-- COS, SIN, TAN
SELECT COS(temperature), SIN(temperature), TAN(temperature) FROM temperature WHERE time >= now() - 1h;

-- DERIVATIVE
SELECT DERIVATIVE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- DIFFERENCE
SELECT DIFFERENCE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- ELAPSED
SELECT ELAPSED(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- EXP, LN, LOG
SELECT EXP(temperature/10), LN(temperature), LOG(temperature, 10) FROM temperature WHERE time >= now() - 1h;

-- MOVING_AVERAGE
SELECT MOVING_AVERAGE(temperature, 3) FROM temperature WHERE time >= now() - 1h GROUP BY time(1m);

-- NON_NEGATIVE_DERIVATIVE
SELECT NON_NEGATIVE_DERIVATIVE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- POW, SQRT
SELECT POW(temperature, 2), SQRT(temperature) FROM temperature WHERE time >= now() - 1h;

Subqueries¶

-- Basic subquery
SELECT mean FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m));

-- Subquery with WHERE
SELECT * FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m)) WHERE mean > 20;

-- Nested subqueries
SELECT max FROM (SELECT MAX(mean) AS max FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m)));

-- Subquery with mathematical operations
SELECT mean * 1.8 + 32 AS fahrenheit FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m));

Regular Expressions¶

-- Match tag values with regex
SELECT * FROM temperature WHERE location =~ /^office.*/;

-- Match field values with regex
SELECT * FROM logs WHERE message =~ /error|warning/;

-- Case-insensitive regex
SELECT * FROM temperature WHERE location =~ /(?i)office/;

-- Negated regex
SELECT * FROM temperature WHERE location !~ /warehouse/;

Flux Query Language¶

Basic Flux Syntax¶

// Import packages
import "csv"
import "json"
import "math"

// Basic query structure
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> filter(fn: (r) => r.location == "office")

// Variables
start_time = -1h
measurement_name = "temperature"

from(bucket: "mybucket")
  |> range(start: start_time)
  |> filter(fn: (r) => r._measurement == measurement_name)

// Comments
// Single line comment
/* Multi-line
   comment */

Data Selection¶

// Basic data selection
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")

// Multiple filters
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> filter(fn: (r) => r.location == "office")
  |> filter(fn: (r) => r._field == "temperature")

// Time range with absolute times
from(bucket: "mybucket")
  |> range(start: 2023-01-01T00:00:00Z, stop: 2023-01-02T00:00:00Z)

// Complex filter conditions
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature" and (r.location == "office" or r.location == "warehouse"))

// Filter by field value
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> filter(fn: (r) => r._value > 20.0)

Aggregations¶

// Mean aggregation
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> aggregateWindow(every: 10m, fn: mean)

// Multiple aggregations
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> aggregateWindow(every: 10m, fn: mean)
  |> yield(name: "mean")

from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> aggregateWindow(every: 10m, fn: max)
  |> yield(name: "max")

// Group by tags
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> group(columns: ["location"])
  |> aggregateWindow(every: 10m, fn: mean)

// Custom aggregation function
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> aggregateWindow(every: 10m, fn: (tables=<-, column) => 
      tables |> quantile(q: 0.95, column: column))

Transformations¶

// Map function
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> map(fn: (r) => ({ r with _value: r._value * 1.8 + 32.0 }))

// Rename columns
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> rename(columns: {_value: "temperature_celsius"})

// Add new columns
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> map(fn: (r) => ({ r with fahrenheit: r._value * 1.8 + 32.0 }))

// Drop columns
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> drop(columns: ["_start", "_stop"])

// Keep only specific columns
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> keep(columns: ["_time", "_value", "location"])

// Sort data
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> sort(columns: ["_time"], desc: false)

// Limit results
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> limit(n: 10)

Joins and Unions¶

// Join two streams
temp_data = from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")

humidity_data = from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "humidity")

join(tables: {temp: temp_data, humidity: humidity_data}, on: ["_time", "location"])

// Union multiple streams
stream1 = from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")

stream2 = from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "pressure")

union(tables: [stream1, stream2])

// Pivot data
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "sensors")
  |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")

Window Functions¶

// Time-based windows
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> window(every: 10m)
  |> mean()
  |> duplicate(column: "_stop", as: "_time")
  |> window(every: inf)

// Moving average
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> movingAverage(n: 5)

// Exponential moving average
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> exponentialMovingAverage(n: 5)

// Derivative
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "counter")
  |> derivative(unit: 1s, nonNegative: true)

// Difference
from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "temperature")
  |> difference(nonNegative: false)

Data Management¶

Database Operations¶

-- Create database
CREATE DATABASE mydb;

-- Create database with retention policy
CREATE DATABASE mydb WITH DURATION 30d REPLICATION 1 SHARD DURATION 1h NAME "default";

-- Drop database
DROP DATABASE mydb;

-- Show databases
SHOW DATABASES;

-- Use database
USE mydb;

Measurement Operations¶

-- Show measurements
SHOW MEASUREMENTS;

-- Show measurements with regex
SHOW MEASUREMENTS WITH MEASUREMENT =~ /temp.*/;

-- Show measurements on specific database
SHOW MEASUREMENTS ON mydb;

-- Drop measurement
DROP MEASUREMENT temperature;

-- Show series
SHOW SERIES;

-- Show series from specific measurement
SHOW SERIES FROM temperature;

-- Show series with WHERE clause
SHOW SERIES FROM temperature WHERE location = 'office';

-- Drop series
DROP SERIES FROM temperature WHERE location = 'office';

Field and Tag Operations¶

-- Show field keys
SHOW FIELD KEYS;

-- Show field keys from specific measurement
SHOW FIELD KEYS FROM temperature;

-- Show tag keys
SHOW TAG KEYS;

-- Show tag keys from specific measurement
SHOW TAG KEYS FROM temperature;

-- Show tag values
SHOW TAG VALUES FROM temperature WITH KEY = "location";

-- Show tag values with WHERE clause
SHOW TAG VALUES FROM temperature WITH KEY = "location" WHERE location =~ /office.*/;

-- Show tag values for multiple keys
SHOW TAG VALUES FROM temperature WITH KEY IN ("location", "sensor");

Data Deletion¶

-- Delete data with WHERE clause
DELETE FROM temperature WHERE location = 'office';

-- Delete data with time range
DELETE FROM temperature WHERE time < '2023-01-01T00:00:00Z';

-- Delete data with multiple conditions
DELETE FROM temperature WHERE location = 'office' AND time < '2023-01-01T00:00:00Z';

-- Delete all data from measurement
DELETE FROM temperature;

-- Drop measurement (deletes all data and metadata)
DROP MEASUREMENT temperature;

-- Drop series
DROP SERIES FROM temperature WHERE location = 'office';

Retention Policies¶

Creating Retention Policies¶

-- Create retention policy
CREATE RETENTION POLICY "one_week" ON "mydb" DURATION 7d REPLICATION 1;

-- Create retention policy with shard duration
CREATE RETENTION POLICY "one_month" ON "mydb" DURATION 30d REPLICATION 1 SHARD DURATION 1h;

-- Create default retention policy
CREATE RETENTION POLICY "default" ON "mydb" DURATION 365d REPLICATION 1 DEFAULT;

-- Show retention policies
SHOW RETENTION POLICIES;

-- Show retention policies for specific database
SHOW RETENTION POLICIES ON mydb;

Managing Retention Policies¶

-- Alter retention policy
ALTER RETENTION POLICY "one_week" ON "mydb" DURATION 14d;

-- Alter retention policy replication
ALTER RETENTION POLICY "one_week" ON "mydb" REPLICATION 2;

-- Set as default retention policy
ALTER RETENTION POLICY "one_week" ON "mydb" DEFAULT;

-- Drop retention policy
DROP RETENTION POLICY "one_week" ON "mydb";

Writing to Specific Retention Policy¶

-- Write to specific retention policy
INSERT INTO mydb."one_week".temperature,location=office,sensor=A temperature=23.5;

-- Using fully qualified measurement name
INSERT INTO mydb."one_week".temperature,location=office temperature=23.5,humidity=45.2;

Continuous Queries¶

Creating Continuous Queries¶

-- Basic continuous query
CREATE CONTINUOUS QUERY "cq_mean_temp" ON "mydb"
BEGIN
  SELECT mean(temperature) INTO "average_temperature" FROM "temperature" GROUP BY time(10m)
END;

-- Continuous query with WHERE clause
CREATE CONTINUOUS QUERY "cq_office_temp" ON "mydb"
BEGIN
  SELECT mean(temperature) INTO "office_avg_temp" FROM "temperature" WHERE location = 'office' GROUP BY time(10m)
END;

-- Continuous query with tags
CREATE CONTINUOUS QUERY "cq_temp_by_location" ON "mydb"
BEGIN
  SELECT mean(temperature) INTO "avg_temp_by_location" FROM "temperature" GROUP BY time(10m), location
END;

-- Continuous query with retention policy
CREATE CONTINUOUS QUERY "cq_downsampled" ON "mydb"
BEGIN
  SELECT mean(temperature) INTO "mydb"."one_month"."avg_temperature" FROM "mydb"."autogen"."temperature" GROUP BY time(1h)
END;

-- Continuous query with RESAMPLE
CREATE CONTINUOUS QUERY "cq_resample" ON "mydb"
RESAMPLE EVERY 30s FOR 2m
BEGIN
  SELECT mean(temperature) INTO "avg_temperature" FROM "temperature" GROUP BY time(1m)
END;

Managing Continuous Queries¶

-- Show continuous queries
SHOW CONTINUOUS QUERIES;

-- Show continuous queries for specific database
SHOW CONTINUOUS QUERIES ON mydb;

-- Drop continuous query
DROP CONTINUOUS QUERY "cq_mean_temp" ON "mydb";

Advanced Continuous Queries¶

-- Continuous query with multiple aggregations
CREATE CONTINUOUS QUERY "cq_temp_stats" ON "mydb"
BEGIN
  SELECT mean(temperature) AS mean_temp, max(temperature) AS max_temp, min(temperature) AS min_temp
  INTO "temperature_stats" FROM "temperature" GROUP BY time(10m), location
END;

-- Continuous query with mathematical operations
CREATE CONTINUOUS QUERY "cq_temp_fahrenheit" ON "mydb"
BEGIN
  SELECT mean(temperature) * 1.8 + 32 AS temp_fahrenheit INTO "temperature_fahrenheit" FROM "temperature" GROUP BY time(10m)
END;

-- Continuous query with conditional logic
CREATE CONTINUOUS QUERY "cq_high_temp_alert" ON "mydb"
BEGIN
  SELECT mean(temperature) INTO "high_temperature_alerts" FROM "temperature" WHERE temperature > 30 GROUP BY time(5m)
END;

Administration¶

User Management¶

-- Create admin user
CREATE USER admin WITH PASSWORD 'password' WITH ALL PRIVILEGES;

-- Create regular user
CREATE USER user1 WITH PASSWORD 'password';

-- Show users
SHOW USERS;

-- Grant privileges
GRANT ALL ON mydb TO user1;
GRANT READ ON mydb TO user1;
GRANT WRITE ON mydb TO user1;

-- Revoke privileges
REVOKE ALL PRIVILEGES FROM user1;
REVOKE ALL ON mydb FROM user1;

-- Change password
SET PASSWORD FOR user1 = 'newpassword';

-- Drop user
DROP USER user1;

Authentication and Authorization¶

# Enable authentication in influxdb.conf
[http]
  auth-enabled = true

# Create admin user (first user)
influx
CREATE USER admin WITH PASSWORD 'password' WITH ALL PRIVILEGES

# Connect with authentication
influx -username admin -password password

# HTTP API with authentication
curl -G http://localhost:8086/query \
  -u admin:password \
  --data-urlencode "q=SHOW DATABASES"

Configuration Management¶

# Main configuration file
/etc/influxdb/influxdb.conf

# Key configuration sections:

# [meta]
# dir = "/var/lib/influxdb/meta"

# [data]
# dir = "/var/lib/influxdb/data"
# wal-dir = "/var/lib/influxdb/wal"
# max-series-per-database = 1000000
# max-values-per-tag = 100000

# [coordinator]
# write-timeout = "10s"
# max-concurrent-queries = 0
# query-timeout = "0s"
# log-queries-after = "0s"
# max-select-point = 0
# max-select-series = 0
# max-select-buckets = 0

# [retention]
# enabled = true
# check-interval = "30m"

# [shard-precreation]
# enabled = true
# check-interval = "10m"
# advance-period = "30m"

# [monitor]
# store-enabled = true
# store-database = "_internal"
# store-interval = "10s"

# [http]
# enabled = true
# bind-address = ":8086"
# auth-enabled = false
# log-enabled = true
# write-tracing = false
# pprof-enabled = true
# https-enabled = false
# https-certificate = "/etc/ssl/influxdb.pem"

# [logging]
# format = "auto"
# level = "info"
# suppress-logo = false

# Restart after configuration changes
sudo systemctl restart influxdb

Backup and Restore¶

# Backup database
influxd backup -portable -database mydb /backup/location/

# Backup all databases
influxd backup -portable /backup/location/

# Backup with retention policy
influxd backup -portable -database mydb -retention autogen /backup/location/

# Backup with time range
influxd backup -portable -database mydb -start 2023-01-01T00:00:00Z -end 2023-01-02T00:00:00Z /backup/location/

# Restore database
influxd restore -portable -database mydb /backup/location/

# Restore to different database
influxd restore -portable -database mydb -newdb mynewdb /backup/location/

# Restore with retention policy mapping
influxd restore -portable -database mydb -retention autogen -newrp newrp /backup/location/

# Online backup (Enterprise)
influxd backup -online -database mydb /backup/location/

# Incremental backup (Enterprise)
influxd backup -online -database mydb -incremental /backup/location/

Monitoring¶

Internal Monitoring¶

-- Show internal statistics
SHOW STATS;

-- Show specific stats
SHOW STATS FOR 'httpd';
SHOW STATS FOR 'write';
SHOW STATS FOR 'queryExecutor';

-- Query internal database
USE _internal;
SHOW MEASUREMENTS;

-- Monitor write statistics
SELECT * FROM "write" WHERE time > now() - 1h;

-- Monitor query statistics
SELECT * FROM "queryExecutor" WHERE time > now() - 1h;

-- Monitor HTTP statistics
SELECT * FROM "httpd" WHERE time > now() - 1h;

-- Monitor runtime statistics
SELECT * FROM "runtime" WHERE time > now() - 1h;

Performance Monitoring¶

-- Monitor query performance
SELECT mean("queryDurationNs") FROM "queryExecutor" WHERE time > now() - 1h GROUP BY time(10m);

-- Monitor write throughput
SELECT sum("writeReq") FROM "write" WHERE time > now() - 1h GROUP BY time(10m);

-- Monitor memory usage
SELECT mean("Alloc"), mean("TotalAlloc") FROM "runtime" WHERE time > now() - 1h GROUP BY time(10m);

-- Monitor goroutines
SELECT mean("NumGoroutine") FROM "runtime" WHERE time > now() - 1h GROUP BY time(10m);

-- Monitor shard statistics
SELECT * FROM "shard" WHERE time > now() - 1h;

-- Monitor TSM file statistics
SELECT * FROM "tsm1_filestore" WHERE time > now() - 1h;

System Monitoring¶

# Monitor InfluxDB process
ps aux | grep influxd

# Monitor memory usage
free -h

# Monitor disk usage
df -h /var/lib/influxdb

# Monitor disk I/O
iostat -x 1

# Monitor network connections
netstat -tulpn | grep :8086

# Monitor log files
tail -f /var/log/influxdb/influxdb.log

# Monitor with systemd
journalctl -u influxdb -f

# Check service status
systemctl status influxdb

HTTP API Monitoring¶

# Health check
curl -i http://localhost:8086/health

# Ping endpoint
curl -i http://localhost:8086/ping

# Query endpoint
curl -G http://localhost:8086/query \
  --data-urlencode "q=SHOW DATABASES"

# Write endpoint
curl -i -XPOST http://localhost:8086/write?db=mydb \
  --data-binary 'temperature,location=office value=23.5'

# Metrics endpoint (Prometheus format)
curl http://localhost:8086/metrics

# Debug endpoints
curl http://localhost:8086/debug/pprof/
curl http://localhost:8086/debug/pprof/goroutine
curl http://localhost:8086/debug/pprof/heap

Performance Tuning¶

Write Performance¶

# Batch writes
# Write multiple points in single request
curl -i -XPOST http://localhost:8086/write?db=mydb \
  --data-binary '
temperature,location=office,sensor=A value=23.5
temperature,location=office,sensor=B value=24.1
temperature,location=warehouse,sensor=C value=18.9
'

# Use UDP for high-throughput writes
# Enable UDP in influxdb.conf
[udp]
  enabled = true
  bind-address = ":8089"
  database = "mydb"
  batch-size = 5000
  batch-timeout = "1s"

# Write via UDP
echo "temperature,location=office value=23.5" | nc -u localhost 8089

# Optimize write configuration
[data]
  cache-max-memory-size = "1g"
  cache-snapshot-memory-size = "25m"
  cache-snapshot-write-cold-duration = "10m"
  compact-full-write-cold-duration = "4h"
  max-series-per-database = 1000000
  max-values-per-tag = 100000

[coordinator]
  write-timeout = "10s"
  max-concurrent-queries = 0

Query Performance¶

-- Use time ranges in queries
SELECT * FROM temperature WHERE time >= now() - 1h;

-- Use specific field names instead of *
SELECT temperature, humidity FROM temperature WHERE time >= now() - 1h;

-- Use LIMIT to reduce result set
SELECT * FROM temperature WHERE time >= now() - 1h LIMIT 1000;

-- Use appropriate GROUP BY intervals
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- Avoid SELECT * with GROUP BY
-- Good
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- Bad
SELECT * FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);

-- Use tags for filtering (indexed)
SELECT * FROM temperature WHERE location = 'office' AND time >= now() - 1h;

-- Avoid regex on high cardinality tags
-- Good
SELECT * FROM temperature WHERE location = 'office';
-- Bad (if location has high cardinality)
SELECT * FROM temperature WHERE location =~ /office.*/;

Schema Design¶

-- Use appropriate tag cardinality
-- Good: Low cardinality tags
temperature,location=office,sensor=A temperature=23.5
temperature,location=warehouse,sensor=B temperature=24.1

-- Bad: High cardinality tags (avoid)
-- temperature,user_id=12345,session_id=abcdef temperature=23.5

-- Use fields for high cardinality data
-- Good
temperature,location=office user_id=12345i,session_id="abcdef",temperature=23.5

-- Normalize tag values
-- Good
temperature,location=office,status=active temperature=23.5
-- Bad
temperature,location="Office Building A",status="Currently Active" temperature=23.5

-- Use consistent naming conventions
-- Good
cpu_usage,host=server1,region=us-east value=85.2
memory_usage,host=server1,region=us-east value=1024i
-- Bad
cpuUsage,Host=Server1,REGION=US-EAST Value=85.2

Hardware Optimization¶

# Memory recommendations
# RAM: 8GB minimum, 32GB+ for production
# Set cache size to 25% of available RAM

# Disk recommendations
# Use SSDs for better I/O performance
# Separate data and WAL directories on different disks

# Example optimized configuration
[data]
  dir = "/var/lib/influxdb/data"
  wal-dir = "/var/lib/influxdb/wal"  # Put on separate SSD
  cache-max-memory-size = "8g"      # 25% of 32GB RAM
  cache-snapshot-memory-size = "256m"

# Network optimization
# Use dedicated network for cluster communication
# Ensure sufficient bandwidth for write load

# CPU optimization
# InfluxDB is CPU intensive for queries
# Use high-frequency CPUs
# Scale horizontally for better performance

Clustering¶

InfluxDB Enterprise Clustering¶

# Meta node configuration
[meta]
  dir = "/var/lib/influxdb/meta"
  bind-address = ":8088"
  http-bind-address = ":8091"

# Data node configuration
[meta]
  dir = "/var/lib/influxdb/meta"

[data]
  dir = "/var/lib/influxdb/data"
  wal-dir = "/var/lib/influxdb/wal"

[cluster]
  shard-writer-timeout = "5s"
  write-timeout = "10s"

# Start meta nodes first
sudo systemctl start influxdb-meta

# Join meta nodes to cluster
influxd-ctl add-meta meta1:8091
influxd-ctl add-meta meta2:8091
influxd-ctl add-meta meta3:8091

# Start data nodes
sudo systemctl start influxdb

# Add data nodes to cluster
influxd-ctl add-data data1:8088
influxd-ctl add-data data2:8088
influxd-ctl add-data data3:8088

# Show cluster status
influxd-ctl show

Cluster Management¶

# Show cluster members
influxd-ctl show

# Show shards
influxd-ctl show-shards

# Remove data node
influxd-ctl remove-data data3:8088

# Remove meta node
influxd-ctl remove-meta meta3:8091

# Update data node
influxd-ctl update-data data1:8088

# Copy shard
influxd-ctl copy-shard <source> <dest> <shard-id>

# Remove shard
influxd-ctl remove-shard <data-node> <shard-id>

# Truncate shards
influxd-ctl truncate-shards <delay>

Replication and Sharding¶

-- Create database with replication
CREATE DATABASE mydb WITH REPLICATION 2;

-- Show shard groups
SHOW SHARD GROUPS;

-- Create retention policy with replication
CREATE RETENTION POLICY "replicated" ON "mydb" DURATION 30d REPLICATION 2;

-- Show data distribution
SHOW SHARDS;

Best Practices¶

Schema Design Best Practices¶

-- Use tags for metadata (indexed, low cardinality)
-- Use fields for measured values (not indexed, high cardinality)

-- Good schema design
cpu_usage,host=server1,region=us-east,environment=prod usage_percent=85.2,load_avg=1.5 1609459200000000000

-- Tag guidelines:
-- - Use for filtering and grouping
-- - Keep cardinality low (< 100,000 unique values)
-- - Use consistent naming (lowercase, underscores)
-- - Avoid user IDs, session IDs, or other high cardinality data

-- Field guidelines:
-- - Use for actual measurements
-- - Can have high cardinality
-- - Use appropriate data types (float, integer, boolean, string)
-- - Avoid storing metadata in fields

Write Best Practices¶

# Batch writes for better performance
# Write 1000-10000 points per batch

# Use consistent timestamps
# Prefer nanosecond precision
# Use server timestamp when possible

# Example batch write
curl -i -XPOST http://localhost:8086/write?db=mydb&precision=ns \
  --data-binary '
cpu_usage,host=server1 value=85.2 1609459200000000000
cpu_usage,host=server2 value=78.9 1609459200000000000
memory_usage,host=server1 value=1024 1609459200000000000
memory_usage,host=server2 value=2048 1609459200000000000
'

# Avoid writing historical data out of order
# Write data in chronological order when possible

# Use appropriate retention policies
# Don't keep data longer than necessary
# Use different retention policies for different data types

Query Best Practices¶

-- Always use time ranges
SELECT * FROM temperature WHERE time >= now() - 1h;

-- Use specific field names
SELECT temperature, humidity FROM sensors WHERE time >= now() - 1h;

-- Use tags for filtering
SELECT * FROM temperature WHERE location = 'office' AND time >= now() - 1h;

-- Use appropriate GROUP BY intervals
SELECT mean(temperature) FROM temperature WHERE time >= now() - 24h GROUP BY time(1h);

-- Use LIMIT for large result sets
SELECT * FROM temperature WHERE time >= now() - 1h ORDER BY time DESC LIMIT 100;

-- Use subqueries for complex operations
SELECT max(mean_temp) FROM (
  SELECT mean(temperature) AS mean_temp FROM temperature 
  WHERE time >= now() - 1h GROUP BY time(10m)
);

-- Avoid high cardinality GROUP BY
-- Good
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;
-- Bad (if user_id has high cardinality)
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY user_id;

Operational Best Practices¶

# Monitor disk space
# InfluxDB can grow quickly
# Set up alerts for disk usage > 80%

# Regular backups
# Automate backup process
# Test restore procedures

# Monitor performance
# Track write rates
# Monitor query performance
# Set up alerting for slow queries

# Capacity planning
# Monitor series cardinality
# Plan for data growth
# Consider retention policies

# Security
# Enable authentication
# Use HTTPS in production
# Limit network access
# Regular security updates

# Example monitoring script
#!/bin/bash
DISK_USAGE=$(df /var/lib/influxdb | tail -1 | awk '{print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 80 ]; then
  echo "Warning: InfluxDB disk usage is ${DISK_USAGE}%"
fi

SERIES_COUNT=$(influx -execute "SHOW SERIES" | wc -l)
if [ $SERIES_COUNT -gt 1000000 ]; then
  echo "Warning: High series cardinality: $SERIES_COUNT"
fi

Performance Optimization¶

# Memory optimization
# Set appropriate cache sizes
# Monitor memory usage
# Use memory-mapped files for better performance

# Disk optimization
# Use SSDs for better I/O
# Separate WAL and data directories
# Monitor disk I/O

# Network optimization
# Use dedicated network for cluster traffic
# Optimize batch sizes
# Use compression for remote writes

# Query optimization
# Create appropriate continuous queries
# Use materialized views for common queries
# Cache frequently accessed data

# Example optimized configuration
[data]
  cache-max-memory-size = "8g"
  cache-snapshot-memory-size = "256m"
  cache-snapshot-write-cold-duration = "10m"
  compact-full-write-cold-duration = "4h"
  max-concurrent-compactions = 3

[coordinator]
  write-timeout = "10s"
  max-concurrent-queries = 0
  query-timeout = "0s"

[http]
  max-body-size = 25000000
  max-concurrent-write-limit = 0
  max-enqueued-write-limit = 0

Summary¶

InfluxDB is a powerful time series database optimized for handling high-volume, time-stamped data. This cheatsheet provides comprehensive coverage of InfluxDB operations from basic concepts to advanced administration and optimization.

Key Strengths: - Time Series Optimization: Purpose-built for time-stamped data - High Write Performance: Handles millions of writes per second - Flexible Query Language: Both InfluxQL and Flux for different use cases - Built-in Downsampling: Continuous queries for data aggregation - Horizontal Scaling: Enterprise clustering for large deployments

Best Use Cases: - DevOps monitoring and metrics collection - IoT sensor data and telemetry - Real-time analytics and dashboards - Application performance monitoring - Financial and trading data

Important Considerations: - Schema design significantly impacts performance - Tag cardinality must be carefully managed - Memory requirements can be substantial for high-cardinality data - Clustering requires Enterprise Edition - Query performance depends on proper indexing and time ranges

By following the practices and techniques outlined in this cheatsheet, you can effectively design, implement, and maintain InfluxDB deployments that provide high-performance time series data storage and analysis for modern monitoring and analytics applications.