Feuille de chaleur InfluxDB¶
InfluxDB - Base de données des séries chronologiques
InfluxDB est une base de données de séries chronologiques open-source conçue pour gérer les charges élevées d'écriture et de requête. Il est conçu pour les données horodatées, y compris la surveillance DevOps, les mesures d'application, les données des capteurs IoT et l'analyse en temps réel.
Sommaire¶
- [Installation] (#installation)
- [Concepts de base] (#basic-concepts)
- [InfluxQL] (#influxql)
- [Langue de requête delux] (#flux-query-language)
- Gestion des données
- [Politiques de conservation] (#retention-policies)
- [Questions continues] (#continuous-queries)
- [Administration] (#administration)
- [Surveiller] (#monitoring)
- [Tonnage du rendement] (#performance-tuning)
- [Clustering] (#clustering)
- [Meilleures pratiques] (#best-practices)
Installation¶
Installation Ubuntu/Debian¶
# Add InfluxDB repository
curl -s https://repos.influxdata.com/influxdata-archive_compat.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
# Update package index
sudo apt update
# Install InfluxDB
sudo apt install influxdb
# Start InfluxDB service
sudo systemctl start influxdb
sudo systemctl enable influxdb
# Check status
sudo systemctl status influxdb
# Verify installation
influx -version
CentOS/RHEL Installation¶
# Add InfluxDB repository
sudo tee /etc/yum.repos.d/influxdb.repo << EOF
[influxdb]
name = InfluxDB Repository - RHEL
baseurl = https://repos.influxdata.com/rhel/\$releasever/\$basearch/stable
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF
# Install InfluxDB
sudo yum install influxdb
# Start InfluxDB service
sudo systemctl start influxdb
sudo systemctl enable influxdb
# Check status
sudo systemctl status influxdb
```_
### Installation Docker
```bash
# Pull InfluxDB image
docker pull influxdb:latest
# Run InfluxDB container
docker run -d \
--name influxdb \
-p 8086:8086 \
-v influxdb-data:/var/lib/influxdb \
-e INFLUXDB_DB=mydb \
-e INFLUXDB_ADMIN_USER=admin \
-e INFLUXDB_ADMIN_PASSWORD=password \
influxdb:latest
# Run InfluxDB 2.x
docker run -d \
--name influxdb2 \
-p 8086:8086 \
-v influxdb2-data:/var/lib/influxdb2 \
-e DOCKER_INFLUXDB_INIT_MODE=setup \
-e DOCKER_INFLUXDB_INIT_USERNAME=admin \
-e DOCKER_INFLUXDB_INIT_PASSWORD=password123 \
-e DOCKER_INFLUXDB_INIT_ORG=myorg \
-e DOCKER_INFLUXDB_INIT_BUCKET=mybucket \
influxdb:2.7
# Docker Compose setup
cat > docker-compose.yml << EOF
version: '3.8'
services:
influxdb:
image: influxdb:latest
container_name: influxdb
ports:
- "8086:8086"
environment:
- INFLUXDB_DB=mydb
- INFLUXDB_ADMIN_USER=admin
- INFLUXDB_ADMIN_PASSWORD=password
volumes:
- influxdb-data:/var/lib/influxdb
restart: unless-stopped
volumes:
influxdb-data:
EOF
docker-compose up -d
```_
### Installation manuelle
```bash
# Download InfluxDB
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.8.10_linux_amd64.tar.gz
# Extract
tar -xzf influxdb-1.8.10_linux_amd64.tar.gz
sudo mv influxdb-1.8.10-1 /opt/influxdb
# Create user and directories
sudo useradd --system --home-dir /var/lib/influxdb --shell /bin/false influxdb
sudo mkdir -p /var/lib/influxdb
sudo chown influxdb:influxdb /var/lib/influxdb
# Create systemd service
sudo tee /etc/systemd/system/influxdb.service << EOF
[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target
[Service]
User=influxdb
Group=influxdb
ExecStart=/opt/influxdb/usr/bin/influxd -config /etc/influxdb/influxdb.conf
KillMode=control-group
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl start influxdb
sudo systemctl enable influxdb
Concepts de base¶
Concepts clés¶
# Time Series Data Structure
# measurement,tag_key1=tag_value1,tag_key2=tag_value2 field_key1=field_value1,field_key2=field_value2 timestamp
# Example:
# temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000
# Components:
# - Measurement: Similar to table name (e.g., "temperature")
# - Tags: Indexed metadata (e.g., location=office, sensor=A)
# - Fields: Actual data values (e.g., temperature=23.5, humidity=45.2)
# - Timestamp: Unix nanosecond timestamp
# Database: Container for users, retention policies, continuous queries, and time series data
# Retention Policy: Describes how long InfluxDB keeps data and how many copies exist
# Series: Collection of data that share a measurement, tag set, and field key
# Point: Single data record with measurement, tag set, field set, and timestamp
InfluxDB CLI¶
# Connect to InfluxDB
influx
# Connect with specific parameters
influx -host localhost -port 8086
# Connect with authentication
influx -username admin -password password
# Connect to specific database
influx -database mydb
# Execute single command
influx -execute "SHOW DATABASES"
# Execute commands from file
influx -import -path=data.txt -precision=ns
# Set precision
influx -precision rfc3339
Commandes de base¶
-- Show databases
SHOW DATABASES;
-- Create database
CREATE DATABASE mydb;
-- Use database
USE mydb;
-- Show measurements (tables)
SHOW MEASUREMENTS;
-- Show tag keys
SHOW TAG KEYS;
-- Show tag values
SHOW TAG VALUES FROM "temperature" WITH KEY = "location";
-- Show field keys
SHOW FIELD KEYS;
-- Show series
SHOW SERIES;
-- Show retention policies
SHOW RETENTION POLICIES;
-- Show users
SHOW USERS;
InfluxQL¶
Écriture des données¶
-- Insert single point
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2
-- Insert with timestamp
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000
-- Insert multiple points
INSERT temperature,location=office,sensor=A temperature=23.5,humidity=45.2
INSERT temperature,location=office,sensor=B temperature=24.1,humidity=43.8
INSERT temperature,location=warehouse,sensor=C temperature=18.9,humidity=52.1
-- Insert with different field types
INSERT metrics,host=server1 cpu_usage=85.2,memory_usage=1024i,disk_full=false,status="running"
-- Batch insert using line protocol
# temperature,location=office,sensor=A temperature=23.5,humidity=45.2 1609459200000000000
# temperature,location=office,sensor=B temperature=24.1,humidity=43.8 1609459260000000000
# temperature,location=warehouse,sensor=C temperature=18.9,humidity=52.1 1609459320000000000
Demande de données¶
-- Basic SELECT
SELECT * FROM temperature;
-- Select specific fields
SELECT temperature, humidity FROM temperature;
-- Select with WHERE clause
SELECT * FROM temperature WHERE location = 'office';
-- Select with time range
SELECT * FROM temperature WHERE time >= '2023-01-01T00:00:00Z' AND time <= '2023-01-02T00:00:00Z';
-- Select with relative time
SELECT * FROM temperature WHERE time >= now() - 1h;
-- Select with LIMIT
SELECT * FROM temperature LIMIT 10;
-- Select with ORDER BY
SELECT * FROM temperature ORDER BY time DESC LIMIT 10;
-- Select with GROUP BY
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;
-- Select with GROUP BY time
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- Select with multiple GROUP BY
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m), location;
-- Select with FILL
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(null);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(previous);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(linear);
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m) FILL(0);
Fonctions d'agrégation¶
-- COUNT
SELECT COUNT(temperature) FROM temperature WHERE time >= now() - 1h;
-- MEAN
SELECT MEAN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- MEDIAN
SELECT MEDIAN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- MODE
SELECT MODE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- SPREAD
SELECT SPREAD(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- STDDEV
SELECT STDDEV(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- SUM
SELECT SUM(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- MIN and MAX
SELECT MIN(temperature), MAX(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- FIRST and LAST
SELECT FIRST(temperature), LAST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- PERCENTILE
SELECT PERCENTILE(temperature, 95) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
Fonctions du sélecteur¶
-- TOP
SELECT TOP(temperature, 5) FROM temperature WHERE time >= now() - 1h;
-- BOTTOM
SELECT BOTTOM(temperature, 5) FROM temperature WHERE time >= now() - 1h;
-- FIRST
SELECT FIRST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;
-- LAST
SELECT LAST(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;
-- MAX
SELECT MAX(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- MIN
SELECT MIN(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- SAMPLE
SELECT SAMPLE(temperature, 3) FROM temperature WHERE time >= now() - 1h;
Fonctions de transformation¶
-- ABS
SELECT ABS(temperature) FROM temperature WHERE time >= now() - 1h;
-- ACOS, ASIN, ATAN
SELECT ACOS(temperature/100), ASIN(temperature/100), ATAN(temperature/100) FROM temperature WHERE time >= now() - 1h;
-- CEIL, FLOOR, ROUND
SELECT CEIL(temperature), FLOOR(temperature), ROUND(temperature) FROM temperature WHERE time >= now() - 1h;
-- COS, SIN, TAN
SELECT COS(temperature), SIN(temperature), TAN(temperature) FROM temperature WHERE time >= now() - 1h;
-- DERIVATIVE
SELECT DERIVATIVE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- DIFFERENCE
SELECT DIFFERENCE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- ELAPSED
SELECT ELAPSED(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- EXP, LN, LOG
SELECT EXP(temperature/10), LN(temperature), LOG(temperature, 10) FROM temperature WHERE time >= now() - 1h;
-- MOVING_AVERAGE
SELECT MOVING_AVERAGE(temperature, 3) FROM temperature WHERE time >= now() - 1h GROUP BY time(1m);
-- NON_NEGATIVE_DERIVATIVE
SELECT NON_NEGATIVE_DERIVATIVE(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- POW, SQRT
SELECT POW(temperature, 2), SQRT(temperature) FROM temperature WHERE time >= now() - 1h;
Sous-requêtes¶
-- Basic subquery
SELECT mean FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m));
-- Subquery with WHERE
SELECT * FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m)) WHERE mean > 20;
-- Nested subqueries
SELECT max FROM (SELECT MAX(mean) AS max FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m)));
-- Subquery with mathematical operations
SELECT mean * 1.8 + 32 AS fahrenheit FROM (SELECT MEAN(temperature) AS mean FROM temperature WHERE time >= now() - 1h GROUP BY time(10m));
Expressions régulières¶
-- Match tag values with regex
SELECT * FROM temperature WHERE location =~ /^office.*/;
-- Match field values with regex
SELECT * FROM logs WHERE message =~ /error|warning/;
-- Case-insensitive regex
SELECT * FROM temperature WHERE location =~ /(?i)office/;
-- Negated regex
SELECT * FROM temperature WHERE location !~ /warehouse/;
Langue de requête Flux¶
Syntaxe de base du flux¶
// Import packages
import "csv"
import "json"
import "math"
// Basic query structure
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r.location == "office")
// Variables
start_time = -1h
measurement_name = "temperature"
from(bucket: "mybucket")
|> range(start: start_time)
|> filter(fn: (r) => r._measurement == measurement_name)
// Comments
// Single line comment
/* Multi-line
comment */
Sélection des données¶
// Basic data selection
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
// Multiple filters
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r.location == "office")
|> filter(fn: (r) => r._field == "temperature")
// Time range with absolute times
from(bucket: "mybucket")
|> range(start: 2023-01-01T00:00:00Z, stop: 2023-01-02T00:00:00Z)
// Complex filter conditions
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature" and (r.location == "office" or r.location == "warehouse"))
// Filter by field value
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> filter(fn: (r) => r._value > 20.0)
Agrégations¶
// Mean aggregation
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 10m, fn: mean)
// Multiple aggregations
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 10m, fn: mean)
|> yield(name: "mean")
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 10m, fn: max)
|> yield(name: "max")
// Group by tags
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> group(columns: ["location"])
|> aggregateWindow(every: 10m, fn: mean)
// Custom aggregation function
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> aggregateWindow(every: 10m, fn: (tables=<-, column) =>
tables |> quantile(q: 0.95, column: column))
Transformations¶
// Map function
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> map(fn: (r) => ({ r with _value: r._value * 1.8 + 32.0 }))
// Rename columns
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> rename(columns: {_value: "temperature_celsius"})
// Add new columns
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> map(fn: (r) => ({ r with fahrenheit: r._value * 1.8 + 32.0 }))
// Drop columns
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> drop(columns: ["_start", "_stop"])
// Keep only specific columns
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> keep(columns: ["_time", "_value", "location"])
// Sort data
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> sort(columns: ["_time"], desc: false)
// Limit results
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> limit(n: 10)
Adhésions et syndicats¶
// Join two streams
temp_data = from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
humidity_data = from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "humidity")
join(tables: {temp: temp_data, humidity: humidity_data}, on: ["_time", "location"])
// Union multiple streams
stream1 = from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
stream2 = from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "pressure")
union(tables: [stream1, stream2])
// Pivot data
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "sensors")
|> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
Fonctions de la fenêtre¶
// Time-based windows
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> window(every: 10m)
|> mean()
|> duplicate(column: "_stop", as: "_time")
|> window(every: inf)
// Moving average
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> movingAverage(n: 5)
// Exponential moving average
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> exponentialMovingAverage(n: 5)
// Derivative
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "counter")
|> derivative(unit: 1s, nonNegative: true)
// Difference
from(bucket: "mybucket")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "temperature")
|> difference(nonNegative: false)
Gestion des données¶
Opérations de bases de données¶
-- Create database
CREATE DATABASE mydb;
-- Create database with retention policy
CREATE DATABASE mydb WITH DURATION 30d REPLICATION 1 SHARD DURATION 1h NAME "default";
-- Drop database
DROP DATABASE mydb;
-- Show databases
SHOW DATABASES;
-- Use database
USE mydb;
Opérations de mesure¶
-- Show measurements
SHOW MEASUREMENTS;
-- Show measurements with regex
SHOW MEASUREMENTS WITH MEASUREMENT =~ /temp.*/;
-- Show measurements on specific database
SHOW MEASUREMENTS ON mydb;
-- Drop measurement
DROP MEASUREMENT temperature;
-- Show series
SHOW SERIES;
-- Show series from specific measurement
SHOW SERIES FROM temperature;
-- Show series with WHERE clause
SHOW SERIES FROM temperature WHERE location = 'office';
-- Drop series
DROP SERIES FROM temperature WHERE location = 'office';
Opérations sur le terrain et sur les étiquettes¶
-- Show field keys
SHOW FIELD KEYS;
-- Show field keys from specific measurement
SHOW FIELD KEYS FROM temperature;
-- Show tag keys
SHOW TAG KEYS;
-- Show tag keys from specific measurement
SHOW TAG KEYS FROM temperature;
-- Show tag values
SHOW TAG VALUES FROM temperature WITH KEY = "location";
-- Show tag values with WHERE clause
SHOW TAG VALUES FROM temperature WITH KEY = "location" WHERE location =~ /office.*/;
-- Show tag values for multiple keys
SHOW TAG VALUES FROM temperature WITH KEY IN ("location", "sensor");
Suppression des données¶
-- Delete data with WHERE clause
DELETE FROM temperature WHERE location = 'office';
-- Delete data with time range
DELETE FROM temperature WHERE time < '2023-01-01T00:00:00Z';
-- Delete data with multiple conditions
DELETE FROM temperature WHERE location = 'office' AND time < '2023-01-01T00:00:00Z';
-- Delete all data from measurement
DELETE FROM temperature;
-- Drop measurement (deletes all data and metadata)
DROP MEASUREMENT temperature;
-- Drop series
DROP SERIES FROM temperature WHERE location = 'office';
Politiques de conservation¶
Création de politiques de conservation¶
-- Create retention policy
CREATE RETENTION POLICY "one_week" ON "mydb" DURATION 7d REPLICATION 1;
-- Create retention policy with shard duration
CREATE RETENTION POLICY "one_month" ON "mydb" DURATION 30d REPLICATION 1 SHARD DURATION 1h;
-- Create default retention policy
CREATE RETENTION POLICY "default" ON "mydb" DURATION 365d REPLICATION 1 DEFAULT;
-- Show retention policies
SHOW RETENTION POLICIES;
-- Show retention policies for specific database
SHOW RETENTION POLICIES ON mydb;
Gestion des politiques de conservation¶
-- Alter retention policy
ALTER RETENTION POLICY "one_week" ON "mydb" DURATION 14d;
-- Alter retention policy replication
ALTER RETENTION POLICY "one_week" ON "mydb" REPLICATION 2;
-- Set as default retention policy
ALTER RETENTION POLICY "one_week" ON "mydb" DEFAULT;
-- Drop retention policy
DROP RETENTION POLICY "one_week" ON "mydb";
Écrire à une politique de conservation spécifique¶
-- Write to specific retention policy
INSERT INTO mydb."one_week".temperature,location=office,sensor=A temperature=23.5;
-- Using fully qualified measurement name
INSERT INTO mydb."one_week".temperature,location=office temperature=23.5,humidity=45.2;
Enquêtes continues¶
Création de requêtes continues¶
-- Basic continuous query
CREATE CONTINUOUS QUERY "cq_mean_temp" ON "mydb"
BEGIN
SELECT mean(temperature) INTO "average_temperature" FROM "temperature" GROUP BY time(10m)
END;
-- Continuous query with WHERE clause
CREATE CONTINUOUS QUERY "cq_office_temp" ON "mydb"
BEGIN
SELECT mean(temperature) INTO "office_avg_temp" FROM "temperature" WHERE location = 'office' GROUP BY time(10m)
END;
-- Continuous query with tags
CREATE CONTINUOUS QUERY "cq_temp_by_location" ON "mydb"
BEGIN
SELECT mean(temperature) INTO "avg_temp_by_location" FROM "temperature" GROUP BY time(10m), location
END;
-- Continuous query with retention policy
CREATE CONTINUOUS QUERY "cq_downsampled" ON "mydb"
BEGIN
SELECT mean(temperature) INTO "mydb"."one_month"."avg_temperature" FROM "mydb"."autogen"."temperature" GROUP BY time(1h)
END;
-- Continuous query with RESAMPLE
CREATE CONTINUOUS QUERY "cq_resample" ON "mydb"
RESAMPLE EVERY 30s FOR 2m
BEGIN
SELECT mean(temperature) INTO "avg_temperature" FROM "temperature" GROUP BY time(1m)
END;
Gestion des requêtes continues¶
-- Show continuous queries
SHOW CONTINUOUS QUERIES;
-- Show continuous queries for specific database
SHOW CONTINUOUS QUERIES ON mydb;
-- Drop continuous query
DROP CONTINUOUS QUERY "cq_mean_temp" ON "mydb";
Enquêtes continues avancées¶
-- Continuous query with multiple aggregations
CREATE CONTINUOUS QUERY "cq_temp_stats" ON "mydb"
BEGIN
SELECT mean(temperature) AS mean_temp, max(temperature) AS max_temp, min(temperature) AS min_temp
INTO "temperature_stats" FROM "temperature" GROUP BY time(10m), location
END;
-- Continuous query with mathematical operations
CREATE CONTINUOUS QUERY "cq_temp_fahrenheit" ON "mydb"
BEGIN
SELECT mean(temperature) * 1.8 + 32 AS temp_fahrenheit INTO "temperature_fahrenheit" FROM "temperature" GROUP BY time(10m)
END;
-- Continuous query with conditional logic
CREATE CONTINUOUS QUERY "cq_high_temp_alert" ON "mydb"
BEGIN
SELECT mean(temperature) INTO "high_temperature_alerts" FROM "temperature" WHERE temperature > 30 GROUP BY time(5m)
END;
Administration¶
Gestion des utilisateurs¶
-- Create admin user
CREATE USER admin WITH PASSWORD 'password' WITH ALL PRIVILEGES;
-- Create regular user
CREATE USER user1 WITH PASSWORD 'password';
-- Show users
SHOW USERS;
-- Grant privileges
GRANT ALL ON mydb TO user1;
GRANT READ ON mydb TO user1;
GRANT WRITE ON mydb TO user1;
-- Revoke privileges
REVOKE ALL PRIVILEGES FROM user1;
REVOKE ALL ON mydb FROM user1;
-- Change password
SET PASSWORD FOR user1 = 'newpassword';
-- Drop user
DROP USER user1;
Authentification et autorisation¶
# Enable authentication in influxdb.conf
[http]
auth-enabled = true
# Create admin user (first user)
influx
CREATE USER admin WITH PASSWORD 'password' WITH ALL PRIVILEGES
# Connect with authentication
influx -username admin -password password
# HTTP API with authentication
curl -G http://localhost:8086/query \
-u admin:password \
--data-urlencode "q=SHOW DATABASES"
Gestion de la configuration¶
# Main configuration file
/etc/influxdb/influxdb.conf
# Key configuration sections:
# [meta]
# dir = "/var/lib/influxdb/meta"
# [data]
# dir = "/var/lib/influxdb/data"
# wal-dir = "/var/lib/influxdb/wal"
# max-series-per-database = 1000000
# max-values-per-tag = 100000
# [coordinator]
# write-timeout = "10s"
# max-concurrent-queries = 0
# query-timeout = "0s"
# log-queries-after = "0s"
# max-select-point = 0
# max-select-series = 0
# max-select-buckets = 0
# [retention]
# enabled = true
# check-interval = "30m"
# [shard-precreation]
# enabled = true
# check-interval = "10m"
# advance-period = "30m"
# [monitor]
# store-enabled = true
# store-database = "_internal"
# store-interval = "10s"
# [http]
# enabled = true
# bind-address = ":8086"
# auth-enabled = false
# log-enabled = true
# write-tracing = false
# pprof-enabled = true
# https-enabled = false
# https-certificate = "/etc/ssl/influxdb.pem"
# [logging]
# format = "auto"
# level = "info"
# suppress-logo = false
# Restart after configuration changes
sudo systemctl restart influxdb
Sauvegarde et restauration¶
# Backup database
influxd backup -portable -database mydb /backup/location/
# Backup all databases
influxd backup -portable /backup/location/
# Backup with retention policy
influxd backup -portable -database mydb -retention autogen /backup/location/
# Backup with time range
influxd backup -portable -database mydb -start 2023-01-01T00:00:00Z -end 2023-01-02T00:00:00Z /backup/location/
# Restore database
influxd restore -portable -database mydb /backup/location/
# Restore to different database
influxd restore -portable -database mydb -newdb mynewdb /backup/location/
# Restore with retention policy mapping
influxd restore -portable -database mydb -retention autogen -newrp newrp /backup/location/
# Online backup (Enterprise)
influxd backup -online -database mydb /backup/location/
# Incremental backup (Enterprise)
influxd backup -online -database mydb -incremental /backup/location/
Surveillance¶
Surveillance interne¶
-- Show internal statistics
SHOW STATS;
-- Show specific stats
SHOW STATS FOR 'httpd';
SHOW STATS FOR 'write';
SHOW STATS FOR 'queryExecutor';
-- Query internal database
USE _internal;
SHOW MEASUREMENTS;
-- Monitor write statistics
SELECT * FROM "write" WHERE time > now() - 1h;
-- Monitor query statistics
SELECT * FROM "queryExecutor" WHERE time > now() - 1h;
-- Monitor HTTP statistics
SELECT * FROM "httpd" WHERE time > now() - 1h;
-- Monitor runtime statistics
SELECT * FROM "runtime" WHERE time > now() - 1h;
Surveillance de la performance¶
-- Monitor query performance
SELECT mean("queryDurationNs") FROM "queryExecutor" WHERE time > now() - 1h GROUP BY time(10m);
-- Monitor write throughput
SELECT sum("writeReq") FROM "write" WHERE time > now() - 1h GROUP BY time(10m);
-- Monitor memory usage
SELECT mean("Alloc"), mean("TotalAlloc") FROM "runtime" WHERE time > now() - 1h GROUP BY time(10m);
-- Monitor goroutines
SELECT mean("NumGoroutine") FROM "runtime" WHERE time > now() - 1h GROUP BY time(10m);
-- Monitor shard statistics
SELECT * FROM "shard" WHERE time > now() - 1h;
-- Monitor TSM file statistics
SELECT * FROM "tsm1_filestore" WHERE time > now() - 1h;
Surveillance du système¶
# Monitor InfluxDB process
ps aux | grep influxd
# Monitor memory usage
free -h
# Monitor disk usage
df -h /var/lib/influxdb
# Monitor disk I/O
iostat -x 1
# Monitor network connections
netstat -tulpn | grep :8086
# Monitor log files
tail -f /var/log/influxdb/influxdb.log
# Monitor with systemd
journalctl -u influxdb -f
# Check service status
systemctl status influxdb
Surveillance de l'API HTTP¶
# Health check
curl -i http://localhost:8086/health
# Ping endpoint
curl -i http://localhost:8086/ping
# Query endpoint
curl -G http://localhost:8086/query \
--data-urlencode "q=SHOW DATABASES"
# Write endpoint
curl -i -XPOST http://localhost:8086/write?db=mydb \
--data-binary 'temperature,location=office value=23.5'
# Metrics endpoint (Prometheus format)
curl http://localhost:8086/metrics
# Debug endpoints
curl http://localhost:8086/debug/pprof/
curl http://localhost:8086/debug/pprof/goroutine
curl http://localhost:8086/debug/pprof/heap
Analyse des performances¶
Écrire les performances¶
# Batch writes
# Write multiple points in single request
curl -i -XPOST http://localhost:8086/write?db=mydb \
--data-binary '
temperature,location=office,sensor=A value=23.5
temperature,location=office,sensor=B value=24.1
temperature,location=warehouse,sensor=C value=18.9
'
# Use UDP for high-throughput writes
# Enable UDP in influxdb.conf
[udp]
enabled = true
bind-address = ":8089"
database = "mydb"
batch-size = 5000
batch-timeout = "1s"
# Write via UDP
echo "temperature,location=office value=23.5" | nc -u localhost 8089
# Optimize write configuration
[data]
cache-max-memory-size = "1g"
cache-snapshot-memory-size = "25m"
cache-snapshot-write-cold-duration = "10m"
compact-full-write-cold-duration = "4h"
max-series-per-database = 1000000
max-values-per-tag = 100000
[coordinator]
write-timeout = "10s"
max-concurrent-queries = 0
Exécution des requêtes¶
-- Use time ranges in queries
SELECT * FROM temperature WHERE time >= now() - 1h;
-- Use specific field names instead of *
SELECT temperature, humidity FROM temperature WHERE time >= now() - 1h;
-- Use LIMIT to reduce result set
SELECT * FROM temperature WHERE time >= now() - 1h LIMIT 1000;
-- Use appropriate GROUP BY intervals
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- Avoid SELECT * with GROUP BY
-- Good
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- Bad
SELECT * FROM temperature WHERE time >= now() - 1h GROUP BY time(10m);
-- Use tags for filtering (indexed)
SELECT * FROM temperature WHERE location = 'office' AND time >= now() - 1h;
-- Avoid regex on high cardinality tags
-- Good
SELECT * FROM temperature WHERE location = 'office';
-- Bad (if location has high cardinality)
SELECT * FROM temperature WHERE location =~ /office.*/;
Conception du schéma¶
-- Use appropriate tag cardinality
-- Good: Low cardinality tags
temperature,location=office,sensor=A temperature=23.5
temperature,location=warehouse,sensor=B temperature=24.1
-- Bad: High cardinality tags (avoid)
-- temperature,user_id=12345,session_id=abcdef temperature=23.5
-- Use fields for high cardinality data
-- Good
temperature,location=office user_id=12345i,session_id="abcdef",temperature=23.5
-- Normalize tag values
-- Good
temperature,location=office,status=active temperature=23.5
-- Bad
temperature,location="Office Building A",status="Currently Active" temperature=23.5
-- Use consistent naming conventions
-- Good
cpu_usage,host=server1,region=us-east value=85.2
memory_usage,host=server1,region=us-east value=1024i
-- Bad
cpuUsage,Host=Server1,REGION=US-EAST Value=85.2
Optimisation du matériel¶
# Memory recommendations
# RAM: 8GB minimum, 32GB+ for production
# Set cache size to 25% of available RAM
# Disk recommendations
# Use SSDs for better I/O performance
# Separate data and WAL directories on different disks
# Example optimized configuration
[data]
dir = "/var/lib/influxdb/data"
wal-dir = "/var/lib/influxdb/wal" # Put on separate SSD
cache-max-memory-size = "8g" # 25% of 32GB RAM
cache-snapshot-memory-size = "256m"
# Network optimization
# Use dedicated network for cluster communication
# Ensure sufficient bandwidth for write load
# CPU optimization
# InfluxDB is CPU intensive for queries
# Use high-frequency CPUs
# Scale horizontally for better performance
Groupement¶
InfluxDB Groupement d'entreprises¶
# Meta node configuration
[meta]
dir = "/var/lib/influxdb/meta"
bind-address = ":8088"
http-bind-address = ":8091"
# Data node configuration
[meta]
dir = "/var/lib/influxdb/meta"
[data]
dir = "/var/lib/influxdb/data"
wal-dir = "/var/lib/influxdb/wal"
[cluster]
shard-writer-timeout = "5s"
write-timeout = "10s"
# Start meta nodes first
sudo systemctl start influxdb-meta
# Join meta nodes to cluster
influxd-ctl add-meta meta1:8091
influxd-ctl add-meta meta2:8091
influxd-ctl add-meta meta3:8091
# Start data nodes
sudo systemctl start influxdb
# Add data nodes to cluster
influxd-ctl add-data data1:8088
influxd-ctl add-data data2:8088
influxd-ctl add-data data3:8088
# Show cluster status
influxd-ctl show
Gestion des grappes¶
# Show cluster members
influxd-ctl show
# Show shards
influxd-ctl show-shards
# Remove data node
influxd-ctl remove-data data3:8088
# Remove meta node
influxd-ctl remove-meta meta3:8091
# Update data node
influxd-ctl update-data data1:8088
# Copy shard
influxd-ctl copy-shard <source> <dest> <shard-id>
# Remove shard
influxd-ctl remove-shard <data-node> <shard-id>
# Truncate shards
influxd-ctl truncate-shards <delay>
Réplication et Sharding¶
-- Create database with replication
CREATE DATABASE mydb WITH REPLICATION 2;
-- Show shard groups
SHOW SHARD GROUPS;
-- Create retention policy with replication
CREATE RETENTION POLICY "replicated" ON "mydb" DURATION 30d REPLICATION 2;
-- Show data distribution
SHOW SHARDS;
Meilleures pratiques¶
Meilleures pratiques de conception du schéma¶
-- Use tags for metadata (indexed, low cardinality)
-- Use fields for measured values (not indexed, high cardinality)
-- Good schema design
cpu_usage,host=server1,region=us-east,environment=prod usage_percent=85.2,load_avg=1.5 1609459200000000000
-- Tag guidelines:
-- - Use for filtering and grouping
-- - Keep cardinality low (< 100,000 unique values)
-- - Use consistent naming (lowercase, underscores)
-- - Avoid user IDs, session IDs, or other high cardinality data
-- Field guidelines:
-- - Use for actual measurements
-- - Can have high cardinality
-- - Use appropriate data types (float, integer, boolean, string)
-- - Avoid storing metadata in fields
Écrire les meilleures pratiques¶
# Batch writes for better performance
# Write 1000-10000 points per batch
# Use consistent timestamps
# Prefer nanosecond precision
# Use server timestamp when possible
# Example batch write
curl -i -XPOST http://localhost:8086/write?db=mydb&precision=ns \
--data-binary '
cpu_usage,host=server1 value=85.2 1609459200000000000
cpu_usage,host=server2 value=78.9 1609459200000000000
memory_usage,host=server1 value=1024 1609459200000000000
memory_usage,host=server2 value=2048 1609459200000000000
'
# Avoid writing historical data out of order
# Write data in chronological order when possible
# Use appropriate retention policies
# Don't keep data longer than necessary
# Use different retention policies for different data types
Demander les meilleures pratiques¶
-- Always use time ranges
SELECT * FROM temperature WHERE time >= now() - 1h;
-- Use specific field names
SELECT temperature, humidity FROM sensors WHERE time >= now() - 1h;
-- Use tags for filtering
SELECT * FROM temperature WHERE location = 'office' AND time >= now() - 1h;
-- Use appropriate GROUP BY intervals
SELECT mean(temperature) FROM temperature WHERE time >= now() - 24h GROUP BY time(1h);
-- Use LIMIT for large result sets
SELECT * FROM temperature WHERE time >= now() - 1h ORDER BY time DESC LIMIT 100;
-- Use subqueries for complex operations
SELECT max(mean_temp) FROM (
SELECT mean(temperature) AS mean_temp FROM temperature
WHERE time >= now() - 1h GROUP BY time(10m)
);
-- Avoid high cardinality GROUP BY
-- Good
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY location;
-- Bad (if user_id has high cardinality)
SELECT mean(temperature) FROM temperature WHERE time >= now() - 1h GROUP BY user_id;
Meilleures pratiques opérationnelles¶
# Monitor disk space
# InfluxDB can grow quickly
# Set up alerts for disk usage > 80%
# Regular backups
# Automate backup process
# Test restore procedures
# Monitor performance
# Track write rates
# Monitor query performance
# Set up alerting for slow queries
# Capacity planning
# Monitor series cardinality
# Plan for data growth
# Consider retention policies
# Security
# Enable authentication
# Use HTTPS in production
# Limit network access
# Regular security updates
# Example monitoring script
#!/bin/bash
DISK_USAGE=$(df /var/lib/influxdb | tail -1 | awk '{print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 80 ]; then
echo "Warning: InfluxDB disk usage is ${DISK_USAGE}%"
fi
SERIES_COUNT=$(influx -execute "SHOW SERIES" | wc -l)
if [ $SERIES_COUNT -gt 1000000 ]; then
echo "Warning: High series cardinality: $SERIES_COUNT"
fi
Optimisation des performances¶
# Memory optimization
# Set appropriate cache sizes
# Monitor memory usage
# Use memory-mapped files for better performance
# Disk optimization
# Use SSDs for better I/O
# Separate WAL and data directories
# Monitor disk I/O
# Network optimization
# Use dedicated network for cluster traffic
# Optimize batch sizes
# Use compression for remote writes
# Query optimization
# Create appropriate continuous queries
# Use materialized views for common queries
# Cache frequently accessed data
# Example optimized configuration
[data]
cache-max-memory-size = "8g"
cache-snapshot-memory-size = "256m"
cache-snapshot-write-cold-duration = "10m"
compact-full-write-cold-duration = "4h"
max-concurrent-compactions = 3
[coordinator]
write-timeout = "10s"
max-concurrent-queries = 0
query-timeout = "0s"
[http]
max-body-size = 25000000
max-concurrent-write-limit = 0
max-enqueued-write-limit = 0
Résumé¶
InfluxDB est une puissante base de données de séries chronologiques optimisée pour la gestion de données à volume élevé, avec un tampon de temps. Cette triche fournit une couverture complète des opérations InfluxDB des concepts de base à l'administration et l'optimisation avancées.
Principales forces: - ** Optimisation des séries horaires**: Conçu pour les données horodatées - High Write Performance: gère des millions d'écritures par seconde - Langue de requête souple: les deux Influx QL et Flux pour différents cas d'utilisation - Échantillonnage en descente: Demandes continues d'agrégation des données - Élargissement horizontal: regroupement d'entreprises pour les grands déploiements
Cas de la meilleure utilisation: - Surveillance DevOps et collecte des paramètres - Données et télémétrie des capteurs IoT - Analyse en temps réel et tableaux de bord - Surveillance du rendement de l'application - Données financières et commerciales
** Considérations importantes :** - La conception du schéma a un impact significatif sur les performances - Tag cardinality doit être géré avec soin - Les besoins en mémoire peuvent être importants pour les données à haute cardinalité - Le regroupement nécessite Édition Entreprise - La performance des requêtes dépend de l'indexation et des intervalles de temps appropriés
En suivant les pratiques et techniques décrites dans cette feuille de triche, vous pouvez concevoir, mettre en œuvre et maintenir efficacement des déploiements InfluxDB qui fournissent un stockage et une analyse de données de séries chronologiques haute performance pour des applications modernes de surveillance et d'analyse.