Apache Cassandra Cheatsheet¶
- Verteilte NoSQL Datenbank
Apache Cassandra ist ein freier und offener, verteilter, breiter Spaltenspeicher, NoSQL Datenbank-Management-System, um große Mengen von Daten über viele Warenserver zu handhaben, bietet hohe Verfügbarkeit ohne einzigen Punkt des Ausfalls. < p>
Inhaltsverzeichnis¶
- [Installation](#installation
- (#basic-commands)
- (#keyspace-operations_)
- [Tabellen](#table-operations_
- Datentypen
- (CRUD Operationen)(LINK_5__
- [Query Operations](LINK_6__
- [Indexes](LINK_7_
- Benutzerdefinierte Typen
- Funktionen
- Materialisierte Ansichten
- (#batch-operations_)
- (#consistency-levels)
- (#cluster-management_)
- (#performance-tuning_)
- (Monitoring)(LINK_15_)
- [Sicherheit](#security_
- Beste Praktiken
Installation¶
Ubuntu/Debian Installation¶
```bash
Add Apache Cassandra repository¶
echo "deb https://debian.cassandra.apache.org 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
Update package index¶
sudo apt update
Install Cassandra¶
sudo apt install cassandra
Start Cassandra service¶
sudo systemctl start cassandra sudo systemctl enable cassandra
Check status¶
sudo systemctl status cassandra
Verify installation¶
nodetool status ```_
CentOS/RHEL Installation¶
```bash
Create repository file¶
sudo tee /etc/yum.repos.d/cassandra.repo << EOF [cassandra] name=Apache Cassandra baseurl=https://redhat.cassandra.apache.org/40x/ gpgcheck=1 repo_gpgcheck=1 gpgkey=https://downloads.apache.org/cassandra/KEYS EOF
Install Cassandra¶
sudo yum install cassandra
Start Cassandra service¶
sudo systemctl start cassandra sudo systemctl enable cassandra
Check status¶
nodetool status ```_
Docker Installation¶
```bash
Pull Cassandra image¶
docker pull cassandra:latest
Run Cassandra container¶
docker run --name cassandra-node \ -p 9042:9042 \ -p 7000:7000 \ -p 7001:7001 \ -p 7199:7199 \ -p 9160:9160 \ -d cassandra:latest
Connect to Cassandra¶
docker exec -it cassandra-node cqlsh
Docker Compose for cluster¶
cat > docker-compose.yml << EOF version: '3.8' services: cassandra-1: image: cassandra:latest container_name: cassandra-1 ports: - "9042:9042" environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch - CASSANDRA_DC=datacenter1 volumes: - cassandra-1-data:/var/lib/cassandra
cassandra-2: image: cassandra:latest container_name: cassandra-2 environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch - CASSANDRA_DC=datacenter1 - CASSANDRA_SEEDS=cassandra-1 depends_on: - cassandra-1 volumes: - cassandra-2-data:/var/lib/cassandra
cassandra-3: image: cassandra:latest container_name: cassandra-3 environment: - CASSANDRA_CLUSTER_NAME=MyCluster - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch - CASSANDRA_DC=datacenter1 - CASSANDRA_SEEDS=cassandra-1 depends_on: - cassandra-1 volumes: - cassandra-3-data:/var/lib/cassandra
volumes: cassandra-1-data: cassandra-2-data: cassandra-3-data: EOF
docker-compose up -d ```_
Manuelle Installation¶
```bash
Download Cassandra¶
wget https://downloads.apache.org/cassandra/4.0.7/apache-cassandra-4.0.7-bin.tar.gz
Extract¶
tar -xzf apache-cassandra-4.0.7-bin.tar.gz sudo mv apache-cassandra-4.0.7 /opt/cassandra
Set environment variables¶
echo 'export CASSANDRA_HOME=/opt/cassandra' >> ~/.bashrc echo 'export PATH=\(PATH:\)CASSANDRA_HOME/bin' >> ~/.bashrc source ~/.bashrc
Start Cassandra¶
cassandra -f # Foreground mode
or¶
cassandra # Background mode
Connect to Cassandra¶
cqlsh ```_
Grundlegende Befehle¶
Verbindung mit Cassandra¶
```bash
Connect to local Cassandra¶
cqlsh
Connect to remote Cassandra¶
cqlsh hostname cqlsh hostname port
Connect with credentials¶
cqlsh -u username -p password
Connect with SSL¶
cqlsh --ssl
Execute CQL from file¶
cqlsh -f script.cql
Execute single command¶
cqlsh -e "DESCRIBE KEYSPACES;" ```_
CQL Shell Commands¶
```cql -- Show help HELP;
-- Show current keyspace SELECT keyspace_name FROM system.local;
-- Show cluster information DESCRIBE CLUSTER;
-- Show keyspaces DESCRIBE KEYSPACES;
-- Show tables in keyspace DESCRIBE TABLES;
-- Show table structure DESCRIBE TABLE table_name;
-- Show version SELECT release_version FROM system.local;
-- Exit CQL shell EXIT; QUIT;
-- Enable/disable tracing TRACING ON; TRACING OFF;
-- Show query execution time CONSISTENCY; ```_
Systeminformationen¶
```cql -- Show cluster information SELECT cluster_name, listen_address, data_center, rack, release_version FROM system.local;
-- Show all nodes SELECT peer, data_center, rack, release_version, tokens FROM system.peers;
-- Show keyspace information SELECT keyspace_name, durable_writes, replication FROM system_schema.keyspaces;
-- Show table information SELECT keyspace_name, table_name, bloom_filter_fp_chance, caching, comment, compaction, compression FROM system_schema.tables WHERE keyspace_name = 'mykeyspace';
-- Show column information SELECT keyspace_name, table_name, column_name, clustering_order, column_name_bytes, kind, position, type FROM system_schema.columns WHERE keyspace_name = 'mykeyspace' AND table_name = 'mytable'; ```_
Schwerpunkte¶
Schlüsselbereiche erstellen¶
```cql -- Simple keyspace creation CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 3 };
-- Keyspace with NetworkTopologyStrategy CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'datacenter1': 3, 'datacenter2': 2 } AND DURABLE_WRITES = true;
-- Keyspace for development (single node) CREATE KEYSPACE dev_keyspace WITH REPLICATION = { 'class': 'SimpleStrategy', 'replication_factor': 1 };
-- Keyspace with custom options CREATE KEYSPACE analytics WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'datacenter1': 3 } AND DURABLE_WRITES = false; ```_
Schlüsselbereiche verwalten¶
```cql -- Use keyspace USE mykeyspace;
-- Alter keyspace replication ALTER KEYSPACE mykeyspace WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'datacenter1': 5, 'datacenter2': 3 };
-- Alter durable writes ALTER KEYSPACE mykeyspace WITH DURABLE_WRITES = false;
-- Drop keyspace DROP KEYSPACE mykeyspace;
-- Show keyspace details DESCRIBE KEYSPACE mykeyspace;
-- Show all keyspaces SELECT keyspace_name, durable_writes, replication FROM system_schema.keyspaces; ```_
Tabelle Operationen¶
Tabellen erstellen¶
```cql -- Basic table creation CREATE TABLE users ( user_id UUID PRIMARY KEY, username TEXT, email TEXT, first_name TEXT, last_name TEXT, created_at TIMESTAMP );
-- Table with composite primary key CREATE TABLE user_posts ( user_id UUID, post_id TIMEUUID, title TEXT, content TEXT, created_at TIMESTAMP, PRIMARY KEY (user_id, post_id) );
-- Table with clustering order CREATE TABLE time_series ( sensor_id UUID, timestamp TIMESTAMP, temperature DOUBLE, humidity DOUBLE, PRIMARY KEY (sensor_id, timestamp) ) WITH CLUSTERING ORDER BY (timestamp DESC);
-- Table with multiple clustering columns CREATE TABLE user_activities ( user_id UUID, year INT, month INT, day INT, activity_id TIMEUUID, activity_type TEXT, description TEXT, PRIMARY KEY (user_id, year, month, day, activity_id) ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, activity_id DESC);
-- Table with options CREATE TABLE products ( product_id UUID PRIMARY KEY, name TEXT, description TEXT, price DECIMAL, category TEXT, created_at TIMESTAMP ) WITH comment = 'Product catalog table' AND gc_grace_seconds = 864000 AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': 32, 'min_threshold': 4} AND compression = {'chunk_length_in_kb': 64, 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}; ```_
Änderung der Tabellen¶
```cql -- Add column ALTER TABLE users ADD phone TEXT;
-- Add multiple columns ALTER TABLE users ADD ( address TEXT, city TEXT, country TEXT );
-- Drop column ALTER TABLE users DROP phone;
-- Rename column (Cassandra 3.0+) ALTER TABLE users RENAME email TO email_address;
-- Alter column type (limited support) ALTER TABLE users ALTER created_at TYPE TIMESTAMP;
-- Add table options ALTER TABLE users WITH comment = 'User information table';
-- Modify compaction strategy ALTER TABLE users WITH compaction = { 'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb': 160 };
-- Modify compression ALTER TABLE users WITH compression = { 'class': 'DeflateCompressor', 'chunk_length_in_kb': 64 };
-- Drop table DROP TABLE users; ```_
Angaben zur Tabelle¶
```cql -- Describe table DESCRIBE TABLE users;
-- Show table schema SELECT keyspace_name, table_name, column_name, clustering_order, column_name_bytes, kind, position, type FROM system_schema.columns WHERE keyspace_name = 'mykeyspace' AND table_name = 'users';
-- Show table options SELECT keyspace_name, table_name, bloom_filter_fp_chance, caching, comment, compaction, compression, crc_check_chance, dclocal_read_repair_chance, default_time_to_live, gc_grace_seconds, max_index_interval, memtable_flush_period_in_ms, min_index_interval, read_repair_chance, speculative_retry FROM system_schema.tables WHERE keyspace_name = 'mykeyspace' AND table_name = 'users'; ```_
Datentypen¶
Basisdatentypen¶
```cql -- Numeric types CREATE TABLE numeric_examples ( id UUID PRIMARY KEY, tiny_int_val TINYINT, -- 8-bit signed integer small_int_val SMALLINT, -- 16-bit signed integer int_val INT, -- 32-bit signed integer big_int_val BIGINT, -- 64-bit signed integer varint_val VARINT, -- Arbitrary precision integer decimal_val DECIMAL, -- Variable precision decimal float_val FLOAT, -- 32-bit IEEE-754 floating point double_val DOUBLE -- 64-bit IEEE-754 floating point );
-- Text and binary types CREATE TABLE text_examples ( id UUID PRIMARY KEY, ascii_val ASCII, -- ASCII string text_val TEXT, -- UTF-8 string varchar_val VARCHAR, -- UTF-8 string (alias for TEXT) blob_val BLOB -- Binary data );
-- Date and time types CREATE TABLE datetime_examples ( id UUID PRIMARY KEY, date_val DATE, -- Date without time time_val TIME, -- Time without date timestamp_val TIMESTAMP, -- Date and time duration_val DURATION -- Duration (Cassandra 3.10+) );
-- Boolean and UUID types CREATE TABLE other_examples ( id UUID PRIMARY KEY, bool_val BOOLEAN, -- True or false uuid_val UUID, -- Random UUID timeuuid_val TIMEUUID, -- Time-based UUID inet_val INET -- IP address (IPv4 or IPv6) ); ```_
Datentypen erfassen¶
```cql
-- Set type
CREATE TABLE user_interests (
user_id UUID PRIMARY KEY,
interests SET
-- List type
CREATE TABLE user_addresses (
user_id UUID PRIMARY KEY,
addresses LIST
-- Map type
CREATE TABLE user_attributes (
user_id UUID PRIMARY KEY,
attributes MAP
-- Frozen collections
CREATE TABLE user_profile (
user_id UUID PRIMARY KEY,
tags FROZEN
-- Nested collections (with FROZEN)
CREATE TABLE complex_data (
id UUID PRIMARY KEY,
nested_map MAP
Benutzerdefinierte Typen¶
```cql -- Create user defined type CREATE TYPE address ( street TEXT, city TEXT, state TEXT, zip_code TEXT, country TEXT );
-- Create another UDT CREATE TYPE phone ( country_code TEXT, number TEXT, type TEXT );
-- Use UDT in table
CREATE TABLE contacts (
contact_id UUID PRIMARY KEY,
name TEXT,
home_address address,
work_address address,
phones LIST
-- Frozen UDT CREATE TABLE user_profiles ( user_id UUID PRIMARY KEY, personal_info FROZEN
, contact_numbers FROZEN- >
);
```_
AUSRÜSTUNG Operationen¶
Einsatzbereiche¶
```cql -- Basic insert INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'john_doe', 'john@example.com', 'John', 'Doe', toTimestamp(now()));
-- Insert with TTL (Time To Live) INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'jane_doe', 'jane@example.com', 'Jane', 'Doe', toTimestamp(now())) USING TTL 86400;
-- Insert with timestamp INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'bob_smith', 'bob@example.com', 'Bob', 'Smith', toTimestamp(now())) USING TIMESTAMP 1640995200000000;
-- Insert into collection columns INSERT INTO user_interests (user_id, interests) VALUES (uuid(), {'technology', 'music', 'sports'});
INSERT INTO user_addresses (user_id, addresses) VALUES (uuid(), ['123 Main St', '456 Oak Ave', '789 Pine Rd']);
INSERT INTO user_attributes (user_id, attributes) VALUES (uuid(), {'theme': 'dark', 'language': 'en', 'timezone': 'UTC'});
-- Insert with UDT INSERT INTO contacts (contact_id, name, home_address, phones) VALUES ( uuid(), 'John Doe', {street: '123 Main St', city: 'New York', state: 'NY', zip_code: '10001', country: 'USA'}, [{country_code: '+1', number: '555-1234', type: 'mobile'}] );
-- Conditional insert (IF NOT EXISTS) INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'unique_user', 'unique@example.com', 'Unique', 'User', toTimestamp(now())) IF NOT EXISTS; ```_
Wählen Sie Operationen¶
```cql -- Basic select SELECT * FROM users;
-- Select specific columns SELECT user_id, username, email FROM users;
-- Select with WHERE clause (must include partition key) SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Select with clustering column conditions SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND post_id > minTimeuuid('2023-01-01');
-- Select with LIMIT SELECT * FROM users LIMIT 10;
-- Select with ORDER BY (only on clustering columns) SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 ORDER BY post_id DESC;
-- Select with ALLOW FILTERING (use carefully) SELECT * FROM users WHERE email = 'john@example.com' ALLOW FILTERING;
-- Select with token function for pagination SELECT * FROM users WHERE token(user_id) > token(123e4567-e89b-12d3-a456-426614174000);
-- Select collection elements SELECT user_id, interests FROM user_interests; SELECT user_id, interests[0] FROM user_interests; -- First element of set/list
-- Select from UDT SELECT contact_id, name, home_address.city, home_address.state FROM contacts;
-- Select with functions SELECT user_id, username, toTimestamp(now()) as current_time FROM users; SELECT user_id, username, dateOf(user_id) as creation_date FROM users WHERE user_id = timeuuid(); ```_
Aktualisierung der Operationen¶
```cql -- Basic update UPDATE users SET email = 'newemail@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update multiple columns UPDATE users SET email = 'john.doe@example.com', first_name = 'Jonathan' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update with TTL UPDATE users USING TTL 3600 SET email = 'temp@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update with timestamp UPDATE users USING TIMESTAMP 1640995200000000 SET email = 'timestamped@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update collection - add to set UPDATE user_interests SET interests = interests + {'reading'} WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update collection - remove from set UPDATE user_interests SET interests = interests - {'sports'} WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update collection - append to list UPDATE user_addresses SET addresses = addresses + ['999 New St'] WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update collection - prepend to list UPDATE user_addresses SET addresses = ['000 First St'] + addresses WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update map UPDATE user_attributes SET attributes['theme'] = 'light' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Update UDT field UPDATE contacts SET home_address.city = 'Los Angeles' WHERE contact_id = 123e4567-e89b-12d3-a456-426614174000;
-- Conditional update UPDATE users SET email = 'conditional@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 IF email = 'old@example.com';
-- Update with IF EXISTS UPDATE users SET email = 'exists@example.com' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 IF EXISTS; ```_
Löschen von Operationen¶
```cql -- Delete entire row DELETE FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete specific columns DELETE email, phone FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete with timestamp DELETE FROM users USING TIMESTAMP 1640995200000000 WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete from collection DELETE interests['technology'] FROM user_interests WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete list element by index DELETE addresses[0] FROM user_addresses WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete map key DELETE attributes['theme'] FROM user_attributes WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete UDT field DELETE home_address.phone FROM contacts WHERE contact_id = 123e4567-e89b-12d3-a456-426614174000;
-- Conditional delete DELETE FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 IF email = 'delete@example.com';
-- Delete with IF EXISTS DELETE FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 IF EXISTS;
-- Delete range (clustering columns) DELETE FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND post_id > minTimeuuid('2023-01-01') AND post_id < maxTimeuuid('2023-12-31'); ```_
Abfrage von Operationen¶
Erweiterte Abfrage¶
```cql -- Range queries on clustering columns SELECT * FROM time_series WHERE sensor_id = 123e4567-e89b-12d3-a456-426614174000 AND timestamp >= '2023-01-01' AND timestamp < '2023-02-01';
-- IN queries SELECT * FROM users WHERE user_id IN ( 123e4567-e89b-12d3-a456-426614174000, 456e7890-e89b-12d3-a456-426614174001, 789e0123-e89b-12d3-a456-426614174002 );
-- Token-based queries for pagination SELECT * FROM users WHERE token(user_id) > token(123e4567-e89b-12d3-a456-426614174000) LIMIT 1000;
-- CONTAINS queries (requires SASI or secondary index) SELECT * FROM user_interests WHERE interests CONTAINS 'technology' ALLOW FILTERING;
-- CONTAINS KEY queries for maps SELECT * FROM user_attributes WHERE attributes CONTAINS KEY 'theme' ALLOW FILTERING;
-- Slice queries SELECT * FROM user_activities WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND year = 2023 AND month >= 6;
-- Multi-column slice SELECT * FROM user_activities WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND (year, month) >= (2023, 6) AND (year, month) <= (2023, 12); ```_
Aggregationsfunktionen¶
```cql -- Count SELECT COUNT(*) FROM users;
-- Count with WHERE clause SELECT COUNT(*) FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Min and Max (Cassandra 2.2+) SELECT MIN(temperature), MAX(temperature) FROM time_series WHERE sensor_id = 123e4567-e89b-12d3-a456-426614174000;
-- Sum and Average (Cassandra 2.2+) SELECT SUM(temperature), AVG(temperature) FROM time_series WHERE sensor_id = 123e4567-e89b-12d3-a456-426614174000;
-- Group by (limited support) SELECT sensor_id, COUNT(*) FROM time_series GROUP BY sensor_id; ```_
Integrierte Funktionen¶
```cql -- UUID functions SELECT uuid() as new_uuid; SELECT now() as current_timeuuid; SELECT minTimeuuid('2023-01-01') as min_uuid; SELECT maxTimeuuid('2023-12-31') as max_uuid;
-- Date and time functions SELECT toTimestamp(now()) as current_timestamp; SELECT dateOf(now()) as current_date; SELECT unixTimestampOf(now()) as unix_timestamp;
-- Blob functions SELECT textAsBlob('hello') as text_blob; SELECT blobAsText(textAsBlob('hello')) as blob_text;
-- Token function SELECT token(user_id) FROM users;
-- TTL and WRITETIME functions SELECT TTL(email), WRITETIME(email) FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Type conversion functions SELECT toJson(interests) FROM user_interests; SELECT fromJson('["technology", "music"]') as parsed_interests; ```_
Index¶
Sekundäre Indexe¶
```cql -- Create secondary index CREATE INDEX ON users (email);
-- Create named index CREATE INDEX user_email_idx ON users (email);
-- Create index on collection CREATE INDEX ON user_interests (interests);
-- Create index on map keys CREATE INDEX ON user_attributes (KEYS(attributes));
-- Create index on map values CREATE INDEX ON user_attributes (VALUES(attributes));
-- Create index on map entries CREATE INDEX ON user_attributes (ENTRIES(attributes));
-- Create index on UDT field CREATE INDEX ON contacts (home_address);
-- Show indexes DESCRIBE INDEX user_email_idx;
-- Drop index DROP INDEX user_email_idx; ```_
SASI-Index (Cassandra 3.4+)¶
```cql -- Create SASI index for text search CREATE CUSTOM INDEX user_name_sasi_idx ON users (first_name) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'CONTAINS', 'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer', 'case_sensitive': 'false' };
-- Create SASI index for numeric range queries CREATE CUSTOM INDEX product_price_sasi_idx ON products (price) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'SPARSE'};
-- Query with SASI index SELECT * FROM users WHERE first_name LIKE '%john%'; SELECT * FROM products WHERE price >= 100 AND price <= 500; ```_
Benutzerdefinierte Typen¶
UDT erstellen und verwalten¶
```cql -- Create UDT CREATE TYPE address ( street TEXT, city TEXT, state TEXT, zip_code TEXT, country TEXT );
-- Create nested UDT CREATE TYPE contact_info ( email TEXT, phone TEXT, address address );
-- Alter UDT - add field ALTER TYPE address ADD region TEXT;
-- Alter UDT - rename field ALTER TYPE address RENAME zip_code TO postal_code;
-- Show UDT DESCRIBE TYPE address;
-- Drop UDT DROP TYPE address;
-- Use UDT in table CREATE TABLE customers ( customer_id UUID PRIMARY KEY, name TEXT, billing_address address, shipping_address address, contact contact_info );
-- Insert with UDT INSERT INTO customers (customer_id, name, billing_address, contact) VALUES ( uuid(), 'John Doe', { street: '123 Main St', city: 'New York', state: 'NY', postal_code: '10001', country: 'USA', region: 'Northeast' }, { email: 'john@example.com', phone: '+1-555-1234', address: { street: '456 Work Ave', city: 'New York', state: 'NY', postal_code: '10002', country: 'USA', region: 'Northeast' } } );
-- Query UDT fields SELECT customer_id, name, billing_address.city, contact.email FROM customers;
-- Update UDT field UPDATE customers SET billing_address.city = 'Los Angeles' WHERE customer_id = 123e4567-e89b-12d3-a456-426614174000; ```_
Funktionen¶
Benutzerdefinierte Funktionen (UDF)¶
```cql -- Enable UDF (in cassandra.yaml: enable_user_defined_functions: true)
-- Create simple UDF CREATE OR REPLACE FUNCTION fLog (input double) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS 'return Double.valueOf(Math.log(input.doubleValue()));';
-- Create UDF with multiple parameters CREATE OR REPLACE FUNCTION fAdd (x int, y int) CALLED ON NULL INPUT RETURNS int LANGUAGE java AS 'return Integer.valueOf(x.intValue() + y.intValue());';
-- Create UDF for text processing CREATE OR REPLACE FUNCTION toUpperCase (input text) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS 'return input.toUpperCase();';
-- Create UDF for collection processing
CREATE OR REPLACE FUNCTION listSize (input list
-- Use UDF in queries SELECT user_id, fAdd(5, 10) as sum FROM users; SELECT user_id, toUpperCase(first_name) as upper_name FROM users; SELECT user_id, listSize(addresses) as address_count FROM user_addresses;
-- Show functions DESCRIBE FUNCTION fLog;
-- Drop function DROP FUNCTION fLog; ```_
Benutzer Defined Aggregate (UDA)¶
```cql
-- Create state function for UDA
CREATE OR REPLACE FUNCTION avgState (state tuple
-- Create final function for UDA
CREATE OR REPLACE FUNCTION avgFinal (state tuple
-- Create UDA
CREATE OR REPLACE AGGREGATE average (int)
SFUNC avgState
STYPE tuple
-- Use UDA SELECT sensor_id, average(temperature) FROM time_series GROUP BY sensor_id;
-- Drop UDA DROP AGGREGATE average; ```_
Materialisiert Ansichten¶
Erstellen Materialisiert Ansichten¶
```cql -- Create materialized view CREATE MATERIALIZED VIEW users_by_email AS SELECT user_id, username, email, first_name, last_name, created_at FROM users WHERE email IS NOT NULL AND user_id IS NOT NULL PRIMARY KEY (email, user_id);
-- Materialized view with different clustering CREATE MATERIALIZED VIEW posts_by_title AS SELECT user_id, post_id, title, content, created_at FROM user_posts WHERE title IS NOT NULL AND user_id IS NOT NULL AND post_id IS NOT NULL PRIMARY KEY (title, user_id, post_id);
-- Materialized view with filtering CREATE MATERIALIZED VIEW recent_posts AS SELECT user_id, post_id, title, content, created_at FROM user_posts WHERE created_at IS NOT NULL AND user_id IS NOT NULL AND post_id IS NOT NULL AND created_at >= '2023-01-01' PRIMARY KEY (created_at, user_id, post_id) WITH CLUSTERING ORDER BY (user_id ASC, post_id DESC);
-- Show materialized view DESCRIBE MATERIALIZED VIEW users_by_email;
-- Query materialized view SELECT * FROM users_by_email WHERE email = 'john@example.com';
-- Drop materialized view DROP MATERIALIZED VIEW users_by_email; ```_
Materialisiert Betrachtungen ansehen¶
```cql -- Materialized views are automatically maintained -- Base table changes are propagated to views
-- Insert into base table INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'new_user', 'new@example.com', 'New', 'User', toTimestamp(now()));
-- Query view to see automatic update SELECT * FROM users_by_email WHERE email = 'new@example.com';
-- Update base table UPDATE users SET first_name = 'Updated' WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- View is automatically updated SELECT * FROM users_by_email WHERE email = 'john@example.com';
-- Show all materialized views SELECT keyspace_name, view_name, base_table_name FROM system_schema.views; ```_
Batch Operationen¶
Batch Statements¶
```cql -- Logged batch (default) BEGIN BATCH INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'batch_user1', 'batch1@example.com', 'Batch', 'User1', toTimestamp(now()));
INSERT INTO users (user_id, username, email, first_name, last_name, created_at)
VALUES (uuid(), 'batch_user2', 'batch2@example.com', 'Batch', 'User2', toTimestamp(now()));
UPDATE users SET email = 'updated@example.com'
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
APPLY BATCH;
-- Unlogged batch (better performance, no atomicity guarantee) BEGIN UNLOGGED BATCH INSERT INTO user_posts (user_id, post_id, title, content, created_at) VALUES (123e4567-e89b-12d3-a456-426614174000, now(), 'Post 1', 'Content 1', toTimestamp(now()));
INSERT INTO user_posts (user_id, post_id, title, content, created_at)
VALUES (123e4567-e89b-12d3-a456-426614174000, now(), 'Post 2', 'Content 2', toTimestamp(now()));
APPLY BATCH;
-- Batch with TTL and timestamp BEGIN BATCH USING TTL 3600 AND TIMESTAMP 1640995200000000 INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'temp_user', 'temp@example.com', 'Temp', 'User', toTimestamp(now()));
UPDATE users SET phone = '+1-555-9999'
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
APPLY BATCH;
-- Conditional batch BEGIN BATCH INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'conditional_user', 'conditional@example.com', 'Conditional', 'User', toTimestamp(now())) IF NOT EXISTS;
UPDATE users SET email = 'conditional_update@example.com'
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000
IF email = 'old@example.com';
APPLY BATCH;
-- Counter batch BEGIN COUNTER BATCH UPDATE page_views SET views = views + 1 WHERE page_id = 'home'; UPDATE page_views SET views = views + 1 WHERE page_id = 'about'; APPLY BATCH; ```_
Konsistenzebenen¶
Konsistenzebenen festlegen¶
```cql -- Show current consistency level CONSISTENCY;
-- Set consistency level for reads CONSISTENCY ONE; CONSISTENCY QUORUM; CONSISTENCY ALL; CONSISTENCY LOCAL_QUORUM; CONSISTENCY EACH_QUORUM; CONSISTENCY LOCAL_ONE;
-- Set consistency level for writes CONSISTENCY ONE; CONSISTENCY QUORUM; CONSISTENCY ALL; CONSISTENCY LOCAL_QUORUM; CONSISTENCY EACH_QUORUM;
-- Consistency levels explanation: -- ONE: One replica responds -- QUORUM: Majority of replicas respond -- ALL: All replicas respond -- LOCAL_QUORUM: Majority of replicas in local datacenter -- EACH_QUORUM: Majority of replicas in each datacenter -- LOCAL_ONE: One replica in local datacenter
-- Example queries with different consistency levels CONSISTENCY QUORUM; SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
CONSISTENCY LOCAL_QUORUM; INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'consistent_user', 'consistent@example.com', 'Consistent', 'User', toTimestamp(now()));
-- Serial consistency for lightweight transactions SERIAL CONSISTENCY SERIAL; SERIAL CONSISTENCY LOCAL_SERIAL;
-- Lightweight transaction with serial consistency INSERT INTO users (user_id, username, email, first_name, last_name, created_at) VALUES (uuid(), 'lwt_user', 'lwt@example.com', 'LWT', 'User', toTimestamp(now())) IF NOT EXISTS; ```_
Cluster Management¶
Node Operationen¶
```bash
Check cluster status¶
nodetool status
Check node information¶
nodetool info
Check ring information¶
nodetool ring
Describe cluster¶
nodetool describecluster
Check gossip information¶
nodetool gossipinfo
Flush memtables to disk¶
nodetool flush
Compact SSTables¶
nodetool compact
Cleanup after replication changes¶
nodetool cleanup
Repair data¶
nodetool repair
Repair specific keyspace¶
nodetool repair mykeyspace
Repair specific table¶
nodetool repair mykeyspace mytable
Incremental repair¶
nodetool repair -inc
Full repair¶
nodetool repair -full ```_
Cluster Monitoring¶
```bash
Check node stats¶
nodetool netstats
Check compaction stats¶
nodetool compactionstats
Check thread pool stats¶
nodetool tpstats
Check histogram stats¶
nodetool cfstats
Check table stats¶
nodetool tablestats mykeyspace.mytable
Check proxyhistograms¶
nodetool proxyhistograms
Check gossip state¶
nodetool gossipinfo
Check schema version¶
nodetool describecluster
Check pending tasks¶
nodetool pendingcompactions
Check streaming operations¶
nodetool netstats ```_
Backup und Snapshot¶
```bash
Create snapshot¶
nodetool snapshot
Create snapshot with name¶
nodetool snapshot -t backup_20231201
Create snapshot for specific keyspace¶
nodetool snapshot mykeyspace
List snapshots¶
nodetool listsnapshots
Clear snapshot¶
nodetool clearsnapshot
Clear specific snapshot¶
nodetool clearsnapshot -t backup_20231201
Incremental backup (enable in cassandra.yaml)¶
incremental_backups: true¶
Restore from snapshot¶
1. Stop Cassandra¶
2. Clear data directory¶
3. Copy snapshot files to data directory¶
4. Start Cassandra¶
5. Run nodetool refresh¶
```_
Hinzufügen / Entfernen von Nodes¶
```bash
Add new node to cluster¶
1. Install Cassandra on new node¶
2. Configure cassandra.yaml with cluster settings¶
3. Set seeds to existing nodes¶
4. Start Cassandra¶
5. Check status: nodetool status¶
Bootstrap new node¶
nodetool bootstrap
Decommission node (graceful removal)¶
nodetool decommission
Remove dead node (forced removal)¶
nodetool removenode
Move node to new token¶
nodetool move
Drain node before shutdown¶
nodetool drain
Stop gossip¶
nodetool disablegossip
Start gossip¶
nodetool enablegossip
Stop thrift¶
nodetool disablethrift
Start thrift¶
nodetool enablethrift ```_
Leistung Tuning¶
Queroptimierung¶
```cql -- Use appropriate partition keys -- Good: Query by partition key SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Bad: Query without partition key (requires ALLOW FILTERING) SELECT * FROM user_posts WHERE title = 'Some Title' ALLOW FILTERING;
-- Use clustering columns for range queries SELECT * FROM time_series WHERE sensor_id = 123e4567-e89b-12d3-a456-426614174000 AND timestamp >= '2023-01-01' AND timestamp < '2023-02-01';
-- Limit result size SELECT * FROM users LIMIT 1000;
-- Use token-based pagination SELECT * FROM users WHERE token(user_id) > token(123e4567-e89b-12d3-a456-426614174000) LIMIT 1000;
-- Avoid ALLOW FILTERING when possible -- Create secondary index instead CREATE INDEX ON users (email); SELECT * FROM users WHERE email = 'john@example.com';
-- Use prepared statements in applications PREPARE get_user AS SELECT * FROM users WHERE user_id = ?;
EXECUTE get_user USING 123e4567-e89b-12d3-a456-426614174000; ```_
Optimierung der Tischgestaltung¶
```cql -- Design tables for your queries -- Query: Get all posts by user, ordered by date CREATE TABLE user_posts_by_date ( user_id UUID, post_date DATE, post_id TIMEUUID, title TEXT, content TEXT, PRIMARY KEY (user_id, post_date, post_id) ) WITH CLUSTERING ORDER BY (post_date DESC, post_id DESC);
-- Denormalize for read performance CREATE TABLE user_post_summary ( user_id UUID, post_count COUNTER, last_post_date TIMESTAMP, PRIMARY KEY (user_id) );
-- Use appropriate compaction strategy ALTER TABLE large_table WITH compaction = { 'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb': 160 };
-- Use appropriate compression ALTER TABLE large_table WITH compression = { 'class': 'LZ4Compressor', 'chunk_length_in_kb': 64 };
-- Set appropriate bloom filter ALTER TABLE large_table WITH bloom_filter_fp_chance = 0.01;
-- Configure caching ALTER TABLE frequently_read_table WITH caching = { 'keys': 'ALL', 'rows_per_partition': '100' }; ```_
Leistungsüberwachung¶
```bash
Monitor read/write latency¶
nodetool proxyhistograms
Monitor compaction¶
nodetool compactionstats
Monitor thread pools¶
nodetool tpstats
Monitor table statistics¶
nodetool tablestats
Monitor JVM¶
nodetool gcstats
Monitor heap usage¶
nodetool info
Enable tracing in CQL¶
TRACING ON; SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000; TRACING OFF;
Check slow queries (enable in cassandra.yaml)¶
slow_query_log_timeout_in_ms: 500¶
Monitor with external tools¶
- DataStax OpsCenter¶
- Prometheus + Grafana¶
- Custom JMX monitoring¶
```_
Überwachung¶
Systemüberwachung¶
```bash
Cluster overview¶
nodetool status nodetool info nodetool describecluster
Node health¶
nodetool gossipinfo nodetool version nodetool uptime
Performance metrics¶
nodetool tpstats nodetool cfstats nodetool proxyhistograms nodetool tablehistograms mykeyspace.mytable
Compaction monitoring¶
nodetool compactionstats nodetool compactionhistory
Memory and GC¶
nodetool gcstats nodetool info | grep -E "(Heap|Off-heap)"
Network and streaming¶
nodetool netstats nodetool streamthroughput
Pending operations¶
nodetool pendingcompactions nodetool pendingranges ```_
JMX Überwachung¶
```bash
Enable JMX (in cassandra-env.sh)¶
JVM_OPTS="\(JVM_OPTS -Dcom.sun.management.jmxremote" JVM_OPTS="\)JVM_OPTS -Dcom.sun.management.jmxremote.port=7199" JVM_OPTS="\(JVM_OPTS -Dcom.sun.management.jmxremote.rmi.port=7199" JVM_OPTS="\)JVM_OPTS -Dcom.sun.management.jmxremote.ssl=false" JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
Connect with JConsole¶
jconsole localhost:7199
Key JMX metrics to monitor:¶
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency¶
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency¶
- org.apache.cassandra.metrics:type=Storage,name=Load¶
- org.apache.cassandra.metrics:type=Compaction,name=PendingTasks¶
- java.lang:type=Memory¶
- java.lang:type=GarbageCollector¶
```_
Überwachung der Daten¶
```bash
System log location¶
tail -f /var/log/cassandra/system.log
Debug log¶
tail -f /var/log/cassandra/debug.log
GC log¶
tail -f /var/log/cassandra/gc.log
Key log patterns to monitor:¶
- "OutOfMemoryError"¶
- "Dropping mutation"¶
- "Timeout"¶
- "Unable to gossip"¶
- "Compaction interrupted"¶
Configure log levels (logback.xml)¶
DEBUG, INFO, WARN, ERROR¶
Monitor specific packages¶
¶
¶
```_
Anwendungsüberwachung¶
```cql -- Monitor query performance TRACING ON; -- Your queries here TRACING OFF;
-- Check table statistics SELECT * FROM system.size_estimates WHERE keyspace_name = 'mykeyspace';
-- Monitor tombstones -- Enable in cassandra.yaml: -- tombstone_warn_threshold: 1000 -- tombstone_failure_threshold: 100000
-- Monitor large partitions -- Enable in cassandra.yaml: -- compaction_large_partition_warning_threshold_mb: 100
-- Monitor slow queries -- Enable in cassandra.yaml: -- slow_query_log_timeout_in_ms: 500
-- Check system tables for monitoring SELECT * FROM system.local; SELECT * FROM system.peers; SELECT * FROM system_schema.keyspaces; SELECT * FROM system_schema.tables; ```_
Sicherheit¶
Authentication¶
```cql -- Enable authentication (in cassandra.yaml) -- authenticator: PasswordAuthenticator
-- Create superuser (default: cassandra/cassandra) -- Change default password ALTER USER cassandra WITH PASSWORD 'new_secure_password';
-- Create new user CREATE USER app_user WITH PASSWORD 'secure_password' NOSUPERUSER;
-- Create superuser CREATE USER admin_user WITH PASSWORD 'admin_password' SUPERUSER;
-- Alter user password ALTER USER app_user WITH PASSWORD 'new_password';
-- List users LIST USERS;
-- Drop user DROP USER app_user; ```_
Genehmigung¶
```cql -- Enable authorization (in cassandra.yaml) -- authorizer: CassandraAuthorizer
-- Grant permissions GRANT SELECT ON KEYSPACE mykeyspace TO app_user; GRANT MODIFY ON KEYSPACE mykeyspace TO app_user; GRANT CREATE ON KEYSPACE mykeyspace TO app_user; GRANT DROP ON KEYSPACE mykeyspace TO app_user; GRANT ALTER ON KEYSPACE mykeyspace TO app_user;
-- Grant table-level permissions GRANT SELECT ON mykeyspace.users TO app_user; GRANT MODIFY ON mykeyspace.users TO app_user;
-- Grant all permissions GRANT ALL PERMISSIONS ON KEYSPACE mykeyspace TO app_user;
-- Revoke permissions REVOKE SELECT ON KEYSPACE mykeyspace FROM app_user;
-- List permissions LIST ALL PERMISSIONS; LIST ALL PERMISSIONS OF app_user;
-- Create role (Cassandra 2.2+) CREATE ROLE app_role;
-- Grant role to user GRANT app_role TO app_user;
-- Grant permissions to role GRANT SELECT ON KEYSPACE mykeyspace TO app_role;
-- List roles LIST ROLES; ```_
SSL/TLS Verschlüsselung¶
```yaml
Client-to-node encryption (in cassandra.yaml)¶
client_encryption_options: enabled: true optional: false keystore: /path/to/keystore.jks keystore_password: keystore_password truststore: /path/to/truststore.jks truststore_password: truststore_password protocol: TLS algorithm: SunX509 store_type: JKS cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA]
Node-to-node encryption¶
server_encryption_options: internode_encryption: all keystore: /path/to/keystore.jks keystore_password: keystore_password truststore: /path/to/truststore.jks truststore_password: truststore_password protocol: TLS algorithm: SunX509 store_type: JKS cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA] ```_
Netzwerksicherheit¶
```yaml
Configure listen addresses (in cassandra.yaml)¶
listen_address: 10.0.0.1 rpc_address: 10.0.0.1 broadcast_address: 10.0.0.1 broadcast_rpc_address: 10.0.0.1
Configure ports¶
storage_port: 7000 ssl_storage_port: 7001 native_transport_port: 9042 rpc_port: 9160
Enable native protocol SSL¶
native_transport_port_ssl: 9142
Firewall rules (example for iptables)¶
Allow Cassandra ports from cluster nodes only¶
iptables -A INPUT -p tcp --dport 7000 -s 10.0.0.0/24 -j ACCEPT¶
iptables -A INPUT -p tcp --dport 7001 -s 10.0.0.0/24 -j ACCEPT¶
iptables -A INPUT -p tcp --dport 9042 -s 10.0.0.0/24 -j ACCEPT¶
```_
Best Practices¶
Datenmodellierung Best Practices¶
```cql -- Design tables around queries, not entities -- Query: Get user's posts ordered by date CREATE TABLE user_posts ( user_id UUID, post_date DATE, post_id TIMEUUID, title TEXT, content TEXT, PRIMARY KEY (user_id, post_date, post_id) ) WITH CLUSTERING ORDER BY (post_date DESC, post_id DESC);
-- Denormalize for performance -- Instead of joins, duplicate data CREATE TABLE post_comments ( post_id UUID, comment_id TIMEUUID, user_id UUID, username TEXT, -- Denormalized from users table comment TEXT, created_at TIMESTAMP, PRIMARY KEY (post_id, comment_id) );
-- Use appropriate partition sizes -- Aim for 10MB-100MB partitions -- Avoid hotspots with good partition key distribution
-- Use time-based partitioning for time series data CREATE TABLE sensor_data ( sensor_id UUID, year INT, month INT, day INT, timestamp TIMESTAMP, temperature DOUBLE, humidity DOUBLE, PRIMARY KEY ((sensor_id, year, month, day), timestamp) );
-- Use counters appropriately CREATE TABLE page_views ( page_id TEXT PRIMARY KEY, views COUNTER );
-- Avoid large collections -- Use separate tables instead of large lists/sets/maps ```_
Best Practices abfragen¶
```cql -- Always include partition key in WHERE clause -- Good SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Bad (requires ALLOW FILTERING) SELECT * FROM user_posts WHERE title = 'Some Title' ALLOW FILTERING;
-- Use clustering columns for range queries SELECT * FROM user_posts WHERE user_id = 123e4567-e89b-12d3-a456-426614174000 AND post_date >= '2023-01-01' AND post_date <= '2023-12-31';
-- Use LIMIT to control result size SELECT * FROM users LIMIT 1000;
-- Use prepared statements in applications -- Reduces parsing overhead and prevents injection attacks
-- Avoid SELECT * in production -- Select only needed columns SELECT user_id, username, email FROM users;
-- Use appropriate consistency levels -- QUORUM for critical reads/writes -- ONE for non-critical operations -- LOCAL_QUORUM for multi-datacenter setups ```_
Performance Best Practices¶
```cql -- Choose appropriate compaction strategy -- SizeTieredCompactionStrategy: Write-heavy workloads -- LeveledCompactionStrategy: Read-heavy workloads -- TimeWindowCompactionStrategy: Time series data
ALTER TABLE time_series WITH compaction = { 'class': 'TimeWindowCompactionStrategy', 'compaction_window_unit': 'DAYS', 'compaction_window_size': 1 };
-- Configure appropriate caching ALTER TABLE frequently_read_table WITH caching = { 'keys': 'ALL', 'rows_per_partition': '100' };
-- Use compression for large tables ALTER TABLE large_table WITH compression = { 'class': 'LZ4Compressor', 'chunk_length_in_kb': 64 };
-- Monitor and tune bloom filters ALTER TABLE large_table WITH bloom_filter_fp_chance = 0.01;
-- Use TTL for temporary data INSERT INTO session_data (session_id, user_id, data, created_at) VALUES (uuid(), 123e4567-e89b-12d3-a456-426614174000, 'session_data', toTimestamp(now())) USING TTL 3600; -- 1 hour TTL ```_
Operationelle Best Practices¶
```bash
Regular maintenance¶
Run repair regularly (weekly for critical data)¶
nodetool repair -inc
Monitor compaction¶
nodetool compactionstats
Clean up after topology changes¶
nodetool cleanup
Monitor disk space¶
df -h
Monitor logs¶
tail -f /var/log/cassandra/system.log
Backup strategy¶
Take snapshots before major changes¶
nodetool snapshot -t pre_upgrade_backup
Monitor cluster health¶
nodetool status nodetool describecluster
Capacity planning¶
Monitor partition sizes¶
Plan for 3x growth¶
Monitor read/write patterns¶
Security practices¶
Change default passwords¶
Use SSL/TLS encryption¶
Implement proper authentication/authorization¶
Regular security updates¶
Network segmentation¶
Monitor access logs¶
```_
Entwicklung Best Practices¶
```cql -- Use UUIDs for primary keys CREATE TABLE orders ( order_id UUID PRIMARY KEY, customer_id UUID, order_date TIMESTAMP, total_amount DECIMAL );
-- Use TIMEUUID for time-ordered data CREATE TABLE events ( event_id TIMEUUID PRIMARY KEY, event_type TEXT, event_data TEXT, created_at TIMESTAMP );
-- Handle null values appropriately -- Cassandra treats null as "no value" -- Use default values or application logic
-- Use batch operations carefully -- Only for related data in same partition -- Avoid large batches (>100 statements)
-- Test with realistic data volumes -- Performance characteristics change with scale -- Test partition sizes and query patterns
-- Monitor application metrics -- Track query latency -- Monitor connection pool usage -- Track error rates -- Monitor timeout exceptions ```_
--
Zusammenfassung¶
Apache Cassandra ist eine hoch skalierbare, verteilte NoSQL-Datenbank, die für die Verarbeitung großer Datenmengen auf vielen Rohstoffservern ohne einzigen Fehlerpunkt konzipiert ist. Dieses Cheatsheet bietet eine umfassende Erfassung der Cassandra-Operationen von der Grundverwaltung bis zu erweiterten Funktionen.
Key Strengths: - Skalierbarkeit: Lineare Skalierbarkeit ohne Einzelfehler - High Availability*: Multi-datacenter-Replikation mit abstimmbarer Konsistenz - **Performance: Optimiert für schreibschwere Workloads mit schnellen Lesegeräten - Standardtoleranz: Automatische Datenreplikation und Fehlererkennung - **Flexible Data Model*: Breiter Spaltenspeicher unterstützt verschiedene Datentypen
Best Use Cases: - Zeitreihendaten und IoT-Anwendungen - Echtzeitanalyse und Protokollierung - Inhaltsmanagement und Kataloge - Messaging und Social Media Plattformen - Anwendungen mit hohem Schreibdurchsatz
Importierte Überlegungen: - Datenmodellierung erfordert Abfrage-First-Ansatz - Limitierte Unterstützung für komplexe Abfragen und Joins - Eventuelle Konsistenzmodell erfordert sorgfältige Prüfung - Operationelle Komplexität steigt mit Clustergröße - Erfordert das Verständnis von verteilten Systemkonzepten
Durch die in diesem Cheatsheet skizzierten Praktiken und Techniken können Sie Cassandra Datenbanken effektiv entwerfen, implementieren und pflegen, die eine hohe Leistung, Skalierbarkeit und Verfügbarkeit für moderne verteilte Anwendungen bieten.
<= <= <= <================================================================================= Funktion copyToClipboard() {\cHFFFF} const commands = document.querySelectorAll('code'); alle Befehle = ''; Befehle. Für jede(cmd) => alle Befehle += cmd.textContent + '\n'); navigator.clipboard.writeText (allCommands); Alarm ('Alle Befehle, die in die Zwischenablage kopiert werden!'); }
Funktion generierenPDF() { Fenster.print(); }