Apache Superset Cheat Sheet
Overview
Apache Superset is an open-source modern data exploration and visualization platform designed for enterprise analytics. Originally developed at Airbnb, it provides a rich set of data visualizations, an intuitive interface for creating dashboards, and a powerful SQL IDE called SQL Lab. Superset supports over 40 visualization types including time-series charts, geospatial maps, pivot tables, and more.
Superset connects to any SQL-speaking database through SQLAlchemy, including PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, ClickHouse, Trino, Druid, and many others. It features a semantic layer for defining metrics and dimensions, role-based access control, row-level security, Jinja templating in SQL, and an extensible plugin architecture for custom visualizations. Superset scales well for large organizations and supports embedded analytics.
Installation
Docker Compose (Recommended)
# Clone Superset repository
git clone https://github.com/apache/superset.git
cd superset
# Start Superset with Docker Compose
docker compose -f docker-compose-non-dev.yml up -d
# Access at http://localhost:8088
# Default credentials: admin / admin
pip Installation
# Install Superset
pip install apache-superset
# Install database drivers
pip install psycopg2-binary # PostgreSQL
pip install mysqlclient # MySQL
pip install snowflake-sqlalchemy # Snowflake
pip install pybigquery # BigQuery
pip install clickhouse-connect # ClickHouse
pip install trino # Trino
# Initialize the database
superset db upgrade
# Create admin user
superset fab create-admin \
--username admin \
--firstname Admin \
--lastname User \
--email admin@example.com \
--password admin
# Load example data (optional)
superset load_examples
# Initialize roles and permissions
superset init
# Start development server
superset run -h 0.0.0.0 -p 8088 --with-threads --reload --debugger
Kubernetes with Helm
helm repo add superset https://apache.github.io/superset
helm install superset superset/superset \
--namespace analytics \
--create-namespace \
--set service.type=LoadBalancer
Configuration
superset_config.py
import os
# Basic config
SECRET_KEY = os.environ.get('SUPERSET_SECRET_KEY', 'your-secret-key-here')
SQLALCHEMY_DATABASE_URI = 'postgresql://superset:superset@db:5432/superset'
# Feature flags
FEATURE_FLAGS = {
'ENABLE_TEMPLATE_PROCESSING': True,
'DASHBOARD_NATIVE_FILTERS': True,
'DASHBOARD_CROSS_FILTERS': True,
'EMBEDDED_SUPERSET': True,
'ALERT_REPORTS': True,
'ESTIMATE_QUERY_COST': True,
'GLOBAL_ASYNC_QUERIES': True,
}
# Cache configuration (Redis)
CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 300,
'CACHE_KEY_PREFIX': 'superset_',
'CACHE_REDIS_URL': 'redis://redis:6379/0',
}
DATA_CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_data_',
'CACHE_REDIS_URL': 'redis://redis:6379/1',
}
# Celery for async queries and alerts
class CeleryConfig:
broker_url = 'redis://redis:6379/2'
result_backend = 'redis://redis:6379/3'
imports = ('superset.sql_lab', 'superset.tasks')
task_annotations = {
'sql_lab.get_sql_results': {'rate_limit': '100/s'},
}
CELERY_CONFIG = CeleryConfig
# Security
WTF_CSRF_ENABLED = True
TALISMAN_ENABLED = True
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SECURE = True
# CORS
ENABLE_CORS = True
CORS_OPTIONS = {
'supports_credentials': True,
'allow_headers': ['*'],
'resources': ['*'],
'origins': ['https://app.example.com'],
}
# Row limit
ROW_LIMIT = 50000
SQL_MAX_ROW = 100000
SUPERSET_WEBSERVER_TIMEOUT = 300
# SMTP for alerts/reports
SMTP_HOST = 'smtp.gmail.com'
SMTP_PORT = 587
SMTP_STARTTLS = True
SMTP_SSL = False
SMTP_USER = 'alerts@example.com'
SMTP_PASSWORD = os.environ.get('SMTP_PASSWORD')
SMTP_MAIL_FROM = 'alerts@example.com'
Database Connection Strings
# PostgreSQL
postgresql://user:pass@host:5432/database
# MySQL
mysql://user:pass@host:3306/database
# BigQuery
bigquery://project
# Snowflake
snowflake://user:pass@account.region/database/schema?warehouse=COMPUTE_WH&role=ANALYST
# Redshift
redshift+psycopg2://user:pass@cluster.region.redshift.amazonaws.com:5439/database
# ClickHouse
clickhousedb://user:pass@host:8123/database
# Trino
trino://user@host:8080/catalog/schema
# Apache Druid
druid://host:8888/druid/v2/sql
CLI Commands
| Command | Description |
|---|---|
superset db upgrade | Apply database migrations |
superset init | Initialize roles and permissions |
superset fab create-admin | Create admin user |
superset load_examples | Load example datasets and dashboards |
superset run -p 8088 | Start development server |
superset export-dashboards | Export dashboards to JSON |
superset import-dashboards -p file.json | Import dashboards |
superset export-datasources | Export data source configs |
superset import-datasources -p file.yaml | Import data sources |
superset compute-thumbnails | Generate dashboard thumbnails |
superset version | Show Superset version |
SQL Lab Features
Jinja Templates
-- Current user
SELECT * FROM orders
WHERE salesperson = '{{ current_username() }}'
-- Current user ID
SELECT * FROM dashboards
WHERE owner_id = {{ current_user_id() }}
-- URL parameters
SELECT * FROM sales
WHERE region = '{{ url_param("region", "all") }}'
-- Date macros
SELECT * FROM events
WHERE event_date >= '{{ from_dttm }}'
AND event_date < '{{ to_dttm }}'
-- Filter values (from dashboard filters)
SELECT * FROM orders
WHERE status IN ({{ "'" + "','".join(filter_values('status')) + "'" }})
-- Conditional query
SELECT *
FROM orders
{% if filter_values('region') %}
WHERE region IN ({{ "'" + "','".join(filter_values('region')) + "'" }})
{% endif %}
Virtual Datasets
-- Create a virtual dataset from SQL Lab
-- Save as dataset for use in charts
SELECT
DATE_TRUNC('month', o.created_at) AS month,
c.segment,
COUNT(DISTINCT o.customer_id) AS customers,
SUM(o.amount) AS revenue,
AVG(o.amount) AS avg_order_value
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1, 2
API Reference
# Get CSRF token
CSRF_TOKEN=$(curl -s -c cookies.txt http://localhost:8088/api/v1/security/csrf_token/ | jq -r '.result')
# Login
curl -X POST http://localhost:8088/api/v1/security/login \
-H "Content-Type: application/json" \
-d '{"username": "admin", "password": "admin", "provider": "db"}' \
-c cookies.txt
# List dashboards
curl http://localhost:8088/api/v1/dashboard/ \
-b cookies.txt \
-H "X-CSRFToken: $CSRF_TOKEN"
# Export dashboard
curl http://localhost:8088/api/v1/dashboard/export/?q=[1,2] \
-b cookies.txt -o dashboards.zip
# List datasets
curl http://localhost:8088/api/v1/dataset/ \
-b cookies.txt \
-H "X-CSRFToken: $CSRF_TOKEN"
# List charts
curl http://localhost:8088/api/v1/chart/ \
-b cookies.txt \
-H "X-CSRFToken: $CSRF_TOKEN"
Advanced Usage
Row-Level Security
# In superset_config.py or via UI
# Create RLS rule that filters data per user
# Admin > Row Level Security
# Example: Users can only see their region's data
# Clause: region = '{{ current_username() }}'
# Tables: orders, customers, revenue
# Roles: Sales_Team
Custom Visualization Plugin
# Create custom viz plugin
npx @superset-ui/generator-superset create my-chart-plugin
cd my-chart-plugin
npm install
npm run build
# Install in Superset
cd superset-frontend
npm install /path/to/my-chart-plugin
Embedded Dashboards
# superset_config.py
FEATURE_FLAGS = {
'EMBEDDED_SUPERSET': True,
}
# Generate guest token via API
# POST /api/v1/security/guest_token/
# Body: {
# "user": {"username": "guest", "first_name": "Guest", "last_name": "User"},
# "resources": [{"type": "dashboard", "id": "dashboard-uuid"}],
# "rls": [{"clause": "team_id = 42"}]
# }
Troubleshooting
| Issue | Solution |
|---|---|
| Slow dashboard loading | Enable caching with Redis. Optimize SQL queries underlying charts |
| Database connection fails | Check connection string format. Install required database driver |
| Charts not rendering | Clear browser cache. Check browser console for JavaScript errors |
| Permission denied errors | Check role-based access. Verify dataset and dashboard permissions |
| Celery workers not starting | Check Redis connectivity. Verify CELERY_CONFIG in superset_config.py |
| Alerts not sending | Configure SMTP settings. Ensure Celery beat is running |
| Thumbnail generation fails | Install Chrome/Chromium headless. Configure WEBDRIVER_BASEURL |
| Migration errors | Run superset db upgrade after version upgrades. Check database compatibility |
| Memory issues with large queries | Set ROW_LIMIT and SQL_MAX_ROW. Enable async queries with Celery |
| Custom viz not appearing | Rebuild frontend: npm run build. Verify plugin is registered |