Skip to content

Apache Superset Cheat Sheet

Overview

Apache Superset is an open-source modern data exploration and visualization platform designed for enterprise analytics. Originally developed at Airbnb, it provides a rich set of data visualizations, an intuitive interface for creating dashboards, and a powerful SQL IDE called SQL Lab. Superset supports over 40 visualization types including time-series charts, geospatial maps, pivot tables, and more.

Superset connects to any SQL-speaking database through SQLAlchemy, including PostgreSQL, MySQL, BigQuery, Snowflake, Redshift, ClickHouse, Trino, Druid, and many others. It features a semantic layer for defining metrics and dimensions, role-based access control, row-level security, Jinja templating in SQL, and an extensible plugin architecture for custom visualizations. Superset scales well for large organizations and supports embedded analytics.

Installation

# Clone Superset repository
git clone https://github.com/apache/superset.git
cd superset

# Start Superset with Docker Compose
docker compose -f docker-compose-non-dev.yml up -d

# Access at http://localhost:8088
# Default credentials: admin / admin

pip Installation

# Install Superset
pip install apache-superset

# Install database drivers
pip install psycopg2-binary    # PostgreSQL
pip install mysqlclient        # MySQL
pip install snowflake-sqlalchemy   # Snowflake
pip install pybigquery         # BigQuery
pip install clickhouse-connect # ClickHouse
pip install trino              # Trino

# Initialize the database
superset db upgrade

# Create admin user
superset fab create-admin \
    --username admin \
    --firstname Admin \
    --lastname User \
    --email admin@example.com \
    --password admin

# Load example data (optional)
superset load_examples

# Initialize roles and permissions
superset init

# Start development server
superset run -h 0.0.0.0 -p 8088 --with-threads --reload --debugger

Kubernetes with Helm

helm repo add superset https://apache.github.io/superset
helm install superset superset/superset \
    --namespace analytics \
    --create-namespace \
    --set service.type=LoadBalancer

Configuration

superset_config.py

import os

# Basic config
SECRET_KEY = os.environ.get('SUPERSET_SECRET_KEY', 'your-secret-key-here')
SQLALCHEMY_DATABASE_URI = 'postgresql://superset:superset@db:5432/superset'

# Feature flags
FEATURE_FLAGS = {
    'ENABLE_TEMPLATE_PROCESSING': True,
    'DASHBOARD_NATIVE_FILTERS': True,
    'DASHBOARD_CROSS_FILTERS': True,
    'EMBEDDED_SUPERSET': True,
    'ALERT_REPORTS': True,
    'ESTIMATE_QUERY_COST': True,
    'GLOBAL_ASYNC_QUERIES': True,
}

# Cache configuration (Redis)
CACHE_CONFIG = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': 300,
    'CACHE_KEY_PREFIX': 'superset_',
    'CACHE_REDIS_URL': 'redis://redis:6379/0',
}

DATA_CACHE_CONFIG = {
    'CACHE_TYPE': 'RedisCache',
    'CACHE_DEFAULT_TIMEOUT': 86400,
    'CACHE_KEY_PREFIX': 'superset_data_',
    'CACHE_REDIS_URL': 'redis://redis:6379/1',
}

# Celery for async queries and alerts
class CeleryConfig:
    broker_url = 'redis://redis:6379/2'
    result_backend = 'redis://redis:6379/3'
    imports = ('superset.sql_lab', 'superset.tasks')
    task_annotations = {
        'sql_lab.get_sql_results': {'rate_limit': '100/s'},
    }

CELERY_CONFIG = CeleryConfig

# Security
WTF_CSRF_ENABLED = True
TALISMAN_ENABLED = True
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SECURE = True

# CORS
ENABLE_CORS = True
CORS_OPTIONS = {
    'supports_credentials': True,
    'allow_headers': ['*'],
    'resources': ['*'],
    'origins': ['https://app.example.com'],
}

# Row limit
ROW_LIMIT = 50000
SQL_MAX_ROW = 100000
SUPERSET_WEBSERVER_TIMEOUT = 300

# SMTP for alerts/reports
SMTP_HOST = 'smtp.gmail.com'
SMTP_PORT = 587
SMTP_STARTTLS = True
SMTP_SSL = False
SMTP_USER = 'alerts@example.com'
SMTP_PASSWORD = os.environ.get('SMTP_PASSWORD')
SMTP_MAIL_FROM = 'alerts@example.com'

Database Connection Strings

# PostgreSQL
postgresql://user:pass@host:5432/database

# MySQL
mysql://user:pass@host:3306/database

# BigQuery
bigquery://project

# Snowflake
snowflake://user:pass@account.region/database/schema?warehouse=COMPUTE_WH&role=ANALYST

# Redshift
redshift+psycopg2://user:pass@cluster.region.redshift.amazonaws.com:5439/database

# ClickHouse
clickhousedb://user:pass@host:8123/database

# Trino
trino://user@host:8080/catalog/schema

# Apache Druid
druid://host:8888/druid/v2/sql

CLI Commands

CommandDescription
superset db upgradeApply database migrations
superset initInitialize roles and permissions
superset fab create-adminCreate admin user
superset load_examplesLoad example datasets and dashboards
superset run -p 8088Start development server
superset export-dashboardsExport dashboards to JSON
superset import-dashboards -p file.jsonImport dashboards
superset export-datasourcesExport data source configs
superset import-datasources -p file.yamlImport data sources
superset compute-thumbnailsGenerate dashboard thumbnails
superset versionShow Superset version

SQL Lab Features

Jinja Templates

-- Current user
SELECT * FROM orders
WHERE salesperson = '{{ current_username() }}'

-- Current user ID
SELECT * FROM dashboards
WHERE owner_id = {{ current_user_id() }}

-- URL parameters
SELECT * FROM sales
WHERE region = '{{ url_param("region", "all") }}'

-- Date macros
SELECT * FROM events
WHERE event_date >= '{{ from_dttm }}'
  AND event_date < '{{ to_dttm }}'

-- Filter values (from dashboard filters)
SELECT * FROM orders
WHERE status IN ({{ "'" + "','".join(filter_values('status')) + "'" }})

-- Conditional query
SELECT *
FROM orders
{% if filter_values('region') %}
WHERE region IN ({{ "'" + "','".join(filter_values('region')) + "'" }})
{% endif %}

Virtual Datasets

-- Create a virtual dataset from SQL Lab
-- Save as dataset for use in charts
SELECT
    DATE_TRUNC('month', o.created_at) AS month,
    c.segment,
    COUNT(DISTINCT o.customer_id) AS customers,
    SUM(o.amount) AS revenue,
    AVG(o.amount) AS avg_order_value
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1, 2

API Reference

# Get CSRF token
CSRF_TOKEN=$(curl -s -c cookies.txt http://localhost:8088/api/v1/security/csrf_token/ | jq -r '.result')

# Login
curl -X POST http://localhost:8088/api/v1/security/login \
    -H "Content-Type: application/json" \
    -d '{"username": "admin", "password": "admin", "provider": "db"}' \
    -c cookies.txt

# List dashboards
curl http://localhost:8088/api/v1/dashboard/ \
    -b cookies.txt \
    -H "X-CSRFToken: $CSRF_TOKEN"

# Export dashboard
curl http://localhost:8088/api/v1/dashboard/export/?q=[1,2] \
    -b cookies.txt -o dashboards.zip

# List datasets
curl http://localhost:8088/api/v1/dataset/ \
    -b cookies.txt \
    -H "X-CSRFToken: $CSRF_TOKEN"

# List charts
curl http://localhost:8088/api/v1/chart/ \
    -b cookies.txt \
    -H "X-CSRFToken: $CSRF_TOKEN"

Advanced Usage

Row-Level Security

# In superset_config.py or via UI
# Create RLS rule that filters data per user
# Admin > Row Level Security

# Example: Users can only see their region's data
# Clause: region = '{{ current_username() }}'
# Tables: orders, customers, revenue
# Roles: Sales_Team

Custom Visualization Plugin

# Create custom viz plugin
npx @superset-ui/generator-superset create my-chart-plugin
cd my-chart-plugin
npm install
npm run build

# Install in Superset
cd superset-frontend
npm install /path/to/my-chart-plugin

Embedded Dashboards

# superset_config.py
FEATURE_FLAGS = {
    'EMBEDDED_SUPERSET': True,
}

# Generate guest token via API
# POST /api/v1/security/guest_token/
# Body: {
#   "user": {"username": "guest", "first_name": "Guest", "last_name": "User"},
#   "resources": [{"type": "dashboard", "id": "dashboard-uuid"}],
#   "rls": [{"clause": "team_id = 42"}]
# }

Troubleshooting

IssueSolution
Slow dashboard loadingEnable caching with Redis. Optimize SQL queries underlying charts
Database connection failsCheck connection string format. Install required database driver
Charts not renderingClear browser cache. Check browser console for JavaScript errors
Permission denied errorsCheck role-based access. Verify dataset and dashboard permissions
Celery workers not startingCheck Redis connectivity. Verify CELERY_CONFIG in superset_config.py
Alerts not sendingConfigure SMTP settings. Ensure Celery beat is running
Thumbnail generation failsInstall Chrome/Chromium headless. Configure WEBDRIVER_BASEURL
Migration errorsRun superset db upgrade after version upgrades. Check database compatibility
Memory issues with large queriesSet ROW_LIMIT and SQL_MAX_ROW. Enable async queries with Celery
Custom viz not appearingRebuild frontend: npm run build. Verify plugin is registered