Skip to content

Apache Iceberg Cheat Sheet

Overview

Apache Iceberg is an open table format designed for huge analytic datasets. It brings reliability and simplicity to data lakes by providing ACID transactions, full schema evolution, hidden partitioning, and time travel capabilities. Unlike Hive-style tables that rely on directory structures for partitioning, Iceberg tracks individual data files in metadata, enabling features like snapshot isolation, atomic commits, and efficient query planning without listing directories.

Iceberg is engine-agnostic and works with Spark, Trino, Flink, Presto, Hive, Dremio, Snowflake, and other query engines simultaneously. It stores data in open formats (Parquet, ORC, or Avro) on any storage system (S3, GCS, HDFS, Azure Blob). Iceberg’s metadata layer handles partition pruning, column-level statistics, and merge-on-read for updates and deletes. This makes it the preferred table format for building modern lakehouse architectures.

Installation

Spark with Iceberg

# Download Spark with Iceberg runtime
spark-shell \
    --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.0 \
    --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
    --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
    --conf spark.sql.catalog.local.type=hadoop \
    --conf spark.sql.catalog.local.warehouse=/tmp/iceberg-warehouse

# PySpark with Iceberg
pyspark \
    --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.5.0 \
    --conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
    --conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
    --conf spark.sql.catalog.spark_catalog.type=hive

Trino with Iceberg

# etc/catalog/iceberg.properties
connector.name=iceberg
iceberg.catalog.type=hive_metastore
hive.metastore.uri=thrift://hive-metastore:9083
iceberg.file-format=PARQUET
iceberg.compression-codec=ZSTD
# Download Iceberg Flink runtime
wget https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-flink-runtime-1.18/1.5.0/iceberg-flink-runtime-1.18-1.5.0.jar

# Start Flink SQL client
sql-client.sh -j iceberg-flink-runtime-1.18-1.5.0.jar

Core SQL Operations

Table Management

-- Create a table
CREATE TABLE local.db.orders (
    order_id BIGINT,
    customer_id BIGINT,
    order_date DATE,
    amount DECIMAL(10,2),
    status STRING,
    region STRING
)
USING iceberg
PARTITIONED BY (region, days(order_date))
TBLPROPERTIES (
    'format-version' = '2',
    'write.parquet.compression-codec' = 'zstd'
);

-- Create table as select (CTAS)
CREATE TABLE local.db.orders_backup
USING iceberg
AS SELECT * FROM local.db.orders WHERE order_date >= '2024-01-01';

-- Insert data
INSERT INTO local.db.orders VALUES
    (1, 100, DATE '2024-06-15', 99.99, 'completed', 'us-east'),
    (2, 101, DATE '2024-06-15', 49.50, 'pending', 'us-west');

-- Merge (upsert)
MERGE INTO local.db.orders t
USING staging.updates s
ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET status = s.status, amount = s.amount
WHEN NOT MATCHED THEN INSERT *;

-- Delete
DELETE FROM local.db.orders WHERE status = 'cancelled' AND order_date < '2023-01-01';

-- Update
UPDATE local.db.orders SET status = 'shipped' WHERE order_id = 1;

Schema Evolution

-- Add columns
ALTER TABLE local.db.orders ADD COLUMNS (
    shipping_method STRING,
    discount DECIMAL(5,2)
);

-- Rename a column
ALTER TABLE local.db.orders RENAME COLUMN amount TO order_amount;

-- Change column type (widening only)
ALTER TABLE local.db.orders ALTER COLUMN order_id TYPE BIGINT;

-- Drop a column
ALTER TABLE local.db.orders DROP COLUMN discount;

-- Reorder columns
ALTER TABLE local.db.orders ALTER COLUMN shipping_method AFTER status;

Partitioning

-- Hidden partition transforms
PARTITIONED BY (
    years(event_time),       -- Partition by year
    months(event_time),      -- Partition by month
    days(event_time),        -- Partition by day
    hours(event_time),       -- Partition by hour
    bucket(16, user_id),     -- Hash bucket
    truncate(10, city)       -- Truncate string
)

-- Evolve partitioning without rewriting data
ALTER TABLE local.db.orders ADD PARTITION FIELD days(order_date);
ALTER TABLE local.db.orders DROP PARTITION FIELD region;

Time Travel and Snapshots

-- Query a specific snapshot
SELECT * FROM local.db.orders VERSION AS OF 123456789;

-- Query at a specific timestamp
SELECT * FROM local.db.orders TIMESTAMP AS OF '2024-06-01 00:00:00';

-- View snapshot history
SELECT * FROM local.db.orders.snapshots;

-- View data files
SELECT * FROM local.db.orders.files;

-- View manifest files
SELECT * FROM local.db.orders.manifests;

-- View table history
SELECT * FROM local.db.orders.history;

-- View partition metadata
SELECT * FROM local.db.orders.partitions;

-- Rollback to a previous snapshot
CALL local.system.rollback_to_snapshot('db.orders', 123456789);

-- Rollback to a timestamp
CALL local.system.rollback_to_timestamp('db.orders', TIMESTAMP '2024-06-01 00:00:00');

-- Cherry-pick a snapshot
CALL local.system.cherrypick_snapshot('db.orders', 987654321);

Configuration

Table Properties

ALTER TABLE local.db.orders SET TBLPROPERTIES (
    'format-version' = '2',
    'write.format.default' = 'parquet',
    'write.parquet.compression-codec' = 'zstd',
    'write.target-file-size-bytes' = '134217728',
    'write.metadata.delete-after-commit.enabled' = 'true',
    'write.metadata.previous-versions-max' = '100',
    'read.split.target-size' = '134217728',
    'write.delete.mode' = 'merge-on-read',
    'write.update.mode' = 'merge-on-read',
    'write.merge.mode' = 'merge-on-read'
);

Catalog Configuration (REST Catalog)

# catalog configuration
catalog:
  name: production
  type: rest
  uri: http://iceberg-rest:8181
  warehouse: s3://my-warehouse/iceberg
  s3.endpoint: https://s3.amazonaws.com
  s3.region: us-east-1
  s3.access-key-id: ${AWS_ACCESS_KEY_ID}
  s3.secret-access-key: ${AWS_SECRET_ACCESS_KEY}

Advanced Usage

Table Maintenance

-- Compact small files (rewrite data files)
CALL local.system.rewrite_data_files(
    table => 'db.orders',
    strategy => 'binpack',
    options => map('target-file-size-bytes', '134217728', 'min-input-files', '5')
);

-- Sort-based compaction
CALL local.system.rewrite_data_files(
    table => 'db.orders',
    strategy => 'sort',
    sort_order => 'order_date ASC NULLS LAST, region ASC'
);

-- Expire old snapshots
CALL local.system.expire_snapshots('db.orders', TIMESTAMP '2024-01-01 00:00:00', 100);

-- Remove orphan files
CALL local.system.remove_orphan_files(
    table => 'db.orders',
    older_than => TIMESTAMP '2024-06-01 00:00:00'
);

-- Rewrite manifests for better query planning
CALL local.system.rewrite_manifests('db.orders');

Branching and Tagging

-- Create a branch
ALTER TABLE local.db.orders CREATE BRANCH audit_branch;

-- Create a tag (immutable reference)
ALTER TABLE local.db.orders CREATE TAG `end-of-q2`
    AS OF VERSION 123456789
    RETAIN 180 DAYS;

-- Write to a branch
INSERT INTO local.db.orders.branch_audit_branch VALUES (...);

-- Read from a branch
SELECT * FROM local.db.orders VERSION AS OF 'audit_branch';

-- Fast-forward merge
CALL local.system.fast_forward('db.orders', 'main', 'audit_branch');

PySpark API

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .config("spark.sql.catalog.local", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.local.type", "hadoop") \
    .config("spark.sql.catalog.local.warehouse", "s3://bucket/warehouse") \
    .getOrCreate()

# Read Iceberg table
df = spark.table("local.db.orders")

# Write to Iceberg table
df.writeTo("local.db.orders").append()

# Overwrite partitions dynamically
df.writeTo("local.db.orders").overwritePartitions()

# Read specific snapshot
spark.read.option("snapshot-id", 123456789).table("local.db.orders")

Troubleshooting

IssueSolution
Small file problemRun rewrite_data_files with binpack strategy. Set target-file-size-bytes
Slow query performanceCheck partition pruning with EXPLAIN. Run rewrite_manifests for better planning
Snapshot accumulationSchedule expire_snapshots regularly. Set write.metadata.previous-versions-max
Schema mismatch errorsCheck column types and order. Use ALTER TABLE for schema evolution
Orphan files consuming storageRun remove_orphan_files periodically. Check for failed writes
Concurrent write conflictsUse format-version=2 for row-level conflict detection. Implement retry logic
OOM during compactionReduce target-file-size-bytes or limit file count per rewrite task
Catalog sync issuesVerify catalog URI and credentials. Check metadata file accessibility
Partition evolution confusionNew partition spec only applies to new writes. Old data keeps original partitioning
Merge-on-read performanceRun compaction to convert delete files to copy-on-write format