Overview
Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo and now an Apache top-level project. It provides a unified messaging model supporting both queueing and streaming use cases with multi-tenancy, geo-replication, and tiered storage built in.
Pulsar separates serving (brokers) from storage (Apache BookKeeper), enabling independent scaling. It supports multiple subscription types (exclusive, shared, failover, key-shared), schema enforcement, message deduplication, and delayed message delivery. Pulsar Functions provide lightweight serverless compute directly on the messaging layer.
Installation
Standalone Mode (Development)
# Download Pulsar
wget https://archive.apache.org/dist/pulsar/pulsar-3.3.0/apache-pulsar-3.3.0-bin.tar.gz
tar xvfz apache-pulsar-3.3.0-bin.tar.gz
cd apache-pulsar-3.3.0
# Start standalone (broker + bookie + zookeeper)
bin/pulsar standalone
# Verify
bin/pulsar-admin brokers list use
Docker
docker run -d --name pulsar \
-p 6650:6650 \
-p 8080:8080 \
apachepulsar/pulsar:3.3.0 \
bin/pulsar standalone
Docker Compose (Full Cluster)
version: '3'
services:
zookeeper:
image: apachepulsar/pulsar:3.3.0
command: bin/pulsar zookeeper
ports:
- "2181:2181"
bookie:
image: apachepulsar/pulsar:3.3.0
command: bin/pulsar bookie
depends_on:
- zookeeper
broker:
image: apachepulsar/pulsar:3.3.0
command: bin/pulsar broker
ports:
- "6650:6650"
- "8080:8080"
depends_on:
- bookie
Core CLI Commands
Tenant and Namespace Management
| Command | Description |
|---|
pulsar-admin tenants list | List all tenants |
pulsar-admin tenants create my-tenant | Create a tenant |
pulsar-admin namespaces list my-tenant | List namespaces in a tenant |
pulsar-admin namespaces create my-tenant/my-ns | Create a namespace |
pulsar-admin namespaces delete my-tenant/my-ns | Delete a namespace |
pulsar-admin namespaces policies my-tenant/my-ns | Show namespace policies |
Topic Management
# Create a partitioned topic
bin/pulsar-admin topics create-partitioned-topic \
persistent://my-tenant/my-ns/my-topic -p 4
# List topics
bin/pulsar-admin topics list my-tenant/my-ns
# Get topic stats
bin/pulsar-admin topics stats persistent://my-tenant/my-ns/my-topic
# Peek at messages
bin/pulsar-admin topics peek-messages \
persistent://my-tenant/my-ns/my-topic -s my-sub -n 10
# Skip messages on a subscription
bin/pulsar-admin topics skip \
persistent://my-tenant/my-ns/my-topic -s my-sub -n 100
# Delete a topic
bin/pulsar-admin topics delete persistent://my-tenant/my-ns/my-topic
Producing and Consuming
# Produce messages
bin/pulsar-client produce persistent://my-tenant/my-ns/my-topic \
-m "Hello Pulsar" -n 10
# Consume messages
bin/pulsar-client consume persistent://my-tenant/my-ns/my-topic \
-s my-subscription -n 10
# Consume with specific subscription type
bin/pulsar-client consume persistent://my-tenant/my-ns/my-topic \
-s my-shared-sub -t Shared -n 0
Configuration
Broker Configuration (conf/broker.conf)
# Cluster name
clusterName=my-cluster
# Zookeeper connection
zookeeperServers=zk1:2181,zk2:2181,zk3:2181
configurationStoreServers=zk1:2181,zk2:2181,zk3:2181
# Broker settings
brokerServicePort=6650
webServicePort=8080
# Message retention
defaultRetentionTimeInMinutes=4320
defaultRetentionSizeInMB=10240
# Backlog quota
backlogQuotaDefaultLimitGB=10
backlogQuotaDefaultRetentionPolicy=producer_request_hold
# Deduplication
brokerDeduplicationEnabled=true
# Max message size (5MB default)
maxMessageSize=5242880
# Managed ledger settings
managedLedgerDefaultEnsembleSize=2
managedLedgerDefaultWriteQuorum=2
managedLedgerDefaultAckQuorum=2
Namespace Policies
# Set retention policy
bin/pulsar-admin namespaces set-retention my-tenant/my-ns \
--size 10G --time 7d
# Set backlog quota
bin/pulsar-admin namespaces set-backlog-quota my-tenant/my-ns \
--limit 10G --policy producer_request_hold
# Set message TTL
bin/pulsar-admin namespaces set-message-ttl my-tenant/my-ns --messageTTL 3600
# Set replication clusters
bin/pulsar-admin namespaces set-clusters my-tenant/my-ns \
--clusters us-east,us-west,eu-central
# Enable deduplication
bin/pulsar-admin namespaces set-deduplication my-tenant/my-ns --enable
# Set schema validation
bin/pulsar-admin namespaces set-schema-validation-enforce \
my-tenant/my-ns --enable
Subscription Types
| Type | Description | Use Case |
|---|
Exclusive | Single consumer per subscription | Ordered processing |
Shared | Round-robin across consumers | Parallel processing |
Failover | Active-standby consumers | High availability |
Key_Shared | Partition by message key | Ordered per-key processing |
Advanced Usage
Pulsar Functions
# Deploy a function
bin/pulsar-admin functions create \
--function-name my-func \
--inputs persistent://my-tenant/my-ns/input-topic \
--output persistent://my-tenant/my-ns/output-topic \
--jar my-function.jar \
--classname com.example.MyFunction
# List functions
bin/pulsar-admin functions list --tenant my-tenant --namespace my-ns
# Get function status
bin/pulsar-admin functions status \
--tenant my-tenant --namespace my-ns --name my-func
# Delete a function
bin/pulsar-admin functions delete \
--tenant my-tenant --namespace my-ns --name my-func
Tiered Storage (Offloading)
# Configure S3 offloader in broker.conf
# managedLedgerOffloadDriver=aws-s3
# s3ManagedLedgerOffloadBucket=pulsar-offload
# s3ManagedLedgerOffloadRegion=us-east-1
# Set offload threshold on namespace
bin/pulsar-admin namespaces set-offload-threshold \
my-tenant/my-ns --size 10G
# Trigger manual offload
bin/pulsar-admin topics offload \
persistent://my-tenant/my-ns/my-topic -s 10G
Geo-Replication
# Enable replication on namespace
bin/pulsar-admin namespaces set-clusters my-tenant/my-ns \
--clusters us-east,eu-west
# Check replication status
bin/pulsar-admin topics stats persistent://my-tenant/my-ns/my-topic \
| jq '.replication'
Monitoring
# Broker metrics endpoint
curl http://localhost:8080/metrics
# Key metrics to watch
# pulsar_broker_topics_count
# pulsar_subscription_back_log
# pulsar_throughput_in / pulsar_throughput_out
# pulsar_storage_size
# pulsar_msg_backlog
# bookkeeper_server_ADD_ENTRY_REQUEST (bookie write latency)
Troubleshooting
| Issue | Solution |
|---|
| Broker fails to start | Check ZooKeeper connectivity; verify clusterName matches metadata |
| Messages not consumed | Verify subscription exists; check consumer subscription type |
| Backlog growing | Scale consumers; check consumer errors in logs; increase parallelism |
| Topic creation fails | Verify tenant/namespace exists; check authorization |
| High publish latency | Check BookKeeper health; ensure sufficient bookie nodes |
| Out of disk | Configure tiered storage offloading; adjust retention policies |
| Schema compatibility error | Check schema compatibility strategy; use BACKWARD for safe evolution |
| Geo-replication lag | Monitor replication throughput; check cross-region network latency |