Installation
| Platform | Method | Command |
|---|
| Ubuntu/Debian | Python CLI | sudo apt-get install python3 python3-pip && pip3 install pdpyras pd-cli |
| Ubuntu/Debian | Node.js CLI | `curl -fsSL https://deb.nodesource.com/setup_lts.x \ |
| macOS | Homebrew (Python) | brew install python3 && pip3 install pdpyras pd-cli |
| macOS | Homebrew (Node) | brew install node && npm install -g pagerduty-cli |
| Windows | Python | pip install pdpyras pd-cli |
| Windows | Chocolatey | choco install python && pip install pdpyras pd-cli |
| Any Platform | Docker | docker pull pagerduty/pdagent |
| Platform | Command |
|---|
| Ubuntu/Debian | `curl -s https://packages.pagerduty.com/GPG-KEY-pagerduty \ |
| RHEL/CentOS | sudo rpm --import https://packages.pagerduty.com/GPG-KEY-pagerduty && sudo yum install pdagent |
| Start Agent | sudo systemctl start pdagent && sudo systemctl enable pdagent |
Basic Commands
Authentication & Setup
| Command | Description |
|---|
pd login | Authenticate and configure API token interactively |
export PDTOKEN=your_api_token | Set API token via environment variable |
pd rest:get /users/me | Test authentication and get current user info |
pd user:set user@example.com | Set default user for operations |
Incident Management
| Command | Description |
|---|
pd incident:list | List all incidents |
pd incident:list --status triggered | List only triggered (active) incidents |
pd incident:list --status acknowledged | List acknowledged incidents |
pd incident:get --id INCIDENT_ID | Get detailed information about specific incident |
pd incident:ack --id INCIDENT_ID | Acknowledge an incident |
pd incident:resolve --id INCIDENT_ID | Resolve an incident |
pd incident:notes --id INCIDENT_ID --note "Message" | Add note to incident |
pd incident:reassign --id INCIDENT_ID --user user@example.com | Reassign incident to different user |
pd incident:priority --id INCIDENT_ID --priority P1 | Set incident priority (P1-P5) |
pd incident:snooze --id INCIDENT_ID --duration 3600 | Snooze incident for specified seconds |
Service Management
| Command | Description |
|---|
pd service:list | List all services |
pd service:get --id SERVICE_ID | Get service details |
pd service:disable --id SERVICE_ID | Disable a service |
pd service:enable --id SERVICE_ID | Enable a service |
pd service:integration:list --service-id SERVICE_ID | List integrations for a service |
User & On-Call Management
| Command | Description |
|---|
pd user:list | List all users in account |
pd user:get --id USER_ID | Get user details |
pd oncall:list | List current on-call users |
pd user:contact:list --user-id USER_ID | List user’s contact methods |
pd user:notification:list --user-id USER_ID | List user’s notification rules |
PagerDuty Agent Commands
| Command | Description |
|---|
pd-send -k KEY -t trigger -d "Description" | Trigger new incident via agent |
pd-send -k KEY -t acknowledge -i incident_key | Acknowledge incident via agent |
pd-send -k KEY -t resolve -i incident_key | Resolve incident via agent |
sudo systemctl status pdagent | Check agent service status |
sudo journalctl -u pdagent -f | View agent logs in real-time |
Advanced Usage
Advanced Incident Operations
| Command | Description |
|---|
pd incident:create --title "Issue" --service-id SID --urgency high --priority P1 | Create incident with full details |
pd incident:merge --source-ids ID1,ID2 --target-id MAIN_ID | Merge multiple incidents into one |
pd incident:list --since 2024-01-01T00:00:00Z --until 2024-01-31T23:59:59Z | List incidents within date range |
pd incident:list --service-ids SID1,SID2 --urgencies high | Filter incidents by service and urgency |
| `pd incident:list —json \ | jq -r ‘.incidents[].id’` |
| `pd incident:list —status triggered —json \ | jq -r ‘.incidents[].id’ \ |
REST API Operations (curl)
| Command | Description |
|---|
curl -X GET "https://api.pagerduty.com/incidents" -H "Authorization: Token token=$PDTOKEN" -H "Accept: application/vnd.pagerduty+json;version=2" | List incidents via REST API |
curl -X POST "https://api.pagerduty.com/incidents" -H "Authorization: Token token=$PDTOKEN" -H "Content-Type: application/json" -H "From: user@example.com" -d '{"incident":{...}}' | Create incident via REST API |
curl -X GET "https://api.pagerduty.com/oncalls" -H "Authorization: Token token=$PDTOKEN" | Get on-call schedule via API |
curl -X PUT "https://api.pagerduty.com/incidents/$ID" -H "Authorization: Token token=$PDTOKEN" -d '{"incident":{"type":"incident_reference","status":"resolved"}}' | Update incident status via API |
Advanced Agent Operations
| Command | Description |
|---|
pd-send -k KEY -t trigger -d "High CPU" -s error -i key123 | Send alert with severity and incident key |
pd-send -k KEY -t trigger -d "Alert" -f severity=critical -f host=web01 | Send alert with custom fields |
| `echo ’{“routing_key”:“KEY”,“event_action”:“trigger”,“payload”:{“summary”:“Alert”,“severity”:“error”}}’ \ | curl -X POST https://events.pagerduty.com/v2/enqueue -d @-` |
Schedule Management
| Command | Description |
|---|
pd schedule:list | List all schedules |
pd schedule:show --id SCHEDULE_ID | Show schedule details with on-call users |
pd schedule:override --id SCHEDULE_ID --user USER_ID --start START --end END | Create schedule override |
Escalation Policy Management
| Command | Description |
|---|
pd escalation:list | List all escalation policies |
pd escalation:get --id EP_ID | Get escalation policy details |
Analytics & Reporting
| Command | Description |
|---|
pd analytics:incidents --since 2024-01-01 --until 2024-01-31 | Get incident analytics for date range |
| `pd incident:list —json \ | jq ‘[.incidents[] \ |
Configuration
Environment Variables
# Set API token
export PDTOKEN="your_api_token_here"
# Set default region (for EU accounts)
export PD_API_BASE="https://api.eu.pagerduty.com"
# Set default user email
export PD_USER_EMAIL="user@example.com"
API Token Generation
- Log into PagerDuty web interface
- Navigate to Configuration → API Access
- Click Create New API Key
- Choose User Token or Account Token
- Copy token and save securely
Integration Keys
# Integration keys are service-specific
# Find them at: Service → Integrations → Integration Key
# Use in agent:
pd-send -k "your_integration_key" -t trigger -d "Alert message"
# Use in Events API v2:
curl -X POST https://events.pagerduty.com/v2/enqueue \
-H "Content-Type: application/json" \
-d '{
"routing_key": "your_integration_key",
"event_action": "trigger",
"payload": {
"summary": "Server down",
"severity": "critical",
"source": "prod-server-01"
}
}'
# Agent config location: /etc/pdagent.conf
# View current configuration
cat /etc/pdagent.conf
# Common settings:
# - pid_file: /var/run/pdagent/pdagent.pid
# - log_dir: /var/log/pdagent
# - outqueue_dir: /var/lib/pdagent/outqueue
Service Configuration Example
{
"service": {
"name": "Production API",
"description": "Main production API service",
"escalation_policy": {
"id": "ESCALATION_POLICY_ID",
"type": "escalation_policy_reference"
},
"alert_creation": "create_alerts_and_incidents",
"incident_urgency_rule": {
"type": "constant",
"urgency": "high"
},
"auto_resolve_timeout": 14400,
"acknowledgement_timeout": 1800
}
}
Common Use Cases
Use Case 1: Trigger and Resolve Incident from Monitoring
# Trigger incident when issue detected
pd-send -k R0123456789ABCDEF0123456789ABCDEF \
-t trigger \
-d "Database connection pool exhausted" \
-s critical \
-i db_pool_incident_001
# Add context as incident develops
pd-send -k R0123456789ABCDEF0123456789ABCDEF \
-t trigger \
-d "Connection count: 500/500" \
-i db_pool_incident_001
# Resolve when fixed
pd-send -k R0123456789ABCDEF0123456789ABCDEF \
-t resolve \
-i db_pool_incident_001
Use Case 2: Check Who’s On-Call Before Deployment
# Get current on-call engineers
pd oncall:list --json | jq -r '.oncalls[] | "\(.escalation_policy.summary): \(.user.summary)"'
# Get on-call for specific escalation policy
pd oncall:list --escalation-policy-ids EP123456 --json | jq -r '.oncalls[].user.summary'
# Check schedule for next 7 days
pd schedule:show --id SCHEDULE_ID --since $(date -u +%Y-%m-%dT%H:%M:%SZ) --until $(date -u -d '+7 days' +%Y-%m-%dT%H:%M:%SZ)
Use Case 3: Bulk Incident Management During Outage
# Get all triggered incidents for a service
INCIDENTS=$(pd incident:list --service-ids SERVICE_ID --status triggered --json | jq -r '.incidents[].id')
# Acknowledge all incidents
echo "$INCIDENTS" | xargs -I {} pd incident:ack --id {}
# Add note to all incidents
echo "$INCIDENTS" | xargs -I {} pd incident:notes --id {} --note "Mass outage - investigating root cause"
# Resolve all incidents after fix
echo "$INCIDENTS" | xargs -I {} pd incident:resolve --id {}
Use Case 4: Create Incident with Conference Bridge
# Create high-priority incident with Zoom link
curl -X POST "https://api.pagerduty.com/incidents" \
-H "Authorization: Token token=$PDTOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.pagerduty+json;version=2" \
-H "From: oncall@example.com" \
-d '{
"incident": {
"type": "incident",
"title": "Production database outage",
"service": {
"id": "SERVICE_ID",
"type": "service_reference"
},
"urgency": "high",
"priority": {
"id": "PRIORITY_P1_ID",
"type": "priority_reference"
},
"body": {
"type": "incident_body",
"details": "Primary database cluster unresponsive"
},
"conference_bridge": {
"conference_number": "https://zoom.us/j/1234567890",
"conference_url": "https://zoom.us/j/1234567890"
}
}
}'
Use Case 5: Generate Weekly Incident Report
# Get incidents from last week
LAST_WEEK=$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ)
NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ)
pd incident:list --since $LAST_WEEK --until $NOW --json | \
jq -r '.incidents[] | [.created_at, .urgency, .status, .title] | @csv' > weekly_incidents.csv
# Count incidents by service
pd incident:list --since $LAST_WEEK --until $NOW --json | \
jq -r '.incidents[] | .service.summary' | sort | uniq -c | sort -rn
# Calculate mean time to acknowledge
pd incident:list --since $LAST_WEEK --until $NOW --json | \
jq '[.incidents[] | select(.status == "resolved") |
(.first_trigger_log_entry.created_at as $trigger |
.acknowledgements[0].at as $ack |
($ack | fromdateiso8601) - ($trigger | fromdateiso8601))] |
add / length / 60' # Result in minutes
Best Practices
- Use incident keys for deduplication: Always provide consistent incident keys (
-i flag) to prevent duplicate alerts for the same issue
- Set appropriate urgencies: Use
high urgency for critical production issues, low for non-urgent notifications to avoid alert fatigue
- Leverage auto-resolution: Configure services with
auto_resolve_timeout to automatically close incidents when monitoring shows recovery
- Implement escalation policies: Create multi-level escalation policies to ensure incidents reach someone who can respond
- Add context to incidents: Include relevant details in incident descriptions, notes, and custom fields to speed up resolution
- Use schedule overrides: Plan for vacations and schedule changes by creating overrides rather than modifying base schedules
- Tag and categorize incidents: Use consistent tagging for incidents to enable better reporting and trend analysis
- Test integrations regularly: Send test alerts to verify monitoring integrations are working correctly
- Review incident analytics: Regularly analyze MTTA (Mean Time to Acknowledge) and MTTR (Mean Time to Resolve) metrics
- Document runbooks: Link incidents to runbooks and documentation to help responders quickly resolve common issues
- Use status pages: Keep stakeholders informed by connecting incidents to status pages for transparent communication
Troubleshooting
| Issue | Solution |
|---|
| Authentication fails with “Invalid token” | Verify token with pd rest:get /users/me. Generate new token at Configuration → API Access. Ensure token has correct permissions. |
| Agent not sending events | Check agent status: sudo systemctl status pdagent. View logs: sudo journalctl -u pdagent -f. Verify integration key is correct. Test connectivity: curl https://events.pagerduty.com/health |
| Incidents not triggering | Verify service is enabled: pd service:get --id SERVICE_ID. Check integration key matches. Ensure service has valid escalation policy assigned. |
| No notifications received | Check user contact methods: pd user:contact:list --user-id USER_ID. Verify notification rules: pd user:notification:list --user-id USER_ID. Test contact method in PagerDuty UI. |
| CLI returns “Service Unavailable” | Check PagerDuty status at status.pagerduty.com. Verify API endpoint (use https://api.eu.pagerduty.com for EU accounts). Check network connectivity and firewall rules. |
| Duplicate incidents created | Use consistent incident keys with -i flag. Configure alert grouping in service settings. Set appropriate deduplication time windows. |
| Schedule shows wrong on-call person | Verify timezone settings in schedule configuration. Check for active overrides: pd schedule:show --id SCHEDULE_ID. Ensure schedule layers are configured correctly. |
| API rate limit exceeded | Implement exponential backoff in scripts. Use bulk operations where possible. Cache frequently accessed data. Check rate limit headers in API responses. |
| Events API v2 returns 400 error | Validate JSON payload structure. Ensure routing_key (not integration_key) is used. Check required fields: summary, severity, source. Verify event_action is valid (trigger/acknowledge/resolve). |
| Cannot resolve incident | Check if incident is already resolved. Verify user has permissions to resolve. Ensure incident ID is correct. Try via web UI to rule out API issues. |
Quick Reference: Event Severity Levels
| Severity | Use Case |
|---|
critical | Service outage, data loss, security breach |
error | Service degradation, failed jobs, errors affecting users |
warning | Potential issues, threshold breaches, degraded performance |
info | Informational events, successful deployments, routine notifications |
Quick Reference: Incident Priorities
| Priority | Response Time | Use Case |
|---|
P1 | Immediate | Complete service outage, critical security incident |
P2 | < 30 minutes | Major feature broken, significant performance degradation |
P3 | < 2 hours | Minor feature issues, isolated customer impact |
P4 | < 8 hours | Small bugs, cosmetic issues |