CloudQuery - Cloud Asset Inventory as SQL Cheatsheet

CloudQuery is an open-source, plugin-based data movement framework that extracts configuration from cloud providers and SaaS APIs (AWS, GCP, Azure, Kubernetes, GitHub, and more) and loads it into a destination — most commonly PostgreSQL — so you can query your entire infrastructure with SQL. Security and platform teams use it for asset inventory, posture management, compliance evidence, and answering “what do we actually have running?” with a query instead of a console click-through.

Architecture

Component	Role
Source plugin	Pulls data from an API (aws, gcp, azure, k8s, github, …)
Destination plugin	Writes data to a store (postgresql, bigquery, sqlite, file, …)
Sync	One run that extracts from sources and loads to destinations
Config	YAML files describing sources and destinations

Installation

Method	Command
Homebrew	`brew install cloudquery/tap/cloudquery`
Script	`curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_amd64 -o cloudquery && chmod +x cloudquery`
Docker	`docker run ghcr.io/cloudquery/cloudquery:latest`
Verify	`cloudquery --version`

Configuration

# aws-to-postgres.yaml
kind: source
spec:
  name: aws
  path: cloudquery/aws
  version: "VERSION"
  destinations: ["postgresql"]
  tables: ["aws_ec2_instances", "aws_s3_buckets", "aws_iam_*"]
---
kind: destination
spec:
  name: postgresql
  path: cloudquery/postgresql
  version: "VERSION"
  spec:
    connection_string: "postgresql://user:pass@localhost:5432/cq"

Core Commands

Command	Description
`cloudquery sync config.yaml`	Run a sync (extract → load)
`cloudquery sync aws.yaml pg.yaml`	Combine multiple config files
`cloudquery init --source aws --destination postgresql`	Scaffold a config
`cloudquery tables config.yaml`	List tables a source provides
`cloudquery migrate config.yaml`	Apply schema migrations only
`cloudquery plugin install config.yaml`	Pre-install plugins
`cloudquery --log-level debug sync ...`	Verbose logging

Querying the Inventory

Once synced, query with plain SQL:

-- Public S3 buckets
SELECT name, region FROM aws_s3_buckets
WHERE block_public_acls = false;

-- EC2 instances missing a required tag
SELECT instance_id, region FROM aws_ec2_instances
WHERE tags->>'Owner' IS NULL;

-- IAM users without MFA
SELECT user_name FROM aws_iam_users
WHERE mfa_active = false;

-- Cross-cloud: count compute by provider
SELECT 'aws' AS cloud, count(*) FROM aws_ec2_instances
UNION ALL SELECT 'gcp', count(*) FROM gcp_compute_instances;

Common Source Plugins

Plugin	Covers
`cloudquery/aws`	EC2, S3, IAM, VPC, RDS, Lambda, …
`cloudquery/gcp`	Compute, Storage, IAM, GKE, …
`cloudquery/azure`	VMs, Storage, AAD, …
`cloudquery/k8s`	Pods, Deployments, RBAC, …
`cloudquery/github`	Repos, members, branch protection
`cloudquery/cloudflare`, `okta`, `gcp`…	SaaS posture

Scheduling & CI

Approach	How
Cron	Run `cloudquery sync` on a schedule
CI pipeline	Sync then run SQL policy checks, fail on violations
Incremental	Many tables support incremental syncs to reduce cost
Policies	Pair with SQL queries as compliance controls

Common Workflows

# Nightly inventory refresh into Postgres
cloudquery sync aws.yaml gcp.yaml azure.yaml postgres.yaml

# Quick local exploration into SQLite (no DB server)
cloudquery sync aws.yaml sqlite.yaml
sqlite3 cq.db "SELECT name FROM aws_s3_buckets"

# List what an AWS source exposes before syncing
cloudquery tables aws.yaml

CloudQuery vs Other Approaches

Aspect	CloudQuery	Steampipe	Native CLIs
Model	Sync to a DB, then SQL	Live SQL over APIs	Imperative per-call
Best for	Inventory, history, joins at scale	Ad-hoc live queries	One-off lookups
Persistence	Yes (your database)	Query-time	None
Cross-cloud joins	Yes	Yes	Manual

CloudQuery - Cloud Asset Inventory as SQL Cheatsheet

CloudQuery - Cloud Asset Inventory as SQL Cheatsheet

Architecture

Installation

Configuration

Core Commands

Querying the Inventory

Common Source Plugins

Scheduling & CI

Common Workflows

CloudQuery vs Other Approaches

Resources