콘텐츠로 이동

CloudQuery - SQL 기반 클라우드 자산 인벤토리 치트시트

CloudQuery - SQL 기반 클라우드 자산 인벤토리 치트시트

CloudQuery는 클라우드 제공자 및 SaaS API(AWS, GCP, Azure, Kubernetes, GitHub 등)에서 구성을 추출하고 목적지(일반적으로 PostgreSQL)로 로드하는 오픈소스 플러그인 기반 데이터 이동 프레임워크입니다. 전체 인프라를 SQL로 쿼리할 수 있게 합니다. 보안 및 플랫폼 팀은 자산 인벤토리, 상태 관리, 규정 준수 증명, “우리가 실제로 무엇을 실행하고 있는가?”를 쿼리로 답하기 위해 사용합니다.

아키텍처

구성 요소	역할
Source plugin	API에서 데이터 가져오기 (aws, gcp, azure, k8s, github, …)
Destination plugin	저장소에 데이터 쓰기 (postgresql, bigquery, sqlite, file, …)
Sync	소스에서 추출하고 목적지로 로드하는 한 번의 실행
Config	소스 및 목적지를 설명하는 YAML 파일

설치

방법	명령어
Homebrew	`brew install cloudquery/tap/cloudquery`
Script	`curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_amd64 -o cloudquery && chmod +x cloudquery`
Docker	`docker run ghcr.io/cloudquery/cloudquery:latest`
확인	`cloudquery --version`

설정

# aws-to-postgres.yaml
kind: source
spec:
  name: aws
  path: cloudquery/aws
  version: "VERSION"
  destinations: ["postgresql"]
  tables: ["aws_ec2_instances", "aws_s3_buckets", "aws_iam_*"]
---
kind: destination
spec:
  name: postgresql
  path: cloudquery/postgresql
  version: "VERSION"
  spec:
    connection_string: "postgresql://user:pass@localhost:5432/cq"

핵심 명령어

명령어	설명
`cloudquery sync config.yaml`	동기화 실행 (추출 → 로드)
`cloudquery sync aws.yaml pg.yaml`	여러 구성 파일 결합
`cloudquery init --source aws --destination postgresql`	구성 스캐폴딩
`cloudquery tables config.yaml`	소스가 제공하는 테이블 나열
`cloudquery migrate config.yaml`	스키마 마이그레이션만 적용
`cloudquery plugin install config.yaml`	플러그인 사전 설치
`cloudquery --log-level debug sync ...`	자세한 로깅

인벤토리 쿼리

동기화 후 일반 SQL로 쿼리합니다:

-- 공개 S3 버킷
SELECT name, region FROM aws_s3_buckets
WHERE block_public_acls = false;

-- 필수 태그가 없는 EC2 인스턴스
SELECT instance_id, region FROM aws_ec2_instances
WHERE tags->>'Owner' IS NULL;

-- MFA 없는 IAM 사용자
SELECT user_name FROM aws_iam_users
WHERE mfa_active = false;

-- 크로스 클라우드: 제공자별 컴퓨팅 수 계산
SELECT 'aws' AS cloud, count(*) FROM aws_ec2_instances
UNION ALL SELECT 'gcp', count(*) FROM gcp_compute_instances;

일반적인 Source 플러그인

플러그인	포함
`cloudquery/aws`	EC2, S3, IAM, VPC, RDS, Lambda, …
`cloudquery/gcp`	Compute, Storage, IAM, GKE, …
`cloudquery/azure`	VMs, Storage, AAD, …
`cloudquery/k8s`	Pods, Deployments, RBAC, …
`cloudquery/github`	Repos, members, branch protection
`cloudquery/cloudflare`, `okta`, `gcp`…	SaaS 상태

스케줄링 & CI

접근	방법
Cron	스케줄에 따라 `cloudquery sync` 실행
CI pipeline	동기화 후 SQL 정책 검사 실행, 위반 시 실패
Incremental	많은 테이블이 증분 동기화를 지원하여 비용 절감
Policies	SQL 쿼리를 규정 준수 제어로 페어링

일반적인 워크플로우

# Postgres로 야간 인벤토리 새로고침
cloudquery sync aws.yaml gcp.yaml azure.yaml postgres.yaml

# SQLite로 빠른 로컬 탐색 (DB 서버 없음)
cloudquery sync aws.yaml sqlite.yaml
sqlite3 cq.db "SELECT name FROM aws_s3_buckets"

# 동기화 전에 AWS 소스가 노출하는 항목 나열
cloudquery tables aws.yaml

CloudQuery vs 다른 접근법

측면	CloudQuery	Steampipe	Native CLIs
모델	DB로 동기화 후 SQL	API 상 라이브 SQL	호출당 명령형
최적 사용	인벤토리, 히스토리, 규모 조인	애드혹 라이브 쿼리	일회성 조회
지속성	예 (당신의 데이터베이스)	쿼리 시간	없음
크로스 클라우드 조인	예	예	수동

리소스