10.5 pg_stat_statements

pg_stat_statements는 PostgreSQL의 가장 중요한 진단 확장입니다. 모든 쿼리를 정규화한 형태로 모아 호출 수·실행 시간·평균 등을 카운트합니다. 슬로우 쿼리 식별, 부하 패턴 분석, 회귀 탐지의 출발점.

활성

# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = top
pg_stat_statements.track_utility = on
pg_stat_statements.save = on

restart 후:

CREATE EXTENSION pg_stat_statements;

정규화

쿼리는 구조만 같으면 한 entry에 합산. 리터럴·파라미터는 $1, $2, ... 또는 ?로 대체.

-- 실제 호출
SELECT * FROM users WHERE id = 42;
SELECT * FROM users WHERE id = 100;
SELECT * FROM users WHERE id = 500;

-- pg_stat_statements 한 줄
SELECT * FROM users WHERE id = $1

리터럴이 다르더라도 같은 쿼리 패턴이 한 entry에 카운트되어 패턴별 비용을 볼 수 있습니다.

핵심 컬럼

SELECT calls,
       round(mean_exec_time::numeric, 2) AS mean_ms,
       round(total_exec_time::numeric / 1000, 1) AS total_sec,
       round(stddev_exec_time::numeric, 1) AS stddev_ms,
       rows,
       LEFT(query, 100) AS query
  FROM pg_stat_statements
 ORDER BY total_exec_time DESC
 LIMIT 10;

컬럼	의미
`calls`	누적 호출 수
`total_exec_time`	누적 실행 시간 (ms)
`mean_exec_time`	평균
`stddev_exec_time`	표준편차 — 큰 값이면 변동 큼
`min_exec_time`, `max_exec_time`	극값
`rows`	누적 반환 row
`shared_blks_hit`, `shared_blks_read`	shared_buffers·디스크
`wal_records`, `wal_bytes`	생성 WAL
`query`	정규화된 쿼리
`queryid`	쿼리 ID (PG 14+, 외부 도구와 매칭)
`userid`, `dbid`	사용자·DB

PG 13+에서 pg_stat_statements의 시간이 planning과 execution으로 분리:

SELECT total_plan_time, total_exec_time, calls, query
  FROM pg_stat_statements
 ORDER BY total_plan_time DESC LIMIT 5;

total_plan_time이 크면 옵티마이저 자체 비용이 큰 쿼리 — 파티션 너무 많거나 plan cache 안 됩니다.

운영 진단 패턴

1. 총 부하 TOP

SELECT round(total_exec_time::numeric / 1000, 1) AS total_sec,
       calls, round(mean_exec_time::numeric, 2) AS mean_ms,
       LEFT(query, 80)
  FROM pg_stat_statements
 ORDER BY total_exec_time DESC
 LIMIT 20;

total_sec이 큰 = 전체 부하의 큰 부분을 차지하는 쿼리. 운영 튜닝 우선순위.

2. 평균 느린 쿼리

SELECT mean_exec_time, calls, query
  FROM pg_stat_statements
 WHERE calls > 100
 ORDER BY mean_exec_time DESC
 LIMIT 20;

가끔 호출되지만 한 번에 매우 느린 쿼리. calls > 100로 외톨이 제외.

3. 호출 폭증

SELECT calls, query FROM pg_stat_statements ORDER BY calls DESC LIMIT 10;

칼럼 폭주를 발견할 때. 1초당 수만 번 호출되는 단순 쿼리는 결국 큰 비용입니다.

4. 일관성 없는 쿼리

SELECT calls, mean_exec_time, stddev_exec_time,
       round(stddev_exec_time / mean_exec_time, 2) AS cv,
       LEFT(query, 80)
  FROM pg_stat_statements
 WHERE calls > 1000
 ORDER BY stddev_exec_time / nullif(mean_exec_time, 0) DESC
 LIMIT 10;

CV(coefficient of variation)가 크면 데이터에 따라 plan이 갈리는 쿼리. 통계나 인덱스 점검 대상입니다.

5. cache miss 큰 쿼리

SELECT shared_blks_read, calls,
       round(100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0), 1) AS hit_pct,
       LEFT(query, 80)
  FROM pg_stat_statements
 WHERE shared_blks_read > 1000
 ORDER BY shared_blks_read DESC
 LIMIT 10;

cache miss 많은 쿼리 = 디스크 I/O 크게 만드는 주범. shared_buffers·인덱스 검토합니다.

6. WAL 생성 큰 쿼리 (PG 13+)

SELECT round(wal_bytes::numeric / 1024 / 1024, 1) AS wal_mb,
       calls, LEFT(query, 80)
  FROM pg_stat_statements
 WHERE query NOT ILIKE 'select%'
 ORDER BY wal_bytes DESC
 LIMIT 10;

복제·archive 부하의 원천.

reset과 추세 분석

SELECT pg_stat_statements_reset();

전체 reset 후 고정된 기간의 통계를 모음 = 운영 분석의 표준입니다.

PG 13+의 부분 reset:

SELECT pg_stat_statements_reset(userid, dbid, queryid);

특정 쿼리만 reset. 변경 효과 측정합니다.

외부 모니터링 통합

-- Prometheus exporter는 이런 식으로 폴링
SELECT queryid, calls, total_exec_time, mean_exec_time,
       shared_blks_read, shared_blks_hit
  FROM pg_stat_statements
 WHERE userid <> ANY(...);

postgres_exporter의 pg_stat_statements 컬렉터가 표준입니다. Grafana 대시보드로 가시화.

파라미터

파라미터	기본	의미
`pg_stat_statements.max`	5000	추적할 쿼리 entry 수
`pg_stat_statements.track`	`top`	최상위 SQL만 (`all`이면 함수 내부 SQL도)
`pg_stat_statements.track_utility`	on	DDL 등 utility도
`pg_stat_statements.save`	on	shutdown 시 통계 보존

운영 표준: max = 10000, track = top, save = on.

한계와 주의

한계	메모
누적값만 — 시간대별 분포는 외부 도구가 polling	exporter + Grafana
통계는 재시작 후에도 유지(save=on)	reset 필요 시 명시
`query` 컬럼이 잘림 — `pg_stat_statements.track_planning = on` (PG 13+)	planning 시간 별도
메모리 비용 — entry당 약 12KB shared memory	max를 너무 크게 X
보안: 일반 사용자도 다른 사용자 쿼리 일부 봄	`pg_read_all_stats` 권장

auto_explain과의 짝

pg_stat_statements는 요약 통계, auto_explain은 개별 plan. 둘을 같이 켜면:

pg_stat_statements로 어떤 쿼리가 무거운가 식별
auto_explain으로 그 쿼리가 어떤 plan을 받았는가 분석

다음 절(10.6)에서 auto_explain을 봅니다.

정리

pg_stat_statements = 쿼리 정규화 + 누적 카운터
shared_preload_libraries에 등록 후 CREATE EXTENSION
운영 진단의 시작점 — total_exec_time 기준 TOP 쿼리
PG 13+의 planning vs execution 분리, WAL 카운터
외부 모니터링(Prometheus exporter)과 결합이 표준
pg_stat_statements_reset()으로 변경 효과 측정

다음 절(10.6)에서는 슬로우 쿼리의 plan을 자동 캡처하는 **auto_explain**을 봅니다.

10.4 통계 뷰 10.6 auto_explain