PostgreSQL 테이블 용량 분석 쿼리

사내 빅데이터 모니터링 플랫폼은 클러스터에서 수집된 방대한 메트릭을 HBase 같은 NoSQL에 저장하는 한편, 솔루션 자체의 설정(Topology), 사용자 권한, 알람 히스토리, 대시보드 메타데이터 등은 PostgreSQL에 저장하여 운영했습니다.

수집 환경 규모가 커질수록 알람 히스토리 같은 로그성 테이블이 급격히 비대해졌고, 디스크 용량 알람이 울리면 가장 먼저 “DB 내에서 어느 테이블이 디스크를 가장 많이 쓰는가?” 를 빠르게 파악해서 Partitioning이나 Vacuum 전략을 세워야 했습니다.

아래 쿼리들을 시스템 헬스체크 대시보드와 배치 모니터링에 활용했던 경험을 정리합니다.

1. 테이블별 전체 용량 TOP 20

데이터 + 인덱스 + TOAST를 전부 합산한 실질적인 디스크 점유량 기준으로 상위 20개를 뽑습니다.

sql


SELECT
    relname                                             AS "Table",
    pg_size_pretty(pg_total_relation_size(C.oid))       AS "Total Size",
    pg_size_pretty(pg_relation_size(C.oid))             AS "Table Only",
    pg_size_pretty(pg_indexes_size(C.oid))              AS "Index Size",
    to_char(reltuples::numeric, 'FM999,999,999,999')    AS "Estimated Rows"
FROM
    pg_class C
    LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE
    nspname NOT IN ('pg_catalog', 'information_schema')
    AND C.relkind = 'r'           -- 실제 테이블만 (뷰·인덱스 제외)
    AND nspname !~ '^pg_toast'
ORDER BY
    pg_total_relation_size(C.oid) DESC
LIMIT 20;

출력 예시:

plaintext


     Table          | Total Size | Table Only | Index Size | Estimated Rows
--------------------+------------+------------+------------+----------------
 metric_history     | 48 GB      | 32 GB      | 16 GB      | 2,341,984,512
 alert_log          | 12 GB      | 9,800 MB   | 2,200 MB   | 487,231,008
 dashboard_snapshot | 4,200 MB   | 4,100 MB   | 100 MB     | 12,847,332
 user_session       | 320 MB     | 210 MB     | 110 MB     | 9,234,112

쿼리 핵심 포인트

함수	의미
`pg_total_relation_size`	테이블 + 인덱스 + TOAST 전부 합산
`pg_relation_size`	테이블 데이터만 (인덱스 제외)
`pg_indexes_size`	해당 테이블 인덱스 총합
`pg_size_pretty`	Byte → KB/MB/GB 자동 변환
`reltuples`	`ANALYZE` 기반 추정 Row 수 (COUNT(*) 없이 1초 이내 조회 가능)

2. Dead Tuple 비율 확인 (Bloat 탐지)

PostgreSQL은 MVCC 방식으로 DELETE/UPDATE 시 기존 행을 즉시 지우지 않고 Dead Tuple로 남겨둡니다. Dead Tuple이 쌓이면 테이블이 부풀어(Bloat) 불필요한 디스크를 차지하고 쿼리 성능도 떨어집니다.

sql


SELECT
    schemaname,
    relname                                         AS "Table",
    n_live_tup                                      AS "Live Rows",
    n_dead_tup                                      AS "Dead Rows",
    ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 2)
                                                    AS "Dead Ratio (%)",
    last_vacuum,
    last_autovacuum,
    last_analyze
FROM
    pg_stat_user_tables
ORDER BY
    n_dead_tup DESC
LIMIT 20;

Dead Ratio 판단 기준:

5% 이하: 양호
10~20%: VACUUM 필요 검토
20% 초과: 즉시 VACUUM ANALYZE 실행 권장

3. 인덱스 사용률 확인 (불필요한 인덱스 탐지)

존재하지만 실제로 쿼리에서 거의 사용되지 않는 인덱스는 INSERT/UPDATE 성능을 해치는 애물단지입니다.

sql


SELECT
    schemaname,
    relname                         AS "Table",
    indexrelname                    AS "Index",
    pg_size_pretty(pg_relation_size(indexrelid))
                                    AS "Index Size",
    idx_scan                        AS "Used Count",
    idx_tup_read,
    idx_tup_fetch
FROM
    pg_stat_user_indexes
    JOIN pg_index USING (indexrelid)
WHERE
    indisunique = false             -- Unique 인덱스는 제외
ORDER BY
    idx_scan ASC, pg_relation_size(indexrelid) DESC
LIMIT 20;

Used Count가 0이거나 매우 낮은데 Index Size가 크다면 DROP을 검토합니다.

4. 전체 데이터베이스 크기 요약

sql


SELECT
    datname                                 AS "Database",
    pg_size_pretty(pg_database_size(datname)) AS "Size"
FROM
    pg_database
ORDER BY
    pg_database_size(datname) DESC;

5. 스키마별 용량 요약

sql


SELECT
    table_schema                    AS "Schema",
    pg_size_pretty(
        SUM(pg_total_relation_size(quote_ident(table_schema) || '.' || quote_ident(table_name)))
    )                               AS "Total Size"
FROM
    information_schema.tables
GROUP BY
    table_schema
ORDER BY
    SUM(pg_total_relation_size(quote_ident(table_schema) || '.' || quote_ident(table_name))) DESC;

6. 빠른 대응: VACUUM & ANALYZE 명령어

Dead Tuple이 많은 테이블을 정리하는 명령어입니다.

sql


-- 특정 테이블 Dead Tuple 정리 (조용히, 잠금 최소)
VACUUM alert_log;
 
-- Dead Tuple 정리 + 통계 갱신 (reltuples 수치 갱신)
VACUUM ANALYZE alert_log;
 
-- 테이블을 완전히 재작성하여 공간 반환 (배타 잠금 발생! 서비스 중 금지)
VACUUM FULL alert_log;

VACUUM FULL은 배타 잠금(Exclusive Lock)으로 해당 테이블을 완전히 막습니다. 서비스 중에는 pg_repack 익스텐션을 사용하는 것이 안전합니다.

활용 팁

위 쿼리들을 cron 배치로 매일 실행해 결과를 관리자 채널에 전송하면 용량 이슈를 사전에 캐치할 수 있습니다.
pg_stat_user_tables의 통계는 PostgreSQL 재시작 시 초기화됩니다. 장기 추이 분석이 필요하다면 결과를 별도 테이블에 INSERT하는 이력 수집 배치를 구성하세요.