Barman rsync 모드 — SSH 셋업과 하드링크 dedup

streaming은 알겠고, rsync는 언제 쓰나

2편에서 Barman의 streaming-only 모델을 셋업했다. 이 글은 같은 lab에서 rsync 모델로 갈아끼우고 — 또는 처음부터 rsync로 셋업하고 — 그 다음 왜 rsync를 골랐는가를 정리하는 워크북이다.

처음 보는 독자도 따라올 수 있게 환경을 짧게 정리한다.

호스트	역할	Barman 라벨
`demo-pg01`	PostgreSQL 17 primary	—
`demo-barman01`	Barman 3.18 서버	`pg01`

pg01은 Barman 서버 라벨, demo-pg01은 실제 호스트네임. 환경 셋업이 처음이라면 2편을 먼저 보고 오면 빠르다.

rsync 모델 vs streaming 모델 한눈에

항목	rsync 모델 (이 글)	streaming 모델 (2편)
도입 시기	Barman 1.x (2012-)	Barman 2.0 (2016-)
전송 채널	SSH + rsync	PostgreSQL streaming replication
WAL 수집	`archive_command` (폴링)	`pg_receivewal` (실시간 stream)
의존	양방향 SSH 키	replication slot, replication user
증분 dedup	하드링크 (`reuse_backup = link`)	(PG 17+ 블록 레벨 점진 이행)
병렬 복사	`parallel_jobs = N`	(단일 stream)
적합 환경	베어메탈·VM, SSH 통제 가능	컨테이너·K8s, SSH 미허용

이 글의 주제는 다섯·여섯 번째 행 — 하드링크 dedup과 parallel_jobs. rsync 모델 고유의 강점이다.

SSH 셋업 — 양방향 두 줄

rsync 모델은 두 방향의 SSH가 필요하다.

방향	용도
`barman@demo-barman01` → `postgres@demo-pg01`	base backup 시 rsync로 데이터 디렉토리 풀링
`postgres@demo-pg01` → `barman@demo-barman01`	`archive_command`로 WAL을 Barman에 푸시

비밀번호 없이 통과해야 cron이 자동으로 돌므로 SSH 키 기반.

방향 1 — demo-barman01의 barman 사용자에서:

sudo -u barman ssh-keygen -t ed25519 -N "" -f ~barman/.ssh/id_ed25519 \
  -C "barman@demo-barman01"

# 공개키를 demo-pg01의 ~postgres/.ssh/authorized_keys에 등록
sudo -u barman ssh-copy-id postgres@demo-pg01

# 검증 — 비밀번호 없이 통과해야 함
sudo -u barman ssh postgres@demo-pg01 'echo ok'

방향 2 — demo-pg01의 postgres 사용자에서:

sudo -u postgres ssh-keygen -t ed25519 -N "" -f ~postgres/.ssh/id_ed25519 \
  -C "postgres@demo-pg01"
sudo -u postgres ssh-copy-id barman@demo-barman01
sudo -u postgres ssh barman@demo-barman01 'echo ok'

운영 환경에서는 authorized_keys에 from="..." 호스트 제약이나 command="..." 락다운을 거는 게 안전하다. 이 글은 lab이라 단순화했다.

PostgreSQL 측 — archive_command 활성화

/var/lib/pgsql/17/data/postgresql.conf 핵심 항목:

listen_addresses = '*'
wal_level = replica
archive_mode = on
archive_command = 'barman-wal-archive -U barman demo-barman01 pg01 %p'

barman-wal-archive는 Barman 패키지에 같이 깔린다. 내부적으로 SSH로 barman@demo-barman01에 접속해 WAL 파일을 /var/lib/barman/pg01/incoming/에 정확히 떨어뜨린다 — cp나 직접 scp보다 안전하다.

pg_hba.conf에 Barman의 conninfo 접속을 열어 둔다 (replication slot은 rsync 모델에서 필수가 아니지만 barman check가 PostgreSQL 메타데이터를 읽기 위해 필요):

host    postgres        barman       demo-barman01        scram-sha-256

PostgreSQL 재시작 후 사용자 생성:

sudo systemctl restart postgresql-17
sudo -u postgres psql <<'SQL'
CREATE USER barman WITH ENCRYPTED PASSWORD 'changeme';
GRANT pg_read_all_settings, pg_read_all_stats TO barman;
SQL

streaming 모델과 다르게 REPLICATION 속성·replication slot은 생성하지 않는다.

Barman 측 — `backup_method = rsync`

/etc/barman.d/pg01.conf를 다음과 같이 설정한다.

[pg01]
description = "Production PostgreSQL primary (rsync mode)"
ssh_command = ssh postgres@demo-pg01
conninfo = host=demo-pg01 user=barman dbname=postgres
backup_method = rsync
parallel_jobs = 2
reuse_backup = link
archiver = on
retention_policy = RECOVERY WINDOW OF 4 WEEKS

핵심 5줄 풀어쓰면:

키	의미
`backup_method = rsync`	base backup 시 SSH+rsync로 데이터 디렉토리 가져옴
`ssh_command = ssh postgres@demo-pg01`	rsync가 사용할 SSH 명령 — barman 사용자에서 postgres@demo-pg01로 접속
`parallel_jobs = 2`	병렬 rsync worker 수. 디스크·네트워크 여유에 맞춰 조정
`reuse_backup = link`	이전 backup의 변경 없는 파일을 하드링크로 재사용 — 핵심 dedup 옵션
`archiver = on`	`archive_command`로 들어오는 WAL 자동 처리

streaming 모델의 streaming_conninfo·streaming_archiver·slot_name 키는 모두 없다.

설정 검증:

sudo -u barman barman check pg01

receive-wal running은 streaming 모델 전용이므로 rsync 모드에서는 항목이 빠지거나 disabled로 나온다. 그 외 항목이 모두 OK면 셋업 완료.

첫 backup + 하드링크 dedup 실측

# 첫 base backup
sudo -u barman barman backup pg01

기대 출력 (요약):

Starting backup using rsync-over-ssh method for server pg01 ...
Copy done (time: 12 seconds)
Backup size: 41.5 MiB

이번에는 의도적으로 작은 변경만 만든 뒤 두 번째 backup을 떠본다.

# 데이터 일부 갱신
sudo -u postgres psql -c "UPDATE notes SET body = body || '!' WHERE id < 100;"

# 두 번째 base backup — reuse_backup = link 가 동작
sudo -u barman barman backup pg01
sudo -u barman barman list-backup pg01

이제 디스크 사용량을 두 가지 방식으로 잰다.

# 실제 디스크 사용량 (하드링크는 한 번만 카운트)
sudo du -sh /var/lib/barman/pg01/base/
# 예: 42.0 MiB

# 논리 사용량 (하드링크가 중복 카운트되어 backup별로 따로 잡힘)
sudo du -sh --apparent-size /var/lib/barman/pg01/base/
# 예: 83 MiB

차이 41 MiB ≈ 첫 backup 크기가 dedup으로 절약된 디스크다. 두 번째 backup은 변경된 페이지만 새로 차지하고, 나머지는 첫 backup 파일에 hardlink로 연결된다. backup 횟수가 늘수록 누적 절약 효과가 커진다.

언제 rsync인가 — 결정 기준 4가지

기준	rsync 유리	streaming 유리
환경	베어메탈·VM, SSH 통제 가능	컨테이너·K8s, SSH 미허용
디스크 절약	하드링크 dedup → 동일 backup 다수 보관 시 절약 큼	블록 레벨 dedup은 PG 17+로 점진 이행 중
병렬화	`parallel_jobs`로 N개 worker	단일 stream
운영 부담	양방향 SSH 키 관리	replication slot 관리

요약하면.

베어메탈·VM + 디스크 효율 + 병렬 복사 → rsync
컨테이너·K8s + SSH 회피 + 운영 단순화 → streaming
둘 다 가능한 환경 → streaming이 무난 (운영자의 SSH 키 관리 부담이 가장 큰 마찰점)

streaming → rsync 전환 (또는 반대)

같은 라벨로 두 모델을 동시에 운영할 수는 없다 — backup_method는 하나만. 전환 절차는 다음과 같다.

# 1단계: 기존 라벨 비활성화
[pg01]
active = false
...

# 2단계: 새 conf 파일로 다른 라벨 만들기 (예: pg01-rsync)
[pg01-rsync]
backup_method = rsync
...

새 라벨이 안정적으로 backup이 들어온다는 게 확인되면 (보통 1주일+ 경과·복구 리허설 1회 이상), 기존 라벨을 폐기한다.

운영 환경에서는 기존 backup을 즉시 버리지 않는다. 보존 정책이 자연 만료될 때까지 두 라벨을 동시에 보관 → 이전 시점 PITR이 필요해질 때 안전망이 된다.

정리

항목	내용
백업 모델	`backup_method = rsync` — SSH+rsync 기반
WAL 수집	`archive_command = 'barman-wal-archive ...'`
SSH 방향	양방향 (rsync용 + archive_command용)
핵심 dedup	`reuse_backup = link` — 하드링크 기반, 백업 횟수 누적될수록 절약 큼
병렬 복사	`parallel_jobs = N`
streaming과의 관계	동시 사용 불가 (라벨당 한 모델) — 전환 시 새 라벨로 병행 운영 후 폐기

“streaming은 운영 단순화에 강하고, rsync는 디스크 효율·병렬화에 강하다. 환경이 둘 중 하나로 강제되지 않으면 streaming부터 시도해 보는 게 무난하다.”

streaming은 알겠고, rsync는 언제 쓰나#

rsync 모델 vs streaming 모델 한눈에#

SSH 셋업 — 양방향 두 줄#

PostgreSQL 측 — archive_command 활성화#

Barman 측 — backup_method = rsync#

첫 backup + 하드링크 dedup 실측#

언제 rsync인가 — 결정 기준 4가지#

streaming → rsync 전환 (또는 반대)#

정리#

참고 자료#