What is WAL-Based Replication?

Prerequisites(2)

Write-Ahead Log (WAL)TCP Reliable Stream

Builds on to(1)

WAL-based replication is a technique where the primary database streams its Write-Ahead Log records to one or more replica (standby) databases. The replicas apply these WAL records to their own data files, maintaining an identical copy of the primary's data. This provides high availability (failover to a replica if the primary crashes), read scaling (distribute read queries across replicas), and disaster recovery (replicas can be in a different data center).

Physical replication

Physical replication sends the raw WAL bytes — the actual page-level changes — from the primary to the replica. The replica applies these byte-for-byte changes to its own data files.

Advantages: simple, fast, exact byte-level copy. The replica is an identical physical copy.
Limitations: the replica must run the same database version, same architecture, same OS. It cannot have different indexes or schema. It replays everything, including index changes, vacuum operations, and so on.
PostgreSQL: this is the default replication mode (wal_level = replica). The replica receives WAL segment files and replays them.

Logical replication

Logical replication decodes the WAL into high-level row changes — INSERT, UPDATE, DELETE statements — and sends those to the replica.

Advantages: replicas can run different database versions. You can replicate a subset of tables. Replicas can have different indexes or even different schemas. Supports cross-platform replication.
Limitations: higher overhead due to decoding. Does not replicate DDL (schema changes) automatically. Does not replicate sequences, large objects, or some system catalog changes.
PostgreSQL: enabled with wal_level = logical. Uses publication/subscription model (CREATE PUBLICATION, CREATE SUBSCRIPTION).

Replication slots

A replication slot is a mechanism that tracks how far a replica has consumed the WAL. It guarantees that the primary will not delete (truncate) WAL segments that the replica has not yet received.

Without replication slots, if a replica falls behind (e.g., due to network issues), the primary might recycle WAL segments that the replica still needs, causing the replica to become out of sync and requiring a full re-initialization.

The tradeoff: if a replica goes offline permanently and its slot is not dropped, the primary will keep WAL segments forever, eventually running out of disk space. Monitoring slot lag is critical.

Synchronous vs asynchronous replication

Asynchronous (default): the primary commits immediately after writing its own WAL. The replica receives WAL records with some delay (replication lag). If the primary crashes, committed transactions that have not yet been replayed on the replica are lost.
Synchronous: the primary waits for the replica to acknowledge receipt (or even application) of the WAL record before confirming the commit to the client. This guarantees zero data loss but adds latency to every commit (network round-trip to the replica). PostgreSQL supports synchronous_commit = on | remote_write | remote_apply with increasing durability guarantees.

Initial snapshot: pg_basebackup

Before a replica can start consuming WAL, it needs a baseline copy of the primary's data. pg_basebackup takes a consistent file-system-level snapshot of the primary while it continues to operate, creates a backup, and records which WAL position the backup is consistent with. The replica then starts replaying WAL from that position.

**WAL-based replication** is a technique where the primary database streams its Write-Ahead Log records to one or more **replica** (standby) databases. The replicas apply these WAL records to their own data files, maintaining an identical copy of the primary's data. This provides **high availability** (failover to a replica if the primary crashes), **read scaling** (distribute read queries across replicas), and **disaster recovery** (replicas can be in a different data center).

### Physical replication

**Physical replication** sends the raw WAL bytes — the actual page-level changes — from the primary to the replica. The replica applies these byte-for-byte changes to its own data files.

- **Advantages**: simple, fast, exact byte-level copy. The replica is an identical physical copy.
- **Limitations**: the replica must run the same database version, same architecture, same OS. It cannot have different indexes or schema. It replays everything, including index changes, vacuum operations, and so on.
- **PostgreSQL**: this is the default replication mode (`wal_level = replica`). The replica receives WAL segment files and replays them.

### Logical replication

**Logical replication** decodes the WAL into high-level row changes — INSERT, UPDATE, DELETE statements — and sends those to the replica.

- **Advantages**: replicas can run different database versions. You can replicate a subset of tables. Replicas can have different indexes or even different schemas. Supports cross-platform replication.
- **Limitations**: higher overhead due to decoding. Does not replicate DDL (schema changes) automatically. Does not replicate sequences, large objects, or some system catalog changes.
- **PostgreSQL**: enabled with `wal_level = logical`. Uses publication/subscription model (`CREATE PUBLICATION`, `CREATE SUBSCRIPTION`).

### Replication slots

A **replication slot** is a mechanism that tracks how far a replica has consumed the WAL. It guarantees that the primary will not delete (truncate) WAL segments that the replica has not yet received.

Without replication slots, if a replica falls behind (e.g., due to network issues), the primary might recycle WAL segments that the replica still needs, causing the replica to become out of sync and requiring a full re-initialization.

The tradeoff: if a replica goes offline permanently and its slot is not dropped, the primary will keep WAL segments forever, eventually running out of disk space. Monitoring slot lag is critical.

### Synchronous vs asynchronous replication

- **Asynchronous** (default): the primary commits immediately after writing its own WAL. The replica receives WAL records with some delay (replication lag). If the primary crashes, committed transactions that have not yet been replayed on the replica are lost.

- **Synchronous**: the primary waits for the replica to acknowledge receipt (or even application) of the WAL record before confirming the commit to the client. This guarantees zero data loss but adds latency to every commit (network round-trip to the replica). PostgreSQL supports `synchronous_commit = on | remote_write | remote_apply` with increasing durability guarantees.

### Initial snapshot: pg_basebackup

Before a replica can start consuming WAL, it needs a baseline copy of the primary's data. `pg_basebackup` takes a consistent file-system-level snapshot of the primary while it continues to operate, creates a backup, and records which WAL position the backup is consistent with. The replica then starts replaying WAL from that position.

Real-Life: Setting Up PostgreSQL Streaming Replication

Real-World Example

Here is how a typical PostgreSQL primary-replica setup works in production:

1. Initial setup (primary):

wal_level = replica in postgresql.conf
max_wal_senders = 3 (allow up to 3 replication connections)
Create a replication user: CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD '...';
Update pg_hba.conf to allow the replica's IP to connect for replication.

2. Base backup (on the replica machine):

Run: pg_basebackup -h primary_host -U replicator -D /var/lib/postgresql/data -Fp -Xs -P
This streams a full copy of the primary's data directory to the replica and records the WAL position.

3. Start the replica:

Create standby.signal in the data directory (PostgreSQL 12+) or set standby_mode = on in recovery.conf (older versions).
Set primary_conninfo = 'host=primary_host user=replicator password=...'
Start PostgreSQL. The replica connects to the primary, requests WAL from the position recorded in pg_basebackup, and begins replaying.

4. Monitoring:

On the primary: SELECT * FROM pg_stat_replication; shows each replica's write, flush, and replay positions, plus lag in bytes and seconds.
On the replica: SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();
If replay_lag grows, the replica is falling behind — investigate I/O or network bottlenecks.

5. Failover: if the primary crashes, promote the replica with pg_ctl promote or SELECT pg_promote();. The replica stops replaying WAL and starts accepting writes. Applications must be redirected to the new primary (typically via a load balancer or DNS update).

WAL-Based Replication Architecture

Step 1 of 2