Model Replication: Partial Sync

Time: 30m
Level: advanced
Artifacts: not specified

Progress0%

Partial synchronization is Redis replication's memory of recent history.

If a replica disconnects briefly, it should not have to receive the entire dataset again. It should be able to say, "I processed the stream up to this point. Send me what I missed."

The model is:

replication id + processed offset -> missing bytes from backlog

If the primary still has the missing bytes, the replica catches up cheaply. If not, Redis falls back to full sync.

Offsets Name Positions In The Stream

Replication is a byte stream. The primary tracks an offset as it sends propagated writes. A replica tracks the latest offset it has processed.

primary current offset: 100000
replica processed:       99750
missing range:           99751..100000

When the replica reconnects, it can ask for the stream after its last known offset.

This is more precise than counting commands. Commands have different serialized sizes, and replication is ultimately delivered as bytes.

Replication IDs Protect History

An offset only makes sense within a particular history. If two primaries have different histories, offset 5000 in one stream is not equivalent to offset 5000 in another.

Redis uses replication IDs to identify the history a replica followed.

same replication id + offset still available -> partial sync
different history or missing offset -> full sync

This protects replicas from stitching together incompatible timelines.

The Backlog Is A Circular Memory Of Writes

The primary keeps a replication backlog: a circular buffer containing recent bytes from the replication stream.

append propagated writes
old bytes fall out when the buffer wraps

If a replica reconnects quickly, the bytes it missed are probably still in the backlog. If it was gone too long, the backlog has overwritten them, and partial sync is impossible.

The backlog size is therefore an operational choice. Larger backlog means replicas can survive longer disconnections without full resync. It also consumes more memory.

Continue Or Full Resync

The reconnection decision is straightforward in concept:

replica sends known replication id and offset
primary checks whether history matches
primary checks whether requested offset is still in backlog
if yes, send CONTINUE and missing bytes
if no, perform FULLRESYNC

The elegance is that partial sync reuses the same stream model as full sync. It just starts from a later point.

Partial Sync Is About Avoiding Waste

Full sync is correct but expensive. Partial sync is an optimization that preserves correctness when enough recent history is available.

This is one of Redis' recurring design patterns: keep a bounded memory of useful work, use it when possible, and fall back to a more expensive but reliable path when the bounded memory is not enough.

In replication, that bounded memory is the backlog. The reliable path is full sync.

Next step

See what actually stuck.

Take the practice scenarios now.

Previous tutorialExplore the full Build Your Own Redis track