fivenines

Theory / 10 min

Hash Table And Rehashing

Redis is often described as fast because it keeps data in memory. That is true, but incomplete. Memory is only fast when the data structures cooperate.

The main Redis keyspace is built around a dictionary, and the dictionary is built around a hash table. Conceptually:

key -> hash function -> bucket -> entry -> Redis object

For ordinary lookups, this gives Redis the behavior people expect: GET name should not care whether the database has ten keys or ten million keys, at least not in the average case.

Hash Tables Are A Bet On Distribution

A hash table turns a key into a number and uses that number to choose a bucket. If keys are spread evenly across buckets, lookups are short. If too many keys land in the same bucket, the server has to scan a chain or secondary structure inside that bucket.

That is the quiet contract behind hash-table performance:

good hash distribution + healthy table size -> average O(1) access

Redis uses dictionaries in many places, not only the top-level keyspace. Hashes can use dictionaries internally after they outgrow compact encodings. Sets can be represented by dictionaries. Expiration metadata is also indexed by key. Once you understand Redis dictionaries, you understand one of the repeating patterns in the system.

Resizing Is Where The Trouble Begins

Hash tables need to grow when they become crowded and shrink when they waste too much memory. A naive resize is easy to imagine:

allocate a new table
walk every entry in the old table
move each entry into the new table
discard the old table

That works beautifully in a textbook and badly in an event loop. Moving millions of keys in one uninterrupted operation can create a latency spike large enough for clients to notice.

Redis' design pressure is not just "make operations fast." It is "avoid surprising long pauses."

Incremental Rehashing

Incremental rehashing spreads resize work across many small moments. During a rehash, the dictionary effectively has two tables:

old table: still contains keys that have not moved
new table: receives new inserts and migrated buckets

Lookups check both tables while rehashing is active. Inserts go to the new table. Deletes may need to search both. Each normal dictionary operation can also migrate a small amount of work from the old table to the new one.

The shape is:

start rehash
  -> keep two tables
  -> gradually move buckets
  -> finish when old table is empty

This makes every individual operation slightly more complex, but it protects the server's latency profile.

The Cost Of Smoothness

Incremental rehashing is a classic Redis tradeoff. The implementation becomes more subtle because a key may temporarily live in either table. Lookup, insert, and delete paths must all respect that transitional state.

But the payoff is enormous. Instead of making one unlucky client wait for a full dictionary resize, Redis makes many operations pay a small and bounded cost. It converts a cliff into a slope.

Time events can also help rehashing progress during quiet periods. The server does not have to wait for heavy traffic to finish migrating buckets.

Why This Belongs In The Latency Story

Hash tables are usually taught as data structures. In Redis, they are also part of the latency architecture.

The dictionary must preserve fast access, but it must do so without violating the event loop's need for short turns. Incremental rehashing is what happens when a data structure is designed not only for asymptotic performance, but for a real server with live clients.

Redis' speed comes from this kind of detail. The obvious idea is "use a hash table." The Redis-shaped idea is "use a hash table that resizes without making the event loop disappear."