Anatomy of a B-Tree Node

Prerequisites(2)

Builds on to(1)

A B-tree node is the fundamental building block of B-tree and B+ tree indexes — the most widely used index structure in databases. Understanding the internal layout of a single node is essential before studying full B-tree operations.

Node structure

A B-tree node contains:

An array of keys: stored in sorted order. A node with k keys divides the key space into k+1 regions.
An array of child pointers (for internal nodes): a node with k keys has k+1 children. Child pointer i points to the subtree containing all keys between key[i-1] and key[i].
An array of values/record pointers (for leaf nodes): each key is associated with a value or a pointer to the actual record.
Metadata: whether the node is a leaf, the current number of keys, a pointer to the parent (optional).

Order and degree

The minimum degree (often called t) defines the capacity constraints:

Each non-root node has at least t-1 keys and at most 2t-1 keys.
Each non-root internal node has at least t children and at most 2t children.
The root can have as few as 1 key (or 0 if the tree is empty).

For example, with t=3: each node has 2-5 keys and 3-6 children. A typical database page (8 KB) can fit hundreds of keys, giving t values in the hundreds.

Key search within a node

When the B-tree search arrives at a node, it must find where the target key falls among the node's sorted keys. Two strategies:

Linear scan: O(k) where k is the number of keys. Simple, but slow for large nodes.
Binary search: O(log k). For nodes with hundreds of keys (which is typical), this is significantly faster.

The search determines either:

The key is found at index i → return the associated value (or descend to the correct child in a B+ tree).
The key falls between key[i-1] and key[i] → descend into child[i].

Why high fanout matters

A B-tree with minimum degree t has a maximum fanout of 2t. A node that fits in a single disk page can hold hundreds of keys, giving a fanout of hundreds. This means:

A tree with fanout 500 can index 500^3 = 125 million keys in just 3 levels.
Each level requires one disk I/O, so any key can be found in 3 disk reads.
Compare this to a binary search tree: log2(125 million) = 27 levels = 27 disk reads.

This is why B-trees dominate database indexing: they minimize the number of disk I/Os by maximizing the number of keys per node (per disk page).

Node types

Internal node: has keys and child pointers, no record data (in B+ trees) or has record data (in B-trees).
Leaf node: has keys and record pointers/values, no children. In B+ trees, leaf nodes also have sibling pointers for range scans.
Root node: can be internal or leaf. In a small tree, the root is the only node and is a leaf.

Real-Life: PostgreSQL B-Tree Index Page

Real-World Example

In PostgreSQL, a B-tree index page (8 KB) is laid out using the slotted page format, with index tuples instead of heap tuples.

Internal node page:

Each index tuple contains: a key value + a pointer to a child page.
With 8-byte keys and 6-byte page pointers, each entry is ~14 bytes + overhead.
An 8 KB page can fit roughly 300-400 entries — that is the fanout.
A 3-level tree can index 400^3 = 64 million rows.

Leaf node page:

Each index tuple contains: a key value + a TID (tuple identifier = ctid pointing to the heap page and slot).
Leaf pages are linked in a doubly-linked list for range scans.

Binary search within a page: PostgreSQL performs binary search over the index tuples within a page to find the target key or the correct child pointer. This is the _bt_binsrch() function in the source code.

Example: searching for key 42 in a 3-level B-tree:

Read root page. Binary search finds 42 is between key 30 and key 50. Follow child pointer for that range.
Read internal page. Binary search finds 42 is between key 40 and key 45. Follow child pointer.
Read leaf page. Binary search finds key 42. Return the TID (page 17, slot 3). Fetch the heap tuple.

Total: 3 index page reads + 1 heap page read = 4 I/Os.

B-Tree Node: Keys, Children, and Search

Step 1 of 3