What is a Clustered Index?

Prerequisites(2)

Builds on to(1)

A clustered index determines the physical storage order of the rows in a table. The table data itself is stored in the leaf nodes of a B+ tree, sorted by the clustering key. Because data can only be physically sorted one way, a table can have at most one clustered index.

How it works

In a clustered index (also called an index-organized table), the B+ tree's leaf nodes do not just contain pointers — they contain the actual row data. The internal nodes contain separator keys that guide searches down to the correct leaf page. This means a lookup by the clustering key traverses the B+ tree and lands directly on the data — no extra lookup needed.

InnoDB and the Primary Key

In MySQL's InnoDB storage engine, the primary key IS the clustered index. When you define PRIMARY KEY (id), InnoDB physically organizes the table data in a B+ tree sorted by id. Every InnoDB table is an index-organized table. If you do not define a primary key, InnoDB first tries to use a unique NOT NULL index. If none exists, it creates a hidden 6-byte internal row ID and clusters on that.

Range Scans and Sequential I/O

The killer advantage of clustered indexes is range queries on the clustering key. A query like SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-01-31' (when order_date is the clustering key) reads a contiguous run of leaf pages — pure sequential I/O. This is dramatically faster than random I/O to scattered heap pages.

Heap Table vs Index-Organized Table

A heap table stores rows in no particular order (insert anywhere there is space). A lookup by any column requires either a full scan or a secondary index that points to the physical row location (RID). PostgreSQL uses heap tables by default — the primary key is just a regular B-tree index whose leaf nodes point to heap tuple locations (ctid).

An index-organized table (InnoDB, Oracle IOT, SQL Server clustered table) stores rows sorted in the B+ tree. The tradeoff: inserts into the middle of the sort order may cause page splits (expensive), and the table is optimized for one access pattern (the clustering key) at the expense of others.

Page Splits

When a new row must be inserted into a full leaf page (because it belongs in the middle of the sorted order), the database performs a page split: allocating a new page and distributing half the entries to each page. Page splits cause random I/O and fragmentation. This is why auto-increment integer primary keys are popular — they always append to the end, avoiding splits.

Real-Life: InnoDB Table Storage

Real-World Example

Consider an e-commerce orders table in MySQL InnoDB:

CREATE TABLE orders (
  order_id INT AUTO_INCREMENT PRIMARY KEY,
  customer_id INT,
  order_date DATE,
  total DECIMAL(10,2)
);

What happens internally:

InnoDB creates a B+ tree clustered on order_id.
Leaf nodes contain the full rows: (order_id, customer_id, order_date, total).
Because order_id auto-increments, new rows always append to the rightmost leaf — no page splits.

Fast range scan: SELECT * FROM orders WHERE order_id BETWEEN 1000 AND 2000 reads a contiguous chunk of leaf pages. Sequential I/O — very fast.

Slow range scan on non-clustering key: SELECT * FROM orders WHERE customer_id BETWEEN 100 AND 200 cannot use the clustered order. Without a secondary index on customer_id, this is a full table scan.

Why UUID primary keys hurt InnoDB: UUIDs are random, so inserts land in random positions within the B+ tree, causing constant page splits and fragmentation. Auto-increment integers always go to the end — sequential inserts, no splits.

Clustered Index: B+ Tree with Row Data in Leaves

Step 1 of 2