What is a Composite Index?
A composite index (also called a multi-column index or compound index) is a B+ tree index built on two or more columns. The keys in the B+ tree are sorted first by the first column, then by the second column within ties of the first, and so on — exactly like a phone book sorted by last name, then first name.
The Leftmost Prefix Rule
This is the single most important concept for composite indexes. An index on (a, b, c) can efficiently satisfy queries that filter on:
- (a) alone
- (a, b)
- (a, b, c)
But it cannot efficiently satisfy queries that filter only on:
- (b) alone — the index is not sorted by
bin isolation - (c) alone
- (b, c) — same problem
The index can only be used when the query starts from the leftmost column and uses contiguous columns from the left.
Equality vs Range Predicates
Column order in a composite index is critical. Consider an index on (a, b, c):
WHERE a = 1 AND b = 5 AND c = 10— all three columns used (equality on all).WHERE a = 1 AND b > 5— both columns used.anarrows to a = 1, thenb > 5uses the sorted order within a = 1.WHERE a > 1 AND b = 5— only a is used for the range scan. Onceais a range, the index cannot usebbecause within the range a > 1, thebvalues are interleaved across differentavalues.
Rule of thumb: put equality columns first, range columns last. Put the most selective equality column first to narrow down candidates quickly.
Column Order Design Strategy
When designing a composite index:
- Identify the columns in your WHERE clause.
- Separate them into equality predicates (=) and range predicates (>, <, BETWEEN, LIKE 'prefix%').
- Place equality columns first (in order of selectivity — most selective first).
- Place the range column last (only one range column can be used per index scan).
- If ORDER BY follows the same column order, the index also avoids a sort.
Index Skip Scan
Some databases (Oracle, MySQL 8.0.13+, PostgreSQL 13+) support index skip scan, which can use a composite index even when the leftmost column is not in the WHERE clause — but only if the leftmost column has very few distinct values. The optimizer effectively does a separate scan for each distinct value of the first column. This is an optimization, not a replacement for proper index design.
Real-Life: Designing the Right Composite Index
A SaaS application has a logs table with millions of rows:
CREATE TABLE logs (
id BIGINT PRIMARY KEY,
tenant_id INT,
severity VARCHAR(10), -- 'INFO', 'WARN', 'ERROR'
created_at TIMESTAMP,
message TEXT
);
The most common query:
SELECT * FROM logs
WHERE tenant_id = ? AND severity = 'ERROR'
AND created_at > NOW() - INTERVAL '1 hour'
ORDER BY created_at DESC
LIMIT 50;
Index design using the equality-first, range-last rule:
- Equality columns:
tenant_id,severity(both = comparisons) - Range column:
created_at(> comparison) - ORDER BY:
created_at DESC(matches index order if placed last)
Best index: CREATE INDEX idx_logs ON logs(tenant_id, severity, created_at DESC)
Why this works:
- B+ tree seeks to
tenant_id = ?— skips all other tenants. - Within that tenant, seeks to
severity = 'ERROR'— skips INFO and WARN. - Within ERROR for that tenant, scans
created_atbackward from now — pure sequential read. ORDER BY created_at DESCis already satisfied by the index order — no sort step.LIMIT 50stops after 50 rows — very fast.
Wrong index: (created_at, tenant_id, severity) — the range column first means the index cannot efficiently narrow by tenant_id and severity.