What is a Covering Index?
A covering index is an index that contains all the columns a query needs, allowing the database to answer the query entirely from the index without ever accessing the underlying table data (the heap or clustered index). When the query planner detects this, it performs an index-only scan — a major performance optimization.
Why it matters
Recall that a normal secondary index lookup in InnoDB requires a double lookup: search the secondary index to get the primary key, then search the clustered index to get the full row. This second lookup is called a bookmark lookup or table access by index rowid. If the query only needs columns that are already in the secondary index, the bookmark lookup is completely unnecessary.
For example, if you have an index on (email) and run SELECT email FROM users WHERE email = 'alice@example.com', the database can return email directly from the index leaf node without touching the clustered index at all.
The INCLUDE Clause (PostgreSQL)
PostgreSQL supports the INCLUDE clause, which adds non-key columns to the index leaf nodes without affecting the sort order:
CREATE INDEX idx_email_inc ON users(email) INCLUDE (name, created_at);
This index is sorted by email (the key column), and each leaf node also stores name and created_at as payload. A query like SELECT name, created_at FROM users WHERE email = 'alice@example.com' becomes an index-only scan. The included columns are NOT part of the sort key, so they do not affect the index's ordering or range-scan behavior.
How to identify covering index opportunities
In EXPLAIN output, look for:
- PostgreSQL: "Index Only Scan" (covering) vs "Index Scan" (non-covering, requires heap fetch)
- MySQL: "Using index" in the Extra column means covering
- SQL Server: "Index Seek" with no "Key Lookup" operator
Trade-offs: Wide vs Narrow Indexes
Making an index wider (more columns) means:
- Larger index size on disk and in the buffer pool
- More maintenance cost on writes (every INSERT/UPDATE/DELETE touches more data)
- Faster reads for queries that are covered
A narrow index (1-2 columns) is small and cheap to maintain but rarely covers queries. A wide index (many columns or with INCLUDE) is expensive to maintain but eliminates table lookups for specific queries. The decision depends on the workload: read-heavy systems benefit more from covering indexes.
Composite Index as Covering Index
A composite index on (a, b, c) naturally covers any query that only needs columns a, b, and/or c. This is a common pattern: designing a composite index that serves both as a search index AND a covering index for the most frequent queries.
Real-Life: Eliminating Bookmark Lookups
An e-commerce dashboard runs this query thousands of times per minute:
SELECT order_id, total, status
FROM orders
WHERE customer_id = ?
ORDER BY created_at DESC
LIMIT 20;
Without a covering index:
A secondary index on (customer_id, created_at) speeds up the lookup and ordering, but the database must do a bookmark lookup for each of the 20 rows to retrieve order_id, total, and status from the clustered index. That is 20 extra random I/O operations.
With a covering index:
CREATE INDEX idx_orders_covering
ON orders(customer_id, created_at DESC)
INCLUDE (order_id, total, status);
Now the index contains everything the query needs. The database:
- Seeks to
customer_id = ?in the index. - Reads the first 20 entries (already sorted by
created_at DESC). - Returns
order_id,total,statusdirectly from the index leaf. - Zero bookmark lookups. All data comes from the index.
Impact: This reduced p99 query latency from 12ms to 0.8ms in a real production system — a 15x improvement — because it eliminated 20 random disk reads per query.