Back to DAG

Partial / Filtered Index

databases

What is a Partial Index?

A partial index (called a filtered index in SQL Server) is an index that only includes rows satisfying a specified WHERE predicate. Instead of indexing every row in the table, it indexes only a subset — making the index dramatically smaller, faster to scan, and cheaper to maintain.

Syntax

-- PostgreSQL partial index
CREATE INDEX idx_pending_orders ON orders(created_at)
  WHERE status = 'pending';

-- SQL Server filtered index
CREATE INDEX idx_pending_orders ON orders(created_at)
  WHERE status = 'pending';

This index only contains rows where status = 'pending'. If only 2% of orders are pending, the index is ~50x smaller than a full index on created_at.

When the query planner uses a partial index

The database will only use a partial index when the query's WHERE clause implies the index predicate. For the index above:

  • WHERE status = 'pending' AND created_at > '2024-01-01' — YES, the query explicitly filters on status = 'pending', which matches the index predicate.
  • WHERE created_at > '2024-01-01' — NO, the query does not restrict to pending orders, so the partial index might miss non-pending rows.
  • WHERE status = 'pending' AND priority = 'high' — YES, the query's WHERE implies status = 'pending', so the partial index can be used (the priority filter is applied after the index scan).

Why partial indexes are powerful

  1. Smaller size: fewer rows means fewer B+ tree pages, using less disk and buffer pool memory.
  2. Faster scans: scanning a 1 MB index is faster than scanning a 50 MB index.
  3. Cheaper maintenance: INSERTs/UPDATEs/DELETEs only touch the index when the row matches the predicate. An INSERT with status = 'shipped' does not write to the partial index at all.
  4. Hot subset pattern: many applications have a "hot" subset of active rows (pending orders, unread messages, open tickets) that queries focus on, while the vast majority are "cold" (completed, read, closed). A partial index on the hot subset is ideal.

Unique partial index

A powerful use case: enforce uniqueness only on a subset of rows.

-- Each user can have at most one 'active' subscription
CREATE UNIQUE INDEX uniq_active_sub ON subscriptions(user_id)
  WHERE status = 'active';

This allows multiple rows per user (old subscriptions with status = 'cancelled'), but enforces that at most one row per user has status = 'active'. This is impossible to express with a standard unique constraint.

Limitations

  • Not all databases support partial indexes (MySQL does NOT; PostgreSQL and SQL Server do).
  • The predicate must be a simple expression (no subqueries, no volatile functions).
  • The query planner must be able to prove the query's WHERE clause implies the index predicate — complex conditions may not be recognized.

Real-Life: Indexing Only Active Records

Real-World Example

A task management system has 50 million tasks, but only 200,000 are currently status = 'open'. The dashboard query is:

SELECT * FROM tasks
WHERE status = 'open' AND assigned_to = ?
ORDER BY priority DESC, created_at
LIMIT 20;

Approach 1: Full index on (status, assigned_to, priority, created_at)

  • Indexes all 50 million rows.
  • Index size: ~1.2 GB.
  • Every INSERT (even for completed tasks) must update this index.

Approach 2: Partial index

CREATE INDEX idx_open_tasks ON tasks(assigned_to, priority DESC, created_at)
  WHERE status = 'open';
  • Indexes only 200,000 rows (the open ones).
  • Index size: ~5 MB (240x smaller!).
  • INSERTs with status = 'done' do not touch this index at all.
  • When a task is completed (UPDATE status = 'done'), the row is removed from the partial index.

Real-world examples:

  • E-commerce: index only pending/processing orders, not the millions of delivered ones
  • Email: index only unread messages for the inbox query
  • SaaS multi-tenant: index only active tenants for login lookups
  • Soft deletes: WHERE deleted_at IS NULL to index only non-deleted rows

Partial Index: Indexing Only Matching Rows

Table: orders (1,000,000 rows) id=1 status=shipped date=Jan 1 id=2 status=shipped date=Jan 2 id=3 status=pending date=Jan 3 id=4 status=delivered date=Jan 5 id=5 status=pending date=Jan 7 id=6 status=shipped date=Jan 8 id=7 status=delivered date=Jan 9 id=8 status=pending date=Jan 10 ... 999,992 more rows (98% shipped/delivered) Full Index on (date) All 1,000,000 rows indexed Size: ~50 MB Every INSERT updates index Partial Index on (date) WHERE status = 'pending' Only 20,000 rows (2%) Size: ~1 MB, 50x smaller! id=3 date=Jan 3 id=5 date=Jan 7 id=8 date=Jan 10 Unique Partial Index: CREATE UNIQUE INDEX uniq_active ON subs(user_id) WHERE status='active'; user_id=1, status=active (indexed, unique enforced) user_id=1, status=cancelled (NOT indexed, no constraint) One active per user!
Step 1 of 2