Back to DAG

Distributed Secondary Index

databases

Secondary Indexes in Partitioned Databases

In a partitioned database, the primary key determines which partition stores each row. But what about queries on non-primary-key columns? For example, "find all orders where product_id = 42" on a table partitioned by order_id. Without a secondary index, this query must scan every partition—a full table scan distributed across the cluster.

Secondary indexes allow efficient lookups by non-primary-key columns. In a distributed system, there are two fundamentally different approaches to organizing secondary indexes.

Local Index (Document-Partitioned)

Each partition maintains its own secondary index covering only the data stored on that partition. When a write occurs, the local index is updated on the same partition—no cross-partition communication needed.

Problem with reads: A query on the secondary index must be sent to every partition because each partition only knows about its own data. This is called scatter-gather (or fan-out). If there are 100 partitions, the query coordinator sends 100 sub-queries, waits for all responses, and merges the results. This is expensive, especially with high latency or many partitions.

Used by: MongoDB, Cassandra (secondary indexes), Elasticsearch (each shard has a local index), Riak.

Global Index (Term-Partitioned)

The secondary index itself is partitioned by the indexed term (the column value). For example, all entries with color = red go to partition 0's index, all entries with color = blue go to partition 1's index. This means a query for a specific term only needs to contact one index partition—no scatter-gather.

Problem with writes: A single write may need to update multiple global index partitions. Inserting a row with color = red and size = large requires updating the color index partition and the size index partition, which may be on different nodes. This cross-partition write is expensive and typically done asynchronously, meaning the global index is eventually consistent.

Used by: DynamoDB (Global Secondary Indexes), Oracle (global indexes on partitioned tables), Riak Search.

Trade-off Summary

AspectLocal IndexGlobal Index
Write complexitySimple (local only)Complex (cross-partition)
Read performanceScatter-gather (slow)Single partition (fast)
ConsistencyImmediately consistentEventually consistent
Best forWrite-heavy, rare secondary queriesRead-heavy on secondary columns

Real-Life: DynamoDB Global Secondary Indexes

Real-World Example

Amazon DynamoDB provides both approaches:

  • Local Secondary Index (LSI): Must share the same partition key as the base table. The index is stored on the same partition as the data it references. Reads are consistent, but you can only index within a partition key.
  • Global Secondary Index (GSI): Can have a completely different partition key and sort key. The index is partitioned independently from the base table. DynamoDB asynchronously propagates writes to GSIs, so they are eventually consistent (typical lag: milliseconds).

Example: A table Orders partitioned by customerId. You want to query by productId. Create a GSI with partition key = productId. Now SELECT * FROM Orders WHERE productId = 42 hits one GSI partition instead of scanning all customer partitions.

Elasticsearch:

  • Each index is divided into shards (partitions). Each shard has its own local inverted index.
  • A search query is sent to all shards (scatter-gather), results are merged and ranked by the coordinating node.
  • This is why adding more shards increases write throughput but can increase query latency (more fan-out).

MongoDB:

  • Secondary indexes are document-partitioned (local). A query on a non-shard-key field with a secondary index triggers scatter-gather across all shards.
  • This is why MongoDB documentation recommends including the shard key in queries whenever possible.

Local vs Global Secondary Index

Local Index (Document-Partitioned) Partition 0 data: orders 1-100 idx: red=>[3,7] Partition 1 data: orders 101-200 idx: red=>[150] Partition 2 data: orders 201-300 idx: red=>[255] Query: color=red scatter to ALL Global Index (Term-Partitioned) Data P0 orders 1-100 Data P1 orders 101-200 Index P0: a-m blue=>[5,130], gold=>[42] Index P1: n-z red=>[3,7,150,255] Query: color=red ONE partition Write to Data P0 with color=red: must also update Index P1 (async). Local index: fast writes, slow reads (scatter). Global index: fast reads, slow writes (cross-partition).
Step 1 of 2