How Databases Execute Joins
When you write SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id, the database must physically combine rows from two tables. The SQL only declares what you want — the query optimizer chooses how to execute the join by picking one of three fundamental algorithms. Each has different performance characteristics depending on table sizes, available indexes, memory, and sort order.
1. Nested Loop Join (NLJ)
The simplest algorithm: for each row in the outer (driving) table, scan the inner table to find matching rows.
for each row r in outer:
for each row s in inner:
if r.key == s.key:
emit (r, s)
- Time: O(n * m) where n = outer rows, m = inner rows.
- When used: the inner table has an index on the join key (turning the inner "scan" into an O(log m) lookup), or both tables are very small. With an index, effective cost is O(n * log m).
- Memory: O(1) — no extra data structures needed.
- Advantage: can start returning results immediately (good for LIMIT queries). Supports all join predicates, not just equality.
2. Hash Join
Build a hash table on the smaller (build) relation, then probe it with each row from the larger (probe) relation.
Build phase: scan the smaller table, insert each row into a hash table keyed by the join column. Probe phase: scan the larger table, for each row look up the join key in the hash table.
- Time: O(n + m) — one pass over each table.
- Memory: O(min(n, m)) — the hash table for the smaller relation must fit in memory. If it does not fit, the optimizer uses a partitioned (grace) hash join that spills partitions to disk.
- When used: large tables with no useful index on the join key. This is the workhorse join for analytics and data warehouses.
- Limitation: only works for equi-joins (equality predicates), not range or inequality joins.
3. Sort-Merge Join
Sort both tables on the join key, then merge them in a single pass (like the merge step of merge sort).
Sort phase: sort both relations by the join key. Merge phase: advance two pointers through the sorted relations, emitting matches.
- Time: O(n log n + m log m) for sorting, then O(n + m) for the merge.
- When used: both tables are already sorted by the join key (e.g., clustered index), or the result must be sorted anyway (ORDER BY on the join key). Also good for very large datasets that do not fit in memory — external merge sort is efficient.
- Memory: O(sort buffer) — lower than hash join when tables are pre-sorted.
How the Optimizer Decides
The query optimizer estimates the cost of each strategy using table statistics (row counts, distinct values, index availability, sort order). Rules of thumb:
| Scenario | Best algorithm |
|---|---|
| Small inner table with index | Nested Loop |
| Large tables, no index, equality join | Hash Join |
| Both tables pre-sorted on join key | Sort-Merge |
| Non-equality predicate (e.g., <, BETWEEN) | Nested Loop |
| LIMIT with indexed inner | Nested Loop |
You can inspect the chosen algorithm with EXPLAIN ANALYZE in PostgreSQL.
Real-Life: Picking the Right Join Algorithm
Scenario 1 — OLTP with indexes (Nested Loop wins):
SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id WHERE orders.id = 12345;
Only one order row matches the WHERE clause. The optimizer picks Nested Loop: fetch the single order, then use the B-tree index on customers.id to find the matching customer in O(log n). Hash Join would wastefully build a hash table over all customers.
Scenario 2 — Analytics on large tables (Hash Join wins):
SELECT c.country, SUM(o.amount) FROM orders o JOIN customers c ON o.customer_id = c.id GROUP BY c.country;
With 50 million orders and 2 million customers, there is no selective WHERE clause. The optimizer builds a hash table on the smaller customers table (2M rows), then probes it with each of the 50M orders. Total work: ~52M operations. A Nested Loop without an index would be 100 trillion operations.
Scenario 3 — Pre-sorted data (Sort-Merge wins): Both the orders and shipments tables have a clustered index on order_date. Joining on order_date requires no sorting — the merge phase simply walks both tables in lock-step. Cost: O(n + m) with no hash table memory overhead.