What is HNSW?

Prerequisites(2)

Builds on to(1)

Hierarchical Navigable Small World (HNSW) is a graph-based algorithm for approximate nearest neighbor (ANN) search in high-dimensional vector spaces. It builds a multi-layer graph where each layer is a proximity graph connecting nearby vectors, enabling fast logarithmic-time search.

The small-world idea

A navigable small-world graph has two properties: (1) most nodes connect to nearby neighbors, and (2) a few long-range links allow jumping across the graph quickly. This mirrors the "six degrees of separation" phenomenon. By greedily following edges that bring us closer to the target, we can navigate from any node to any other in O(log n) hops.

Hierarchical layers

HNSW extends this by stacking multiple layers of proximity graphs:

Layer 0 (bottom): contains all elements. Dense, with many short-range connections.
Layer 1: contains a random subset of elements. Fewer nodes, but connections span larger distances.
Layer L (top): very sparse. Only a handful of elements with very long-range connections.

Each element is assigned a maximum layer drawn from an exponential distribution: most elements exist only in layer 0, a few reach layer 1, fewer reach layer 2, and so on. This creates a natural hierarchy similar to a skip list.

Search algorithm

Start from the entry point at the topmost layer.
At the current layer, greedily move to the neighbor closest to the query vector. Repeat until no neighbor is closer (local minimum).
Drop down to the next layer and repeat greedy search from the same node.
At layer 0, perform a more thorough beam search with width ef — maintain a priority queue of the ef closest candidates found so far. Explore neighbors of each candidate.
Return the top-k results from the priority queue.

Construction

When inserting element q:

Draw a random max layer l from an exponential distribution: l = floor(-ln(rand()) * mL) where mL = 1/ln(M).
Start from the entry point at the top layer. Greedily descend to layer l+1.
From layer l down to layer 0, find the efConstruction nearest neighbors at each layer and create bidirectional edges. If a node exceeds M connections, prune the weakest.

Key parameters

Parameter	Role	Typical value
M	Max connections per node per layer	16-64
efConstruction	Beam width during construction	100-400
efSearch	Beam width during query	50-200

Higher M and ef values improve recall but increase memory and latency.

Complexity

Operation	Time	Space
Search	O(log n)	O(n * M)
Insert	O(log n)	O(M)

Where HNSW is used

pgvector (PostgreSQL): CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
Faiss (Meta): HNSW index for billion-scale similarity search
Qdrant, Weaviate, Pinecone: vector databases built on HNSW
Elasticsearch: dense_vector field type with HNSW indexing

**Hierarchical Navigable Small World (HNSW)** is a graph-based algorithm for **approximate nearest neighbor (ANN)** search in high-dimensional vector spaces. It builds a multi-layer graph where each layer is a proximity graph connecting nearby vectors, enabling fast logarithmic-time search.

### The small-world idea

A navigable small-world graph has two properties: (1) most nodes connect to nearby neighbors, and (2) a few long-range links allow jumping across the graph quickly. This mirrors the "six degrees of separation" phenomenon. By greedily following edges that bring us closer to the target, we can navigate from any node to any other in O(log n) hops.

### Hierarchical layers

HNSW extends this by stacking **multiple layers** of proximity graphs:

- **Layer 0 (bottom)**: contains **all** elements. Dense, with many short-range connections.
- **Layer 1**: contains a random subset of elements. Fewer nodes, but connections span larger distances.
- **Layer L (top)**: very sparse. Only a handful of elements with very long-range connections.

Each element is assigned a **maximum layer** drawn from an exponential distribution: most elements exist only in layer 0, a few reach layer 1, fewer reach layer 2, and so on. This creates a natural hierarchy similar to a skip list.

### Search algorithm

1. Start from the **entry point** at the topmost layer.
2. At the current layer, greedily move to the neighbor closest to the query vector. Repeat until no neighbor is closer (local minimum).
3. **Drop down** to the next layer and repeat greedy search from the same node.
4. At layer 0, perform a more thorough **beam search** with width `ef` — maintain a priority queue of the `ef` closest candidates found so far. Explore neighbors of each candidate.
5. Return the top-k results from the priority queue.

### Construction

When inserting element *q*:

1. Draw a random max layer *l* from an exponential distribution: `l = floor(-ln(rand()) * mL)` where `mL = 1/ln(M)`.
2. Start from the entry point at the top layer. Greedily descend to layer *l+1*.
3. From layer *l* down to layer 0, find the `efConstruction` nearest neighbors at each layer and create bidirectional edges. If a node exceeds **M** connections, prune the weakest.

### Key parameters

| Parameter       | Role | Typical value |
|----------------|------|---------------|
| M              | Max connections per node per layer | 16-64 |
| efConstruction | Beam width during construction | 100-400 |
| efSearch       | Beam width during query | 50-200 |

Higher M and ef values improve recall but increase memory and latency.

### Complexity

| Operation | Time        | Space    |
|-----------|------------|----------|
| Search    | O(log n)   | O(n * M) |
| Insert    | O(log n)   | O(M)     |

### Where HNSW is used

- **pgvector** (PostgreSQL): `CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)`
- **Faiss** (Meta): HNSW index for billion-scale similarity search
- **Qdrant**, **Weaviate**, **Pinecone**: vector databases built on HNSW
- **Elasticsearch**: dense_vector field type with HNSW indexing

Real-Life: Semantic Search with Embeddings

Real-World Example

Modern search engines convert documents and queries into embedding vectors (e.g., 768-dimensional vectors from a transformer model). Finding the most relevant documents means finding the nearest vectors — a nearest neighbor problem.

The scale problem: A knowledge base with 100 million documents, each represented as a 768-dim vector, cannot be searched by brute force. Computing 100 million cosine similarities per query would take seconds.

HNSW in action (pgvector):

-- Store embeddings
CREATE TABLE articles (id serial, content text, embedding vector(768));

-- Build HNSW index
CREATE INDEX ON articles USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 200);

-- Query: find 10 most similar articles
SELECT content FROM articles
ORDER BY embedding <=> query_embedding
LIMIT 10;

The HNSW index turns this from a 100M distance computation into ~1,000 distance computations — a 100,000x speedup with 95%+ recall.

Tradeoff: HNSW uses significant memory (each vector stores M pointers per layer). For billion-scale datasets, hybrid approaches like IVF+HNSW are used, where IVF partitions the data and HNSW searches within each partition.

HNSW: Multi-Layer Search

Step 1 of 2