What is HNSW?
Hierarchical Navigable Small World (HNSW) is a graph-based algorithm for approximate nearest neighbor (ANN) search in high-dimensional vector spaces. It builds a multi-layer graph where each layer is a proximity graph connecting nearby vectors, enabling fast logarithmic-time search.
The small-world idea
A navigable small-world graph has two properties: (1) most nodes connect to nearby neighbors, and (2) a few long-range links allow jumping across the graph quickly. This mirrors the "six degrees of separation" phenomenon. By greedily following edges that bring us closer to the target, we can navigate from any node to any other in O(log n) hops.
Hierarchical layers
HNSW extends this by stacking multiple layers of proximity graphs:
- Layer 0 (bottom): contains all elements. Dense, with many short-range connections.
- Layer 1: contains a random subset of elements. Fewer nodes, but connections span larger distances.
- Layer L (top): very sparse. Only a handful of elements with very long-range connections.
Each element is assigned a maximum layer drawn from an exponential distribution: most elements exist only in layer 0, a few reach layer 1, fewer reach layer 2, and so on. This creates a natural hierarchy similar to a skip list.
Search algorithm
- Start from the entry point at the topmost layer.
- At the current layer, greedily move to the neighbor closest to the query vector. Repeat until no neighbor is closer (local minimum).
- Drop down to the next layer and repeat greedy search from the same node.
- At layer 0, perform a more thorough beam search with width
ef— maintain a priority queue of theefclosest candidates found so far. Explore neighbors of each candidate. - Return the top-k results from the priority queue.
Construction
When inserting element q:
- Draw a random max layer l from an exponential distribution:
l = floor(-ln(rand()) * mL)wheremL = 1/ln(M). - Start from the entry point at the top layer. Greedily descend to layer l+1.
- From layer l down to layer 0, find the
efConstructionnearest neighbors at each layer and create bidirectional edges. If a node exceeds M connections, prune the weakest.
Key parameters
| Parameter | Role | Typical value |
|---|---|---|
| M | Max connections per node per layer | 16-64 |
| efConstruction | Beam width during construction | 100-400 |
| efSearch | Beam width during query | 50-200 |
Higher M and ef values improve recall but increase memory and latency.
Complexity
| Operation | Time | Space |
|---|---|---|
| Search | O(log n) | O(n * M) |
| Insert | O(log n) | O(M) |
Where HNSW is used
- pgvector (PostgreSQL):
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) - Faiss (Meta): HNSW index for billion-scale similarity search
- Qdrant, Weaviate, Pinecone: vector databases built on HNSW
- Elasticsearch: dense_vector field type with HNSW indexing
Real-Life: Semantic Search with Embeddings
Modern search engines convert documents and queries into embedding vectors (e.g., 768-dimensional vectors from a transformer model). Finding the most relevant documents means finding the nearest vectors — a nearest neighbor problem.
The scale problem: A knowledge base with 100 million documents, each represented as a 768-dim vector, cannot be searched by brute force. Computing 100 million cosine similarities per query would take seconds.
HNSW in action (pgvector):
-- Store embeddings
CREATE TABLE articles (id serial, content text, embedding vector(768));
-- Build HNSW index
CREATE INDEX ON articles USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);
-- Query: find 10 most similar articles
SELECT content FROM articles
ORDER BY embedding <=> query_embedding
LIMIT 10;
The HNSW index turns this from a 100M distance computation into ~1,000 distance computations — a 100,000x speedup with 95%+ recall.
Tradeoff: HNSW uses significant memory (each vector stores M pointers per layer). For billion-scale datasets, hybrid approaches like IVF+HNSW are used, where IVF partitions the data and HNSW searches within each partition.