Back to DAG

Bitmap Index

databases

What is a Bitmap Index?

A bitmap index creates one bit vector (bitmap) for every distinct value in a column. Each bitmap has N bits — one per row in the table. Bit i is set to 1 if row i has that value, and 0 otherwise.

Example

Consider a table with 8 rows and a column color with values {red, blue, green}:

Rowcolor
0red
1blue
2red
3green
4blue
5red
6green
7blue

Three bitmaps are created:

  • red: 1 0 1 0 0 1 0 0
  • blue: 0 1 0 0 1 0 0 1
  • green: 0 0 0 1 0 0 1 0

Query evaluation with bitwise operations

Bitmap indexes shine for multi-predicate queries. To evaluate WHERE color = 'red' AND size = 'large':

  1. Fetch the red bitmap from the color index.
  2. Fetch the large bitmap from the size index.
  3. Compute bitwise AND: the result bitmap has 1s only where both conditions are true.
  4. Scan the result bitmap for set bits — those are the matching row IDs.

For OR predicates (WHERE color = 'red' OR color = 'blue'), compute bitwise OR of the two bitmaps.

The power of bitmaps is that these operations process 64 rows at a time on a 64-bit CPU (one machine word), making them extremely fast.

When bitmap indexes excel

  • Low cardinality columns: columns with few distinct values (gender, status, country, color). Each bitmap is compact.
  • Data warehouse queries: complex filters combining many predicates with AND/OR. Each predicate is a single bitmap lookup; combining them is trivial bitwise operations.
  • Read-heavy workloads: bitmap indexes are fast to scan but expensive to update (inserting a row requires updating every bitmap in the index).

When NOT to use bitmap indexes

  • High cardinality columns (e.g., email, UUID): one bitmap per value means millions of sparse bitmaps — wasteful.
  • OLTP workloads: frequent INSERT/UPDATE/DELETE causes constant bitmap rebuilding. Row-level locking on bitmaps is impractical.

Bitmap compression

Raw bitmaps for large tables waste space (millions of bits, mostly zeros for rare values). Compression techniques:

  • Run-Length Encoding (RLE): encode runs of consecutive 0s or 1s as (value, length) pairs. Highly effective for sorted data.
  • Roaring Bitmaps: hybrid structure — uses arrays for sparse chunks, bitmaps for dense chunks, and RLE for runs. Used in Apache Lucene, Apache Druid, and Apache Spark.
  • Word-Aligned Hybrid (WAH): compresses at word boundaries for fast bitwise operations without decompression.

Where bitmap indexes are used

  • Oracle Database: explicit CREATE BITMAP INDEX statement for star schema queries.
  • Apache Druid: bitmap indexes on every dimension column by default for sub-second OLAP queries.
  • Apache Lucene/Elasticsearch: posting lists stored as Roaring Bitmaps for inverted index intersection.
  • ClickHouse: bitmap functions for set operations in analytics.

Real-Life: E-Commerce Product Filtering

Real-World Example

When you filter products on an e-commerce site ("Brand: Nike AND Size: 10 AND Color: Black AND Price: Under $100"), the backend often uses bitmap indexes.

Without bitmaps: For each filter, scan the table or use a separate B-tree index, then intersect results in memory. With 4 filters, that could mean 4 index lookups and a complex merge.

With bitmaps:

  1. Fetch bitmap for brand = 'Nike'1 0 1 1 0 0 1 0 ...
  2. Fetch bitmap for size = 100 0 1 0 0 1 1 0 ...
  3. Fetch bitmap for color = 'Black'1 0 1 0 0 0 1 0 ...
  4. Fetch bitmap for price < 1001 1 1 1 0 0 0 0 ...
  5. AND all four: 0 0 1 0 0 0 0 0 ... — only row 2 matches.

This takes microseconds because each AND is a single CPU instruction per 64 rows.

Apache Druid uses this exact approach for real-time analytics dashboards. A query like "count orders by region where status='shipped' and category='electronics'" resolves to bitmap intersections followed by a count of set bits (popcount), all in sub-second time even on billions of rows.

Bitmap Index: Construction and Query

Table (8 rows) row color size 0redS 1blueL 2redL 3greenS 4blueM 5redL 6greenM 7blueS Color Bitmaps red 1 0 1 0 0 1 0 0 blue 0 1 0 0 1 0 0 1 green 0 0 0 1 0 0 1 0 Size Bitmaps L 0 1 1 0 0 1 0 0 Query: color='red' AND size='L' red bitmap 1 0 1 0 0 1 0 0 AND L bitmap 0 1 1 0 0 1 0 0 = Result 0 0 1 0 0 1 0 0 rows 2, 5 Bitmap Compression Raw: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 ... RLE: (0,6)(1,1)(0,7)(1,1)(0,3)... Roaring: sparse chunks → arrays, dense → bitmaps Roaring Bitmaps: best general-purpose compression
Step 1 of 3