What is a Bitmap Index?

Prerequisites(1)

A bitmap index creates one bit vector (bitmap) for every distinct value in a column. Each bitmap has N bits — one per row in the table. Bit i is set to 1 if row i has that value, and 0 otherwise.

Example

Consider a table with 8 rows and a column color with values {red, blue, green}:

Row	color
0	red
1	blue
2	red
3	green
4	blue
5	red
6	green
7	blue

Three bitmaps are created:

red: 1 0 1 0 0 1 0 0
blue: 0 1 0 0 1 0 0 1
green: 0 0 0 1 0 0 1 0

Query evaluation with bitwise operations

Bitmap indexes shine for multi-predicate queries. To evaluate WHERE color = 'red' AND size = 'large':

Fetch the red bitmap from the color index.
Fetch the large bitmap from the size index.
Compute bitwise AND: the result bitmap has 1s only where both conditions are true.
Scan the result bitmap for set bits — those are the matching row IDs.

For OR predicates (WHERE color = 'red' OR color = 'blue'), compute bitwise OR of the two bitmaps.

The power of bitmaps is that these operations process 64 rows at a time on a 64-bit CPU (one machine word), making them extremely fast.

When bitmap indexes excel

Low cardinality columns: columns with few distinct values (gender, status, country, color). Each bitmap is compact.
Data warehouse queries: complex filters combining many predicates with AND/OR. Each predicate is a single bitmap lookup; combining them is trivial bitwise operations.
Read-heavy workloads: bitmap indexes are fast to scan but expensive to update (inserting a row requires updating every bitmap in the index).

When NOT to use bitmap indexes

High cardinality columns (e.g., email, UUID): one bitmap per value means millions of sparse bitmaps — wasteful.
OLTP workloads: frequent INSERT/UPDATE/DELETE causes constant bitmap rebuilding. Row-level locking on bitmaps is impractical.

Bitmap compression

Raw bitmaps for large tables waste space (millions of bits, mostly zeros for rare values). Compression techniques:

Run-Length Encoding (RLE): encode runs of consecutive 0s or 1s as (value, length) pairs. Highly effective for sorted data.
Roaring Bitmaps: hybrid structure — uses arrays for sparse chunks, bitmaps for dense chunks, and RLE for runs. Used in Apache Lucene, Apache Druid, and Apache Spark.
Word-Aligned Hybrid (WAH): compresses at word boundaries for fast bitwise operations without decompression.

Where bitmap indexes are used

Oracle Database: explicit CREATE BITMAP INDEX statement for star schema queries.
Apache Druid: bitmap indexes on every dimension column by default for sub-second OLAP queries.
Apache Lucene/Elasticsearch: posting lists stored as Roaring Bitmaps for inverted index intersection.
ClickHouse: bitmap functions for set operations in analytics.

Real-Life: E-Commerce Product Filtering

Real-World Example

When you filter products on an e-commerce site ("Brand: Nike AND Size: 10 AND Color: Black AND Price: Under $100"), the backend often uses bitmap indexes.

Without bitmaps: For each filter, scan the table or use a separate B-tree index, then intersect results in memory. With 4 filters, that could mean 4 index lookups and a complex merge.

With bitmaps:

Fetch bitmap for brand = 'Nike' → 1 0 1 1 0 0 1 0 ...
Fetch bitmap for size = 10 → 0 0 1 0 0 1 1 0 ...
Fetch bitmap for color = 'Black' → 1 0 1 0 0 0 1 0 ...
Fetch bitmap for price < 100 → 1 1 1 1 0 0 0 0 ...
AND all four: 0 0 1 0 0 0 0 0 ... — only row 2 matches.

This takes microseconds because each AND is a single CPU instruction per 64 rows.

Apache Druid uses this exact approach for real-time analytics dashboards. A query like "count orders by region where status='shipped' and category='electronics'" resolves to bitmap intersections followed by a count of set bits (popcount), all in sub-second time even on billions of rows.

Bitmap Index: Construction and Query

Step 1 of 3