Back to DAG

R-Tree (Geospatial)

databases

What is an R-Tree?

An R-tree is a balanced tree data structure used to index multi-dimensional data such as geographic coordinates, rectangles, and polygons. It is the spatial analog of a B-tree: while a B-tree indexes one-dimensional keys, an R-tree indexes objects in two or more dimensions using Minimum Bounding Rectangles (MBRs).

Structure

Each node in an R-tree contains between m and M entries (minimum and maximum fill). A leaf node entry contains a pointer to the actual data object and its MBR. An internal node entry contains a pointer to a child node and an MBR that tightly encloses all MBRs in that child subtree. The root has at least two children (unless it is a leaf).

Search

To answer a range query (find all objects that overlap a query rectangle Q):

  1. Start at the root. For each entry whose MBR overlaps Q, recurse into that child.
  2. At leaf nodes, check each entry's MBR against Q and return matching objects.
  3. Multiple subtrees may be visited because MBRs of sibling nodes can overlap — this is the key difference from B-trees, where ranges do not overlap.

Insert

To insert a new object:

  1. Start at the root. At each level, choose the child whose MBR requires the smallest area enlargement to include the new object. Break ties by choosing the MBR with the smallest area.
  2. At the leaf, add the entry. If the node overflows (more than M entries), split it into two nodes, minimizing overlap between the resulting MBRs.
  3. Propagate the split upward, adjusting MBRs along the path.

Node splitting strategies

  • Quadratic split: Try all pairs of entries as seeds, assign remaining entries to minimize area. O(M^2).
  • Linear split: Pick the two entries most separated along any axis as seeds. O(M). Faster but lower quality.

Variants

  • R-tree*: On overflow, first attempts forced reinsertion — removes a fraction of entries and reinserts them. This improves tree quality and reduces overlap. Used in most modern implementations.
  • R+-tree: Eliminates overlap between sibling MBRs by allowing objects to be stored in multiple leaf nodes. This guarantees single-path search for point queries but complicates insertion and deletion.

Complexity

OperationAverage caseWorst case
SearchO(log n)O(n)
InsertO(log n)O(n)
DeleteO(log n)O(n)

Worst case occurs when MBRs overlap heavily, forcing many paths to be explored. Well-constructed R*-trees keep overlap minimal.

Where R-trees are used

  • PostGIS: spatial indexing for PostgreSQL (uses GiST framework with R-tree logic).
  • SQLite R*Tree module: built-in virtual table for spatial queries.
  • Oracle Spatial: R-tree indexing for SDO_GEOMETRY columns.
  • Elasticsearch: geo_shape queries backed by BKD trees (a related spatial structure).

Real-Life: Finding Nearby Restaurants

Real-World Example

When a map application answers "find all restaurants within 500 meters of my location," it uses a spatial index.

Without an R-tree: scan every restaurant in the database and compute its distance. For 10 million restaurants, that is 10 million distance calculations per query — far too slow.

With an R-tree: the query rectangle (a bounding box around the 500m radius) is checked against the root MBRs. Only subtrees whose MBRs overlap the query box are explored. In practice, this prunes 99%+ of the data, touching perhaps a few hundred nodes instead of millions.

PostGIS example:

SELECT name FROM restaurants
WHERE ST_DWithin(location, ST_MakePoint(-73.98, 40.76)::geography, 500);

PostGIS automatically uses the R-tree index on the location column. The planner translates the distance query into a bounding-box filter (exploiting the R-tree), then applies exact distance filtering on the small set of candidates.

Other spatial queries powered by R-trees:

  • "Find all parcels that intersect this polygon" (land registry)
  • "Which delivery zones contain this address?" (logistics)
  • "Select all sensor readings within this geographic region" (IoT)

R-Tree: MBRs and Search

Spatial View MBR-A p1 p2 p3 p4 MBR-B p5 p6 p7 Query Q Tree View Root MBR-A MBR-B p1-3 p4 p5 p6-7 Q overlaps both MBRs Search enters A and B Returns p4, p5 (overlap Q) Insert Strategy Choose child with smallest area enlargement New point near MBR-B? Insert into B subtree. If node overflows (>M entries), split it. R*-tree: try forced reinsertion before split.
Step 1 of 2