Back to DAG

Geospatial Queries (PostGIS)

databases

Geospatial Data and PostGIS

Standard SQL can store latitude and longitude as plain numbers, but it cannot efficiently answer questions like "find all restaurants within 1 km of my hotel" or "which delivery zones overlap this neighborhood?" These require geospatial operations — computing distances on a sphere, testing containment, finding intersections — and specialized indexes to make them fast.

PostGIS: Spatial Extension for PostgreSQL

PostGIS adds geospatial types, functions, and indexes to PostgreSQL. It is the most widely used open-source spatial database extension, powering mapping applications from OpenStreetMap to Uber.

Geometry vs Geography

PostGIS offers two core types:

  • Geometry: computations on a flat Cartesian plane. Fast and exact for small areas (city-level), but distorts over large distances because the Earth is not flat. Uses SRID 0 (no reference system) or projected SRIDs like SRID 3857 (Web Mercator).
  • Geography: computations on a sphere/spheroid (WGS 84, SRID 4326). Accurate for any distance on Earth but slower because it uses trigonometric great-circle math. Always uses degrees (longitude, latitude).

Rule of thumb: use Geography for anything involving real-world distances (delivery radius, flight paths). Use Geometry for local spatial relationships on projected data (city planning, floor plans).

Well-Known Text (WKT) Format

Spatial data is represented in WKT:

  • POINT(-73.9857 40.7484) — a single point (longitude, latitude for Empire State Building)
  • LINESTRING(-73.99 40.74, -73.98 40.75) — a path
  • POLYGON((-73.99 40.74, -73.98 40.74, -73.98 40.75, -73.99 40.75, -73.99 40.74)) — a closed shape (first and last point must match)

Note: in WKT, the order is (longitude, latitude) — the opposite of Google Maps convention.

Key PostGIS Functions

FunctionDescription
ST_Distance(a, b)Distance between two geometries
ST_DWithin(a, b, d)True if a and b are within distance d
ST_Contains(a, b)True if geometry a fully contains b
ST_Intersects(a, b)True if a and b share any space
ST_Area(a)Area of a polygon
ST_Buffer(a, d)Expands geometry a by distance d
ST_Transform(a, srid)Reproject to a different coordinate system

GiST Index for Spatial Queries

A B-tree index cannot help with 2D spatial queries — it orders data in one dimension. PostGIS uses a GiST (Generalized Search Tree) index, which internally builds an R-tree: a tree of nested bounding boxes. To find all restaurants within 1 km, the R-tree quickly prunes branches whose bounding boxes do not overlap the search circle, then checks only the remaining candidates with exact geometry tests.

CREATE INDEX idx_restaurants_geom ON restaurants USING gist(geom);

Spatial Joins

The most powerful geospatial pattern is the spatial join — combining two tables based on spatial relationships instead of key equality:

SELECT h.name, r.name, ST_Distance(h.geog, r.geog) FROM hotels h JOIN restaurants r ON ST_DWithin(h.geog, r.geog, 1000);

This finds every hotel-restaurant pair within 1000 meters. The GiST index makes this efficient even with millions of rows.

Real-Life: Ride-Sharing Pickup Matching

Real-World Example

A ride-sharing app like Uber needs to answer: "find all available drivers within 2 km of the rider's location." This is a classic geospatial query.

Schema:

CREATE TABLE drivers (
  id SERIAL PRIMARY KEY,
  name TEXT,
  location GEOGRAPHY(POINT, 4326),
  is_available BOOLEAN
);
CREATE INDEX idx_drivers_loc ON drivers USING gist(location);

Query:

SELECT name, ST_Distance(location, ST_MakePoint(-73.9857, 40.7484)::geography) AS dist_m
FROM drivers
WHERE is_available
  AND ST_DWithin(location, ST_MakePoint(-73.9857, 40.7484)::geography, 2000)
ORDER BY dist_m;

The GiST index makes this query fast even with 100,000 active drivers. Without the spatial index, PostgreSQL would compute ST_DWithin for every single driver — a full table scan with expensive trigonometric calculations.

Other spatial use cases:

  • Delivery zones: ST_Contains to check if an address falls within a delivery polygon
  • Geofencing: trigger alerts when a GPS tracker leaves a predefined boundary (ST_Intersects with the boundary polygon)
  • Land use analysis: ST_Area and ST_Intersection to compute overlap between parcels and flood zones

R-tree Bounding Box Index for Spatial Search

R-tree index: bounding boxes prune spatial search Region A r1 r2 r3 r4 r5 Region B r6 r7 r8 r9 r10 r11 query R-tree search: 1. Check root bounding boxes Region A: no overlap → skip all 5 points Region B: overlaps! → check child boxes 2. Exact distance check on candidates only Result: r6, r7, r8, r10 within radius r9, r11 too far Checked 6 of 11 points instead of all 11
Step 1 of 2