Geospatial Data and PostGIS
Standard SQL can store latitude and longitude as plain numbers, but it cannot efficiently answer questions like "find all restaurants within 1 km of my hotel" or "which delivery zones overlap this neighborhood?" These require geospatial operations — computing distances on a sphere, testing containment, finding intersections — and specialized indexes to make them fast.
PostGIS: Spatial Extension for PostgreSQL
PostGIS adds geospatial types, functions, and indexes to PostgreSQL. It is the most widely used open-source spatial database extension, powering mapping applications from OpenStreetMap to Uber.
Geometry vs Geography
PostGIS offers two core types:
- Geometry: computations on a flat Cartesian plane. Fast and exact for small areas (city-level), but distorts over large distances because the Earth is not flat. Uses SRID 0 (no reference system) or projected SRIDs like SRID 3857 (Web Mercator).
- Geography: computations on a sphere/spheroid (WGS 84, SRID 4326). Accurate for any distance on Earth but slower because it uses trigonometric great-circle math. Always uses degrees (longitude, latitude).
Rule of thumb: use Geography for anything involving real-world distances (delivery radius, flight paths). Use Geometry for local spatial relationships on projected data (city planning, floor plans).
Well-Known Text (WKT) Format
Spatial data is represented in WKT:
POINT(-73.9857 40.7484)— a single point (longitude, latitude for Empire State Building)LINESTRING(-73.99 40.74, -73.98 40.75)— a pathPOLYGON((-73.99 40.74, -73.98 40.74, -73.98 40.75, -73.99 40.75, -73.99 40.74))— a closed shape (first and last point must match)
Note: in WKT, the order is (longitude, latitude) — the opposite of Google Maps convention.
Key PostGIS Functions
| Function | Description |
|---|---|
| ST_Distance(a, b) | Distance between two geometries |
| ST_DWithin(a, b, d) | True if a and b are within distance d |
| ST_Contains(a, b) | True if geometry a fully contains b |
| ST_Intersects(a, b) | True if a and b share any space |
| ST_Area(a) | Area of a polygon |
| ST_Buffer(a, d) | Expands geometry a by distance d |
| ST_Transform(a, srid) | Reproject to a different coordinate system |
GiST Index for Spatial Queries
A B-tree index cannot help with 2D spatial queries — it orders data in one dimension. PostGIS uses a GiST (Generalized Search Tree) index, which internally builds an R-tree: a tree of nested bounding boxes. To find all restaurants within 1 km, the R-tree quickly prunes branches whose bounding boxes do not overlap the search circle, then checks only the remaining candidates with exact geometry tests.
CREATE INDEX idx_restaurants_geom ON restaurants USING gist(geom);
Spatial Joins
The most powerful geospatial pattern is the spatial join — combining two tables based on spatial relationships instead of key equality:
SELECT h.name, r.name, ST_Distance(h.geog, r.geog) FROM hotels h JOIN restaurants r ON ST_DWithin(h.geog, r.geog, 1000);
This finds every hotel-restaurant pair within 1000 meters. The GiST index makes this efficient even with millions of rows.
Real-Life: Ride-Sharing Pickup Matching
A ride-sharing app like Uber needs to answer: "find all available drivers within 2 km of the rider's location." This is a classic geospatial query.
Schema:
CREATE TABLE drivers (
id SERIAL PRIMARY KEY,
name TEXT,
location GEOGRAPHY(POINT, 4326),
is_available BOOLEAN
);
CREATE INDEX idx_drivers_loc ON drivers USING gist(location);
Query:
SELECT name, ST_Distance(location, ST_MakePoint(-73.9857, 40.7484)::geography) AS dist_m
FROM drivers
WHERE is_available
AND ST_DWithin(location, ST_MakePoint(-73.9857, 40.7484)::geography, 2000)
ORDER BY dist_m;
The GiST index makes this query fast even with 100,000 active drivers. Without the spatial index, PostgreSQL would compute ST_DWithin for every single driver — a full table scan with expensive trigonometric calculations.
Other spatial use cases:
- Delivery zones: ST_Contains to check if an address falls within a delivery polygon
- Geofencing: trigger alerts when a GPS tracker leaves a predefined boundary (ST_Intersects with the boundary polygon)
- Land use analysis: ST_Area and ST_Intersection to compute overlap between parcels and flood zones