Back to DAG

Time-Series DB (InfluxDB)

databases

Time-Series Database Fundamentals

A time-series database is optimized for storing and querying data points indexed by timestamp. Each data point typically consists of a timestamp, a set of tags (metadata), and one or more field values (measurements). Time-series data arises naturally in monitoring (CPU usage, request latency), IoT (sensor readings, vehicle telemetry), financial markets (stock prices, trade volumes), and event tracking (user actions, log events).

Write Pattern: Append-Only, High Cardinality

Time-series workloads are write-heavy and append-only. Data arrives continuously and is rarely updated. The write rate can be enormous: a fleet of 10,000 servers each reporting 100 metrics every 10 seconds generates 100,000 data points per second. The "cardinality" of a time series is the number of unique tag combinations. High cardinality (e.g., per-user metrics) stresses indexing because each unique tag combination creates a separate series.

Query Pattern: Range Scans and Aggregations

Queries almost always involve a time range: "give me CPU usage for server X over the last 24 hours." Aggregation is common: average, percentile, min/max over time windows (e.g., "average CPU every 5 minutes"). Point queries by primary key are rare. This is the opposite of OLTP workloads.

Storage Engine

Time-series databases use specialized storage:

  • Time-partitioned shards: Data is divided into chunks by time range (e.g., one shard per day or week). Old shards can be dropped without expensive deletion.
  • Columnar layout: Each field is stored in a separate column, enabling efficient compression (timestamps are monotonically increasing, values are often similar) and fast scans.
  • LSM-based write path: Incoming data is buffered in memory and flushed to immutable on-disk segments, similar to LSM trees but optimized for time-ordered data.

Retention Policies and Downsampling

Old data is often less valuable at full resolution. A retention policy automatically deletes data older than a threshold (e.g., 30 days of raw data). Downsampling aggregates old data to lower resolution: instead of per-second readings, keep per-hour averages. This dramatically reduces storage while preserving trends.

Notable Systems

  • InfluxDB: Purpose-built time-series DB with its own query language (InfluxQL / Flux). Uses a TSI (Time-Series Index) for tag indexing and TSM (Time-Structured Merge tree) for data storage.
  • TimescaleDB: PostgreSQL extension. Transparent hypertables partition data by time. Full SQL compatibility.
  • Prometheus: Pull-based monitoring system. Scrapes metrics from HTTP endpoints at configurable intervals. Local storage with remote-write for long-term retention (e.g., Thanos, Cortex).

Real-Life: Monitoring Infrastructure with Prometheus + InfluxDB

Real-World Example

Prometheus for infrastructure monitoring:

  • Each server runs a metrics exporter (e.g., node_exporter for Linux). Prometheus scrapes these endpoints every 15 seconds.
  • Data is stored locally on the Prometheus server. PromQL enables powerful queries: rate(http_requests_total{status="500"}[5m]) computes the per-second rate of 500 errors over 5-minute windows.
  • Alertmanager fires alerts when PromQL conditions are met (e.g., error rate > 1% for 5 minutes).
  • For long-term storage, Prometheus remote-writes to Thanos or Cortex, which store data in object storage (S3).

InfluxDB for IoT:

  • A factory has 1,000 sensors each reporting temperature, pressure, and vibration every second. That's 3,000 writes/second.
  • InfluxDB stores this in TSM format, partitioned by 1-hour shards. Queries like "average temperature per sensor per 5 minutes over the last 24 hours" scan only the relevant time shards.
  • Retention policy: keep raw data for 7 days, keep 1-hour downsampled aggregates for 1 year.

TimescaleDB at real scale:

  • Timescale reports customers storing trillions of data points. As a PostgreSQL extension, it supports JOINs between time-series hypertables and regular relational tables—a major advantage over purpose-built TSDBs.

Time-Series Storage: Time-Partitioned Shards

Incoming Metrics (append-only) cpu,host=srv1 usage=72.5 1706140800 cpu,host=srv2 usage=45.1 1706140801 Time-Partitioned Shards Shard: Jan 24 14 GB, read-only expires in 5d Shard: Jan 25 12 GB, read-only Shard: Jan 26 11 GB, read-only Shard: Jan 27 6 GB, active writes go here Query: avg(cpu) WHERE host='srv1' AND time > Jan 26 Only shards Jan 26 + Jan 27 scanned Retention + Downsampling Raw data (per-second) Retention: 7 days Resolution: 1s downsample Aggregated (per-hour) Retention: 1 year Resolution: 1h avg/min/max expire Deleted
Step 1 of 2