← Docs
Recipe

ML feature store design

A reference architecture for offline/online feature serving with point-in-time correctness.

Core components

  • Batch transformation pipelines (Spark / Beam)
  • Offline store — Parquet on S3, partitioned by date
  • Online store — Redis Cluster, feature vectors keyed by entity
  • Feature registry — metadata catalog with versioned schemas
  • Point-in-time join engine for training set generation

Serving patterns

Online inference reads pre-computed feature vectors from Redis via entity ID lookup. Sub-5ms p99 latency target. Offline training jobs materialize time-travel snapshots by joining feature values as-of each label timestamp.

Consistency guarantees

Feature values carry a valid_from /valid_until range. The join engine filters rows where the label timestamp falls inside the validity window, preventing future leakage.

Monitoring

Track feature drift via population stability index (PSI) on daily batches. Alert when any feature exceeds 0.25 PSI threshold. Log serving latency histograms and cache-hit ratios per feature group.