Recipe
ML feature store design
A reference architecture for offline/online feature serving with point-in-time correctness.
Core components
- Batch transformation pipelines (Spark / Beam)
- Offline store — Parquet on S3, partitioned by date
- Online store — Redis Cluster, feature vectors keyed by entity
- Feature registry — metadata catalog with versioned schemas
- Point-in-time join engine for training set generation
Serving patterns
Online inference reads pre-computed feature vectors from Redis via entity ID lookup. Sub-5ms p99 latency target. Offline training jobs materialize time-travel snapshots by joining feature values as-of each label timestamp.
Consistency guarantees
Feature values carry a valid_from /valid_until range. The join engine filters rows where the label timestamp falls inside the validity window, preventing future leakage.
Monitoring
Track feature drift via population stability index (PSI) on daily batches. Alert when any feature exceeds 0.25 PSI threshold. Log serving latency histograms and cache-hit ratios per feature group.