Recipe: Model + feature drift alert design
A production-tested pattern for detecting distributional drift in ML pipelines and surfacing actionable alerts before silent degradation reaches customers.
Why drift matters
Models degrade in production not because code changes, but because the world changes. Feature distributions shift, label relationships erode, and accuracy decays silently. Without drift monitoring, you discover the problem from customer complaints.
Architecture
The recipe uses a three-tier alert pipeline: statistical detection, severity scoring, and notification routing. Each tier is decoupled so you can swap the detector (KS test, PSI, Jensen-Shannon) or the alert sink (Slack, PagerDuty, webhook) independently.
Tier 1 — Detection
Compare reference and production windows using Population Stability Index (PSI) for features and a rolling accuracy monitor for model output. Thresholds: PSI > 0.25 triggers a warning; accuracy drop > 5% triggers critical.
Tier 2 — Severity
Combine feature drift magnitude, model impact, and traffic volume into a single severity score (0–100). Scores above 70 page on-call; scores 40–70 create a Slack thread; below 40 log for weekly review.
Tier 3 — Routing
Alerts carry structured payloads: affected features, drift magnitude, reference window range, and a link to the monitoring dashboard. Deduplication keys prevent alert storms during transient spikes.
Implementation notes
Run detection as a scheduled job (every 15 minutes for high-traffic models, hourly for batch). Store reference distributions in your feature store. Use exponential backoff for alert retries. Always include a runbook link in the alert payload so the responder knows exactly what to check first.