← Back to Docs
Recipe

Recipe: Dataset drift detection

Monitor feature and label distributions in production to catch silent model degradation before it impacts users.

Overview

Dataset drift occurs when the statistical properties of incoming data diverge from the training distribution. Meridian compares live inference payloads against a stored baseline using the two-sample Kolmogorov-Smirnov test and Jensen-Shannon divergence, surfacing per-feature drift scores in real time.

Prerequisites

  • Meridian SDK v2.1+ instrumented in your inference pipeline
  • A baseline dataset exported from your training or validation split
  • At least 200 inference requests logged to establish a comparison window

Steps

  1. Upload baseline

    Navigate to Datasets → Baselines and upload a CSV or Parquet file containing the features and labels your model expects.

  2. Enable drift monitoring

    Toggle drift detection on the model's configuration page. Select the baseline and set a drift threshold (default: 0.15 JS distance).

  3. Inspect drift reports

    Open Monitoring → Drift to view per-feature scores, historical trends, and automated alerts when thresholds are breached.

Interpreting results

A JS divergence above 0.15 indicates meaningful distribution shift. Pair this with prediction accuracy metrics to distinguish benign drift (seasonal patterns) from harmful drift (data pipeline bugs, upstream schema changes).