Recipe

Great-Expectations data-quality rules

Validate every row before it poisons your pipeline. This recipe wires GX expectations into Meridian's ingestion guardrail.

Ingredients

  • Great Expectations ≥ 0.18
  • Meridian ingestion endpoint (v2)
  • Expectation Suite JSON
  • Checkpoint config YAML

Steps

  1. Define the suite. Scaffold with great_expectations suite new. Add expect_column_values_to_not_be_null, expect_column_values_to_be_in_set, and expect_column_mean_to_be_between.
  2. Edit the checkpoint. Point the datasource at your staging table and set action_list to fire a webhook on failure.
  3. Wire Meridian. Paste the webhook URL from your Meridian workspace settings. The payload includes run_id, success, and per-expectation results.
  4. Schedule. Run the checkpoint via Airflow or cron every 15 minutes. Meridian will quarantine batches that fail critical expectations.

Expected output

{
  "run_id": "2026-05-26T14-22-01",
  "success": false,
  "results": [
    {
      "expectation": "expect_column_values_to_not_be_null",
      "column": "email",
      "unexpected_count": 3
    }
  ]
}

Need the full checkpoint YAML? Grab it from the recipes index.