Recipe

Dagster Primer

A pragmatic walk-through of Dagster for engineers wiring data pipelines into Meridian. We cover assets, jobs, and schedules without the marketing fluff, so you can ship orchestrated workflows against the Meridian gateway in under an hour.

1. Software-defined assets

Dagster inverts the usual scheduler model. Instead of writing tasks and wiring DAG edges by hand, you declare the data artifacts you want to exist. Dependencies are inferred from your function signatures. This pairs cleanly with Meridian model calls because each asset can wrap a single inference step.

from dagster import asset
import requests, os

@asset
def raw_transcripts():
    return load_transcripts("./inbox")

@asset
def summaries(raw_transcripts):
    r = requests.post(
        "https://llm.getnimbus.net/v1/chat/completions",
        headers={"Authorization": f"Bearer {os.environ['MERIDIAN_KEY']}"},
        json={"model": "azure/model-router", "messages": [...]},
    )
    return r.json()["choices"][0]["message"]["content"]

2. Jobs and schedules

A job is a materialization plan over a selection of assets. Wrap the selection in a define_asset_job and attach a cron schedule. Dagster persists run history, retries transient failures, and lets you replay a single asset in isolation when an upstream model call drifts.

  • Schedules live next to the assets they materialize.
  • Backfills are a first-class concept, not a hack on top of Airflow.
  • Run config is typed, so a bad partition key fails at load time.

3. Wiring Meridian as a resource

For anything beyond a single asset, lift the Meridian client into a Dagster resource. The resource handles auth headers, retries on 429s, and a shared HTTP session. Your assets receive a typed client and stay readable. When you later swap azure/model-router for a pinned deployment, you change one line, not twenty.