Recipe: ETL/ELT pipeline spec writer
Generate precise, reviewable data pipeline specifications from natural language descriptions. Covers source-to-target mapping, transformation logic, error handling, and scheduling.
Overview
This recipe transforms a plain-English description of a data movement task into a structured specification document. The output includes column-level lineage, transformation pseudocode, retry policies, and idempotency guarantees — ready for engineering review or direct implementation.
Input Schema
- source_type — Postgres, MySQL, S3, Kafka, REST API, etc.
- target_type — Snowflake, BigQuery, Redshift, S3, etc.
- description — free-text description of the pipeline intent
- frequency — batch window, streaming, or cron expression
- slas — freshness and completeness targets
Output Sections
Example Prompt
“Ingest daily order exports from a Postgres replica into Snowflake. Deduplicate on order_id, enrich with exchange rates from a REST API, and partition by order_date. Must complete within 2 hours of source availability. Alert if row count deviates more than 5% from 7-day rolling average.”