Recipe

Recipe: ETL/ELT pipeline spec writer

Generate precise, reviewable data pipeline specifications from natural language descriptions. Covers source-to-target mapping, transformation logic, error handling, and scheduling.

Overview

This recipe transforms a plain-English description of a data movement task into a structured specification document. The output includes column-level lineage, transformation pseudocode, retry policies, and idempotency guarantees — ready for engineering review or direct implementation.

Input Schema

  • source_type — Postgres, MySQL, S3, Kafka, REST API, etc.
  • target_type — Snowflake, BigQuery, Redshift, S3, etc.
  • description — free-text description of the pipeline intent
  • frequency — batch window, streaming, or cron expression
  • slas — freshness and completeness targets

Output Sections

Source-to-target column map
Transformation rules (pseudocode)
Error handling & retry policy
Idempotency strategy
Scheduling & dependency graph
Monitoring & alerting hooks

Example Prompt

“Ingest daily order exports from a Postgres replica into Snowflake. Deduplicate on order_id, enrich with exchange rates from a REST API, and partition by order_date. Must complete within 2 hours of source availability. Alert if row count deviates more than 5% from 7-day rolling average.”