RECIPE

Broadway (Elixir) Primer

Broadway is a concurrent and multi-stage data ingestion framework for Elixir, built atop GenStage. It excels at consuming high-throughput data from sources like Kafka, RabbitMQ, SQS, and Google PubSub with built-in batching, backpressure, fault tolerance, and graceful shutdown.

1. Why Broadway

Traditional consumer loops require manual orchestration of concurrency, error handling, and backpressure. Broadway abstracts these concerns into a declarative pipeline: producers emit messages, processors transform them, and batchers group results for downstream sinks. The runtime supervises every stage and automatically applies retries.

  • Backpressure-aware via GenStage demand
  • Batching with size and timeout thresholds
  • Rate limiting at the producer layer
  • Graceful shutdown drains in-flight messages

2. Pipeline Anatomy

A Broadway pipeline is a single module implementing the Broadway behaviour. It declares a producer, a set of processors, and optional batchers. Each stage runs in its own supervised process tree, so a crash in one processor never collapses the whole pipeline.

3. Minimal Example

The snippet below sketches a pipeline that pulls from SQS, processes each message concurrently across ten workers, and batches results before persisting them.

defmodule MyApp.Pipeline do
  use Broadway

  def start_link(_opts) do
    Broadway.start_link(__MODULE__,
      name: __MODULE__,
      producer: [
        module: {BroadwaySQS.Producer, queue_url: "..."},
        concurrency: 1
      ],
      processors: [default: [concurrency: 10]],
      batchers: [default: [batch_size: 50, batch_timeout: 2000]]
    )
  end

  def handle_message(_, message, _) do
    message
  end

  def handle_batch(_, messages, _, _) do
    Enum.each(messages, &persist/1)
    messages
  end

  defp persist(_msg), do: :ok
end