Recipe: Long-doc summary with map-reduce
Chunk a large document, summarize each chunk in parallel, then reduce into a final coherent summary.
Overview
When a document exceeds the model's context window, split it into overlapping chunks. Run a summarization prompt on each chunk independently, then feed the partial summaries into a final reduction step that produces a single polished output.
Step 1 — Chunking
Split the source text into segments of roughly 2000 tokens with a 200-token overlap. Use a recursive character splitter that respects paragraph and sentence boundaries. Store each chunk with its ordinal index.
Step 2 — Map phase
Send every chunk to the model with a prompt like: “Summarize the following passage in 3–5 sentences. Preserve key facts, names, and figures.” Run all calls concurrently. Collect the per-chunk summaries in order.
Step 3 — Reduce phase
Concatenate the partial summaries and pass them to a final prompt: “Synthesize the following section summaries into one cohesive document summary. Eliminate redundancy and maintain logical flow.” This yields the finished output.
Tips
- Use a tokenizer that matches your target model for accurate chunk sizing.
- Overlap prevents losing context at chunk boundaries — 10% is a safe default.
- For very long documents, add an intermediate merge step before the final reduce.
- Track chunk order; the reduce prompt works best when summaries are presented sequentially.
Meridian handles chunking, parallel map calls, and the reduce step automatically when you pass a document larger than the context window. No manual orchestration required.