Back to docs
Architecture

RAG System Design

Retrieval-Augmented Generation pipeline for context-aware recipe search and generation.

Pipeline Overview

Documents flow through ingestion, chunking, embedding, and indexing before queries hit the retrieval layer. The generator fuses retrieved context with the user prompt for grounded responses.

Ingestion

  • Raw recipe markdown parsed into structured JSON
  • Metadata extracted: cuisine, prep time, allergens, equipment
  • Images transcribed via vision model for multimodal context

Chunking Strategy

Semantic splitting by recipe section (ingredients, steps, notes) with 256-token overlap. Each chunk carries parent document ID and section label for provenance tracking.

Embedding & Index

Chunks embedded via text-embedding-3-small (1536-d). Stored in a vector database with hybrid BM25 + cosine similarity retrieval. Metadata filters enable scoped queries by cuisine or dietary constraint.

Retrieval

  • Query rewritten for expansion (synonyms, ingredient aliases)
  • Top-k = 8 with MMR diversity reranking
  • Relevance threshold gates passage inclusion

Generation

Retrieved chunks injected into a structured prompt template. Model instructed to cite sources inline. Response schema enforces structured output: title, ingredients array, step-by-step instructions, and source document IDs.

Evaluation

Ground-truth Q&A pairs measure recall@k and answer faithfulness. RAGAS framework scores context relevance and hallucination rate. Dashboard tracks drift over recipe corpus updates.

This design powers the Meridian recipe assistant. See the implementation guide for code-level details.