RAG System Design
Retrieval-Augmented Generation pipeline for context-aware recipe search and generation.
Pipeline Overview
Documents flow through ingestion, chunking, embedding, and indexing before queries hit the retrieval layer. The generator fuses retrieved context with the user prompt for grounded responses.
Ingestion
- •Raw recipe markdown parsed into structured JSON
- •Metadata extracted: cuisine, prep time, allergens, equipment
- •Images transcribed via vision model for multimodal context
Chunking Strategy
Semantic splitting by recipe section (ingredients, steps, notes) with 256-token overlap. Each chunk carries parent document ID and section label for provenance tracking.
Embedding & Index
Chunks embedded via text-embedding-3-small (1536-d). Stored in a vector database with hybrid BM25 + cosine similarity retrieval. Metadata filters enable scoped queries by cuisine or dietary constraint.
Retrieval
- •Query rewritten for expansion (synonyms, ingredient aliases)
- •Top-k = 8 with MMR diversity reranking
- •Relevance threshold gates passage inclusion
Generation
Retrieved chunks injected into a structured prompt template. Model instructed to cite sources inline. Response schema enforces structured output: title, ingredients array, step-by-step instructions, and source document IDs.
Evaluation
Ground-truth Q&A pairs measure recall@k and answer faithfulness. RAGAS framework scores context relevance and hallucination rate. Dashboard tracks drift over recipe corpus updates.
This design powers the Meridian recipe assistant. See the implementation guide for code-level details.