Recipe

Corrective RAG Patterns

Corrective Retrieval-Augmented Generation (CRAG) layers a self-grading and fallback loop on top of vanilla RAG. When retrieved chunks score low on relevance, the pipeline rewrites the query, falls back to web search, or refuses gracefully. This recipe shows three production patterns we run on Meridian to keep answer quality consistent under messy retrieval.

1. Retrieval grading with a lightweight judge

Before passing chunks to the generator, score each one with a small judge model. A binary relevant/not-relevant signal beats a 1-10 scale in practice. Drop anything that fails the bar and decide between the rewrite path and the fallback path based on what survives.

2. Query rewriting on partial matches

When some chunks pass but coverage feels thin, rewrite the query using the surviving context as a hint. A second retrieval pass with the rewritten query usually closes the gap without ever hitting the web fallback path.

3. Web fallback as a last resort

If zero chunks survive the judge, route to a web-search tool and rebuild a fresh context window. Mark the answer as web-sourced so the UI can surface provenance. Never silently mix corpus and web chunks in the same context.

Reference pipeline

async function correctiveRag(query: string) {
  const chunks = await retrieve(query, { k: 8 });
  const graded = await judge(query, chunks);
  const kept  = graded.filter(c => c.relevant);

  if (kept.length >= 3) {
    return generate(query, kept);
  }
  if (kept.length > 0) {
    const rewritten = await rewriteQuery(query, kept);
    const second    = await retrieve(rewritten, { k: 8 });
    return generate(rewritten, second);
  }
  const web = await webSearch(query, { k: 5 });
  return generate(query, web, { source: 'web' });
}