← Back to docs

Recipe

Recursive retrieval design

Recursive retrieval expands a single user query into a tree of sub-queries, feeding each level's hits back into the next round of search. This pattern unlocks deep document corpora where the answer lives across many small fragments, and it pairs well with Meridian's azure/model-router for adaptive cost control.

1. Seed the root query

Start with the raw user prompt. Embed it with your default embedding model and run a top-K search against the primary index. Record the hits with a depth tag of zero so downstream code can prune later. Keep K small (3-5) to leave budget for deeper levels.

2. Generate follow-up queries

Feed the level-zero hits to a small router model and ask it to emit two follow-up queries. The model sees only the retrieved text, never the original user prompt, which keeps the recursion grounded in the corpus rather than the user's phrasing. Cap follow-ups at two per node to keep the fan-out tractable.

3. Recurse with a depth bound

Call the same retrieval function on each follow-up query with depth plus one. A hard ceiling of three levels keeps wall-clock under two seconds for most corpora. Flatten the hit tree, dedupe by document id, and pass the unified set to your final synthesis call.

// Recursive retrieval loop
async function recursiveRetrieve(query: string, depth = 0, maxDepth = 3) {
  if (depth >= maxDepth) return [];

  const hits = await meridian.search({
    query,
    model: 'azure/model-router',
    topK: 5,
  });

  const followUps = await meridian.complete({
    model: 'azure/model-router',
    messages: [
      { role: 'system', content: 'Extract 2 follow-up queries.' },
      { role: 'user', content: hits.map(h => h.text).join('\n') },
    ],
  });

  const children = await Promise.all(
    followUps.queries.map(q => recursiveRetrieve(q, depth + 1, maxDepth))
  );

  return [...hits, ...children.flat()];
}