Recipe: HyDE

Hypothetical Document Embeddings

What It Is

HyDE generates a synthetic “ideal” document from a user query, then retrieves real documents whose embeddings are nearest to that hypothetical document. This bridges the gap between short queries and the richer language of stored content.

When to Use

Short, ambiguous user queries that lack context.
Corpora where documents are long-form and dense (papers, legal, medical).
When zero-shot retrieval accuracy is too low and you cannot fine-tune an embedding model.

High-Level Flow

1. User query → “What are the side effects of drug X?”

2. LLM generates a hypothetical document answering that question.

3. Embed the hypothetical document.

4. Vector search using that embedding against your real document store.

5. Return top-k real documents to the user or downstream LLM.

Tradeoffs

HyDE adds latency and LLM cost because every query requires an extra generation step. The quality of retrieval depends heavily on the LLM’s ability to produce a plausible hypothetical document. For high-throughput systems, consider caching frequent query patterns or using a smaller model for the generation step.

Related Recipes

RAG Baseline Multi-Query Self-Query