Back to docs
Architecture

Recipe Reranker Design

Two-stage retrieval pipeline that pairs fast vector search with precision cross-encoder scoring for recipe recommendations.

Pipeline Overview

1User query → embedding via text-embedding-3-small
2ANN retrieval: top-100 candidates from Qdrant
3Cross-encoder scores all 100 query-document pairs
4Re-rank by cross-encoder score → top-10 returned

Cross-Encoder Model

We run a fine-tuned MiniLM-L6-v2 cross-encoder on an A10G GPU via Modal. The model takes concatenated [query, recipe_text] pairs and outputs a single relevance logit. Inference for 100 pairs completes in under 80ms at p99.

Latency Budget

Stagep50p99
Embedding45ms120ms
Qdrant ANN12ms35ms
Cross-encoder55ms80ms