Recipe: Search relevance + ranking design

A practical blueprint for building search that actually finds what users mean — not just what they type.

1. Signal taxonomy

Rank on three tiers. Tier 1 — exact title match, boosted 10x. Tier 2 — ingredient overlap, weighted by rarity (saffron > salt). Tier 3 — semantic similarity via embedding cosine distance against the query. Combine with a weighted linear sum, not a black-box model. You need debuggable scores.

2. Tokenization that respects food

Standard analyzers break on “chicken breast” and “gluten-free.” Build a culinary-aware tokenizer: preserve bigrams for common pairings, strip measure words (cup, tbsp), and expand dietary aliases (vegan → no-meat, no-dairy, no-honey). Store both raw and normalized tokens in the index.

3. Personalization without creep

Maintain a per-user preference vector — cuisine affinity, avoided ingredients, skill level — stored client-side or in an opaque server blob. Apply a gentle 1.3x boost to results matching two or more preference dimensions. Never re-rank the top slot solely on personalization; relevance must lead.

4. Freshness decay

Seasonal recipes (pumpkin, grilling) get a time-decay boost using a Gaussian centered on peak season. Trending recipes — measured by 24h view velocity — get a 1.5x lift that decays linearly over 72 hours. Cap trend boost so a single viral post cannot drown the index.

5. Evaluation loop

Log every query, the top-10 returned IDs, and the clicked result. Compute MRR and NDCG@10 weekly. Run A/B tests on weight coefficients with a 5% traffic slice. If a change does not beat baseline on NDCG within 14 days, revert automatically. Relevance is a regression target, not a vibe.

← Back to docs