Embeddings pipeline & vector store wiring
End-to-end pipeline that ingests raw content, chunks it, generates embeddings via an inference endpoint, and indexes vectors into a store for semantic retrieval.
Pipeline stages
- Ingest raw markdown or JSON from a content source.
- Split into overlapping chunks with a sliding-window chunker.
- POST each chunk to an embedding model endpoint, collect float32 vectors.
- Upsert vectors + metadata into the vector store with a batch writer.
Key decisions
- Chunk size 512 tokens, overlap 64 — balances context vs retrieval granularity.
- Embedding model: text-embedding-3-small (1536-d). Swap via env var.
- Vector store: Upstash Vector with cosine similarity index.
- Idempotency keys derived from content hash prevent duplicate inserts on replay.
Wiring diagram
Content Source
│
Chunker (sliding window)
│
Embedding endpoint (POST /v1/embeddings)
│
Vector store upsert (batch)
▼
Semantic search index
Next steps
Wire the retrieval endpoint to the search bar in Recipe: Search endpoint. Add a re-index webhook triggered on content updates.