RAG hub

Everything you need to build retrieval-augmented generation pipelines — from embedding fundamentals to production vector stores and end-to-end recipes.

RAG recipe

End-to-end retrieval-augmented generation pattern — chunk, embed, retrieve, augment, generate.

Embeddings guide

Choosing embedding models, dimension trade-offs, normalization, and distance metrics for vector search.

RAG with pgvector

Postgres as a vector store — IVFFlat, HNSW indexes, and hybrid search with pgvector.

RAG with Pinecone

Serverless vector database — namespaces, metadata filtering, and pod-based scaling.

RAG with Qdrant

High-performance vector search engine — quantization, payload indexing, and sparse-dense hybrid retrieval.

RAG with Weaviate

AI-native vector database — GraphQL API, generative search, and multi-tenancy.

RAG with Chroma

Open-source embedding database — collections, metadata filtering, and local-first development.

Prompt caching

Reduce latency and cost by caching system prompts and long context prefixes across requests.

Recipe: PDF Q&A

Build a PDF question-answering pipeline — parsing, chunking, embedding, and retrieval over documents.

Recipe: Semantic search

Implement semantic search over a knowledge base — from text preprocessing to ranked retrieval.

Not sure where to start?

Begin with the RAG recipe for a high-level overview, then dive into the embeddings guide to understand vector representations. Pick a vector store that matches your infrastructure, and explore the recipes for concrete implementations.