RAG hub
Everything you need to build retrieval-augmented generation pipelines — from embedding fundamentals to production vector stores and end-to-end recipes.
RAG recipe
End-to-end retrieval-augmented generation pattern — chunk, embed, retrieve, augment, generate.
Embeddings guide
Choosing embedding models, dimension trade-offs, normalization, and distance metrics for vector search.
RAG with pgvector
Postgres as a vector store — IVFFlat, HNSW indexes, and hybrid search with pgvector.
RAG with Pinecone
Serverless vector database — namespaces, metadata filtering, and pod-based scaling.
RAG with Qdrant
High-performance vector search engine — quantization, payload indexing, and sparse-dense hybrid retrieval.
RAG with Weaviate
AI-native vector database — GraphQL API, generative search, and multi-tenancy.
RAG with Chroma
Open-source embedding database — collections, metadata filtering, and local-first development.
Prompt caching
Reduce latency and cost by caching system prompts and long context prefixes across requests.
Recipe: PDF Q&A
Build a PDF question-answering pipeline — parsing, chunking, embedding, and retrieval over documents.
Recipe: Semantic search
Implement semantic search over a knowledge base — from text preprocessing to ranked retrieval.
Not sure where to start?
Begin with the RAG recipe for a high-level overview, then dive into the embeddings guide to understand vector representations. Pick a vector store that matches your infrastructure, and explore the recipes for concrete implementations.