Recipe

Vector Database Design

A practical guide to schema design, index selection, and query patterns for production vector stores.

Dimensionality & Precision

Choose embedding dimension based on your model — 384 for all-MiniLM, 768 for BERT-base, 1536 for OpenAI ada-002. Prefer float32 for recall-critical workloads; use int8 quantization when throughput matters more than sub-percent accuracy loss.

Index Strategy

HNSW delivers the best latency-recall tradeoff for datasets under 10M vectors. Set M=16, efConstruction=200 for build, efSearch=64–128 for query. IVF-PQ shines above 100M vectors — partition count should target ~10K vectors per cluster.

Metadata Filtering

Pre-filter before vector search when selectivity is high. Post-filter when the metadata clause matches >80% of the corpus. Hybrid approaches — scalar indexes on tenant_id, timestamp, or tags — prevent full-scan disasters at scale.

Sharding & Multi-Tenancy

Partition by tenant for strict isolation; use collection-per-tenant for small fleets. For SaaS with thousands of tenants, a single collection with tenant_id pre-filtering avoids index sprawl. Monitor shard size — repartition when any shard exceeds 50M vectors.

Consistency & Durability

WAL-backed inserts guarantee no data loss on crash. Set replication factor ≥3 for production. Tune commit intervals — 1s for freshness, 10s for throughput. Snapshot every 10K mutations to bound recovery time.