Recipe: Kubernetes VPA design
Vertical Pod Autoscaler architecture for right-sizing workloads without manual intervention.
Overview
VPA observes historical resource usage and adjusts container requests/limits. Three components — recommender, updater, admission controller — form a closed loop.
Recommender
Queries metrics-server for CPU/memory histograms. Computes percentile-based recommendations (P90 default). Stores in VPA object status.
Updater
Compares running pods against recommendations. Evicts pods when drift exceeds threshold. Respects PDB and min-replicas guardrails.
Admission Controller
Mutates pod creation requests to inject recommended resources. Prevents OOM on first deploy. Requires webhook registration.
Trade-offs
- Eviction-based updates cause brief disruption
- Recommender needs 8+ days of history for confidence
- Not compatible with HPA on same metric