← Docs
Recipe

Recipe: Kubernetes VPA design

Vertical Pod Autoscaler architecture for right-sizing workloads without manual intervention.

Overview

VPA observes historical resource usage and adjusts container requests/limits. Three components — recommender, updater, admission controller — form a closed loop.

Recommender

Queries metrics-server for CPU/memory histograms. Computes percentile-based recommendations (P90 default). Stores in VPA object status.

Updater

Compares running pods against recommendations. Evicts pods when drift exceeds threshold. Respects PDB and min-replicas guardrails.

Admission Controller

Mutates pod creation requests to inject recommended resources. Prevents OOM on first deploy. Requires webhook registration.

Trade-offs

  • Eviction-based updates cause brief disruption
  • Recommender needs 8+ days of history for confidence
  • Not compatible with HPA on same metric