Serverless patterns
Cold starts, warm pools, and keeping latency predictable when your functions scale to zero.
Why it matters
Serverless platforms bill per invocation and scale automatically, but cold starts can add 200–800 ms. Structuring your handler initialization and choosing the right runtime shape determines whether users see a spinner or a snappy response.
The warm-pool contract
Most providers keep a container alive for 5–15 minutes after the last request. Move expensive setup — SDK clients, config parsing, connection pools — outside the handler so they survive across invocations. The handler itself should be a thin orchestrator.
Provisioned concurrency
For latency-sensitive endpoints, pre-warm a fixed number of instances. This trades cost for predictability. Combine with scheduled pings if your provider lacks native keep-alive, but prefer native provisioned concurrency when available.
Runtime selection
Interpreted runtimes (Python, Ruby) cold-start faster than JIT compiled ones. If you use Go or Rust, compile with slim binaries and avoid large static assets in the deployment package. Every megabyte counts during the fetch-and-unpack phase.
Observability
Tag every invocation with a correlation ID. Log init duration separately from handler duration. Set alarms on p95 cold-start latency so you catch regressions before users do.
Next: Edge functions — run closer to users with zero cold starts.