Recipe

Serverless patterns

Cold starts, warm pools, and keeping latency predictable when your functions scale to zero.

Why it matters

Serverless platforms bill per invocation and scale automatically, but cold starts can add 200–800 ms. Structuring your handler initialization and choosing the right runtime shape determines whether users see a spinner or a snappy response.

The warm-pool contract

Most providers keep a container alive for 5–15 minutes after the last request. Move expensive setup — SDK clients, config parsing, connection pools — outside the handler so they survive across invocations. The handler itself should be a thin orchestrator.

Provisioned concurrency

For latency-sensitive endpoints, pre-warm a fixed number of instances. This trades cost for predictability. Combine with scheduled pings if your provider lacks native keep-alive, but prefer native provisioned concurrency when available.

Runtime selection

Interpreted runtimes (Python, Ruby) cold-start faster than JIT compiled ones. If you use Go or Rust, compile with slim binaries and avoid large static assets in the deployment package. Every megabyte counts during the fetch-and-unpack phase.

Observability

Tag every invocation with a correlation ID. Log init duration separately from handler duration. Set alarms on p95 cold-start latency so you catch regressions before users do.

Next: Edge functions — run closer to users with zero cold starts.