Back to docs

Concurrency best practices

Rate-limit concurrent operations with semaphores to avoid overwhelming APIs, databases, or local resources.

Python — asyncio.Semaphore

When you fire off hundreds of coroutines at once, an unbounded gather can exhaust file descriptors or trigger remote rate limits. Wrap each worker with a semaphore so only N tasks run concurrently.

import asyncio

async def fetch(session, url, sem):
    async with sem:
        async with session.get(url) as resp:
            return await resp.json()

async def main(urls):
    sem = asyncio.Semaphore(8)          # max 8 concurrent
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, u, sem) for u in urls]
        results = await asyncio.gather(*tasks)
    return results
  • Semaphore(N) caps in-flight coroutines; excess callers block at async with sem.
  • • Prefer asyncio.BoundedSemaphore when you want a hard upper bound that catches mismatched releases.
  • • Combine with asyncio.wait_for to add a per-task timeout so a stuck worker doesn't hold a slot forever.

TypeScript — p-limit

In Node.js or the browser, p-limit is the idiomatic concurrency gate. It returns a function that queues callers and resolves them in FIFO order as slots free up.

import pLimit from 'p-limit';

const limit = pLimit(6);               // max 6 concurrent

const urls: string[] = [
  'https://api.example.com/a',
  'https://api.example.com/b',
  // ... hundreds more
];

const results = await Promise.all(
  urls.map((url) =>
    limit(async () => {
      const res = await fetch(url);
      return res.json();
    })
  )
);
  • limit(fn) returns a Promise that resolves with the return value of fn — drop-in compatible with Promise.all.
  • • The internal queue is unbounded; only execution is capped. If you need backpressure, pair with a producer-consumer pattern.
  • • For Deno or environments without npm, a minimal 10-line semaphore using a promise chain achieves the same effect.

Choosing the right limit

CPU-bound work

Cap at os.cpu_count() or navigator.hardwareConcurrency. Oversubscribing adds context-switch overhead with zero throughput gain.

I/O-bound work

Start with 2× the number of CPU cores, then tune based on observed latency and remote 429 responses. Many HTTP clients ship with a default pool size of 10–20.