Concurrency best practices
Rate-limit concurrent operations with semaphores to avoid overwhelming APIs, databases, or local resources.
Python — asyncio.Semaphore
When you fire off hundreds of coroutines at once, an unbounded gather can exhaust file descriptors or trigger remote rate limits. Wrap each worker with a semaphore so only N tasks run concurrently.
import asyncio
async def fetch(session, url, sem):
async with sem:
async with session.get(url) as resp:
return await resp.json()
async def main(urls):
sem = asyncio.Semaphore(8) # max 8 concurrent
async with aiohttp.ClientSession() as session:
tasks = [fetch(session, u, sem) for u in urls]
results = await asyncio.gather(*tasks)
return results- •
Semaphore(N)caps in-flight coroutines; excess callers block atasync with sem. - • Prefer
asyncio.BoundedSemaphorewhen you want a hard upper bound that catches mismatched releases. - • Combine with
asyncio.wait_forto add a per-task timeout so a stuck worker doesn't hold a slot forever.
TypeScript — p-limit
In Node.js or the browser, p-limit is the idiomatic concurrency gate. It returns a function that queues callers and resolves them in FIFO order as slots free up.
import pLimit from 'p-limit';
const limit = pLimit(6); // max 6 concurrent
const urls: string[] = [
'https://api.example.com/a',
'https://api.example.com/b',
// ... hundreds more
];
const results = await Promise.all(
urls.map((url) =>
limit(async () => {
const res = await fetch(url);
return res.json();
})
)
);- •
limit(fn)returns a Promise that resolves with the return value offn— drop-in compatible withPromise.all. - • The internal queue is unbounded; only execution is capped. If you need backpressure, pair with a producer-consumer pattern.
- • For Deno or environments without npm, a minimal 10-line semaphore using a promise chain achieves the same effect.
Choosing the right limit
CPU-bound work
Cap at os.cpu_count() or navigator.hardwareConcurrency. Oversubscribing adds context-switch overhead with zero throughput gain.
I/O-bound work
Start with 2× the number of CPU cores, then tune based on observed latency and remote 429 responses. Many HTTP clients ship with a default pool size of 10–20.