1.The two rings
io_uring exposes a Submission Queue (SQ) and a Completion Queue (CQ) mapped into both the kernel and user address space. The application writes Submission Queue Entries (SQEs) and the kernel writes Completion Queue Entries (CQEs). Syscalls collapse to a single ring update, which means tight loops never leave userspace.
Setup happens through io_uring_setup(2), which returns a fd you mmap to obtain the rings. Most code uses liburing so you never touch the raw syscall.
2.A minimal accept-read-write loop
The following sketch handles one connection. In production you would chain SQEs with IOSQE_IO_LINK and reuse fixed buffers, but the shape stays identical.
struct io_uring ring; io_uring_queue_init(256, &ring, 0); struct io_uring_sqe *sqe = io_uring_get_sqe(&ring); io_uring_prep_accept(sqe, listen_fd, NULL, NULL, 0); io_uring_sqe_set_data(sqe, (void *)ACCEPT_OP); io_uring_submit(&ring); struct io_uring_cqe *cqe; io_uring_wait_cqe(&ring, &cqe); int client_fd = cqe->res; io_uring_cqe_seen(&ring, cqe);
3.Pitfalls and tuning
- Buffers must outlive the SQE. The kernel reads them after the syscall returns.
- SQPOLL eliminates syscalls entirely but pins a kernel thread per ring — budget for it.
- Use registered file descriptors and fixed buffers to skip per-op refcount work.
- Kernel versions matter: multi-shot accept lands in 5.19, zero-copy send in 6.0.
- Always check cqe->res for negative errno values before trusting the result.