Direct permalink: https://github.com/axboe/liburing/wiki/io_uring-and-networking-in-2023/a6b20fcee88b253eb7dd8240e3c6535c4d32de72
Some Q&A with someone who appears to be the author can be seen in the LWN comments: https://lwn.net/Articles/923369/
i played with io_uring through tokio in rust (… a year ago or more?), shit was so fast. i kept thinking something was broken in my benchmarking code. i still don’t believe it tbh
Does anyone know the history of this API? Part of my background is in virtualisation, including paravirtualised device implementation, and the description here feels a lot like how the guest-host communication in PV devices is typically implemented. I’m wondering if io_uring was directly inspired by those designs, or if it’s essentially convergent evolution because the constraints are broadly similar. (VM exits are somewhat more expensive than user-kernel context switches, so it makes sense this design would have evolved for VMs first, but ultimately both kinds of context switches begin to dominate the performance characteristics once you’ve taken care of the lower hanging fruit.)
I think it’s convergent evolution - both modern network (NIC ring buffers) and storage (NVMe) hardware interfaces look a lot like this too.
There’s a lot of related work. POSIX’s lio_listio (which has a terrible implementation in Linux, but is fast elsewhere), QNX asynchronous system calls, and so on. Kernel bypass things like netmap, DPDK, SPDK, GPU drivers, and so on are also similar.
The key novel(ish - there have been a few papers with similar ideas) that’s interesting in io_uring and novel is the fact that it removes the need for global serialisation on a file descriptor table, which POSIX mandates for everything else. This is a huge win for things doing large numbers of operations small operations (e.g. open socket, send one message, close socket).
The no-fd and command queueing aspects are reminiscent of Mach messaging, which runs much of the show on Apple’s OSes, but of course has its origins quite a few decades back. (And Apple platforms don’t currently have anything equivalent to io_uring for file and socket I/O specifically, those still fall squarely into the “BSD subsystem” - FreeBSD’s kevent is as good as it gets on macOS/iOS/…, and Apple’s “user space networking” effort is currently only available via quite high level APIs for HTTP and such.)
There’s arguably prior art in the form of how VMS did system calls (AST).
Very very exciting stuff. I think this series of blog posts are a good intro to what kinds of benefits uring can bring https://idea.popcount.org/2017-01-06-select-is-fundamentally-broken/