io_uring/net: allow coalescing of mapped segments io_uring-net-coalesce
authorJens Axboe <axboe@kernel.dk>
Thu, 8 Aug 2024 16:47:03 +0000 (10:47 -0600)
committerJens Axboe <axboe@kernel.dk>
Wed, 18 Sep 2024 10:45:40 +0000 (04:45 -0600)
commit15bf9fc8cfaf274f3355bbe89caf8449454339a6
treeabe295fba350d1f0c926ec8608f0c3f8df768875
parent4894a9674c8be0fc25b2a44d9e990a0a7afd1e11
io_uring/net: allow coalescing of mapped segments

For bundles, when multiple buffers are selected, it's not unlikely
that some/all of them will be virtually contigious. If these segments
aren't big, then nice wins can be reaped by coalescing them into
bigger segments. This makes networking copies more efficient, and
reduces the number of iterations that need to be done over an iovec.
Ideally, multiple segments that would've been mapped as an ITER_IOVEC
before can now be mapped into a single ITER_UBUF iterator.

Example from an io_uring network backend receiving data, with various
transfer sizes, over a 100G network link.

recv size    coalesce    threads    bw          cpu usage    bw diff
=====================================================================
64             0           1       23GB/sec       100%
64             1           1       46GB/sec        79%        +100%
64             0           4       81GB/sec       370%
64             1           4       96GB/sec       160%        + 20%
256            0           1       44GB/sec        90%
256            1           1       47GB/sec        48%        +  7%
256            0           4       90GB/sec       190%
256            1           4       96GB/sec       120%        +  7%
1024           0           1       49GB/sec        60%
1024           1           1       50GB/sec        53%        +  2%
1024           0           4       94GB/sec       140%
1024           1           4       96GB/sec       120%        +  2%

where obviously small buffer sizes benefit the most, but where an
efficiency gain is seen even at higher buffer sizes as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring/kbuf.c
io_uring/kbuf.h
io_uring/net.c
io_uring/net.h