io_uring/uring_cmd: defer SQE copying until it's needed
authorJens Axboe <axboe@kernel.dk>
Wed, 20 Mar 2024 21:23:47 +0000 (15:23 -0600)
committerJens Axboe <axboe@kernel.dk>
Tue, 26 Mar 2024 17:09:07 +0000 (11:09 -0600)
commit31e2d7fe8a60bfb4fa04009f1b21f0ed67e2359b
treebe70172aa4269a3e5267708477845e20285dcb7f
parent236b1d77ef94df72e605b01874b526ad83265a8b
io_uring/uring_cmd: defer SQE copying until it's needed

The previous commit turned on async data for uring_cmd, and did the
basic conversion of setting everything up on the prep side. However, for
a lot of use cases, -EIOCBQUEUED will get returned on issue, as the
operation got successfully queued. For that case, a persistent SQE isn't
needed, as it's just used for issue.

Unless execution goes async immediately, defer copying the double SQE
until it's necessary.

This greatly reduces the overhead of such commands, as evidenced by
a perf diff from before and after this change:

    10.60%     -8.58%  [kernel.vmlinux]  [k] io_uring_cmd_prep

where the prep side drops from 10.60% to ~2%, which is more expected.
Performance also rises from ~113M IOPS to ~122M IOPS, bringing us back
to where it was before the async command prep.

Tested-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring/uring_cmd.c