io_uring: get rid of intermediate aux cqe caches
With defer taskrun we store aux cqes into a cache array and then flush
into the CQ, and we also maintain the ordering so aux cqes are flushed
before request completions. Why do we need the cache instead of pushing
them directly? We acutally don't, so let's kill it.
One nuance is synchronisation -- the path we touch here is only for
DEFER_TASKRUN and guaranteed to be executed in the task context, and
all cqe posting is serialised by that. We also don't need locks because
of that, see __io_cq_lock().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/935d517f0e71218bfc1d40352a4754abb610176d.1709224453.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>