io_uring: lock-free task_work stack
Instead of keeping a list of task_work items keep them in a lock-free
stack. However, we still would like to keep the ordering guarantees, so
reverse the list upon execution in io_uring_task_work_run().
First, for each tw add it a spin_lock/unlock_irq() pair with a single
cmpxchg(). Same on the execution side but per batch. And it also kills
the final lock/unlock at the end of io_uring_task_work_run().
The main downside here is that we need to reverse the tw list on
execution messing up with caches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c7c3d1a6d7a038f414658314eeeadbbd186c1435.1650548192.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>