io_uring: support bio caching for non-polled IO
Mark the kiocb with IOCB_ALLOC_CACHE even for non-polled IO, in case
the lower layer participates in per-cpu bio caching. If it does, then
IOCB_PUT_CACHE will be set upon kiocb->ki_complete() invocation,
passing ownership to io_uring.
io_uring doesn't complete even IRQ based requests from IRQ context,
so we can safely put the bio when we run the actual io_kiocb completion.
This provides a 5-10% boost in IOPS with IRQ driven IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>