io_uring/chan: cache consumer head loads
Posting a message on the channel currently requires reading the
destination to know how far along it is. But in practice, this only
needs to be done every time the tail has caught up.
Initialize a cached_head to be that of the ring size, and use the cached
head when posting an event. If the cached entries are used up, do a
proper c->head read and update the cached_head again.
This greatly reduces the cross traffic on the posting side, by avoiding
pulling in the consumer ring head entry until it's required.
Signed-off-by: Jens Axboe <axboe@kernel.dk>