blk-mq: allocate tags in batches
Instead of grabbing tags one by one, grab a batch and store the local
cache in the software queue. Then subsequent tag allocations can just
grab free tags from there, without having to hit the shared tag map.
We flush these batches out if we run out of tags on the hardware queue.
The intent here is this should rarely happen.
This works very well in practice, with anywhere from 40-60 batch counts
seen regularly in testing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>