git.kernel.dk Git - linux-2.6-block.git/commit

author	Bart Van Assche <bvanassche@acm.org>
	Thu, 18 Jan 2024 17:53:34 +0000 (10:53 -0700)
committer	Jens Axboe <axboe@kernel.dk>
	Mon, 1 Apr 2024 22:03:34 +0000 (16:03 -0600)
commit	5509e9957621ed8148c8503096d94831298cc570
tree	e38c4f280f2efdb39700f25d7f4a517d91d93267	tree
parent	eba8080cf98c21e5d8713f6782b0fd2d9c8fd998	commit \| diff

block/mq-deadline: use separate insertion lists

Reduce lock contention on dd->lock by calling dd_insert_request() from
inside the dispatch callback instead of from the insert callback. This
patch is inspired by a patch from Jens.

With the previous dispatch and merge optimization, this drastically
reduces contention for a sample cases of 32 threads doing IO to devices.
The test case looks as follows:

fio --bs=512 --group_reporting=1 --gtod_reduce=1 --invalidate=1 \
--ioengine=io_uring --norandommap --runtime=60 --rw=randread \
--thread --time_based=1 --buffered=0 --fixedbufs=1 --numjobs=32 \
--iodepth=4 --iodepth_batch_submit=4 --iodepth_batch_complete=4 \
--name=scaletest --filename=/dev/$DEV

Before:

Device IOPS sys contention diff
====================================================
null_blk 879K 89% 93.6%
nvme0n1 901K 86% 94.5%

and after this and the previous two patches:

Device IOPS sys contention diff
====================================================
null_blk 2867K 11.1% ~6.0% +226%
nvme0n1 3162K 9.9% ~5.0% +250%

which basically eliminates all of the lock contention, it's down to
more normal levels. The throughput increases show that nicely, with more
than a 300% improvement for both cases.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
[axboe: expand commit message with more details and perf results]
Signed-off-by: Jens Axboe <axboe@kernel.dk>