writeback: throttle buffered writeback
Test patch that throttles buffered writeback to make it a lot
more smooth, and has way less impact on other system activity.
Background writeback should be, by definition, background
activity. The fact that we flush huge bundles of it at the time
means that it potentially has heavy impacts on foreground workloads,
which isn't ideal. We can't easily limit the sizes of writes that
we do, since that would impact file system layout in the presence
of delayed allocation. So just throttle back buffered writeback,
unless someone is waiting for it.
This is just a test patch, and as such, it registers a queue sysfs
entry to both monitor the current state:
$ cat /sys/block/nvme0n1/queue/wb_stats
idle=16, normal=32, max=64, inflight=0, wait=0, timer=0, bdp_wait=0
'idle' denotes how many requests we will allow inflight for idle
buffered writeback, 'normal' for higher priority writeback, and 'max'
for when it's urgent we clean pages. The values are calculated based
on the queue depth of the device, and the 'wb_percent' setting. If
'wb_percent' is set to zero, the functionality is turned off.
'inflight' shows how many requests are currently inflight for buffered
writeback, 'wait' shows if anyone is currently waiting for access,
'timer' shows if we have processes being deferred in write back cache
timeout, and bdp_wait shows if someone is currently throttled on this
device in balance_dirty_pages().
Finally, if the device has write back caching, 'wb_cache_delay' delays
by this amount of usecs when a write completes before allowing more.
It'd be nice to auto-tune 'wb_percent' based on device response. Flash
is less picky than rotating storage, but still needs throttling. For
flash storage, a wb_percent setting of 50% gives good read latencies
while still having good write bandwidth. For rotating storage, lower
settings (like 10-15%) are more reasonable.
Signed-off-by: Jens Axboe <axboe@fb.com>