git.kernel.dk Git - linux-2.6-block.git/commit

mm/filemap: make buffered writes work with RWF_UNCACHED

If RWF_UNCACHED is set for a write, mark new folios being written with
uncached. This is done by passing in the fact that it's an uncached write
through the folio pointer. We can only get there when IOCB_UNCACHED was
allowed, which can only happen if the file system opts in. Opting in means
they need to check for the LSB in the folio pointer to know if it's an
uncached write or not. If it is, then FGP_UNCACHED should be used if
creating new folios is necessary.

Uncached writes will drop any folios they create upon writeback
completion, but leave folios that may exist in that range alone. Since
->write_begin() doesn't currently take any flags, and to avoid needing
to change the callback kernel wide, use the foliop being passed in to
->write_begin() to signal if this is an uncached write or not. File
systems can then use that to mark newly created folios as uncached.

Add a helper, generic_uncached_write(), that generic_file_write_iter()
calls upon successful completion of an uncached write.

This provides similar benefits to using RWF_UNCACHED with reads. Testing
buffered writes on 32 files:

writing bs 65536, uncached 0
  1s: 196035MB/sec, MB=196035
  2s: 132308MB/sec, MB=328147
  3s: 132438MB/sec, MB=460586
  4s: 116528MB/sec, MB=577115
  5s: 103898MB/sec, MB=681014
  6s: 108893MB/sec, MB=789907
  7s: 99678MB/sec, MB=889586
  8s: 106545MB/sec, MB=996132
  9s: 106826MB/sec, MB=1102958
10s: 101544MB/sec, MB=1204503
11s: 111044MB/sec, MB=1315548
12s: 124257MB/sec, MB=1441121
13s: 116031MB/sec, MB=1557153
14s: 114540MB/sec, MB=1671694
15s: 115011MB/sec, MB=1786705
16s: 115260MB/sec, MB=1901966
17s: 116068MB/sec, MB=2018034
18s: 116096MB/sec, MB=2134131

where it's quite obvious where the page cache filled, and performance
dropped from to about half of where it started, settling in at around
115GB/sec. Meanwhile, 32 kswapds were running full steam trying to
reclaim pages.

Running the same test with uncached buffered writes:

writing bs 65536, uncached 1
  1s: 198974MB/sec
  2s: 189618MB/sec
  3s: 193601MB/sec
  4s: 188582MB/sec
  5s: 193487MB/sec
  6s: 188341MB/sec
  7s: 194325MB/sec
  8s: 188114MB/sec
  9s: 192740MB/sec
10s: 189206MB/sec
11s: 193442MB/sec
12s: 189659MB/sec
13s: 191732MB/sec
14s: 190701MB/sec
15s: 191789MB/sec
16s: 191259MB/sec
17s: 190613MB/sec
18s: 191951MB/sec

and the behavior is fully predictable, performing the same throughout
even after the page cache would otherwise have fully filled with dirty
data. It's also about 65% faster, and using half the CPU of the system
compared to the normal buffered write.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

author	Jens Axboe <axboe@kernel.dk>
	Tue, 5 Nov 2024 21:35:11 +0000 (14:35 -0700)
committer	Jens Axboe <axboe@kernel.dk>
	Sun, 10 Nov 2024 13:51:53 +0000 (06:51 -0700)
commit	e1f50cbb7aa4564b72fb77ab304e320af350f1bd
tree	7fb84e9d8da3569ee78f24dd1c7501d36a06e92e	tree
parent	01b4f845f686a75f095e016771e6d34419637e7f	commit \| diff

include/linux/pagemap.h		diff \| blob \| blame \| history
mm/filemap.c		diff \| blob \| blame \| history