splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE
authorDavid Howells <dhowells@redhat.com>
Tue, 7 Feb 2023 10:45:40 +0000 (10:45 +0000)
committerSteve French <stfrench@microsoft.com>
Mon, 20 Feb 2023 23:25:43 +0000 (17:25 -0600)
commit33b3b041543e8b3abf9a692d0f8c2ab0e07c50cd
tree4a38418853dba47c932ba986502a0d9de91e4d92
parent07073eb01c5f630344bc1c3e56b0e0d94aedf919
splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE

Implement a function, direct_file_splice(), that deals with this by using
an ITER_BVEC iterator instead of an ITER_PIPE iterator as the former won't
free its buffers when reverted.  The function bulk allocates all the
buffers it thinks it is going to use in advance, does the read
synchronously and only then trims the buffer down.  The pages we did use
get pushed into the pipe.

This fixes a problem with the upcoming iov_iter_extract_pages() function,
whereby pages extracted from a non-user-backed iterator such as ITER_PIPE
aren't pinned.  __iomap_dio_rw(), however, calls iov_iter_revert() to
shorten the iterator to just the bufferage it is going to use - which has
the side-effect of freeing the excess pipe buffers, even though they're
attached to a bio and may get written to by DMA (thanks to Hillf Danton for
spotting this[1]).

This then causes memory corruption that is particularly noticeable when the
syzbot test[2] is run.  The test boils down to:

out = creat(argv[1], 0666);
ftruncate(out, 0x800);
lseek(out, 0x200, SEEK_SET);
in = open(argv[1], O_RDONLY | O_DIRECT | O_NOFOLLOW);
sendfile(out, in, NULL, 0x1dd00);

run repeatedly in parallel.  What I think is happening is that ftruncate()
occasionally shortens the DIO read that's about to be made by sendfile's
splice core by reducing i_size.

This should be more efficient for DIO read by virtue of doing a bulk page
allocation, but slightly less efficient by ignoring any partial page in the
pipe.

Reported-by: syzbot+a440341a59e3b7142895@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20230207094731.1390-1-hdanton@sina.com/
Link: https://lore.kernel.org/r/000000000000b0b3c005f3a09383@google.com/
Signed-off-by: Steve French <stfrench@microsoft.com>
fs/splice.c
include/linux/fs.h