linux-2.6-block.git
3 years agoMerge branches 'work.misc', 'work.iov_iter', 'work.namei' and 'work.lseek-2' into...
Al Viro [Tue, 19 Jul 2022 17:09:25 +0000 (13:09 -0400)]
Merge branches 'work.misc', 'work.iov_iter', 'work.namei' and 'work.lseek-2' into for-next

3 years agoexpand those iov_iter_advance()...
Al Viro [Sat, 11 Jun 2022 08:04:33 +0000 (04:04 -0400)]
expand those iov_iter_advance()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agopipe_get_pages(): switch to append_pipe()
Al Viro [Tue, 14 Jun 2022 20:38:53 +0000 (16:38 -0400)]
pipe_get_pages(): switch to append_pipe()

now that we are advancing the iterator, there's no need to
treat the first page separately - just call append_pipe()
in a loop.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoget rid of non-advancing variants
Al Viro [Fri, 10 Jun 2022 17:05:12 +0000 (13:05 -0400)]
get rid of non-advancing variants

mechanical change; will be further massaged in subsequent commits

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoceph: switch the last caller of iov_iter_get_pages_alloc()
Al Viro [Fri, 10 Jun 2022 15:43:27 +0000 (11:43 -0400)]
ceph: switch the last caller of iov_iter_get_pages_alloc()

here nothing even looks at the iov_iter after the call, so we couldn't
care less whether it advances or not.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years ago9p: convert to advancing variant of iov_iter_get_pages_alloc()
Al Viro [Fri, 10 Jun 2022 15:42:02 +0000 (11:42 -0400)]
9p: convert to advancing variant of iov_iter_get_pages_alloc()

that one is somewhat clumsier than usual and needs serious testing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoaf_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
Al Viro [Thu, 9 Jun 2022 15:14:04 +0000 (11:14 -0400)]
af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()

... and adjust the callers

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
Al Viro [Thu, 9 Jun 2022 15:07:52 +0000 (11:07 -0400)]
iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()

... and untangle the cleanup on failure to add into pipe.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoblock: convert to advancing variants of iov_iter_get_pages{,_alloc}()
Al Viro [Thu, 9 Jun 2022 14:37:57 +0000 (10:37 -0400)]
block: convert to advancing variants of iov_iter_get_pages{,_alloc}()

... doing revert if we end up not using some pages

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
Al Viro [Thu, 9 Jun 2022 14:28:36 +0000 (10:28 -0400)]
iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()

Most of the users immediately follow successful iov_iter_get_pages()
with advancing by the amount it had returned.

Provide inline wrappers doing that, convert trivial open-coded
uses of those.

BTW, iov_iter_get_pages() never returns more than it had been asked
to; such checks in cifs ought to be removed someday...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: saner helper for page array allocation
Al Viro [Fri, 17 Jun 2022 18:45:41 +0000 (14:45 -0400)]
iov_iter: saner helper for page array allocation

All call sites of get_pages_array() are essenitally identical now.
Replace with common helper...

Returns number of slots available in resulting array or 0 on OOM;
it's up to the caller to make sure it doesn't ask to zero-entry
array (i.e. neither maxpages nor size are allowed to be zero).

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofold __pipe_get_pages() into pipe_get_pages()
Al Viro [Fri, 17 Jun 2022 18:30:39 +0000 (14:30 -0400)]
fold __pipe_get_pages() into pipe_get_pages()

... and don't mangle maxsize there - turn the loop into counting
one instead.  Easier to see that we won't run out of array that
way.  Note that special treatment of the partial buffer in that
thing is an artifact of the non-advancing semantics of
iov_iter_get_pages() - if not for that, it would be append_pipe(),
same as the body of the loop that follows it.  IOW, once we make
iov_iter_get_pages() advancing, the whole thing will turn into
calculate how many pages do we want
allocate an array (if needed)
call append_pipe() that many times.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_XARRAY: don't open-code DIV_ROUND_UP()
Al Viro [Sat, 11 Jun 2022 00:30:35 +0000 (20:30 -0400)]
ITER_XARRAY: don't open-code DIV_ROUND_UP()

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
Al Viro [Fri, 17 Jun 2022 17:54:15 +0000 (13:54 -0400)]
unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts

same as for pipes and xarrays; after that iov_iter_get_pages() becomes
a wrapper for __iov_iter_get_pages_alloc().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify xarray_get_pages() and xarray_get_pages_alloc()
Al Viro [Fri, 17 Jun 2022 17:48:03 +0000 (13:48 -0400)]
unify xarray_get_pages() and xarray_get_pages_alloc()

same as for pipes

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agounify pipe_get_pages() and pipe_get_pages_alloc()
Al Viro [Fri, 17 Jun 2022 17:35:35 +0000 (13:35 -0400)]
unify pipe_get_pages() and pipe_get_pages_alloc()

The differences between those two are
* pipe_get_pages() gets a non-NULL struct page ** value pointing to
preallocated array + array size.
* pipe_get_pages_alloc() gets an address of struct page ** variable that
contains NULL, allocates the array and (on success) stores its address in
that variable.

Not hard to combine - always pass struct page ***, have
the previous pipe_get_pages_alloc() caller pass ~0U as cap for
array size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_get_pages(): sanity-check arguments
Al Viro [Fri, 17 Jun 2022 19:15:14 +0000 (15:15 -0400)]
iov_iter_get_pages(): sanity-check arguments

zero maxpages is bogus, but best treated as "just return 0";
NULL pages, OTOH, should be treated as a hard bug.

get rid of now completely useless checks in xarray_get_pages{,_alloc}().

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
Al Viro [Sat, 11 Jun 2022 00:38:20 +0000 (20:38 -0400)]
iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper

Incidentally, ITER_XARRAY did *not* free the sucker in case when
iter_xarray_populate_pages() returned 0...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: fold data_start() and pipe_space_for_user() together
Al Viro [Wed, 15 Jun 2022 13:44:38 +0000 (09:44 -0400)]
ITER_PIPE: fold data_start() and pipe_space_for_user() together

All their callers are next to each other; all of them
want the total amount of pages and, possibly, the
offset in the partial final buffer.

Combine into a new helper (pipe_npages()), fix the
bogosity in pipe_space_for_user(), while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: cache the type of last buffer
Al Viro [Wed, 15 Jun 2022 06:02:51 +0000 (02:02 -0400)]
ITER_PIPE: cache the type of last buffer

We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
check if ->iov_offset is non-zero (i.e. that pipe is not empty)
if so, get the corresponding pipe_buffer and check its ->ops
if it's &default_pipe_buf_ops, we have an anon buffer.

Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
empty, no buffers occupied: 0
anon, with bytes up to N-1 filled: N
zero-copy, with bytes up to N-1 filled: -N

That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.

Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.

Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator.  About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe.  There are only two cases where we leave the
sane state:
1) iov_iter_get_pages()/iov_iter_get_pages_alloc().  Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
2) iov_iter copied, then something is put into the copy.  Since
they share the underlying pipe, the original gets behind.  When we
decide that we are done with the copy (original is not usable until then)
we advance the original.  direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data.  At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: clean iov_iter_revert()
Al Viro [Sun, 12 Jun 2022 21:54:35 +0000 (17:54 -0400)]
ITER_PIPE: clean iov_iter_revert()

Fold pipe_truncate() into it, clean up.  We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: clean pipe_advance() up
Al Viro [Wed, 15 Jun 2022 20:03:25 +0000 (16:03 -0400)]
ITER_PIPE: clean pipe_advance() up

instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: lose iter_head argument of __pipe_get_pages()
Al Viro [Thu, 16 Jun 2022 18:26:23 +0000 (14:26 -0400)]
ITER_PIPE: lose iter_head argument of __pipe_get_pages()

it's only used to get to the partial buffer we can add to,
and that's always the last one, i.e. pipe->head - 1.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: fold push_pipe() into __pipe_get_pages()
Al Viro [Sat, 11 Jun 2022 06:52:03 +0000 (02:52 -0400)]
ITER_PIPE: fold push_pipe() into __pipe_get_pages()

Expand the only remaining call of push_pipe() (in
__pipe_get_pages()), combine it with the page-collecting loop there.

Note that the only reason it's not a loop doing append_pipe() is
that append_pipe() is advancing, while iov_iter_get_pages() is not.
As soon as it switches to saner semantics, this thing will switch
to using append_pipe().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: allocate buffers as we go in copy-to-pipe primitives
Al Viro [Tue, 14 Jun 2022 17:53:53 +0000 (13:53 -0400)]
ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives

New helper: append_pipe().  Extends the last buffer if possible,
allocates a new one otherwise.  Returns page and offset in it
on success, NULL on failure.  iov_iter is advanced past the
data we've got.

Use that instead of push_pipe() in copy-to-pipe primitives;
they get simpler that way.  Handling of short copy (in "mc" one)
is done simply by iov_iter_revert() - iov_iter is in consistent
state after that one, so we can use that.

[Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: helpers for adding pipe buffers
Al Viro [Mon, 13 Jun 2022 18:30:15 +0000 (14:30 -0400)]
ITER_PIPE: helpers for adding pipe buffers

There are only two kinds of pipe_buffer in the area used by ITER_PIPE.

1) anonymous - copy_to_iter() et.al. end up creating those and copying
data there.  They have zero ->offset, and their ->ops points to
default_pipe_page_ops.

2) zero-copy ones - those come from copy_page_to_iter(), and page
comes from caller.  ->offset is also caller-supplied - it might be
non-zero.  ->ops points to page_cache_pipe_buf_ops.

Move creation and insertion of those into helpers - push_anon(pipe, size)
and push_page(pipe, page, offset, size) resp., separating them from
the "could we avoid creating a new buffer by merging with the current
head?" logics.

Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoITER_PIPE: helper for getting pipe buffer by index
Al Viro [Tue, 14 Jun 2022 14:24:37 +0000 (10:24 -0400)]
ITER_PIPE: helper for getting pipe buffer by index

pipe_buffer instances of a pipe are organized as a ring buffer,
with power-of-2 size.  Indices are kept *not* reduced modulo ring
size, so the buffer refered to by index N is
pipe->bufs[N & (pipe->ring_size - 1)].

Ring size can change over the lifetime of a pipe, but not while
the pipe is locked.  So for any iov_iter primitives it's a constant.
Original conversion of pipes to this layout went overboard trying
to microoptimize that - calculating pipe->ring_size - 1, storing
it in a local variable and using through the function.  In some
cases it might be warranted, but most of the times it only
obfuscates what's going on in there.

Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
that and use it in the obvious cases.  More will follow...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agosplice: stop abusing iov_iter_advance() to flush a pipe
Al Viro [Sun, 12 Jun 2022 20:07:49 +0000 (16:07 -0400)]
splice: stop abusing iov_iter_advance() to flush a pipe

Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother
with rather non-obvious use of iov_iter_advance() in there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoswitch new_sync_{read,write}() to ITER_UBUF
Al Viro [Sun, 22 May 2022 20:55:40 +0000 (16:55 -0400)]
switch new_sync_{read,write}() to ITER_UBUF

Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonew iov_iter flavour - ITER_UBUF
Al Viro [Sun, 22 May 2022 18:59:25 +0000 (14:59 -0400)]
new iov_iter flavour - ITER_UBUF

Equivalent of single-segment iovec.  Initialized by iov_iter_ubuf(),
checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
ones.

We are going to expose the things like ->write_iter() et.al. to those
in subsequent commits.

New predicate (user_backed_iter()) that is true for ITER_IOVEC and
ITER_UBUF; places like direct-IO handling should use that for
checking that pages we modify after getting them from iov_iter_get_pages()
would need to be dirtied.

DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
will solve all problems - there's code that uses iter_is_iovec() to
decide how to poke around in iov_iter guts and for that the predicate
replacement obviously won't suffice.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoMerge branches 'fixes', 'block-iter', 'work.9p', 'work.iov_iter-base' and 'fixes...
Al Viro [Tue, 19 Jul 2022 17:08:19 +0000 (13:08 -0400)]
Merge branches 'fixes', 'block-iter', 'work.9p', 'work.iov_iter-base' and 'fixes-s390' into work.iov_iter

3 years agofs: remove no_llseek
Jason A. Donenfeld [Wed, 29 Jun 2022 13:07:00 +0000 (15:07 +0200)]
fs: remove no_llseek

Now that all callers of ->llseek are going through vfs_llseek(), we
don't gain anything by keeping no_llseek around. Nothing actually calls
it and setting ->llseek to no_lseek is completely equivalent to
leaving it NULL.

Longer term (== by the end of merge window) we want to remove all such
intializations.  To simplify the merge window this commit does *not*
touch initializers - it only defines no_llseek as NULL (and simplifies
the tests on file opening).

At -rc1 we'll need do a mechanical removal of no_llseek -

git grep -l -w no_llseek | grep -v porting.rst | while read i; do
sed -i '/\<no_llseek\>/d' $i
done
would do it.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofs: check FMODE_LSEEK to control internal pipe splicing
Jason A. Donenfeld [Wed, 29 Jun 2022 13:06:58 +0000 (15:06 +0200)]
fs: check FMODE_LSEEK to control internal pipe splicing

The original direct splicing mechanism from Jens required the input to
be a regular file because it was avoiding the special socket case. It
also recognized blkdevs as being close enough to a regular file. But it
forgot about chardevs, which behave the same way and work fine here.

This is an okayish heuristic, but it doesn't totally work. For example,
a few chardevs should be spliceable here. And a few regular files
shouldn't. This patch fixes this by instead checking whether FMODE_LSEEK
is set, which represents decently enough what we need rewinding for when
splicing to internal pipes.

Fixes: b92ce5589374 ("[PATCH] splice: add direct fd <-> fd splicing support")
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agovfio: do not set FMODE_LSEEK flag
Jason A. Donenfeld [Wed, 29 Jun 2022 13:07:02 +0000 (15:07 +0200)]
vfio: do not set FMODE_LSEEK flag

This file does not support llseek, so don't set the flag advertising it.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agodma-buf: remove useless FMODE_LSEEK flag
Jason A. Donenfeld [Wed, 29 Jun 2022 13:07:01 +0000 (15:07 +0200)]
dma-buf: remove useless FMODE_LSEEK flag

This is already set by anon_inode_getfile(), since dma_buf_fops has
non-NULL ->llseek, so we don't need to set it here too.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofs: do not compare against ->llseek
Jason A. Donenfeld [Wed, 29 Jun 2022 13:06:59 +0000 (15:06 +0200)]
fs: do not compare against ->llseek

Now vfs_llseek() can simply check for FMODE_LSEEK; if it's set,
we know that ->llseek() won't be NULL and if it's not we should
just fail with -ESPIPE.

A couple of other places where we used to check for special
values of ->llseek() (somewhat inconsistently) switched to
checking FMODE_LSEEK.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofs: clear or set FMODE_LSEEK based on llseek function
Jason A. Donenfeld [Wed, 29 Jun 2022 13:06:57 +0000 (15:06 +0200)]
fs: clear or set FMODE_LSEEK based on llseek function

Pipe-like behaviour on llseek(2) (i.e. unconditionally failing with
-ESPIPE) can be expresses in 3 ways:
1) ->llseek set to NULL in file_operations
2) ->llseek set to no_llseek in file_operations
3) FMODE_LSEEK *not* set in ->f_mode.

Enforce (3) in cases (1) and (2); that will allow to simplify the
checks and eventually get rid of no_llseek boilerplate.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoblock: fix leaking page ref on truncated direct io for-5.20/block-iter
Keith Busch [Tue, 12 Jul 2022 15:32:56 +0000 (08:32 -0700)]
block: fix leaking page ref on truncated direct io

The size being added to a bio from an iov is aligned to a block size
after the pages were gotten. If the new aligned size truncates the last
page, its reference was being leaked. Ensure all pages that were not
added to the bio have their reference released.

Since this essentially requires doing the same that bio_put_pages(), and
there was only one caller for that function, this patch makes the
put_page() loop common for everyone.

Fixes: b1a000d3b8ec5 ("block: relax direct io memory alignment")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220712153256.2202024-3-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: ensure bio_iov_add_page can't fail
Keith Busch [Tue, 12 Jul 2022 15:32:55 +0000 (08:32 -0700)]
block: ensure bio_iov_add_page can't fail

Adding the page could fail on the bio_full() condition, which checks for
either exceeding the bio's max segments or total size exceeding
UINT_MAX. We already ensure the max segments can't be exceeded, so just
ensure the total size won't reach the limit. This simplifies error
handling and removes unnecessary repeated bio_full() checks.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220712153256.2202024-2-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: ensure iov_iter advances for added pages
Keith Busch [Tue, 12 Jul 2022 15:32:54 +0000 (08:32 -0700)]
block: ensure iov_iter advances for added pages

There are cases where a bio may not accept additional pages, and the iov
needs to advance to the last data length that was accepted. The zone
append used to handle this correctly, but was inadvertently broken when
the setup was made common with the normal r/w case.

Fixes: 576ed9135489c ("block: use bio_add_page in bio_iov_iter_get_pages")
Fixes: c58c0074c54c2 ("block/bio: remove duplicate append pages code")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220712153256.2202024-1-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agofirst_iovec_segment(): just return address
Al Viro [Fri, 17 Jun 2022 20:07:49 +0000 (16:07 -0400)]
first_iovec_segment(): just return address

... and calculate the offset in the caller

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: massage calling conventions for first_{iovec,bvec}_segment()
Al Viro [Tue, 21 Jun 2022 20:10:37 +0000 (16:10 -0400)]
iov_iter: massage calling conventions for first_{iovec,bvec}_segment()

Pass maxsize by reference, return length via the same.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: first_{iovec,bvec}_segment() - simplify a bit
Al Viro [Tue, 21 Jun 2022 19:55:19 +0000 (15:55 -0400)]
iov_iter: first_{iovec,bvec}_segment() - simplify a bit

We return length + offset in page via *size.  Don't bother - the caller
can do that arithmetics just as well; just report the length to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()
Al Viro [Sat, 11 Jun 2022 00:53:17 +0000 (20:53 -0400)]
iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()

caller can do that just as easily

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
Al Viro [Sat, 11 Jun 2022 20:44:21 +0000 (16:44 -0400)]
iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT

All callers can and should handle iov_iter_get_pages() returning
fewer pages than requested.  All in-kernel ones do.  And it makes
the arithmetical overflow analysis much simpler...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiov_iter_bvec_advance(): don't bother with bvec_iter
Al Viro [Tue, 7 Jun 2022 03:44:33 +0000 (23:44 -0400)]
iov_iter_bvec_advance(): don't bother with bvec_iter

do what we do for iovec/kvec; that ends up generating better code,
AFAICS.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agos390: copy_oldmem_page() - don't ignore ->iov_offset
Al Viro [Wed, 6 Jul 2022 19:17:03 +0000 (15:17 -0400)]
s390: copy_oldmem_page() - don't ignore ->iov_offset

if you feel like you have to poke in iov_iter guts in the
first place, that is...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agostep_into(): move fetching ->d_inode past handle_mounts()
Al Viro [Mon, 4 Jul 2022 02:35:56 +0000 (22:35 -0400)]
step_into(): move fetching ->d_inode past handle_mounts()

... and lose messing with it in __follow_mount_rcu()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agolookup_fast(): don't bother with inode
Al Viro [Mon, 4 Jul 2022 02:20:20 +0000 (22:20 -0400)]
lookup_fast(): don't bother with inode

Note that validation of ->d_seq after ->d_inode fetch is gone, along
with fetching of ->d_inode itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofollow_dotdot{,_rcu}(): don't bother with inode
Al Viro [Mon, 4 Jul 2022 02:18:11 +0000 (22:18 -0400)]
follow_dotdot{,_rcu}(): don't bother with inode

step_into() will fetch it, TYVM.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agostep_into(): lose inode argument
Al Viro [Mon, 4 Jul 2022 02:07:32 +0000 (22:07 -0400)]
step_into(): lose inode argument

make handle_mounts() always fetch it.  This is just the first step -
the callers of step_into() will stop trying to calculate the sucker,
etc.

The passed value should be equal to dentry->d_inode in all cases;
in RCU mode - fetched after we'd sampled ->d_seq.  Might as well
fetch it here.  We do need to validate ->d_seq, which duplicates
the check currently done in lookup_fast(); that duplication will
go away shortly.

After that change handle_mounts() always ignores the initial value of
*inode and always sets it on success.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonamei: stash the sampled ->d_seq into nameidata
Al Viro [Mon, 4 Jul 2022 22:12:39 +0000 (18:12 -0400)]
namei: stash the sampled ->d_seq into nameidata

New field: nd->next_seq.  Set to 0 outside of RCU mode, holds the sampled
value for the next dentry to be considered.  Used instead of an arseload
of local variables, arguments, etc.

step_into() has lost seq argument; nd->next_seq is used, so dentry passed
to it must be the one ->next_seq is about.

There are two requirements for RCU pathwalk:
1) it should not give a hard failure (other than -ECHILD) unless
non-RCU pathwalk might fail that way given suitable timings.
2) it should not succeed unless non-RCU pathwalk might succeed
with the same end location given suitable timings.

The use of seq numbers is the way we achieve that.  Invariant we want
to maintain is:
if RCU pathwalk can reach the state with given nd->path, nd->inode
and nd->seq after having traversed some part of pathname, it must be possible
for non-RCU pathwalk to reach the same nd->path and nd->inode after having
traversed the same part of pathname, and observe the nd->path.dentry->d_seq
equal to what RCU pathwalk has in nd->seq

For transition from parent to child, we sample child's ->d_seq
and verify that parent's ->d_seq remains unchanged.  Anything that
disrupts parent-child relationship would've bumped ->d_seq on both.
For transitions from child to parent we sample parent's ->d_seq
and verify that child's ->d_seq has not changed.  Same reasoning as
for the previous case applies.
For transition from mountpoint to root of mounted we sample
the ->d_seq of root and verify that nobody has touched mount_lock since
the beginning of pathwalk.  That guarantees that mount we'd found had
been there all along, with these mountpoint and root of the mounted.
It would be possible for a non-RCU pathwalk to reach the previous state,
find the same mount and observe its root at the moment we'd sampled
->d_seq of that
For transitions from root of mounted to mountpoint we sample
->d_seq of mountpoint and verify that mount_lock had not been touched
since the beginning of pathwalk.  The same reasoning as in the
previous case applies.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonamei: move clearing LOOKUP_RCU towards rcu_read_unlock()
Al Viro [Wed, 6 Jul 2022 16:40:31 +0000 (12:40 -0400)]
namei: move clearing LOOKUP_RCU towards rcu_read_unlock()

try_to_unlazy()/try_to_unlazy_next() drop LOOKUP_RCU in the
very beginning and do rcu_read_unlock() only at the very end.
However, nothing done in between even looks at the flag in
question; might as well clear it at the same time we unlock.

Note that try_to_unlazy_next() used to call legitimize_mnt(),
which might drop/regain rcu_read_lock() in some cases.  This
is no longer true, so we really have rcu_read_lock() held
all along until the end.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoswitch try_to_unlazy_next() to __legitimize_mnt()
Al Viro [Tue, 5 Jul 2022 16:22:46 +0000 (12:22 -0400)]
switch try_to_unlazy_next() to __legitimize_mnt()

The tricky case (__legitimize_mnt() failing after having grabbed
a reference) can be trivially dealt with by leaving nd->path.mnt
non-NULL, for terminate_walk() to drop it.

legitimize_mnt() becomes static after that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofollow_dotdot{,_rcu}(): change calling conventions
Al Viro [Mon, 4 Jul 2022 15:20:51 +0000 (11:20 -0400)]
follow_dotdot{,_rcu}(): change calling conventions

Instead of returning NULL when we are in root, just make it return
the current position (and set *seqp and *inodep accordingly).
That collapses the calls of step_into() in handle_dots()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agonamei: get rid of pointless unlikely(read_seqcount_retry(...))
Al Viro [Tue, 5 Jul 2022 15:23:58 +0000 (11:23 -0400)]
namei: get rid of pointless unlikely(read_seqcount_retry(...))

read_seqcount_retry() et.al. are inlined and there's enough annotations
for compiler to figure out that those are unlikely to return non-zero.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years ago__follow_mount_rcu(): verify that mount_lock remains unchanged
Al Viro [Mon, 4 Jul 2022 21:26:29 +0000 (17:26 -0400)]
__follow_mount_rcu(): verify that mount_lock remains unchanged

Validate mount_lock seqcount as soon as we cross into mount in RCU
mode.  Sure, ->mnt_root is pinned and will remain so until we
do rcu_read_unlock() anyway, and we will eventually fail to unlazy if
the mount_lock had been touched, but we might run into a hard error
(e.g. -ENOENT) before trying to unlazy.  And it's possible to end
up with RCU pathwalk racing with rename() and umount() in a way
that would fail with -ENOENT while non-RCU pathwalk would've
succeeded with any timings.

Once upon a time we hadn't needed that, but analysis had been subtle,
brittle and went out of window as soon as RENAME_EXCHANGE had been
added.

It's narrow, hard to hit and won't get you anything other than
stray -ENOENT that could be arranged in much easier way with the
same priveleges, but it's a bug all the same.

Cc: stable@kernel.org
X-sky-is-falling: unlikely
Fixes: da1ce0670c14 "vfs: add cross-rename"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agofix short copy handling in copy_mc_pipe_to_iter()
Al Viro [Sun, 12 Jun 2022 23:50:29 +0000 (19:50 -0400)]
fix short copy handling in copy_mc_pipe_to_iter()

Unlike other copying operations on ITER_PIPE, copy_mc_to_iter() can
result in a short copy.  In that case we need to trim the unused
buffers, as well as the length of partially filled one - it's not
enough to set ->head, ->iov_offset and ->count to reflect how
much had we copied.  Not hard to fix, fortunately...

I'd put a helper (pipe_discard_from(pipe, head)) into pipe_fs_i.h,
rather than iov_iter.c - it has nothing to do with iov_iter and
having it will allow us to avoid an ugly kludge in fs/splice.c.
We could put it into lib/iov_iter.c for now and move it later,
but I don't see the point going that way...

Cc: stable@kernel.org # 4.19+
Fixes: ca146f6f091e "lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()"
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agocopy_page_{to,from}_iter(): switch iovec variants to generic
Al Viro [Thu, 26 May 2022 23:07:11 +0000 (19:07 -0400)]
copy_page_{to,from}_iter(): switch iovec variants to generic

we can do copyin/copyout under kmap_local_page(); it shouldn't overflow
the kmap stack - the maximal footprint increase only by one here.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agovfs: escape hash as well
Siddhesh Poyarekar [Tue, 28 Jun 2022 00:30:58 +0000 (08:30 +0800)]
vfs: escape hash as well

When a filesystem is mounted with a name that starts with a #:

 # mount '#name' /mnt/bad -t tmpfs

this will cause the entry to look like this (leading space added so
that git does not strip it out):

 #name /mnt/bad tmpfs rw,seclabel,relatime,inode64 0 0

This breaks getmntent and any code that aims to parse fstab as well as
/proc/mounts with the same logic since they need to strip leading spaces
or skip over comment lines, due to which they report incorrect output or
skip over the line respectively.

Solve this by translating the hash character into its octal encoding
equivalent so that applications can decode the name correctly.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoiomap: add support for dma aligned direct-io
Keith Busch [Fri, 10 Jun 2022 19:58:30 +0000 (12:58 -0700)]
iomap: add support for dma aligned direct-io

Use the address alignment requirements from the block_device for direct
io instead of requiring addresses be aligned to the block size.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-12-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: relax direct io memory alignment block-5.20-al
Keith Busch [Fri, 10 Jun 2022 19:58:29 +0000 (12:58 -0700)]
block: relax direct io memory alignment

Use the address alignment requirements from the block_device for direct
io instead of requiring addresses be aligned to the block size. User
space can discover the alignment requirements from the dma_alignment
queue attribute.

User space can specify any hardware compatible DMA offset for each
segment, but every segment length is still required to be a multiple of
the block size.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-11-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: introduce bdev_iter_is_aligned helper
Keith Busch [Fri, 10 Jun 2022 19:58:28 +0000 (12:58 -0700)]
block: introduce bdev_iter_is_aligned helper

Provide a convenient function for this repeatable coding pattern.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-10-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoiov: introduce iov_iter_aligned
Keith Busch [Fri, 10 Jun 2022 19:58:27 +0000 (12:58 -0700)]
iov: introduce iov_iter_aligned

The existing iov_iter_alignment() function returns the logical OR of
address and length. For cases where address and length need to be
considered separately, introduce a helper function that a caller can
specificy length and address masks that indicate if the iov is
unaligned.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-9-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock/bounce: count bytes instead of sectors
Keith Busch [Fri, 10 Jun 2022 19:58:26 +0000 (12:58 -0700)]
block/bounce: count bytes instead of sectors

Individual bv_len's may not be a sector size.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-8-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock/merge: count bytes instead of sectors
Keith Busch [Fri, 10 Jun 2022 19:58:25 +0000 (12:58 -0700)]
block/merge: count bytes instead of sectors

Individual bv_len's may not be a sector size.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220610195830.3574005-7-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: add a helper function for dio alignment
Keith Busch [Fri, 10 Jun 2022 19:58:24 +0000 (12:58 -0700)]
block: add a helper function for dio alignment

This will make it easier to add more complex acceptable alignment
criteria in the future.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-6-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: introduce bdev_dma_alignment helper
Keith Busch [Fri, 10 Jun 2022 19:58:23 +0000 (12:58 -0700)]
block: introduce bdev_dma_alignment helper

Preparing for upcoming dma_alignment users that have a block_device, but
don't need the request_queue.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-5-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: export dma_alignment attribute
Keith Busch [Fri, 10 Jun 2022 19:58:22 +0000 (12:58 -0700)]
block: export dma_alignment attribute

User space may want to know how to align their buffers to avoid
bouncing. Export the queue attribute.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-4-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock/bio: remove duplicate append pages code
Keith Busch [Fri, 10 Jun 2022 19:58:21 +0000 (12:58 -0700)]
block/bio: remove duplicate append pages code

The getting pages setup for zone append and normal IO are identical. Use
common code for each.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-3-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: fix infinite loop for invalid zone append
Keith Busch [Fri, 10 Jun 2022 19:58:20 +0000 (12:58 -0700)]
block: fix infinite loop for invalid zone append

Returning 0 early from __bio_iov_append_get_pages() for the
max_append_sectors warning just creates an infinite loop since 0 means
success, and the bio will never fill from the unadvancing iov_iter. We
could turn the return into an error value, but it will already be turned
into an error value later on, so just remove the warning. Clearly no one
ever hit it anyway.

Fixes: 0512a75b98f84 ("block: Introduce REQ_OP_ZONE_APPEND")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220610195830.3574005-2-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoLinux 5.19-rc4 v5.19-rc4
Linus Torvalds [Sun, 26 Jun 2022 21:22:10 +0000 (14:22 -0700)]
Linux 5.19-rc4

3 years agoMerge tag 'soc-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 26 Jun 2022 21:12:56 +0000 (14:12 -0700)]
Merge tag 'soc-fixes-5.19' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "A number of fixes have accumulated, but they are largely for harmless
  issues:

   - Several OF node leak fixes

   - A fix to the Exynos7885 UART clock description

   - DTS fixes to prevent boot failures on TI AM64 and J721s2

   - Bus probe error handling fixes for Baikal-T1

   - A fixup to the way STM32 SoCs use separate dts files for different
     firmware stacks

   - Multiple code fixes for Arm SCMI firmware, all dealing with
     robustness of the implementation

   - Multiple NXP i.MX devicetree fixes, addressing incorrect data in DT
     nodes

   - Three updates to the MAINTAINERS file, including Florian Fainelli
     taking over BCM283x/BCM2711 (Raspberry Pi) from Nicolas Saenz
     Julienne"

* tag 'soc-fixes-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits)
  ARM: dts: aspeed: nuvia: rename vendor nuvia to qcom
  arm: mach-spear: Add missing of_node_put() in time.c
  ARM: cns3xxx: Fix refcount leak in cns3xxx_init
  MAINTAINERS: Update email address
  arm64: dts: ti: k3-am64-main: Remove support for HS400 speed mode
  arm64: dts: ti: k3-j721s2: Fix overlapping GICD memory region
  ARM: dts: bcm2711-rpi-400: Fix GPIO line names
  bus: bt1-axi: Don't print error on -EPROBE_DEFER
  bus: bt1-apb: Don't print error on -EPROBE_DEFER
  ARM: Fix refcount leak in axxia_boot_secondary
  ARM: dts: stm32: move SCMI related nodes in a dedicated file for stm32mp15
  soc: imx: imx8m-blk-ctrl: fix display clock for LCDIF2 power domain
  ARM: dts: imx6qdl-colibri: Fix capacitive touch reset polarity
  ARM: dts: imx6qdl: correct PU regulator ramp delay
  firmware: arm_scmi: Fix incorrect error propagation in scmi_voltage_descriptors_get
  firmware: arm_scmi: Avoid using extended string-buffers sizes if not necessary
  firmware: arm_scmi: Fix SENSOR_AXIS_NAME_GET behaviour when unsupported
  ARM: dts: imx7: Move hsic_phy power domain to HSIC PHY node
  soc: bcm: brcmstb: pm: pm-arm: Fix refcount leak in brcmstb_pm_probe
  MAINTAINERS: Update BCM2711/BCM2835 maintainer
  ...

3 years agoMerge tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Sun, 26 Jun 2022 21:00:55 +0000 (14:00 -0700)]
Merge tag 'mm-hotfixes-stable-2022-06-26' of git://git./linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "Minor things, mainly - mailmap updates, MAINTAINERS updates, etc.

  Fixes for this merge window:

   - fix for a damon boot hang, from SeongJae

   - fix for a kfence warning splat, from Jason Donenfeld

   - fix for zero-pfn pinning, from Alex Williamson

   - fix for fallocate hole punch clearing, from Mike Kravetz

  Fixes for previous releases:

   - fix for a performance regression, from Marcelo

   - fix for a hwpoisining BUG from zhenwei pi"

* tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: add entry for Christian Marangi
  mm/memory-failure: disable unpoison once hw error happens
  hugetlbfs: zero partial pages during fallocate hole punch
  mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
  mm: re-allow pinning of zero pfns
  mm/kfence: select random number before taking raw lock
  MAINTAINERS: add maillist information for LoongArch
  MAINTAINERS: update MM tree references
  MAINTAINERS: update Abel Vesa's email
  MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
  MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
  mailmap: add alias for jarkko@profian.com
  mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
  kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
  mm: lru_cache_disable: use synchronize_rcu_expedited
  mm/page_isolation.c: fix one kernel-doc comment

3 years agoMerge tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sun, 26 Jun 2022 19:12:25 +0000 (12:12 -0700)]
Merge tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Enable ignore_missing_thread in 'perf stat', enabling counting with
   '--pid' when threads disappear during counting session setup

 - Adjust output data offset for backward compatibility in 'perf inject'

 - Fix missing free in copy_kcore_dir() in 'perf inject'

 - Fix caching files with a wrong build ID

 - Sync drm, cpufeatures, vhost and svn headers with the kernel

* tag 'perf-tools-fixes-for-v5.19-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools headers UAPI: Synch KVM's svm.h header with the kernel
  tools include UAPI: Sync linux/vhost.h with the kernel sources
  perf stat: Enable ignore_missing_thread
  perf inject: Adjust output data offset for backward compatibility
  perf trace beauty: Fix generation of errno id->str table on ALT Linux
  perf build-id: Fix caching files with a wrong build ID
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  perf inject: Fix missing free in copy_kcore_dir()

3 years agoMerge tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sun, 26 Jun 2022 17:11:36 +0000 (10:11 -0700)]
Merge tag 'for-5.19-rc3-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - zoned relocation fixes:
      - fix critical section end for extent writeback, this could lead
        to out of order write
      - prevent writing to previous data relocation block group if space
        gets low

 - reflink fixes:
      - fix race between reflinking and ordered extent completion
      - proper error handling when block reserve migration fails
      - add missing inode iversion/mtime/ctime updates on each iteration
        when replacing extents

 - fix deadlock when running fsync/fiemap/commit at the same time

 - fix false-positive KCSAN report regarding pid tracking for read locks
   and data race

 - minor documentation update and link to new site

* tag 'for-5.19-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Documentation: update btrfs list of features and link to readthedocs.io
  btrfs: fix deadlock with fsync+fiemap+transaction commit
  btrfs: don't set lock_owner when locking extent buffer for reading
  btrfs: zoned: fix critical section of relocation inode writeback
  btrfs: zoned: prevent allocation from previous data relocation BG
  btrfs: do not BUG_ON() on failure to migrate space when replacing extents
  btrfs: add missing inode updates on each iteration when replacing extents
  btrfs: fix race between reflinking and ordered extent completion

3 years agoMerge tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma...
Linus Torvalds [Sun, 26 Jun 2022 17:01:40 +0000 (10:01 -0700)]
Merge tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

 - pass the correct size to dma_set_encrypted() when freeing memory
   (Dexuan Cui)

* tag 'dma-mapping-5.19-2022-06-26' of git://git.infradead.org/users/hch/dma-mapping:
  dma-direct: use the correct size for dma_set_encrypted()

3 years agoMerge tag 'for-5.19/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Sun, 26 Jun 2022 16:13:51 +0000 (09:13 -0700)]
Merge tag 'for-5.19/fbdev-2' of git://git./linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:
 "Two bug fixes for the pxa3xx and intelfb drivers:

   - pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write

   - intelfb: Initialize value of stolen size

  The other changes are small cleanups, simplifications and
  documentation updates to the cirrusfb, skeletonfb, omapfb,
  intelfb, au1100fb and simplefb drivers"

* tag 'for-5.19/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  video: fbdev: omap: Remove duplicate 'the' in comment
  video: fbdev: omapfb: Align '*' in comment
  video: fbdev: simplefb: Check before clk_put() not needed
  video: fbdev: au1100fb: Drop unnecessary NULL ptr check
  video: fbdev: pxa3xx-gcu: Fix integer overflow in pxa3xx_gcu_write
  video: fbdev: skeletonfb: Convert to generic power management
  video: fbdev: cirrusfb: Remove useless reference to PCI power management
  video: fbdev: intelfb: Initialize value of stolen size
  video: fbdev: intelfb: Use aperture size from pci_resource_len
  video: fbdev: skeletonfb: Fix syntax errors in comments

3 years agoMerge tag 'for-5.19/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Sun, 26 Jun 2022 16:08:30 +0000 (09:08 -0700)]
Merge tag 'for-5.19/parisc-3' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

 - enable ARCH_HAS_STRICT_MODULE_RWX to prevent a boot crash on c8000
   machines

 - flush all mappings of a shared anonymous page on PA8800/8900 machines
   via flushing the whole data cache. This may slow down such machines
   but makes sure that the cache is consistent

 - Fix duplicate definition build error regarding fb_is_primary_device()

* tag 'for-5.19/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Enable ARCH_HAS_STRICT_MODULE_RWX
  parisc: Fix flush_anon_page on PA8800/PA8900
  parisc: align '*' in comment in math-emu code
  parisc/stifb: Fix fb_is_primary_device() only available with CONFIG_FB_STI

3 years agoMerge tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Sun, 26 Jun 2022 15:59:21 +0000 (08:59 -0700)]
Merge tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa

Pull xtensa fixes from Max Filippov:

 - fix OF reference leaks in xtensa arch code

 - replace '.bss' with '.section .bss' to fix entry.S build with old
   assembler

* tag 'xtensa-20220626' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: change '.bss' to '.section .bss'
  xtensa: xtfpga: Fix refcount leak bug in setup
  xtensa: Fix refcount leak bug in time.c

3 years agoMerge tag 'powerpc-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 26 Jun 2022 15:53:20 +0000 (08:53 -0700)]
Merge tag 'powerpc-5.19-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - A fix for a CMA change that broke booting guests with > 2G RAM on
   Power8 hosts.

 - Fix the RTAS call filter to allow a special case that applications
   rely on.

 - A change to our execve path, to make the execve syscall exit
   tracepoint work.

 - Three fixes to wire up our various RNGs earlier in boot so they're
   available for use in the initial seeding in random_init().

 - A build fix for when KASAN is enabled along with
   STRUCTLEAK_BYREF_ALL.

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Christophe Leroy, Jason
Donenfeld, Nathan Lynch, Naveen N. Rao, Sathvika Vasireddy, Sumit
Dubey2, Tyrel Datwyler, and Zi Yan.

* tag 'powerpc-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv: wire up rng during setup_arch
  powerpc/prom_init: Fix build failure with GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN
  powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
  powerpc: Enable execve syscall exit tracepoint
  powerpc/pseries: wire up rng during setup_arch()
  powerpc/microwatt: wire up rng during setup_arch()
  powerpc/mm: Move CMA reservations after initmem_init()

3 years agoMerge tag 'kbuild-fixes-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 26 Jun 2022 15:47:28 +0000 (08:47 -0700)]
Merge tag 'kbuild-fixes-v5.19-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix modpost to detect EXPORT_SYMBOL marked as __init or__exit

 - Update the supported arch list in the LLVM document

 - Avoid the second link of vmlinux for CONFIG_TRIM_UNUSED_KSYMS

 - Avoid false __KSYM___this_module define in include/generated/autoksyms.h

* tag 'kbuild-fixes-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Ignore __this_module in gen_autoksyms.sh
  kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)
  Documentation/llvm: Update Supported Arch table
  modpost: fix section mismatch check for exported init/exit sections

3 years agoMerge tag 'exfat-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linki...
Linus Torvalds [Sun, 26 Jun 2022 15:41:04 +0000 (08:41 -0700)]
Merge tag 'exfat-for-5.19-rc4' of git://git./linux/kernel/git/linkinjeon/exfat

Pull exfat fix from Namjae Jeon:

 - Use updated exfat_chain directly instead of snapshot values in
   rename.

* tag 'exfat-for-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: use updated exfat_chain directly during renaming

3 years agoMerge tag '5.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 26 Jun 2022 15:34:52 +0000 (08:34 -0700)]
Merge tag '5.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs client fixes from Steve French:
 "Fixes addressing important multichannel, and reconnect issues.

  Multichannel mounts when the server network interfaces changed, or ip
  addresses changed, uncovered problems, especially in reconnect, but
  the patches for this were held up until recently due to some lock
  conflicts that are now addressed.

  Included in this set of fixes:

   - three fixes relating to multichannel reconnect, dynamically
     adjusting the list of server interfaces to avoid problems during
     reconnect

   - a lock conflict fix related to the above

   - two important fixes for negotiate on secondary channels (null
     netname can unintentionally cause multichannel to be disabled to
     some servers)

   - a reconnect fix (reporting incorrect IP address in some cases)"

* tag '5.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update cifs_ses::ip_addr after failover
  cifs: avoid deadlocks while updating iface
  cifs: periodically query network interfaces from server
  cifs: during reconnect, update interface if necessary
  cifs: change iface_list from array to sorted linked list
  smb3: use netname when available on secondary channels
  smb3: fix empty netname context on secondary channels

3 years agotools headers UAPI: Synch KVM's svm.h header with the kernel
Arnaldo Carvalho de Melo [Mon, 21 Dec 2020 23:04:45 +0000 (20:04 -0300)]
tools headers UAPI: Synch KVM's svm.h header with the kernel

To pick up the changes from:

  d5af44dde5461d12 ("x86/sev: Provide support for SNP guest request NAEs")
  0afb6b660a6b58cb ("x86/sev: Use SEV-SNP AP creation to start secondary CPUs")
  dc3f3d2474b80eae ("x86/mm: Validate memory when changing the C-bit")
  cbd3d4f7c4e5a93e ("x86/sev: Check SEV-SNP features support")

That gets these new SVM exit reasons:

+       { SVM_VMGEXIT_PSC,              "vmgexit_page_state_change" }, \
+       { SVM_VMGEXIT_GUEST_REQUEST,    "vmgexit_guest_request" }, \
+       { SVM_VMGEXIT_EXT_GUEST_REQUEST, "vmgexit_ext_guest_request" }, \
+       { SVM_VMGEXIT_AP_CREATION,      "vmgexit_ap_creation" }, \
+       { SVM_VMGEXIT_HV_FEATURES,      "vmgexit_hypervisor_feature" }, \

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
  diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h

This causes these changes:

  CC      /tmp/build/perf-urgent/arch/x86/util/kvm-stat.o
  LD      /tmp/build/perf-urgent/arch/x86/util/perf-in.o
  LD      /tmp/build/perf-urgent/arch/x86/perf-in.o
  LD      /tmp/build/perf-urgent/arch/perf-in.o
  LD      /tmp/build/perf-urgent/perf-in.o
  LINK    /tmp/build/perf-urgent/perf

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools include UAPI: Sync linux/vhost.h with the kernel sources
Arnaldo Carvalho de Melo [Tue, 14 Apr 2020 12:12:55 +0000 (09:12 -0300)]
tools include UAPI: Sync linux/vhost.h with the kernel sources

To get the changes in:

  84d7c8fd3aade2fe ("vhost-vdpa: introduce uAPI to set group ASID")
  2d1fcb7758e49fd9 ("vhost-vdpa: uAPI to get virtqueue group id")
  a0c95f201170bd55 ("vhost-vdpa: introduce uAPI to get the number of address spaces")
  3ace88bd37436abc ("vhost-vdpa: introduce uAPI to get the number of virtqueue groups")
  175d493c3c3e09a3 ("vhost: move the backend feature bits to vhost_types.h")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
  diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

To pick up these changes and support them:

  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
  $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
  $ diff -u before after
  --- before 2022-06-26 12:04:35.982003781 -0300
  +++ after 2022-06-26 12:04:43.819972476 -0300
  @@ -28,6 +28,7 @@
    [0x74] = "VDPA_SET_CONFIG",
    [0x75] = "VDPA_SET_VRING_ENABLE",
    [0x77] = "VDPA_SET_CONFIG_CALL",
  + [0x7C] = "VDPA_SET_GROUP_ASID",
   };
   static const char *vhost_virtio_ioctl_read_cmds[] = {
    [0x00] = "GET_FEATURES",
  @@ -39,5 +40,8 @@
    [0x76] = "VDPA_GET_VRING_NUM",
    [0x78] = "VDPA_GET_IOVA_RANGE",
    [0x79] = "VDPA_GET_CONFIG_SIZE",
  + [0x7A] = "VDPA_GET_AS_NUM",
  + [0x7B] = "VDPA_GET_VRING_GROUP",
    [0x80] = "VDPA_GET_VQS_COUNT",
  + [0x81] = "VDPA_GET_GROUP_NUM",
   };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Yrh3xMYbfeAD0MFL@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Enable ignore_missing_thread
Gang Li [Wed, 22 Jun 2022 03:00:37 +0000 (11:00 +0800)]
perf stat: Enable ignore_missing_thread

perf already support ignore_missing_thread for -p, but not yet
applied to `perf stat -p <pid>`. This patch enables ignore_missing_thread
for `perf stat -p <pid>`.

Committer notes:

And here is a refresher about the 'ignore_missing_thread' knob, from a
previous patch using it:

  ca8000684ec4e66f ("perf evsel: Enable ignore_missing_thread for pid option")

  ---
    While monitoring a multithread process with pid option, perf sometimes
    may return sys_perf_event_open failure with 3(No such process) if any of
    the process's threads die before we open the event. However, we want
    perf continue monitoring the remaining threads and do not exit with
    error.
  ---

Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220622030037.15005-1-ligang.bdlg@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf inject: Adjust output data offset for backward compatibility
Raul Silvera [Tue, 21 Jun 2022 15:27:25 +0000 (15:27 +0000)]
perf inject: Adjust output data offset for backward compatibility

When 'perf inject' creates a new file, it reuses the data offset from
the input file. If there has been a change on the size of the header, as
happened in v5.12 -> v5.13, the new offsets will be wrong, resulting in
a corrupted output file.

This change adds the function perf_session__data_offset to compute the
data offset based on the current header size, and uses that instead of
the offset from the original input file.

Signed-off-by: Raul Silvera <rsilvera@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220621152725.2668041-1-rsilvera@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf trace beauty: Fix generation of errno id->str table on ALT Linux
Arnaldo Carvalho de Melo [Tue, 21 Jun 2022 15:34:37 +0000 (12:34 -0300)]
perf trace beauty: Fix generation of errno id->str table on ALT Linux

For some reason using:

         cat <<EoFuncBegin
  static const char *errno_to_name__$arch(int err)
  {
         switch (err) {
  EoFuncBegin

In tools/perf/trace/beauty/arch_errno_names.sh isn't working on ALT
Linux sisyphus (development version), which could be some distro
specific glitch, so just get this done in an alternative way that works
everywhere while giving notice to the people working on that distro to
try and figure our what really took place.

Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf build-id: Fix caching files with a wrong build ID
Adrian Hunter [Tue, 21 Jun 2022 12:51:44 +0000 (15:51 +0300)]
perf build-id: Fix caching files with a wrong build ID

Build ID events associate a file name with a build ID.  However, when
using perf inject, there is no guarantee that the file on the current
machine at the current time has that build ID. Fix by comparing the
build IDs and skip adding to the cache if they are different.

Example:

  $ echo "int main() {return 0;}" > prog.c
  $ gcc -o prog prog.c
  $ perf record --buildid-all ./prog
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data ]
  $ file-buildid() { file $1 | awk -F= '{print $2}' | awk -F, '{print $1}' ; }
  $ file-buildid prog
  444ad9be165d8058a48ce2ffb4e9f55854a3293e
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf
  444ad9be165d8058a48ce2ffb4e9f55854a3293e
  $ echo "int main() {return 1;}" > prog.c
  $ gcc -o prog prog.c
  $ file-buildid prog
  885524d5aaa24008a3e2b06caa3ea95d013c0fc5

Before:

  $ perf buildid-cache --purge $(pwd)/prog
  $ perf inject -i perf.data -o junk
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf
  885524d5aaa24008a3e2b06caa3ea95d013c0fc5
  $

After:

  $ perf buildid-cache --purge $(pwd)/prog
  $ perf inject -i perf.data -o junk
  $ file-buildid ~/.debug/$(pwd)/prog/444ad9be165d8058a48ce2ffb4e9f55854a3293e/elf

  $

Fixes: 454c407ec17a0c63 ("perf: add perf-inject builtin")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: https://lore.kernel.org/r/20220621125144.5623-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools headers cpufeatures: Sync with the kernel sources
Arnaldo Carvalho de Melo [Thu, 1 Jul 2021 16:39:15 +0000 (13:39 -0300)]
tools headers cpufeatures: Sync with the kernel sources

To pick the changes from:

  d6d0c7f681fda1d0 ("x86/cpufeatures: Add PerfMonV2 feature bit")
  296d5a17e793956f ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
  f30903394eb62316 ("x86/cpufeatures: Add virtual TSC_AUX feature bit")
  8ad7e8f696951f19 ("x86/fpu/xsave: Support XSAVEC in the kernel")
  59bd54a84d15e933 ("x86/tdx: Detect running as a TDX guest in early boot")
  a77d41ac3a0f41c8 ("x86/cpufeatures: Add AMD Fam19h Branch Sampling feature")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YrDkgmwhLv+nKeOo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools headers UAPI: Sync drm/i915_drm.h with the kernel sources
Arnaldo Carvalho de Melo [Sat, 13 Nov 2021 14:08:31 +0000 (11:08 -0300)]
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources

To pick up the changes in:

  ecf8eca51f33dbfd ("drm/i915/xehp: Add compute engine ABI")
  991b4de3275728fd ("drm/i915/uapi: Add kerneldoc for engine class enum")
  c94fde8f516610b0 ("drm/i915/uapi: Add DRM_I915_QUERY_GEOMETRY_SUBSLICES")
  1c671ad753dbbf5f ("drm/i915/doc: Link query items to their uapi structs")
  a2e5402691e23269 ("drm/i915/doc: Convert perf UAPI comments to kerneldoc")
  462ac1cdf4d7acf1 ("drm/i915/doc: Convert drm_i915_query_topology_info comment to kerneldoc")
  034d47b25b2ce627 ("drm/i915/uapi: Document DRM_I915_QUERY_HWCONFIG_BLOB")
  78e1fb3112c0ac44 ("drm/i915/uapi: Add query for hwconfig blob")

That don't add any new ioctl, so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: John Harrison <John.C.Harrison@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://lore.kernel.org/lkml/YrDi4ALYjv9Mdocq@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf inject: Fix missing free in copy_kcore_dir()
Adrian Hunter [Mon, 20 Jun 2022 10:39:04 +0000 (13:39 +0300)]
perf inject: Fix missing free in copy_kcore_dir()

Free string allocated by asprintf().

Fixes: d8fc08550929bb84 ("perf inject: Keep a copy of kcore_dir")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220620103904.7960-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoparisc: Enable ARCH_HAS_STRICT_MODULE_RWX
Helge Deller [Sun, 26 Jun 2022 09:50:43 +0000 (11:50 +0200)]
parisc: Enable ARCH_HAS_STRICT_MODULE_RWX

Fix a boot crash on a c8000 machine as reported by Dave.  Basically it changes
patch_map() to return an alias mapping to the to-be-patched code in order to
prevent writing to write-protected memory.

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v5.2+
Link: https://lore.kernel.org/all/e8ec39e8-25f8-e6b4-b7ed-4cb23efc756e@bell.net/
3 years agoparisc: Fix flush_anon_page on PA8800/PA8900
John David Anglin [Sat, 18 Jun 2022 15:14:34 +0000 (15:14 +0000)]
parisc: Fix flush_anon_page on PA8800/PA8900

Anonymous pages are allocated with the shared mappings colouring,
SHM_COLOUR. Since the alias boundary on machines with PA8800 and
PA8900 processors is unknown, flush_user_cache_page() might not
flush all mappings of a shared anonymous page. Flushing the whole
data cache flushes all mappings.

This won't fix all coherency issues with shared mappings but it
seems to work well in practice.  I haven't seen any random memory
faults in almost a month on a rp3440 running as a debian buildd
machine.

There is a small preformance hit.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.18+
3 years agoparisc: align '*' in comment in math-emu code
Jiang Jian [Tue, 21 Jun 2022 06:38:23 +0000 (14:38 +0800)]
parisc: align '*' in comment in math-emu code

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Helge Deller <deller@gmx.de>
3 years agokbuild: Ignore __this_module in gen_autoksyms.sh
Sami Tolvanen [Thu, 16 Jun 2022 19:57:59 +0000 (19:57 +0000)]
kbuild: Ignore __this_module in gen_autoksyms.sh

Module object files can contain an undefined reference to __this_module,
which isn't resolved until we link the final .ko. The kernel doesn't
export this symbol, so ignore it in gen_autoksyms.sh.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Steve Muckle <smuckle@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Ramji Jiyani <ramjiyani@google.com>
3 years agokbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)
Masahiro Yamada [Thu, 23 Jun 2022 19:11:47 +0000 (04:11 +0900)]
kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS (2nd attempt)

If CONFIG_TRIM_UNUSED_KSYMS is enabled and the kernel is built from
a pristine state, the vmlinux is linked twice.

Commit 3fdc7d3fe4c0 ("kbuild: link vmlinux only once for
CONFIG_TRIM_UNUSED_KSYMS") explains why this happens, but it did not fix
the issue at all.

Now I realized I had applied a wrong patch.

In v1 patch [1], the autoksyms_recursive target correctly recurses to
"$(MAKE) -f $(srctree)/Makefile autoksyms_recursive".

In v2 patch [2], I accidentally dropped the diff line, and it recurses to
"$(MAKE) -f $(srctree)/Makefile vmlinux".

Restore the code I intended in v1.

[1]: https://lore.kernel.org/linux-kbuild/1521045861-22418-8-git-send-email-yamada.masahiro@socionext.com/
[2]: https://lore.kernel.org/linux-kbuild/1521166725-24157-8-git-send-email-yamada.masahiro@socionext.com/

Fixes: 3fdc7d3fe4c0 ("kbuild: link vmlinux only once for CONFIG_TRIM_UNUSED_KSYMS")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
3 years agoMerge tag 'char-misc-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sat, 25 Jun 2022 17:07:36 +0000 (10:07 -0700)]
Merge tag 'char-misc-5.19-rc4' of git://git./linux/kernel/git/gregkh/char-misc

Pull IIO driver fixes from Greg KH:
 "Here are a set of IIO driver fixes for 5.19-rc4. Jonathan said it best
  in his pull request to me, so I'll just quote it here below, as that's
  the only changes we have right now for the char-misc driver tree:

  testing:
      - Fix a missing MODULE_LICENSE() warning by restricting possible
        build configs.

  Various drivers:
      - Fix ordering of iio_get_trigger() being called before
        iio_trigger_register()

  adi,admv1014:
      - Fix dubious x & !y warning.

  adi,axi-adc:
      - Fix missing of_node_put() in error and normal paths.

  aspeed,adc:
      - Add missing of_node_put()

  fsl,mma8452:
      - Fix broken probing from device tree.
      - Drop check on return value of i2c write to device to cause reset
        as ACK will be missing (device reset before sending it).

  fsl,vf610:
      - Fix documentation of in_conversion_mode ABI.

  iio-trig-sysfs:
      - Ensure irq work has finished before freeing the trigger.

  invensense,mpu3050:
      - Disable regulators in error path.

  invensense,icm42600:
      - Fix collision of enum value of 0 with error path where 0 is no
        match.

  renesas,rzg2l_Adc:
      - Add missing fwnode_handle_put() in error path.

  rescale:
      - Fix a boolean logic bug for detection of raw + scale affecting
        an obscure corner case.

  semtech,sx9324:
      - Check return value of read of pin_defs

  st,stm32-adc:
      - Fix interaction across ADC instances for some supported devices.
      - Drop false spurious IRQ messages.
      - Fix calibration value handling. If we can't calibrate don't
        expose the vref_int channel.
      - Fix maximum clock rate for stm32pm15x

  ti,ads131e08:
      - Add missing fwnode_handle_put() in error paths.

  xilinx,ams:
      - Fix variable checked for error from platform_get_irq()

  x-powers,axp288:
      - Overide TS_PIN bias current for boards where it is not correctly
        initialized.

  yamaha,yas530:
      - Fix inverted check on calibration data being all zeros.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
  iio:proximity:sx9324: Check ret value of device_property_read_u32_array()
  iio: accel: mma8452: ignore the return value of reset operation
  iio: adc: stm32: fix maximum clock rate for stm32mp15x
  iio: adc: stm32: fix vrefint wrong calibration value handling
  iio: imu: inv_icm42600: Fix broken icm42600 (chip id 0 value)
  iio: adc: vf610: fix conversion mode sysfs node name
  iio: adc: adi-axi-adc: Fix refcount leak in adi_axi_adc_attach_client
  iio: test: fix missing MODULE_LICENSE for IIO_RESCALE=m
  iio:humidity:hts221: rearrange iio trigger get and register
  iio:chemical:ccs811: rearrange iio trigger get and register
  iio:accel:mxc4005: rearrange iio trigger get and register
  iio:accel:kxcjk-1013: rearrange iio trigger get and register
  iio:accel:bma180: rearrange iio trigger get and register
  iio: afe: rescale: Fix boolean logic bug
  iio: adc: aspeed: Fix refcount leak in aspeed_adc_set_trim_data
  iio: adc: stm32: Fix IRQs on STM32F4 by removing custom spurious IRQs message
  iio: adc: stm32: Fix ADCs iteration in irq handler
  iio: adc: ti-ads131e08: add missing fwnode_handle_put() in ads131e08_alloc_channels()
  iio: adc: rzg2l_adc: add missing fwnode_handle_put() in rzg2l_adc_parse_properties()
  iio: trigger: sysfs: fix use-after-free on remove
  ...

3 years agoMerge tag 'usb-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sat, 25 Jun 2022 17:02:05 +0000 (10:02 -0700)]
Merge tag 'usb-5.19-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 5.19-rc4
  for a few small reported issues. They include:

   - new usb-serial driver ids

   - MAINTAINERS file update to properly catch the USB dts files

   - dt-bindings fixes for reported build warnings

   - xhci driver fixes for reported problems

   - typec Kconfig dependancy fix

   - raw_gadget fuzzing fixes found by syzbot

   - chipidea driver bugfix

   - usb gadget uvc bugfix

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: chipidea: udc: check request status before setting device address
  USB: gadget: Fix double-free bug in raw_gadget driver
  xhci-pci: Allow host runtime PM as default for Intel Meteor Lake xHCI
  xhci-pci: Allow host runtime PM as default for Intel Raptor Lake xHCI
  xhci: turn off port power in shutdown
  xhci: Keep interrupt disabled in initialization until host is running.
  USB: serial: option: add Quectel RM500K module support
  USB: serial: option: add Quectel EM05-G modem
  USB: serial: pl2303: add support for more HXN (G) types
  usb: typec: wcove: Drop wrong dependency to INTEL_SOC_PMIC
  usb: gadget: uvc: fix list double add in uvcg_video_pump
  dt-bindings: usb: ehci: Increase the number of PHYs
  dt-bindings: usb: ohci: Increase the number of PHYs
  usb: gadget: Fix non-unique driver names in raw-gadget driver
  MAINTAINERS: add include/dt-bindings/usb to USB SUBSYSTEM
  USB: serial: option: add Telit LE910Cx 0x1250 composition