linux-block.git
7 months agopowerpc64/ftrace: Move ftrace sequence out of line
Naveen N Rao [Wed, 30 Oct 2024 07:08:45 +0000 (12:38 +0530)]
powerpc64/ftrace: Move ftrace sequence out of line

Function profile sequence on powerpc includes two instructions at the
beginning of each function:
mflr r0
bl ftrace_caller

The call to ftrace_caller() gets nop'ed out during kernel boot and is
patched in when ftrace is enabled.

Given the sequence, we cannot return from ftrace_caller with 'blr' as we
need to keep LR and r0 intact. This results in link stack (return
address predictor) imbalance when ftrace is enabled. To address that, we
would like to use a three instruction sequence:
mflr r0
bl ftrace_caller
mtlr r0

Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
reserve two instruction slots before the function. This results in a
total of five instruction slots to be reserved for ftrace use on each
function that is traced.

Move the function profile sequence out-of-line to minimize its impact.
To do this, we reserve a single nop at function entry using
-fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
the total number of functions that can be traced. This is then used to
generate a .S file reserving the appropriate amount of space for use as
ftrace stubs, which is built and linked into vmlinux.

On bootup, the stub space is split into separate stubs per function and
populated with the proper instruction sequence. A pointer to the
associated stub is maintained in dyn_arch_ftrace.

For modules, space for ftrace stubs is reserved from the generic module
stub space.

This is restricted to and enabled by default only on 64-bit powerpc,
though there are some changes to accommodate 32-bit powerpc. This is
done so that 32-bit powerpc could choose to opt into this based on
further tests and benchmarks.

As an example, after this patch, kernel functions will have a single nop
at function entry:
<kernel_clone>:
addis r2,r12,467
addi r2,r2,-16028
nop
mfocrf r11,8
...

When ftrace is enabled, the nop is converted to an unconditional branch
to the stub associated with that function:
<kernel_clone>:
addis r2,r12,467
addi r2,r2,-16028
b ftrace_ool_stub_text_end+0x11b28
mfocrf r11,8
...

The associated stub:
<ftrace_ool_stub_text_end+0x11b28>:
mflr r0
bl ftrace_caller
mtlr r0
b kernel_clone+0xc
...

This change showed an improvement of ~10% in null_syscall benchmark on a
Power 10 system with ftrace enabled.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com
7 months agokbuild: Add generic hook for architectures to use before the final vmlinux link
Naveen N Rao [Wed, 30 Oct 2024 07:08:44 +0000 (12:38 +0530)]
kbuild: Add generic hook for architectures to use before the final vmlinux link

On powerpc, we would like to be able to make a pass on vmlinux.o and
generate a new object file to be linked into vmlinux. Add a generic pass
in Makefile.vmlinux that architectures can use for this purpose.

Architectures need to select CONFIG_ARCH_WANTS_PRE_LINK_VMLINUX and must
provide arch/<arch>/tools/Makefile with .arch.vmlinux.o target, which
will be invoked prior to the final vmlinux link step.

Acked-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-12-hbathini@linux.ibm.com
7 months agopowerpc/ftrace: Add a postlink script to validate function tracer
Naveen N Rao [Wed, 30 Oct 2024 07:08:43 +0000 (12:38 +0530)]
powerpc/ftrace: Add a postlink script to validate function tracer

Function tracer on powerpc can only work with vmlinux having a .text
size of up to ~64MB due to powerpc branch instruction having a limited
relative branch range of 32MB. Today, this is only detected on kernel
boot when ftrace is init'ed. Add a post-link script to check the size of
.text so that we can detect this at build time, and break the build if
necessary.

We add a dependency on !COMPILE_TEST for CONFIG_HAVE_FUNCTION_TRACER so
that allyesconfig and other test builds can continue to work without
enabling ftrace.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-11-hbathini@linux.ibm.com
7 months agopowerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into bpf_jit_emit_func_call_rel()
Naveen N Rao [Wed, 30 Oct 2024 07:08:42 +0000 (12:38 +0530)]
powerpc64/bpf: Fold bpf_jit_emit_func_call_hlp() into bpf_jit_emit_func_call_rel()

Commit 61688a82e047 ("powerpc/bpf: enable kfunc call") enhanced
bpf_jit_emit_func_call_hlp() to handle calls out to module region, where
bpf progs are generated. The only difference now between
bpf_jit_emit_func_call_hlp() and bpf_jit_emit_func_call_rel() is in
handling of the initial pass where target function address is not known.
Fold that logic into bpf_jit_emit_func_call_hlp() and rename it to
bpf_jit_emit_func_call_rel() to simplify bpf function call JIT code.

We don't actually need to load/restore TOC across a call out to a
different kernel helper or to a different bpf program since they all
work with the kernel TOC. We only need to do it if we have to call out
to a module function. So, guard TOC load/restore with appropriate
conditions.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-10-hbathini@linux.ibm.com
7 months agopowerpc/ftrace: Move ftrace stub used for init text before _einittext
Naveen N Rao [Wed, 30 Oct 2024 07:08:41 +0000 (12:38 +0530)]
powerpc/ftrace: Move ftrace stub used for init text before _einittext

Move the ftrace stub used to cover inittext before _einittext so that it
is within kernel text, as seen through core_kernel_text(). This is
required for a subsequent change to ftrace.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-9-hbathini@linux.ibm.com
7 months agopowerpc/ftrace: Skip instruction patching if the instructions are the same
Naveen N Rao [Wed, 30 Oct 2024 07:08:40 +0000 (12:38 +0530)]
powerpc/ftrace: Skip instruction patching if the instructions are the same

To simplify upcoming changes to ftrace, add a check to skip actual
instruction patching if the old and new instructions are the same. We
still validate that the instruction is what we expect, but don't
actually patch the same instruction again.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-8-hbathini@linux.ibm.com
7 months agopowerpc/ftrace: Remove pointer to struct module from dyn_arch_ftrace
Naveen N Rao [Wed, 30 Oct 2024 07:08:39 +0000 (12:38 +0530)]
powerpc/ftrace: Remove pointer to struct module from dyn_arch_ftrace

Pointer to struct module is only relevant for ftrace records belonging
to kernel modules. Having this field in dyn_arch_ftrace wastes memory
for all ftrace records belonging to the kernel. Remove the same in
favour of looking up the module from the ftrace record address, similar
to other architectures.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-7-hbathini@linux.ibm.com
7 months agopowerpc/module_64: Convert #ifdef to IS_ENABLED()
Naveen N Rao [Wed, 30 Oct 2024 07:08:38 +0000 (12:38 +0530)]
powerpc/module_64: Convert #ifdef to IS_ENABLED()

Minor refactor for converting #ifdef to IS_ENABLED().

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-6-hbathini@linux.ibm.com
7 months agopowerpc32/ftrace: Unify 32-bit and 64-bit ftrace entry code
Naveen N Rao [Wed, 30 Oct 2024 07:08:37 +0000 (12:38 +0530)]
powerpc32/ftrace: Unify 32-bit and 64-bit ftrace entry code

On 32-bit powerpc, gcc generates a three instruction sequence for
function profiling:
mflr r0
stw r0, 4(r1)
bl _mcount

On kernel boot, the call to _mcount() is nop-ed out, to be patched back
in when ftrace is actually enabled. The 'stw' instruction therefore is
not necessary unless ftrace is enabled. Nop it out during ftrace init.

When ftrace is enabled, we want the 'stw' so that stack unwinding works
properly. Perform the same within the ftrace handler, similar to 64-bit
powerpc.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-5-hbathini@linux.ibm.com
7 months agopowerpc64/ftrace: Nop out additional 'std' instruction emitted by gcc v5.x
Naveen N Rao [Wed, 30 Oct 2024 07:08:36 +0000 (12:38 +0530)]
powerpc64/ftrace: Nop out additional 'std' instruction emitted by gcc v5.x

Gcc v5.x emits a 3-instruction sequence for -mprofile-kernel:
mflr r0
std r0, 16(r1)
bl _mcount

Gcc v6.x moved to a simpler 2-instruction sequence by removing the 'std'
instruction. The store saved the return address in the LR save area in
the caller stack frame for stack unwinding. However, with dynamic
ftrace, we no longer have a call to _mcount on kernel boot when ftrace
is not enabled. When ftrace is enabled, that store is performed within
ftrace_caller(). As such, the additional 'std' instruction is redundant.
Nop it out on kernel boot.

With this change, we now use the same 2-instruction profiling sequence
with both -mprofile-kernel, as well as -fpatchable-function-entry on
64-bit powerpc.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-4-hbathini@linux.ibm.com
7 months agopowerpc/kprobes: Use ftrace to determine if a probe is at function entry
Naveen N Rao [Wed, 30 Oct 2024 07:08:35 +0000 (12:38 +0530)]
powerpc/kprobes: Use ftrace to determine if a probe is at function entry

Rather than hard-coding the offset into a function to be used to
determine if a kprobe is at function entry, use ftrace_location() to
determine the ftrace location within the function and categorize all
instructions till that offset to be function entry.

For functions that cannot be traced, we fall back to using a fixed
offset of 8 (two instructions) to categorize a probe as being at
function entry for 64-bit elfv2, unless we are using pcrel.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-3-hbathini@linux.ibm.com
7 months agopowerpc/trace: Account for -fpatchable-function-entry support by toolchain
Naveen N Rao [Wed, 30 Oct 2024 07:08:34 +0000 (12:38 +0530)]
powerpc/trace: Account for -fpatchable-function-entry support by toolchain

So far, we have relied on the fact that gcc supports both
-mprofile-kernel, as well as -fpatchable-function-entry, and clang
supports neither. Our Makefile only checks for CONFIG_MPROFILE_KERNEL to
decide which files to build. Clang has a feature request out [*] to
implement -fpatchable-function-entry, and is unlikely to support
-mprofile-kernel.

Update our Makefile checks so that we pick up the correct files to build
once clang picks up support for -fpatchable-function-entry.

[*] https://github.com/llvm/llvm-project/issues/57031

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030070850.1361304-2-hbathini@linux.ibm.com
7 months agopowerpc/64: Remove maple platform
Michael Ellerman [Sun, 13 Oct 2024 10:29:57 +0000 (21:29 +1100)]
powerpc/64: Remove maple platform

The maple platform was added in 2004 [1], to support the "Maple" 970FX
evaluation board.

It was later used for IBM JS20/JS21 machines, as well as the Bimini
machine, aka "Yellow Dog Powerstation".

Sadly all those machines have passed into memory, and there's been no
evidence for years that anyone is still using any of them.

Remove the platform and related code. It can always be reinstated if
there's interest.

Note that this has no impact on support for 970FX based Power Macs.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux-fullhistory.git/commit/?id=f0d068d65c5e555ffcfbc189de32598f6f00770c

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241013102957.548291-1-mpe@ellerman.id.au
7 months agopowerpc/boot: Remove bogus reference to lilo
Michael Ellerman [Wed, 9 Oct 2024 05:38:06 +0000 (16:38 +1100)]
powerpc/boot: Remove bogus reference to lilo

The help text refers to lilo, but the install script does not run lilo
and never has. The reference to lilo seems to have come originally from
arch/ppc/Makefile, but it was not true there either.

Remove it.

Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://fosstodon.org/@kernellogger/113032940928131612
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009053806.135807-1-mpe@ellerman.id.au
7 months agopowerpc/pseries: Fix dtl_access_lock to be a rw_semaphore
Michael Ellerman [Mon, 19 Aug 2024 12:24:01 +0000 (22:24 +1000)]
powerpc/pseries: Fix dtl_access_lock to be a rw_semaphore

The dtl_access_lock needs to be a rw_sempahore, a sleeping lock, because
the code calls kmalloc() while holding it, which can sleep:

  # echo 1 > /proc/powerpc/vcpudispatch_stats
  BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 199, name: sh
  preempt_count: 1, expected: 0
  3 locks held by sh/199:
   #0: c00000000a0743f8 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x324/0x438
   #1: c0000000028c7058 (dtl_enable_mutex){+.+.}-{3:3}, at: vcpudispatch_stats_write+0xd4/0x5f4
   #2: c0000000028c70b8 (dtl_access_lock){+.+.}-{2:2}, at: vcpudispatch_stats_write+0x220/0x5f4
  CPU: 0 PID: 199 Comm: sh Not tainted 6.10.0-rc4 #152
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries
  Call Trace:
    dump_stack_lvl+0x130/0x148 (unreliable)
    __might_resched+0x174/0x410
    kmem_cache_alloc_noprof+0x340/0x3d0
    alloc_dtl_buffers+0x124/0x1ac
    vcpudispatch_stats_write+0x2a8/0x5f4
    proc_reg_write+0xf4/0x150
    vfs_write+0xfc/0x438
    ksys_write+0x88/0x148
    system_call_exception+0x1c4/0x5a0
    system_call_common+0xf4/0x258

Fixes: 06220d78f24a ("powerpc/pseries: Introduce rwlock to gatekeep DTLB usage")
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Nysal Jan K.A <nysal@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240819122401.513203-1-mpe@ellerman.id.au
7 months agopowerpc/machdep: Drop include of dma-mapping.h
Michael Ellerman [Wed, 9 Oct 2024 05:18:26 +0000 (16:18 +1100)]
powerpc/machdep: Drop include of dma-mapping.h

Drop the include of dma-mapping.h in machdep.h, replace it with forward
declarations of struct device and struct pci_dev, and include time64.h
and page.h which are required for time64_t and pgprot_t respectively.

Add direct includes of some other headers to some files that were
getting them via machdep.h.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009051826.132805-2-mpe@ellerman.id.au
7 months agopowerpc/machdep: Drop include of seq_file.h
Michael Ellerman [Wed, 9 Oct 2024 05:18:25 +0000 (16:18 +1100)]
powerpc/machdep: Drop include of seq_file.h

Drop the include of seq_file.h in machdep.h, replace it with a forward
declaration of struct seq_file, which is all that's required.

Add direct includes of seq_file.h to some files that were getting
seq_file.h via machdep.h.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009051826.132805-1-mpe@ellerman.id.au
7 months agopowerpc/64: Drop IPI_PRIORITY from asm-offsets
Michael Ellerman [Wed, 9 Oct 2024 05:17:00 +0000 (16:17 +1100)]
powerpc/64: Drop IPI_PRIORITY from asm-offsets

The last use of IPI_PRIORITY in asm was removed in commit 37f55d30df2e
("KVM: PPC: Book3S HV: Convert kvmppc_read_intr to a C function").

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009051701.132282-1-mpe@ellerman.id.au
7 months agopowerpc: Adjust adding stack protector flags to KBUILD_CLAGS for clang
Nathan Chancellor [Wed, 9 Oct 2024 19:26:09 +0000 (12:26 -0700)]
powerpc: Adjust adding stack protector flags to KBUILD_CLAGS for clang

After fixing the HAVE_STACKPROTECTER checks for clang's in-progress
per-task stack protector support [1], the build fails during prepare0
because '-mstack-protector-guard-offset' has not been added to
KBUILD_CFLAGS yet but the other '-mstack-protector-guard' flags have.

  clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default
  clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default
  make[4]: *** [scripts/Makefile.build:229: scripts/mod/empty.o] Error 1
  make[4]: *** [scripts/Makefile.build:102: scripts/mod/devicetable-offsets.s] Error 1

Mirror other architectures and add all '-mstack-protector-guard' flags
to KBUILD_CFLAGS atomically during stack_protector_prepare, which
resolves the issue and allows clang's implementation to fully work with
the kernel.

Cc: stable@vger.kernel.org # 6.1+
Link: https://github.com/llvm/llvm-project/pull/110928
Reviewed-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009-powerpc-fix-stackprotector-test-clang-v2-2-12fb86b31857@kernel.org
7 months agopowerpc: Fix stack protector Kconfig test for clang
Nathan Chancellor [Wed, 9 Oct 2024 19:26:08 +0000 (12:26 -0700)]
powerpc: Fix stack protector Kconfig test for clang

Clang's in-progress per-task stack protector support [1] does not work
with the current Kconfig checks because '-mstack-protector-guard-offset'
is not provided, unlike all other architecture Kconfig checks.

  $ fd Kconfig -x rg -l mstack-protector-guard-offset
  ./arch/arm/Kconfig
  ./arch/riscv/Kconfig
  ./arch/arm64/Kconfig

This produces an error from clang, which is interpreted as the flags not
being supported at all when they really are.

  $ clang --target=powerpc64-linux-gnu \
          -mstack-protector-guard=tls \
          -mstack-protector-guard-reg=r13 \
          -c -o /dev/null -x c /dev/null
  clang: error: '-mstack-protector-guard=tls' is used without '-mstack-protector-guard-offset', and there is no default

This argument will always be provided by the build system, so mirror
other architectures and use '-mstack-protector-guard-offset=0' for
testing support, which fixes the issue for clang and does not regress
support with GCC.

Even with the first problem addressed, the 32-bit test continues to fail
because Kbuild uses the powerpc64le-linux-gnu target for clang and
nothing flips the target to 32-bit, resulting in an error about an
invalid register valid:

  $ clang --target=powerpc64le-linux-gnu \
          -mstack-protector-guard=tls
          -mstack-protector-guard-reg=r2 \
          -mstack-protector-guard-offset=0 \
          -x c -c -o /dev/null /dev/null
  clang: error: invalid value 'r2' in 'mstack-protector-guard-reg=', expected one of: r13

While GCC allows arbitrary registers, the implementation of
'-mstack-protector-guard=tls' in LLVM shares the same code path as the
user space thread local storage implementation, which uses a fixed
register (2 for 32-bit and 13 for 62-bit), so the command line parsing
enforces this limitation.

Use the Kconfig macro '$(m32-flag)', which expands to '-m32' when
supported, in the stack protector support cc-option call to properly
switch the target to a 32-bit one, which matches what happens in Kbuild.
While the 64-bit macro does not strictly need it, add the equivalent
64-bit option for symmetry.

Cc: stable@vger.kernel.org # 6.1+
Link: https://github.com/llvm/llvm-project/pull/110928
Reviewed-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241009-powerpc-fix-stackprotector-test-clang-v2-1-12fb86b31857@kernel.org
7 months agobook3s64/hash: Early detect debug_pagealloc size requirement
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:53 +0000 (22:59 +0530)]
book3s64/hash: Early detect debug_pagealloc size requirement

Add hash_supports_debug_pagealloc() helper to detect whether
debug_pagealloc can be supported on hash or not. This checks for both,
whether debug_pagealloc config is enabled and the linear map should
fit within rma_size/4 region size.

This can then be used early during htab_init_page_sizes() to decide
linear map pagesize if hash supports either debug_pagealloc or
kfence.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/c33c6691b2a2cf619cc74ac100118ca4dbf21a48.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Disable kfence if not early init
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:52 +0000 (22:59 +0530)]
book3s64/hash: Disable kfence if not early init

Enable kfence on book3s64 hash only when early init is enabled.
This is because, kfence could cause the kernel linear map to be mapped
at PAGE_SIZE level instead of 16M (which I guess we don't want).

Also currently there is no way to -
1. Make multiple page size entries for the SLB used for kernel linear
   map.
2. No easy way of getting the hash slot details after the page table
   mapping for kernel linear setup. So even if kfence allocate the
   pool in late init, we won't be able to get the hash slot details in
   kfence linear map.

Thus this patch disables kfence on hash if kfence early init is not
enabled.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/4a6eea8cfd1cd28fccfae067026bff30cbec1d4b.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/radix: Refactoring common kfence related functions
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:51 +0000 (22:59 +0530)]
book3s64/radix: Refactoring common kfence related functions

Both radix and hash on book3s requires to detect if kfence
early init is enabled or not. Hash needs to disable kfence
if early init is not enabled because with kfence the linear map is
mapped using PAGE_SIZE rather than 16M mapping.
We don't support multiple page sizes for slb entry used for kernel
linear map in book3s64.

This patch refactors out the common functions required to detect kfence
early init is enabled or not.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/f4a787224fbe5bb787158ace579780c0257f6602.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Add kfence functionality
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:50 +0000 (22:59 +0530)]
book3s64/hash: Add kfence functionality

Now that linear map functionality of debug_pagealloc is made generic,
enable kfence to use this generic infrastructure.

1. Define kfence related linear map variables.
   - u8 *linear_map_kf_hash_slots;
   - unsigned long linear_map_kf_hash_count;
   - DEFINE_RAW_SPINLOCK(linear_map_kf_hash_lock);
2. The linear map size allocated in RMA region is quite small
   (KFENCE_POOL_SIZE >> PAGE_SHIFT) which is 512 bytes by default.
3. kfence pool memory is reserved using memblock_phys_alloc() which has
   can come from anywhere.
   (default 255 objects => ((1+255) * 2) << PAGE_SHIFT = 32MB)
4. The hash slot information for kfence memory gets added in linear map
   in hash_linear_map_add_slot() (which also adds for debug_pagealloc).

Reported-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/5c2b61941b344077a2b8654dab46efa0322af3af.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Disable debug_pagealloc if it requires more memory
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:49 +0000 (22:59 +0530)]
book3s64/hash: Disable debug_pagealloc if it requires more memory

Make size of the linear map to be allocated in RMA region to be of
ppc64_rma_size / 4. If debug_pagealloc requires more memory than that
then do not allocate any memory and disable debug_pagealloc.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/e1ef66f32a1fe63bcbb89d5c11d86c65beef5ded.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Make kernel_map_linear_page() generic
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:48 +0000 (22:59 +0530)]
book3s64/hash: Make kernel_map_linear_page() generic

Currently kernel_map_linear_page() function assumes to be working on
linear_map_hash_slots array. But since in later patches we need a
separate linear map array for kfence, hence make
kernel_map_linear_page() take a linear map array and lock in it's
function argument.

This is needed to separate out kfence from debug_pagealloc
infrastructure.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/5b67df7b29e68d7c78d6fc1f42d41137299bac6b.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Refactor hash__kernel_map_pages() function
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:47 +0000 (22:59 +0530)]
book3s64/hash: Refactor hash__kernel_map_pages() function

This refactors hash__kernel_map_pages() function to call
hash_debug_pagealloc_map_pages(). This will come useful when we will add
kfence support.

No functionality changes in this patch.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/0cb8ddcccdcf61ea06ab4d92aacd770c16cc0f2c.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Add hash_debug_pagealloc_alloc_slots() function
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:46 +0000 (22:59 +0530)]
book3s64/hash: Add hash_debug_pagealloc_alloc_slots() function

This adds hash_debug_pagealloc_alloc_slots() function instead of open
coding that in htab_initialize(). This is required since we will be
separating the kfence functionality to not depend upon debug_pagealloc.

Now that everything required for debug_pagealloc is under a #ifdef
config. Bring in linear_map_hash_slots and linear_map_hash_count
variables under the same config too.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/d1d5aabe1e4c693a983e59ccf3de08e3c28c5161.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Add hash_debug_pagealloc_add_slot() function
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:45 +0000 (22:59 +0530)]
book3s64/hash: Add hash_debug_pagealloc_add_slot() function

This adds hash_debug_pagealloc_add_slot() function instead of open
coding that in htab_bolt_mapping(). This is required since we will be
separating kfence functionality to not depend upon debug_pagealloc.

No functionality change in this patch.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/026f0aaa1dddd89154dc8d20ceccfca4f63ccf79.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Refactor kernel linear map related calls
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:44 +0000 (22:59 +0530)]
book3s64/hash: Refactor kernel linear map related calls

This just brings all linear map related handling at one place instead of
having those functions scattered in hash_utils file.
Makes it easy for review.

No functionality changes in this patch.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/56c610310aa50b5417976a39c5f15b78bc76c764.1729271995.git.ritesh.list@gmail.com
7 months agobook3s64/hash: Remove kfence support temporarily
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:43 +0000 (22:59 +0530)]
book3s64/hash: Remove kfence support temporarily

Kfence on book3s Hash on pseries is anyways broken. It fails to boot
due to RMA size limitation. That is because, kfence with Hash uses
debug_pagealloc infrastructure. debug_pagealloc allocates linear map
for entire dram size instead of just kfence relevant objects.
This means for 16TB of DRAM it will require (16TB >> PAGE_SHIFT)
which is 256MB which is half of RMA region on P8.
crash kernel reserves 256MB and we also need 2048 * 16KB * 3 for
emergency stack and some more for paca allocations.
That means there is not enough memory for reserving the full linear map
in the RMA region, if the DRAM size is too big (>=16TB)
(The issue is seen above 8TB with crash kernel 256 MB reservation).

Now Kfence does not require linear memory map for entire DRAM.
It only needs for kfence objects. So this patch temporarily removes the
kfence functionality since debug_pagealloc code needs some refactoring.
We will bring in kfence on Hash support in later patches.

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/1761bc39674473c8878dedca15e0d9a0d3a1b528.1729271995.git.ritesh.list@gmail.com
7 months agopowerpc/mm/fault: Fix kfence page fault reporting
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 17:29:42 +0000 (22:59 +0530)]
powerpc/mm/fault: Fix kfence page fault reporting

copy_from_kernel_nofault() can be called when doing read of /proc/kcore.
/proc/kcore can have some unmapped kfence objects which when read via
copy_from_kernel_nofault() can cause page faults. Since *_nofault()
functions define their own fixup table for handling fault, use that
instead of asking kfence to handle such faults.

Hence we search the exception tables for the nip which generated the
fault. If there is an entry then we let the fixup table handler handle the
page fault by returning an error from within ___do_page_fault().

This can be easily triggered if someone tries to do dd from /proc/kcore.
eg. dd if=/proc/kcore of=/dev/null bs=1M

Some example false negatives:

  ===============================
  BUG: KFENCE: invalid read in copy_from_kernel_nofault+0x9c/0x1a0
  Invalid read at 0xc0000000fdff0000:
   copy_from_kernel_nofault+0x9c/0x1a0
   0xc00000000665f950
   read_kcore_iter+0x57c/0xa04
   proc_reg_read_iter+0xe4/0x16c
   vfs_read+0x320/0x3ec
   ksys_read+0x90/0x154
   system_call_exception+0x120/0x310
   system_call_vectored_common+0x15c/0x2ec

  BUG: KFENCE: use-after-free read in copy_from_kernel_nofault+0x9c/0x1a0
  Use-after-free read at 0xc0000000fe050000 (in kfence-#2):
   copy_from_kernel_nofault+0x9c/0x1a0
   0xc00000000665f950
   read_kcore_iter+0x57c/0xa04
   proc_reg_read_iter+0xe4/0x16c
   vfs_read+0x320/0x3ec
   ksys_read+0x90/0x154
   system_call_exception+0x120/0x310
   system_call_vectored_common+0x15c/0x2ec

Fixes: 90cbac0e995d ("powerpc: Enable KFENCE for PPC32")
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: Disha Goel <disgoel@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/a411788081d50e3b136c6270471e35aba3dfafa3.1729271995.git.ritesh.list@gmail.com
7 months agopowerpc/fadump: Move fadump_cma_init to setup_arch() after initmem_init()
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 16:17:57 +0000 (21:47 +0530)]
powerpc/fadump: Move fadump_cma_init to setup_arch() after initmem_init()

During early init CMA_MIN_ALIGNMENT_BYTES can be PAGE_SIZE,
since pageblock_order is still zero and it gets initialized
later during initmem_init() e.g.
setup_arch() -> initmem_init() -> sparse_init() -> set_pageblock_order()

One such use case where this causes issue is -
early_setup() -> early_init_devtree() -> fadump_reserve_mem() -> fadump_cma_init()

This causes CMA memory alignment check to be bypassed in
cma_init_reserved_mem(). Then later cma_activate_area() can hit
a VM_BUG_ON_PAGE(pfn & ((1 << order) - 1)) if the reserved memory
area was not pageblock_order aligned.

Fix it by moving the fadump_cma_init() after initmem_init(),
where other such cma reservations also gets called.

<stack trace>
==============
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10010
flags: 0x13ffff800000000(node=1|zone=0|lastcpupid=0x7ffff) CMA
raw: 013ffff800000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(pfn & ((1 << order) - 1))
------------[ cut here ]------------
kernel BUG at mm/page_alloc.c:778!

Call Trace:
__free_one_page+0x57c/0x7b0 (unreliable)
free_pcppages_bulk+0x1a8/0x2c8
free_unref_page_commit+0x3d4/0x4e4
free_unref_page+0x458/0x6d0
init_cma_reserved_pageblock+0x114/0x198
cma_init_reserved_areas+0x270/0x3e0
do_one_initcall+0x80/0x2f8
kernel_init_freeable+0x33c/0x530
kernel_init+0x34/0x26c
ret_from_kernel_user_thread+0x14/0x1c

Fixes: 11ac3e87ce09 ("mm: cma: use pageblock_order as the single alignment")
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Sachin P Bappalige <sachinpb@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/3ae208e48c0d9cefe53d2dc4f593388067405b7d.1729146153.git.ritesh.list@gmail.com
7 months agopowerpc/fadump: Reserve page-aligned boot_memory_size during fadump_reserve_mem
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 16:17:56 +0000 (21:47 +0530)]
powerpc/fadump: Reserve page-aligned boot_memory_size during fadump_reserve_mem

This patch refactors all CMA related initialization and alignment code
to within fadump_cma_init() which gets called in the end. This also means
that we keep [reserve_dump_area_start, boot_memory_size] page aligned
during fadump_reserve_mem(). Then later in fadump_cma_init() we extract the
aligned chunk and provide it to CMA. This inherently also fixes an issue in
the current code where the reserve_dump_area_start is not aligned
when the physical memory can have holes and the suitable chunk starts at
an unaligned boundary.

After this we should be able to call fadump_cma_init() independently
later in setup_arch() where pageblock_order is non-zero.

Suggested-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/805d6b900968fb9402ad8f4e4775597db42085c4.1729146153.git.ritesh.list@gmail.com
7 months agopowerpc/fadump: Refactor and prepare fadump_cma_init for late init
Ritesh Harjani (IBM) [Fri, 18 Oct 2024 16:17:55 +0000 (21:47 +0530)]
powerpc/fadump: Refactor and prepare fadump_cma_init for late init

We anyway don't use any return values from fadump_cma_init(). Since
fadump_reserve_mem() from where fadump_cma_init() gets called today,
already has the required checks.
This patch makes this function return type as void. Let's also handle
extra cases like return if fadump_supported is false or dump_active, so
that in later patches we can call fadump_cma_init() separately from
setup_arch().

Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/a2afc3d6481a87a305e89cfc4a3f3d2a0b8ceab3.1729146153.git.ritesh.list@gmail.com
7 months agoMerge branch 'topic/vdso' into next
Michael Ellerman [Wed, 16 Oct 2024 11:10:52 +0000 (22:10 +1100)]
Merge branch 'topic/vdso' into next

Merge VDSO changes we're keeping in a topic branch, in case they conflict with
other VDSO changes in flight.

8 months agopowerpc/vdso: Flag VDSO64 entry points as functions
Christophe Leroy [Wed, 9 Oct 2024 22:17:57 +0000 (00:17 +0200)]
powerpc/vdso: Flag VDSO64 entry points as functions

On powerpc64 as shown below by readelf, vDSO functions symbols have
type NOTYPE.

$ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg
ELF Header:
  Magic:   7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, big endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           PowerPC64
  Version:                           0x1
...

Symbol table '.dynsym' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
...
     1: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15
...
     4: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
     5: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __[...]@@LINUX_2.6.15

Symbol table '.symtab' contains 56 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
...
    45: 0000000000000000     0 OBJECT  GLOBAL DEFAULT  ABS LINUX_2.6.15
    46: 00000000000006c0    48 NOTYPE  GLOBAL DEFAULT    8 __kernel_getcpu
    47: 0000000000000524    84 NOTYPE  GLOBAL DEFAULT    8 __kernel_clock_getres

To overcome that, commit ba83b3239e65 ("selftests: vDSO: fix vDSO
symbols lookup for powerpc64") was applied to have selftests also
look for NOTYPE symbols, but the correct fix should be to flag VDSO
entry points as functions.

The original commit that brought VDSO support into powerpc/64 has the
following explanation:

    Note that the symbols exposed by the vDSO aren't "normal" function symbols, apps
    can't be expected to link against them directly, the vDSO's are both seen
    as if they were linked at 0 and the symbols just contain offsets to the
    various functions.  This is done on purpose to avoid a relocation step
    (ppc64 functions normally have descriptors with abs addresses in them).
    When glibc uses those functions, it's expected to use it's own trampolines
    that know how to reach them.

The descriptors it's talking about are the OPD function descriptors
used on ABI v1 (big endian). But it would be more correct for a text
symbol to have type function, even if there's no function descriptor
for it.

glibc has a special case already for handling the VDSO symbols which
creates a fake opd pointing at the kernel symbol. So changing the VDSO
symbol type to function shouldn't affect that.

For ABI v2, there is no function descriptors and VDSO functions can
safely have function type.

So lets flag VDSO entry points as functions and revert the
selftest change.

Link: https://github.com/mpe/linux-fullhistory/commit/5f2dd691b62da9d9cc54b938f8b29c22c93cb805
Fixes: ba83b3239e65 ("selftests: vDSO: fix vDSO symbols lookup for powerpc64")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-By: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/b6ad2f1ee9887af3ca5ecade2a56f4acda517a85.1728512263.git.christophe.leroy@csgroup.eu
8 months agopowerpc/vdso: Implement __arch_get_vdso_rng_data()
Christophe Leroy [Wed, 2 Oct 2024 08:39:29 +0000 (10:39 +0200)]
powerpc/vdso: Implement __arch_get_vdso_rng_data()

VDSO time functions do not call any other function, so they don't
need to save/restore LR. However, retrieving the address of VDSO data
page requires using LR hence saving then restoring it, which can be
heavy on some CPUs. On the other hand, VDSO functions on powerpc are
not standard functions and require a wrapper function to call C VDSO
functions. And that wrapper has to save and restore LR in order to
call the C VDSO function, so retrieving VDSO data page address in that
wrapper doesn't require additional save/restore of LR.

For random VDSO functions it is a bit different. Because the function
calls __arch_chacha20_blocks_nostack(), it saves and restores LR.
Retrieving VDSO data page address can then be done there without
additional save/restore of LR.

So lets implement __arch_get_vdso_rng_data() and simplify the wrapper.

It starts paving the way for the day powerpc will implement a more
standard ABI for VDSO functions.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/a1a9bd0df508f1b5c04684b7366940577dfc6262.1727858295.git.christophe.leroy@csgroup.eu
8 months agopowerpc/vdso: Add a page for non-time data
Christophe Leroy [Wed, 2 Oct 2024 08:39:28 +0000 (10:39 +0200)]
powerpc/vdso: Add a page for non-time data

The page containing VDSO time data is swapped with the one containing
TIME namespace data when a process uses a non-root time namespace.
For other data like powerpc specific data and RNG data, it means
tracking whether time namespace is the root one or not to know which
page to use.

Simplify the logic behind by moving time data out of first data page
so that the first data page which contains everything else always
remains the first page. Time data is in the second or third page
depending on selected time namespace.

While we are playing with get_datapage macro, directly take into
account the data offset inside the macro instead of adding that offset
afterwards.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/0557d3ec898c1d0ea2fc59fa8757618e524c5d94.1727858295.git.christophe.leroy@csgroup.eu
8 months agoLinux 6.12-rc2 v6.12-rc2
Linus Torvalds [Sun, 6 Oct 2024 22:32:27 +0000 (15:32 -0700)]
Linux 6.12-rc2

8 months agoMerge tag 'kbuild-fixes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masah...
Linus Torvalds [Sun, 6 Oct 2024 18:34:55 +0000 (11:34 -0700)]
Merge tag 'kbuild-fixes-v6.12' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Move non-boot built-in DTBs to the .rodata section

 - Fix Kconfig bugs

 - Fix maint scripts in the linux-image Debian package

 - Import some list macros to scripts/include/

* tag 'kbuild-fixes-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: deb-pkg: Remove blank first line from maint scripts
  kbuild: fix a typo dt_binding_schema -> dt_binding_schemas
  scripts: import more list macros
  kconfig: qconf: fix buffer overflow in debug links
  kconfig: qconf: move conf_read() before drawing tree pain
  kconfig: clear expr::val_is_valid when allocated
  kconfig: fix infinite loop in sym_calc_choice()
  kbuild: move non-boot built-in DTBs to .rodata section

8 months agoMerge tag 'platform-drivers-x86-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 6 Oct 2024 18:11:01 +0000 (11:11 -0700)]
Merge tag 'platform-drivers-x86-v6.12-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:

 - Intel PMC fix for suspend/resume issues on some Sky and Kaby Lake
   laptops

 - Intel Diamond Rapids hw-id additions

 - Documentation and MAINTAINERS fixes

 - Some other small fixes

* tag 'platform-drivers-x86-v6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors
  platform/x86: wmi: Update WMI driver API documentation
  platform/x86: dell-ddv: Fix typo in documentation
  platform/x86: dell-sysman: add support for alienware products
  platform/x86/intel: power-domains: Add Diamond Rapids support
  platform/x86: ISST: Add Diamond Rapids to support list
  platform/x86:intel/pmc: Disable ACPI PM Timer disabling on Sky and Kaby Lake
  platform/x86: dell-laptop: Do not fail when encountering unsupported batteries
  MAINTAINERS: Update Intel In Field Scan(IFS) entry
  platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug

8 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 6 Oct 2024 17:53:28 +0000 (10:53 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix pKVM error path on init, making sure we do not change critical
     system registers as we're about to fail

   - Make sure that the host's vector length is at capped by a value
     common to all CPUs

   - Fix kvm_has_feat*() handling of "negative" features, as the current
     code is pretty broken

   - Promote Joey to the status of official reviewer, while James steps
     down -- hopefully only temporarly

  x86:

   - Fix compilation with KVM_INTEL=KVM_AMD=n

   - Fix disabling KVM_X86_QUIRK_SLOT_ZAP_ALL when shadow MMU is in use

  Selftests:

   - Fix compilation on non-x86 architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  x86/reboot: emergency callbacks are now registered by common KVM code
  KVM: x86: leave kvm.ko out of the build if no vendor module is requested
  KVM: x86/mmu: fix KVM_X86_QUIRK_SLOT_ZAP_ALL for shadow MMU
  KVM: arm64: Fix kvm_has_feat*() handling of negative features
  KVM: selftests: Fix build on architectures other than x86_64
  KVM: arm64: Another reviewer reshuffle
  KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM
  KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path

8 months agoMerge tag 'powerpc-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 6 Oct 2024 17:43:00 +0000 (10:43 -0700)]
Merge tag 'powerpc-6.12-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

 - Allow r30 to be used in vDSO code generation of getrandom

Thanks to Jason A. Donenfeld

* tag 'powerpc-6.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/vdso: allow r30 in vDSO code generation of getrandom

8 months agokbuild: deb-pkg: Remove blank first line from maint scripts
Aaron Thompson [Fri, 4 Oct 2024 07:52:45 +0000 (07:52 +0000)]
kbuild: deb-pkg: Remove blank first line from maint scripts

The blank line causes execve() to fail:

  # strace ./postinst
  execve("./postinst", ...) = -1 ENOEXEC (Exec format error)
  strace: exec: Exec format error
  +++ exited with 1 +++

However running the scripts via shell does work (at least with bash)
because the shell attempts to execute the file as a shell script when
execve() fails.

Fixes: b611daae5efc ("kbuild: deb-pkg: split image and debug objects staging out into functions")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
8 months agokbuild: fix a typo dt_binding_schema -> dt_binding_schemas
Xu Yang [Wed, 25 Sep 2024 05:32:30 +0000 (13:32 +0800)]
kbuild: fix a typo dt_binding_schema -> dt_binding_schemas

If we follow "make help" to "make dt_binding_schema", we will see
below error:

$ make dt_binding_schema
make[1]: *** No rule to make target 'dt_binding_schema'.  Stop.
make: *** [Makefile:224: __sub-make] Error 2

It should be a typo. So this will fix it.

Fixes: 604a57ba9781 ("dt-bindings: kbuild: Add separate target/dependency for processed-schema.json")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
8 months agoscripts: import more list macros
Sami Tolvanen [Mon, 23 Sep 2024 18:18:47 +0000 (18:18 +0000)]
scripts: import more list macros

Import list_is_first, list_is_last, list_replace, and list_replace_init.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
8 months agoplatform/x86: x86-android-tablets: Fix use after free on platform_device_register...
Hans de Goede [Sat, 5 Oct 2024 13:05:45 +0000 (15:05 +0200)]
platform/x86: x86-android-tablets: Fix use after free on platform_device_register() errors

x86_android_tablet_remove() frees the pdevs[] array, so it should not
be used after calling x86_android_tablet_remove().

When platform_device_register() fails, store the pdevs[x] PTR_ERR() value
into the local ret variable before calling x86_android_tablet_remove()
to avoid using pdevs[] after it has been freed.

Fixes: 5eba0141206e ("platform/x86: x86-android-tablets: Add support for instantiating platform-devs")
Fixes: e2200d3f26da ("platform/x86: x86-android-tablets: Add gpio_keys support to x86_android_tablet_init()")
Cc: stable@vger.kernel.org
Reported-by: Aleksandr Burakov <a.burakov@rosalinux.ru>
Closes: https://lore.kernel.org/platform-driver-x86/20240917120458.7300-1-a.burakov@rosalinux.ru/
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20241005130545.64136-1-hdegoede@redhat.com
8 months agoplatform/x86: wmi: Update WMI driver API documentation
Armin Wolf [Sat, 5 Oct 2024 21:38:24 +0000 (23:38 +0200)]
platform/x86: wmi: Update WMI driver API documentation

The WMI driver core now passes the WMI event data to legacy notify
handlers, so WMI devices sharing notification IDs are now being
handled properly.

Fixes: e04e2b760ddb ("platform/x86: wmi: Pass event data directly to legacy notify handlers")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20241005213825.701887-1-W_Armin@gmx.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoplatform/x86: dell-ddv: Fix typo in documentation
Anaswara T Rajan [Sat, 5 Oct 2024 07:00:56 +0000 (12:30 +0530)]
platform/x86: dell-ddv: Fix typo in documentation

Fix typo in word 'diagnostics' in documentation.

Signed-off-by: Anaswara T Rajan <anaswaratrajan@gmail.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20241005070056.16326-1-anaswaratrajan@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoplatform/x86: dell-sysman: add support for alienware products
Crag Wang [Fri, 4 Oct 2024 15:27:58 +0000 (23:27 +0800)]
platform/x86: dell-sysman: add support for alienware products

Alienware supports firmware-attributes and has its own OEM string.

Signed-off-by: Crag Wang <crag_wang@dell.com>
Link: https://lore.kernel.org/r/20241004152826.93992-1-crag_wang@dell.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoplatform/x86/intel: power-domains: Add Diamond Rapids support
Srinivas Pandruvada [Thu, 3 Oct 2024 21:55:54 +0000 (14:55 -0700)]
platform/x86/intel: power-domains: Add Diamond Rapids support

Add Diamond Rapids (INTEL_PANTHERCOVE_X) to tpmi_cpu_ids to support
domaid id mappings.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20241003215554.3013807-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoplatform/x86: ISST: Add Diamond Rapids to support list
Srinivas Pandruvada [Thu, 3 Oct 2024 21:55:53 +0000 (14:55 -0700)]
platform/x86: ISST: Add Diamond Rapids to support list

Add Diamond Rapids (INTEL_PANTHERCOVE_X) to SST support list by adding
to isst_cpu_ids.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20241003215554.3013807-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoplatform/x86:intel/pmc: Disable ACPI PM Timer disabling on Sky and Kaby Lake
Hans de Goede [Thu, 3 Oct 2024 20:26:13 +0000 (22:26 +0200)]
platform/x86:intel/pmc: Disable ACPI PM Timer disabling on Sky and Kaby Lake

There have been multiple reports that the ACPI PM Timer disabling is
causing Sky and Kaby Lake systems to hang on all suspend (s2idle, s3,
hibernate) methods.

Remove the acpi_pm_tmr_ctl_offset and acpi_pm_tmr_disable_bit settings from
spt_reg_map to disable the ACPI PM Timer disabling on Sky and Kaby Lake to
fix the hang on suspend.

Fixes: e86c8186d03a ("platform/x86:intel/pmc: Enable the ACPI PM Timer to be turned off when suspended")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/linux-pm/18784f62-91ff-4d88-9621-6c88eb0af2b5@molgen.mpg.de/
Reported-by: Todd Brandt <todd.e.brandt@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219346
Cc: Marek Maslanka <mmaslanka@google.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Todd Brandt <todd.e.brandt@intel.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> # Dell XPS 13 9360/0596KF
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20241003202614.17181-2-hdegoede@redhat.com
8 months agoplatform/x86: dell-laptop: Do not fail when encountering unsupported batteries
Armin Wolf [Tue, 1 Oct 2024 21:28:35 +0000 (23:28 +0200)]
platform/x86: dell-laptop: Do not fail when encountering unsupported batteries

If the battery hook encounters a unsupported battery, it will
return an error. This in turn will cause the battery driver to
automatically unregister the battery hook.

On machines with multiple batteries however, this will prevent
the battery hook from handling the primary battery, since it will
always get unregistered upon encountering one of the unsupported
batteries.

Fix this by simply ignoring unsupported batteries.

Reviewed-by: Pali Rohár <pali@kernel.org>
Fixes: ab58016c68cc ("platform/x86:dell-laptop: Add knobs to change battery charge settings")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20241001212835.341788-4-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoMAINTAINERS: Update Intel In Field Scan(IFS) entry
Jithu Joseph [Tue, 1 Oct 2024 17:08:08 +0000 (10:08 -0700)]
MAINTAINERS: Update Intel In Field Scan(IFS) entry

Ashok is no longer with Intel and his e-mail address will start bouncing
soon.  Update his email address to the new one he provided to ensure
correct contact details in the MAINTAINERS file.

Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
Link: https://lore.kernel.org/r/20241001170808.203970-1-jithu.joseph@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoMerge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Sun, 6 Oct 2024 07:59:22 +0000 (03:59 -0400)]
Merge tag 'kvmarm-fixes-6.12-1' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.12, take #1

- Fix pKVM error path on init, making sure we do not change critical
  system registers as we're about to fail

- Make sure that the host's vector length is at capped by a value
  common to all CPUs

- Fix kvm_has_feat*() handling of "negative" features, as the current
  code is pretty broken

- Promote Joey to the status of official reviewer, while James steps
  down -- hopefully only temporarly

8 months agox86/reboot: emergency callbacks are now registered by common KVM code
Paolo Bonzini [Tue, 1 Oct 2024 14:34:58 +0000 (10:34 -0400)]
x86/reboot: emergency callbacks are now registered by common KVM code

Guard them with CONFIG_KVM_X86_COMMON rather than the two vendor modules.
In practice this has no functional change, because CONFIG_KVM_X86_COMMON
is set if and only if at least one vendor-specific module is being built.
However, it is cleaner to specify CONFIG_KVM_X86_COMMON for functions that
are used in kvm.ko.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Fixes: 6d55a94222db ("x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 months agoKVM: x86: leave kvm.ko out of the build if no vendor module is requested
Paolo Bonzini [Tue, 1 Oct 2024 14:15:01 +0000 (10:15 -0400)]
KVM: x86: leave kvm.ko out of the build if no vendor module is requested

kvm.ko is nothing but library code shared by kvm-intel.ko and kvm-amd.ko.
It provides no functionality on its own and it is unnecessary unless one
of the vendor-specific module is compiled.  In particular, /dev/kvm is
not created until one of kvm-intel.ko or kvm-amd.ko is loaded.

Use CONFIG_KVM to decide if it is built-in or a module, but use the
vendor-specific modules for the actual decision on whether to build it.

This also fixes a build failure when CONFIG_KVM_INTEL and CONFIG_KVM_AMD
are both disabled.  The cpu_emergency_register_virt_callback() function
is called from kvm.ko, but it is only defined if at least one of
CONFIG_KVM_INTEL and CONFIG_KVM_AMD is provided.

Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 months agoMerge tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Sat, 5 Oct 2024 22:18:04 +0000 (15:18 -0700)]
Merge tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "A lot of little fixes, bigger ones include:

   - bcachefs's __wait_on_freeing_inode() was broken in rc1 due to vfs
     changes, now fixed along with another lost wakeup

   - fragmentation LRU fixes; fsck now repairs successfully (this is the
     data structure copygc uses); along with some nice simplification.

   - Rework logged op error handling, so that if logged op replay errors
     (due to another filesystem error) we delete the logged op instead
     of going into an infinite loop)

   - Various small filesystem connectivitity repair fixes"

* tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs:
  bcachefs: Rework logged op error handling
  bcachefs: Add warn param to subvol_get_snapshot, peek_inode
  bcachefs: Kill snapshot arg to fsck_write_inode()
  bcachefs: Check for unlinked, non-empty dirs in check_inode()
  bcachefs: Check for unlinked inodes with dirents
  bcachefs: Check for directories with no backpointers
  bcachefs: Kill alloc_v4.fragmentation_lru
  bcachefs: minor lru fsck fixes
  bcachefs: Mark more errors AUTOFIX
  bcachefs: Make sure we print error that causes fsck to bail out
  bcachefs: bkey errors are only AUTOFIX during read
  bcachefs: Create lost+found in correct snapshot
  bcachefs: Fix reattach_inode()
  bcachefs: Add missing wakeup to bch2_inode_hash_remove()
  bcachefs: Fix trans_commit disk accounting revert
  bcachefs: Fix bch2_inode_is_open() check
  bcachefs: Fix return type of dirent_points_to_inode_nowarn()
  bcachefs: Fix bad shift in bch2_read_flag_list()

8 months agoMerge tag 'for-linus-6.12a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 5 Oct 2024 17:59:44 +0000 (10:59 -0700)]
Merge tag 'for-linus-6.12a-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "Fix Xen config issue introduced in the merge window"

* tag 'for-linus-6.12a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Fix config option reference in XEN_PRIVCMD definition

8 months agoMerge tag 'ext4_for_linus-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 5 Oct 2024 17:47:00 +0000 (10:47 -0700)]
Merge tag 'ext4_for_linus-5.12-rc2' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix some ext4 bugs and regressions relating to oneline resize and fast
  commits"

* tag 'ext4_for_linus-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix off by one issue in alloc_flex_gd()
  ext4: mark fc as ineligible using an handle in ext4_xattr_set()
  ext4: use handle to mark fc as ineligible in __track_dentry_update()

8 months agoMerge tag 'cxl-fixes-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Linus Torvalds [Sat, 5 Oct 2024 17:40:16 +0000 (10:40 -0700)]
Merge tag 'cxl-fixes-6.12-rc2' of git://git./linux/kernel/git/cxl/cxl

Pull cxl fix from Ira Weiny:

 - Fix calculation for SBDF in error injection

* tag 'cxl-fixes-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  EINJ, CXL: Fix CXL device SBDF calculation

8 months agoMerge tag 'i2c-for-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 5 Oct 2024 17:31:04 +0000 (10:31 -0700)]
Merge tag 'i2c-for-6.12-rc2' of git://git./linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:

 - Fix potential deadlock during runtime suspend and resume (stm32f7)

* tag 'i2c-for-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: stm32f7: Do not prepare/unprepare clock during runtime suspend/resume

8 months agoMerge tag 'spi-fix-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Sat, 5 Oct 2024 17:25:04 +0000 (10:25 -0700)]
Merge tag 'spi-fix-v6.12-rc1' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small set of driver specific fixes that came in since the merge
  window, about half of which is fixes for correctness in the use of the
  runtime PM APIs done as part of a broader cleanup"

* tag 'spi-fix-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: s3c64xx: fix timeout counters in flush_fifo
  spi: atmel-quadspi: Fix wrong register value written to MR
  spi: spi-cadence: Fix missing spi_controller_is_target() check
  spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled
  spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled

8 months agoMerge tag 'hardening-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 5 Oct 2024 17:19:14 +0000 (10:19 -0700)]
Merge tag 'hardening-v6.12-rc2' of git://git./linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - gcc plugins: Avoid Kconfig warnings with randstruct (Nathan
   Chancellor)

 - MAINTAINERS: Add security/Kconfig.hardening to hardening section
   (Nathan Chancellor)

 - MAINTAINERS: Add unsafe_memcpy() to the FORTIFY review list

* tag 'hardening-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  MAINTAINERS: Add security/Kconfig.hardening to hardening section
  hardening: Adjust dependencies in selection of MODVERSIONS
  MAINTAINERS: Add unsafe_memcpy() to the FORTIFY review list

8 months agoMerge tag 'lsm-pr-20241004' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Linus Torvalds [Sat, 5 Oct 2024 17:10:45 +0000 (10:10 -0700)]
Merge tag 'lsm-pr-20241004' of git://git./linux/kernel/git/pcmoore/lsm

Pull lsm revert from Paul Moore:
 "Here is the CONFIG_SECURITY_TOMOYO_LKM revert that we've been
  discussing this week. With near unanimous agreement that the original
  TOMOYO patches were not the right way to solve the distro problem
  Tetsuo is trying the solve, reverting is our best option at this time"

* tag 'lsm-pr-20241004' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  tomoyo: revert CONFIG_SECURITY_TOMOYO_LKM support

8 months agoplatform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug
Zach Wade [Mon, 23 Sep 2024 14:45:08 +0000 (22:45 +0800)]
platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug

Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds".
kasan report:
[   19.411889] ==================================================================
[   19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[   19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113
[   19.417368]
[   19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G            E      6.9.0 #10
[   19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022
[   19.422687] Call Trace:
[   19.424091]  <TASK>
[   19.425448]  dump_stack_lvl+0x5d/0x80
[   19.426963]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[   19.428694]  print_report+0x19d/0x52e
[   19.430206]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[   19.431837]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[   19.433539]  kasan_report+0xf0/0x170
[   19.435019]  ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[   19.436709]  _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common]
[   19.438379]  ? __pfx_sched_clock_cpu+0x10/0x10
[   19.439910]  isst_if_cpu_online+0x406/0x58f [isst_if_common]
[   19.441573]  ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common]
[   19.443263]  ? ttwu_queue_wakelist+0x2c1/0x360
[   19.444797]  cpuhp_invoke_callback+0x221/0xec0
[   19.446337]  cpuhp_thread_fun+0x21b/0x610
[   19.447814]  ? __pfx_cpuhp_thread_fun+0x10/0x10
[   19.449354]  smpboot_thread_fn+0x2e7/0x6e0
[   19.450859]  ? __pfx_smpboot_thread_fn+0x10/0x10
[   19.452405]  kthread+0x29c/0x350
[   19.453817]  ? __pfx_kthread+0x10/0x10
[   19.455253]  ret_from_fork+0x31/0x70
[   19.456685]  ? __pfx_kthread+0x10/0x10
[   19.458114]  ret_from_fork_asm+0x1a/0x30
[   19.459573]  </TASK>
[   19.460853]
[   19.462055] Allocated by task 1198:
[   19.463410]  kasan_save_stack+0x30/0x50
[   19.464788]  kasan_save_track+0x14/0x30
[   19.466139]  __kasan_kmalloc+0xaa/0xb0
[   19.467465]  __kmalloc+0x1cd/0x470
[   19.468748]  isst_if_cdev_register+0x1da/0x350 [isst_if_common]
[   19.470233]  isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr]
[   19.471670]  do_one_initcall+0xa4/0x380
[   19.472903]  do_init_module+0x238/0x760
[   19.474105]  load_module+0x5239/0x6f00
[   19.475285]  init_module_from_file+0xd1/0x130
[   19.476506]  idempotent_init_module+0x23b/0x650
[   19.477725]  __x64_sys_finit_module+0xbe/0x130
[   19.476506]  idempotent_init_module+0x23b/0x650
[   19.477725]  __x64_sys_finit_module+0xbe/0x130
[   19.478920]  do_syscall_64+0x82/0x160
[   19.480036]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   19.481292]
[   19.482205] The buggy address belongs to the object at ffff888829e65000
 which belongs to the cache kmalloc-512 of size 512
[   19.484818] The buggy address is located 0 bytes to the right of
 allocated 512-byte region [ffff888829e65000ffff888829e65200)
[   19.487447]
[   19.488328] The buggy address belongs to the physical page:
[   19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60
[   19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[   19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
[   19.493914] page_type: 0xffffffff()
[   19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
[   19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
[   19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001
[   19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000
[   19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff
[   19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
[   19.503784] page dumped because: kasan: bad access detected
[   19.505058]
[   19.505970] Memory state around the buggy address:
[   19.507172]  ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   19.508599]  ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   19.510014]                    ^
[   19.510016]  ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   19.510018]  ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   19.515367] ==================================================================

The reason for this error is physical_package_ids assigned by VMware VMM
are not continuous and have gaps. This will cause value returned by
topology_physical_package_id() to be more than topology_max_packages().

Here the allocation uses topology_max_packages(). The call to
topology_max_packages() returns maximum logical package ID not physical
ID. Hence use topology_logical_package_id() instead of
topology_physical_package_id().

Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering")
Cc: stable@vger.kernel.org
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zach Wade <zachwade.k@gmail.com>
Link: https://lore.kernel.org/r/20240923144508.1764-1-zachwade.k@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
8 months agoMerge tag 'linux_kselftest-fixes-6.12-rc2' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sat, 5 Oct 2024 00:30:59 +0000 (17:30 -0700)]
Merge tag 'linux_kselftest-fixes-6.12-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "Fixes to build warnings, install scripts, run-time error path, and git
  status cleanups to tests:

   - devices/probe: fix for Python3 regex string syntax warnings

   - clone3: removing unused macro from clone3_cap_checkpoint_restore()

   - vDSO: fix to align getrandom states to cache line

   - core and exec: add missing executables to .gitignore files

   - rtc: change to skip test if /dev/rtc0 can't be accessed

   - timers/posix: fix warn_unused_result result in __fatal_error()

   - breakpoints: fix to detect suspend successful condition correctly

   - hid: fix to install required dependencies to run the test"

* tag 'linux_kselftest-fixes-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: breakpoints: use remaining time to check if suspend succeed
  kselftest/devices/probe: Fix SyntaxWarning in regex strings for Python3
  selftest: hid: add missing run-hid-tools-tests.sh
  selftests: vDSO: align getrandom states to cache line
  selftests: exec: update gitignore for load_address
  selftests: core: add unshare_test to gitignore
  clone3: clone3_cap_checkpoint_restore: remove unused MAX_PID_NS_LEVEL macro
  selftests:timers: posix_timers: Fix warn_unused_result in __fatal_error()
  selftest: rtc: Check if could access /dev/rtc0 before testing

8 months agobcachefs: Rework logged op error handling
Kent Overstreet [Tue, 24 Sep 2024 02:06:58 +0000 (22:06 -0400)]
bcachefs: Rework logged op error handling

Initially it was thought that we just wanted to ignore errors from
logged op replay, but it turns out we do need to catch -EROFS, or we'll
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Add warn param to subvol_get_snapshot, peek_inode
Kent Overstreet [Tue, 24 Sep 2024 09:33:07 +0000 (05:33 -0400)]
bcachefs: Add warn param to subvol_get_snapshot, peek_inode

These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Kill snapshot arg to fsck_write_inode()
Kent Overstreet [Mon, 30 Sep 2024 04:00:33 +0000 (00:00 -0400)]
bcachefs: Kill snapshot arg to fsck_write_inode()

It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.

This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for unlinked, non-empty dirs in check_inode()
Kent Overstreet [Mon, 30 Sep 2024 03:38:37 +0000 (23:38 -0400)]
bcachefs: Check for unlinked, non-empty dirs in check_inode()

We want to check for this early so it can be reattached if necessary in
check_unreachable_inodes(); better than letting it be deleted and having
the children reattached, losing their filenames.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for unlinked inodes with dirents
Kent Overstreet [Mon, 30 Sep 2024 02:38:04 +0000 (22:38 -0400)]
bcachefs: Check for unlinked inodes with dirents

link count works differently in bcachefs - it's only nonzero for files
with multiple hardlinks, which means we can also avoid checking it
except for files that are known to have hardlinks.

That means we need a few different checks instead; in particular, we
don't want fsck to delet a file that has a dirent pointing to it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for directories with no backpointers
Kent Overstreet [Sat, 28 Sep 2024 19:27:37 +0000 (15:27 -0400)]
bcachefs: Check for directories with no backpointers

It's legal for regular files to have missing backpointers (due to
hardlinks), and fsck should automatically add them, but for directories
this is an error that should be flagged.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Kill alloc_v4.fragmentation_lru
Kent Overstreet [Tue, 1 Oct 2024 23:08:37 +0000 (19:08 -0400)]
bcachefs: Kill alloc_v4.fragmentation_lru

The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.

It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.

This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: minor lru fsck fixes
Kent Overstreet [Tue, 1 Oct 2024 20:40:33 +0000 (16:40 -0400)]
bcachefs: minor lru fsck fixes

check_lru_key() wasn't using write buffer updates for deleting bad lru
entries - dating from before the lru btree used the btree write buffer.

And when possibly flushing the btree write buffer (to make sure we're
seeing a real inconsistency), we need to be using the modern
bch2_btree_write_buffer_maybe_flush().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Mark more errors AUTOFIX
Kent Overstreet [Tue, 1 Oct 2024 20:26:21 +0000 (16:26 -0400)]
bcachefs: Mark more errors AUTOFIX

Errors are getting marked as AUTOFIX once they've been (re)-tested and
audited.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Make sure we print error that causes fsck to bail out
Kent Overstreet [Tue, 1 Oct 2024 20:26:02 +0000 (16:26 -0400)]
bcachefs: Make sure we print error that causes fsck to bail out

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: bkey errors are only AUTOFIX during read
Kent Overstreet [Fri, 4 Oct 2024 19:05:40 +0000 (15:05 -0400)]
bcachefs: bkey errors are only AUTOFIX during read

Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.

Fixes: 5612daafb764 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Create lost+found in correct snapshot
Kent Overstreet [Sat, 28 Sep 2024 19:33:08 +0000 (15:33 -0400)]
bcachefs: Create lost+found in correct snapshot

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix reattach_inode()
Kent Overstreet [Sat, 28 Sep 2024 06:44:12 +0000 (02:44 -0400)]
bcachefs: Fix reattach_inode()

Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Add missing wakeup to bch2_inode_hash_remove()
Kent Overstreet [Fri, 4 Oct 2024 23:44:32 +0000 (19:44 -0400)]
bcachefs: Add missing wakeup to bch2_inode_hash_remove()

This fixes two different bugs:

- Looser locking with the rhashtable means we need to recheck if the
  inode is still hashed after prepare_to_wait(), and add a corresponding
  wakeup after removing from the hash table.

da18ecbf0fb6 ("fs: add i_state helpers") changed the bit waitqueues
  used for inodes, and bcachefs wasn't updated and thus broke; this
  updates bcachefs to the new helper.

Fixes: 112d21fd1a12 ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agoext4: fix off by one issue in alloc_flex_gd()
Baokun Li [Fri, 27 Sep 2024 13:33:29 +0000 (21:33 +0800)]
ext4: fix off by one issue in alloc_flex_gd()

Wesley reported an issue:

==================================================================
EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks
------------[ cut here ]------------
kernel BUG at fs/ext4/resize.c:324!
CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27
RIP: 0010:ext4_resize_fs+0x1212/0x12d0
Call Trace:
 __ext4_ioctl+0x4e0/0x1800
 ext4_ioctl+0x12/0x20
 __x64_sys_ioctl+0x99/0xd0
 x64_sys_call+0x1206/0x20d0
 do_syscall_64+0x72/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
==================================================================

While reviewing the patch, Honza found that when adjusting resize_bg in
alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than
flexbg_size.

The reproduction of the problem requires the following:

 o_group = flexbg_size * 2 * n;
 o_size = (o_group + 1) * group_size;
 n_group: [o_group + flexbg_size, o_group + flexbg_size * 2)
 o_size = (n_group + 1) * group_size;

Take n=0,flexbg_size=16 as an example:

              last:15
|o---------------|--------------n-|
o_group:0    resize to      n_group:30

The corresponding reproducer is:

img=test.img
rm -f $img
truncate -s 600M $img
mkfs.ext4 -F $img -b 1024 -G 16 8M
dev=`losetup -f --show $img`
mkdir -p /tmp/test
mount $dev /tmp/test
resize2fs $dev 248M

Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE()
to prevent the issue from happening again.

[ Note: another reproucer which this commit fixes is:

  img=test.img
  rm -f $img
  truncate -s 25MiB $img
  mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img
  truncate -s 3GiB $img
  dev=`losetup -f --show $img`
  mkdir -p /tmp/test
  mount $dev /tmp/test
  resize2fs $dev 3G
  umount $dev
  losetup -d $dev

  -- TYT ]

Reported-by: Wesley Hershberger <wesley.hershberger@canonical.com>
Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081231
Reported-by: Stéphane Graber <stgraber@stgraber.org>
Closes: https://lore.kernel.org/all/20240925143325.518508-1-aleksandr.mikhalitsyn@canonical.com/
Tested-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Tested-by: Eric Sandeen <sandeen@redhat.com>
Fixes: 665d3e0af4d3 ("ext4: reduce unnecessary memory allocation in alloc_flex_gd()")
Cc: stable@vger.kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240927133329.1015041-1-libaokun@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
8 months agoext4: mark fc as ineligible using an handle in ext4_xattr_set()
Luis Henriques (SUSE) [Mon, 23 Sep 2024 10:49:09 +0000 (11:49 +0100)]
ext4: mark fc as ineligible using an handle in ext4_xattr_set()

Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
in a fast-commit being done before the filesystem is effectively marked as
ineligible.  This patch moves the call to this function so that an handle
can be used.  If a transaction fails to start, then there's not point in
trying to mark the filesystem as ineligible, and an error will eventually be
returned to user-space.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240923104909.18342-3-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
8 months agoext4: use handle to mark fc as ineligible in __track_dentry_update()
Luis Henriques (SUSE) [Mon, 23 Sep 2024 10:49:08 +0000 (11:49 +0100)]
ext4: use handle to mark fc as ineligible in __track_dentry_update()

Calling ext4_fc_mark_ineligible() with a NULL handle is racy and may result
in a fast-commit being done before the filesystem is effectively marked as
ineligible.  This patch fixes the calls to this function in
__track_dentry_update() by adding an extra parameter to the callback used in
ext4_fc_track_template().

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20240923104909.18342-2-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
8 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 4 Oct 2024 19:20:09 +0000 (12:20 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "A couple of build/config issues and expanding the speculative SSBS
  workaround to more CPUs:

   - Expand the speculative SSBS workaround to cover Cortex-A715,
     Neoverse-N3 and Microsoft Azure Cobalt 100

   - Force position-independent veneers - in some kernel configurations,
     the LLD linker generates position-dependent veneers for otherwise
     position-independent code, resulting in early boot-time failures

   - Fix Kconfig selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS so that it
     is not enabled when not supported by the combination of clang and
     GNU ld"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386
  arm64: fix selection of HAVE_DYNAMIC_FTRACE_WITH_ARGS
  arm64: errata: Expand speculative SSBS workaround once more
  arm64: cputype: Add Neoverse-N3 definitions
  arm64: Force position-independent veneers

8 months agoMerge tag 'riscv-for-linus-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Oct 2024 19:16:51 +0000 (12:16 -0700)]
Merge tag 'riscv-for-linus-6.12-rc2' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - PERF_TYPE_BREAKPOINT now returns -EOPNOTSUPP instead of -ENOENT,
   which aligns to other ports and is a saner value

 - The KASAN-related stack size increasing logic has been moved to a C
   header, to avoid dependency issues

* tag 'riscv-for-linus-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix kernel stack size when KASAN is enabled
  drivers/perf: riscv: Align errno for unsupported perf event

8 months agoMerge tag 'trace-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Fri, 4 Oct 2024 19:11:06 +0000 (12:11 -0700)]
Merge tag 'trace-v6.12-rc1' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix tp_printk command line option crashing the kernel

   With the code that can handle a buffer from a previous boot, the
   trace_check_vprintf() needed access to the delta of the address space
   used by the old buffer and the current buffer. To do so, the
   trace_array (tr) parameter was used. But when tp_printk is enabled on
   the kernel command line, no trace buffer is used and the trace event
   is sent directly to printk(). That meant the tr field of the iterator
   descriptor was NULL, and since tp_printk still uses
   trace_check_vprintf() it caused a NULL dereference.

 - Add ptrace.h include to x86 ftrace file for completeness

 - Fix rtla installation when done with out-of-tree build

 - Fix the help messages in rtla that were incorrect

 - Several fixes to fix races with the timerlat and hwlat code

   Several locking issues were discovered with the coordination between
   timerlat kthread creation and hotplug. As timerlat has callbacks from
   hotplug code to start kthreads when CPUs come online. There are also
   locking issues with grabbing the cpu_read_lock() and the locks within
   timerlat.

* tag 'trace-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/hwlat: Fix a race during cpuhp processing
  tracing/timerlat: Fix a race during cpuhp processing
  tracing/timerlat: Drop interface_lock in stop_kthread()
  tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
  x86/ftrace: Include <asm/ptrace.h>
  rtla: Fix the help text in osnoise and timerlat top tools
  tools/rtla: Fix installation from out-of-tree build
  tracing: Fix trace_check_vprintf() when tp_printk is used

8 months agoMerge tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka...
Linus Torvalds [Fri, 4 Oct 2024 19:05:39 +0000 (12:05 -0700)]
Merge tag 'slab-for-6.12-rc1' of git://git./linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:
 "Fixes for issues introduced in this merge window: kobject memory leak,
  unsupressed warning and possible lockup in new slub_kunit tests,
  misleading code in kvfree_rcu_queue_batch()"

* tag 'slab-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slub/kunit: skip test_kfree_rcu when the slub kunit test is built-in
  mm, slab: suppress warnings in test_leak_destroy kunit test
  rcu/kvfree: Refactor kvfree_rcu_queue_batch()
  mm, slab: fix use of SLAB_SUPPORTS_SYSFS in kmem_cache_release()

8 months agoMerge tag 'acpi-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 4 Oct 2024 18:59:36 +0000 (11:59 -0700)]
Merge tag 'acpi-6.12-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix up the ACPI IRQ override quirk list and add two new entries
  to it, add a new quirk to the ACPI backlight (video) driver, and fix
  the ACPI battery driver.

  Specifics:

   - Add a quirk for Dell OptiPlex 5480 AIO to the ACPI backlight
     (video) driver (Hans de Goede)

   - Prevent the ACPI battery driver from crashing when unregistering a
     battery hook and simplify battery hook locking in it (Armin Wolf)

   - Fix up the ACPI IRQ override quirk list and add quirks for Asus
     Vivobook X1704VAP and Asus ExpertBook B2502CVA to it (Hans de
     Goede)"

* tag 'acpi-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: battery: Fix possible crash when unregistering a battery hook
  ACPI: battery: Simplify battery hook locking
  ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO
  ACPI: resource: Add Asus ExpertBook B2502CVA to irq1_level_low_skip_override[]
  ACPI: resource: Add Asus Vivobook X1704VAP to irq1_level_low_skip_override[]
  ACPI: resource: Loosen the Asus E1404GAB DMI match to also cover the E1404GA
  ACPI: resource: Remove duplicate Asus E1504GAB IRQ override

8 months agoMerge tag 'pm-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 4 Oct 2024 18:57:15 +0000 (11:57 -0700)]
Merge tag 'pm-6.12-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix two cpufreq issues, one in the core and one in the
  intel_pstate driver:

   - Fix CPU device node reference counting in the cpufreq core (Miquel
     Sabaté Solà)

   - Turn the spinlock used by the intel_pstate driver in hard IRQ
     context into a raw one to prevent the driver from crashing when
     PREEMPT_RT is enabled (Uwe Kleine-König)"

* tag 'pm-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Avoid a bad reference count on CPU node
  cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock

8 months agoMerge tag 'gpio-fixes-for-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Oct 2024 18:50:38 +0000 (11:50 -0700)]
Merge tag 'gpio-fixes-for-v6.12-rc2' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a potential NULL-pointer dereference in gpiolib core

 - fix a probe() regression from the v6.12 merge window and an older bug
   leading to missed interrupts in gpio-davinci

* tag 'gpio-fixes-for-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()
  gpio: davinci: Fix condition for irqchip registration
  gpio: davinci: fix lazy disable

8 months agoMerge tag 'sound-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 4 Oct 2024 18:29:46 +0000 (11:29 -0700)]
Merge tag 'sound-6.12-rc2' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Slightly high amount of changes in this round, partly because of my
  vacation in the last weeks. But all changes are small and nothing
  looks worrisome.

  The biggest LOCs is MAINTAINERS updates, and there is a core change
  for card-ID string creation for non-ASCII inputs. Others are rather
  device-specific, such as new quirks and device IDs for ASoC, usual
  HD-audio and USB-audio quirks and fixes, as well as regression fixes
  in HD-audio HDMI audio and Conexant codec"

* tag 'sound-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (39 commits)
  ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin
  ALSA: line6: add hw monitor volume control to POD HD500X
  ALSA: gus: Fix some error handling paths related to get_bpos() usage
  ALSA: hda: Add missing parameter description for snd_hdac_stream_timecounter_init()
  ALSA: usb-audio: Add native DSD support for Luxman D-08u
  ALSA: core: add isascii() check to card ID generator
  MAINTAINERS: ALSA: use linux-sound@vger.kernel.org list
  Revert "ALSA: hda: Conditionally use snooping for AMD HDMI"
  ASoC: intel: sof_sdw: Add check devm_kasprintf() returned value
  ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m
  ASoC: dt-bindings: davinci-mcasp: Fix interrupts property
  ASoC: qcom: sm8250: add qrb4210-rb2-sndcard compatible string
  ASoC: dt-bindings: qcom,sm8250: add qrb4210-rb2-sndcard
  ALSA: hda: fix trigger_tstamp_latched
  ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200
  ALSA: hda/generic: Drop obsoleted obey_preferred_dacs flag
  ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs
  ALSA: silence integer wrapping warning
  ASoC: Intel: soc-acpi: arl: Fix some missing empty terminators
  ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item
  ...

8 months agoMerge tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Fri, 4 Oct 2024 18:25:14 +0000 (11:25 -0700)]
Merge tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Weekly fixes, xe and amdgpu lead the way, with panthor, and few core
  components getting various fixes. Nothing seems too out of the
  ordinary.

  atomic:
   - Use correct type when reading damage rectangles

  display:
   - Fix kernel docs

  dp-mst:
   - Fix DSC decompression detection

  hdmi:
   - Fix infoframe size

  sched:
   - Update maintainers
   - Fix race condition whne queueing up jobs
   - Fix locking in drm_sched_entity_modify_sched()
   - Fix pointer deref if entity queue changes

  sysfb:
   - Disable sysfb if framebuffer parent device is unknown

  amdgpu:
   - DML2 fix
   - DSC fix
   - Dispclk fix
   - eDP HDR fix
   - IPS fix
   - TBT fix

  i915:
   - One fix for bitwise and logical "and" mixup in PM code

  xe:
   - Restore pci state on resume
   - Fix locking on submission, queue and vm
   - Fix UAF on queue destruction
   - Fix resource release on freq init error path
   - Use rw_semaphore to reduce contention on ASID->VM lookup
   - Fix steering for media on Xe2_HPM
   - Tuning updates to Xe2
   - Resume TDR after GT reset to prevent jobs running forever
   - Move id allocation to avoid userspace using a guessed number to
     trigger UAF
   - Fix OA stream close preventing pbatch buffers to complete
   - Fix NPD when migrating memory on LNL
   - Fix memory leak when aborting binds

  panthor:
   - Fix locking
   - Set FOP_UNSIGNED_OFFSET in fops instance
   - Acquire lock in panthor_vm_prepare_map_op_ctx()
   - Avoid uninitialized variable in tick_ctx_cleanup()
   - Do not block scheduler queue if work is pending
   - Do not add write fences to the shared BOs

  vbox:
   - Fix VLA handling"

* tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits)
  drm/xe: Fix memory leak when aborting binds
  drm/xe: Prevent null pointer access in xe_migrate_copy
  drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close
  drm/xe/queue: move xa_alloc to prevent UAF
  drm/xe/vm: move xa_alloc to prevent UAF
  drm/xe: Clean up VM / exec queue file lock usage.
  drm/xe: Resume TDR after GT reset
  drm/xe/xe2: Add performance tuning for L3 cache flushing
  drm/xe/xe2: Extend performance tuning to media GT
  drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM
  drm/xe: Use helper for ASID -> VM in GPU faults and access counters
  drm/xe: Convert to USM lock to rwsem
  drm/xe: use devm_add_action_or_reset() helper
  drm/xe: fix UAF around queue destruction
  drm/xe/guc_submit: add missing locking in wedged_fini
  drm/xe: Restore pci state upon resume
  drm/amd/display: Fix system hang while resume with TBT monitor
  drm/amd/display: Enable idle workqueue for more IPS modes
  drm/amd/display: Add HDR workaround for specific eDP
  drm/amd/display: avoid set dispclk to 0
  ...

8 months agoMerge tag 'block-6.12-20241004' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 4 Oct 2024 17:43:44 +0000 (10:43 -0700)]
Merge tag 'block-6.12-20241004' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - Fix another use-after-free in aoe

 - Fixup wrong nested non-saving irq disable/restore in blk-iocost

 - Fixup a kerneldoc complaint introduced by a merge window patch

* tag 'block-6.12-20241004' of git://git.kernel.dk/linux:
  aoe: fix the potential use-after-free problem in more places
  blk_iocost: remove some duplicate irq disable/enables
  block: fix blk_rq_map_integrity_sg kernel-doc

8 months agoMerge tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 4 Oct 2024 17:39:36 +0000 (10:39 -0700)]
Merge tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Fix an error path memory leak, if one part fails to allocate.
   Obviously not something that'll generally hit without error
   injection.

 - Fix an io_req_flags_t cast to make sparse happier.

 - Improve the recv multishot termination. Not a bug now, but could be
   one in the future. This makes it do the same thing that recvmsg does
   in terms of when to terminate a request or not.

* tag 'io_uring-6.12-20241004' of git://git.kernel.dk/linux:
  io_uring/net: harden multishot termination case for recv
  io_uring: fix casts to io_req_flags_t
  io_uring: fix memory leak when cache init fail

8 months agoMerge tag 'fsnotify_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 4 Oct 2024 17:31:59 +0000 (10:31 -0700)]
Merge tag 'fsnotify_for_v6.12-rc2' of git://git./linux/kernel/git/jack/linux-fs

Pull fsnotify fixes from Jan Kara:
 "Fixes for an inotify deadlock and a data race in fsnotify"

* tag 'fsnotify_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  inotify: Fix possible deadlock in fsnotify_destroy_mark
  fsnotify: Avoid data race between fsnotify_recalc_mask() and fsnotify_object_watched()

8 months agoMerge tag 'fs_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack...
Linus Torvalds [Fri, 4 Oct 2024 17:24:06 +0000 (10:24 -0700)]
Merge tag 'fs_for_v6.12-rc2' of git://git./linux/kernel/git/jack/linux-fs

Pull UDF fixes from Jan Kara:
 "A couple of UDF error handling fixes for issues spotted by syzbot"

* tag 'fs_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix uninit-value use in udf_get_fileshortad
  udf: refactor inode_bmap() to handle error
  udf: refactor udf_next_aext() to handle error
  udf: refactor udf_current_aext() to handle error

8 months agoMerge tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 4 Oct 2024 17:10:23 +0000 (10:10 -0700)]
Merge tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A fix from Patrick for a variety of CephFS lockup scenarios caused by
  a regression in cap handling which sneaked in through the netfs helper
  library in 5.18 (marked for stable) and an unrelated one-line cleanup"

* tag 'ceph-for-6.12-rc2' of https://github.com/ceph/ceph-client:
  ceph: fix cap ref leak via netfs init_request
  ceph: use struct_size() helper in __ceph_pool_perm_get()