Naveen N Rao (AMD) [Sun, 20 Apr 2025 16:32:48 +0000 (22:02 +0530)]
MAINTAINERS: powerpc: Remove myself as a reviewer
I haven't been able to participate and help with this as much as I had
hoped, and it doesn't look like I will be able to spend time on powerpc
going forward.
Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420163248.1746379-1-naveen@kernel.org
Thorsten Blum [Mon, 10 Feb 2025 22:42:44 +0000 (23:42 +0100)]
powerpc/iommu: Use str_disabled_enabled() helper
Remove hard-coded strings by using the str_disabled_enabled() helper.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250210224246.363318-1-thorsten.blum@linux.dev
Thorsten Blum [Fri, 17 Jan 2025 11:46:20 +0000 (12:46 +0100)]
powerpc/powermac: Use str_enabled_disabled() and str_on_off() helpers
Remove hard-coded strings by using the str_enabled_disabled() and
str_on_off() helper functions.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250117114625.64903-2-thorsten.blum@linux.dev
Thorsten Blum [Mon, 10 Feb 2025 10:06:46 +0000 (11:06 +0100)]
powerpc/mm/fault: Use str_write_read() helper function
Remove hard-coded strings by using the str_write_read() helper function.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250210100648.1440-2-thorsten.blum@linux.dev
Thorsten Blum [Mon, 21 Apr 2025 18:31:08 +0000 (20:31 +0200)]
powerpc: Replace strcpy() with strscpy() in proc_ppc64_init()
strcpy() is deprecated; use strscpy() instead.
Don't cast the destination buffer from 'u8[]' to 'char *' to satisfy the
__must_be_array() requirement of strscpy().
No functional changes intended.
Link: https://github.com/KSPP/linux/issues/88
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250421183110.436265-1-thorsten.blum@linux.dev
Gaurav Batra [Mon, 12 May 2025 22:46:53 +0000 (17:46 -0500)]
powerpc/pseries/iommu: Fix kmemleak in TCE table userspace view
When a device is opened by a userspace driver, via VFIO interface, DMA
window is created. This DMA window has TCE Table and a corresponding
data for userview of TCE table.
When the userspace driver closes the device, all the above infrastructure
is free'ed and the device control given back to kernel. Both DMA window
and TCE table is getting free'ed. But due to a code bug, userview of the
TCE table is not getting free'ed. This is resulting in a memory leak.
Befow is the information from KMEMLEAK
unreferenced object 0xc008000022af0000 (size
16777216):
comm "senlib_unit_tes", pid 9346, jiffies
4294983174
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 0):
kmemleak_vmalloc+0xc8/0x1a0
__vmalloc_node_range+0x284/0x340
vzalloc+0x58/0x70
spapr_tce_create_table+0x4b0/0x8d0
tce_iommu_create_table+0xcc/0x170 [vfio_iommu_spapr_tce]
tce_iommu_create_window+0x144/0x2f0 [vfio_iommu_spapr_tce]
tce_iommu_ioctl.part.0+0x59c/0xc90 [vfio_iommu_spapr_tce]
vfio_fops_unl_ioctl+0x88/0x280 [vfio]
sys_ioctl+0xf4/0x160
system_call_exception+0x164/0x310
system_call_vectored_common+0xe8/0x278
unreferenced object 0xc008000023b00000 (size
4194304):
comm "senlib_unit_tes", pid 9351, jiffies
4294984116
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 0):
kmemleak_vmalloc+0xc8/0x1a0
__vmalloc_node_range+0x284/0x340
vzalloc+0x58/0x70
spapr_tce_create_table+0x4b0/0x8d0
tce_iommu_create_table+0xcc/0x170 [vfio_iommu_spapr_tce]
tce_iommu_create_window+0x144/0x2f0 [vfio_iommu_spapr_tce]
tce_iommu_create_default_window+0x88/0x120 [vfio_iommu_spapr_tce]
tce_iommu_ioctl.part.0+0x57c/0xc90 [vfio_iommu_spapr_tce]
vfio_fops_unl_ioctl+0x88/0x280 [vfio]
sys_ioctl+0xf4/0x160
system_call_exception+0x164/0x310
system_call_vectored_common+0xe8/0x278
Fixes:
f431a8cde7f1 ("powerpc/iommu: Reimplement the iommu_table_group_ops for pSeries")
Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250512224653.35697-1-gbatra@linux.ibm.com
Madhavan Srinivasan [Sun, 11 May 2025 04:11:11 +0000 (09:41 +0530)]
powerpc/kernel: Fix ppc_save_regs inclusion in build
Recent patch fixed an old commit
'
fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")'
which is to include building of ppc_save_reg.c only when XMON
and KEXEC_CORE and PPC_BOOK3S are enabled. This was valid, since
ppc_save_regs was called only in replay_system_reset() of old
irq.c which was under BOOK3S.
But there has been multiple refactoring of irq.c and have
added call to ppc_save_regs() from __replay_soft_interrupts
-> replay_soft_interrupts which is part of irq_64.c included
under CONFIG_PPC64. And since ppc_save_regs is called in
CRASH_DUMP path as part of crash_setup_regs in kexec.h,
CONFIG_PPC32 also needs it.
So with this recent patch which enabled the building of
ppc_save_regs.c caused a build break when none of these
(XMON, KEXEC_CORE, BOOK3S) where enabled as part of config.
Patch to enable building of ppc_save_regs.c by defaults.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250511041111.841158-1-maddy@linux.ibm.com
Thorsten Blum [Sun, 10 Nov 2024 16:21:37 +0000 (17:21 +0100)]
powerpc: Transliterate author name and remove FIXME
The name is Mimi Phuong-Thao Vo.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241110162139.5179-2-thorsten.blum@linux.dev
Athira Rajeev [Tue, 6 May 2025 13:52:32 +0000 (19:22 +0530)]
powerpc/pseries/htmdump: Include header file to get is_kvm_guest() definition
htmdump_init calls is_kvm_guest() to check for guest environment.
is_kvm_guest() is defined in kvm_guest.h header file. Without including
the header file, build hits error failing to find the function.
arch/powerpc/platforms/pseries/htmdump.c: In function 'htmdump_init':
arch/powerpc/platforms/pseries/htmdump.c:469:6: error: implicit declaration of function 'is_kvm_guest';
did you mean '__key_get'? [-Werror=implicit-function-declaration]
if (is_kvm_guest()) {
^~~~~~~~~~~~
__key_get
This is observed in configs where CONFIG_KVM_GUEST is disabled.
Include header file explicitly to avoid the build error
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202505061324.elUl4njU-lkp@intel.com/
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250506135232.69014-1-atrajeev@linux.ibm.com
Amit Machhiwal [Fri, 25 Apr 2025 18:56:41 +0000 (00:26 +0530)]
KVM: PPC: Book3S HV: Fix IRQ map warnings with XICS on pSeries KVM Guest
The commit
9576730d0e6e ("KVM: PPC: select IRQ_BYPASS_MANAGER") enabled
IRQ_BYPASS_MANAGER when CONFIG_KVM was set. Subsequently, commit
c57875f5f9be ("KVM: PPC: Book3S HV: Enable IRQ bypass") enabled IRQ
bypass and added the necessary callbacks to create/remove the mappings
between host real IRQ and the guest GSI.
The availability of IRQ bypass is determined by the arch-specific
function kvm_arch_has_irq_bypass(), which invokes
kvmppc_irq_bypass_add_producer_hv(). This function, in turn, calls
kvmppc_set_passthru_irq_hv() to create a mapping in the passthrough IRQ
map, associating a host IRQ to a guest GSI.
However, when a pSeries KVM guest (L2) is booted within an LPAR (L1)
with the kernel boot parameter `xive=off`, it defaults to using emulated
XICS controller. As an attempt to establish host IRQ to guest GSI
mappings via kvmppc_set_passthru_irq() on a PCI device hotplug
(passhthrough) operation fail, returning -ENOENT. This failure occurs
because only interrupts with EOI operations handled through OPAL calls
(verified via is_pnv_opal_msi()) are currently supported.
These mapping failures lead to below repeated warnings in the L1 host:
[ 509.220349] kvmppc_set_passthru_irq_hv: Could not assign IRQ map for (58,4970)
[ 509.220368] kvmppc_set_passthru_irq (irq 58, gsi 4970) fails: -2
[ 509.220376] vfio-pci 0015:01:00.0: irq bypass producer (token
0000000090bc635b) registration fails: -2
...
[ 509.291781] vfio-pci 0015:01:00.0: irq bypass producer (token
000000003822eed8) registration fails: -2
Fix this by restricting IRQ bypass enablement on pSeries systems by
making the IRQ bypass callbacks unavailable when running on pSeries
platform.
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250425185641.1611857-1-amachhiw@linux.ibm.com
Christophe Leroy [Fri, 2 May 2025 13:07:53 +0000 (15:07 +0200)]
powerpc/8xx: Reduce alignment constraint for kernel memory
8xx has three large page sizes: 8M, 512k and 16k.
A too big alignment can lead to wasting memory. On a board which has
only 32 MBytes of RAM, every single byte is worth it and a 512k
alignment is sometimes too much.
Allow mapping kernel memory with 16k pages and reduce the constraint
on kernel memory alignment. 512k and 16k pages are handled the same
way so reverse tests in order to make 8M pages the special case and
other ones (512k and 16k) the alternative.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/fa9927b70df13627cdf10b992ea71d6562c7760e.1746191262.git.christophe.leroy@csgroup.eu
Michal Suchanek [Mon, 31 Mar 2025 10:57:19 +0000 (12:57 +0200)]
powerpc/boot: Fix build with gcc 15
Similar to x86 the ppc boot code does not build with GCC 15.
Copy the fix from
commit
ee2ab467bddf ("x86/boot: Use '-std=gnu11' to fix build with GCC 15")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Tested-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250331105722.19709-1-msuchanek@suse.de
Athira Rajeev [Sun, 20 Apr 2025 18:08:44 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add documentation for H_HTM debugfs interface
Documentation for HTM (Hardware Trace Macro) debugfs interface
and how it can be used to configure/control the HTM operations.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-10-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:43 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm capabilities support to htmdump module
Support dumping HTM capabilities information from Hardware
Trace Macro (HTM) function via debugfs interface. Under
debugfs folder "/sys/kernel/debug/powerpc/htmdump", add
file "htmcaps".
The interface allows only read of this file which will present the
content of HTM buffer from the hcall.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-9-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:42 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm flags support to htmdump module
Under debugfs folder, "/sys/kernel/debug/powerpc/htmdump", add file
"htmflags". Currently supported flag value is to enable/disable
HTM buffer wrap. wrap is used along with "configure" to prevent
HTM buffer from wrapping. Writing 1 will set noWrap while
configuring HTM
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-8-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:41 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm setup support to htmdump module
Add htm setup support to htmdump module. To use the
HTM (Hardware Trace Macro), HTM buffer has to be allocated.
Support setup of HTM buffers via debugfs interface. Under
debugfs folder, "/sys/kernel/debug/powerpc/htmdump", add file
"htmsetup". The interface allows setup of HTM buffer by writing
size of HTM buffer in power of 2 to the "htmsetup" file
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-7-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:40 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm info support to htmdump module
Support dumping system processor configuration from Hardware
Trace Macro (HTM) function via debugfs interface. Under
debugfs folder "/sys/kernel/debug/powerpc/htmdump", add
file "htminfo".
The interface allows only read of this file which will present the
content of HTM buffer from the hcall. The 16th offset of HTM
buffer has value for the number of entries for array of processors.
Use this information to copy data to the debugfs file
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-6-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:39 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm status support to htmdump module
Support dumping status of Hardware Trace Macro (HTM) function
via debugfs interface. Under debugfs folder
"/sys/kernel/debug/powerpc/htmdump", add file "htmstatus".
The interface allows only read of this file which will present the
content of HTM status buffer from the hcall. The 16th offset of HTM
status buffer has value for the number of HTM entries in the status
buffer. Each nest htm status entry is 0x6 bytes, where as core HTM
status entry is 0x8 bytes. Calculate the number of bytes to read
based on this detail.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-5-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:38 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm start support to htmdump module
Support starting of Hardware Trace Macro (HTM) function
via debugfs interface. Under debugfs folder
"/sys/kernel/debug/powerpc/htmdump", add file "htmstart".
The interface allows starting of htm via this file by
writing value "1". Also allows stopping of htm tracing by
writing value "0" to this file. Any other value returns
-EINVAL.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-4-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:37 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm configure support to htmdump module
Support configuring of Hardware Trace Macro (HTM) function
via debugfs interface. Under debugfs folder
"/sys/kernel/debug/powerpc/htmdump", add file "htmconfigure".
The interface allows configuring of htm via this file
by writing value "1". Allow deconfiguring of htm via this file
by writing value "0". Any other value returns -EINVAL.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-3-atrajeev@linux.ibm.com
Athira Rajeev [Sun, 20 Apr 2025 18:08:36 +0000 (23:38 +0530)]
powerpc/pseries/htmdump: Add htm_hcall_wrapper to integrate other htm operations
H_HTM (Hardware Trace Macro) hypervisor call is an HCALL to export data
from Hardware Trace Macro (HTM) function. The debugfs interface to
export the HTM function data in an lpar currently supports only dumping
of HTM data in an lpar. To add support for setup, configuration and
control of HTM function via debugfs interface, update the hcall wrapper
function. Rename and update htm_get_dump_hardware to htm_hcall_wrapper()
so that it can be used for other HTM operations as well. Additionally
include parameter "htm_op". Update htmdump module to check the return
code of hcall in a separate function so that it can be reused for other
option too. Add check to disable the interface in guest environment.
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250420180844.53128-2-atrajeev@linux.ibm.com
Bartosz Golaszewski [Fri, 2 May 2025 08:59:51 +0000 (10:59 +0200)]
powerpc: 8xx/gpio: use new line value setter callbacks
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc 8xx
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250502-gpiochip-set-rv-powerpc-v2-5-488e43e325bf@linaro.org
Bartosz Golaszewski [Fri, 2 May 2025 08:59:50 +0000 (10:59 +0200)]
powerpc: 52xx/gpio: use new line value setter callbacks
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250502-gpiochip-set-rv-powerpc-v2-4-488e43e325bf@linaro.org
Bartosz Golaszewski [Fri, 2 May 2025 08:59:49 +0000 (10:59 +0200)]
powerpc: 44x/gpio: use new line value setter callbacks
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250502-gpiochip-set-rv-powerpc-v2-3-488e43e325bf@linaro.org
Bartosz Golaszewski [Fri, 2 May 2025 08:59:48 +0000 (10:59 +0200)]
powerpc: 83xx/gpio: use new line value setter callbacks
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250502-gpiochip-set-rv-powerpc-v2-2-488e43e325bf@linaro.org
Bartosz Golaszewski [Fri, 2 May 2025 08:59:47 +0000 (10:59 +0200)]
powerpc: sysdev/gpio: use new line value setter callbacks
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250502-gpiochip-set-rv-powerpc-v2-1-488e43e325bf@linaro.org
Haren Myneni [Wed, 30 Apr 2025 02:28:47 +0000 (19:28 -0700)]
Documentation: Fix description format for powerpc RTAS ioctls
Fix the description format for the following build warnings:
"Documentation/userspace-api/ioctl/ioctl-number.rst:369:
ERROR: Malformed table. Text in column margin in table line 301.
0xB2 03-05 arch/powerpc/include/uapi/asm/papr-indices.h
powerpc/pseries indices API
<mailto:linuxppc-dev>
0xB2 06-07 arch/powerpc/include/uapi/asm/papr-platform-dump.h
powerpc/pseries Platform Dump API
<mailto:linuxppc-dev>
0xB2 08 arch/powerpc/include/uapi/asm/papr-physical-attestation.h
powerpc/pseries Physical Attestation API
<mailto:linuxppc-dev>"
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Fixes:
43d869ac25f1 ("powerpc/pseries: Define papr_indices_io_block for papr-indices ioctls")
Fixes:
8aa9efc0be66 ("powerpc/pseries: Add papr-platform-dump character driver for dump retrieval")
Fixes:
86900ab620a4 ("powerpc/pseries: Add a char driver for physical-attestation RTAS")
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Closes: https://lore.kernel.org/linux-next/
20250429181707.
7848912b@canb.auug.org.au/
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250430022847.1118093-1-haren@linux.ibm.com
Haren Myneni [Tue, 29 Apr 2025 21:14:18 +0000 (14:14 -0700)]
powerpc/pseries: Include linux/types.h in papr-platform-dump.h
Fix the following build warning:
usr/include/asm/papr-platform-dump.h:12: found __[us]{8,16,32,64} type without #include <linux/types.h>
Fixes:
8aa9efc0be66 ("powerpc/pseries: Add papr-platform-dump character driver for dump retrieval")
Closes: https://lore.kernel.org/linux-next/
20250429185735.
034ba678@canb.auug.org.au/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
[Maddy: fixed the commit to combine tags together]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250429211419.1081354-1-haren@linux.ibm.com
Christophe Leroy [Wed, 12 Feb 2025 07:40:48 +0000 (08:40 +0100)]
powerpc: Don't use --- in kernel logs
When a kernel log containing --- at the start of a line is copied into
a patch message, 'git am' drops everything located after that ---.
Replace --- by ---- to avoid that.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/54a1f8d2c3fb5b95434039724c8c141052ae5cc0.1739346038.git.christophe.leroy@csgroup.eu
Eddie James [Tue, 11 Feb 2025 16:20:54 +0000 (10:20 -0600)]
powerpc/crash: Fix non-smp kexec preparation
In non-smp configurations, crash_kexec_prepare is never called in
the crash shutdown path. One result of this is that the crashing_cpu
variable is never set, preventing crash_save_cpu from storing the
NT_PRSTATUS elf note in the core dump.
Fixes:
c7255058b543 ("powerpc/crash: save cpu register data in crash_smp_send_stop()")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250211162054.857762-1-eajames@linux.ibm.com
Jiri Slaby (SUSE) [Thu, 17 Apr 2025 10:53:05 +0000 (12:53 +0200)]
powerpc: do not build ppc_save_regs.o always
The Fixes commit below tried to add CONFIG_PPC_BOOK3S to one of the
conditions to enable the build of ppc_save_regs.o. But it failed to do
so, in fact. The commit omitted to add a dollar sign.
Therefore, ppc_save_regs.o is built always these days (as
"(CONFIG_PPC_BOOK3S)" is never an empty string).
Fix this by adding the missing dollar sign.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Fixes:
fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250417105305.397128-1-jirislaby@kernel.org
Gautam Menghani [Wed, 5 Mar 2025 09:02:36 +0000 (14:32 +0530)]
powerpc/pseries/msi: Avoid reading PCI device registers in reduced power states
When a system is being suspended to RAM, the PCI devices are also
suspended and the PPC code ends up calling pseries_msi_compose_msg() and
this triggers the BUG_ON() in __pci_read_msi_msg() because the device at
this point is in reduced power state. In reduced power state, the memory
mapped registers of the PCI device are not accessible.
To replicate the bug:
1. Make sure deep sleep is selected
# cat /sys/power/mem_sleep
s2idle [deep]
2. Make sure console is not suspended (so that dmesg logs are visible)
echo N > /sys/module/printk/parameters/console_suspend
3. Suspend the system
echo mem > /sys/power/state
To fix this behaviour, read the cached msi message of the device when the
device is not in PCI_D0 power state instead of touching the hardware.
Fixes:
a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250305090237.294633-1-gautam@linux.ibm.com
Hari Bathini [Tue, 22 Apr 2025 08:26:09 +0000 (13:56 +0530)]
powerpc/bpf: fix JIT code size calculation of bpf trampoline
arch_bpf_trampoline_size() provides JIT size of the BPF trampoline
before the buffer for JIT'ing it is allocated. The total number of
instructions emitted for BPF trampoline JIT code depends on where
the final image is located. So, the size arrived at with the dummy
pass in arch_bpf_trampoline_size() can vary from the actual size
needed in arch_prepare_bpf_trampoline(). When the instructions
accounted in arch_bpf_trampoline_size() is less than the number of
instructions emitted during the actual JIT compile of the trampoline,
the below warning is produced:
WARNING: CPU: 8 PID: 204190 at arch/powerpc/net/bpf_jit_comp.c:981 __arch_prepare_bpf_trampoline.isra.0+0xd2c/0xdcc
which is:
/* Make sure the trampoline generation logic doesn't overflow */
if (image && WARN_ON_ONCE(&image[ctx->idx] >
(u32 *)rw_image_end - BPF_INSN_SAFETY)) {
So, during the dummy pass, instead of providing some arbitrary image
location, account for maximum possible instructions if and when there
is a dependency with image location for JIT'ing.
Fixes:
d243b62b7bd3 ("powerpc64/bpf: Add support for bpf trampolines")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/
6168bfc8-659f-4b5a-a6fb-
90a916dde3b3@linux.ibm.com/
Cc: stable@vger.kernel.org # v6.13+
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250422082609.949301-1-hbathini@linux.ibm.com
Hari Bathini [Wed, 16 Apr 2025 19:12:27 +0000 (00:42 +0530)]
powerpc64/ftrace: fix clobbered r15 during livepatching
While r15 is clobbered always with PPC_FTRACE_OUT_OF_LINE, it is
not restored in livepatch sequence leading to not so obvious fails
like below:
BUG: Unable to handle kernel data access on write at 0xc0000000000f9078
Faulting instruction address: 0xc0000000018ff958
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP:
c0000000018ff958 LR:
c0000000018ff930 CTR:
c0000000009c0790
REGS:
c00000005f2e7790 TRAP: 0300 Tainted: G K (6.14.0+)
MSR:
8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR:
2822880b XER:
20040000
CFAR:
c0000000008addc0 DAR:
c0000000000f9078 DSISR:
0a000000 IRQMASK: 1
GPR00:
c0000000018f2584 c00000005f2e7a30 c00000000280a900 c000000017ffa488
GPR04:
0000000000000008 0000000000000000 c0000000018f24fc 000000000000000d
GPR08:
fffffffffffe0000 000000000000000d 0000000000000000 0000000000008000
GPR12:
c0000000009c0790 c000000017ffa480 c00000005f2e7c78 c0000000000f9070
GPR16:
c00000005f2e7c90 0000000000000000 0000000000000000 0000000000000000
GPR20:
0000000000000000 c00000005f3efa80 c00000005f2e7c60 c00000005f2e7c88
GPR24:
c00000005f2e7c60 0000000000000001 c0000000000f9078 0000000000000000
GPR28:
00007fff97960000 c000000017ffa480 0000000000000000 c0000000000f9078
...
Call Trace:
check_heap_object+0x34/0x390 (unreliable)
__mutex_unlock_slowpath.isra.0+0xe4/0x230
seq_read_iter+0x430/0xa90
proc_reg_read_iter+0xa4/0x200
vfs_read+0x41c/0x510
ksys_read+0xa4/0x190
system_call_exception+0x1d0/0x440
system_call_vectored_common+0x15c/0x2ec
Fix it by restoring r15 always.
Fixes:
eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Reported-by: Viktor Malik <vmalik@redhat.com>
Closes: https://lore.kernel.org/lkml/
1aec4a9a-a30b-43fd-b303-
7a351caeccb7@redhat.com
Cc: stable@vger.kernel.org # v6.13+
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Tested-by: Viktor Malik <vmalik@redhat.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416191227.201146-1-hbathini@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:42 +0000 (15:57 -0700)]
powerpc/pseries: Add a char driver for physical-attestation RTAS
The RTAS call ibm,physical-attestation is used to retrieve
information about the trusted boot state of the firmware and
hypervisor on the system, and also Trusted Platform Modules (TPM)
data if the system is TCG 2.0 compliant.
This RTAS interface expects the caller to define different command
structs such as RetrieveTPMLog, RetrievePlatformCertificat and etc,
in a work area with a maximum size of 4K bytes and the response
buffer will be returned in the same work area.
The current implementation of this RTAS function is in the user
space but allocation of the work area is restricted with the system
lockdown. So this patch implements this RTAS function in the kernel
and expose to the user space with open/ioctl/read interfaces.
PAPR (2.13+ 21.3 ibm,physical-attestation) defines RTAS function:
- Pass the command struct to obtain the response buffer for the
specific command.
- This RTAS function is sequence RTAS call and has to issue RTAS
call multiple times to get the complete response buffer (max 64K).
The hypervisor expects the first RTAS call with the sequence 1 and
the subsequent calls with the sequence number returned from the
previous calls.
Expose these interfaces to user space with a
/dev/papr-physical-attestation character device using the following
programming model:
int devfd = open("/dev/papr-physical-attestation");
int fd = ioctl(devfd, PAPR_PHY_ATTEST_IOC_HANDLE,
struct papr_phy_attest_io_block);
- The user space defines the command struct and requests the
response for any command.
- Obtain the complete response buffer and returned the buffer as
blob to the command specific FD.
size = read(fd, buf, len);
- Can retrieve the response buffer once or multiple times until the
end of BLOB buffer.
Implemented this new kernel ABI support in librtas library for
system lockdown
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-8-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:41 +0000 (15:57 -0700)]
powerpc/pseries: Add papr-platform-dump character driver for dump retrieval
ibm,platform-dump RTAS call in combination with writable mapping
/dev/mem is issued to collect platform dump from the hypervisor
and may need multiple calls to get the complete dump. The current
implementation uses rtas_platform_dump() API provided by librtas
library to issue these RTAS calls. But /dev/mem access by the
user space is prohibited under system lockdown.
The solution should be to restrict access to RTAS function in user
space and provide kernel interfaces to collect dump. This patch
adds papr-platform-dump character driver and expose standard
interfaces such as open / ioctl/ read to user space in ways that
are compatible with lockdown.
PAPR (7.3.3.4.1 ibm,platform-dump) provides a method to obtain
the complete dump:
- Each dump will be identified by ID called dump tag.
- A sequence of RTAS calls have to be issued until retrieve the
complete dump. The hypervisor expects the first RTAS call with
the sequence 0 and the subsequent calls with the sequence
number returned from the previous calls.
- The hypervisor returns "dump complete" status once the complete
dump is retrieved. But expects one more RTAS call from the
partition with the NULL buffer to invalidate dump which means
the dump will be removed in the hypervisor.
- Sequence of calls are allowed with different dump IDs at the
same time but not with the same dump ID.
Expose these interfaces to user space with a /dev/papr-platform-dump
character device using the following programming model:
int devfd = open("/dev/papr-platform-dump", O_RDONLY);
int fd = ioctl(devfd,PAPR_PLATFORM_DUMP_IOC_CREATE_HANDLE, &dump_id)
- Restrict user space to access with the same dump ID.
Typically we do not expect user space requests the dump
again for the same dump ID.
char *buf = malloc(size);
length = read(fd, buf, size);
- size should be minimum 1K based on PAPR and <= 4K based
on RTAS work area size. It will be restrict to RTAS work
area size. Using 4K work area based on the current
implementation in librtas library
- Each read call issue RTAS call to get the data based on
the size requirement and returns bytes returned from the
hypervisor
- If the previous call returns dump complete status, the
next read returns 0 like EOF.
ret = ioctl(PAPR_PLATFORM_DUMP_IOC_INVALIDATE, &dump_id)
- RTAS call with NULL buffer to invalidates the dump.
The read API should use the file descriptor obtained from ioctl
based on dump ID so that gets dump contents for the corresponding
dump ID. Implemented support in librtas (rtas_platform_dump()) for
this new ABI to support system lockdown.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-7-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:40 +0000 (15:57 -0700)]
powerpc/pseries: Add ibm,get-dynamic-sensor-state RTAS call support
The RTAS call ibm,get-dynamic-sensor-state is used to get the
sensor state identified by the location code and the sensor
token. The librtas library provides an API
rtas_get_dynamic_sensor() which uses /dev/mem access for work
area allocation but is restricted under system lockdown.
This patch provides an interface with new ioctl
PAPR_DYNAMIC_SENSOR_IOC_GET to the papr-indices character
driver which executes this HCALL and copies the sensor state
in the user specified ioctl buffer.
Refer PAPR 7.3.19 ibm,get-dynamic-sensor-state for more
information on this RTAS call.
- User input parameters to the RTAS call: location code string
and the sensor token
Expose these interfaces to user space with a /dev/papr-indices
character device using the following programming model:
int fd = open("/dev/papr-indices", O_RDWR);
int ret = ioctl(fd, PAPR_DYNAMIC_SENSOR_IOC_GET,
struct papr_indices_io_block)
- The user space specifies input parameters in
papr_indices_io_block struct
- Returned state for the specified sensor is copied to
papr_indices_io_block.dynamic_param.state
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-6-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:39 +0000 (15:57 -0700)]
powerpc/pseries: Add ibm,set-dynamic-indicator RTAS call support
The RTAS call ibm,set-dynamic-indicator is used to set the new
indicator state identified by a location code. The current
implementation uses rtas_set_dynamic_indicator() API provided by
librtas library which allocates RMO buffer and issue this RTAS
call in the user space. But /dev/mem access by the user space
is prohibited under system lockdown.
This patch provides an interface with new ioctl
PAPR_DYNAMIC_INDICATOR_IOC_SET to the papr-indices character
driver and expose this interface to the user space that is
compatible with lockdown.
Refer PAPR 7.3.18 ibm,set-dynamic-indicator for more
information on this RTAS call.
- User input parameters to the RTAS call: location code
string, indicator token and new state
Expose these interfaces to user space with a /dev/papr-indices
character device using the following programming model:
int fd = open("/dev/papr-indices", O_RDWR);
int ret = ioctl(fd, PAPR_DYNAMIC_INDICATOR_IOC_SET,
struct papr_indices_io_block)
- The user space passes input parameters in papr_indices_io_block
struct
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-5-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:38 +0000 (15:57 -0700)]
powerpc/pseries: Add papr-indices char driver for ibm,get-indices
The RTAS call ibm,get-indices is used to obtain indices and
location codes for a specified indicator or sensor token. The
current implementation uses rtas_get_indices() API provided by
librtas library which allocates RMO buffer and issue this RTAS
call in the user space. But writable mapping /dev/mem access by
the user space is prohibited under system lockdown.
To overcome the restricted access in the user space, the kernel
provide interfaces to collect indices data from the hypervisor.
This patch adds papr-indices character driver and expose standard
interfaces such as open / ioctl/ read to user space in ways that
are compatible with lockdown.
PAPR (2.13 7.3.17 ibm,get-indices RTAS Call) describes the
following steps to retrieve all indices data:
- User input parameters to the RTAS call: sensor or indicator,
and indice type
- ibm,get-indices is sequence RTAS call which means has to issue
multiple times to get the entire list of indicators or sensors
of a particular type. The hypervisor expects the first RTAS call
with the sequence 1 and the subsequent calls with the sequence
number returned from the previous calls.
- The OS may not interleave calls to ibm,get-indices for different
indicator or sensor types. Means other RTAS calls with different
type should not be issued while the previous type sequence is in
progress. So collect the entire list of indices and copied to
buffer BLOB during ioctl() and expose this buffer to the user
space with the file descriptor.
- The hypervisor fills the work area with a specific format but
does not return the number of bytes written to the buffer.
Instead of parsing the data for each call to determine the data
length, copy the work area size (RTAS_GET_INDICES_BUF_SIZE) to
the buffer. Return work-area size of data to the user space for
each read() call.
Expose these interfaces to user space with a /dev/papr-indices
character device using the following programming model:
int devfd = open("/dev/papr-indices", O_RDONLY);
int fd = ioctl(devfd, PAPR_INDICES_IOC_GET,
struct papr_indices_io_block)
- Collect all indices data for the specified token to the buffer
char *buf = malloc(RTAS_GET_INDICES_BUF_SIZE);
length = read(fd, buf, RTAS_GET_INDICES_BUF_SIZE)
- RTAS_GET_INDICES_BUF_SIZE of data is returned to the user
space.
- The user space retrieves the indices and their location codes
from the buffer
- Should issue multiple read() calls until reaches the end of
BLOB buffer.
The read() should use the file descriptor obtained from ioctl to
get the data that is exposed to file descriptor. Implemented
support in librtas (rtas_get_indices()) for this new ABI for
system lockdown.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-4-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:37 +0000 (15:57 -0700)]
powerpc/pseries: Define papr_indices_io_block for papr-indices ioctls
To issue ibm,get-indices, ibm,set-dynamic-indicator and
ibm,get-dynamic-sensor-state in the user space, the RMO buffer is
allocated for the work area which is restricted under system
lockdown. So instead of user space execution, the kernel will
provide /dev/papr-indices interface to execute these RTAS calls.
The user space assigns data in papr_indices_io_block struct
depends on the specific HCALL and passes to the following ioctls:
PAPR_INDICES_IOC_GET: Use for ibm,get-indices. Returns a
get-indices handle fd to read data.
PAPR_DYNAMIC_SENSOR_IOC_GET: Use for ibm,get-dynamic-sensor-state.
Updates the sensor state in
papr_indices_io_block.dynamic_param.state
PAPR_DYNAMIC_INDICATOR_IOC_SET: Use for ibm,set-dynamic-indicator.
Sets the new state for the input
indicator.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-3-haren@linux.ibm.com
Haren Myneni [Wed, 16 Apr 2025 22:57:36 +0000 (15:57 -0700)]
powerpc/pseries: Define common functions for RTAS sequence calls
The RTAS call can be normal where retrieves the data form the
hypervisor once or sequence based RTAS call which has to
issue multiple times until the complete data is obtained. For
some of these sequence RTAS calls, the OS should not interleave
calls with different input until the sequence is completed.
The data is collected for each call and copy to the buffer
for the entire sequence during ioctl() handle and then expose
this buffer to the user space with read() handle.
One such sequence RTAS call is ibm,get-vpd and its support is
already included in the current code. To add the similar support
for other sequence based calls, move the common functions in to
separate file and update papr_rtas_sequence struct with the
following callbacks so that RTAS call specific code will be
defined and executed to complete the sequence.
struct papr_rtas_sequence {
int error;
void params;
void (*begin) (struct papr_rtas_sequence *);
void (*end) (struct papr_rtas_sequence *);
const char * (*work) (struct papr_rtas_sequence *, size_t *);
};
params: Input parameters used to pass for RTAS call.
Begin: RTAS call specific function to initialize data
including work area allocation.
End: RTAS call specific function to free up resources
(free work area) after the sequence is completed.
Work: The actual RTAS call specific function which collects
the data from the hypervisor.
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Tested-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416225743.596462-2-haren@linux.ibm.com
Shrikanth Hegde [Mon, 10 Feb 2025 18:43:34 +0000 (00:13 +0530)]
powerpc: enable dynamic preemption
Once the lazy preemption is supported, it would be desirable to change
the preemption models at runtime. So add support for dynamic preemption
using DYNAMIC_KEY.
::Tested lightly on Power10 LPAR
Performance numbers indicate that, preempt=none(no dynamic) and
preempt=none(dynamic) are close.
cat /sys/kernel/debug/sched/preempt
(none) voluntary full lazy
perf stat -e probe:__cond_resched -a sleep 1
Performance counter stats for 'system wide':
1,253 probe:__cond_resched
echo full > /sys/kernel/debug/sched/preempt
cat /sys/kernel/debug/sched/preempt
none voluntary (full) lazy
perf stat -e probe:__cond_resched -a sleep 1
Performance counter stats for 'system wide':
0 probe:__cond_resched
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250210184334.567383-2-sshegde@linux.ibm.com
Thorsten Blum [Sun, 9 Feb 2025 08:17:03 +0000 (09:17 +0100)]
fadump: Use str_yes_no() helper in fadump_show_config()
Remove hard-coded strings by using the str_yes_no() helper function.
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250209081704.2758-2-thorsten.blum@linux.dev
Thorsten Blum [Fri, 11 Apr 2025 08:42:21 +0000 (10:42 +0200)]
KVM: powerpc: Enable commented out BUILD_BUG_ON() assertion
The BUILD_BUG_ON() assertion was commented out in commit
38634e676992
("powerpc/kvm: Remove problematic BUILD_BUG_ON statement") and fixed in
commit
c0a187e12d48 ("KVM: powerpc: Fix BUILD_BUG_ON condition"), but
not enabled. Enable it now that this no longer breaks and remove the
comment.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250411084222.6916-1-thorsten.blum@linux.dev
Thorsten Blum [Wed, 19 Feb 2025 11:20:52 +0000 (12:20 +0100)]
powerpc: mpic: Use str_enabled_disabled() helper function
Remove hard-coded strings by using the str_enabled_disabled() helper
function.
Use pr_debug() instead of printk(KERN_DEBUG) to silence a checkpatch
warning.
Reviewed-by: Ricardo B. Marlière <ricardo@marliere.net>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250219112053.3352-2-thorsten.blum@linux.dev
Thorsten Blum [Wed, 19 Feb 2025 11:14:45 +0000 (12:14 +0100)]
powerpc/ps3: Use str_write_read() in ps3_notification_read_write()
Remove hard-coded strings by using the str_write_read() helper function.
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250219111445.2875-2-thorsten.blum@linux.dev
Vaibhav Jain [Wed, 16 Apr 2025 16:27:36 +0000 (21:57 +0530)]
powerpc/kvm-hv-pmu: Add perf-events for Hostwide counters
Update 'kvm-hv-pmu.c' to add five new perf-events mapped to the five
Hostwide counters. Since these newly introduced perf events are at system
wide scope and can be read from any L1-Lpar CPU, 'kvmppc_pmu' scope and
capabilities are updated appropriately.
Also introduce two new helpers. First is kvmppc_update_l0_stats() that uses
the infrastructure introduced in previous patches to issues the
H_GUEST_GET_STATE hcall L0-PowerVM to fetch guest-state-buffer holding the
latest values of these counters which is then parsed and 'l0_stats'
variable updated.
Second helper is kvmppc_pmu_event_update() which is called from
'kvmppv_pmu' callbacks and uses kvmppc_update_l0_stats() to update
'l0_stats' and the update the 'struct perf_event's event-counter.
Some minor updates to kvmppc_pmu_{add, del, read}() to remove some debug
scaffolding code.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-7-vaibhav@linux.ibm.com
Vaibhav Jain [Wed, 16 Apr 2025 16:27:35 +0000 (21:57 +0530)]
powerpc/kvm-hv-pmu: Implement GSB message-ops for hostwide counters
Implement and setup necessary structures to send a prepolulated
Guest-State-Buffer(GSB) requesting hostwide counters to L0-PowerVM and have
the returned GSB holding the values of these counters parsed. This is done
via existing GSB implementation and with the newly added support of
Hostwide elements in GSB.
The request to L0-PowerVM to return Hostwide counters is done using a
pre-allocated GSB named 'gsb_l0_stats'. To be able to populate this GSB
with the needed Guest-State-Elements (GSIDs) a instance of 'struct
kvmppc_gs_msg' named 'gsm_l0_stats' is introduced. The 'gsm_l0_stats' is
tied to an instance of 'struct kvmppc_gs_msg_ops' named 'gsb_ops_l0_stats'
which holds various callbacks to be compute the size ( hostwide_get_size()
), populate the GSB ( hostwide_fill_info() ) and
refresh ( hostwide_refresh_info() ) the contents of
'l0_stats' that holds the Hostwide counters returned from L0-PowerVM.
To protect these structures from simultaneous access a spinlock
'lock_l0_stats' has been introduced. The allocation and initialization of
the above structures is done in newly introduced kvmppc_init_hostwide() and
similarly the cleanup is performed in newly introduced
kvmppc_cleanup_hostwide().
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-6-vaibhav@linux.ibm.com
Vaibhav Jain [Wed, 16 Apr 2025 16:27:34 +0000 (21:57 +0530)]
kvm powerpc/book3s-apiv2: Introduce kvm-hv specific PMU
Introduce a new PMU named 'kvm-hv' inside a new module named 'kvm-hv-pmu'
to report Book3s kvm-hv specific performance counters. This will expose
KVM-HV specific performance attributes to user-space via kernel's PMU
infrastructure and would enableusers to monitor active kvm-hv based guests.
The patch creates necessary scaffolding to for the new PMU callbacks and
introduces the new kernel module name 'kvm-hv-pmu' which is built with
CONFIG_KVM_BOOK3S_HV_PMU. The patch doesn't introduce any perf-events yet,
which will be introduced in later patches
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-5-vaibhav@linux.ibm.com
Vaibhav Jain [Wed, 16 Apr 2025 16:27:33 +0000 (21:57 +0530)]
kvm powerpc/book3s-apiv2: Add kunit tests for Hostwide GSB elements
Update 'test-guest-state-buffer.c' to add two new KUNIT test cases for
validating correctness of changes to Guest-state-buffer management
infrastructure for adding support for Hostwide GSB elements.
The newly introduced test test_gs_hostwide_msg() checks if the Hostwide
elements can be set and parsed from a Guest-state-buffer. The second kunit
test test_gs_hostwide_counters() checks if the Hostwide GSB elements can be
send to the L0-PowerVM hypervisor via the H_GUEST_SET_STATE hcall and
ensures that the returned guest-state-buffer has all the 5 Hostwide stat
counters present.
Below is the KATP test report with the newly added KUNIT tests:
KTAP version 1
# Subtest: guest_state_buffer_test
# module: test_guest_state_buffer
1..7
ok 1 test_creating_buffer
ok 2 test_adding_element
ok 3 test_gs_bitmap
ok 4 test_gs_parsing
ok 5 test_gs_msg
ok 6 test_gs_hostwide_msg
# test_gs_hostwide_counters: Guest Heap Size=0 bytes
# test_gs_hostwide_counters: Guest Heap Size Max=
10995367936 bytes
# test_gs_hostwide_counters: Guest Page-table Size=
2178304 bytes
# test_gs_hostwide_counters: Guest Page-table Size Max=
2147483648 bytes
# test_gs_hostwide_counters: Guest Page-table Reclaim Size=0 bytes
ok 7 test_gs_hostwide_counters
# guest_state_buffer_test: pass:7 fail:0 skip:0 total:7
# Totals: pass:7 fail:0 skip:0 total:7
ok 1 guest_state_buffer_test
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-4-vaibhav@linux.ibm.com
Vaibhav Jain [Wed, 16 Apr 2025 16:27:32 +0000 (21:57 +0530)]
kvm powerpc/book3s-apiv2: Add support for Hostwide GSB elements
Add support for adding and parsing Hostwide elements to the
Guest-state-buffer data structure used in apiv2. These elements are used to
share meta-information pertaining to entire L1-Lpar and this
meta-information is maintained by L0-PowerVM hypervisor. Example of this
include the amount of the page-table memory currently used by L0-PowerVM
for hosting the Shadow-Pagetable of all active L2-Guests. More of the are
documented in kernel-documentation at [1]. The Hostwide GSB elements are
currently only support with H_GUEST_SET_STATE hcall with a special flag
namely 'KVMPPC_GS_FLAGS_HOST_WIDE'.
The patch introduces new defs for the 5 new Hostwide GSB elements including
their GSIDs as well as introduces a new class of GSB elements namely
'KVMPPC_GS_CLASS_HOSTWIDE' to indicate to GSB construction/parsing
infrastructure in 'kvm/guest-state-buffer.c'. Also
gs_msg_ops_vcpu_get_size(), kvmppc_gsid_type() and
kvmppc_gse_{flatten,unflatten}_iden() are updated to appropriately indicate
the needed size for these Hostwide GSB elements as well as how to
flatten/unflatten their GSIDs so that they can be marked as available in
GSB bitmap.
[1] Documention/arch/powerpc/kvm-nested.rst
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-3-vaibhav@linux.ibm.com
Vaibhav Jain [Wed, 16 Apr 2025 16:27:31 +0000 (21:57 +0530)]
powerpc: Document APIv2 KVM hcall spec for Hostwide counters
Update kvm-nested APIv2 documentation to include five new
Guest-State-Elements to fetch the hostwide counters. These counters are
per L1-Lpar and indicate the amount of Heap/Page-table memory allocated,
available and Page-table memory reclaimed for all L2-Guests active
instances
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250416162740.93143-2-vaibhav@linux.ibm.com
Linus Torvalds [Sun, 13 Apr 2025 18:54:49 +0000 (11:54 -0700)]
Linux 6.15-rc2
Linus Torvalds [Sun, 13 Apr 2025 17:52:04 +0000 (10:52 -0700)]
Merge tag 'erofs-for-6.15-rc2-fixes' of git://git./linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Properly handle errors when file-backed I/O fails
- Fix compilation issues on ARM platform (arm-linux-gnueabi)
- Fix parsing of encoded extents
- Minor cleanup
* tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: remove duplicate code
erofs: fix encoded extents handling
erofs: add __packed annotation to union(__le16..)
erofs: set error to bio if file-backed IO fails
Linus Torvalds [Sun, 13 Apr 2025 14:15:50 +0000 (07:15 -0700)]
Merge tag 'ext4_for_linus-6.15-rc2' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"A few more miscellaneous ext4 bug fixes and cleanups including some
syzbot failures and fixing a stale file handing refeencing an inode
previously used as a regular file, but which has been deleted and
reused as an ea_inode would result in ext4 erroneously considering
this a case of fs corruption"
* tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix off-by-one error in do_split
ext4: make block validity check resistent to sb bh corruption
ext4: avoid -Wflex-array-member-not-at-end warning
Documentation: ext4: Add fields to ext4_super_block documentation
ext4: don't treat fhandle lookup of ea_inode as FS corruption
Linus Torvalds [Sun, 13 Apr 2025 14:11:33 +0000 (07:11 -0700)]
Merge tag 'fixes-2025-04-13' of git://git./linux/kernel/git/rppt/memblock
Pull memblock fix from Mike Rapoport:
"Fix build of memblock test.
Add missing stubs for mutex and free_reserved_area() to memblock
tests"
* tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock tests: Fix mutex related build error
Artem Sadovnikov [Fri, 4 Apr 2025 08:28:05 +0000 (08:28 +0000)]
ext4: fix off-by-one error in do_split
Syzkaller detected a use-after-free issue in ext4_insert_dentry that was
caused by out-of-bounds access due to incorrect splitting in do_split.
BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109
Write of size 251 at addr
ffff888074572f14 by task syz-executor335/5847
CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted
6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
__asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106
ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109
add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154
make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351
ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455
ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796
ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431
vfs_symlink+0x137/0x2e0 fs/namei.c:4615
do_symlinkat+0x222/0x3a0 fs/namei.c:4641
__do_sys_symlink fs/namei.c:4662 [inline]
__se_sys_symlink fs/namei.c:4660 [inline]
__x64_sys_symlink+0x7a/0x90 fs/namei.c:4660
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
The following loop is located right above 'if' statement.
for (i = count-1; i >= 0; i--) {
/* is more than half of this entry in 2nd half of the block? */
if (size + map[i].size/2 > blocksize/2)
break;
size += map[i].size;
move++;
}
'i' in this case could go down to -1, in which case sum of active entries
wouldn't exceed half the block size, but previous behaviour would also do
split in half if sum would exceed at the very last block, which in case of
having too many long name files in a single block could lead to
out-of-bounds access and following use-after-free.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Cc: stable@vger.kernel.org
Fixes:
5872331b3d91 ("ext4: fix potential negative array index in do_split()")
Signed-off-by: Artem Sadovnikov <a.sadovnikov@ispras.ru>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250404082804.2567-3-a.sadovnikov@ispras.ru
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ojaswin Mujoo [Fri, 28 Mar 2025 06:24:52 +0000 (11:54 +0530)]
ext4: make block validity check resistent to sb bh corruption
Block validity checks need to be skipped in case they are called
for journal blocks since they are part of system's protected
zone.
Currently, this is done by checking inode->ino against
sbi->s_es->s_journal_inum, which is a direct read from the ext4 sb
buffer head. If someone modifies this underneath us then the
s_journal_inum field might get corrupted. To prevent against this,
change the check to directly compare the inode with journal->j_inode.
**Slight change in behavior**: During journal init path,
check_block_validity etc might be called for journal inode when
sbi->s_journal is not set yet. In this case we now proceed with
ext4_inode_block_valid() instead of returning early. Since systems zones
have not been set yet, it is okay to proceed so we can perform basic
checks on the blocks.
Suggested-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/0c06bc9ebfcd6ccfed84a36e79147bf45ff5adc1.1743142920.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Gustavo A. R. Silva [Wed, 26 Mar 2025 22:55:51 +0000 (16:55 -0600)]
ext4: avoid -Wflex-array-member-not-at-end warning
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of
a flexible structure where the size of the flexible-array member
is known at compile-time, and refactor the rest of the code,
accordingly.
So, with these changes, fix the following warning:
fs/ext4/mballoc.c:3041:40: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/Z-SF97N3AxcIMlSi@kspp
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Tom Vierjahn [Mon, 24 Mar 2025 22:09:30 +0000 (23:09 +0100)]
Documentation: ext4: Add fields to ext4_super_block documentation
Documentation and implementation of the ext4 super block have
slightly diverged: Padding has been removed in order to make room for
new fields that are still missing in the documentation.
Add the new fields s_encryption_level, s_first_error_errorcode,
s_last_error_errorcode to the documentation of the ext4 super block.
Fixes:
f542fbe8d5e8 ("ext4 crypto: reserve codepoints used by the ext4 encryption feature")
Fixes:
878520ac45f9 ("ext4: save the error code which triggered an ext4_error() in the superblock")
Signed-off-by: Tom Vierjahn <tom.vierjahn@acm.org>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/20250324221004.5268-1-tom.vierjahn@acm.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Linus Torvalds [Sat, 12 Apr 2025 22:37:40 +0000 (15:37 -0700)]
Merge tag 'trace-v6.15-rc1' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Hide get_vm_area() from MMUless builds
The function get_vm_area() is not defined when CONFIG_MMU is not
defined. Hide that function within #ifdef CONFIG_MMU.
- Fix output of synthetic events when they have dynamic strings
The print fmt of the synthetic event's format file use to have "%.*s"
for dynamic size strings even though the user space exported
arguments had only __get_str() macro that provided just a nul
terminated string. This was fixed so that user space could parse this
properly.
But the reason that it had "%.*s" was because internally it provided
the maximum size of the string as one of the arguments. The fix that
replaced "%.*s" with "%s" caused the trace output (when the kernel
reads the event) to write "(efault)" as it would now read the length
of the string as "%s".
As the string provided is always nul terminated, there's no reason
for the internal code to use "%.*s" anyway. Just remove the length
argument to match the "%s" that is now in the format.
- Fix the ftrace subops hash logic of the manager ops hash
The function_graph uses the ftrace subops code. The subops code is a
way to have a single ftrace_ops registered with ftrace to determine
what functions will call the ftrace_ops callback. More than one user
of function graph can register a ftrace_ops with it. The function
graph infrastructure will then add this ftrace_ops as a subops with
the main ftrace_ops it registers with ftrace. This is because the
functions will always call the function graph callback which in turn
calls the subops ftrace_ops callbacks.
The main ftrace_ops must add a callback to all the functions that the
subops want a callback from. When a subops is registered, it will
update the main ftrace_ops hash to include the functions it wants.
This is the logic that was broken.
The ftrace_ops hash has a "filter_hash" and a "notrace_hash" where
all the functions in the filter_hash but not in the notrace_hash are
attached by ftrace. The original logic would have the main ftrace_ops
filter_hash be a union of all the subops filter_hashes and the main
notrace_hash would be a intersect of all the subops filter hashes.
But this was incorrect because the notrace hash depends on the
filter_hash it is associated to and not the union of all
filter_hashes.
Instead, when a subops is added, just include all the functions of
the subops hash that are in its filter_hash but not in its
notrace_hash. The main subops hash should not use its notrace hash,
unless all of its subops hashes have an empty filter_hash (which
means to attach to all functions), and then, and only then, the main
ftrace_ops notrace hash can be the intersect of all the subops
hashes.
This not only fixes the bug, but also simplifies the code.
- Add a selftest to better test the subops filtering
Add a selftest that would catch the bug fixed by the above change.
- Fix extra newline printed in function tracing with retval
The function parameter code changed the output logic slightly and
called print_graph_retval() and also printed a newline. The
print_graph_retval() also prints a newline which caused blank lines
to be printed in the function graph tracer when retval was added.
This caused one of the selftests to fail if retvals were enabled.
Instead remove the new line output from print_graph_retval() and have
the callers always print the new line so that it doesn't have to do
special logic if it calls print_graph_retval() or not.
- Fix out-of-bound memory access in the runtime verifier
When rv_is_container_monitor() is called on the last entry on the
link list it references the next entry, which is the list head and
causes an out-of-bound memory access.
* tag 'trace-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv: Fix out-of-bound memory access in rv_is_container_monitor()
ftrace: Do not have print_graph_retval() add a newline
tracing/selftest: Add test to better test subops filtering of function graph
ftrace: Fix accounting of subop hashes
ftrace: Properly merge notrace hashes
tracing: Do not add length to print format in synthetic events
tracing: Hide get_vm_area() from MMUless builds
Linus Torvalds [Sat, 12 Apr 2025 19:48:10 +0000 (12:48 -0700)]
Merge tag 'bpf-fixes' of git://git./linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Followup fixes for resilient spinlock (Kumar Kartikeya Dwivedi):
- Make res_spin_lock test less verbose, since it was spamming BPF
CI on failure, and make the check for AA deadlock stronger
- Fix rebasing mistake and use architecture provided
res_smp_cond_load_acquire
- Convert BPF maps (queue_stack and ringbuf) to resilient spinlock
to address long standing syzbot reports
- Make sure that classic BPF load instruction from SKF_[NET|LL]_OFF
offsets works when skb is fragmeneted (Willem de Bruijn)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Convert ringbuf map to rqspinlock
bpf: Convert queue_stack map to rqspinlock
bpf: Use architecture provided res_smp_cond_load_acquire
selftests/bpf: Make res_spin_lock AA test condition stronger
selftests/net: test sk_filter support for SKF_NET_OFF on frags
bpf: support SKF_NET_OFF and SKF_LL_OFF on skb frags
selftests/bpf: Make res_spin_lock test less verbose
Nam Cao [Fri, 11 Apr 2025 07:37:17 +0000 (09:37 +0200)]
rv: Fix out-of-bound memory access in rv_is_container_monitor()
When rv_is_container_monitor() is called on the last monitor in
rv_monitors_list, KASAN yells:
BUG: KASAN: global-out-of-bounds in rv_is_container_monitor+0x101/0x110
Read of size 8 at addr
ffffffff97c7c798 by task setup/221
The buggy address belongs to the variable:
rv_monitors_list+0x18/0x40
This is due to list_next_entry() is called on the last entry in the list.
It wraps around to the first list_head, and the first list_head is not
embedded in struct rv_monitor_def.
Fix it by checking if the monitor is last in the list.
Cc: stable@vger.kernel.org
Cc: Gabriele Monaco <gmonaco@redhat.com>
Fixes:
cb85c660fcd4 ("rv: Add option for nested monitors and include sched")
Link: https://lore.kernel.org/e85b5eeb7228bfc23b8d7d4ab5411472c54ae91b.1744355018.git.namcao@linutronix.de
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Fri, 11 Apr 2025 17:30:15 +0000 (13:30 -0400)]
ftrace: Do not have print_graph_retval() add a newline
The retval and retaddr options for function_graph tracer will add a
comment at the end of a function for both leaf and non leaf functions that
looks like:
__wake_up_common(); /* ret=0x1 */
} /* pick_next_task_fair ret=0x0 */
The function print_graph_retval() adds a newline after the "*/". But if
that's not called, the caller function needs to make sure there's a
newline added.
This is confusing and when the function parameters code was added, it
added a newline even when calling print_graph_retval() as the fact that
the print_graph_retval() function prints a newline isn't obvious.
This caused an extra newline to be printed and that made it fail the
selftests when the retval option was set, as the selftests were not
expecting blank lines being injected into the trace.
Instead of having print_graph_retval() print a newline, just have the
caller always print the newline regardless if it calls print_graph_retval()
or not. This not only fixes this bug, but it also simplifies the code.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250411133015.015ca393@gandalf.local.home
Reported-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/
ccc40f2b-4b9e-4abd-8daf-
d22fce2a86f0@sirena.org.uk/
Fixes:
ff5c9c576e754 ("ftrace: Add support for function argument to graph tracer")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sat, 12 Apr 2025 15:11:19 +0000 (08:11 -0700)]
Merge tag 'pwm/for-6.15-rc2-fixes' of git://git./linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
"A set of fixes for pwm core and various drivers
The first three patches handle clk_get_rate() returning 0 (which might
happen for example if the CCF is disabled). The first of these was
found because this triggered a warning with clang, the two others by
looking for similar issues in other drivers.
The remaining three fixes address issues in the new waveform pwm API.
Now that I worked on this a bit more, the finer details and corner
cases are better understood and the code is fixed accordingly"
* tag 'pwm/for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: axi-pwmgen: Let .round_waveform_tohw() signal when request was rounded up
pwm: stm32: Search an appropriate duty_cycle if period cannot be modified
pwm: Let pwm_set_waveform() succeed even if lowlevel driver rounded up
pwm: fsl-ftm: Handle clk_get_rate() returning 0
pwm: rcar: Improve register calculation
pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()
Linus Torvalds [Fri, 11 Apr 2025 23:41:14 +0000 (16:41 -0700)]
Merge tag 'v6.15-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Fix multichannel decryption UAF
- Fix regression mounting to onedrive shares
- Fix missing mount option check for posix vs. noposix
- Fix version field in WSL symlinks
- Three minor cleanup to reparse point handling
- SMB1 fix for WSL special files
- SMB1 Kerberos fix
- Add SMB3 defines for two new FS attributes
* tag 'v6.15-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: Add defines for two new FileSystemAttributes
cifs: Fix querying of WSL CHR and BLK reparse points over SMB1
cifs: Split parse_reparse_point callback to functions: get buffer and parse buffer
cifs: Improve handling of name surrogate reparse points in reparse.c
cifs: Remove explicit handling of IO_REPARSE_TAG_MOUNT_POINT in inode.c
cifs: Fix encoding of SMB1 Session Setup Kerberos Request in non-UNICODE mode
smb: client: fix UAF in decryption with multichannel
cifs: Fix support for WSL-style symlinks
smb311 client: fix missing tcon check when mounting with linux/posix extensions
cifs: Ensure that all non-client-specific reparse points are processed by the server
Linus Torvalds [Fri, 11 Apr 2025 23:29:52 +0000 (16:29 -0700)]
Merge tag 'pci-v6.15-fixes-1' of git://git./linux/kernel/git/pci/pci
Pull pci fix from Bjorn Helgaas:
- Run quirk_huawei_pcie_sva() before arm_smmu_probe_device(), which
depends on the quirk, to avoid IOMMU initialization failures
(Zhangfei Gao)
* tag 'pci-v6.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Run quirk_huawei_pcie_sva() before arm_smmu_probe_device()
Steven Rostedt [Wed, 9 Apr 2025 15:15:51 +0000 (11:15 -0400)]
tracing/selftest: Add test to better test subops filtering of function graph
A bug was discovered that showed the accounting of the subops of the
ftrace_ops filtering was incorrect. Add a new test to better test the
filtering.
This test creates two instances, where it will add various filters to both
the set_ftrace_filter and the set_ftrace_notrace files and enable
function_graph. Then it looks into the enabled_functions file to make sure
that the filters are behaving correctly.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Andy Chiu <andybnac@gmail.com>
Link: https://lore.kernel.org/20250409152720.380778379@goodmis.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Wed, 9 Apr 2025 15:15:50 +0000 (11:15 -0400)]
ftrace: Fix accounting of subop hashes
The function graph infrastructure uses ftrace to hook to functions. It has
a single ftrace_ops to manage all the users of function graph. Each
individual user (tracing, bpf, fprobes, etc) has its own ftrace_ops to
track the functions it will have its callback called from. These
ftrace_ops are "subops" to the main ftrace_ops of the function graph
infrastructure.
Each ftrace_ops has a filter_hash and a notrace_hash that is defined as:
Only trace functions that are in the filter_hash but not in the
notrace_hash.
If the filter_hash is empty, it means to trace all functions.
If the notrace_hash is empty, it means do not disable any function.
The function graph main ftrace_ops needs to be a superset containing all
the functions to be traced by all the subops it has. The algorithm to
perform this merge was incorrect.
When the first subops was added to the main ops, it simply made the main
ops a copy of the subops (same filter_hash and notrace_hash).
When a second ops was added, it joined the new subops filter_hash with the
main ops filter_hash as a union of the two sets. The intersect between the
new subops notrace_hash and the main ops notrace_hash was created as the
new notrace_hash of the main ops.
The issue here is that it would then start tracing functions than no
subops were tracing. For example if you had two subops that had:
subops 1:
filter_hash = '*sched*' # trace all functions with "sched" in it
notrace_hash = '*time*' # except do not trace functions with "time"
subops 2:
filter_hash = '*lock*' # trace all functions with "lock" in it
notrace_hash = '*clock*' # except do not trace functions with "clock"
The intersect of '*time*' functions with '*clock*' functions could be the
empty set. That means the main ops will be tracing all functions with
'*time*' and all "*clock*" in it!
Instead, modify the algorithm to be a bit simpler and correct.
First, when adding a new subops, even if it's the first one, do not add
the notrace_hash if the filter_hash is not empty. Instead, just add the
functions that are in the filter_hash of the subops but not in the
notrace_hash of the subops into the main ops filter_hash. There's no
reason to add anything to the main ops notrace_hash.
The notrace_hash of the main ops should only be non empty iff all subops
filter_hashes are empty (meaning to trace all functions) and all subops
notrace_hashes include the same functions.
That is, the main ops notrace_hash is empty if any subops filter_hash is
non empty.
The main ops notrace_hash only has content in it if all subops
filter_hashes are empty, and the content are only functions that intersect
all the subops notrace_hashes. If any subops notrace_hash is empty, then
so is the main ops notrace_hash.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Andy Chiu <andybnac@gmail.com>
Link: https://lore.kernel.org/20250409152720.216356767@goodmis.org
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Andy Chiu [Tue, 8 Apr 2025 16:02:57 +0000 (00:02 +0800)]
ftrace: Properly merge notrace hashes
The global notrace hash should be jointly decided by the intersection of
each subops's notrace hash, but not the filter hash.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250408160258.48563-1-andybnac@gmail.com
Fixes:
5fccc7552ccb ("ftrace: Add subops logic to allow one ops to manage many")
Signed-off-by: Andy Chiu <andybnac@gmail.com>
[ fixed removing of freeing of filter_hash ]
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Zhangfei Gao [Mon, 17 Mar 2025 01:13:52 +0000 (01:13 +0000)]
PCI: Run quirk_huawei_pcie_sva() before arm_smmu_probe_device()
quirk_huawei_pcie_sva() sets properties needed by arm_smmu_probe_device(),
but
bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
changed the iommu_probe_device() flow so arm_smmu_probe_device() is now
invoked before the quirk, leading to failures like this:
reg-dummy reg-dummy: late IOMMU probe at driver bind, something fishy here!
WARNING: CPU: 0 PID: 1 at drivers/iommu/iommu.c:449 __iommu_probe_device+0x140/0x570
RIP: 0010:__iommu_probe_device+0x140/0x570
The SR-IOV enumeration ordering changes like this:
pci_iov_add_virtfn
pci_device_add
pci_fixup_device(pci_fixup_header) <--
device_add
bus_notify
iommu_bus_notifier
+ iommu_probe_device
+ arm_smmu_probe_device
pci_bus_add_device
pci_fixup_device(pci_fixup_final) <--
device_attach
driver_probe_device
really_probe
pci_dma_configure
acpi_dma_configure_id
- iommu_probe_device
- arm_smmu_probe_device
The non-SR-IOV case is similar in that pci_device_add() is called from
pci_scan_single_device() in the generic enumeration path and
pci_bus_add_device() is called later, after all host bridges have been
enumerated.
Declare quirk_huawei_pcie_sva() as a header fixup to ensure that it happens
before arm_smmu_probe_device().
Fixes:
bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Closes: https://lore.kernel.org/all/SJ1PR11MB61295DE21A1184AEE0786E25B9D22@SJ1PR11MB6129.namprd11.prod.outlook.com/
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
[bhelgaas: commit log, add failure info and reporter]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250317011352.5806-1-zhangfei.gao@linaro.org
Kumar Kartikeya Dwivedi [Fri, 11 Apr 2025 10:17:59 +0000 (03:17 -0700)]
bpf: Convert ringbuf map to rqspinlock
Convert the raw spinlock used by BPF ringbuf to rqspinlock. Currently,
we have an open syzbot report of a potential deadlock. In addition, the
ringbuf can fail to reserve spuriously under contention from NMI
context.
It is potentially attractive to enable unconstrained usage (incl. NMIs)
while ensuring no deadlocks manifest at runtime, perform the conversion
to rqspinlock to achieve this.
This change was benchmarked for BPF ringbuf's multi-producer contention
case on an Intel Sapphire Rapids server, with hyperthreading disabled
and performance governor turned on. 5 warm up runs were done for each
case before obtaining the results.
Before (raw_spinlock_t):
Ringbuf, multi-producer contention
==================================
rb-libbpf nr_prod 1 11.440 ± 0.019M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 2 2.706 ± 0.010M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 3 3.130 ± 0.004M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 4 2.472 ± 0.003M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 8 2.352 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 12 2.813 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 16 1.988 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 20 2.245 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 24 2.148 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 28 2.190 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 32 2.490 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 36 2.180 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 40 2.201 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 44 2.226 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 48 2.164 ± 0.001M/s (drops 0.000 ± 0.000M/s)
rb-libbpf nr_prod 52 1.874 ± 0.001M/s (drops 0.000 ± 0.000M/s)
After (rqspinlock_t):
Ringbuf, multi-producer contention
==================================
rb-libbpf nr_prod 1 11.078 ± 0.019M/s (drops 0.000 ± 0.000M/s) (-3.16%)
rb-libbpf nr_prod 2 2.801 ± 0.014M/s (drops 0.000 ± 0.000M/s) (3.51%)
rb-libbpf nr_prod 3 3.454 ± 0.005M/s (drops 0.000 ± 0.000M/s) (10.35%)
rb-libbpf nr_prod 4 2.567 ± 0.002M/s (drops 0.000 ± 0.000M/s) (3.84%)
rb-libbpf nr_prod 8 2.468 ± 0.001M/s (drops 0.000 ± 0.000M/s) (4.93%)
rb-libbpf nr_prod 12 2.510 ± 0.001M/s (drops 0.000 ± 0.000M/s) (-10.77%)
rb-libbpf nr_prod 16 2.075 ± 0.001M/s (drops 0.000 ± 0.000M/s) (4.38%)
rb-libbpf nr_prod 20 2.640 ± 0.001M/s (drops 0.000 ± 0.000M/s) (17.59%)
rb-libbpf nr_prod 24 2.092 ± 0.001M/s (drops 0.000 ± 0.000M/s) (-2.61%)
rb-libbpf nr_prod 28 2.426 ± 0.005M/s (drops 0.000 ± 0.000M/s) (10.78%)
rb-libbpf nr_prod 32 2.331 ± 0.004M/s (drops 0.000 ± 0.000M/s) (-6.39%)
rb-libbpf nr_prod 36 2.306 ± 0.003M/s (drops 0.000 ± 0.000M/s) (5.78%)
rb-libbpf nr_prod 40 2.178 ± 0.002M/s (drops 0.000 ± 0.000M/s) (-1.04%)
rb-libbpf nr_prod 44 2.293 ± 0.001M/s (drops 0.000 ± 0.000M/s) (3.01%)
rb-libbpf nr_prod 48 2.022 ± 0.001M/s (drops 0.000 ± 0.000M/s) (-6.56%)
rb-libbpf nr_prod 52 1.809 ± 0.001M/s (drops 0.000 ± 0.000M/s) (-3.47%)
There's a fair amount of noise in the benchmark, with numbers on reruns
going up and down by 10%, so all changes are in the range of this
disturbance, and we see no major regressions.
Reported-by: syzbot+850aaf14624dc0c6d366@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
0000000000004aa700061379547e@google.com
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250411101759.4061366-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Linus Torvalds [Fri, 11 Apr 2025 15:36:18 +0000 (08:36 -0700)]
Merge tag 'spi-fix-v6.15-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of cleanups for the error handling in the Freescale drivers"
* tag 'spi-fix-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: fsl-spi: Remove redundant probe error message
spi: fsl-qspi: Fix double cleanup in probe error path
Linus Torvalds [Fri, 11 Apr 2025 15:33:33 +0000 (08:33 -0700)]
Merge tag 'ata-6.15-rc2' of git://git./linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal:
- Fix missing error checks during controller probe in the sata_sx4
driver (Wentao)
- Fix missing error checks during controller probe in the pata_pxa
driver (Henry)
* tag 'ata-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: sata_sx4: Add error handling in pdc20621_i2c_read()
ata: pata_pxa: Fix potential NULL pointer dereference in pxa_ata_probe()
Linus Torvalds [Fri, 11 Apr 2025 15:29:35 +0000 (08:29 -0700)]
Merge tag 'block-6.15-
20250411' of git://git.kernel.dk/linux
Pull more block fixes from Jens Axboe:
"Apparently my internal clock was off, or perhaps it was just wishful
thinking, but I sent out block fixes yesterday as my brain assumed it
was Friday. Subsequently, that missed the NVMe fixes that should go
into this weeks release as well. Hence, here's a followup with those,
and another simple fix.
- NVMe pull request via Christoph:
- nvmet fc/fcloop refcounting fixes (Daniel Wagner)
- fix missed namespace/ANA scans (Hannes Reinecke)
- fix a use after free in the new TCP netns support (Kuniyuki
Iwashima)
- fix a NULL instead of false review in multipath (Uday Shankar)
- Use strscpy() for null_blk disk name copy"
* tag 'block-6.15-
20250411' of git://git.kernel.dk/linux:
null_blk: Use strscpy() instead of strscpy_pad() in null_add_dev()
nvmet-fc: put ref when assoc->del_work is already scheduled
nvmet-fc: take tgtport reference only once
nvmet-fc: update tgtport ref per assoc
nvmet-fc: inline nvmet_fc_free_hostport
nvmet-fc: inline nvmet_fc_delete_assoc
nvmet-fcloop: add ref counting to lport
nvmet-fcloop: replace kref with refcount
nvmet-fcloop: swap list_add_tail arguments
nvme-tcp: fix use-after-free of netns by kernel TCP socket.
nvme: multipath: fix return value of nvme_available_path
nvme: re-read ANA log page after ns scan completes
nvme: requeue namespace scan on missed AENs
Linus Torvalds [Fri, 11 Apr 2025 15:24:46 +0000 (08:24 -0700)]
Merge tag 'iommu-fixes-v6.15-rc1' of git://git./linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- Fix two crashes, one in core code and a NULL-ptr dereference in the
Mediatek IOMMU driver
- Dma_ops cleanup fix for core code
- Two fixes for Intel VT-d driver:
- Fix posted MSI issue when users change cpu affinity
- Remove invalid set_dma_ops() call in the iommu driver
- Warning fix for Tegra IOMMU driver
- Suspend/Resume fix for Exynos IOMMU driver
- Probe failure fix for Renesas IOMMU driver
- Cosmetic fix
* tag 'iommu-fixes-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/tegra241-cmdqv: Fix warnings due to dmam_free_coherent()
iommu: remove unneeded semicolon
iommu/mediatek: Fix NULL pointer deference in mtk_iommu_device_group
iommu/exynos: Fix suspend/resume with IDENTITY domain
iommu/ipmmu-vmsa: Register in a sensible order
iommu: Clear iommu-dma ops on cleanup
iommu/vt-d: Remove an unnecessary call set_dma_ops()
iommu/vt-d: Wire up irq_ack() to irq_move_irq() for posted MSIs
iommu: Fix crash in report_iommu_fault()
Linus Torvalds [Fri, 11 Apr 2025 15:21:19 +0000 (08:21 -0700)]
Merge tag 'acpi-6.15-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a recent regression in the ACPI button driver, add quirks
related to EC wakeups from suspend-to-idle and fix coding mistakes
related to the usage of sizeof() in the PPTT parser code:
Summary:
- Add suspend-to-idle EC wakeup quirks for Lenovo Go S (Mario
Limonciello)
- Prevent ACPI button from sending spurions KEY_POWER events to user
space in some cases after a recent update (Mario Limonciello)
- Compute the size of a structure instead of the size of a pointer in
two places in the PPTT parser code (Jean-Marc Eurin)"
* tag 'acpi-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls
ACPI: EC: Set ec_no_wakeup for Lenovo Go S
ACPI: button: Only send `KEY_POWER` for `ACPI_BUTTON_NOTIFY_STATUS`
Linus Torvalds [Fri, 11 Apr 2025 15:17:40 +0000 (08:17 -0700)]
Merge tag 's390-6.15-3' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
"Note that besides two bug fixes this includes three commits for IBM
z17, which was announced this week.
- Add IBM z17 bits:
- Setup elf_platform for new machine types
- Allow to compile the kernel with z17 optimizations
- Add new performance counters
- Fix mismatch between indicator bits and queue indexes in virtio CCW code
- Fix double free in pmu setup error path"
* tag 's390-6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cpumf: Fix double free on error in cpumf_pmu_event_init()
s390/cpumf: Update CPU Measurement facility extended counter set support
s390: Allow to compile with z17 optimizations
s390: Add z17 elf platform
s390/virtio_ccw: Don't allocate/assign airqs for non-existing queues
Rafael J. Wysocki [Fri, 11 Apr 2025 13:50:15 +0000 (15:50 +0200)]
Merge branches 'acpi-ec' and 'acpi-button'
Merge updates of the ACPI EC and button drivers for 6.15-rc2:
- Add suspend-to-idle EC wakeup quirks for Lenovo Go S (Mario
Limonciello).
- Prevent ACPI button from sending spurions KEY_POWER events to user
space in some cases after a recent update (Mario Limonciello).
* acpi-ec:
ACPI: EC: Set ec_no_wakeup for Lenovo Go S
* acpi-button:
ACPI: button: Only send `KEY_POWER` for `ACPI_BUTTON_NOTIFY_STATUS`
Thorsten Blum [Thu, 10 Apr 2025 15:47:23 +0000 (17:47 +0200)]
null_blk: Use strscpy() instead of strscpy_pad() in null_add_dev()
blk_mq_alloc_disk() already zero-initializes the destination buffer,
making strscpy() sufficient for safely copying the disk's name. The
additional NUL-padding performed by strscpy_pad() is unnecessary.
If the destination buffer has a fixed length, strscpy() automatically
determines its size using sizeof() when the argument is omitted. This
makes the explicit size argument unnecessary.
The source string is also NUL-terminated and meets the __must_be_cstr()
requirement of strscpy().
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250410154727.883207-1-thorsten.blum@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nicolin Chen [Mon, 7 Apr 2025 20:19:08 +0000 (13:19 -0700)]
iommu/tegra241-cmdqv: Fix warnings due to dmam_free_coherent()
Two WARNINGs are observed when SMMU driver rolls back upon failure:
arm-smmu-v3.9.auto: Failed to register iommu
arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22
------------[ cut here ]------------
WARNING: CPU: 5 PID: 1 at kernel/dma/mapping.c:74 dmam_free_coherent+0xc0/0xd8
Call trace:
dmam_free_coherent+0xc0/0xd8 (P)
tegra241_vintf_free_lvcmdq+0x74/0x188
tegra241_cmdqv_remove_vintf+0x60/0x148
tegra241_cmdqv_remove+0x48/0xc8
arm_smmu_impl_remove+0x28/0x60
devm_action_release+0x1c/0x40
------------[ cut here ]------------
128 pages are still in use!
WARNING: CPU: 16 PID: 1 at mm/page_alloc.c:6902 free_contig_range+0x18c/0x1c8
Call trace:
free_contig_range+0x18c/0x1c8 (P)
cma_release+0x154/0x2f0
dma_free_contiguous+0x38/0xa0
dma_direct_free+0x10c/0x248
dma_free_attrs+0x100/0x290
dmam_free_coherent+0x78/0xd8
tegra241_vintf_free_lvcmdq+0x74/0x160
tegra241_cmdqv_remove+0x98/0x198
arm_smmu_impl_remove+0x28/0x60
devm_action_release+0x1c/0x40
This is because the LVCMDQ queue memory are managed by devres, while that
dmam_free_coherent() is called in the context of devm_action_release().
Jason pointed out that "arm_smmu_impl_probe() has mis-ordered the devres
callbacks if ops->device_remove() is going to be manually freeing things
that probe allocated":
https://lore.kernel.org/linux-iommu/
20250407174408.GB1722458@nvidia.com/
In fact, tegra241_cmdqv_init_structures() only allocates memory resources
which means any failure that it generates would be similar to -ENOMEM, so
there is no point in having that "falling back to standard SMMU" routine,
as the standard SMMU would likely fail to allocate memory too.
Remove the unwind part in tegra241_cmdqv_init_structures(), and return a
proper error code to ask SMMU driver to call tegra241_cmdqv_remove() via
impl_ops->device_remove(). Then, drop tegra241_vintf_free_lvcmdq() since
devres will take care of that.
Fixes:
483e0bd8883a ("iommu/tegra241-cmdqv: Do not allocate vcmdq until dma_set_mask_and_coherent")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250407201908.172225-1-nicolinc@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pei Xiao [Mon, 7 Apr 2025 01:53:28 +0000 (09:53 +0800)]
iommu: remove unneeded semicolon
cocci warnings:
drivers/iommu/dma-iommu.c:1788:2-3: Unneeded semicolon
so remove unneeded semicolon to fix cocci warnings.
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/tencent_73EEE47E6ECCF538229C9B9E6A0272DA2B05@qq.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Louis-Alexis Eyraud [Thu, 3 Apr 2025 10:22:12 +0000 (12:22 +0200)]
iommu/mediatek: Fix NULL pointer deference in mtk_iommu_device_group
Currently, mtk_iommu calls during probe iommu_device_register before
the hw_list from driver data is initialized. Since iommu probing issue
fix, it leads to NULL pointer dereference in mtk_iommu_device_group when
hw_list is accessed with list_first_entry (not null safe).
So, change the call order to ensure iommu_device_register is called
after the driver data are initialized.
Fixes:
9e3a2a643653 ("iommu/mediatek: Adapt sharing and non-sharing pgtable case")
Fixes:
bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org> # MT8183 Juniper, MT8186 Tentacruel
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Link: https://lore.kernel.org/r/20250403-fix-mtk-iommu-error-v2-1-fe8b18f8b0a8@collabora.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Marek Szyprowski [Tue, 1 Apr 2025 20:27:31 +0000 (22:27 +0200)]
iommu/exynos: Fix suspend/resume with IDENTITY domain
Commit
bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe
path") changed the sequence of probing the SYSMMU controller devices and
calls to arm_iommu_attach_device(), what results in resuming SYSMMU
controller earlier, when it is still set to IDENTITY mapping. Such change
revealed the bug in IDENTITY handling in the exynos-iommu driver. When
SYSMMU controller is set to IDENTITY mapping, data->domain is NULL, so
adjust checks in suspend & resume callbacks to handle this case
correctly.
Fixes:
b3d14960e629 ("iommu/exynos: Implement an IDENTITY domain")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20250401202731.2810474-1-m.szyprowski@samsung.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Robin Murphy [Thu, 20 Mar 2025 14:41:27 +0000 (14:41 +0000)]
iommu/ipmmu-vmsa: Register in a sensible order
IPMMU registers almost-initialised instances, but misses assigning the
drvdata to make them fully functional, so initial calls back into
ipmmu_probe_device() are likely to fail unnecessarily. Reorder this to
work as it should, also pruning the long-out-of-date comment and adding
the missing sysfs cleanup on error for good measure.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes:
bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/53be6667544de65a15415b699e38a9a965692e45.1742481687.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Robin Murphy [Thu, 10 Apr 2025 11:23:48 +0000 (12:23 +0100)]
iommu: Clear iommu-dma ops on cleanup
If iommu_device_register() encounters an error, it can end up tearing
down already-configured groups and default domains, however this
currently still leaves devices hooked up to iommu-dma (and even
historically the behaviour in this area was at best inconsistent across
architectures/drivers...) Although in the case that an IOMMU is present
whose driver has failed to probe, users cannot necessarily expect DMA to
work anyway, it's still arguable that we should do our best to put
things back as if the IOMMU driver was never there at all, and certainly
the potential for crashing in iommu-dma itself is undesirable. Make sure
we clean up the dev->dma_iommu flag along with everything else.
Reported-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Closes: https://lore.kernel.org/all/CAGXv+5HJpTYmQ2h-GD7GjyeYT7bL9EBCvu0mz5LgpzJZtzfW0w@mail.gmail.com/
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/e788aa927f6d827dd4ea1ed608fada79f2bab030.1744284228.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Petr Tesarik [Thu, 10 Apr 2025 07:32:47 +0000 (15:32 +0800)]
iommu/vt-d: Remove an unnecessary call set_dma_ops()
Do not touch per-device DMA ops when the driver has been converted to use
the dma-iommu API.
Fixes:
c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Link: https://lore.kernel.org/r/20250403165605.278541-1-ptesarik@suse.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Sean Christopherson [Thu, 10 Apr 2025 07:32:46 +0000 (15:32 +0800)]
iommu/vt-d: Wire up irq_ack() to irq_move_irq() for posted MSIs
Set the posted MSI irq_chip's irq_ack() hook to irq_move_irq() instead of
a dummy/empty callback so that posted MSIs process pending changes to the
IRQ's SMP affinity. Failure to honor a pending set-affinity results in
userspace being unable to change the effective affinity of the IRQ, as
IRQD_SETAFFINITY_PENDING is never cleared and so irq_set_affinity_locked()
always defers moving the IRQ.
The issue is most easily reproducible by setting /proc/irq/xx/smp_affinity
multiple times in quick succession, as only the first update is likely to
be handled in process context.
Fixes:
ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs")
Cc: Robert Lippert <rlippert@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Wentao Yang <wentaoyang@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250321194249.1217961-1-seanjc@google.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fedor Pchelkin [Tue, 8 Apr 2025 21:33:41 +0000 (00:33 +0300)]
iommu: Fix crash in report_iommu_fault()
The following crash is observed while handling an IOMMU fault with a
recent kernel:
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle page fault for address:
ffff8c708299f700
PGD
19ee01067 P4D
19ee01067 PUD
101c10063 PMD
80000001028001e3
Oops: Oops: 0011 [#1] SMP NOPTI
CPU: 4 UID: 0 PID: 139 Comm: irq/25-AMD-Vi Not tainted 6.15.0-rc1+ #20 PREEMPT(lazy)
Hardware name: LENOVO 21D0/LNVNB161216, BIOS J6CN50WW 09/27/2024
RIP: 0010:0xffff8c708299f700
Call Trace:
<TASK>
? report_iommu_fault+0x78/0xd3
? amd_iommu_report_page_fault+0x91/0x150
? amd_iommu_int_thread+0x77/0x180
? __pfx_irq_thread_fn+0x10/0x10
? irq_thread_fn+0x23/0x60
? irq_thread+0xf9/0x1e0
? __pfx_irq_thread_dtor+0x10/0x10
? __pfx_irq_thread+0x10/0x10
? kthread+0xfc/0x240
? __pfx_kthread+0x10/0x10
? ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
? ret_from_fork_asm+0x1a/0x30
</TASK>
report_iommu_fault() checks for an installed handler comparing the
corresponding field to NULL. It can (and could before) be called for a
domain with a different cookie type - IOMMU_COOKIE_DMA_IOVA, specifically.
Cookie is represented as a union so we may end up with a garbage value
treated there if this happens for a domain with another cookie type.
Formerly there were two exclusive cookie types in the union.
IOMMU_DOMAIN_SVA has a dedicated iommu_report_device_fault().
Call the fault handler only if the passed domain has a required cookie
type.
Found by Linux Verification Center (linuxtesting.org).
Fixes:
6aa63a4ec947 ("iommu: Sort out domain user data")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250408213342.285955-1-pchelkin@ispras.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Linus Torvalds [Fri, 11 Apr 2025 03:30:06 +0000 (20:30 -0700)]
Merge tag 'drm-fixes-2025-04-11-1' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes, as expected it has a bit more in it than probably usual
for rc2. amdgpu/xe/i915 lead the way with fixes all over for a bunch
of other drivers. Nothing major stands out from what I can see.
tests:
- Clean up struct drm_display_mode in various places
i915:
- Fix scanline offset for LNL+ and BMG+
- Fix GVT unterminated-string-initialization build warning
- Fix DP rate limit when sink doesn't support TPS4
- Handle GDDR + ECC memory type detection
- Fix VRR parameter change check
- Fix fence not released on early probe errors
- Disable render power gating during live selftests
xe:
- Add another BMG PCI ID
- Fix UAFs on migration paths
- Fix shift-out-of-bounds access on TLB invalidation
- Ensure ccs_mode is correctly set on gt reset
- Extend some HW workarounds to Xe3
- Fix PM runtime get/put on sysfs files
- Fix u64 division on 32b
- Fix flickering due to missing L3 invalidations
- Fix missing error code return
amdgpu:
- MES FW version caching fixes
- Only use GTT as a fallback if we already have a backing store
- dma_buf fix
- IP discovery fix
- Replay and PSR with VRR fix
- DC FP fixes
- eDP fixes
- KIQ TLB invalidate fix
- Enable dmem groups support
- Allow pinning VRAM dma bufs if imports can do P2P
- Workload profile fixes
- Prevent possible division by 0 in fan handling
amdkfd:
- Queue reset fixes
imagination:
- Fix overflow
- Fix use-after-free
ivpu:
- Fix suspend/resume
nouveau:
- Do not deref dangling pointer
rockchip:
- Set DP/HDMI registers correctly
udmabuf:
- Fix overflow
virtgpu:
- Set reservation lock on dma-buf import
- Fix error handling in prepare_fb"
* tag 'drm-fixes-2025-04-11-1' of https://gitlab.freedesktop.org/drm/kernel: (58 commits)
drm/rockchip: dw_hdmi_qp: Fix io init for dw_hdmi_qp_rockchip_resume
drm/rockchip: vop2: Fix interface enable/mux setting of DP1 on rk3588
drm/amdgpu/mes12: optimize MES pipe FW version fetching
drm/amd/pm/smu11: Prevent division by zero
drm/amdgpu: cancel gfx idle work in device suspend for s0ix
drm/amd/display: pause the workload setting in dm
drm/amdgpu/pm/swsmu: implement pause workload profile
drm/amdgpu/pm: add workload profile pause helper
drm/i915/huc: Fix fence not released on early probe errors
drm/i915/vrr: Add vrr.vsync_{start, end} in vrr_params_changed
drm/tests: probe-helper: Fix drm_display_mode memory leak
drm/tests: modes: Fix drm_display_mode memory leak
drm/tests: modes: Fix drm_display_mode memory leak
drm/tests: cmdline: Fix drm_display_mode memory leak
drm/tests: modeset: Fix drm_display_mode memory leak
drm/tests: modeset: Fix drm_display_mode memory leak
drm/tests: helpers: Create kunit helper to destroy a drm_display_mode
drm/xe: Restore EIO errno return when GuC PC start fails
drm/xe: Invalidate L3 read-only cachelines for geometry streams too
drm/xe: avoid plain 64-bit division
...
Linus Torvalds [Fri, 11 Apr 2025 02:38:22 +0000 (19:38 -0700)]
Merge tag 'bcachefs-2025-04-10' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Mostly minor fixes.
Eric Biggers' crypto API conversion is included because of long
standing sporadic crashes - mostly, but not entirely syzbot - in the
crypto API code when calling poly1305, which have been nigh impossible
to reproduce and debug.
His rework deletes the code where we've seen the crashes, so either
it'll be a fix or we'll end up with backtraces we can debug. (Thanks
Eric!)"
* tag 'bcachefs-2025-04-10' of git://evilpiepirate.org/bcachefs:
bcachefs: Use sort_nonatomic() instead of sort()
bcachefs: Remove unnecessary softdep on xxhash
bcachefs: use library APIs for ChaCha20 and Poly1305
bcachefs: Fix duplicate "ro,read_only" in opts at startup
bcachefs: Fix UAF in bchfs_read()
bcachefs: Use cpu_to_le16 for dirent lengths
bcachefs: Fix type for parameter in journal_advance_devs_to_next_bucket
bcachefs: Fix escape sequence in prt_printf
Dave Airlie [Thu, 10 Apr 2025 23:11:04 +0000 (09:11 +1000)]
Merge tag 'drm-xe-fixes-2025-04-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Add another BMG PCI ID
- Fix UAFs on migration paths
- Fix shift-out-of-bounds access on TLB invalidation
- Ensure ccs_mode is correctly set on gt reset
- Extend some HW workarounds to Xe3
- Fix PM runtime get/put on sysfs files
- Fix u64 division on 32b
- Fix flickering due to missing L3 invalidations
- Fix missing error code return
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/unq5j26aejbrjz5nuvmdtcgupyix5bacpoahod4bdohlvwrney@kekimsi5ossx
Dave Airlie [Thu, 10 Apr 2025 23:07:19 +0000 (09:07 +1000)]
Merge tag 'drm-misc-fixes-2025-04-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
imagination:
- Fix overflow
- Fix use-after-free
ivpu:
- Fix suspend/resume
nouveau:
- Do not deref dangling pointer
rockchip:
- Set DP/HDMI registers correctly
tests:
- Clean up struct drm_display_mode in various places
udmabuf:
- Fix overflow
virtgpu:
- Set reservation lock on dma-buf import
- Fix error handling in prepare_fb
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250410122414.GA32202@2a02-2454-fd5e-fd00-d686-8907-6053-f8d8.dyn6.pyur.net
Linus Torvalds [Thu, 10 Apr 2025 22:47:46 +0000 (15:47 -0700)]
Merge tag 'irq-urgent-2025-04-10' of git://git./linux/kernel/git/tip/tip
Pull misc irqchip fixes from Ingo Molnar:
- Fix NULL pointer dereference crashes due to missing .chip_flags setup
in the sg2042-msi and irq-bcm2712-mip irqchip drivers
- Remove the davinci aintc irqchip driver's leftover header too
* tag 'irq-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm2712-mip: Set EOI/ACK flags in msi_parent_ops
irqchip/sg2042-msi: Add missing chip flags
irqchip/davinci: Remove leftover header
Linus Torvalds [Thu, 10 Apr 2025 22:39:39 +0000 (15:39 -0700)]
Merge tag 'timers-urgent-2025-04-10' of git://git./linux/kernel/git/tip/tip
Pull misc timer fixes from Ingo Molnar:
- Fix missing ACCESS_PRIVATE() that triggered a Sparse warning
- Fix lockdep false positive in tick_freeze() on CONFIG_PREEMPT_RT=y
- Avoid <vdso/unaligned.h> macro's variable shadowing to address build
warning that triggers under W=2 builds
* tag 'timers-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
vdso: Address variable shadowing in macros
timekeeping: Add a lockdep override in tick_freeze()
hrtimer: Add missing ACCESS_PRIVATE() for hrtimer::function
Linus Torvalds [Thu, 10 Apr 2025 22:20:10 +0000 (15:20 -0700)]
Merge tag 'x86-urgent-2025-04-10' of git://git./linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix CPU topology related regression that limited Xen PV guests to a
single CPU
- Fix ancient e820__register_nosave_regions() bugs that were causing
problems with kexec's artificial memory maps
- Fix an S4 hibernation crash caused by two missing ENDBR's that were
mistakenly removed in a recent commit
- Fix a resctrl serialization bug
- Fix early_printk documentation and comments
- Fix RSB bugs, combined with preparatory updates to better match the
code to vendor recommendations.
- Add RSB mitigation document
- Fix/update documentation
- Fix the erratum_1386_microcode[] table to be NULL terminated
* tag 'x86-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ibt: Fix hibernate
x86/cpu: Avoid running off the end of an AMD erratum table
Documentation/x86: Zap the subsection letters
Documentation/x86: Update the naming of CPU features for /proc/cpuinfo
x86/bugs: Add RSB mitigation document
x86/bugs: Don't fill RSB on context switch with eIBRS
x86/bugs: Don't fill RSB on VMEXIT with eIBRS+retpoline
x86/bugs: Fix RSB clearing in indirect_branch_prediction_barrier()
x86/bugs: Use SBPB in write_ibpb() if applicable
x86/bugs: Rename entry_ibpb() to write_ibpb()
x86/early_printk: Use 'mmio32' for consistency, fix comments
x86/resctrl: Fix rdtgroup_mkdir()'s unlocked use of kernfs_node::name
x86/e820: Fix handling of subpage regions when calculating nosave ranges in e820__register_nosave_regions()
x86/acpi: Don't limit CPUs to 1 for Xen PV guests due to disabled ACPI
Linus Torvalds [Thu, 10 Apr 2025 21:47:36 +0000 (14:47 -0700)]
Merge tag 'perf-urgent-2025-04-10' of git://git./linux/kernel/git/tip/tip
Pull misc perf events fixes from Ingo Molnar:
- Fix __free_event() corner case splat
- Fix false-positive uprobes related lockdep splat on
CONFIG_PREEMPT_RT=y kernels
- Fix a complicated perf sigtrap race that may result in hangs
* tag 'perf-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix hang while freeing sigtrap event
uprobes: Avoid false-positive lockdep splat on CONFIG_PREEMPT_RT=y in the ri_timer() uprobe timer callback, use raw_write_seqcount_*()
perf/core: Fix WARN_ON(!ctx) in __free_event() for partial init
Linus Torvalds [Thu, 10 Apr 2025 21:27:32 +0000 (14:27 -0700)]
Merge tag 'objtool-urgent-2025-04-10' of git://git./linux/kernel/git/tip/tip
Pull misc objtool fixes from Ingo Molnar:
- Remove the recently introduced ANNOTATE_IGNORE_ALTERNATIVE noise from
clac()/stac() code to make .s files more readable
- Fix INSN_SYSCALL / INSN_SYSRET semantics
- Fix various false-positive warnings
* tag 'objtool-urgent-2025-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix false-positive "ignoring unreachables" warning
objtool: Remove ANNOTATE_IGNORE_ALTERNATIVE from CLAC/STAC
objtool, xen: Fix INSN_SYSCALL / INSN_SYSRET semantics
objtool: Stop UNRET validation on UD2
objtool: Split INSN_CONTEXT_SWITCH into INSN_SYSCALL and INSN_SYSRET
objtool: Fix INSN_CONTEXT_SWITCH handling in validate_unret()
Josh Poimboeuf [Wed, 9 Apr 2025 22:49:36 +0000 (15:49 -0700)]
objtool: Fix false-positive "ignoring unreachables" warning
There's no need to try to automatically disable unreachable warnings if
they've already been manually disabled due to CONFIG_KCOV quirks.
This avoids a spurious warning with a KCOV kernel:
fs/smb/client/cifs_unicode.o: warning: objtool: cifsConvertToUTF16.part.0+0xce5: ignoring unreachables due to jump table quirk
Fixes:
eeff7ac61526 ("objtool: Warn when disabling unreachable warnings")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/5eb28eeb6a724b7d945a961cfdcf8d41e6edf3dc.1744238814.git.jpoimboe@kernel.org
Closes: https://lore.kernel.org/r/
202504090910.QkvTAR36-lkp@intel.com/
Kumar Kartikeya Dwivedi [Thu, 10 Apr 2025 15:31:42 +0000 (08:31 -0700)]
bpf: Convert queue_stack map to rqspinlock
Replace all usage of raw_spinlock_t in queue_stack_maps.c with
rqspinlock. This is a map type with a set of open syzbot reports
reproducing possible deadlocks. Prior attempt to fix the issues
was at [0], but was dropped in favor of this approach.
Make sure we return the -EBUSY error in case of possible deadlocks or
timeouts, just to make sure user space or BPF programs relying on the
error code to detect problems do not break.
With these changes, the map should be safe to access in any context,
including NMIs.
[0]: https://lore.kernel.org/all/
20240429165658.
1305969-1-sidchintamaneni@gmail.com
Reported-by: syzbot+8bdfc2c53fb2b63e1871@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
0000000000004c3fc90615f37756@google.com
Reported-by: syzbot+252bc5c744d0bba917e1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
000000000000c80abd0616517df9@google.com
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250410153142.2064340-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>