Nitin Rawat [Mon, 28 Jul 2025 22:57:11 +0000 (04:27 +0530)]
scsi: ufs: core: Fix interrupt handling for MCQ Mode
Commit
3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service
routine to a threaded IRQ handler") introduced a regression where the UFS
interrupt status register (IS) was not cleared in ufshcd_intr() when
operating in MCQ mode. As a result, the IS register remained uncleared.
This led to a persistent issue during UIC interrupts:
ufshcd_is_auto_hibern8_error() consistently returned true because the
UFSHCD_UIC_HIBERN8_MASK bit was set, while the active command was neither
UIC_CMD_DME_HIBER_ENTER nor UIC_CMD_DME_HIBER_EXIT. This caused
continuous auto hibern8 enter errors and device failed to boot.
To fix this, ensure that the interrupt status register is properly
cleared in the ufshcd_intr() function for both MCQ mode with ESI enabled.
[ 4.553226] ufshcd-qcom
1d84000.ufs: ufshcd_check_errors: Auto
Hibern8 Enter failed - status: 0x00000040, upmcrs: 0x00000001
[ 4.553229] ufshcd-qcom
1d84000.ufs: ufshcd_check_errors: saved_err
0x40 saved_uic_err 0x0
[ 4.553311] host_regs:
00000000:
d5c7033f 20e0071f 00000400 00000000
[ 4.553312] host_regs:
00000010:
01000000 00010217 00000c96 00000000
[ 4.553314] host_regs:
00000020:
00000440 00170ef5 00000000 00000000
[ 4.553316] host_regs:
00000030:
0000010f 00000001 00000000 00000000
[ 4.553317] host_regs:
00000040:
00000000 00000000 00000000 00000000
[ 4.553319] host_regs:
00000050:
fffdf000 0000000f 00000000 00000000
[ 4.553320] host_regs:
00000060:
00000001 80000000 00000000 00000000
[ 4.553322] host_regs:
00000070:
fffde000 0000000f 00000000 00000000
[ 4.553323] host_regs:
00000080:
00000001 00000000 00000000 00000000
[ 4.553325] host_regs:
00000090:
00000002 d0020000 00000000 01930200
Fixes:
3c7ac40d7322 ("scsi: ufs: core: Delegate the interrupt service routine to a threaded IRQ handler")
Co-developed-by: Palash Kambar <quic_pkambar@quicinc.com>
Signed-off-by: Palash Kambar <quic_pkambar@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20250728225711.29273-1-quic_nitirawa@quicinc.com
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Linus Torvalds [Wed, 6 Aug 2025 01:41:21 +0000 (04:41 +0300)]
Merge tag 'perf-fixes-27504' of git://git./linux/kernel/git/tip/tip.git
Pull perf fixes from Thomas Gleixner:
"Perf fixes for perf_mmap() reference counting to prevent potential
reference count leaks which are caused by:
- VMA splits, which change the offset or size of a mapping, which
causes perf_mmap_close() to ignore the unmap or unmap the wrong
buffer.
- Several internal issues of perf_mmap(), which can cause reference
count leaks in the perf mmap, corrupt accounting or cause leaks in
perf drivers.
The main fix is to prevent VMA splits by implementing the
[may_]split() callback for vm operations.
The other issues are addressed by rearranging code, early returns on
failure and invocation of cleanups.
Also provide a selftest to validate the fixes.
The reference counting should be converted to refcount_t, but that
requires larger refactoring of the code and will be done once these
fixes are upstream"
* tag 'perf-fixes-27504' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git:
selftests/perf_events: Add a mmap() correctness test
perf/core: Prevent VMA split of buffer mappings
perf/core: Handle buffer mapping fail correctly in perf_mmap()
perf/core: Exit early on perf_mmap() fail
perf/core: Don't leak AUX buffer refcount on allocation failure
perf/core: Preserve AUX buffer allocation failure result
Ammar Faizi [Wed, 6 Aug 2025 00:31:05 +0000 (07:31 +0700)]
net: usbnet: Fix the wrong netif_carrier_on() call
The commit referenced in the Fixes tag causes usbnet to malfunction
(identified via git bisect). Post-commit, my external RJ45 LAN cable
fails to connect. Linus also reported the same issue after pulling that
commit.
The code has a logic error: netif_carrier_on() is only called when the
link is already on. Fix this by moving the netif_carrier_on() call
outside the if-statement entirely. This ensures it is always called
when EVENT_LINK_CARRIER_ON is set and properly clears it regardless
of the link state.
Cc: stable@vger.kernel.org
Cc: Armando Budianto <sprite@gnuweeb.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wjqL4uF0MG_c8+xHX1Vv8==sPYQrtzbdA3kzi96284nuQ@mail.gmail.com
Closes: https://lore.kernel.org/netdev/CAHk-=wjKh8X4PT_mU1kD4GQrbjivMfPn-_hXa6han_BTDcXddw@mail.gmail.com
Closes: https://lore.kernel.org/netdev/
0752dee6-43d6-4e1f-81d2-
4248142cccd2@gnuweeb.org
Fixes:
0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Masahiro Yamada [Mon, 4 Aug 2025 14:20:07 +0000 (23:20 +0900)]
MAINTAINERS: hand over Kbuild maintenance
I'm stepping down as the maintainer of Kbuild/Kconfig.
It was enjoyable to refactor and improve the kernel build system,
but due to personal reasons, I believe it's difficult for me to
continue in this role any further.
I discussed this off-list with Nathan and Nicolas, and they have
kindly agreed to take over the maintenance of Kbuild with Odd Fixes.
I'm grateful to them for stepping in.
As for Kconfig, there are currently no designated reviewers, so the
maintainer position will remain vacant for now. I hope someone will
step up to take on the role.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nicolas Schier <nicolas@fjasle.eu>
Michał Górny [Tue, 29 Jul 2025 13:24:55 +0000 (15:24 +0200)]
kheaders: make it possible to override TAR
Commit
86cdd2fdc4e3 ("kheaders: make headers archive reproducible")
introduced a number of options specific to GNU tar to the `tar`
invocation in `gen_kheaders.sh` script. This causes the script to fail
to work on systems where `tar` is not GNU tar. This can occur e.g.
on recent Gentoo Linux installations that support using bsdtar from
libarchive instead.
Add a `TAR` make variable to make it possible to override the tar
executable used, e.g. by specifying:
make TAR=gtar
Link: https://bugs.gentoo.org/884061
Reported-by: Sam James <sam@gentoo.org>
Tested-by: Sam James <sam@gentoo.org>
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Thomas Weißschuh [Mon, 28 Jul 2025 13:47:37 +0000 (15:47 +0200)]
kbuild: userprogs: use correct linker when mixing clang and GNU ld
The userprogs infrastructure does not expect clang being used with GNU ld
and in that case uses /usr/bin/ld for linking, not the configured $(LD).
This fallback is problematic as it will break when cross-compiling.
Mixing clang and GNU ld is used for example when building for SPARC64,
as ld.lld is not sufficient; see Documentation/kbuild/llvm.rst.
Relax the check around --ld-path so it gets used for all linkers.
Fixes:
dfc1b168a8c4 ("kbuild: userprogs: use correct lld when linking through clang")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Suchit Karunakaran [Sun, 27 Jul 2025 16:44:33 +0000 (22:14 +0530)]
kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c
strcpy() performs no bounds checking and can lead to buffer overflows if
the input string exceeds the destination buffer size. This patch replaces
it with strncpy(), and null terminates the input string.
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Reviewed-by: Nicolas Schier <nicolas.schier@linux.dev>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Suchit Karunakaran [Sat, 26 Jul 2025 19:43:07 +0000 (01:13 +0530)]
kconfig: lxdialog: replace strcpy with snprintf in print_autowrap
strcpy() does not perform bounds checking and can lead to buffer overflows
if the source string exceeds the destination buffer size. In
print_autowrap(), replace strcpy() with snprintf() to safely copy the
prompt string into the fixed-size tempstr buffer.
Signed-off-by: Suchit Karunakaran <suchitkarunakaran@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Meghana Malladi [Sun, 3 Aug 2025 18:02:16 +0000 (23:32 +0530)]
net: ti: icssg-prueth: Fix skb handling for XDP_PASS
emac_rx_packet() is a common function for handling traffic
for both xdp and non-xdp use cases. Use common logic for
handling skb with or without xdp to prevent any incorrect
packet processing. This patch fixes ping working with
XDP_PASS for icssg driver.
Fixes:
62aa3246f4623 ("net: ti: icssg-prueth: Add XDP support")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250803180216.3569139-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Samiullah Khawaja [Mon, 4 Aug 2025 16:44:57 +0000 (16:44 +0000)]
net: Update threaded state in napi config in netif_set_threaded
Commit
2677010e7793 ("Add support to set NAPI threaded for individual
NAPI") added support to enable/disable threaded napi using netlink. This
also extended the napi config save/restore functionality to set the napi
threaded state. This breaks netdev reset for drivers that use napi
threaded at device level and also use napi config save/restore on
napi_disable/napi_enable. Basically on netdev with napi threaded enabled
at device level, a napi_enable call will get stuck trying to stop the
napi kthread. This is because the napi->config->threaded is set to
disabled when threaded is enabled at device level.
The issue can be reproduced on virtio-net device using qemu. To
reproduce the issue run following,
echo 1 > /sys/class/net/threaded
ethtool -L eth0 combined 1
Update the threaded state in napi config in netif_set_threaded and add a
new test that verifies this scenario.
Tested on qemu with virtio-net:
NETIF=eth0 ./tools/testing/selftests/drivers/net/napi_threaded.py
TAP version 13
1..2
ok 1 napi_threaded.change_num_queues
ok 2 napi_threaded.enable_dev_threaded_disable_napi_threaded
# Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0
Fixes:
2677010e7793 ("Add support to set NAPI threaded for individual NAPI")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250804164457.2494390-1-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Trond Myklebust [Tue, 15 Jul 2025 18:29:51 +0000 (11:29 -0700)]
NFS/localio: nfs_uuid_put() fix the wake up after unlinking the file
Use store_release_wake_up() instead of wake_up_var_locked(), because the
waiter cannot retake the nfs_uuid->lock.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Suggested-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/all/175262948827.2234665.1891349021754495573@noble.neil.brown.name/
Fixes:
21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Tue, 15 Jul 2025 19:49:00 +0000 (12:49 -0700)]
NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()
In order for the wait in nfs_uuid_put() to be safe, it is necessary to
ensure that nfs_uuid_add_file() doesn't add a new entry once the
nfs_uuid->net has been NULLed out.
Also fix up the wake_up_var_locked() / wait_var_event_spinlock() to both
use the nfs_uuid address, since nfl, and &nfl->uuid could be used elsewhere.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/all/175262893035.2234665.1735173020338594784@noble.neil.brown.name/
Fixes:
21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Tue, 15 Jul 2025 19:43:41 +0000 (12:43 -0700)]
NFS/localio: nfs_close_local_fh() fix check for file closed
If the struct nfs_file_localio is closed, its list entry will be empty,
but the nfs_uuid->files list might still contain other entries.
Acked-by: Mike Snitzer <snitzer@kernel.org>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Fixes:
21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Ido Schimmel [Mon, 4 Aug 2025 11:43:20 +0000 (14:43 +0300)]
selftests: netdevsim: Xfail nexthop test on slow machines
A lot of test cases in the file are related to the idle and unbalanced
timers of resilient nexthop groups and these tests are reported to be
flaky on slow machines running debug kernels.
Rather than marking a lot of individual tests with xfail_on_slow(),
simply mark all the tests. Note that the test is stable on non-debug
machines and that with debug kernels we are mainly interested in the
output of various sanitizers in order to determine pass / fail.
Before:
# make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
TEST_GEN_PROGS="" run_tests
[...]
# TEST: Bucket migration after idle timer (with delete) [FAIL]
# Group expected to still be unbalanced
[...]
not ok 1 selftests: drivers/net/netdevsim: nexthop.sh # exit=1
After:
# make -C tools/testing/selftests KSFT_MACHINE_SLOW=yes \
TARGETS=drivers/net/netdevsim TEST_PROGS=nexthop.sh \
TEST_GEN_PROGS="" run_tests
[...]
# TEST: Bucket migration after idle timer (with delete) [XFAIL]
# Group expected to still be unbalanced
[...]
ok 1 selftests: drivers/net/netdevsim: nexthop.sh
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/
20250729160609.
02e0f157@kernel.org/
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250804114320.193203-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 5 Aug 2025 23:01:47 +0000 (16:01 -0700)]
Merge branch 'eth-fbnic-fix-drop-stats-support'
Mohsin Bashir says:
====================
eth: fbnic: Fix drop stats support
Fix hardware drop stats support on the TX path of fbnic by addressing two
issues: ensure that tx_dropped stats are correctly copied to the
rtnl_link_stats64 struct, and protect the copying of drop stats from
fdb->hw_stats to the local variable with the hw_stats_lock to
ensure consistency.
====================
Link: https://patch.msgid.link/20250802024636.679317-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohsin Bashir [Sat, 2 Aug 2025 02:46:36 +0000 (19:46 -0700)]
eth: fbnic: Lock the tx_dropped update
Wrap copying of drop stats on TX path from fbd->hw_stats by the
hw_stats_lock. Currently, it is being performed outside the lock and
another thread accessing fbd->hw_stats can lead to inconsistencies.
Fixes:
5f8bd2ce8269 ("eth: fbnic: add support for TMI stats")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250802024636.679317-3-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohsin Bashir [Sat, 2 Aug 2025 02:46:35 +0000 (19:46 -0700)]
eth: fbnic: Fix tx_dropped reporting
Correctly copy the tx_dropped stats from the fbd->hw_stats to the
rtnl_link_stats64 struct.
Fixes:
5f8bd2ce8269 ("eth: fbnic: add support for TMI stats")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250802024636.679317-2-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 1 Aug 2025 17:07:54 +0000 (10:07 -0700)]
eth: fbnic: remove the debugging trick of super high page bias
Alex added page bias of LONG_MAX, which is admittedly quite
a clever way of catching overflows of the pp ref count.
The page pool code was "optimized" to leave the ref at 1
for freed pages so it can't catch basic bugs by itself any more.
(Something we should probably address under DEBUG_NET...)
Unfortunately for fbnic since commit
f7dc3248dcfb ("skbuff: Optimization
of SKB coalescing for page pool") core _may_ actually take two extra
pp refcounts, if one of them is returned before driver gives up the bias
the ret < 0 check in page_pool_unref_netmem() will trigger.
While at it add a FBNIC_ to the name of the driver constant.
Fixes:
0cb4c0a13723 ("eth: fbnic: Implement Rx queue alloc/start/stop/free")
Link: https://patch.msgid.link/20250801170754.2439577-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 30 Jul 2025 20:23:23 +0000 (22:23 +0200)]
net: ftgmac100: fix potential NULL pointer access in ftgmac100_phy_disconnect
After the call to phy_disconnect() netdev->phydev is reset to NULL.
So fixed_phy_unregister() would be called with a NULL pointer as argument.
Therefore cache the phy_device before this call.
Fixes:
e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Link: https://patch.msgid.link/2b80a77a-06db-4dd7-85dc-3a8e0de55a1d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Krzysztof Kozlowski [Thu, 24 Jul 2025 11:37:59 +0000 (13:37 +0200)]
dt-bindings: net: Replace bouncing Alexandru Tachici emails
Emails to alexandru.tachici@analog.com bounce permanently:
Remote Server returned '550 5.1.10 RESOLVER.ADR.RecipientNotFound; Recipient not found by SMTP address lookup'
so replace him with Marcelo Schmitt from Analog.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
Link: https://patch.msgid.link/20250724113758.61874-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Keith Busch [Tue, 15 Jul 2025 18:46:22 +0000 (11:46 -0700)]
vfio/type1: conditional rescheduling while pinning
A large DMA mapping request can loop through dma address pinning for
many pages. In cases where THP can not be used, the repeated vmf_insert_pfn can
be costly, so let the task reschedule as need to prevent CPU stalls. Failure to
do so has potential harmful side effects, like increased memory pressure
as unrelated rcu tasks are unable to make their reclaim callbacks and
result in OOM conditions.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 36-....: (20999 ticks this GP) idle=b01c/1/0x4000000000000000 softirq=35839/35839 fqs=3538
rcu: hardirqs softirqs csw/system
rcu: number: 0 107 0
rcu: cputime: 50 0 10446 ==> 10556(ms)
rcu: (t=21075 jiffies g=377761 q=204059 ncpus=384)
...
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? walk_system_ram_range+0x63/0x120
? walk_system_ram_range+0x46/0x120
? pgprot_writethrough+0x20/0x20
lookup_memtype+0x67/0xf0
track_pfn_insert+0x20/0x40
vmf_insert_pfn_prot+0x88/0x140
vfio_pci_mmap_huge_fault+0xf9/0x1b0 [vfio_pci_core]
__do_fault+0x28/0x1b0
handle_mm_fault+0xef1/0x2560
fixup_user_fault+0xf5/0x270
vaddr_get_pfns+0x169/0x2f0 [vfio_iommu_type1]
vfio_pin_pages_remote+0x162/0x8e0 [vfio_iommu_type1]
vfio_iommu_type1_ioctl+0x1121/0x1810 [vfio_iommu_type1]
? futex_wake+0x1c1/0x260
x64_sys_call+0x234/0x17a0
do_syscall_64+0x63/0x130
? exc_page_fault+0x63/0x130
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20250715184622.3561598-1-kbusch@meta.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Małgorzata Mielnik [Tue, 15 Jul 2025 08:11:50 +0000 (09:11 +0100)]
vfio/qat: add support for intel QAT 6xxx virtual functions
Extend the qat_vfio_pci variant driver to support QAT 6xxx Virtual
Functions (VFs). Add the relevant QAT 6xxx VF device IDs to the driver's
probe table, enabling proper detection and initialization of these devices.
Update the module description to reflect that the driver now supports all
QAT generations.
Signed-off-by: Małgorzata Mielnik <malgorzata.mielnik@intel.com>
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Link: https://lore.kernel.org/r/20250715081150.1244466-1-suman.kumar.chakraborty@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Xin Zeng [Tue, 15 Jul 2025 00:13:57 +0000 (20:13 -0400)]
vfio/qat: Remove myself from VFIO QAT PCI driver maintainers
Remove myself from VFIO QAT PCI driver maintainers as I'm leaving
Intel.
Signed-off-by: Xin Zeng <xin.zeng@intel.com>
Link: https://lore.kernel.org/r/20250715001357.33725-1-xin.zeng@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Jason Gunthorpe [Mon, 14 Jul 2025 16:08:25 +0000 (13:08 -0300)]
vfio/pci: Do vf_token checks for VFIO_DEVICE_BIND_IOMMUFD
This was missed during the initial implementation. The VFIO PCI encodes
the vf_token inside the device name when opening the device from the group
FD, something like:
"0000:04:10.0 vf_token=
bd8d9d2b-5a5f-4f5a-a211-
f591514ba1f3"
This is used to control access to a VF unless there is co-ordination with
the owner of the PF.
Since we no longer have a device name in the cdev path, pass the token
directly through VFIO_DEVICE_BIND_IOMMUFD using an optional field
indicated by VFIO_DEVICE_BIND_FLAG_TOKEN.
Fixes:
5fcc26969a16 ("vfio: Add VFIO_DEVICE_BIND_IOMMUFD")
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v3-bdd8716e85fe+3978a-vfio_token_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Marcos Alano [Tue, 5 Aug 2025 20:44:29 +0000 (13:44 -0700)]
Input: add keycode for performance mode key
Alienware calls this key "Performance Boost". Dell calls it "G-Mode".
The goal is to have a specific keycode to detect when this key is
pressed, so userspace can act upon it and do what have to do, usually
starting the power profile for performance.
Signed-off-by: Marcos Alano <marcoshalano@gmail.com>
Link: https://lore.kernel.org/r/20250509193708.2190586-1-marcoshalano@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Jinjiang Tu [Thu, 24 Jul 2025 09:09:57 +0000 (17:09 +0800)]
fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats
Hold PTL in pagemap_hugetlb_range() and gather_hugetlb_stats() to avoid
operating on stale page, as pagemap_pmd_range() and gather_pte_stats()
have done.
Link: https://lkml.kernel.org/r/20250724090958.455887-3-tujinjiang@huawei.com
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brahmajit Das <brahmajit.xyz@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Rientjes <rientjes@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joern Engel <joern@logfs.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jinjiang Tu [Thu, 24 Jul 2025 09:09:56 +0000 (17:09 +0800)]
mm/smaps: fix race between smaps_hugetlb_range and migration
smaps_hugetlb_range() handles the pte without holdling ptl, and may be
concurrenct with migration, leaing to BUG_ON in pfn_swap_entry_to_page().
The race is as follows.
smaps_hugetlb_range migrate_pages
huge_ptep_get
remove_migration_ptes
folio_unlock
pfn_swap_entry_folio
BUG_ON
To fix it, hold ptl lock in smaps_hugetlb_range().
Link: https://lkml.kernel.org/r/20250724090958.455887-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20250724090958.455887-2-tujinjiang@huawei.com
Fixes:
25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Brahmajit Das <brahmajit.xyz@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Rientjes <rientjes@google.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joern Engel <joern@logfs.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Barry Song [Tue, 5 Aug 2025 03:54:47 +0000 (11:54 +0800)]
mm: fix the race between collapse and PT_RECLAIM under per-vma lock
The check_pmd_still_valid() call during collapse is currently only
protected by the mmap_lock in write mode, which was sufficient when
pt_reclaim always ran under mmap_lock in read mode. However, since
madvise_dontneed can now execute under a per-VMA lock, this assumption is
no longer valid. As a result, a race condition can occur between collapse
and PT_RECLAIM, potentially leading to a kernel panic.
[ 38.151897] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] SMP KASI
[ 38.153519] KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
[ 38.154605] CPU: 0 UID: 0 PID: 721 Comm: repro Not tainted 6.16.0-next-
20250801-next-
2025080 #1 PREEMPT(voluntary)
[ 38.155929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org4
[ 38.157418] RIP: 0010:kasan_byte_accessible+0x15/0x30
[ 38.158125] Code: 03 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 48 b8 00 00 00 00 00 fc0
[ 38.160461] RSP: 0018:
ffff88800feef678 EFLAGS:
00010286
[ 38.161220] RAX:
dffffc0000000000 RBX:
0000000000000001 RCX:
1ffffffff0dde60c
[ 38.162232] RDX:
0000000000000000 RSI:
ffffffff85da1e18 RDI:
dffffc0000000003
[ 38.163176] RBP:
ffff88800feef698 R08:
0000000000000001 R09:
0000000000000000
[ 38.164195] R10:
0000000000000000 R11:
ffff888016a8ba58 R12:
0000000000000018
[ 38.165189] R13:
0000000000000018 R14:
ffffffff85da1e18 R15:
0000000000000000
[ 38.166100] FS:
0000000000000000(0000) GS:
ffff8880e3b40000(0000) knlGS:
0000000000000000
[ 38.167137] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 38.167891] CR2:
00007f97fadfe504 CR3:
0000000007088005 CR4:
0000000000770ef0
[ 38.168812] PKRU:
55555554
[ 38.169275] Call Trace:
[ 38.169647] <TASK>
[ 38.169975] ? __kasan_check_byte+0x19/0x50
[ 38.170581] lock_acquire+0xea/0x310
[ 38.171083] ? rcu_is_watching+0x19/0xc0
[ 38.171615] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 38.172343] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 38.173130] _raw_spin_lock+0x38/0x50
[ 38.173707] ? __pte_offset_map_lock+0x1a2/0x3c0
[ 38.174390] __pte_offset_map_lock+0x1a2/0x3c0
[ 38.174987] ? __pfx___pte_offset_map_lock+0x10/0x10
[ 38.175724] ? __pfx_pud_val+0x10/0x10
[ 38.176308] ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
[ 38.177183] unmap_page_range+0xb60/0x43e0
[ 38.177824] ? __pfx_unmap_page_range+0x10/0x10
[ 38.178485] ? mas_next_slot+0x133a/0x1a50
[ 38.179079] unmap_single_vma.constprop.0+0x15b/0x250
[ 38.179830] unmap_vmas+0x1fa/0x460
[ 38.180373] ? __pfx_unmap_vmas+0x10/0x10
[ 38.180994] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 38.181877] exit_mmap+0x1a2/0xb40
[ 38.182396] ? lock_release+0x14f/0x2c0
[ 38.182929] ? __pfx_exit_mmap+0x10/0x10
[ 38.183474] ? __pfx___mutex_unlock_slowpath+0x10/0x10
[ 38.184188] ? mutex_unlock+0x16/0x20
[ 38.184704] mmput+0x132/0x370
[ 38.185208] do_exit+0x7e7/0x28c0
[ 38.185682] ? __this_cpu_preempt_check+0x21/0x30
[ 38.186328] ? do_group_exit+0x1d8/0x2c0
[ 38.186873] ? __pfx_do_exit+0x10/0x10
[ 38.187401] ? __this_cpu_preempt_check+0x21/0x30
[ 38.188036] ? _raw_spin_unlock_irq+0x2c/0x60
[ 38.188634] ? lockdep_hardirqs_on+0x89/0x110
[ 38.189313] do_group_exit+0xe4/0x2c0
[ 38.189831] __x64_sys_exit_group+0x4d/0x60
[ 38.190413] x64_sys_call+0x2174/0x2180
[ 38.190935] do_syscall_64+0x6d/0x2e0
[ 38.191449] entry_SYSCALL_64_after_hwframe+0x76/0x7e
This patch moves the vma_start_write() call to precede
check_pmd_still_valid(), ensuring that the check is also properly
protected by the per-VMA lock.
Link: https://lkml.kernel.org/r/20250805035447.7958-1-21cnbao@gmail.com
Fixes:
a6fde7add78d ("mm: use per_vma lock for MADV_DONTNEED")
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: "Lai, Yi" <yi1.lai@linux.intel.com>
Reported-by: "Lai, Yi" <yi1.lai@linux.intel.com>
Closes: https://lore.kernel.org/all/aJAFrYfyzGpbm+0m@ly-workstation/
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Waiman Long [Mon, 28 Jul 2025 19:02:48 +0000 (15:02 -0400)]
mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
A soft lockup warning was observed on a relative small system x86-64
system with 16 GB of memory when running a debug kernel with kmemleak
enabled.
watchdog: BUG: soft lockup - CPU#8 stuck for 33s! [kworker/8:1:134]
The test system was running a workload with hot unplug happening in
parallel. Then kemleak decided to disable itself due to its inability to
allocate more kmemleak objects. The debug kernel has its
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE set to 40,000.
The soft lockup happened in kmemleak_do_cleanup() when the existing
kmemleak objects were being removed and deleted one-by-one in a loop via a
workqueue. In this particular case, there are at least 40,000 objects
that need to be processed and given the slowness of a debug kernel and the
fact that a raw_spinlock has to be acquired and released in
__delete_object(), it could take a while to properly handle all these
objects.
As kmemleak has been disabled in this case, the object removal and
deletion process can be further optimized as locking isn't really needed.
However, it is probably not worth the effort to optimize for such an edge
case that should rarely happen. So the simple solution is to call
cond_resched() at periodic interval in the iteration loop to avoid soft
lockup.
Link: https://lkml.kernel.org/r/20250728190248.605750-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Masami Hiramatsu (Google) [Wed, 30 Jul 2025 14:25:08 +0000 (23:25 +0900)]
MAINTAINERS: add Masami as a reviewer of hung task detector
Since I'm actively working on hung task blocker detector, add myself to a
reviewer of the HUNG TASK DETECTOR feature.
Link: https://lkml.kernel.org/r/175388550841.627474.3260499035226455392.stgit@devnote2
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Breno Leitao [Thu, 31 Jul 2025 09:57:18 +0000 (02:57 -0700)]
mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
When netpoll is enabled, calling pr_warn_once() while holding
kmemleak_lock in mem_pool_alloc() can cause a deadlock due to lock
inversion with the netconsole subsystem. This occurs because
pr_warn_once() may trigger netpoll, which eventually leads to
__alloc_skb() and back into kmemleak code, attempting to reacquire
kmemleak_lock.
This is the path for the deadlock.
mem_pool_alloc()
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
-> pr_warn_once()
-> netconsole subsystem
-> netpoll
-> __alloc_skb
-> __create_object
-> raw_spin_lock_irqsave(&kmemleak_lock, flags);
Fix this by setting a flag and issuing the pr_warn_once() after
kmemleak_lock is released.
Link: https://lkml.kernel.org/r/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org
Fixes:
c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jann Horn [Mon, 28 Jul 2025 20:11:54 +0000 (22:11 +0200)]
kasan/test: fix protection against compiler elision
The kunit test is using assignments to
"static volatile void *kasan_ptr_result" to prevent elision of memory
loads, but that's not working:
In this variable definition, the "volatile" applies to the "void", not to
the pointer.
To make "volatile" apply to the pointer as intended, it must follow
after the "*".
This makes the kasan_memchr test pass again on my system. The
kasan_strings test is still failing because all the definitions of
load_unaligned_zeropad() are lacking explicit instrumentation hooks and
ASAN does not instrument asm() memory operands.
Link: https://lkml.kernel.org/r/20250728-kasan-kunit-fix-volatile-v1-1-e7157c9af82d@google.com
Fixes:
5f1c8108e7ad ("mm:kasan: fix sparse warnings: Should it be static?")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Nihar Chaithanya <niharchaithanya@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dave Airlie [Tue, 5 Aug 2025 20:11:28 +0000 (06:11 +1000)]
Merge tag 'drm-intel-next-fixes-2025-08-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 fixes for v6.17-rc1:
- Fixes around DP LFPS (Low-Frequency Periodic Signaling)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/e1147bede8f219682419d198022cfe8d9d4edc28@intel.com
Lorenzo Stoakes [Sat, 2 Aug 2025 20:55:35 +0000 (22:55 +0200)]
selftests/perf_events: Add a mmap() correctness test
Exercise various mmap(), munmap() and mremap() invocations, which might
cause a perf buffer mapping to be split or truncated.
To avoid hard coding the perf event and having dependencies on
architectures and configuration options, scan through event types in sysfs
and try to open them. On success, try to mmap() and if that succeeds try to
mmap() the AUX buffer.
In case that no AUX buffer supporting event is found, only test the base
buffer mapping. If no mappable event is found or permissions are not
sufficient, skip the tests.
Reserve a PROT_NONE region for both rb and aux tests to allow testing the
case where mremap unmaps beyond the end of a mapped VMA to prevent it from
unmapping unrelated mappings.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Co-developed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Thomas Gleixner [Wed, 30 Jul 2025 21:01:21 +0000 (23:01 +0200)]
perf/core: Prevent VMA split of buffer mappings
The perf mmap code is careful about mmap()'ing the user page with the
ringbuffer and additionally the auxiliary buffer, when the event supports
it. Once the first mapping is established, subsequent mapping have to use
the same offset and the same size in both cases. The reference counting for
the ringbuffer and the auxiliary buffer depends on this being correct.
Though perf does not prevent that a related mapping is split via mmap(2),
munmap(2) or mremap(2). A split of a VMA results in perf_mmap_open() calls,
which take reference counts, but then the subsequent perf_mmap_close()
calls are not longer fulfilling the offset and size checks. This leads to
reference count leaks.
As perf already has the requirement for subsequent mappings to match the
initial mapping, the obvious consequence is that VMA splits, caused by
resizing of a mapping or partial unmapping, have to be prevented.
Implement the vm_operations_struct::may_split() callback and return
unconditionally -EINVAL.
That ensures that the mapping offsets and sizes cannot be changed after the
fact. Remapping to a different fixed address with the same size is still
possible as it takes the references for the new mapping and drops those of
the old mapping.
Fixes:
45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams")
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27504
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: stable@vger.kernel.org
Thomas Gleixner [Sat, 2 Aug 2025 10:48:55 +0000 (12:48 +0200)]
perf/core: Handle buffer mapping fail correctly in perf_mmap()
After successful allocation of a buffer or a successful attachment to an
existing buffer perf_mmap() tries to map the buffer read only into the page
table. If that fails, the already set up page table entries are zapped, but
the other perf specific side effects of that failure are not handled. The
calling code just cleans up the VMA and does not invoke perf_mmap_close().
This leaks reference counts, corrupts user->vm accounting and also results
in an unbalanced invocation of event::event_mapped().
Cure this by moving the event::event_mapped() invocation before the
map_range() call so that on map_range() failure perf_mmap_close() can be
invoked without causing an unbalanced event::event_unmapped() call.
perf_mmap_close() undoes the reference counts and eventually frees buffers.
Fixes:
b709eb872e19 ("perf: map pages in advance")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Sat, 2 Aug 2025 10:49:48 +0000 (12:49 +0200)]
perf/core: Exit early on perf_mmap() fail
When perf_mmap() fails to allocate a buffer, it still invokes the
event_mapped() callback of the related event. On X86 this might increase
the perf_rdpmc_allowed reference counter. But nothing undoes this as
perf_mmap_close() is never called in this case, which causes another
reference count leak.
Return early on failure to prevent that.
Fixes:
1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Sat, 2 Aug 2025 10:39:39 +0000 (12:39 +0200)]
perf/core: Don't leak AUX buffer refcount on allocation failure
Failure of the AUX buffer allocation leaks the reference count.
Set the reference count to 1 only when the allocation succeeds.
Fixes:
45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
Thomas Gleixner [Mon, 4 Aug 2025 20:22:09 +0000 (22:22 +0200)]
perf/core: Preserve AUX buffer allocation failure result
A recent overhaul sets the return value to 0 unconditionally after the
allocations, which causes reference count leaks and corrupts the user->vm
accounting.
Preserve the AUX buffer allocation failure return value, so that the
subsequent code works correctly.
Fixes:
0983593f32c4 ("perf/core: Lift event->mmap_mutex in perf_mmap()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: stable@vger.kernel.org
Wang Zhaolong [Mon, 4 Aug 2025 13:40:05 +0000 (21:40 +0800)]
smb: client: smb: client: eliminate mid_flags field
This is step 3/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
Replace the mid_flags bitmask with dedicated boolean fields to
simplify locking logic and improve code readability:
- Replace MID_DELETED with bool deleted_from_q
- Replace MID_WAIT_CANCELLED with bool wait_cancelled
- Remove mid_flags field entirely
The new boolean fields have clearer semantics:
- deleted_from_q: whether mid has been removed from pending_mid_q
- wait_cancelled: whether request was cancelled during wait
This change reduces memory usage (from 4-byte bitmask to 2 boolean
flags) and eliminates confusion about which lock protects which
flag bits, preparing for per-mid locking in the next patch.
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Wang Zhaolong [Mon, 4 Aug 2025 13:40:04 +0000 (21:40 +0800)]
smb: client: add mid_counter_lock to protect the mid counter counter
This is step 2/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
Add a dedicated mid_counter_lock to protect current_mid counter,
separating it from mid_queue_lock which protects pending_mid_q
operations. This reduces lock contention and prepares for finer-
grained locking in subsequent patches.
Changes:
- Add TCP_Server_Info->mid_counter_lock spinlock
- Rename CurrentMid to current_mid for consistency
- Use mid_counter_lock to protect current_mid access
- Update locking documentation in cifsglob.h
This separation allows mid allocation to proceed without blocking
queue operations, improving performance under heavy load.
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Wang Zhaolong [Mon, 4 Aug 2025 13:40:03 +0000 (21:40 +0800)]
smb: client: rename server mid_lock to mid_queue_lock
This is step 1/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
The current mid_lock name is somewhat ambiguous about what it protects.
To prepare for splitting this lock into separate, more granular locks,
this patch renames mid_lock to mid_queue_lock to clearly indicate its
specific responsibility for protecting the pending_mid_q list and
related queue operations.
No functional changes are made in this patch - it only prepares the
codebase for the lock splitting that follows.
- mid_queue_lock for queue operations
- mid_counter_lock for mid counter operations
- per-mid locks for individual mid state management
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pedro Falcato [Tue, 29 Jul 2025 12:03:48 +0000 (13:03 +0100)]
RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
Ever since commit
c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
we have been doing this:
static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
size_t size)
[...]
/* Calculate the number of bytes we need to push, for this page
* specifically */
size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
/* If we can't splice it, then copy it in, as normal */
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
/* Set the bvec pointing to the page, with len $bytes */
bvec_set_page(&bvec, page[i], bytes, offset);
/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
lock_sock(sk);
/* Sendmsg with $size size (!!!) */
rv = tcp_sendmsg_locked(sk, &msg, size);
This means we've been sending oversized iov_iters and tcp_sendmsg calls
for a while. This has a been a benign bug because sendpage_ok() always
returned true. With the recent slab allocator changes being slowly
introduced into next (that disallow sendpage on large kmalloc
allocations), we have recently hit out-of-bounds crashes, due to slight
differences in iov_iter behavior between the MSG_SPLICE_PAGES and
"regular" copy paths:
(MSG_SPLICE_PAGES)
skb_splice_from_iter
iov_iter_extract_pages
iov_iter_extract_bvec_pages
uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
skb_splice_from_iter gets a "short" read
(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
[...]
copy_from_iter
/* this doesn't help */
if (unlikely(iter->count < len))
len = iter->count;
iterate_bvec
... and we run off the bvecs
Fix this by properly setting the iov_iter's byte count, plus sending the
correct byte count to tcp_sendmsg_locked.
Link: https://patch.msgid.link/r/20250729120348.495568-1-pfalcato@suse.de
Cc: stable@vger.kernel.org
Fixes:
c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/
202507220801.
50a7210-lkp@intel.com
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
Acked-by: Bernard Metzler <bernard.metzler@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
NeilBrown [Fri, 18 Jul 2025 01:26:14 +0000 (11:26 +1000)]
nfsd: avoid ref leak in nfsd_open_local_fh()
If two calls to nfsd_open_local_fh() race and both successfully call
nfsd_file_acquire_local(), they will both get an extra reference to the
net to accompany the file reference stored in *pnf.
One of them will fail to store (using xchg()) the file reference in
*pnf and will drop that reference but WON'T drop the accompanying
reference to the net. This leak means that when the nfs server is shut
down it will hang in nfsd_shutdown_net() waiting for
&nn->nfsd_net_free_done.
This patch adds the missing nfsd_net_put().
Reported-by: Mike Snitzer <snitzer@kernel.org>
Fixes:
e6f7e1487ab5 ("nfs_localio: simplify interface to nfsd for getting nfsd_file")
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neil@brown.name>
Tested-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jeff Layton [Wed, 16 Jul 2025 13:34:29 +0000 (09:34 -0400)]
nfsd: don't set the ctime on delegated atime updates
Clients will typically precede a DELEGRETURN for a delegation with
delegated timestamp with a SETATTR to set the timestamps on the server
to match what the client has.
knfsd implements this by using the nfsd_setattr() infrastructure, which
will set ATTR_CTIME on any update that goes to notify_change(). This is
problematic as it means that the client will get a spurious ctime
update when updating the atime.
POSIX unfortunately doesn't phrase it succinctly, but updating the atime
due to reads should not update the ctime. In this case, the client is
sending a SETATTR to update the atime on the server to match its latest
value. The ctime should not be advanced in this case as that would
incorrectly indicate a change to the inode.
Fix this by not implicitly setting ATTR_CTIME when ATTR_DELEG is set in
__nfsd_setattr(). The decoder for FATTR4_WORD2_TIME_DELEG_MODIFY already
sets ATTR_CTIME, so this is sufficient to make it skip setting the ctime
on atime-only updates.
Fixes:
7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR")
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Linus Torvalds [Tue, 5 Aug 2025 13:55:03 +0000 (16:55 +0300)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rmk/linux
Pull ARM update from Russell King:
"Just one development update this time:
- Finish removing Coresight support"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9449/1: coresight: Finish removal of Coresight support in arch/arm/kernel
Linus Torvalds [Tue, 5 Aug 2025 13:37:05 +0000 (16:37 +0300)]
Merge tag 'exfat-for-6.17-rc1' of git://git./linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Use generic_write_sync instead of vfs_fsync_range in exfat_file_write_iter.
It will fix an issue where fdatasync would be set incorrectly.
- Fix potential infinite loop by the self-linked chain.
* tag 'exfat-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: add cluster chain loop check for dir
exfat: fdatasync flag should be same like generic_write_sync()
Linus Torvalds [Tue, 5 Aug 2025 13:02:07 +0000 (16:02 +0300)]
Merge tag 'mm-stable-2025-08-03-12-35' of git://git./linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "mseal cleanups" (Lorenzo Stoakes)
Some mseal cleaning with no intended functional change.
- "Optimizations for khugepaged" (David Hildenbrand)
Improve khugepaged throughput by batching PTE operations for large
folios. This gain is mainly for arm64.
- "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport)
A bugfix, additional debug code and cleanups to the execmem code.
- "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song)
Bugfixes, cleanups and performance improvememnts to the mTHP swapin
code"
* tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits)
mm: mempool: fix crash in mempool_free() for zero-minimum pools
mm: correct type for vmalloc vm_flags fields
mm/shmem, swap: fix major fault counting
mm/shmem, swap: rework swap entry and index calculation for large swapin
mm/shmem, swap: simplify swapin path and result handling
mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO
mm/shmem, swap: tidy up swap entry splitting
mm/shmem, swap: tidy up THP swapin checks
mm/shmem, swap: avoid redundant Xarray lookup during swapin
x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations
x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations
execmem: drop writable parameter from execmem_fill_trapping_insns()
execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP)
execmem: move execmem_force_rw() and execmem_restore_rox() before use
execmem: rework execmem_cache_free()
execmem: introduce execmem_alloc_rw()
execmem: drop unused execmem_update_copy()
mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped
mm/rmap: add anon_vma lifetime debug check
mm: remove mm/io-mapping.c
...
Sumanth Korikkar [Mon, 4 Aug 2025 09:57:03 +0000 (11:57 +0200)]
s390/mm: Allocate page table with PAGE_SIZE granularity
Make vmem_pte_alloc() consistent by always allocating page table of
PAGE_SIZE granularity, regardless of whether page_table_alloc() (with
slab) or memblock_alloc() is used. This ensures page table can be fully
freed when the corresponding page table entries are removed.
Fixes:
d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Wentao Guan [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: vDSO: Remove -nostdlib complier flag
Since $(LD) is directly used, hence -nostdlib is unneeded, MIPS has
removed this, we should remove it too.
bdbf2038fbf4 ("MIPS: VDSO: remove -nostdlib compiler flag").
In fact, other architectures also use $(LD) now.
fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO")
691efbedc60d ("arm64: vdso: use $(LD) instead of $(CC) to link VDSO")
2ff906994b6c ("MIPS: VDSO: Use $(LD) instead of $(CC) to link VDSO")
2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO")
Cc: stable@vger.kernel.org
Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: dts: Add eMMC/SDIO controller support to Loongson-2K2000
The Loongson-2K2000 integrates one eMMC controller and one SDIO controller.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: dts: Add SDIO controller support to Loongson-2K1000
The Loongson-2K1000 integrates one SDIO controller for SD storage cards
and SDIO cards.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: dts: Add SDIO controller support to Loongson-2K0500
The Loongson-2K0500 integrates two SDIO controllers for SD storage cards
and SDIO cards, supporting SD storage card boot.
The module is supported now, enable it.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tiezhu Yang [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: BPF: Set bpf_jit_bypass_spec_v1/v4()
JITs can set bpf_jit_bypass_spec_v1/v4() if they want the verifier to
skip analysis/patching for the respective vulnerability, it is safe to
set both bpf_jit_bypass_spec_v1/v4(), because there is no speculation
barrier instruction for LoongArch.
Suggested-by: Luis Gerhorst <luis.gerhorst@fau.de>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Haoran Jiang [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: BPF: Fix the tailcall hierarchy
In specific use cases combining tailcalls and BPF-to-BPF calls,
MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt
back-propagation from callee to caller. This patch fixes this
tailcall issue caused by abusing the tailcall in bpf2bpf feature
on LoongArch like the way of "bpf, x64: Fix tailcall hierarchy".
Push tail_call_cnt_ptr and tail_call_cnt into the stack,
tail_call_cnt_ptr is passed between tailcall and bpf2bpf,
uses tail_call_cnt_ptr to increment tail_call_cnt.
Fixes:
bb035ef0cc91 ("LoongArch: BPF: Support mixing bpf2bpf and tailcalls")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Haoran Jiang [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: BPF: Fix jump offset calculation in tailcall
The extra pass of bpf_int_jit_compile() skips JIT context initialization
which essentially skips offset calculation leaving out_offset = -1, so
the jmp_offset in emit_bpf_tail_call is calculated by
"#define jmp_offset (out_offset - (cur_offset))"
is a negative number, which is wrong. The final generated assembly are
as follow.
54: bgeu $a2, $t1, -8 # 0x0000004c
58: addi.d $a6, $s5, -1
5c: bltz $a6, -16 # 0x0000004c
60: alsl.d $t2, $a2, $a1, 0x3
64: ld.d $t2, $t2, 264
68: beq $t2, $zero, -28 # 0x0000004c
Before apply this patch, the follow test case will reveal soft lock issues.
cd tools/testing/selftests/bpf/
./test_progs --allow=tailcalls/tailcall_bpf2bpf_1
dmesg:
watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [test_progs:25056]
Cc: stable@vger.kernel.org
Fixes:
5dc615520c4d ("LoongArch: Add BPF JIT support")
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Haoran Jiang <jianghaoran@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tiezhu Yang [Tue, 5 Aug 2025 11:00:22 +0000 (19:00 +0800)]
LoongArch: BPF: Add struct ops support for trampoline
Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
prologue and epilogue for this case.
With this patch, all of the struct_ops related testcases (except
struct_ops_multi_pages) passed on LoongArch.
The testcase struct_ops_multi_pages failed is because the actual
image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
Before:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
After:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
#15 bad_struct_ops:OK
...
#399 struct_ops_autocreate:OK
...
#400 struct_ops_kptr_return:OK
...
#401 struct_ops_maybe_null:OK
...
#402 struct_ops_module:OK
...
#404 struct_ops_no_cfi:OK
...
#405 struct_ops_private_stack:SKIP
...
#406 struct_ops_refcounted:OK
Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Chenghao Duan [Tue, 5 Aug 2025 11:00:18 +0000 (19:00 +0800)]
LoongArch: BPF: Add basic bpf trampoline support
BPF trampoline is the critical infrastructure of the BPF subsystem,
acting as a mediator between kernel functions and BPF programs. Numerous
important features, such as using BPF program for zero overhead kernel
introspection, rely on this key component.
The related tests have passed, including the following technical points:
1. fentry
2. fmod_ret
3. fexit
The following related testcases passed on LoongArch:
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
This issue was first reported by Geliang Tang in June 2024 while
debugging MPTCP BPF selftests on a LoongArch machine (see commit
eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca").
Geliang, Huacai, and Tiezhu then worked together to drive the
implementation of this feature, encouraging broader collaboration among
Chinese kernel engineers.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202507100034.wXofj6VX-lkp@intel.com/
Reported-by: Geliang Tang <geliang@kernel.org>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Chenghao Duan [Tue, 5 Aug 2025 11:00:18 +0000 (19:00 +0800)]
LoongArch: BPF: Add dynamic code modification support
This commit adds support for BPF dynamic code modification on the
LoongArch architecture:
1. Add bpf_arch_text_copy() for instruction block copying.
2. Add bpf_arch_text_poke() for runtime instruction patching.
3. Add bpf_arch_text_invalidate() for code invalidation.
On LoongArch, since symbol addresses in the direct mapping region can't
be reached via relative jump instructions from the paged mapping region,
we use the move_imm+jirl instruction pair as absolute jump instructions.
These require 2-5 instructions, so we reserve 5 NOP instructions in the
program as placeholders for function jumps.
The larch_insn_text_copy() function is solely used for BPF. And the use
of larch_insn_text_copy() requires PAGE_SIZE alignment. Currently, only
the size of the BPF trampoline is page-aligned.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Chenghao Duan [Tue, 5 Aug 2025 11:00:18 +0000 (19:00 +0800)]
LoongArch: BPF: Rename and refactor validate_code()
1. Rename the existing validate_code() to validate_ctx()
2. Factor out the code validation handling into a new helper
validate_code()
Then:
* validate_code() is used to check the validity of code.
* validate_ctx() is used to check both code validity and table entry
correctness.
The new validate_code() will be used in subsequent changes.
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Dmitry Baryshkov [Fri, 1 Aug 2025 10:46:41 +0000 (13:46 +0300)]
drm/bridge: document HDMI CEC callbacks
Provide documentation for the drm_bridge callbacks related to the
DRM_BRIDGE_OP_HDMI_CEC_ADAPTER flag.
Fixes:
a74288c8ded7 ("drm/display: bridge-connector: handle CEC adapters")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/r/
20250611140933.
1429a1b8@canb.auug.org.au
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250801-drm-hdmi-cec-docs-v1-1-be63e6008d0e@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Jason Wang [Tue, 29 Jul 2025 07:39:16 +0000 (15:39 +0800)]
vhost: initialize vq->nheads properly
Commit
7918bb2d19c9 ("vhost: basic in order support") introduces
vq->nheads to store the number of batched used buffers per used elem
but it forgets to initialize the vq->nheads to NULL in
vhost_dev_init() this will cause kfree() that would try to free it
without be allocated if SET_OWNER is not called.
Reported-by: JAEHOON KIM <jhkim@linux.ibm.com>
Reported-by: Breno Leitao <leitao@debian.org>
Fixes:
45347e79b544 ("vhost: basic in order support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20250729073916.80647-1-jasowang@redhat.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Tested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Jaehoon Kim <jhkim@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Lorenzo Pieralisi [Fri, 1 Aug 2025 07:58:20 +0000 (09:58 +0200)]
irqchip/gic-v5: Remove IRQD_RESEND_WHEN_IN_PROGRESS for ITS IRQs
GICv5 LPI interrupts have an active state hence they cannot retrigger
while the interrupt is being handled.
Therefore, setting the IRQD_RESEND_WHEN_IN_PROGRESS flag on LPIs is
pointless, as the situation this flag caters for cannot happen.
Remove it.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250801-gic-v5-fixes-6-17-v1-3-4fcedaccf9e6@kernel.org
Lorenzo Pieralisi [Fri, 1 Aug 2025 07:58:18 +0000 (09:58 +0200)]
irqchip/gic-v5: iwb: Fix iounmap probe failure path
The 0-day bot reported that on the failure path the driver iounmap()s IWB
resources that are managed through devm_ioremap(), which is clearly wrong
because the driver would end up unmapping the MMIO resource twice on
probing failure.
Fix this by removing the error path altogether and by letting devres manage
the iounmapping on clean-up.
Fixes:
695949d8b16f ("irqchip/gic-v5: Add GICv5 IWB support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250801-gic-v5-fixes-6-17-v1-1-4fcedaccf9e6@kernel.org
Closes: https://lore.kernel.org/oe-kbuild-all/
202508010038.N3r4ZmII-lkp@intel.com
Elad Nachman [Sun, 3 Aug 2025 10:25:48 +0000 (13:25 +0300)]
irqchip/mvebu-gicp: Clear pending interrupts on init
When a kexec'ed kernel boots up, there might be stale unhandled interrupts
pending in the interrupt controller. These are delivered as spurious
interrupts once the boot CPU enables interrupts.
Clear all pending interrupts when the driver is initialized to prevent
these spurious interrupts from locking the CPU in an endless loop.
Signed-off-by: Elad Nachman <enachman@marvell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250803102548.669682-2-enachman@marvell.com
Lorenzo Pieralisi [Mon, 4 Aug 2025 14:55:53 +0000 (16:55 +0200)]
irqchip/msi-lib: Fix fwnode refcount in msi_lib_irq_domain_select()
Commit
8b65db1e93a2 ("irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT
handling") added logic in msi_lib_irq_domain_select() to match the domain
fwnode against the fwnode parent of the fwspec.fwnode.
The fwnode_get_parent() caller must call fwnode_handle_put() on the
returned pointer value, lest fwnode refcounting for the parent ends up
being out of kilter.
Fix this by relying on the fwnode_handle clean-up handlers and by
incrementing the fwnode refcount regardless of whether parent matching is
used or not (the domain selection code already holds a reference before
calling msi_lib_irq_domain_select() but to make the exit path more uniform
if IRQ_DOMAIN_FLAG_FWNODE_PARENT is not set fwnode_handle_get() is called
again on fwspec.fwnode so that the clean-up code is the same for the two
matching patterns).
Fixes:
8b65db1e93a2 ("irqchip/msi-lib: Add IRQ_DOMAIN_FLAG_FWNODE_PARENT handling")
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250804145553.795065-1-lpieralisi@kernel.org
Thomas Gleixner [Sat, 2 Aug 2025 10:59:13 +0000 (12:59 +0200)]
irqchip/riscv-imsic: Don't dereference before NULL pointer check
smatch warns about a dereference before check:
drivers/irqchip/irq-riscv-imsic-platform.c:317 imsic_irqdomain_init() warn: variable dereferenced before check 'imsic' (see line 311)
Cure it by moving the firmware not assignement after the checks.
Fixes:
59422904dd98 ("irqchip/riscv-imsic: Convert to msi_create_parent_irq_domain() helper")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Closes: https://lore.kernel.org/r/
202507311953.NFVZkr0a-lkp@intel.com/ ---
drivers/irqchip/irq-riscv-imsic-platform.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Paulo Alcantara [Thu, 31 Jul 2025 23:46:43 +0000 (20:46 -0300)]
smb: client: fix creating symlinks under POSIX mounts
SMB3.1.1 POSIX mounts support native symlinks that are created with
IO_REPARSE_TAG_SYMLINK reparse points, so skip the checking of
FILE_SUPPORTS_REPARSE_POINTS as some servers might not have it set.
Cc: linux-cifs@vger.kernel.org
Cc: Ralph Boehme <slow@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson@ed.ac.uk>
Closes: https://marc.info/?i=
1124e7cd-6a46-40a6-9f44-
b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Thu, 31 Jul 2025 23:46:42 +0000 (20:46 -0300)]
smb: client: default to nonativesocket under POSIX mounts
SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse
points.
Cc: linux-cifs@vger.kernel.org
Cc: Ralph Boehme <slow@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson@ed.ac.uk>
Closes: https://marc.info/?i=
1124e7cd-6a46-40a6-9f44-
b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Geert Uytterhoeven [Sat, 2 Aug 2025 15:53:02 +0000 (17:53 +0200)]
dpll: zl3073x: ZL3073X_I2C and ZL3073X_SPI should depend on NET
When making ZL3073X invisible, it was overlooked that ZL3073X depends on
NET, while ZL3073X_I2C and ZL3073X_SPI do not, causing:
WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_I2C
WARNING: unmet direct dependencies detected for ZL3073X when selected by ZL3073X_SPI
WARNING: unmet direct dependencies detected for ZL3073X
Depends on [n]: NET [=n]
Selected by [y]:
- ZL3073X_I2C [=y] && I2C [=y]
Selected by [y]:
- ZL3073X_SPI [=y] && SPI [=y]
Fix this by adding the missing dependencies to ZL3073X_I2C and
ZL3073X_SPI.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202508022110.nTqZ5Ylu-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/
202508022351.NHIxPF8j-lkp@intel.com/
Fixes:
a4f0866e3dbbf3fe ("dpll: Make ZL3073X invisible")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250802155302.3673457-1-geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maher Azzouzi [Sat, 2 Aug 2025 00:18:57 +0000 (17:18 -0700)]
net/sched: mqprio: fix stack out-of-bounds write in tc entry parsing
TCA_MQPRIO_TC_ENTRY_INDEX is validated using
NLA_POLICY_MAX(NLA_U32, TC_QOPT_MAX_QUEUE), which allows the value
TC_QOPT_MAX_QUEUE (16). This leads to a 4-byte out-of-bounds stack
write in the fp[] array, which only has room for 16 elements (0–15).
Fix this by changing the policy to allow only up to TC_QOPT_MAX_QUEUE - 1.
Fixes:
f62af20bed2d ("net/sched: mqprio: allow per-TC user input of FP adminStatus")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maher Azzouzi <maherazz04@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250802001857.2702497-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 1 Aug 2025 21:27:42 +0000 (14:27 -0700)]
Revert "net: mdio_bus: Use devm for getting reset GPIO"
This reverts commit
3b98c9352511db627b606477fc7944b2fa53a165.
Russell says:
Using devm_*() [here] is completely wrong, because this is called
from mdiobus_register_device(). This is not the probe function
for the device, and thus there is no code to trigger the release of
the resource on unregistration.
Moreover, when the mdiodev is eventually probed, if the driver fails
or the driver is unbound, the GPIO will be released, but a reference
will be left behind.
Using devm* with a struct device that is *not* currently being probed
is fundamentally wrong - an abuse of devm.
Reported-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/95449490-fa58-41d4-9493-c9213c1f2e7d@sirena.org.uk
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Fixes:
3b98c9352511 ("net: mdio_bus: Use devm for getting reset GPIO")
Link: https://patch.msgid.link/20250801212742.2607149-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 1 Aug 2025 18:16:38 +0000 (11:16 -0700)]
selftests: net: packetdrill: xfail all problems on slow machines
We keep seeing flakes on packetdrill on debug kernels, while
non-debug kernels are stable, not a single flake in 200 runs.
Time to give up, debug kernels appear to suffer from 10msec
latency spikes and any timing-sensitive test is bound to flake.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250801181638.2483531-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Quang Le [Fri, 1 Aug 2025 17:54:16 +0000 (13:54 -0400)]
net/packet: fix a race in packet_set_ring() and packet_notifier()
When packet_set_ring() releases po->bind_lock, another thread can
run packet_notifier() and process an NETDEV_UP event.
This race and the fix are both similar to that of commit
15fe076edea7
("net/packet: fix a race in packet_bind() and packet_notifier()").
There too the packet_notifier NETDEV_UP event managed to run while a
po->bind_lock critical section had to be temporarily released. And
the fix was similarly to temporarily set po->num to zero to keep
the socket unhooked until the lock is retaken.
The po->bind_lock in packet_set_ring and packet_notifier precede the
introduction of git history.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Quang Le <quanglex97@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250801175423.2970334-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Schmidt [Fri, 1 Aug 2025 10:13:37 +0000 (12:13 +0200)]
benet: fix BUG when creating VFs
benet crashes as soon as SRIOV VFs are created:
kernel BUG at mm/vmalloc.c:3457!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 4 UID: 0 PID: 7408 Comm: test.sh Kdump: loaded Not tainted 6.16.0+ #1 PREEMPT(voluntary)
[...]
RIP: 0010:vunmap+0x5f/0x70
[...]
Call Trace:
<TASK>
__iommu_dma_free+0xe8/0x1c0
be_cmd_set_mac_list+0x3fe/0x640 [be2net]
be_cmd_set_mac+0xaf/0x110 [be2net]
be_vf_eth_addr_config+0x19f/0x330 [be2net]
be_vf_setup+0x4f7/0x990 [be2net]
be_pci_sriov_configure+0x3a1/0x470 [be2net]
sriov_numvfs_store+0x20b/0x380
kernfs_fop_write_iter+0x354/0x530
vfs_write+0x9b9/0xf60
ksys_write+0xf3/0x1d0
do_syscall_64+0x8c/0x3d0
be_cmd_set_mac_list() calls dma_free_coherent() under a spin_lock_bh.
Fix it by freeing only after the lock has been released.
Fixes:
1a82d19ca2d6 ("be2net: fix sleeping while atomic bugs in be_ndo_bridge_getlink")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250801101338.72502-1-mschmidt@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi [Fri, 1 Aug 2025 07:12:25 +0000 (09:12 +0200)]
net: airoha: npu: Add missing MODULE_FIRMWARE macros
Introduce missing MODULE_FIRMWARE definitions for firmware autoload.
Fixes:
23290c7bc190d ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250801-airoha-npu-missing-module-firmware-v2-1-e860c824d515@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 1 Aug 2025 01:13:35 +0000 (18:13 -0700)]
net: devmem: fix DMA direction on unmapping
Looks like we always unmap the DMA_BUF with DMA_FROM_DEVICE direction.
While at it unexport __net_devmem_dmabuf_binding_free(), it's internal.
Found by code inspection.
Fixes:
bd61848900bf ("net: devmem: Implement TX path")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250801011335.2267515-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arnd Bergmann [Thu, 31 Jul 2025 08:00:20 +0000 (10:00 +0200)]
ipa: fix compile-testing with qcom-mdt=m
There are multiple drivers that use the qualcomm mdt loader, but they
have conflicting ideas of how to deal with that dependency when compile-testing
for non-qualcomm targets:
IPA only enables the MDT loader when the kernel config includes ARCH_QCOM,
but the newly added ath12k support always enables it, which leads to a
link failure with the combination of IPA=y and ATH12K=m:
aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_firmware_load':
ipa_main.c:(.text.unlikely+0x134): undefined reference to `qcom_mdt_load
The ATH12K method seems more reliable here, so change IPA over to do the same
thing.
Fixes:
38a4066f593c ("net: ipa: support COMPILE_TEST")
Fixes:
c0dd3f4f7091 ("wifi: ath12k: enable ath12k AHB support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250731080024.2054904-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 28 Jul 2025 16:31:29 +0000 (09:31 -0700)]
eth: fbnic: unlink NAPIs from queues on error to open
CI hit a UaF in fbnic in the AF_XDP portion of the queues.py test.
The UaF is in the __sk_mark_napi_id_once() call in xsk_bind(),
NAPI has been freed. Looks like the device failed to open earlier,
and we lack clearing the NAPI pointer from the queue.
Fixes:
557d02238e05 ("eth: fbnic: centralize the queue count and NAPI<>queue setting")
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250728163129.117360-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Mon, 4 Aug 2025 23:37:29 +0000 (16:37 -0700)]
Merge tag 'i2c-for-6.17-rc1-part2' of git://git./linux/kernel/git/wsa/linux
Pull more i2c updates from Wolfram Sang:
"A few more patches from I2C. Some are fixes which would be nice to
have in rc1 already, some patches have nearly been fallen through the
cracks, some just needed a bit more testing.
- acpi: enable 100kHz workaround for DLL0945
- apple: add support for Apple A7–A11, T2 chips; Kconfig update
- mux: mule: fix error handling path
- qcom-geni: fix controller frequency mapping
- stm32f7: add DMA-safe transfer support
- tegra: use controller reset if device reset is missing
- tegra: remove unnecessary dma_sync*() calls"
* tag 'i2c-for-6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: muxes: mule: Fix an error handling path in mule_i2c_mux_probe()
i2c: Force DLL0945 touchpad i2c freq to 100khz
i2c: apple: Drop default ARCH_APPLE in Kconfig
i2c: qcom-geni: fix I2C frequency table to achieve accurate bus rates
dt-bindings: i2c: apple,i2c: Document Apple A7-A11, T2 compatibles
i2c: tegra: Remove dma_sync_*() calls
i2c: tegra: Use internal reset when reset property is not available
i2c: stm32f7: support i2c_*_dma_safe_msg_buf APIs
Linus Torvalds [Mon, 4 Aug 2025 23:27:21 +0000 (16:27 -0700)]
Merge tag 'f2fs-for-6.17-rc1' of git://git./linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"Three main updates: folio conversion by Matthew, switch to a new mount
API by Hongbo and Eric, and several sysfs entries to tune GCs for ZUFS
with finer granularity by Daeho.
There are also patches to address bugs and issues in the existing
features such as GCs, file pinning, write-while-dio-read, contingous
block allocation, and memory access violations.
Enhancements:
- switch to new mount API and folio conversion
- add sysfs nodes to controle F2FS GCs for ZUFS
- improve performance on the nat entry cache
- drop inode from the donation list when the last file is closed
- avoid splitting bio when reading multiple pages
Bug fixes:
- fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
- make sure zoned device GC to use FG_GC in shortage of free section
- fix to calculate dirty data during has_not_enough_free_secs()
- fix to update upper_p in __get_secs_required() correctly
- wait for inflight dio completion, excluding pinned files read using dio
- don't break allocation when crossing contiguous sections
- vm_unmap_ram() may be called from an invalid context
- fix to avoid out-of-boundary access in dnode page
- fix to avoid panic in f2fs_evict_inode
- fix to avoid UAF in f2fs_sync_inode_meta()
- fix to use f2fs_is_valid_blkaddr_raw() in do_write_page()
- fix UAF of f2fs_inode_info in f2fs_free_dic
- fix to avoid invalid wait context issue
- fix bio memleak when committing super block
- handle nat.blkaddr corruption in f2fs_get_node_info()
In addition, there are also clean-ups and minor bug fixes"
* tag 'f2fs-for-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (109 commits)
f2fs: drop inode from the donation list when the last file is closed
f2fs: add gc_boost_gc_greedy sysfs node
f2fs: add gc_boost_gc_multiple sysfs node
f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs mode
f2fs: fix to calculate dirty data during has_not_enough_free_secs()
f2fs: fix to update upper_p in __get_secs_required() correctly
f2fs: directly add newly allocated pre-dirty nat entry to dirty set list
f2fs: avoid redundant clean nat entry move in lru list
f2fs: zone: wait for inflight dio completion, excluding pinned files read using dio
f2fs: ignore valid ratio when free section count is low
f2fs: don't break allocation when crossing contiguous sections
f2fs: remove unnecessary tracepoint enabled check
f2fs: merge the two conditions to avoid code duplication
f2fs: vm_unmap_ram() may be called from an invalid context
f2fs: fix to avoid out-of-boundary access in dnode page
f2fs: switch to the new mount api
f2fs: introduce fs_context_operation structure
f2fs: separate the options parsing and options checking
f2fs: Add f2fs_fs_context to record the mount options
f2fs: Allow sbi to be NULL in f2fs_printk
...
Thomas Gleixner [Thu, 24 Jul 2025 10:49:30 +0000 (12:49 +0200)]
x86/irq: Plug vector setup race
Hogan reported a vector setup race, which overwrites the interrupt
descriptor in the per CPU vector array resulting in a disfunctional device.
CPU0 CPU1
interrupt is raised in APIC IRR
but not handled
free_irq()
per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
request_irq() common_interrupt()
d = this_cpu_read(vector_irq[vector]);
per_cpu(vector_irq, CPU1)[vector] = desc;
if (d == VECTOR_SHUTDOWN)
this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
free_irq() cannot observe the pending vector in the CPU1 APIC as there is
no way to query the remote CPUs APIC IRR.
This requires that request_irq() uses the same vector/CPU as the one which
was freed, but this also can be triggered by a spurious interrupt.
Interestingly enough this problem managed to be hidden for more than a
decade.
Prevent this by reevaluating vector_irq under the vector lock, which is
held by the interrupt activation code when vector_irq is updated.
To avoid ifdeffery or IS_ENABLED() nonsense, move the
[un]lock_vector_lock() declarations out under the
CONFIG_IRQ_DOMAIN_HIERARCHY guard as it's only provided when
CONFIG_X86_LOCAL_APIC=y.
The current CONFIG_IRQ_DOMAIN_HIERARCHY guard is selected by
CONFIG_X86_LOCAL_APIC, but can also be selected by other parts of the
Kconfig system, which makes 32-bit UP builds with CONFIG_X86_LOCAL_APIC=n
fail.
Can we just get rid of this !APIC nonsense once and forever?
Fixes:
9345005f4eed ("x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs")
Reported-by: Hogan Wang <hogan.wang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hogan Wang <hogan.wang@huawei.com>
Link: https://lore.kernel.org/all/draft-87ikjhrhhh.ffs@tglx
Jesse.Zhang [Mon, 4 Aug 2025 00:43:15 +0000 (08:43 +0800)]
drm/amdgpu: Update SDMA firmware version check for user queue support
This commit fixes a firmware version check for enabling user queue
support in SDMA v7.0. The previous version check (
7836028) was
incorrect and could lead to issues with PROTECTED_FENCE_SIGNAL
commands causing register conflicts between MCU_DBG0 and MCU_DBG1.
Fixes:
8c011408ed84 ("drm/amdgpu/sdma7: add ucode version checks for userq support")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
92e2449241516c95aab95eea91faecd0fa2b7ed5)
Cc: stable@vger.kernel.org
Lijo Lazar [Fri, 18 Jul 2025 03:55:21 +0000 (09:25 +0530)]
drm/amdgpu: Add NULL check for asic_funcs
If driver load fails too early, asic_funcs pointer remains unassigned.
Add NULL check to sanitize unwind path.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
582bf7c5158dce16f7dc5b8345b7876bd8031224)
Cc: stable@vger.kernel.org
Mario Limonciello [Mon, 21 Jul 2025 04:39:41 +0000 (23:39 -0500)]
drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value"
This reverts commit
66abb996999de0d440a02583a6e70c2c24deab45.
This broke custom brightness curves but it wasn't obvious because
of other related changes. Custom brightness curves are always
from a 0-255 input signal. The correct fix was to fix the default
value which was done by [1].
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4412
Link: https://lore.kernel.org/amd-gfx/0f094c4b-d2a3-42cd-824c-dc2858a5618d@kernel.org/T/#m69f875a7e69aa22df3370b3e3a9e69f4a61fdaf2
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
6ec8a5cbec751625133461600d0d4950ffd3a214)
Cc: stable@vger.kernel.org
Siyang Liu [Fri, 4 Jul 2025 03:16:22 +0000 (11:16 +0800)]
drm/amd/display: fix a Null pointer dereference vulnerability
[Why]
A null pointer dereference vulnerability exists in the AMD display driver's
(DC module) cleanup function dc_destruct().
When display control context (dc->ctx) construction fails
(due to memory allocation failure), this pointer remains NULL.
During subsequent error handling when dc_destruct() is called,
there's no NULL check before dereferencing the perf_trace member
(dc->ctx->perf_trace), causing a kernel null pointer dereference crash.
[How]
Check if dc->ctx is non-NULL before dereferencing.
Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com
Co-developed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
(Updated commit text and removed unnecessary error message)
Signed-off-by: Siyang Liu <Security@tencent.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
9dd8e2ba268c636c240a918e0a31e6feaee19404)
Cc: stable@vger.kernel.org
Michel Dänzer [Wed, 30 Jul 2025 08:09:02 +0000 (10:09 +0200)]
drm/amd/display: Add primary plane to commits for correct VRR handling
amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for
the primary plane. If a commit affects a CRTC but not its primary plane,
it would previously not trigger a refresh cycle or affect LFC, violating
current UAPI semantics.
Fixes e.g. atomic commits affecting only the cursor plane being limited
to the minimum refresh rate.
Don't do this for the legacy cursor ioctls though, it would break the
UAPI semantics for those.
Suggested-by: Xaver Hugl <xaver.hugl@kde.org>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
cc7bfba95966251b254cb970c21627124da3b7f4)
Cc: stable@vger.kernel.org
Alex Deucher [Fri, 18 Jul 2025 19:53:21 +0000 (15:53 -0400)]
drm/amdgpu: update mmhub 3.3 client id mappings
Update the client id mapping so the correct clients
get printed when there is a mmhub page fault.
v2: fix typos spotted by David Wu.
v3: fix additional typo spotted by David.
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
e932f4779a2d329841bb9ca70bb80a4bb2d707b6)
Cc: stable@vger.kernel.org
Alex Deucher [Fri, 18 Jul 2025 19:52:04 +0000 (15:52 -0400)]
drm/amdgpu: update mmhub 3.0.1 client id mappings
Update the client id mapping so the correct clients
get printed when there is a mmhub page fault.
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
2a2681eda73b99a2c1ee8cdb006099ea5d0c2505)
Cc: stable@vger.kernel.org
YuanShang [Wed, 23 Jul 2025 08:44:49 +0000 (16:44 +0800)]
drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job
The field job->vm is used in function amdgpu_job_run to get the page
table re-generation counter and decide whether the job should be skipped.
Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use.
For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed,
then the gfx job should be skipped.
Fixes:
26c95e838e63 ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare")
Signed-off-by: YuanShang <YuanShang.Mao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
ed76936c6b10b547c6df4ca75412331e9ef6d339)
Cc: stable@vger.kernel.org
Timur Kristóf [Tue, 22 Jul 2025 15:58:30 +0000 (17:58 +0200)]
drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming.
Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only
be used for DP. Make sure to initialize the correct amount of PLLs
in DC for these DCE versions and use PLL0 only for DP.
Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at
initialization as opposed to DCE 6.1 and 7.x which use a different
clock source for DFS.
The following functions were used as reference from the old
radeon driver implementation of DCE 6.x:
- radeon_atom_pick_pll
- atombios_crtc_set_disp_eng_pll
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
35222b5934ec8d762473592ece98659baf6bc48e)
Cc: stable@vger.kernel.org
Timur Kristóf [Tue, 22 Jul 2025 15:58:29 +0000 (17:58 +0200)]
drm/amd/display: Don't overwrite dce60_clk_mgr
dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr
with the dce_clk_mgr, causing incorrect behaviour on DCE6.
Fix it by removing the extra dce_clk_mgr_construct.
Fixes:
62eab49faae7 ("drm/amd/display: hide VGH asic specific structs")
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
bbddcbe36a686af03e91341b9bbfcca94bd45fb6)
Cc: stable@vger.kernel.org
David Yat Sin [Wed, 16 Jul 2025 22:04:28 +0000 (22:04 +0000)]
drm/amdkfd: Fix checkpoint-restore on multi-xcc
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and
restores all the MQDs within the partition.
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
a578f2a58c3ab38f0643b1b6e7534af860233cb1)
Cc: stable@vger.kernel.org
Mario Limonciello [Fri, 25 Jul 2025 03:12:22 +0000 (22:12 -0500)]
drm/amd: Restore cached manual clock settings during resume
If the SCLK limits have been set before S3 they will not
be restored. The limits are however cached in the driver and so
they can be restored by running a commit sequence during resume.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250725031222.3015095-3-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
4e9526924d09057a9ba854305e17eded900ced82)
Cc: stable@vger.kernel.org
Mario Limonciello [Fri, 25 Jul 2025 03:12:21 +0000 (22:12 -0500)]
drm/amd: Restore cached power limit during resume
The power limit will be cached in smu->current_power_limit but
if the ASIC goes into S3 this value won't be restored.
Restore the value during SMU resume.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
26a609e053a6fc494403e95403bc6a2470383bec)
Cc: stable@vger.kernel.org
Lijo Lazar [Fri, 25 Jul 2025 04:51:10 +0000 (10:21 +0530)]
drm/amdgpu: Update external revid for GC v9.5.0
Use different external revid for GC v9.5.0 SOCs.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
21c6764ed4bfaecad034bc4fd15dd64c5a436325)
Cc: stable@vger.kernel.org
Lijo Lazar [Tue, 8 Jul 2025 07:47:18 +0000 (13:17 +0530)]
drm/amdgpu: Update supported modes for GC v9.5.0
For GC v9.5.0 SOCs, both CPX and QPX compute modes are also supported in
NPS2 mode.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
9d1ac25c7f830e0132aa816393b1e9f140e71148)
Cc: stable@vger.kernel.org
Thomas Croft [Mon, 4 Aug 2025 15:12:07 +0000 (09:12 -0600)]
ALSA: hda/realtek: add LG gram 16Z90R-A to alc269 fixup table
Several months ago, Joshua Grisham submitted a patch [1]
for several ALC298 based sound cards.
The entry for the LG gram 16 in the alc269_fixup_tbl only matches the
Subsystem ID for the 16Z90R-Q and 16Z90R-K models [2]. My 16Z90R-A has a
different Subsystem ID [3]. I'm not sure why these IDs differ, but I
speculate it's due to the NVIDIA GPU included in the 16Z90R-A model that
isn't present in the other models.
I applied the patch to the latest Arch Linux kernel and the card was
initialized as expected.
[1]: https://lore.kernel.org/linux-sound/
20240909193000.838815-1-josh@joshuagrisham.com/
[2]: https://linux-hardware.org/?id=pci:8086-51ca-1854-0488
[3]: https://linux-hardware.org/?id=pci:8086-51ca-1854-0489
Signed-off-by: Thomas Croft <thomasmcft@gmail.com>
Link: https://patch.msgid.link/20250804151457.134761-2-thomasmcft@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Mon, 4 Aug 2025 17:54:36 +0000 (10:54 -0700)]
Merge tag 'printk-for-6.17' of git://git./linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Add new "hash_pointers=[auto|always|never]" boot parameter to force
the hashing even with "slab_debug" enabled
- Allow to stop CPU, after losing nbcon console ownership during
panic(), even without proper NMI
- Allow to use the printk kthread immediately even for the 1st
registered nbcon
- Compiler warning removal
* tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: nbcon: Allow reacquire during panic
printk: Allow to use the printk kthread immediately even for 1st nbcon
slab: Decouple slab_debug and no_hash_pointers
vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format'
compiler-gcc.h: Introduce __diag_GCC_all
Peter Zijlstra [Tue, 15 Jul 2025 19:11:14 +0000 (15:11 -0400)]
sched/psi: Fix psi_seq initialization
With the seqcount moved out of the group into a global psi_seq,
re-initializing the seqcount on group creation is causing seqcount
corruption.
Fixes:
570c8efd5eb7 ("sched/psi: Optimize psi_group_change() cpu_clock() usage")
Reported-by: Chris Mason <clm@meta.com>
Suggested-by: Beata Michalska <beata.michalska@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>