linux-block.git
5 days agoMerge tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 17 Jul 2025 16:46:37 +0000 (09:46 -0700)]
Merge tag 'pm-6.16-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These address three issues introduced during the current development
  cycle and related to system suspend and hibernation, one triggering
  when asynchronous suspend of devices fails, one possibly affecting
  memory management in the core suspend code error path, and one due to
  duplicate filesystems freezing during system suspend:

   - Fix a deadlock that may occur on asynchronous device suspend
     failures due to missing completion updates in error paths (Rafael
     Wysocki)

   - Drop a misplaced pm_restore_gfp_mask() call, which may cause swap
     to be accessed too early if system suspend fails, from
     suspend_devices_and_enter() (Rafael Wysocki)

   - Remove duplicate filesystems_freeze/thaw() calls, which sometimes
     cause systems to be unable to resume, from enter_state() (Zihuan
     Zhang)"

* tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Update power.completion for all devices on errors
  PM: suspend: clean up redundant filesystems_freeze/thaw() handling
  PM: suspend: Drop a misplaced pm_restore_gfp_mask() call

5 days agoMerge tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 16 Jul 2025 20:00:38 +0000 (13:00 -0700)]
Merge tag 'probes-fixes-v6.16-rc6' of git://git./linux/kernel/git/trace/linux-trace

Pull probes fix from Masami Hiramatsu:

 - fprobe-event: The @params variable was being used in an error path
   without being initialized. The fix to return an error code.

* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/probes: Avoid using params uninitialized in parse_btf_arg()

6 days agotracing/probes: Avoid using params uninitialized in parse_btf_arg()
Nathan Chancellor [Wed, 16 Jul 2025 03:19:44 +0000 (20:19 -0700)]
tracing/probes: Avoid using params uninitialized in parse_btf_arg()

After a recent change in clang to strengthen uninitialized warnings [1],
it points out that in one of the error paths in parse_btf_arg(), params
is used uninitialized:

  kernel/trace/trace_probe.c:660:19: warning: variable 'params' is uninitialized when used here [-Wuninitialized]
    660 |                         return PTR_ERR(params);
        |                                        ^~~~~~

Match many other NO_BTF_ENTRY error cases and return -ENOENT, clearing
up the warning.

Link: https://lore.kernel.org/all/20250715-trace_probe-fix-const-uninit-warning-v1-1-98960f91dd04@kernel.org/
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2110
Fixes: d157d7694460 ("tracing/probes: Support BTF field access from $retval")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
7 days agoMerge tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Tue, 15 Jul 2025 16:26:33 +0000 (09:26 -0700)]
Merge tag 'soc-fixes-6.16-2' of git://git./linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "There are 18 devicetree fixes for three arm64 plaforms: Qualcomm
  Snapdragon, Rockchips and NXP i.MX. These get updated to more
  correctly describe the hardware, fixing issues with:

   - real-time clock on Snapdragon based laptops

   - SD card detection, PCI probing and HDMI/DDC communication on
     Rockchips

   - ethernet and SPI probing on certain i.MX based boards

   - a regression with the i.MX watchdog

  Aside from the devicetree fixes, there are two additional fixes for
  the merged ASPEED LPC snoop driver that saw some changes in 6.16, and
  one additional driver enabled in arm64 defconfig to fix CPU frequency
  scaling"

* tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
  arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on
  soc: aspeed: lpc-snoop: Don't disable channels that aren't enabled
  soc: aspeed: lpc-snoop: Cleanup resources in stack-order
  arm64: dts: imx95: Correct the DMA interrupter number of pcie0_ep
  arm64: dts: rockchip: Add missing fan-supply to rk3566-quartz64-a
  arm64: dts: rockchip: use cs-gpios for spi1 on ringneck
  arm64: dts: add big-endian property back into watchdog node
  arm64: dts: imx95-15x15-evk: fix the overshoot issue of NETC
  arm64: dts: imx95-19x19-evk: fix the overshoot issue of NETC
  arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5
  arm64: dts: imx8mp-venice-gw74xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw73xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw72xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw71xx: fix TPM SPI frequency
  arm64: dts: qcom: x1e80100: describe uefi rtc offset
  arm64: dts: qcom: sc8280xp-x13s: describe uefi rtc offset
  arm64: defconfig: Enable Qualcomm CPUCP mailbox driver
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi 4B
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi CM5
  arm64: dts: rockchip: Adjust the HDMI DDC IO driver strength for rk3588
  ...

7 days agoMerge tag 'hid-for-linus-2025071501' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 15 Jul 2025 16:20:44 +0000 (09:20 -0700)]
Merge tag 'hid-for-linus-2025071501' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

 - one warning cleanup introduced in the last PR (Andy Shevchenko)

 - a nasty syzbot buffer underflow fix co-debugged with Alan Stern
   (Benjamin Tissoires)

* tag 'hid-for-linus-2025071501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  selftests/hid: add a test case for the recent syzbot underflow
  HID: core: do not bypass hid_hw_raw_request
  HID: core: ensure __hid_request reserves the report ID as the first byte
  HID: core: ensure the allocated report buffer can contain the reserved report ID
  HID: debug: Remove duplicate entry (BTN_WHEEL)

7 days agoPM: sleep: Update power.completion for all devices on errors
Rafael J. Wysocki [Mon, 14 Jul 2025 17:45:31 +0000 (19:45 +0200)]
PM: sleep: Update power.completion for all devices on errors

After commit aa7a9275ab81 ("PM: sleep: Suspend async parents after
suspending children"), the following scenario is possible:

 1. Device A is async and it depends on device B that is sync.
 2. Async suspend is scheduled for A before the processing of B is
    started.
 3. A is waiting for B.
 4. In the meantime, an unrelated device fails to suspend and returns
    an error.
 5. The processing of B doesn't start at all and its power.completion is
    not updated.
 6. A is still waiting for B when async_synchronize_full() is called.
 7. Deadlock ensues.

To prevent this from happening, update power.completion for all devices
on errors in all suspend phases, but do not do it directly for devices
that are already being processed or are waiting for the processing to
start because in those cases it may be necessary to wait for the
processing to actually complete before updating power.completion for
the device.

Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children")
Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous")
Closes: https://lore.kernel.org/linux-pm/e13740a0-88f3-4a6f-920f-15805071a7d6@linaro.org/
Reported-and-tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/6191258.lOV4Wx5bFT@rjwysocki.net
7 days agoPM: suspend: clean up redundant filesystems_freeze/thaw() handling
Zihuan Zhang [Sat, 12 Jul 2025 03:08:24 +0000 (11:08 +0800)]
PM: suspend: clean up redundant filesystems_freeze/thaw() handling

The recently introduced support for freezing filesystems during system
suspend included calls to filesystems_freeze() in both suspend_prepare()
and enter_state(), as well as calls to filesystems_thaw() in both
suspend_finish() and the Unlock path in enter_state(). These are
redundant.

Moreover, calling filesystems_freeze() twice, from both suspend_prepare()
and enter_state(), leads to a black screen and makes the system unable
to resume in some cases.

Address this as follows:

 - filesystems_freeze() is already called in suspend_prepare(), which
   is the proper and consistent place to handle pre-suspend operations.
   The second call in enter_state() is unnecessary and so remove it.

 - filesystems_thaw() is invoked in suspend_finish(), which covers
   successful suspend/resume paths. In the failure case, add a call
   to filesystems_thaw() only when needed, avoiding the duplicate call
   in the general Unlock path.

This change simplifies the suspend code and avoids repeated freeze/thaw
calls, while preserving correct ordering and behavior.

Fixes: eacfbf74196f ("power: freeze filesystems during suspend/resume")
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20250712030824.81474-1-zhangzihuan@kylinos.cn
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 days agoPM: suspend: Drop a misplaced pm_restore_gfp_mask() call
Rafael J. Wysocki [Thu, 10 Jul 2025 08:43:26 +0000 (10:43 +0200)]
PM: suspend: Drop a misplaced pm_restore_gfp_mask() call

The pm_restore_gfp_mask() call added by commit 12ffc3b1513e ("PM:
Restrict swap use to later in the suspend sequence") to
suspend_devices_and_enter() is done too early because it takes
place before calling dpm_resume() in dpm_resume_end() and some
swap-backing devices may not be ready at that point.  Moreover,
dpm_resume_end() called subsequently in the same code path invokes
pm_restore_gfp_mask() again and calling it twice in a row is
pointless.

Drop the misplaced pm_restore_gfp_mask() call from
suspend_devices_and_enter() to address this issue.

Fixes: 12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/2810409.mvXUDI8C0e@rjwysocki.net
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 days agoMerge tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 15 Jul 2025 02:25:28 +0000 (19:25 -0700)]
Merge tag 'for-6.16/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mikulas Patocka:

 - dm-bufio: fix scheduling in atomic

* tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm-bufio: fix sched in atomic context

8 days agoLinux 6.16-rc6
Linus Torvalds [Sun, 13 Jul 2025 21:25:58 +0000 (14:25 -0700)]
Linux 6.16-rc6

8 days agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 13 Jul 2025 18:37:35 +0000 (11:37 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Fixes for a few clk drivers and bindings:

 - Add a missing property to the Mediatek MT8188 clk binding to
   keep binding checks happy

 - Avoid an OOB by setting the correct number of parents in
   dispmix_csr_clk_dev_data

 - Allocate clk_hw structs early in probe to avoid an ordering
   issue where clk_parent_data points to an unallocated clk_hw
   when the child clk is registered before the parent clk in the
   SCMI clk driver

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  dt-bindings: clock: mediatek: Add #reset-cells property for MT8188
  clk: imx: Fix an out-of-bounds access in dispmix_csr_clk_dev_data
  clk: scmi: Handle case where child clocks are initialized before their parents

8 days agoMerge tag 'x86_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 13 Jul 2025 17:41:19 +0000 (10:41 -0700)]
Merge tag 'x86_urgent_for_v6.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Update Kirill's email address

 - Allow hugetlb PMD sharing only on 64-bit as it doesn't make a whole
   lotta sense on 32-bit

 - Add fixes for a misconfigured AMD Zen2 client which wasn't even
   supposed to run Linux

* tag 'x86_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Update Kirill Shutemov's email address for TDX
  x86/mm: Disable hugetlb page table sharing on 32-bit
  x86/CPU/AMD: Disable INVLPGB on Zen2
  x86/rdrand: Disable RDSEED on AMD Cyan Skillfish

8 days agoMerge tag 'irq_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 13 Jul 2025 17:36:55 +0000 (10:36 -0700)]
Merge tag 'irq_urgent_for_v6.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Fix a case of recursive locking in the MSI code

 - Fix a randconfig build failure in armada-370-xp irqchip

* tag 'irq_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/irq-msi-lib: Fix build with PCI disabled
  PCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()

8 days agoMerge tag 'perf_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 13 Jul 2025 17:34:47 +0000 (10:34 -0700)]
Merge tag 'perf_urgent_for_v6.16_rc6' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - Prevent perf_sigtrap() from observing an exiting task and warning
   about it

* tag 'perf_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix WARN in perf_sigtrap()

9 days agoselftests/hid: add a test case for the recent syzbot underflow
Benjamin Tissoires [Thu, 10 Jul 2025 14:01:36 +0000 (16:01 +0200)]
selftests/hid: add a test case for the recent syzbot underflow

Syzbot found a buffer underflow in __hid_request(). Add a related test
case for it.

It's not perfect, but it allows to catch a corner case when a report
descriptor is crafted so that it has a size of 0.

Link: https://patch.msgid.link/20250710-report-size-null-v2-4-ccf922b7c4e5@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
9 days agoHID: core: do not bypass hid_hw_raw_request
Benjamin Tissoires [Thu, 10 Jul 2025 14:01:35 +0000 (16:01 +0200)]
HID: core: do not bypass hid_hw_raw_request

hid_hw_raw_request() is actually useful to ensure the provided buffer
and length are valid. Directly calling in the low level transport driver
function bypassed those checks and allowed invalid paramto be used.

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Closes: https://lore.kernel.org/linux-input/c75433e0-9b47-4072-bbe8-b1d14ea97b13@rowland.harvard.edu/
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250710-report-size-null-v2-3-ccf922b7c4e5@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
9 days agoHID: core: ensure __hid_request reserves the report ID as the first byte
Benjamin Tissoires [Thu, 10 Jul 2025 14:01:34 +0000 (16:01 +0200)]
HID: core: ensure __hid_request reserves the report ID as the first byte

The low level transport driver expects the first byte to be the report
ID, even when the report ID is not use (in which case they just shift
the buffer).

However, __hid_request() whas not offsetting the buffer it used by one
in this case, meaning that the raw_request() callback emitted by the
transport driver would be stripped of the first byte.

Note: this changes the API for uhid devices when a request is made
through hid_hw_request. However, several considerations makes me think
this is fine:
- every request to a HID device made through hid_hw_request() would see
  that change, but every request made through hid_hw_raw_request()
  already has the new behaviour. So that means that the users are
  already facing situations where they might have or not the first byte
  being the null report ID when it is 0. We are making things more
  straightforward in the end.
- uhid is mainly used for BLE devices
- uhid is also used for testing, but I don't see that change a big issue
- for BLE devices, we can check which kernel module is calling
  hid_hw_request()
- and in those modules, we can check which are using a Bluetooth device
- and then we can check if the command is used with a report ID or not.
- surprise: none of the kernel module are using a report ID 0
- and finally, bluez, in its function set_report()[0], does the same
  shift if the report ID is 0 and the given buffer has a size > 0.

[0] https://git.kernel.org/pub/scm/bluetooth/bluez.git/tree/profiles/input/hog-lib.c#n879

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Closes: https://lore.kernel.org/linux-input/c75433e0-9b47-4072-bbe8-b1d14ea97b13@rowland.harvard.edu/
Reported-by: syzbot+8258d5439c49d4c35f43@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8258d5439c49d4c35f43
Tested-by: syzbot+8258d5439c49d4c35f43@syzkaller.appspotmail.com
Fixes: 4fa5a7f76cc7 ("HID: core: implement generic .request()")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250710-report-size-null-v2-2-ccf922b7c4e5@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
9 days agoHID: core: ensure the allocated report buffer can contain the reserved report ID
Benjamin Tissoires [Thu, 10 Jul 2025 14:01:33 +0000 (16:01 +0200)]
HID: core: ensure the allocated report buffer can contain the reserved report ID

When the report ID is not used, the low level transport drivers expect
the first byte to be 0. However, currently the allocated buffer not
account for that extra byte, meaning that instead of having 8 guaranteed
bytes for implement to be working, we only have 7.

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Closes: https://lore.kernel.org/linux-input/c75433e0-9b47-4072-bbe8-b1d14ea97b13@rowland.harvard.edu/
Cc: stable@vger.kernel.org
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://patch.msgid.link/20250710-report-size-null-v2-1-ccf922b7c4e5@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
9 days agoMerge tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sat, 12 Jul 2025 17:30:47 +0000 (10:30 -0700)]
Merge tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git./linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "19 hotfixes. A whopping 16 are cc:stable and the remainder address
  post-6.15 issues or aren't considered necessary for -stable kernels.

  14 are for MM.  Three gdb-script fixes and a kallsyms build fix"

* tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  Revert "sched/numa: add statistics of numa balance task"
  mm: fix the inaccurate memory statistics issue for users
  mm/damon: fix divide by zero in damon_get_intervals_score()
  samples/damon: fix damon sample mtier for start failure
  samples/damon: fix damon sample wsse for start failure
  samples/damon: fix damon sample prcl for start failure
  kasan: remove kasan_find_vm_area() to prevent possible deadlock
  scripts: gdb: vfs: support external dentry names
  mm/migrate: fix do_pages_stat in compat mode
  mm/damon/core: handle damon_call_control as normal under kdmond deactivation
  mm/rmap: fix potential out-of-bounds page table access during batched unmap
  mm/hugetlb: don't crash when allocating a folio if there are no resv
  scripts/gdb: de-reference per-CPU MCE interrupts
  scripts/gdb: fix interrupts.py after maple tree conversion
  maple_tree: fix mt_destroy_walk() on root leaf node
  mm/vmalloc: leave lazy MMU mode on PTE mapping error
  scripts/gdb: fix interrupts display after MCP on x86
  lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users()
  kallsyms: fix build without execinfo

9 days agoMerge tag 'erofs-for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 12 Jul 2025 17:20:03 +0000 (10:20 -0700)]
Merge tag 'erofs-for-6.16-rc6-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "Fix for a cache aliasing issue by adding missing flush_dcache_folio(),
  which causes execution failures on some arm32 setups.

  Fix for large compressed fragments, which could be generated by
  -Eall-fragments option (but should be rare) and was rejected by
  mistake due to an on-disk hardening commit.

  The remaining ones are small fixes. Summary:

   - Address cache aliasing for mappable page cache folios

   - Allow readdir() to be interrupted

   - Fix large fragment handling which was errored out by mistake

   - Add missing tracepoints

   - Use memcpy_to_folio() to replace copy_to_iter() for inline data"

* tag 'erofs-for-6.16-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix large fragment handling
  erofs: allow readdir() to be interrupted
  erofs: address D-cache aliasing
  erofs: use memcpy_to_folio() to replace copy_to_iter()
  erofs: fix to add missing tracepoint in erofs_read_folio()
  erofs: fix to add missing tracepoint in erofs_readahead()

9 days agoMerge tag 'bcachefs-2025-07-11' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Sat, 12 Jul 2025 17:13:27 +0000 (10:13 -0700)]
Merge tag 'bcachefs-2025-07-11' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet.

* tag 'bcachefs-2025-07-11' of git://evilpiepirate.org/bcachefs:
  bcachefs: Don't set BCH_FS_error on transaction restart
  bcachefs: Fix additional misalignment in journal space calculations
  bcachefs: Don't schedule non persistent passes persistently
  bcachefs: Fix bch2_btree_transactions_read() synchronization
  bcachefs: btree read retry fixes
  bcachefs: btree node scan no longer uses btree cache
  bcachefs: Tweak btree cache helpers for use by btree node scan
  bcachefs: Fix btree for nonexistent tree depth
  bcachefs: Fix bch2_io_failures_to_text()
  bcachefs: bch2_fpunch_snapshot()

9 days agoMerge tag 'v6.16-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Sat, 12 Jul 2025 17:06:06 +0000 (10:06 -0700)]
Merge tag 'v6.16-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

 - fix use after free in lease break

 - small fix for freeing rdma transport (fixes missing logging of
   cm_qp_destroy)

 - fix write count leak

* tag 'v6.16-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix potential use-after-free in oplock/lease break ack
  ksmbd: fix a mount write count leak in ksmbd_vfs_kern_path_locked()
  smb: server: make use of rdma_destroy_qp()

10 days agoMerge tag 'pci-v6.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Linus Torvalds [Sat, 12 Jul 2025 00:24:36 +0000 (17:24 -0700)]
Merge tag 'pci-v6.16-fixes-3' of git://git./linux/kernel/git/pci/pci

Pull PCI fixes from Bjorn Helgaas:

 - Track apple Root Ports explicitly and look up the driver data from
   the struct device instead of using dev->driver_data, which is used by
   pci_host_common_init() for the generic host bridge pointer (Marc
   Zyngier)

 - Set dev->driver_data before pci_host_common_init() calls
   gen_pci_init() because some drivers need it to set up ECAM mappings;
   this fixes a regression on MicroChip MPFS Icicle (Geert Uytterhoeven)

 - Revert the now-unnecessary use of ECAM pci_config_window.priv to
   store a copy of dev->driver_data (Marc Zyngier)

* tag 'pci-v6.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  Revert "PCI: ecam: Allow cfg->priv to be pre-populated from the root port device"
  PCI: host-generic: Set driver_data before calling gen_pci_init()
  PCI: apple: Add tracking of probed root ports

10 days agoMerge tag 'drm-fixes-2025-07-12' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Sat, 12 Jul 2025 00:18:40 +0000 (17:18 -0700)]
Merge tag 'drm-fixes-2025-07-12' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Simona Vetter:
 "Cross-subsystem Changes:
   - agp/amd64 binding dmesg noise regression fix

  Core Changes:
   - fix race in gem_handle_create_tail
   - fixup handle_count fb refcount regression from -rc5, popular with
     reports ...
   - call rust dtor for drm_device release

  Driver Changes:
   - nouveau: magic 50ms suspend fix, acpi leak fix
   - tegra: dma api error in nvdec
   - pvr: fix device reset
   - habanalbs maintainer update
   - intel display: fix some dsi mipi sequences
   - xe fixes: SRIOV fixes, small GuC fixes, disable indirect ring due
     to issues, compression fix for fragmented BO, doc update

* tag 'drm-fixes-2025-07-12' of https://gitlab.freedesktop.org/drm/kernel: (22 commits)
  drm/xe/guc: Default log level to non-verbose
  drm/xe/bmg: Don't use WA 16023588340 and 22019338487 on VF
  drm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2
  drm/xe/pm: Correct comment of xe_pm_set_vram_threshold()
  drm/xe: Release runtime pm for error path of xe_devcoredump_read()
  drm/xe/pm: Restore display pm if there is error after display suspend
  drm/i915/bios: Apply vlv_fixup_mipi_sequences() to v2 mipi-sequences too
  drm/gem: Fix race in drm_gem_handle_create_tail()
  drm/framebuffer: Acquire internal references on GEM handles
  agp/amd64: Check AGP Capability before binding to unsupported devices
  drm/xe/bmg: fix compressed VRAM handling
  Revert "drm/xe/xe2: Enable Indirect Ring State support for Xe2"
  drm/xe: Allocate PF queue size on pow2 boundary
  drm/xe/pf: Clear all LMTT pages on alloc
  drm/nouveau/gsp: fix potential leak of memory used during acpi init
  rust: drm: remove unnecessary imports
  MAINTAINERS: Change habanalabs maintainer
  drm/imagination: Fix kernel crash when hard resetting the GPU
  drm/tegra: nvdec: Fix dma_alloc_coherent error check
  rust: drm: device: drop_in_place() the drm::Device in release()
  ...

10 days agoRevert "eventpoll: Fix priority inversion problem"
Linus Torvalds [Sat, 12 Jul 2025 00:10:32 +0000 (17:10 -0700)]
Revert "eventpoll: Fix priority inversion problem"

This reverts commit 8c44dac8add7503c345c0f6c7962e4863b88ba42.

I haven't figured out what the actual bug in this commit is, but I did
spend a lot of time chasing it down and eventually succeeded in
bisecting it down to this.

For some reason, this eventpoll commit ends up causing delays and stuck
user space processes, but it only happens on one of my machines, and
only during early boot or during the flurry of initial activity when
logging in.

I must be triggering some very subtle timing issue, but once I figured
out the behavior pattern that made it reasonably reliable to trigger, it
did bisect right to this, and reverting the commit fixes the problem.

Of course, that was only after I had failed at bisecting it several
times, and had flailed around blaming both the drm people and the
netlink people for the odd problems.  The most obvious of which happened
at the time of the first graphical login (the most common symptom being
that some gnome app aborted due to a 30s timeout, often leading to the
whole session then failing if it was some critical component like
gnome-shell or similar).

Acked-by: Nam Cao <namcao@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 days agoerofs: fix large fragment handling
Gao Xiang [Fri, 11 Jul 2025 19:58:26 +0000 (03:58 +0800)]
erofs: fix large fragment handling

Fragments aren't limited by Z_EROFS_PCLUSTER_MAX_DSIZE. However, if
a fragment's logical length is larger than Z_EROFS_PCLUSTER_MAX_DSIZE
but the fragment is not the whole inode, it currently returns
-EOPNOTSUPP because m_flags has the wrong EROFS_MAP_ENCODED flag set.
It is not intended by design but should be rare, as it can only be
reproduced by mkfs with `-Eall-fragments` in a specific case.

Let's normalize fragment m_flags using the new EROFS_MAP_FRAGMENT.

Reported-by: Axel Fontaine <axel@axelfontaine.com>
Closes: https://github.com/erofs/erofs-utils/issues/23
Fixes: 7c3ca1838a78 ("erofs: restrict pcluster size limitations")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250711195826.3601157-1-hsiangkao@linux.alibaba.com
10 days agoMerge tag 'block-6.16-20250710' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 11 Jul 2025 17:35:54 +0000 (10:35 -0700)]
Merge tag 'block-6.16-20250710' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - MD changes via Yu:
     - fix UAF due to stack memory used for bio mempool (Jinchao)
     - fix raid10/raid1 nowait IO error path (Nigel and Qixing)
     - fix kernel crash from reading bitmap sysfs entry (HÃ¥kon)

 - Fix for a UAF in the nbd connect error path

 - Fix for blocksize being bigger than pagesize, if THP isn't enabled

* tag 'block-6.16-20250710' of git://git.kernel.dk/linux:
  block: reject bs > ps block devices when THP is disabled
  nbd: fix uaf in nbd_genl_connect() error path
  md/md-bitmap: fix GPF in bitmap_get_stats()
  md/raid1,raid10: strip REQ_NOWAIT from member bios
  raid10: cleanup memleak at raid10_make_request
  md/raid1: Fix stack memory use after return in raid1_reshape

10 days agoMerge tag 'io_uring-6.16-20250710' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 11 Jul 2025 17:29:30 +0000 (10:29 -0700)]
Merge tag 'io_uring-6.16-20250710' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Remove a pointless warning in the zcrx code

 - Fix for MSG_RING commands, where the allocated io_kiocb
   needs to be freed under RCU as well

 - Revert the work-around we had in place for the anon inodes
   pretending to be regular files. Since that got reworked
   upstream, the work-around is no longer needed

* tag 'io_uring-6.16-20250710' of git://git.kernel.dk/linux:
  Revert "io_uring: gate REQ_F_ISREG on !S_ANON_INODE as well"
  io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCU
  io_uring/zcrx: fix pp destruction warnings

10 days agoMerge tag 'net-6.16-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 11 Jul 2025 17:18:51 +0000 (10:18 -0700)]
Merge tag 'net-6.16-rc6-2' of git://git./linux/kernel/git/netdev/net

Pull more networking fixes from Jakub Kicinski
 "Big chunk of fixes for WiFi, Johannes says probably the last for the
  release.

  The Netlink fixes (on top of the tree) restore operation of iw (WiFi
  CLI) which uses sillily small recv buffer, and is the reason for this
  'emergency PR'.

  The GRE multicast fix also stands out among the user-visible
  regressions.

  Current release - fix to a fix:

   - netlink: make sure we always allow at least one skb to be queued,
     even if the recvbuf is (mis)configured to be tiny

  Previous releases - regressions:

   - gre: fix IPv6 multicast route creation

  Previous releases - always broken:

   - wifi: prevent A-MSDU attacks in mesh networks

   - wifi: cfg80211: fix S1G beacon head validation and detection

   - wifi: mac80211:
       - always clear frame buffer to prevent stack leak in cases which
         hit a WARN()
       - fix monitor interface in device restart

   - wifi: mwifiex: discard erroneous disassoc frames on STA interface

   - wifi: mt76:
       - prevent null-deref in mt7925_sta_set_decap_offload()
       - add missing RCU annotations, and fix sleep in atomic
       - fix decapsulation offload
       - fixes for scanning

   - phy: microchip: improve link establishment and reset handling

   - eth: mlx5e: fix race between DIM disable and net_dim()

   - bnxt_en: correct DMA unmap len for XDP_REDIRECT"

* tag 'net-6.16-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
  netlink: make sure we allow at least one dump skb
  netlink: Fix rmem check in netlink_broadcast_deliver().
  bnxt_en: Set DMA unmap len correctly for XDP_REDIRECT
  bnxt_en: Flush FW trace before copying to the coredump
  bnxt_en: Fix DCB ETS validation
  net: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam()
  net/mlx5e: Add new prio for promiscuous mode
  net/mlx5e: Fix race between DIM disable and net_dim()
  net/mlx5: Reset bw_share field when changing a node's parent
  can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level
  selftests: net: lib: fix shift count out of range
  selftests: Add IPv6 multicast route generation tests for GRE devices.
  gre: Fix IPv6 multicast route creation.
  net: phy: microchip: limit 100M workaround to link-down events on LAN88xx
  net: phy: microchip: Use genphy_soft_reset() to purge stale LPA bits
  ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof
  net: appletalk: Fix device refcount leak in atrtr_create()
  netfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto()
  wifi: mac80211: add the virtual monitor after reconfig complete
  wifi: mac80211: always initialize sdata::key_list
  ...

10 days agoMerge tag 'gpio-fixes-for-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 11 Jul 2025 17:15:50 +0000 (10:15 -0700)]
Merge tag 'gpio-fixes-for-v6.16-rc6' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix performance regression when setting values of multiple GPIO lines
   at once

 - make sure the GPIO OF xlate code doesn't end up passing an
   uninitialized local variable to GPIO core

 - update MAINTAINERS

* tag 'gpio-fixes-for-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  MAINTAINERS: remove bouncing address for Nandor Han
  gpio: of: initialize local variable passed to the .of_xlate() callback
  gpiolib: fix performance regression when using gpio_chip_get_multiple()

11 days agoMerge tag 'pm-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 11 Jul 2025 16:19:33 +0000 (09:19 -0700)]
Merge tag 'pm-6.16-rc6' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a coding mistake in a previous fix related to system suspend and
  hibernation merged recently"

* tag 'pm-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Call pm_restore_gfp_mask() after dpm_resume()

11 days agoMerge tag 'dma-mapping-6.16-2025-07-11' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 11 Jul 2025 15:49:25 +0000 (08:49 -0700)]
Merge tag 'dma-mapping-6.16-2025-07-11' of git://git./linux/kernel/git/mszyprowski/linux

Pull dma-mapping fix from Marek Szyprowski:

 - small fix relevant to arm64 server and custom CMA configuration (Feng
   Tang)

* tag 'dma-mapping-6.16-2025-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma-contiguous: hornor the cma address limit setup by user

11 days agonetlink: make sure we allow at least one dump skb
Jakub Kicinski [Fri, 11 Jul 2025 00:11:21 +0000 (17:11 -0700)]
netlink: make sure we allow at least one dump skb

Commit under Fixes tightened up the memory accounting for Netlink
sockets. Looks like the accounting is too strict for some existing
use cases, Marek reported issues with nl80211 / WiFi iw CLI.

To reduce number of iterations Netlink dumps try to allocate
messages based on the size of the buffer passed to previous
recvmsg() calls. If user space uses a larger buffer in recvmsg()
than sk_rcvbuf we will allocate an skb we won't be able to queue.

Make sure we always allow at least one skb to be queued.
Same workaround is already present in netlink_attachskb().
Alternative would be to cap the allocation size to
  rcvbuf - rmem_alloc
but as I said, the workaround is already present in other places.

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/9794af18-4905-46c6-b12c-365ea2f05858@samsung.com
Fixes: ae8f160e7eb2 ("netlink: Fix wraparounds of sk->sk_rmem_alloc.")
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711001121.3649033-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonetlink: Fix rmem check in netlink_broadcast_deliver().
Kuniyuki Iwashima [Fri, 11 Jul 2025 05:32:07 +0000 (05:32 +0000)]
netlink: Fix rmem check in netlink_broadcast_deliver().

We need to allow queuing at least one skb even when skb is
larger than sk->sk_rcvbuf.

The cited commit made a mistake while converting a condition
in netlink_broadcast_deliver().

Let's correct the rmem check for the allow-one-skb rule.

Fixes: ae8f160e7eb24 ("netlink: Fix wraparounds of sk->sk_rmem_alloc.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711053208.2965945-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge branch 'bnxt_en-3-bug-fixes'
Jakub Kicinski [Fri, 11 Jul 2025 14:28:36 +0000 (07:28 -0700)]
Merge branch 'bnxt_en-3-bug-fixes'

Michael Chan says:

====================
bnxt_en: 3 bug fixes

The first one fixes a possible failure when setting DCB ETS.
The second one fixes the ethtool coredump (-W 2) not containing
all the FW traces.  The third one fixes the DMA unmap length when
transmitting XDP_REDIRECT packets.
====================

Link: https://patch.msgid.link/20250710213938.1959625-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agobnxt_en: Set DMA unmap len correctly for XDP_REDIRECT
Somnath Kotur [Thu, 10 Jul 2025 21:39:38 +0000 (14:39 -0700)]
bnxt_en: Set DMA unmap len correctly for XDP_REDIRECT

When transmitting an XDP_REDIRECT packet, call dma_unmap_len_set()
with the proper length instead of 0.  This bug triggers this warning
on a system with IOMMU enabled:

WARNING: CPU: 36 PID: 0 at drivers/iommu/dma-iommu.c:842 __iommu_dma_unmap+0x159/0x170
RIP: 0010:__iommu_dma_unmap+0x159/0x170
Code: a8 00 00 00 00 48 c7 45 b0 00 00 00 00 48 c7 45 c8 00 00 00 00 48 c7 45 a0 ff ff ff ff 4c 89 45
b8 4c 89 45 c0 e9 77 ff ff ff <0f> 0b e9 60 ff ff ff e8 8b bf 6a 00 66 66 2e 0f 1f 84 00 00 00 00
RSP: 0018:ff22d31181150c88 EFLAGS: 00010206
RAX: 0000000000002000 RBX: 00000000e13a0000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ff22d31181150cf0 R08: ff22d31181150ca8 R09: 0000000000000000
R10: 0000000000000000 R11: ff22d311d36c9d80 R12: 0000000000001000
R13: ff13544d10645010 R14: ff22d31181150c90 R15: ff13544d0b2bac00
FS: 0000000000000000(0000) GS:ff13550908a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005be909dacff8 CR3: 0008000173408003 CR4: 0000000000f71ef0
PKRU: 55555554
Call Trace:
<IRQ>
? show_regs+0x6d/0x80
? __warn+0x89/0x160
? __iommu_dma_unmap+0x159/0x170
? report_bug+0x17e/0x1b0
? handle_bug+0x46/0x90
? exc_invalid_op+0x18/0x80
? asm_exc_invalid_op+0x1b/0x20
? __iommu_dma_unmap+0x159/0x170
? __iommu_dma_unmap+0xb3/0x170
iommu_dma_unmap_page+0x4f/0x100
dma_unmap_page_attrs+0x52/0x220
? srso_alias_return_thunk+0x5/0xfbef5
? xdp_return_frame+0x2e/0xd0
bnxt_tx_int_xdp+0xdf/0x440 [bnxt_en]
__bnxt_poll_work_done+0x81/0x1e0 [bnxt_en]
bnxt_poll+0xd3/0x1e0 [bnxt_en]

Fixes: f18c2b77b2e4 ("bnxt_en: optimized XDP_REDIRECT support")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250710213938.1959625-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agobnxt_en: Flush FW trace before copying to the coredump
Shruti Parab [Thu, 10 Jul 2025 21:39:37 +0000 (14:39 -0700)]
bnxt_en: Flush FW trace before copying to the coredump

bnxt_fill_drv_seg_record() calls bnxt_dbg_hwrm_log_buffer_flush()
to flush the FW trace buffer.  This needs to be done before we
call bnxt_copy_ctx_mem() to copy the trace data.

Without this fix, the coredump may not contain all the FW
traces.

Fixes: 3c2179e66355 ("bnxt_en: Add FW trace coredump segments to the coredump")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250710213938.1959625-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agobnxt_en: Fix DCB ETS validation
Shravya KN [Thu, 10 Jul 2025 21:39:36 +0000 (14:39 -0700)]
bnxt_en: Fix DCB ETS validation

In bnxt_ets_validate(), the code incorrectly loops over all possible
traffic classes to check and add the ETS settings.  Fix it to loop
over the configured traffic classes only.

The unconfigured traffic classes will default to TSA_ETS with 0
bandwidth.  Looping over these unconfigured traffic classes may
cause the validation to fail and trigger this error message:

"rejecting ETS config starving a TC\n"

The .ieee_setets() will then fail.

Fixes: 7df4ae9fe855 ("bnxt_en: Implement DCBNL to support host-based DCBX.")
Reviewed-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Shravya KN <shravya.k-n@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250710213938.1959625-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam()
Alok Tiwari [Thu, 10 Jul 2025 18:06:17 +0000 (11:06 -0700)]
net: ll_temac: Fix missing tx_pending check in ethtools_set_ringparam()

The function ll_temac_ethtools_set_ringparam() incorrectly checked
rx_pending twice, once correctly for RX and once mistakenly in place
of tx_pending. This caused tx_pending to be left unchecked against
TX_BD_NUM_MAX.
As a result, invalid TX ring sizes may have been accepted or valid
ones wrongly rejected based on the RX limit, leading to potential
misconfiguration or unexpected results.

This patch corrects the condition to properly validate tx_pending.

Fixes: f7b261bfc35e ("net: ll_temac: Make RX/TX ring sizes configurable")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://patch.msgid.link/20250710180621.2383000-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge branch 'mlx5-misc-fixes-2025-07-10'
Jakub Kicinski [Fri, 11 Jul 2025 14:26:49 +0000 (07:26 -0700)]
Merge branch 'mlx5-misc-fixes-2025-07-10'

Tariq Toukan says:

====================
mlx5 misc fixes 2025-07-10

This small patchset provides misc bug fixes from the team to the mlx5
core and EN drivers.
====================

Link: https://patch.msgid.link/1752155624-24095-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet/mlx5e: Add new prio for promiscuous mode
Jianbo Liu [Thu, 10 Jul 2025 13:53:44 +0000 (16:53 +0300)]
net/mlx5e: Add new prio for promiscuous mode

An optimization for promiscuous mode adds a high-priority steering
table with a single catch-all rule to steer all traffic directly to
the TTC table.

However, a gap exists between the creation of this table and the
insertion of the catch-all rule. Packets arriving in this brief window
would miss as no rule was inserted yet, unnecessarily incrementing the
'rx_steer_missed_packets' counter and dropped.

This patch resolves the issue by introducing a new prio for this
table, placing it between MLX5E_TC_PRIO and MLX5E_NIC_PRIO. By doing
so, packets arriving during the window now fall through to the next
prio (at MLX5E_NIC_PRIO) instead of being dropped.

Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/1752155624-24095-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet/mlx5e: Fix race between DIM disable and net_dim()
Carolina Jubran [Thu, 10 Jul 2025 13:53:43 +0000 (16:53 +0300)]
net/mlx5e: Fix race between DIM disable and net_dim()

There's a race between disabling DIM and NAPI callbacks using the dim
pointer on the RQ or SQ.

If NAPI checks the DIM state bit and sees it still set, it assumes
`rq->dim` or `sq->dim` is valid. But if DIM gets disabled right after
that check, the pointer might already be set to NULL, leading to a NULL
pointer dereference in net_dim().

Fix this by calling `synchronize_net()` before freeing the DIM context.
This ensures all in-progress NAPI callbacks are finished before the
pointer is cleared.

Kernel log:

BUG: kernel NULL pointer dereference, address: 0000000000000000
...
RIP: 0010:net_dim+0x23/0x190
...
Call Trace:
 <TASK>
 ? __die+0x20/0x60
 ? page_fault_oops+0x150/0x3e0
 ? common_interrupt+0xf/0xa0
 ? sysvec_call_function_single+0xb/0x90
 ? exc_page_fault+0x74/0x130
 ? asm_exc_page_fault+0x22/0x30
 ? net_dim+0x23/0x190
 ? mlx5e_poll_ico_cq+0x41/0x6f0 [mlx5_core]
 ? sysvec_apic_timer_interrupt+0xb/0x90
 mlx5e_handle_rx_dim+0x92/0xd0 [mlx5_core]
 mlx5e_napi_poll+0x2cd/0xac0 [mlx5_core]
 ? mlx5e_poll_ico_cq+0xe5/0x6f0 [mlx5_core]
 busy_poll_stop+0xa2/0x200
 ? mlx5e_napi_poll+0x1d9/0xac0 [mlx5_core]
 ? mlx5e_trigger_irq+0x130/0x130 [mlx5_core]
 __napi_busy_loop+0x345/0x3b0
 ? sysvec_call_function_single+0xb/0x90
 ? asm_sysvec_call_function_single+0x16/0x20
 ? sysvec_apic_timer_interrupt+0xb/0x90
 ? pcpu_free_area+0x1e4/0x2e0
 napi_busy_loop+0x11/0x20
 xsk_recvmsg+0x10c/0x130
 sock_recvmsg+0x44/0x70
 __sys_recvfrom+0xbc/0x130
 ? __schedule+0x398/0x890
 __x64_sys_recvfrom+0x20/0x30
 do_syscall_64+0x4c/0x100
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
...
---[ end trace 0000000000000000 ]---
...
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: 445a25f6e1a2 ("net/mlx5e: Support updating coalescing configuration without resetting channels")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/1752155624-24095-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet/mlx5: Reset bw_share field when changing a node's parent
Carolina Jubran [Thu, 10 Jul 2025 13:53:42 +0000 (16:53 +0300)]
net/mlx5: Reset bw_share field when changing a node's parent

When changing a node's parent, its scheduling element is destroyed and
re-created with bw_share 0. However, the node's bw_share field was not
updated accordingly.

Set the node's bw_share to 0 after re-creation to keep the software
state in sync with the firmware configuration.

Fixes: 9c7bbf4c3304 ("net/mlx5: Add support for setting parent of nodes")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/1752155624-24095-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMAINTAINERS: Update Kirill Shutemov's email address for TDX
Kirill A. Shutemov [Tue, 8 Jul 2025 10:19:22 +0000 (13:19 +0300)]
MAINTAINERS: Update Kirill Shutemov's email address for TDX

Update MAINTAINERS to use my @kernel.org email address.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20250708101922.50560-4-kirill.shutemov%40linux.intel.com
11 days agoMerge tag 'linux-can-fixes-for-6.16-20250711' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Fri, 11 Jul 2025 14:07:56 +0000 (07:07 -0700)]
Merge tag 'linux-can-fixes-for-6.16-20250711' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-07-11

Sean Nyekjaer's patch targets the m_can driver and demotes the "msg
lost in rx" message to debug level to prevent flooding the kernel log
with error messages.

* tag 'linux-can-fixes-for-6.16-20250711' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level
====================

Link: https://patch.msgid.link/20250711102451.2828802-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge tag 'drm-misc-fixes-2025-07-10' of https://gitlab.freedesktop.org/drm/misc...
Simona Vetter [Fri, 11 Jul 2025 12:11:18 +0000 (14:11 +0200)]
Merge tag 'drm-misc-fixes-2025-07-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

drm-misc-fixes for v6.16-rc6 or final:
- Fix nouveau fail on debugfs errors.
- Magic 50 ms to fix nouveau suspend.
- Call rust destructor on drm device release.
- Fix DMA api error handling in tegra/nvdec.
- Fix PVR device reset.
- Habanalabs maintainer update.
- Small memory leak fix when nouveau acpi init fails.
- Do not attempt to bind to any PCI device with AGP capability.
- Make FB's acquire handles on backing object, same as i915/xe already does.
- Fix race in drm_gem_handle_create_tail.

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e522cdc7-1787-48f2-97e5-0f94783970ab@linux.intel.com
11 days agoMerge tag 'qcom-arm64-defconfig-fixes-for-6.16' of https://git.kernel.org/pub/scm...
Arnd Bergmann [Fri, 11 Jul 2025 11:41:08 +0000 (13:41 +0200)]
Merge tag 'qcom-arm64-defconfig-fixes-for-6.16' of https://git./linux/kernel/git/qcom/linux into arm/fixes

Qualcomm Arm64 defconfig fixes for v6.16

The v6.16 driver and DeviceTree updates described and implemented CPU
frequency scaling for the Qualcomm X Elite platform. But the necessary
CPUCP mailbox driver was not enabled, resulting in a series of error
messages being logged during boot (and no CPU frequency scaling).

Enable the missing drivers to silence the errors, and enable CPU
frequency scaling on this platform.

* tag 'qcom-arm64-defconfig-fixes-for-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: defconfig: Enable Qualcomm CPUCP mailbox driver

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
11 days agoMerge tag 'qcom-arm64-fixes-for-6.16' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Fri, 11 Jul 2025 11:39:12 +0000 (13:39 +0200)]
Merge tag 'qcom-arm64-fixes-for-6.16' of https://git./linux/kernel/git/qcom/linux into arm/fixes

Qualcomm DeviceTree fixes for v6.16

The RTC DeviceTree binding was changed in v6.16, to require an explicit
flag indicating that we store RTC offset in in an UEFI variable.

The result sent X Elite and Lenovo Thinkpad X13s users back to 1970, add
the flag to explicitly select the correct configuration for these
devices.

* tag 'qcom-arm64-fixes-for-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
  arm64: dts: qcom: x1e80100: describe uefi rtc offset
  arm64: dts: qcom: sc8280xp-x13s: describe uefi rtc offset

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
11 days agoMerge tag 'aspeed-6.16-fixes-0' of https://git.kernel.org/pub/scm/linux/kernel/git...
Arnd Bergmann [Fri, 11 Jul 2025 11:38:35 +0000 (13:38 +0200)]
Merge tag 'aspeed-6.16-fixes-0' of https://git./linux/kernel/git/bmc/linux into arm/fixes

ASPEED SoC driver fixes for 6.16

Address concerns in the ASPEED LPC snoop driver identified in the first two
patches of the cleanup series at [1].

[1]: https://lore.kernel.org/all/20250616-aspeed-lpc-snoop-fixes-v2-0-3cdd59c934d3@codeconstruct.com.au/

* tag 'aspeed-6.16-fixes-0' of https://git.kernel.org/pub/scm/linux/kernel/git/bmc/linux:
  soc: aspeed: lpc-snoop: Don't disable channels that aren't enabled
  soc: aspeed: lpc-snoop: Cleanup resources in stack-order

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
11 days agoMerge tag 'v6.16-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel...
Arnd Bergmann [Fri, 11 Jul 2025 11:17:23 +0000 (13:17 +0200)]
Merge tag 'v6.16-rockchip-dtsfixes1' of https://git./linux/kernel/git/mmind/linux-rockchip into arm/fixes

Switch to the gpio variant for spi-cs and mmc-detect for some boards
as the in-controller functionality does not work as intended for them.
HDMI drive strength adjustment for better ddc communication and some
missing supplies.

* tag 'v6.16-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: Add missing fan-supply to rk3566-quartz64-a
  arm64: dts: rockchip: use cs-gpios for spi1 on ringneck
  arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi 4B
  arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi CM5
  arm64: dts: rockchip: Adjust the HDMI DDC IO driver strength for rk3588
  arm64: dts: rockchip: fix rk3576 pcie1 linux,pci-domain

Link: https://lore.kernel.org/r/5108768.AiC22s8V5E@diego
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
11 days agocan: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level
Sean Nyekjaer [Fri, 11 Jul 2025 10:12:02 +0000 (12:12 +0200)]
can: m_can: m_can_handle_lost_msg(): downgrade msg lost in rx message to debug level

Downgrade the "msg lost in rx" message to debug level, to prevent
flooding the kernel log with error messages.

Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Link: https://patch.msgid.link/20250711-mcan_ratelimit-v3-1-7413e8e21b84@geanix.com
[mkl: enhance commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 days agoMerge tag 'drm-xe-fixes-2025-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel...
Simona Vetter [Fri, 11 Jul 2025 09:35:39 +0000 (11:35 +0200)]
Merge tag 'drm-xe-fixes-2025-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Clear LMTT page to avoid leaking data from one VF to another
- Align PF queue size to power of 2
- Disable Indirect Ring State to avoid intermittent issues on context
  switch: feature is not currently needed, so can be disabled for now.
- Fix compression handling when the BO pages are very fragmented
- Restore display pm on error path
- Fix runtime pm handling in xe devcoredump
- Fix xe_pm_set_vram_threshold() doc
- Recommend new minor versions of GuC firmware
- Drop some workarounds on VF
- Do not use verbose GuC logging by default: it should be only for
  debugging

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/s6jyd24mimbzb4vxtgc5vupvbyqplfep2c6eupue7znnlbhuxy@lmvzexfzhrnn
11 days agoMerge tag 'drm-intel-fixes-2025-07-10' of https://gitlab.freedesktop.org/drm/i915...
Simona Vetter [Fri, 11 Jul 2025 09:28:41 +0000 (11:28 +0200)]
Merge tag 'drm-intel-fixes-2025-07-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes

Short summary of fixes:
- DSI panel's version 2 mipi-sequences fix (Hans)

Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/aHA_eR0G7X2P6_ib@intel.com
11 days agoMerge tag 'imx-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/shawngu...
Arnd Bergmann [Fri, 11 Jul 2025 09:26:57 +0000 (11:26 +0200)]
Merge tag 'imx-fixes-6.16' of https://git./linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 6.16:

- Keep LDO5 always on for imx8mm-verdin to fix broken Ethernet support
- Add big-endian property back for LS1046A watchdog, as the removal was
  an accident
- Fix DMA interrupter number of i.MX95 pcie0_ep device
- A set of changes from Tim Harvey to fix TPM SPI frequency on
  imx8mp-venice devices
- A couple of changes from Wei Fang to fix NETC overshoot issue on
  i.MX95 EVK boards

* tag 'imx-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on
  arm64: dts: imx95: Correct the DMA interrupter number of pcie0_ep
  arm64: dts: add big-endian property back into watchdog node
  arm64: dts: imx95-15x15-evk: fix the overshoot issue of NETC
  arm64: dts: imx95-19x19-evk: fix the overshoot issue of NETC
  arm64: dts: imx8mp-venice-gw74xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw73xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw72xx: fix TPM SPI frequency
  arm64: dts: imx8mp-venice-gw71xx: fix TPM SPI frequency

Link: https://lore.kernel.org/r/aGzNeZ7KtsRsUkZT@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
11 days agoMAINTAINERS: remove bouncing address for Nandor Han
Bartosz Golaszewski [Wed, 9 Jul 2025 07:18:24 +0000 (09:18 +0200)]
MAINTAINERS: remove bouncing address for Nandor Han

Nandor's address has been bouncing for some time now. Remove it from
MAINTAINERS. The affected driver falls under the wider umbrella of GPIO
modules.

Link: https://lore.kernel.org/r/20250709071825.16212-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
11 days agodrm/xe/guc: Default log level to non-verbose
Lucas De Marchi [Fri, 13 Jun 2025 20:00:37 +0000 (13:00 -0700)]
drm/xe/guc: Default log level to non-verbose

Currently xe sets the guc log level to a verbose level since it's useful
to debug hangs and general development. However the verbose level may
already be too much and affect performance.

Michal Mrozek did some tests with the L0 compute stack for submission
latency with ULLS disabled. Below are the normalized numbers with log
level 3 (the current default) as baseline for each test:

                          Test \ Log Level                        3      0      1      2
 ----------------------------------------------------------- ------ ------ ------ ------
  BestWalkerNthCommandListSubmission(CmdListCount=2)           1.00   0.63   0.63   0.96
  BestWalkerNthSubmission(KernelCount=2)                       1.00   0.62   0.63   0.96
  BestWalkerNthSubmissionImmediate(KernelCount=2)              1.00   0.58   0.58   0.85
  BestWalkerSubmission                                         1.00   0.62   0.62   0.96
  BestWalkerSubmissionImmediate                                1.00   0.63   0.62   0.96
  BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=2)   1.00   0.58   0.58   0.86
  BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=4)   1.00   0.70   0.70   0.83
  BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=8)   1.00   0.53   0.52   0.78

Log level 2 is the first "verbose level" for GuC, where the biggest
difference happens. Keep log level 3 for CONFIG_DRM_XE_DEBUG, but switch
to 1, i.e.  GUC_LOG_LEVEL_NON_VERBOSE, for "normal" builds.

Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250613-guc-log-level-v2-1-cb84a63e49fe@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit a37128ba613ad6a5f81f382fa3cfe5c4a6527310)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agodrm/xe/bmg: Don't use WA 16023588340 and 22019338487 on VF
Michal Wajdeczko [Thu, 10 Jul 2025 10:30:39 +0000 (10:30 +0000)]
drm/xe/bmg: Don't use WA 16023588340 and 22019338487 on VF

These workarounds are not applicable for use by the VFs.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com>
Link: https://lore.kernel.org/r/20250710103040.375610-2-jakub1.kolakowski@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 1d2e2503e506ddc499cbb7afdc8b70bcf6fe241f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agodrm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2
Julia Filipchuk [Thu, 26 Jun 2025 18:28:10 +0000 (11:28 -0700)]
drm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2

UAPI compatibility version 1.22.2

Resolves various bugs. Recommend newer version.

Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250626182805.1701096-13-daniele.ceraolospurio@intel.com
(cherry picked from commit 0b64addcae7f04745bc5f62d41e27268052f812e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agodrm/xe/pm: Correct comment of xe_pm_set_vram_threshold()
Shuicheng Lin [Tue, 8 Jul 2025 02:14:51 +0000 (02:14 +0000)]
drm/xe/pm: Correct comment of xe_pm_set_vram_threshold()

The parameter threshold is with size in MiB, not in bits.
Correct it to avoid any confusion.

v2: s/mb/MiB, s/vram/VRAM, fix return section. (Michal)

Fixes: 30c399529f4c ("drm/xe: Document Xe PM component")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://lore.kernel.org/r/20250708021450.3602087-2-shuicheng.lin@intel.com
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 0efec0500117947f924e5ac83be40f96378af85a)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agodrm/xe: Release runtime pm for error path of xe_devcoredump_read()
Shuicheng Lin [Mon, 7 Jul 2025 00:49:14 +0000 (00:49 +0000)]
drm/xe: Release runtime pm for error path of xe_devcoredump_read()

xe_pm_runtime_put() is missed to be called for the error path in
xe_devcoredump_read().
Add function description comments for xe_devcoredump_read() to help
understand it.

v2: more detail function comments and refine goto logic (Matt)

Fixes: c4a2e5f865b7 ("drm/xe: Add devcoredump chunking")
Cc: stable@vger.kernel.org
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250707004911.3502904-6-shuicheng.lin@intel.com
(cherry picked from commit 017ef1228d735965419ff118fe1b89089e772c42)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agodrm/xe/pm: Restore display pm if there is error after display suspend
Shuicheng Lin [Tue, 8 Jul 2025 03:54:25 +0000 (03:54 +0000)]
drm/xe/pm: Restore display pm if there is error after display suspend

xe_bo_evict_all() is called after xe_display_pm_suspend(). So if there
is error with xe_bo_evict_all(), display pm should be restored.

Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support")
Fixes: cb8f81c17531 ("drm/xe/display: Make display suspend/resume work on discrete")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://lore.kernel.org/r/20250708035424.3608190-2-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 83dcee17855c4e5af037ae3262809036de127903)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
11 days agoselftests: net: lib: fix shift count out of range
Hangbin Liu [Wed, 9 Jul 2025 09:12:44 +0000 (09:12 +0000)]
selftests: net: lib: fix shift count out of range

I got the following warning when writing other tests:

  + handle_test_result_pass 'bond 802.3ad' '(lacp_active off)'
  + local 'test_name=bond 802.3ad'
  + shift
  + local 'opt_str=(lacp_active off)'
  + shift
  + log_test_result 'bond 802.3ad' '(lacp_active off)' ' OK '
  + local 'test_name=bond 802.3ad'
  + shift
  + local 'opt_str=(lacp_active off)'
  + shift
  + local 'result= OK '
  + shift
  + local retmsg=
  + shift
  /net/tools/testing/selftests/net/forwarding/../lib.sh: line 315: shift: shift count out of range

This happens because an extra shift is executed even after all arguments
have been consumed. Remove the last shift in log_test_result() to avoid
this warning.

Fixes: a923af1ceee7 ("selftests: forwarding: Convert log_test() to recognize RET values")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250709091244.88395-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge branch 'gre-fix-default-ipv6-multicast-route-creation'
Jakub Kicinski [Fri, 11 Jul 2025 01:10:48 +0000 (18:10 -0700)]
Merge branch 'gre-fix-default-ipv6-multicast-route-creation'

Guillaume Nault says:

====================
gre: Fix default IPv6 multicast route creation.

When fixing IPv6 link-local address generation on GRE devices with
commit 3e6a0243ff00 ("gre: Fix again IPv6 link-local address
generation."), I accidentally broke the default IPv6 multicast route
creation on these GRE devices.

Fix that in patch 1, making the GRE specific code yet a bit closer to
the generic code used by most other network interface types.

Then extend the selftest in patch 2 to cover this case.
====================

Link: https://patch.msgid.link/cover.1752070620.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoselftests: Add IPv6 multicast route generation tests for GRE devices.
Guillaume Nault [Wed, 9 Jul 2025 14:30:17 +0000 (16:30 +0200)]
selftests: Add IPv6 multicast route generation tests for GRE devices.

The previous patch fixes a bug that prevented the creation of the
default IPv6 multicast route (ff00::/8) for some GRE devices. Now let's
extend the GRE IPv6 selftests to cover this case.

Also, rename check_ipv6_ll_addr() to check_ipv6_device_config() and
adapt comments and script output to take into account the fact that
we're not limited to link-local address generation.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/65a89583bde3bf866a1922c2e5158e4d72c520e2.1752070620.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agogre: Fix IPv6 multicast route creation.
Guillaume Nault [Wed, 9 Jul 2025 14:30:10 +0000 (16:30 +0200)]
gre: Fix IPv6 multicast route creation.

Use addrconf_add_dev() instead of ipv6_find_idev() in
addrconf_gre_config() so that we don't just get the inet6_dev, but also
install the default ff00::/8 multicast route.

Before commit 3e6a0243ff00 ("gre: Fix again IPv6 link-local address
generation."), the multicast route was created at the end of the
function by addrconf_add_mroute(). But this code path is now only taken
in one particular case (gre devices not bound to a local IP address and
in EUI64 mode). For all other cases, the function exits early and
addrconf_add_mroute() is not called anymore.

Using addrconf_add_dev() instead of ipv6_find_idev() in
addrconf_gre_config(), fixes the problem as it will create the default
multicast route for all gre devices. This also brings
addrconf_gre_config() a bit closer to the normal netdevice IPv6
configuration code (addrconf_dev_config()).

Cc: stable@vger.kernel.org
Fixes: 3e6a0243ff00 ("gre: Fix again IPv6 link-local address generation.")
Reported-by: Aiden Yang <ling@moedove.com>
Closes: https://lore.kernel.org/netdev/CANR=AhRM7YHHXVxJ4DmrTNMeuEOY87K2mLmo9KMed1JMr20p6g@mail.gmail.com/
Reviewed-by: Gary Guo <gary@garyguo.net>
Tested-by: Gary Guo <gary@garyguo.net>
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/027a923dcb550ad115e6d93ee8bb7d310378bd01.1752070620.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge branch 'net-phy-microchip-lan88xx-reliability-fixes'
Jakub Kicinski [Fri, 11 Jul 2025 01:08:18 +0000 (18:08 -0700)]
Merge branch 'net-phy-microchip-lan88xx-reliability-fixes'

Oleksij Rempel says:

====================
net: phy: microchip: LAN88xx reliability fixes

This patch series improves the reliability of the Microchip LAN88xx
PHYs, particularly in edge cases involving fixed link configurations or
forced speed modes.

Patch 1 assigns genphy_soft_reset() to the .soft_reset hook to ensure
that stale link partner advertisement (LPA) bits are properly cleared
during reconfiguration. Without this, outdated autonegotiation bits may
remain visible in some parallel detection cases.

Patch 2 restricts the 100 Mbps workaround (originally intended to handle
cable length switching) to only run when the link transitions to the
PHY_NOLINK state. This prevents repeated toggling that can confuse
autonegotiating link partners such as the Intel i350, leading to
unstable link cycles.

Both patches were tested on a LAN7850 (with integrated LAN88xx PHY)
against an Intel I350 NIC. The full test suite - autonegotiation, fixed
link, and parallel detection - passed successfully.
====================

Link: https://patch.msgid.link/20250709130753.3994461-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet: phy: microchip: limit 100M workaround to link-down events on LAN88xx
Oleksij Rempel [Wed, 9 Jul 2025 13:07:53 +0000 (15:07 +0200)]
net: phy: microchip: limit 100M workaround to link-down events on LAN88xx

Restrict the 100Mbit forced-mode workaround to link-down transitions
only, to prevent repeated link reset cycles in certain configurations.

The workaround was originally introduced to improve signal reliability
when switching cables between long and short distances. It temporarily
forces the PHY into 10 Mbps before returning to 100 Mbps.

However, when used with autonegotiating link partners (e.g., Intel i350),
executing this workaround on every link change can confuse the partner
and cause constant renegotiation loops. This results in repeated link
down/up transitions and the PHY never reaching a stable state.

Limit the workaround to only run during the PHY_NOLINK state. This ensures
it is triggered only once per link drop, avoiding disruptive toggling
while still preserving its intended effect.

Note: I am not able to reproduce the original issue that this workaround
addresses. I can only confirm that 100 Mbit mode works correctly in my
test setup. Based on code inspection, I assume the workaround aims to
reset some internal state machine or signal block by toggling speeds.
However, a PHY reset is already performed earlier in the function via
phy_init_hw(), which may achieve a similar effect. Without a reproducer,
I conservatively keep the workaround but restrict its conditions.

Fixes: e57cf3639c32 ("net: lan78xx: fix accessing the LAN7800's internal phy specific registers from the MAC driver")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250709130753.3994461-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet: phy: microchip: Use genphy_soft_reset() to purge stale LPA bits
Oleksij Rempel [Wed, 9 Jul 2025 13:07:52 +0000 (15:07 +0200)]
net: phy: microchip: Use genphy_soft_reset() to purge stale LPA bits

Enable .soft_reset for the LAN88xx PHY driver by assigning
genphy_soft_reset() to ensure that the phylib core performs a proper
soft reset during reconfiguration.

Previously, the driver left .soft_reset unimplemented, so calls to
phy_init_hw() (e.g., from lan88xx_link_change_notify()) did not fully
reset the PHY. As a result, stale contents in the Link Partner Ability
(LPA) register could persist, causing the PHY to incorrectly report
that the link partner advertised autonegotiation even when it did not.

Using genphy_soft_reset() guarantees a clean reset of the PHY and
corrects the false autoneg reporting in these scenarios.

Fixes: ccb989e4d1ef ("net: phy: microchip: Reset LAN88xx PHY to ensure clean link state on LAN7800/7850")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250709130753.3994461-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof
Mingming Cao [Wed, 9 Jul 2025 15:33:32 +0000 (08:33 -0700)]
ibmvnic: Fix hardcoded NUM_RX_STATS/NUM_TX_STATS with dynamic sizeof

The previous hardcoded definitions of NUM_RX_STATS and
NUM_TX_STATS were not updated when new fields were added
to the ibmvnic_{rx,tx}_queue_stats structures. Specifically,
commit 2ee73c54a615 ("ibmvnic: Add stat for tx direct vs tx
batched") added a fourth TX stat, but NUM_TX_STATS remained 3,
leading to a mismatch.

This patch replaces the static defines with dynamic sizeof-based
calculations to ensure the stat arrays are correctly sized.
This fixes incorrect indexing and prevents incomplete stat
reporting in tools like ethtool.

Fixes: 2ee73c54a615 ("ibmvnic: Add stat for tx direct vs tx batched")
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250709153332.73892-1-mmc@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonet: appletalk: Fix device refcount leak in atrtr_create()
Kito Xu [Wed, 9 Jul 2025 03:52:51 +0000 (03:52 +0000)]
net: appletalk: Fix device refcount leak in atrtr_create()

When updating an existing route entry in atrtr_create(), the old device
reference was not being released before assigning the new device,
leading to a device refcount leak. Fix this by calling dev_put() to
release the old device reference before holding the new one.

Fixes: c7f905f0f6d4 ("[ATALK]: Add missing dev_hold() to atrtr_create().")
Signed-off-by: Kito Xu <veritas501@foxmail.com>
Link: https://patch.msgid.link/tencent_E1A26771CDAB389A0396D1681A90A49E5D09@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoMerge tag 'wireless-2025-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Fri, 11 Jul 2025 00:13:46 +0000 (17:13 -0700)]
Merge tag 'wireless-2025-07-10' of https://git./linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Quite a number of fixes still:

 - mt76 (hadn't sent any fixes so far)
   - RCU
   - scanning
   - decapsulation offload
   - interface combinations
 - rt2x00: build fix (bad function pointer prototype)
 - cfg80211: prevent A-MSDU flipping attacks in mesh
 - zd1211rw: prevent race ending with NULL ptr deref
 - cfg80211/mac80211: more S1G fixes
 - mwifiex: avoid WARN on certain RX frames
 - mac80211:
   - avoid stack data leak in WARN cases
   - fix non-transmitted BSSID search
     (on certain multi-BSSID APs)
   - always initialize key list so driver
     iteration won't crash
   - fix monitor interface in device restart
   - fix __free() annotation usage

* tag 'wireless-2025-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (26 commits)
  wifi: mac80211: add the virtual monitor after reconfig complete
  wifi: mac80211: always initialize sdata::key_list
  wifi: mac80211: Fix uninitialized variable with __free() in ieee80211_ml_epcs()
  wifi: mt76: mt792x: Limit the concurrent STA and SoftAP to operate on the same channel
  wifi: mt76: mt7925: Fix null-ptr-deref in mt7925_thermal_init()
  wifi: mt76: fix queue assignment for deauth packets
  wifi: mt76: add a wrapper for wcid access with validation
  wifi: mt76: mt7921: prevent decap offload config before STA initialization
  wifi: mt76: mt7925: prevent NULL pointer dereference in mt7925_sta_set_decap_offload()
  wifi: mt76: mt7925: fix incorrect scan probe IE handling for hw_scan
  wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan
  wifi: mt76: mt7925: fix the wrong config for tx interrupt
  wifi: mt76: Remove RCU section in mt7996_mac_sta_rc_work()
  wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()
  wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl_fixed()
  wifi: mt76: Move RCU section in mt7996_mcu_set_fixed_field()
  wifi: mt76: Assume __mt76_connac_mcu_alloc_sta_req runs in atomic context
  wifi: prevent A-MSDU attacks in mesh networks
  wifi: rt2x00: fix remove callback type mismatch
  wifi: mac80211: reject VHT opmode for unsupported channel widths
  ...
====================

Link: https://patch.msgid.link/20250710122212.24272-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agonetfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto()
Eric Dumazet [Mon, 7 Jul 2025 12:45:17 +0000 (12:45 +0000)]
netfilter: flowtable: account for Ethernet header in nf_flow_pppoe_proto()

syzbot found a potential access to uninit-value in nf_flow_pppoe_proto()

Blamed commit forgot the Ethernet header.

BUG: KMSAN: uninit-value in nf_flow_offload_inet_hook+0x7e4/0x940 net/netfilter/nf_flow_table_inet.c:27
  nf_flow_offload_inet_hook+0x7e4/0x940 net/netfilter/nf_flow_table_inet.c:27
  nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline]
  nf_hook_slow+0xe1/0x3d0 net/netfilter/core.c:623
  nf_hook_ingress include/linux/netfilter_netdev.h:34 [inline]
  nf_ingress net/core/dev.c:5742 [inline]
  __netif_receive_skb_core+0x4aff/0x70c0 net/core/dev.c:5837
  __netif_receive_skb_one_core net/core/dev.c:5975 [inline]
  __netif_receive_skb+0xcc/0xac0 net/core/dev.c:6090
  netif_receive_skb_internal net/core/dev.c:6176 [inline]
  netif_receive_skb+0x57/0x630 net/core/dev.c:6235
  tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485
  tun_get_user+0x4ee0/0x6b40 drivers/net/tun.c:1938
  tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1984
  new_sync_write fs/read_write.c:593 [inline]
  vfs_write+0xb4b/0x1580 fs/read_write.c:686
  ksys_write fs/read_write.c:738 [inline]
  __do_sys_write fs/read_write.c:749 [inline]

Reported-by: syzbot+bf6ed459397e307c3ad2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/686bc073.a00a0220.c7b3.0086.GAE@google.com/T/#u
Fixes: 87b3593bed18 ("netfilter: flowtable: validate pppoe header")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20250707124517.614489-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days agoirqchip/irq-msi-lib: Fix build with PCI disabled
Arnd Bergmann [Thu, 10 Jul 2025 08:00:12 +0000 (10:00 +0200)]
irqchip/irq-msi-lib: Fix build with PCI disabled

The armada-370-xp irqchip fails in some randconfig builds because
of a missing declaration:

In file included from drivers/irqchip/irq-armada-370-xp.c:23:
include/linux/irqchip/irq-msi-lib.h:25:39: error: 'struct msi_domain_info' declared inside parameter list will not be visible outside of this definition or declaration [-Werror]

Add a forward declaration for the msi_domain_info structure.

[ tglx: Fixed up the subsystem prefix. Is it really that hard to get right? ]

Fixes: e51b27438a10 ("irqchip: Make irq-msi-lib.h globally available")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250710080021.2303640-1-arnd@kernel.org
11 days agoPCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()
Himanshu Madhani [Tue, 8 Jul 2025 22:25:30 +0000 (22:25 +0000)]
PCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()

pci_msix_write_tph_tag() takes the per device MSI descriptor mutex and then
invokes msi_domain_get_virq(), which takes the same mutex again. That
obviously results in a system hang which is exposed by a softlockup or
lockdep warning.

Move the lock guard after the invocation of msi_domain_get_virq() to fix
this.

[ tglx: Massage changelog by adding a proper explanation and removing the
   not really useful stacktrace ]

Fixes: d5124a9957b2 ("PCI/MSI: Provide a sane mechanism for TPH")
Reported-by: Jorge Lopez <jorge.jo.lopez@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jorge Lopez <jorge.jo.lopez@oracle.com>
Link: https://lore.kernel.org/all/20250708222530.1041477-1-himanshu.madhani@oracle.com
12 days agoMerge tag 'net-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 10 Jul 2025 16:18:53 +0000 (09:18 -0700)]
Merge tag 'net-6.16-rc6' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from Bluetooth.

  Current release - regressions:

   - tcp: refine sk_rcvbuf increase for ooo packets

   - bluetooth: fix attempting to send HCI_Disconnect to BIS handle

   - rxrpc: fix over large frame size warning

   - eth: bcmgenet: initialize u64 stats seq counter

  Previous releases - regressions:

   - tcp: correct signedness in skb remaining space calculation

   - sched: abort __tc_modify_qdisc if parent class does not exist

   - vsock: fix transport_{g2h,h2g} TOCTOU

   - rxrpc: fix bug due to prealloc collision

   - tipc: fix use-after-free in tipc_conn_close().

   - bluetooth: fix not marking Broadcast Sink BIS as connected

   - phy: qca808x: fix WoL issue by utilizing at8031_set_wol()

   - eth: am65-cpsw-nuss: fix skb size by accounting for skb_shared_info

  Previous releases - always broken:

   - netlink: fix wraparounds of sk->sk_rmem_alloc.

   - atm: fix infinite recursive call of clip_push().

   - eth:
      - stmmac: fix interrupt handling for level-triggered mode in DWC_XGMAC2
      - rtsn: fix a null pointer dereference in rtsn_probe()"

* tag 'net-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  net/sched: sch_qfq: Fix null-deref in agg_dequeue
  rxrpc: Fix oops due to non-existence of prealloc backlog struct
  rxrpc: Fix bug due to prealloc collision
  MAINTAINERS: remove myself as netronome maintainer
  selftests/net: packetdrill: add tcp_ooo-before-and-after-accept.pkt
  tcp: refine sk_rcvbuf increase for ooo packets
  net/sched: Abort __tc_modify_qdisc if parent class does not exist
  net: ethernet: ti: am65-cpsw-nuss: Fix skb size by accounting for skb_shared_info
  net: thunderx: avoid direct MTU assignment after WRITE_ONCE()
  selftests/tc-testing: Create test case for UAF scenario with DRR/NETEM/BLACKHOLE chain
  atm: clip: Fix NULL pointer dereference in vcc_sendmsg()
  atm: clip: Fix infinite recursive call of clip_push().
  atm: clip: Fix memory leak of struct clip_vcc.
  atm: clip: Fix potential null-ptr-deref in to_atmarpd().
  net: phy: smsc: Fix link failure in forced mode with Auto-MDIX
  net: phy: smsc: Force predictable MDI-X state on LAN87xx
  net: phy: smsc: Fix Auto-MDIX configuration when disabled by strap
  net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2
  rxrpc: Fix over large frame size warning
  net: airoha: Fix an error handling path in airoha_probe()
  ...

12 days agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 10 Jul 2025 16:06:53 +0000 (09:06 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Many patches, pretty much all of them small, that accumulated while I
  was on vacation.

  ARM:

   - Remove the last leftovers of the ill-fated FPSIMD host state
     mapping at EL2 stage-1

   - Fix unexpected advertisement to the guest of unimplemented S2 base
     granule sizes

   - Gracefully fail initialising pKVM if the interrupt controller isn't
     GICv3

   - Also gracefully fail initialising pKVM if the carveout allocation
     fails

   - Fix the computing of the minimum MMIO range required for the host
     on stage-2 fault

   - Fix the generation of the GICv3 Maintenance Interrupt in nested
     mode

  x86:

   - Reject SEV{-ES} intra-host migration if one or more vCPUs are
     actively being created, so as not to create a non-SEV{-ES} vCPU in
     an SEV{-ES} VM

   - Use a pre-allocated, per-vCPU buffer for handling de-sparsification
     of vCPU masks in Hyper-V hypercalls; fixes a "stack frame too
     large" issue

   - Allow out-of-range/invalid Xen event channel ports when configuring
     IRQ routing, to avoid dictating a specific ioctl() ordering to
     userspace

   - Conditionally reschedule when setting memory attributes to avoid
     soft lockups when userspace converts huge swaths of memory to/from
     private

   - Add back MWAIT as a required feature for the MONITOR/MWAIT selftest

   - Add a missing field in struct sev_data_snp_launch_start that
     resulted in the guest-visible workarounds field being filled at the
     wrong offset

   - Skip non-canonical address when processing Hyper-V PV TLB flushes
     to avoid VM-Fail on INVVPID

   - Advertise supported TDX TDVMCALLs to userspace

   - Pass SetupEventNotifyInterrupt arguments to userspace

   - Fix TSC frequency underflow"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: avoid underflow when scaling TSC frequency
  KVM: arm64: Remove kvm_arch_vcpu_run_map_fp()
  KVM: arm64: Fix handling of FEAT_GTG for unimplemented granule sizes
  KVM: arm64: Don't free hyp pages with pKVM on GICv2
  KVM: arm64: Fix error path in init_hyp_mode()
  KVM: arm64: Adjust range correctly during host stage-2 faults
  KVM: arm64: nv: Fix MI line level calculation in vgic_v3_nested_update_mi()
  KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush
  KVM: SVM: Add missing member in SNP_LAUNCH_START command structure
  Documentation: KVM: Fix unexpected unindent warnings
  KVM: selftests: Add back the missing check of MONITOR/MWAIT availability
  KVM: Allow CPU to reschedule while setting per-page memory attributes
  KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table.
  KVM: x86/hyper-v: Use preallocated per-vCPU buffer for de-sparsified vCPU masks
  KVM: SVM: Initialize vmsa_pa in VMCB to INVALID_PAGE if VMSA page is NULL
  KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight
  KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities
  KVM: TDX: Exit to userspace for SetupEventNotifyInterrupt

12 days agodrm/i915/bios: Apply vlv_fixup_mipi_sequences() to v2 mipi-sequences too
Hans de Goede [Mon, 7 Jul 2025 21:14:12 +0000 (23:14 +0200)]
drm/i915/bios: Apply vlv_fixup_mipi_sequences() to v2 mipi-sequences too

It turns out that the fixup from vlv_fixup_mipi_sequences() is necessary
for some DSI panel's with version 2 mipi-sequences too.

Specifically the Acer Iconia One 8 A1-840 (not to be confused with the
A1-840FHD which is different) has the following sequences:

BDB block 53 (1284 bytes) - MIPI sequence block:
Sequence block version v2
Panel 0 *

Sequence 2 - MIPI_SEQ_INIT_OTP
GPIO index 9, source 0, set 0 (0x00)
Delay: 50000 us
GPIO index 9, source 0, set 1 (0x01)
Delay: 6000 us
GPIO index 9, source 0, set 0 (0x00)
Delay: 6000 us
GPIO index 9, source 0, set 1 (0x01)
Delay: 25000 us
Send DCS: Port A, VC 0, LP, Type 39, Length 5, Data ff aa 55 a5 80
Send DCS: Port A, VC 0, LP, Type 39, Length 3, Data 6f 11 00
...
Send DCS: Port A, VC 0, LP, Type 05, Length 1, Data 29
Delay: 120000 us

Sequence 4 - MIPI_SEQ_DISPLAY_OFF
Send DCS: Port A, VC 0, LP, Type 05, Length 1, Data 28
Delay: 105000 us
Send DCS: Port A, VC 0, LP, Type 05, Length 2, Data 10 00
Delay: 10000 us

Sequence 5 - MIPI_SEQ_ASSERT_RESET
Delay: 10000 us
GPIO index 9, source 0, set 0 (0x00)

Notice how there is no MIPI_SEQ_DEASSERT_RESET, instead the deassert
is done at the beginning of MIPI_SEQ_INIT_OTP, which is exactly what
the fixup from vlv_fixup_mipi_sequences() fixes up.

Extend it to also apply to v2 sequences, this fixes the panel not working
on the Acer Iconia One 8 A1-840.

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14605
Signed-off-by: Hans de Goede <hansg@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250703143824.7121-1-hansg@kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 11895f375939d60efe7ed5dddc1cffe2e79f976c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
12 days agodm-bufio: fix sched in atomic context
Sheng Yong [Thu, 10 Jul 2025 06:48:55 +0000 (14:48 +0800)]
dm-bufio: fix sched in atomic context

If "try_verify_in_tasklet" is set for dm-verity, DM_BUFIO_CLIENT_NO_SLEEP
is enabled for dm-bufio. However, when bufio tries to evict buffers, there
is a chance to trigger scheduling in spin_lock_bh, the following warning
is hit:

BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2745
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 123, name: kworker/2:2
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
4 locks held by kworker/2:2/123:
 #0: ffff88800a2d1548 ((wq_completion)dm_bufio_cache){....}-{0:0}, at: process_one_work+0xe46/0x1970
 #1: ffffc90000d97d20 ((work_completion)(&dm_bufio_replacement_work)){....}-{0:0}, at: process_one_work+0x763/0x1970
 #2: ffffffff8555b528 (dm_bufio_clients_lock){....}-{3:3}, at: do_global_cleanup+0x1ce/0x710
 #3: ffff88801d5820b8 (&c->spinlock){....}-{2:2}, at: do_global_cleanup+0x2a5/0x710
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 2 UID: 0 PID: 123 Comm: kworker/2:2 Not tainted 6.16.0-rc3-g90548c634bd0 #305 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: dm_bufio_cache do_global_cleanup
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 __might_resched+0x360/0x4e0
 do_global_cleanup+0x2f5/0x710
 process_one_work+0x7db/0x1970
 worker_thread+0x518/0xea0
 kthread+0x359/0x690
 ret_from_fork+0xf3/0x1b0
 ret_from_fork_asm+0x1a/0x30
 </TASK>

That can be reproduced by:

  veritysetup format --data-block-size=4096 --hash-block-size=4096 /dev/vda /dev/vdb
  SIZE=$(blockdev --getsz /dev/vda)
  dmsetup create myverity -r --table "0 $SIZE verity 1 /dev/vda /dev/vdb 4096 4096 <data_blocks> 1 sha256 <root_hash> <salt> 1 try_verify_in_tasklet"
  mount /dev/dm-0 /mnt -o ro
  echo 102400 > /sys/module/dm_bufio/parameters/max_cache_size_bytes
  [read files in /mnt]

Cc: stable@vger.kernel.org # v6.4+
Fixes: 450e8dee51aa ("dm bufio: improve concurrent IO performance")
Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
12 days agowifi: mac80211: add the virtual monitor after reconfig complete
Miri Korenblit [Wed, 9 Jul 2025 20:34:56 +0000 (23:34 +0300)]
wifi: mac80211: add the virtual monitor after reconfig complete

In reconfig we add the virtual monitor in 2 cases:
1. If we are resuming (it was deleted on suspend)
2. If it was added after an error but before the reconfig
   (due to the last non-monitor interface removal).

In the second case, the removal of the non-monitor interface will succeed
but the addition of the virtual monitor will fail, so we add it in the
reconfig.

The problem is that we mislead the driver to think that this is an existing
interface that is getting re-added - while it is actually a completely new
interface from the drivers' point of view.

Some drivers act differently when a interface is re-added. For example, it
might not initialize things because they were already initialized.
Such drivers will - in this case - be left with a partialy initialized vif.

To fix it, add the virtual monitor after reconfig_complete, so the
driver will know that this is a completely new interface.

Fixes: 3c3e21e7443b ("mac80211: destroy virtual monitor interface across suspend")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233451.648d39b041e8.I2e37b68375278987e303d6c00cc5f3d8334d2f96@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
12 days agowifi: mac80211: always initialize sdata::key_list
Miri Korenblit [Wed, 9 Jul 2025 20:34:10 +0000 (23:34 +0300)]
wifi: mac80211: always initialize sdata::key_list

This is currently not initialized for a virtual monitor, leading to a
NULL pointer dereference when - for example - iterating over all the
keys of all the vifs.

Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233400.8dcefe578497.I4c90a00ae3256520e063199d7f6f2580d5451acf@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
12 days agoHID: debug: Remove duplicate entry (BTN_WHEEL)
Andy Shevchenko [Thu, 10 Jul 2025 09:41:20 +0000 (12:41 +0300)]
HID: debug: Remove duplicate entry (BTN_WHEEL)

BTN_WHEEL is duplicated (by value) and compiler is not happy about that:

drivers/hid/hid-debug.c:3302:16: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
 3302 |         [BTN_WHEEL] = "BtnWheel",               [KEY_OK] = "Ok",
      |                       ^~~~~~~~~~
drivers/hid/hid-debug.c:3301:20: note: previous initialization is here
 3301 |         [BTN_GEAR_DOWN] = "BtnGearDown",        [BTN_GEAR_UP] = "BtnGearUp",
      |                           ^~~~~~~~~~~~~

Remove it again, as the commit 7b2daa648eb7 ("HID: debug: Remove duplicates
from 'keys'") already did this once in the past.

Fixes: 194808a1ea39 ("HID: Fix debug name for BTN_GEAR_DOWN, BTN_GEAR_UP, BTN_WHEEL")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20250710094120.753358-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
12 days agonet/sched: sch_qfq: Fix null-deref in agg_dequeue
Xiang Mei [Sat, 5 Jul 2025 21:21:43 +0000 (14:21 -0700)]
net/sched: sch_qfq: Fix null-deref in agg_dequeue

To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c)
when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return
value before using it, similar to the existing approach in sch_hfsc.c.

To avoid code duplication, the following changes are made:

1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static
inline function.

2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to
include/net/pkt_sched.h so that sch_qfq can reuse it.

3. Applied qdisc_peek_len in agg_dequeue to avoid crashing.

Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
12 days agoerofs: allow readdir() to be interrupted
Chao Yu [Thu, 10 Jul 2025 07:36:18 +0000 (15:36 +0800)]
erofs: allow readdir() to be interrupted

In a quick slow device, readdir() may loop for long time in large
directory, let's give a chance to allow it to be interrupted by
userspace.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250710073619.4083422-1-chao@kernel.org
[ Gao Xiang: move cond_resched() to the end of the while loop. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 days agoerofs: address D-cache aliasing
Gao Xiang [Wed, 9 Jul 2025 03:46:14 +0000 (11:46 +0800)]
erofs: address D-cache aliasing

Flush the D-cache before unlocking folios for compressed inodes, as
they are dirtied during decompression.

Avoid calling flush_dcache_folio() on every CPU write, since it's more
like playing whack-a-mole without real benefit.

It has no impact on x86 and arm64/risc-v: on x86, flush_dcache_folio()
is a no-op, and on arm64/risc-v, PG_dcache_clean (PG_arch_1) is clear
for new page cache folios.  However, certain ARM boards are affected,
as reported.

Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
Closes: https://lore.kernel.org/r/c1e51e16-6cc6-49d0-a63e-4e9ff6c4dd53@pengutronix.de
Closes: https://lore.kernel.org/r/38d43fae-1182-4155-9c5b-ffc7382d9917@siemens.com
Tested-by: Jan Kiszka <jan.kiszka@siemens.com>
Tested-by: Stefan Kerkmann <s.kerkmann@pengutronix.de>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250709034614.2780117-2-hsiangkao@linux.alibaba.com
12 days agoerofs: use memcpy_to_folio() to replace copy_to_iter()
Gao Xiang [Wed, 9 Jul 2025 03:46:13 +0000 (11:46 +0800)]
erofs: use memcpy_to_folio() to replace copy_to_iter()

Using copy_to_iter() here is overkill and even messy.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250709034614.2780117-1-hsiangkao@linux.alibaba.com
12 days agoerofs: fix to add missing tracepoint in erofs_read_folio()
Chao Yu [Tue, 8 Jul 2025 11:19:42 +0000 (19:19 +0800)]
erofs: fix to add missing tracepoint in erofs_read_folio()

Commit 771c994ea51f ("erofs: convert all uncompressed cases to iomap")
converts to use iomap interface, it removed trace_erofs_readpage()
tracepoint in the meantime, let's add it back.

Fixes: 771c994ea51f ("erofs: convert all uncompressed cases to iomap")
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250708111942.3120926-1-chao@kernel.org
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 days agoerofs: fix to add missing tracepoint in erofs_readahead()
Chao Yu [Mon, 7 Jul 2025 08:48:32 +0000 (16:48 +0800)]
erofs: fix to add missing tracepoint in erofs_readahead()

Commit 771c994ea51f ("erofs: convert all uncompressed cases to iomap")
converts to use iomap interface, it removed trace_erofs_readahead()
tracepoint in the meantime, let's add it back.

Fixes: 771c994ea51f ("erofs: convert all uncompressed cases to iomap")
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250707084832.2725677-1-chao@kernel.org
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
12 days agoRevert "sched/numa: add statistics of numa balance task"
Chen Yu [Fri, 4 Jul 2025 13:56:20 +0000 (21:56 +0800)]
Revert "sched/numa: add statistics of numa balance task"

This reverts commit ad6b26b6a0a79166b53209df2ca1cf8636296382.

This commit introduces per-memcg/task NUMA balance statistics, but
unfortunately it introduced a NULL pointer exception due to the following
race condition: After a swap task candidate was chosen, its mm_struct
pointer was set to NULL due to task exit.  Later, when performing the
actual task swapping, the p->mm caused the problem.

CPU0                                   CPU1
:
...
task_numa_migrate
     task_numa_find_cpu
      task_numa_compare
        # a normal task p is chosen
        env->best_task = p

                                          # p exit:
                                          exit_signals(p);
                                             p->flags |= PF_EXITING
                                          exit_mm
                                             p->mm = NULL;

      migrate_swap_stop
        __migrate_swap_task((arg->src_task, arg->dst_cpu)
         count_memcg_event_mm(p->mm, NUMA_TASK_SWAP)# p->mm is NULL

task_lock() should be held and the PF_EXITING flag needs to be checked to
prevent this from happening.  After discussion, the conclusion was that
adding a lock is not worthwhile for some statistics calculations.  Revert
the change and rely on the tracepoint for this purpose.

Link: https://lkml.kernel.org/r/20250704135620.685752-1-yu.c.chen@intel.com
Link: https://lkml.kernel.org/r/20250708064917.BBD13C4CEED@smtp.kernel.org
Fixes: ad6b26b6a0a7 ("sched/numa: add statistics of numa balance task")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Closes: https://lore.kernel.org/all/CAE4VaGBLJxpd=NeRJXpSCuw=REhC5LWJpC29kDy-Zh2ZDyzQZA@mail.gmail.com/
Reported-by: Srikanth Aithal <Srikanth.Aithal@amd.com>
Reported-by: Suneeth D <Suneeth.D@amd.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Hladky <jhladky@redhat.com>
Cc: Libo Chen <libo.chen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm: fix the inaccurate memory statistics issue for users
Baolin Wang [Thu, 5 Jun 2025 12:58:29 +0000 (20:58 +0800)]
mm: fix the inaccurate memory statistics issue for users

On some large machines with a high number of CPUs running a 64K pagesize
kernel, we found that the 'RES' field is always 0 displayed by the top
command for some processes, which will cause a lot of confusion for users.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 875525 root      20   0   12480      0      0 R   0.3   0.0   0:00.08 top
      1 root      20   0  172800      0      0 S   0.0   0.0   0:04.52 systemd

The main reason is that the batch size of the percpu counter is quite
large on these machines, caching a significant percpu value, since
converting mm's rss stats into percpu_counter by commit f1a7941243c1 ("mm:
convert mm's rss stats into percpu_counter").  Intuitively, the batch
number should be optimized, but on some paths, performance may take
precedence over statistical accuracy.  Therefore, introducing a new
interface to add the percpu statistical count and display it to users,
which can remove the confusion.  In addition, this change is not expected
to be on a performance-critical path, so the modification should be
acceptable.

In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
dec/inc_mm_counter(), which are all wrappers around
percpu_counter_add_batch().  In percpu_counter_add_batch(), there is
percpu batch caching to avoid 'fbc->lock' contention.  This patch changes
task_mem() and task_statm() to get the accurate mm counters under the
'fbc->lock', but this should not exacerbate kernel 'mm->rss_stat' lock
contention due to the percpu batch caching of the mm counters.  The
following test also confirm the theoretical analysis.

I run the stress-ng that stresses anon page faults in 32 threads on my 32
cores machine, while simultaneously running a script that starts 32
threads to busy-loop pread each stress-ng thread's /proc/pid/status
interface.  From the following data, I did not observe any obvious impact
of this patch on the stress-ng tests.

w/o patch:
stress-ng: info:  [6848]          4,399,219,085,152 CPU Cycles          67.327 B/sec
stress-ng: info:  [6848]          1,616,524,844,832 Instructions          24.740 B/sec (0.367 instr. per cycle)
stress-ng: info:  [6848]          39,529,792 Page Faults Total           0.605 M/sec
stress-ng: info:  [6848]          39,529,792 Page Faults Minor           0.605 M/sec

w/patch:
stress-ng: info:  [2485]          4,462,440,381,856 CPU Cycles          68.382 B/sec
stress-ng: info:  [2485]          1,615,101,503,296 Instructions          24.750 B/sec (0.362 instr. per cycle)
stress-ng: info:  [2485]          39,439,232 Page Faults Total           0.604 M/sec
stress-ng: info:  [2485]          39,439,232 Page Faults Minor           0.604 M/sec

On comparing a very simple app which just allocates & touches some
memory against v6.1 (which doesn't have f1a7941243c1) and latest Linus
tree (4c06e63b9203) I can see that on latest Linus tree the values for
VmRSS, RssAnon and RssFile from /proc/self/status are all zeroes while
they do report values on v6.1 and a Linus tree with this patch.

Link: https://lkml.kernel.org/r/f4586b17f66f97c174f7fd1f8647374fdb53de1c.1749119050.git.baolin.wang@linux.alibaba.com
Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Tested-by Donet Tom <donettom@linux.ibm.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm/damon: fix divide by zero in damon_get_intervals_score()
Honggyu Kim [Wed, 2 Jul 2025 00:02:04 +0000 (09:02 +0900)]
mm/damon: fix divide by zero in damon_get_intervals_score()

The current implementation allows having zero size regions with no special
reasons, but damon_get_intervals_score() gets crashed by divide by zero
when the region size is zero.

  [   29.403950] Oops: divide error: 0000 [#1] SMP NOPTI

This patch fixes the bug, but does not disallow zero size regions to keep
the backward compatibility since disallowing zero size regions might be a
breaking change for some users.

In addition, the same crash can happen when intervals_goal.access_bp is
zero so this should be fixed in stable trees as well.

Link: https://lkml.kernel.org/r/20250702000205.1921-5-honggyu.kim@sk.com
Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agosamples/damon: fix damon sample mtier for start failure
Honggyu Kim [Wed, 2 Jul 2025 00:02:03 +0000 (09:02 +0900)]
samples/damon: fix damon sample mtier for start failure

The damon_sample_mtier_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.

In such cases, setting Y to "enable" then N triggers the similar crash
with mtier because damon sample start failed but the "enable" stays as Y.

Link: https://lkml.kernel.org/r/20250702000205.1921-4-honggyu.kim@sk.com
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agosamples/damon: fix damon sample wsse for start failure
Honggyu Kim [Wed, 2 Jul 2025 00:02:02 +0000 (09:02 +0900)]
samples/damon: fix damon sample wsse for start failure

The damon_sample_wsse_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.

In such cases, setting Y to "enable" then N triggers the similar crash
with wsse because damon sample start failed but the "enable" stays as Y.

Link: https://lkml.kernel.org/r/20250702000205.1921-3-honggyu.kim@sk.com
Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agosamples/damon: fix damon sample prcl for start failure
Honggyu Kim [Wed, 2 Jul 2025 00:02:01 +0000 (09:02 +0900)]
samples/damon: fix damon sample prcl for start failure

Patch series "mm/damon: fix divide by zero and its samples", v3.

This series includes fixes against damon and its samples to make it safer
when damon sample starting fails.

It includes the following changes.
- fix unexpected divide by zero crash for zero size regions
- fix bugs for damon samples in case of start failures

This patch (of 4):

The damon_sample_prcl_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.

In such cases, setting Y to "enable" then N triggers the following crash
because damon sample start failed but the "enable" stays as Y.

  [ 2441.419649] damon_sample_prcl: start
  [ 2454.146817] damon_sample_prcl: stop
  [ 2454.146862] ------------[ cut here ]------------
  [ 2454.146865] kernel BUG at mm/slub.c:546!
  [ 2454.148183] Oops: invalid opcode: 0000 [#1] SMP NOPTI
   ...
  [ 2454.167555] Call Trace:
  [ 2454.167822]  <TASK>
  [ 2454.168061]  damon_destroy_ctx+0x78/0x140
  [ 2454.168454]  damon_sample_prcl_enable_store+0x8d/0xd0
  [ 2454.168932]  param_attr_store+0xa1/0x120
  [ 2454.169315]  module_attr_store+0x20/0x50
  [ 2454.169695]  sysfs_kf_write+0x72/0x90
  [ 2454.170065]  kernfs_fop_write_iter+0x150/0x1e0
  [ 2454.170491]  vfs_write+0x315/0x440
  [ 2454.170833]  ksys_write+0x69/0xf0
  [ 2454.171162]  __x64_sys_write+0x19/0x30
  [ 2454.171525]  x64_sys_call+0x18b2/0x2700
  [ 2454.171900]  do_syscall_64+0x7f/0x680
  [ 2454.172258]  ? exit_to_user_mode_loop+0xf6/0x180
  [ 2454.172694]  ? clear_bhb_loop+0x30/0x80
  [ 2454.173067]  ? clear_bhb_loop+0x30/0x80
  [ 2454.173439]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Link: https://lkml.kernel.org/r/20250702000205.1921-1-honggyu.kim@sk.com
Link: https://lkml.kernel.org/r/20250702000205.1921-2-honggyu.kim@sk.com
Fixes: 2aca254620a8 ("samples/damon: introduce a skeleton of a smaple DAMON module for proactive reclamation")
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agokasan: remove kasan_find_vm_area() to prevent possible deadlock
Yeoreum Yun [Thu, 3 Jul 2025 18:10:18 +0000 (19:10 +0100)]
kasan: remove kasan_find_vm_area() to prevent possible deadlock

find_vm_area() couldn't be called in atomic_context.  If find_vm_area() is
called to reports vm area information, kasan can trigger deadlock like:

CPU0                                CPU1
vmalloc();
 alloc_vmap_area();
  spin_lock(&vn->busy.lock)
                                    spin_lock_bh(&some_lock);
   <interrupt occurs>
   <in softirq>
   spin_lock(&some_lock);
                                    <access invalid address>
                                    kasan_report();
                                     print_report();
                                      print_address_description();
                                       kasan_find_vm_area();
                                        find_vm_area();
                                         spin_lock(&vn->busy.lock) // deadlock!

To prevent possible deadlock while kasan reports, remove kasan_find_vm_area().

Link: https://lkml.kernel.org/r/20250703181018.580833-1-yeoreum.yun@arm.com
Fixes: c056a364e954 ("kasan: print virtual mapping info in reports")
Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Reported-by: Yunseong Kim <ysk@kzalloc.com>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agoscripts: gdb: vfs: support external dentry names
Illia Ostapyshyn [Sun, 29 Jun 2025 00:38:11 +0000 (02:38 +0200)]
scripts: gdb: vfs: support external dentry names

d_shortname of struct dentry only reserves D_NAME_INLINE_LEN characters
and contains garbage for longer names.  Use d_name instead, which always
references the valid name.

Link: https://lore.kernel.org/all/20250525213709.878287-2-illia@yshyn.com/
Link: https://lkml.kernel.org/r/20250629003811.2420418-1-illia@yshyn.com
Fixes: 79300ac805b6 ("scripts/gdb: fix dentry_name() lookup")
Signed-off-by: Illia Ostapyshyn <illia@yshyn.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm/migrate: fix do_pages_stat in compat mode
Christoph Berg [Tue, 24 Jun 2025 14:44:27 +0000 (16:44 +0200)]
mm/migrate: fix do_pages_stat in compat mode

For arrays with more than 16 entries, the old code would incorrectly
advance the pages pointer by 16 words instead of 16 compat_uptr_t.  Fix by
doing the pointer arithmetic inside get_compat_pages_array where pages32
is already a correctly-typed pointer.

Discovered while working on PostgreSQL 18's new NUMA introspection code.

Link: https://lkml.kernel.org/r/aGREU0XTB48w9CwN@msg.df7cb.de
Fixes: 5b1b561ba73c ("mm: simplify compat_sys_move_pages")
Signed-off-by: Christoph Berg <myon@debian.org>
Acked-by: David Hildenbrand <david@redhat.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reported-by: Tomas Vondra <tomas@vondra.me>
Closes: https://www.postgresql.org/message-id/flat/6342f601-77de-4ee0-8c2a-3deb50ceac5b%40vondra.me#86402e3d80c031788f5f55b42c459471
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm/damon/core: handle damon_call_control as normal under kdmond deactivation
SeongJae Park [Sun, 29 Jun 2025 20:49:14 +0000 (13:49 -0700)]
mm/damon/core: handle damon_call_control as normal under kdmond deactivation

DAMON sysfs interface internally uses damon_call() to update DAMON
parameters as users requested, online.  However, DAMON core cancels any
damon_call() requests when it is deactivated by DAMOS watermarks.

As a result, users cannot change DAMON parameters online while DAMON is
deactivated.  Note that users can turn DAMON off and on with different
watermarks to work around.  Since deactivated DAMON is nearly same to
stopped DAMON, the work around should have no big problem.  Anyway, a bug
is a bug.

There is no real good reason to cancel the damon_call() request under
DAMOS deactivation.  Fix it by simply handling the request as normal,
rather than cancelling under the situation.

Link: https://lkml.kernel.org/r/20250629204914.54114-1-sj@kernel.org
Fixes: 42b7491af14c ("mm/damon/core: introduce damon_call()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm/rmap: fix potential out-of-bounds page table access during batched unmap
Lance Yang [Fri, 27 Jun 2025 06:23:19 +0000 (14:23 +0800)]
mm/rmap: fix potential out-of-bounds page table access during batched unmap

As pointed out by David[1], the batched unmap logic in
try_to_unmap_one() may read past the end of a PTE table when a large
folio's PTE mappings are not fully contained within a single page
table.

While this scenario might be rare, an issue triggerable from userspace
must be fixed regardless of its likelihood.  This patch fixes the
out-of-bounds access by refactoring the logic into a new helper,
folio_unmap_pte_batch().

The new helper correctly calculates the safe batch size by capping the
scan at both the VMA and PMD boundaries.  To simplify the code, it also
supports partial batching (i.e., any number of pages from 1 up to the
calculated safe maximum), as there is no strong reason to special-case
for fully mapped folios.

Link: https://lkml.kernel.org/r/20250701143100.6970-1-lance.yang@linux.dev
Link: https://lkml.kernel.org/r/20250630011305.23754-1-lance.yang@linux.dev
Link: https://lkml.kernel.org/r/20250627062319.84936-1-lance.yang@linux.dev
Link: https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redhat.com
Fixes: 354dffd29575 ("mm: support batched unmap for lazyfree large folios during reclamation")
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: David Hildenbrand <david@redhat.com>
Closes: https://lore.kernel.org/linux-mm/a694398c-9f03-4737-81b9-7e49c857fcbe@redhat.com
Suggested-by: Barry Song <baohua@kernel.org>
Acked-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: "Huang, Ying" <huang.ying.caritas@gmail.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mingzhe Yang <mingzhe.yang@ly.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agomm/hugetlb: don't crash when allocating a folio if there are no resv
Vivek Kasireddy [Thu, 26 Jun 2025 19:11:16 +0000 (12:11 -0700)]
mm/hugetlb: don't crash when allocating a folio if there are no resv

There are cases when we try to pin a folio but discover that it has not
been faulted-in.  So, we try to allocate it in memfd_alloc_folio() but
there is a chance that we might encounter a fatal crash/failure
(VM_BUG_ON(!h->resv_huge_pages) in alloc_hugetlb_folio_reserve()) if there
are no active reservations at that instant.  This issue was reported by
syzbot:

kernel BUG at mm/hugetlb.c:2403!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5315 Comm: syz.0.0 Not tainted
6.13.0-rc5-syzkaller-00161-g63676eefb7a0 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:alloc_hugetlb_folio_reserve+0xbc/0xc0 mm/hugetlb.c:2403
Code: 1f eb 05 e8 56 18 a0 ff 48 c7 c7 40 56 61 8e e8 ba 21 cc 09 4c 89
f0 5b 41 5c 41 5e 41 5f 5d c3 cc cc cc cc e8 35 18 a0 ff 90 <0f> 0b 66
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f
RSP: 0018:ffffc9000d3d77f8 EFLAGS: 00010087
RAX: ffffffff81ff6beb RBX: 0000000000000000 RCX: 0000000000100000
RDX: ffffc9000e51a000 RSI: 00000000000003ec RDI: 00000000000003ed
RBP: 1ffffffff34810d9 R08: ffffffff81ff6ba3 R09: 1ffffd4000093005
R10: dffffc0000000000 R11: fffff94000093006 R12: dffffc0000000000
R13: dffffc0000000000 R14: ffffea0000498000 R15: ffffffff9a4086c8
FS:  00007f77ac12e6c0(0000) GS:ffff88801fc00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f77ab54b170 CR3: 0000000040b70000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 memfd_alloc_folio+0x1bd/0x370 mm/memfd.c:88
 memfd_pin_folios+0xf10/0x1570 mm/gup.c:3750
 udmabuf_pin_folios drivers/dma-buf/udmabuf.c:346 [inline]
 udmabuf_create+0x70e/0x10c0 drivers/dma-buf/udmabuf.c:443
 udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:495 [inline]
 udmabuf_ioctl+0x301/0x4e0 drivers/dma-buf/udmabuf.c:526
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:906 [inline]
 __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Therefore, prevent the above crash by removing the VM_BUG_ON() as there is
no need to crash the system in this situation and instead we could just
fail the allocation request.

Furthermore, as described above, the specific situation where this happens
is when we try to pin memfd folios before they are faulted-in.  Although,
this is a valid thing to do, it is not the regular or the common use-case.
Let us consider the following scenarios:

1) hugetlbfs_file_mmap()
    memfd_alloc_folio()
    hugetlb_fault()

2) memfd_alloc_folio()
    hugetlbfs_file_mmap()
    hugetlb_fault()

3) hugetlbfs_file_mmap()
    hugetlb_fault()
        alloc_hugetlb_folio()

3) is the most common use-case where first a memfd is allocated followed
by mmap(), user writes/updates and then the relevant folios are pinned
(memfd_pin_folios()).  The BUG this patch is fixing occurs in 2) because
we try to pin the folios before hugetlbfs_file_mmap() is called.  So, in
this situation we try to allocate the folios before pinning them but since
we did not make any reservations, resv_huge_pages would be 0, leading to
this issue.

Link: https://lkml.kernel.org/r/20250626191116.1377761-1-vivek.kasireddy@intel.com
Fixes: 26a8ea80929c ("mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak")
Reported-by: syzbot+a504cb5bae4fe117ba94@syzkaller.appspotmail.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Closes: https://syzkaller.appspot.com/bug?extid=a504cb5bae4fe117ba94
Closes: https://lore.kernel.org/all/677928b5.050a0220.3b53b0.004d.GAE@google.com/T/
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Steve Sistare <steven.sistare@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
12 days agoscripts/gdb: de-reference per-CPU MCE interrupts
Florian Fainelli [Tue, 24 Jun 2025 03:00:19 +0000 (20:00 -0700)]
scripts/gdb: de-reference per-CPU MCE interrupts

The per-CPU MCE interrupts are looked up by reference and need to be
de-referenced before printing, otherwise we print the addresses of the
variables instead of their contents:

MCE: 18379471554386948492   Machine check exceptions
MCP: 18379471554386948488   Machine check polls

The corrected output looks like this instead now:

MCE:          0   Machine check exceptions
MCP:          1   Machine check polls

Link: https://lkml.kernel.org/r/20250625021109.1057046-1-florian.fainelli@broadcom.com
Link: https://lkml.kernel.org/r/20250624030020.882472-1-florian.fainelli@broadcom.com
Fixes: b0969d7687a7 ("scripts/gdb: print interrupts")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>