Breno Leitao [Wed, 16 Jul 2025 17:38:48 +0000 (10:38 -0700)]
sched/ext: Prevent update_locked_rq() calls with NULL rq
Avoid invoking update_locked_rq() when the runqueue (rq) pointer is NULL
in the SCX_CALL_OP and SCX_CALL_OP_RET macros.
Previously, calling update_locked_rq(NULL) with preemption enabled could
trigger the following warning:
BUG: using __this_cpu_write() in preemptible [
00000000]
This happens because __this_cpu_write() is unsafe to use in preemptible
context.
rq is NULL when an ops invoked from an unlocked context. In such cases, we
don't need to store any rq, since the value should already be NULL
(unlocked). Ensure that update_locked_rq() is only called when rq is
non-NULL, preventing calling __this_cpu_write() on preemptible context.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Fixes:
18853ba782bef ("sched_ext: Track currently locked rq")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v6.15
Jakub Kicinski [Wed, 16 Jul 2025 23:17:34 +0000 (16:17 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-07-15 (ixgbe, fm10k, i40e, ice)
Arnd Bergmann resolves compile issues with large NR_CPUS for ixgbe, fm10k,
and i40e.
For ice:
Dave adds a NULL check for LAG netdev.
Michal corrects a pointer check in debugfs initialization.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: check correct pointer in fwlog debugfs
ice: add NULL check in eswitch lag check
ethernet: intel: fix building with large NR_CPUS
====================
Link: https://patch.msgid.link/20250715202948.3841437-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alok Tiwari [Tue, 15 Jul 2025 14:30:58 +0000 (07:30 -0700)]
net: airoha: fix potential use-after-free in airoha_npu_get()
np->name was being used after calling of_node_put(np), which
releases the node and can lead to a use-after-free bug.
Previously, of_node_put(np) was called unconditionally after
of_find_device_by_node(np), which could result in a use-after-free if
pdev is NULL.
This patch moves of_node_put(np) after the error check to ensure
the node is only released after both the error and success cases
are handled appropriately, preventing potential resource issues.
Fixes:
23290c7bc190 ("net: airoha: Introduce Airoha NPU support")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715143102.3458286-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Christoph Paasch [Tue, 15 Jul 2025 20:20:53 +0000 (13:20 -0700)]
net/mlx5: Correctly set gso_size when LRO is used
gso_size is expected by the networking stack to be the size of the
payload (thus, not including ethernet/IP/TCP-headers). However, cqe_bcnt
is the full sized frame (including the headers). Dividing cqe_bcnt by
lro_num_seg will then give incorrect results.
For example, running a bpftrace higher up in the TCP-stack
(tcp_event_data_recv), we commonly have gso_size set to 1450 or 1451 even
though in reality the payload was only 1448 bytes.
This can have unintended consequences:
- In tcp_measure_rcv_mss() len will be for example 1450, but. rcv_mss
will be 1448 (because tp->advmss is 1448). Thus, we will always
recompute scaling_ratio each time an LRO-packet is received.
- In tcp_gro_receive(), it will interfere with the decision whether or
not to flush and thus potentially result in less gro'ed packets.
So, we need to discount the protocol headers from cqe_bcnt so we can
actually divide the payload by lro_num_seg to get the real gso_size.
v2:
- Use "(unsigned char *)tcp + tcp->doff * 4 - skb->data)" to compute header-len
(Tariq Toukan <tariqt@nvidia.com>)
- Improve commit-message (Gal Pressman <gal@nvidia.com>)
Fixes:
e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250715-cpaasch-pf-925-investigate-incorrect-gso_size-on-cx-7-nic-v2-1-e06c3475f3ac@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kent Overstreet [Wed, 16 Jul 2025 21:31:31 +0000 (17:31 -0400)]
bcachefs: Fix bch2_maybe_casefold() when CONFIG_UTF8=n
maybe_casefold() shouldn't have been nooped, just bch2_casefold().
Fixes:
94426e4201fb ("bcachefs: opts.casefold_disabled")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Jul 2025 14:01:00 +0000 (10:01 -0400)]
bcachefs: Fix build when CONFIG_UNICODE=n
94426e4201fb, which added the killswitch for casefolding, accidentally
removed some of the ifdefs we need to avoid build errors.
It appears we need better build testing for different configurations, it
took two weeks for the robots to catch this one.
Fixes:
94426e4201fb ("bcachefs: opts.casefold_disabled")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 13 Jul 2025 21:19:34 +0000 (17:19 -0400)]
bcachefs: Fix reference to invalid bucket in copygc
Use bch2_dev_bucket_tryget() instead of bch2_dev_tryget() before
checking the bucket bitmap.
Reported-by: syzbot+3168625f36f4a539237e@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 13 Jul 2025 17:31:33 +0000 (13:31 -0400)]
bcachefs: Don't build aux search tree when still repairing node
bch2_btree_node_drop_keys_outside_node() will (re)build aux search
trees, because it's also called by topology repair.
bch2_btree_node_read_done() was calling it before validating individual
keys; invalid ones have to be dropped.
If we call drop_keys_outside_node() first, then
bch2_bset_build_aux_tree() doesn't run because the node already has an
aux search tree - which was invalidated by the repair.
Reported-by: syzbot+c5e7a66b3b23ae65d44f@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 12 Jul 2025 23:33:12 +0000 (19:33 -0400)]
bcachefs: Tweak threshold for allocator triggering discards
The allocator path has a "if we're really low on free buckets, check if
we should issue discards" - tweak this to also trigger discards if more
than 1/128th of the device is in need_discard state.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 12 Jul 2025 23:31:49 +0000 (19:31 -0400)]
bcachefs: Fix triggering of discard by the journal path
It becomes possible to do discards after a journal flush, which
naturally the journal code is reponsible for.
A prior refactoring seems to have broken this - which went unnoticed
because the foreground allocator path can also trigger discards.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Eeli Haapalainen [Mon, 14 Jul 2025 05:13:09 +0000 (08:13 +0300)]
drm/amdgpu/gfx8: reset compute ring wptr on the GPU on resume
Commit
42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs") made the
ring pointer always to be reset on resume from suspend. This caused compute
rings to fail since the reset was done without also resetting it for the
firmware. Reset wptr on the GPU to avoid a disconnect between the driver
and firmware wptr.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3911
Fixes:
42cdf6f687da ("drm/amdgpu/gfx8: always restore kcq MQDs")
Signed-off-by: Eeli Haapalainen <eeli.haapalainen@protonmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
2becafc319db3d96205320f31cc0de4ee5a93747)
Cc: stable@vger.kernel.org
Lijo Lazar [Mon, 14 Jul 2025 05:07:00 +0000 (10:37 +0530)]
drm/amdgpu: Increase reset counter only on success
Increment the reset counter only if soft recovery succeeded. This is
consistent with a ring hard reset behaviour where counter gets
incremented only if hard reset succeeded.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
25c314aa3ec3d30e4ee282540e2096b5c66a2437)
Cc: stable@vger.kernel.org
Thomas Zimmermann [Tue, 15 Jul 2025 09:50:54 +0000 (11:50 +0200)]
drm/radeon: Do not hold console lock during resume
The function radeon_resume_kms() acquires the console lock. It is
inconsistent, as it depends on the notify_client argument. That
lock then covers a number of suspend operations that are unrelated
to the console.
Remove the calls to console_lock() and console_unlock() from the
radeon function. The console lock is only required by DRM's fbdev
emulation, which acquires it as necessary.
Also fixes a possible circular dependency between the console lock
and the client-list mutex, where the mutex is supposed to be taken
first.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
fff8e0504499a929f26e2fb7cf7e2c9854e37b91)
Thomas Zimmermann [Tue, 15 Jul 2025 09:50:53 +0000 (11:50 +0200)]
drm/radeon: Do not hold console lock while suspending clients
The radeon driver holds the console lock while suspending in-kernel
DRM clients. This creates a circular dependency with the client-list
mutex, which is supposed to be acquired first. Reported when combining
radeon with another DRM driver.
Therefore, do not take the console lock in radeon, but let the fbdev
DRM client acquire the lock when needed. This is what all other DRM
drivers so.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Closes: https://lore.kernel.org/dri-devel/
0a087cfd-bd4c-48f1-aa2f-
4a3b12593935@oss.qualcomm.com/
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
612ec7c69d04cb58beb1332c2806da9f2f47a3ae)
Melissa Wen [Mon, 7 Jul 2025 20:52:05 +0000 (16:52 -0400)]
drm/amd/display: Disable CRTC degamma LUT for DCN401
In DCN401 pre-blending degamma LUT isn't affecting cursor as in previous
DCN version. As this is not the behavior close to what is expected for
CRTC degamma LUT, disable CRTC degamma LUT property in this HW.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4176
---
When enabling HDR on KDE, it takes the first CRTC 1D LUT available and
apply a color transformation (Gamma 2.2 -> PQ). AMD driver usually
advertises a CRTC degamma LUT as the first CRTC 1D LUT, but it's
actually applied pre-blending. In previous HW version, it seems to work
fine because the 1D LUT was applied to cursor too, but DCN401 presents a
different behavior and the 1D LUT isn't affecting the hardware cursor.
To address the wrong gamma on cursor with HDR (see the link), I came up
with this patch that disables CRTC degamma LUT in this hw, since it
presents a different behavior than others. With this KDE sees CRTC
regamma LUT as the first post-blending 1D LUT available. This is
actually more consistent with AMD color pipeline. It was tested by the
reporter, since I don't have the HW available for local testing and
debugging.
Melissa
---
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
340231cdceec2c45995d773a358ca3c341f151aa)
Cc: stable@vger.kernel.org
Clayton King [Thu, 19 Jun 2025 17:54:26 +0000 (13:54 -0400)]
drm/amd/display: Free memory allocation
[WHY]
Free memory to avoid memory leak
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Clayton King <clayton.king@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit
fa699acb8e9be2341ee318077fa119acc7d5f329)
Cc: stable@vger.kernel.org
Linus Torvalds [Wed, 16 Jul 2025 20:00:38 +0000 (13:00 -0700)]
Merge tag 'probes-fixes-v6.16-rc6' of git://git./linux/kernel/git/trace/linux-trace
Pull probes fix from Masami Hiramatsu:
- fprobe-event: The @params variable was being used in an error path
without being initialized. The fix to return an error code.
* tag 'probes-fixes-v6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Avoid using params uninitialized in parse_btf_arg()
Zijun Hu [Tue, 15 Jul 2025 12:40:13 +0000 (20:40 +0800)]
Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
For GF variant of WCN6855 without board ID programmed
btusb_generate_qca_nvm_name() will chose wrong NVM
'qca/nvm_usb_00130201.bin' to download.
Fix by choosing right NVM 'qca/nvm_usb_00130201_gf.bin'.
Also simplify NVM choice logic of btusb_generate_qca_nvm_name().
Fixes:
d6cba4e6d0e2 ("Bluetooth: btusb: Add support using different nvm for variant WCN6855 controller")
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Christian Eggers [Mon, 14 Jul 2025 20:27:45 +0000 (22:27 +0200)]
Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
The 'quirks' member already ran out of bits on some platforms some time
ago. Replace the integer member by a bitmap in order to have enough bits
in future. Replace raw bit operations by accessor macros.
Fixes:
ff26b2dd6568 ("Bluetooth: Add quirk for broken READ_VOICE_SETTING")
Fixes:
127881334eaa ("Bluetooth: Add quirk for broken READ_PAGE_SCAN_TYPE")
Suggested-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Christian Eggers [Mon, 14 Jul 2025 20:27:44 +0000 (22:27 +0200)]
Bluetooth: hci_core: add missing braces when using macro parameters
Macro parameters should always be put into braces when accessing it.
Fixes:
4fc9857ab8c6 ("Bluetooth: hci_sync: Add check simultaneous roles support")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Christian Eggers [Mon, 14 Jul 2025 20:27:43 +0000 (22:27 +0200)]
Bluetooth: hci_core: fix typos in macros
The provided macro parameter is named 'dev' (rather than 'hdev', which
may be a variable on the stack where the macro is used).
Fixes:
a9a830a676a9 ("Bluetooth: hci_event: Fix sending HCI_OP_READ_ENC_KEY_SIZE")
Fixes:
6126ffabba6b ("Bluetooth: Introduce HCI_CONN_FLAG_DEVICE_PRIVACY device flag")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Wed, 2 Jul 2025 15:53:40 +0000 (11:53 -0400)]
Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
This replaces the usage of HCI_ERROR_REMOTE_USER_TERM, which as the name
suggest is to indicate a regular disconnection initiated by an user,
with HCI_ERROR_AUTH_FAILURE to indicate the session has timeout thus any
pairing shall be considered as failed.
Fixes:
1e91c29eb60c ("Bluetooth: Use hci_disconnect for immediate disconnection from SMP")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Mon, 30 Jun 2025 18:42:23 +0000 (14:42 -0400)]
Bluetooth: SMP: If an unallowed command is received consider it a failure
If a command is received while a bonding is ongoing consider it a
pairing failure so the session is cleanup properly and the device is
disconnected immediately instead of continuing with other commands that
may result in the session to get stuck without ever completing such as
the case bellow:
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
SMP: Identity Information (0x08) len 16
Identity resolving key[16]:
d7e08edef97d3e62cd2331f82d8073b0
> ACL Data RX: Handle 2048 flags 0x02 dlen 21
SMP: Signing Information (0x0a) len 16
Signature key[16]:
1716c536f94e843a9aea8b13ffde477d
Bluetooth: hci0: unexpected SMP command 0x0a from XX:XX:XX:XX:XX:XX
> ACL Data RX: Handle 2048 flags 0x02 dlen 12
SMP: Identity Address Information (0x09) len 7
Address: XX:XX:XX:XX:XX:XX (Intel Corporate)
While accourding to core spec 6.1 the expected order is always BD_ADDR
first first then CSRK:
When using LE legacy pairing, the keys shall be distributed in the
following order:
LTK by the Peripheral
EDIV and Rand by the Peripheral
IRK by the Peripheral
BD_ADDR by the Peripheral
CSRK by the Peripheral
LTK by the Central
EDIV and Rand by the Central
IRK by the Central
BD_ADDR by the Central
CSRK by the Central
When using LE Secure Connections, the keys shall be distributed in the
following order:
IRK by the Peripheral
BD_ADDR by the Peripheral
CSRK by the Peripheral
IRK by the Central
BD_ADDR by the Central
CSRK by the Central
According to the Core 6.1 for commands used for key distribution "Key
Rejected" can be used:
'3.6.1. Key distribution and generation
A device may reject a distributed key by sending the Pairing Failed command
with the reason set to "Key Rejected".
Fixes:
b28b4943660f ("Bluetooth: Add strict checks for allowed SMP PDUs")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Luiz Augusto von Dentz [Wed, 9 Jul 2025 19:02:56 +0000 (15:02 -0400)]
Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
Due to what seem to be a bug with variant version returned by some
firmwares the code may set hdev->classify_pkt_type with
btintel_classify_pkt_type when in fact the controller doesn't even
support ISO channels feature but may use the handle range expected from
a controllers that does causing the packets to be reclassified as ISO
causing several bugs.
To fix the above btintel_classify_pkt_type will attempt to check if the
controller really supports ISO channels and in case it doesn't don't
reclassify even if the handle range is considered to be ISO, this is
considered safer than trying to fix the specific controller/firmware
version as that could change over time and causing similar problems in
the future.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219553
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2100565
Link: https://github.com/StarLabsLtd/firmware/issues/180
Fixes:
f25b7fd36cc3 ("Bluetooth: Add vendor-specific packet classification for ISO data")
Cc: stable@vger.kernel.org
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Sean Rhodes <sean@starlabs.systems>
Alessandro Gasbarroni [Wed, 9 Jul 2025 07:53:11 +0000 (09:53 +0200)]
Bluetooth: hci_sync: fix connectable extended advertising when using static random address
Currently, the connectable flag used by the setup of an extended
advertising instance drives whether we require privacy when trying to pass
a random address to the advertising parameters (Own Address).
If privacy is not required, then it automatically falls back to using the
controller's public address. This can cause problems when using controllers
that do not have a public address set, but instead use a static random
address.
e.g. Assume a BLE controller that does not have a public address set.
The controller upon powering is set with a random static address by default
by the kernel.
< HCI Command: LE Set Random Address (0x08|0x0005) plen 6
Address: E4:AF:26:D8:3E:3A (Static)
> HCI Event: Command Complete (0x0e) plen 4
LE Set Random Address (0x08|0x0005) ncmd 1
Status: Success (0x00)
Setting non-connectable extended advertisement parameters in bluetoothctl
mgmt
add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g 1
correctly sets Own address type as Random
< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
Own address type: Random (0x01)
Setting connectable extended advertisement parameters in bluetoothctl mgmt
add-ext-adv-params -r 0x801 -x 0x802 -P 2M -g -c 1
mistakenly sets Own address type to Public (which causes to use Public
Address 00:00:00:00:00:00)
< HCI Command: LE Set Extended Advertising Parameters (0x08|0x0036)
plen 25
...
Own address type: Public (0x00)
This causes either the controller to emit an Invalid Parameters error or to
mishandle the advertising.
This patch makes sure that we use the already set static random address
when requesting a connectable extended advertising when we don't require
privacy and our public address is not set (00:00:00:00:00:00).
Fixes:
3fe318ee72c5 ("Bluetooth: move hci_get_random_address() to hci_sync")
Signed-off-by: Alessandro Gasbarroni <alex.gasbarroni@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Kuniyuki Iwashima [Mon, 7 Jul 2025 19:28:29 +0000 (19:28 +0000)]
Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
syzbot reported null-ptr-deref in l2cap_sock_resume_cb(). [0]
l2cap_sock_resume_cb() has a similar problem that was fixed by commit
1bff51ea59a9 ("Bluetooth: fix use-after-free error in lock_sock_nested()").
Since both l2cap_sock_kill() and l2cap_sock_resume_cb() are executed
under l2cap_sock_resume_cb(), we can avoid the issue simply by checking
if chan->data is NULL.
Let's not access to the killed socket in l2cap_sock_resume_cb().
[0]:
BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:82 [inline]
BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
BUG: KASAN: null-ptr-deref in l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
Write of size 8 at addr
0000000000000570 by task kworker/u9:0/52
CPU: 1 UID: 0 PID: 52 Comm: kworker/u9:0 Not tainted
6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Workqueue: hci0 hci_rx_work
Call trace:
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:501 (C)
__dump_stack+0x30/0x40 lib/dump_stack.c:94
dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
print_report+0x58/0x84 mm/kasan/report.c:524
kasan_report+0xb0/0x110 mm/kasan/report.c:634
check_region_inline mm/kasan/generic.c:-1 [inline]
kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
__kasan_check_write+0x20/0x30 mm/kasan/shadow.c:37
instrument_atomic_write include/linux/instrumented.h:82 [inline]
clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
l2cap_sock_resume_cb+0xb4/0x17c net/bluetooth/l2cap_sock.c:1711
l2cap_security_cfm+0x524/0xea0 net/bluetooth/l2cap_core.c:7357
hci_auth_cfm include/net/bluetooth/hci_core.h:2092 [inline]
hci_auth_complete_evt+0x2e8/0xa4c net/bluetooth/hci_event.c:3514
hci_event_func net/bluetooth/hci_event.c:7511 [inline]
hci_event_packet+0x650/0xe9c net/bluetooth/hci_event.c:7565
hci_rx_work+0x320/0xb18 net/bluetooth/hci_core.c:4070
process_one_work+0x7e8/0x155c kernel/workqueue.c:3238
process_scheduled_works kernel/workqueue.c:3321 [inline]
worker_thread+0x958/0xed8 kernel/workqueue.c:3402
kthread+0x5fc/0x75c kernel/kthread.c:464
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:847
Fixes:
d97c899bde33 ("Bluetooth: Introduce L2CAP channel callback for resuming")
Reported-by: syzbot+e4d73b165c3892852d22@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
686c12bd.
a70a0220.29fe6c.0b13.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Nathan Chancellor [Wed, 16 Jul 2025 03:07:01 +0000 (20:07 -0700)]
riscv: uaccess: Fix -Wuninitialized and -Wshadow in __put_user_nocheck
After a recent change in clang to strengthen uninitialized warnings [1],
there is a warning from val being uninitialized in __put_user_nocheck
when called from futex_put_value():
kernel/futex/futex.h:326:18: warning: variable 'val' is uninitialized when used within its own initialization [-Wuninitialized]
326 | unsafe_put_user(val, to, Efault);
| ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:464:21: note: expanded from macro 'unsafe_put_user'
464 | __put_user_nocheck(x, (ptr), label)
| ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:314:36: note: expanded from macro '__put_user_nocheck'
314 | __inttype(x) val = (__inttype(x))x; \
| ~~~ ^
While not on by default, -Wshadow flags the same mistake:
kernel/futex/futex.h:326:2: warning: declaration shadows a local variable [-Wshadow]
326 | unsafe_put_user(val, to, Efault);
| ^
arch/riscv/include/asm/uaccess.h:464:2: note: expanded from macro 'unsafe_put_user'
464 | __put_user_nocheck(x, (ptr), label)
| ^
arch/riscv/include/asm/uaccess.h:314:16: note: expanded from macro '__put_user_nocheck'
314 | __inttype(x) val = (__inttype(x))x; \
| ^
kernel/futex/futex.h:320:48: note: previous declaration is here
320 | static __always_inline int futex_put_value(u32 val, u32 __user *to)
| ^
Use a three underscore prefix for the val variable in __put_user_nocheck
to avoid clashing with either val or __val, which are both used within
the put_user macros, clearing up all warnings.
Closes: https://github.com/ClangBuiltLinux/linux/issues/2109
Fixes:
ca1a66cdd685 ("riscv: uaccess: do not do misaligned accesses in get/put_user()")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250715-riscv-uaccess-fix-self-init-val-v1-1-82b8e911f120@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Pavel Begunkov [Wed, 16 Jul 2025 16:20:17 +0000 (17:20 +0100)]
io_uring/poll: fix POLLERR handling
8c8492ca64e7 ("io_uring/net: don't retry connect operation on EPOLLERR")
is a little dirty hack that
1) wrongfully assumes that POLLERR equals to a failed request, which
breaks all POLLERR users, e.g. all error queue recv interfaces.
2) deviates the connection request behaviour from connect(2), and
3) racy and solved at a wrong level.
Nothing can be done with 2) now, and 3) is beyond the scope of the
patch. At least solve 1) by moving the hack out of generic poll handling
into io_connect().
Cc: stable@vger.kernel.org
Fixes:
8c8492ca64e79 ("io_uring/net: don't retry connect operation on EPOLLERR")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3dc89036388d602ebd84c28e5042e457bdfc952b.1752682444.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Alexandre Ghiti [Wed, 16 Jul 2025 12:05:16 +0000 (12:05 +0000)]
riscv: Stop supporting static ftrace
Now that DYNAMIC_FTRACE was introduced, there is no need to support
static ftrace as it is way less performant. This simplifies the code and
prevents build failures as reported by kernel test robot when
!DYNAMIC_FTRACE.
Also make sure that FUNCTION_TRACER can only be selected if
DYNAMIC_FTRACE is supported (we have a dependency on the toolchain).
Co-developed-by: chenmiao <chenmiao.ku@gmail.com>
Signed-off-by: chenmiao <chenmiao.ku@gmail.com>
Fixes:
b2137c3b6d7a ("riscv: ftrace: prepare ftrace for atomic code patching")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202506191949.o3SMu8Zn-lkp@intel.com/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250716-dev-alex-static_ftrace-v1-1-ba5d2b6fc9c0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Andreas Schwab [Thu, 10 Jul 2025 13:32:18 +0000 (15:32 +0200)]
riscv: traps_misaligned: properly sign extend value in misaligned load handler
Add missing cast to signed long.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Fixes:
956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
Tested-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/mvmikk0goil.fsf@suse.de
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Nam Cao [Wed, 25 Jun 2025 08:56:30 +0000 (10:56 +0200)]
riscv: Enable interrupt during exception handling
force_sig_fault() takes a spinlock, which is a sleeping lock with
CONFIG_PREEMPT_RT=y. However, exception handling calls force_sig_fault()
with interrupt disabled, causing a sleeping in atomic context warning.
This can be reproduced using userspace programs such as:
int main() { asm ("ebreak"); }
or
int main() { asm ("unimp"); }
There is no reason that interrupt must be disabled while handling
exceptions from userspace.
Enable interrupt while handling user exceptions. This also has the added
benefit of avoiding unnecessary delays in interrupt handling.
Fixes:
f0bddf50586d ("riscv: entry: Convert to generic entry")
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250625085630.3649485-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Alexandre Ghiti [Fri, 11 Jul 2025 07:38:38 +0000 (07:38 +0000)]
riscv: ftrace: Properly acquire text_mutex to fix a race condition
As reported by lockdep, some patching was done without acquiring
text_mutex, so there could be a race when mapping the page to patch
since we use the same fixmap entry.
Reported-by: Han Gao <rabenda.cn@gmail.com>
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>
Reported-by: Yao Zi <ziyao@disroot.org>
Closes: https://lore.kernel.org/linux-riscv/aGODMpq7TGINddzM@pie.lan/
Tested-by: Yao Zi <ziyao@disroot.org>
Tested-by: Han Gao <rabenda.cn@gmail.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250711-alex-fixes-v2-1-d85a5438da6c@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Sunil V L [Fri, 11 Jul 2025 14:00:13 +0000 (19:30 +0530)]
ACPI: RISC-V: Remove unnecessary CPPC debug message
The presence or absence of the CPPC SBI extension is currently logged
on every boot. This message is not particularly useful and can clutter
the boot log. Remove this debug message to reduce noise during boot.
This change has no functional impact.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Drew Fustini <fustini@kernel.org>
Link: https://lore.kernel.org/r/20250711140013.3043463-1-sunilvl@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Alexandre Ghiti [Thu, 10 Jul 2025 08:34:31 +0000 (08:34 +0000)]
riscv: Stop considering R_RISCV_NONE as bad relocations
Even though those relocations should not be present in the final
vmlinux, there are a lot of them. And since those relocations are
considered "bad", they flood the compilation output which may hide some
legitimate bad relocations.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Ron Economos <re@w6rz.net>
Link: https://lore.kernel.org/r/20250710-dev-alex-riscv_none_bad_relocs_v1-v1-1-758f2fcc6e75@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Michael C. Pratt [Wed, 16 Jul 2025 14:42:10 +0000 (15:42 +0100)]
nvmem: layouts: u-boot-env: remove crc32 endianness conversion
On 11 Oct 2022, it was reported that the crc32 verification
of the u-boot environment failed only on big-endian systems
for the u-boot-env nvmem layout driver with the following error.
Invalid calculated CRC32: 0x88cd6f09 (expected: 0x096fcd88)
This problem has been present since the driver was introduced,
and before it was made into a layout driver.
The suggested fix at the time was to use further endianness
conversion macros in order to have both the stored and calculated
crc32 values to compare always represented in the system's endianness.
This was not accepted due to sparse warnings
and some disagreement on how to handle the situation.
Later on in a newer revision of the patch, it was proposed to use
cpu_to_le32() for both values to compare instead of le32_to_cpu()
and store the values as __le32 type to remove compilation errors.
The necessity of this is based on the assumption that the use of crc32()
requires endianness conversion because the algorithm uses little-endian,
however, this does not prove to be the case and the issue is unrelated.
Upon inspecting the current kernel code,
there already is an existing use of le32_to_cpu() in this driver,
which suggests there already is special handling for big-endian systems,
however, it is big-endian systems that have the problem.
This, being the only functional difference between architectures
in the driver combined with the fact that the suggested fix
was to use the exact same endianness conversion for the values
brings up the possibility that it was not necessary to begin with,
as the same endianness conversion for two values expected to be the same
is expected to be equivalent to no conversion at all.
After inspecting the u-boot environment of devices of both endianness
and trying to remove the existing endianness conversion,
the problem is resolved in an equivalent way as the other suggested fixes.
Ultimately, it seems that u-boot is agnostic to endianness
at least for the purpose of environment variables.
In other words, u-boot reads and writes the stored crc32 value
with the same endianness that the crc32 value is calculated with
in whichever endianness a certain architecture runs on.
Therefore, the u-boot-env driver does not need to convert endianness.
Remove the usage of endianness macros in the u-boot-env driver,
and change the type of local variables to maintain the same return type.
If there is a special situation in the case of endianness,
it would be a corner case and should be handled by a unique "compatible".
Even though it is not necessary to use endianness conversion macros here,
it may be useful to use them in the future for consistent error printing.
Fixes:
d5542923f200 ("nvmem: add driver handling U-Boot environment variables")
Reported-by: INAGAKI Hiroshi <musashino.open@gmail.com>
Link: https://lore.kernel.org/all/20221011024928.1807-1-musashino.open@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: "Michael C. Pratt" <mcpratt@pm.me>
Signed-off-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://lore.kernel.org/r/20250716144210.4804-1-srini@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Akshay Gupta [Wed, 16 Jul 2025 11:07:29 +0000 (11:07 +0000)]
misc: amd-sbi: Explicitly clear in/out arg "mb_in_out"
- New AMD processor will support different input/output for same command.
- In some scenarios the input value is not cleared, which will be added to
output before reporting the data.
- Clearing input explicitly will be a cleaner and safer approach.
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Link: https://lore.kernel.org/r/20250716110729.2193725-3-akshay.gupta@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Akshay Gupta [Wed, 16 Jul 2025 11:07:28 +0000 (11:07 +0000)]
misc: amd-sbi: Address copy_to/from_user() warning reported in smatch
Smatch warnings are reported for below commit,
Commit
bb13a84ed6b7 ("misc: amd-sbi: Add support for CPUID protocol")
from Apr 28, 2025 (linux-next), leads to the following Smatch static
checker warning:
drivers/misc/amd-sbi/rmi-core.c:376 apml_rmi_reg_xfer() warn: maybe return -EFAULT instead of the bytes remaining?
drivers/misc/amd-sbi/rmi-core.c:394 apml_mailbox_xfer() warn: maybe return -EFAULT instead of the bytes remaining?
drivers/misc/amd-sbi/rmi-core.c:411 apml_cpuid_xfer() warn: maybe return -EFAULT instead of the bytes remaining?
drivers/misc/amd-sbi/rmi-core.c:428 apml_mcamsr_xfer() warn: maybe return -EFAULT instead of the bytes remaining?
copy_to/from_user() returns number of bytes, not copied.
In case data not copied, return "-EFAULT".
Additionally, fixes the "-EPROTOTYPE" error return as intended.
Fixes:
35ac2034db72 ("misc: amd-sbi: Add support for AMD_SBI IOCTL")
Fixes:
bb13a84ed6b7 ("misc: amd-sbi: Add support for CPUID protocol")
Fixes:
69b1ba83d21c ("misc: amd-sbi: Add support for read MCA register protocol")
Fixes:
cf141287b774 ("misc: amd-sbi: Add support for register xfer")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aDVyO8ByVsceybk9@stanley.mountain/
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Link: https://lore.kernel.org/r/20250716110729.2193725-2-akshay.gupta@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Akshay Gupta [Wed, 16 Jul 2025 11:07:27 +0000 (11:07 +0000)]
misc: amd-sbi: Address potential integer overflow issue reported in smatch
Smatch warnings are reported for below commit,
Commit
bb13a84ed6b7 ("misc: amd-sbi: Add support for CPUID protocol")
from Apr 28, 2025 (linux-next), leads to the following Smatch static
checker warning:
drivers/misc/amd-sbi/rmi-core.c:132 rmi_cpuid_read() warn: bitwise OR is zero '0xffffffff00000000 & 0xffff'
drivers/misc/amd-sbi/rmi-core.c:132 rmi_cpuid_read() warn: potential integer overflow from user 'msg->cpu_in_out << 32'
drivers/misc/amd-sbi/rmi-core.c:213 rmi_mca_msr_read() warn: bitwise OR is zero '0xffffffff00000000 & 0xffff'
drivers/misc/amd-sbi/rmi-core.c:213 rmi_mca_msr_read() warn: potential integer overflow from user 'msg->mcamsr_in_out << 32'
CPUID & MCAMSR thread data from input is available at byte 4 & 5, this
patch fixes to copy the user data correctly in the argument.
Previously, CPUID and MCAMSR data is return only for thread 0.
Fixes:
bb13a84ed6b7 ("misc: amd-sbi: Add support for CPUID protocol")
Fixes:
69b1ba83d21c ("misc: amd-sbi: Add support for read MCA register protocol")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aDVyO8ByVsceybk9@stanley.mountain/
Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com>
Signed-off-by: Akshay Gupta <akshay.gupta@amd.com>
Link: https://lore.kernel.org/r/20250716110729.2193725-1-akshay.gupta@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Tue, 8 Jul 2025 13:06:27 +0000 (14:06 +0100)]
comedi: comedi_test: Fix possible deletion of uninitialized timers
In `waveform_common_attach()`, the two timers `&devpriv->ai_timer` and
`&devpriv->ao_timer` are initialized after the allocation of the device
private data by `comedi_alloc_devpriv()` and the subdevices by
`comedi_alloc_subdevices()`. The function may return with an error
between those function calls. In that case, `waveform_detach()` will be
called by the Comedi core to clean up. The check that
`waveform_detach()` uses to decide whether to delete the timers is
incorrect. It only checks that the device private data was allocated,
but that does not guarantee that the timers were initialized. It also
needs to check that the subdevices were allocated. Fix it.
Fixes:
73e0e4dfed4c ("staging: comedi: comedi_test: fix timer lock-up")
Cc: stable@vger.kernel.org # 6.15+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250708130627.21743-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 16:14:39 +0000 (17:14 +0100)]
comedi: Fix initialization of data for instructions that write to subdevice
Some Comedi subdevice instruction handlers are known to access
instruction data elements beyond the first `insn->n` elements in some
cases. The `do_insn_ioctl()` and `do_insnlist_ioctl()` functions
allocate at least `MIN_SAMPLES` (16) data elements to deal with this,
but they do not initialize all of that. For Comedi instruction codes
that write to the subdevice, the first `insn->n` data elements are
copied from user-space, but the remaining elements are left
uninitialized. That could be a problem if the subdevice instruction
handler reads the uninitialized data. Ensure that the first
`MIN_SAMPLES` elements are initialized before calling these instruction
handlers, filling the uncopied elements with 0. For
`do_insnlist_ioctl()`, the same data buffer elements are used for
handling a list of instructions, so ensure the first `MIN_SAMPLES`
elements are initialized for each instruction that writes to the
subdevice.
Fixes:
ed9eccbe8970 ("Staging: add comedi core")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707161439.88385-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 15:33:54 +0000 (16:33 +0100)]
comedi: Fix use of uninitialized data in insn_rw_emulate_bits()
For Comedi `INSN_READ` and `INSN_WRITE` instructions on "digital"
subdevices (subdevice types `COMEDI_SUBD_DI`, `COMEDI_SUBD_DO`, and
`COMEDI_SUBD_DIO`), it is common for the subdevice driver not to have
`insn_read` and `insn_write` handler functions, but to have an
`insn_bits` handler function for handling Comedi `INSN_BITS`
instructions. In that case, the subdevice's `insn_read` and/or
`insn_write` function handler pointers are set to point to the
`insn_rw_emulate_bits()` function by `__comedi_device_postconfig()`.
For `INSN_WRITE`, `insn_rw_emulate_bits()` currently assumes that the
supplied `data[0]` value is a valid copy from user memory. It will at
least exist because `do_insnlist_ioctl()` and `do_insn_ioctl()` in
"comedi_fops.c" ensure at lease `MIN_SAMPLES` (16) elements are
allocated. However, if `insn->n` is 0 (which is allowable for
`INSN_READ` and `INSN_WRITE` instructions, then `data[0]` may contain
uninitialized data, and certainly contains invalid data, possibly from a
different instruction in the array of instructions handled by
`do_insnlist_ioctl()`. This will result in an incorrect value being
written to the digital output channel (or to the digital input/output
channel if configured as an output), and may be reflected in the
internal saved state of the channel.
Fix it by returning 0 early if `insn->n` is 0, before reaching the code
that accesses `data[0]`. Previously, the function always returned 1 on
success, but it is supposed to be the number of data samples actually
read or written up to `insn->n`, which is 0 in this case.
Reported-by: syzbot+cb96ec476fb4914445c9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
cb96ec476fb4914445c9
Fixes:
ed9eccbe8970 ("Staging: add comedi core")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707153355.82474-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 13:57:37 +0000 (14:57 +0100)]
comedi: das6402: Fix bit shift out of bounds
When checking for a supported IRQ number, the following test is used:
/* IRQs 2,3,5,6,7, 10,11,15 are valid for "enhanced" mode */
if ((1 << it->options[1]) & 0x8cec) {
However, `it->options[i]` is an unchecked `int` value from userspace, so
the shift amount could be negative or out of bounds. Fix the test by
requiring `it->options[1]` to be within bounds before proceeding with
the original test. Valid `it->options[1]` values that select the IRQ
will be in the range [1,15]. The value 0 explicitly disables the use of
interrupts.
Fixes:
79e5e6addbb1 ("staging: comedi: das6402: rewrite broken driver")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707135737.77448-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 13:46:22 +0000 (14:46 +0100)]
comedi: aio_iiro_16: Fix bit shift out of bounds
When checking for a supported IRQ number, the following test is used:
if ((1 << it->options[1]) & 0xdcfc) {
However, `it->options[i]` is an unchecked `int` value from userspace, so
the shift amount could be negative or out of bounds. Fix the test by
requiring `it->options[1]` to be within bounds before proceeding with
the original test. Valid `it->options[1]` values that select the IRQ
will be in the range [1,15]. The value 0 explicitly disables the use of
interrupts.
Fixes:
ad7a370c8be4 ("staging: comedi: aio_iiro_16: add command support for change of state detection")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707134622.75403-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 13:34:29 +0000 (14:34 +0100)]
comedi: pcl812: Fix bit shift out of bounds
When checking for a supported IRQ number, the following test is used:
if ((1 << it->options[1]) & board->irq_bits) {
However, `it->options[i]` is an unchecked `int` value from userspace, so
the shift amount could be negative or out of bounds. Fix the test by
requiring `it->options[1]` to be within bounds before proceeding with
the original test. Valid `it->options[1]` values that select the IRQ
will be in the range [1,15]. The value 0 explicitly disables the use of
interrupts.
Reported-by: syzbot+32de323b0addb9e114ff@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
32de323b0addb9e114ff
Fixes:
fcdb427bc7cf ("Staging: comedi: add pcl821 driver")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707133429.73202-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 13:09:08 +0000 (14:09 +0100)]
comedi: das16m1: Fix bit shift out of bounds
When checking for a supported IRQ number, the following test is used:
/* only irqs 2, 3, 4, 5, 6, 7, 10, 11, 12, 14, and 15 are valid */
if ((1 << it->options[1]) & 0xdcfc) {
However, `it->options[i]` is an unchecked `int` value from userspace, so
the shift amount could be negative or out of bounds. Fix the test by
requiring `it->options[1]` to be within bounds before proceeding with
the original test.
Reported-by: syzbot+c52293513298e0fd9a94@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
c52293513298e0fd9a94
Fixes:
729988507680 ("staging: comedi: das16m1: tidy up the irq support in das16m1_attach()")
Tested-by: syzbot+c52293513298e0fd9a94@syzkaller.appspotmail.com
Suggested-by: "Enju, Kohei" <enjuk@amazon.co.jp>
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707130908.70758-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Mon, 7 Jul 2025 12:15:55 +0000 (13:15 +0100)]
comedi: Fix some signed shift left operations
Correct some left shifts of the signed integer constant 1 by some
unsigned number less than 32. Change the constant to 1U to avoid
shifting a 1 into the sign bit.
The corrected functions are comedi_dio_insn_config(),
comedi_dio_update_state(), and __comedi_device_postconfig().
Fixes:
e523c6c86232 ("staging: comedi: drivers: introduce comedi_dio_insn_config()")
Fixes:
05e60b13a36b ("staging: comedi: drivers: introduce comedi_dio_update_state()")
Fixes:
09567cb4373e ("staging: comedi: initialize subdevice s->io_bits in postconfig")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250707121555.65424-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ian Abbott [Fri, 4 Jul 2025 12:04:05 +0000 (13:04 +0100)]
comedi: Fail COMEDI_INSNLIST ioctl if n_insns is too large
The handling of the `COMEDI_INSNLIST` ioctl allocates a kernel buffer to
hold the array of `struct comedi_insn`, getting the length from the
`n_insns` member of the `struct comedi_insnlist` supplied by the user.
The allocation will fail with a WARNING and a stack dump if it is too
large.
Avoid that by failing with an `-EINVAL` error if the supplied `n_insns`
value is unreasonable.
Define the limit on the `n_insns` value in the `MAX_INSNS` macro. Set
this to the same value as `MAX_SAMPLES` (65536), which is the maximum
allowed sum of the values of the member `n` in the array of `struct
comedi_insn`, and sensible comedi instructions will have an `n` of at
least 1.
Reported-by: syzbot+d6995b62e5ac7d79557a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
d6995b62e5ac7d79557a
Fixes:
ed9eccbe8970 ("Staging: add comedi core")
Tested-by: Ian Abbott <abbotti@mev.co.uk>
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20250704120405.83028-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christian Brauner [Wed, 16 Jul 2025 08:40:45 +0000 (10:40 +0200)]
MAINTAINERS: add block and fsdevel lists to iov_iter
We've had multiple instances where people didn't Cc fsdevel or block
which are easily the most affected subsystems by iov_iter changes.
Put a stop to that and make sure both lists are Cced so we can catch
stuff like [1] early.
Link: https://lore.kernel.org/linux-nvme/20250715132750.9619-4-aaptel@nvidia.com
Link: https://lore.kernel.org/20250716-eklig-rasten-ec8c4dc05a1e@brauner
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Ming Lei [Wed, 16 Jul 2025 11:48:08 +0000 (19:48 +0800)]
loop: use kiocb helpers to fix lockdep warning
The lockdep tool can report a circular lock dependency warning in the loop
driver's AIO read/write path:
```
[ 6540.587728] kworker/u96:5/72779 is trying to acquire lock:
[ 6540.593856]
ff110001b5968440 (sb_writers#9){.+.+}-{0:0}, at: loop_process_work+0x11a/0xf70 [loop]
[ 6540.603786]
[ 6540.603786] but task is already holding lock:
[ 6540.610291]
ff110001b5968440 (sb_writers#9){.+.+}-{0:0}, at: loop_process_work+0x11a/0xf70 [loop]
[ 6540.620210]
[ 6540.620210] other info that might help us debug this:
[ 6540.627499] Possible unsafe locking scenario:
[ 6540.627499]
[ 6540.634110] CPU0
[ 6540.636841] ----
[ 6540.639574] lock(sb_writers#9);
[ 6540.643281] lock(sb_writers#9);
[ 6540.646988]
[ 6540.646988] *** DEADLOCK ***
```
This patch fixes the issue by using the AIO-specific helpers
`kiocb_start_write()` and `kiocb_end_write()`. These functions are
designed to be used with a `kiocb` and manage write sequencing
correctly for asynchronous I/O without introducing the problematic
lock dependency.
The `kiocb` is already part of the `loop_cmd` struct, so this change
also simplifies the completion function `lo_rw_aio_do_completion()` by
using the `iocb` from the `cmd` struct directly, instead of retrieving
the loop device from the request queue.
Fixes:
39d86db34e41 ("loop: add file_start_write() and file_end_write()")
Cc: Changhui Zhong <czhong@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250716114808.3159657-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Lane Odenbach [Tue, 15 Jul 2025 18:20:38 +0000 (13:20 -0500)]
ASoC: amd: yc: Add DMI quirk for HP Laptop 17 cp-2033dx
This fixes the internal microphone in the stated device
Signed-off-by: Lane Odenbach <laodenbach@gmail.com>
Link: https://patch.msgid.link/20250715182038.10048-1-laodenbach@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Bard Liao [Wed, 16 Jul 2025 08:22:33 +0000 (16:22 +0800)]
ASoC: Intel: soc-acpi: add support for HP Omen14 ARL
This platform has an RT711-sdca on link0 and RT1316 on link3.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20250716082233.1810334-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Nathan Chancellor [Tue, 15 Jul 2025 22:56:05 +0000 (15:56 -0700)]
memstick: core: Zero initialize id_reg in h_memstick_read_dev_id()
A new warning in clang [1] points out that id_reg is uninitialized then
passed to memstick_init_req() as a const pointer:
drivers/memstick/core/memstick.c:330:59: error: variable 'id_reg' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
330 | memstick_init_req(&card->current_mrq, MS_TPC_READ_REG, &id_reg,
| ^~~~~~
Commit
de182cc8e882 ("drivers/memstick/core/memstick.c: avoid -Wnonnull
warning") intentionally passed this variable uninitialized to avoid an
-Wnonnull warning from a NULL value that was previously there because
id_reg is never read from the call to memstick_init_req() in
h_memstick_read_dev_id(). Just zero initialize id_reg to avoid the
warning, which is likely happening in the majority of builds using
modern compilers that support '-ftrivial-auto-var-init=zero'.
Cc: stable@vger.kernel.org
Fixes:
de182cc8e882 ("drivers/memstick/core/memstick.c: avoid -Wnonnull warning")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d441f19b319e
Closes: https://github.com/ClangBuiltLinux/linux/issues/2105
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250715-memstick-fix-uninit-const-pointer-v1-1-f6753829c27a@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Ralf Lici [Tue, 1 Jul 2025 12:47:44 +0000 (14:47 +0200)]
ovpn: reset GSO metadata after decapsulation
The ovpn_netdev_write() function is responsible for injecting
decapsulated and decrypted packets back into the local network stack.
Prior to this patch, the skb could retain GSO metadata from the outer,
encrypted tunnel packet. This original GSO metadata, relevant to the
sender's transport context, becomes invalid and misleading for the
tunnel/data path once the inner packet is exposed.
Leaving this stale metadata intact causes internal GSO validation checks
further down the kernel's network stack (validate_xmit_skb()) to fail,
leading to packet drops. The reasons for these failures vary by
protocol, for example:
- for ICMP, no offload handler is registered;
- for TCP and UDP, the respective offload handlers return errors when
comparing skb->len to the outdated skb_shinfo(skb)->gso_size.
By calling skb_gso_reset(skb) we ensure the inner packet is presented to
gro_cells_receive() with a clean state, correctly indicating it is an
individual packet from the perspective of the local stack.
This change eliminates the "Driver has suspect GRO implementation, TCP
performance may be compromised" warning and improves overall TCP
performance by allowing GSO/GRO to function as intended on the
decapsulated traffic.
Fixes:
11851cbd60ea ("ovpn: implement TCP transport")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/4
Tested-by: Gert Doering <gert@greenie.muc.de>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Antonio Quartulli [Wed, 25 Jun 2025 14:08:11 +0000 (16:08 +0200)]
ovpn: reject unexpected netlink attributes
Netlink ops do not expect all attributes to be always set, however
this condition is not explicitly coded any where, leading the user
to believe that all sent attributes are somewhat processed.
Fix this behaviour by introducing explicit checks.
For CMD_OVPN_PEER_GET and CMD_OVPN_KEY_GET directly open-code the
needed condition in the related ops handlers.
While for all other ops use attribute subsets in the ovpn.yaml spec file.
Fixes:
b7a63391aa98 ("ovpn: add basic netlink support")
Reported-by: Ralf Lici <ralf@mandelbit.com>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/19
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Ralf Lici [Wed, 4 Jun 2025 13:11:58 +0000 (15:11 +0200)]
ovpn: propagate socket mark to skb in UDP
OpenVPN allows users to configure a FW mark on sockets used to
communicate with other peers. The mark is set by means of the
`SO_MARK` Linux socket option.
However, in the ovpn UDP code path, the socket's `sk_mark` value is
currently ignored and it is not propagated to outgoing `skbs`.
This commit ensures proper inheritance of the field by setting
`skb->mark` to `sk->sk_mark` before handing the `skb` to the network
stack for transmission.
Fixes:
08857b5ec5d9 ("ovpn: implement basic TX path (UDP)")
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31877.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Mathias Nyman [Mon, 23 Jun 2025 13:39:47 +0000 (16:39 +0300)]
usb: hub: Don't try to recover devices lost during warm reset.
Hub driver warm-resets ports in SS.Inactive or Compliance mode to
recover a possible connected device. The port reset code correctly
detects if a connection is lost during reset, but hub driver
port_event() fails to take this into account in some cases.
port_event() ends up using stale values and assumes there is a
connected device, and will try all means to recover it, including
power-cycling the port.
Details:
This case was triggered when xHC host was suspended with DbC (Debug
Capability) enabled and connected. DbC turns one xHC port into a simple
usb debug device, allowing debugging a system with an A-to-A USB debug
cable.
xhci DbC code disables DbC when xHC is system suspended to D3, and
enables it back during resume.
We essentially end up with two hosts connected to each other during
suspend, and, for a short while during resume, until DbC is enabled back.
The suspended xHC host notices some activity on the roothub port, but
can't train the link due to being suspended, so xHC hardware sets a CAS
(Cold Attach Status) flag for this port to inform xhci host driver that
the port needs to be warm reset once xHC resumes.
CAS is xHCI specific, and not part of USB specification, so xhci driver
tells usb core that the port has a connection and link is in compliance
mode. Recovery from complinace mode is similar to CAS recovery.
xhci CAS driver support that fakes a compliance mode connection was added
in commit
8bea2bd37df0 ("usb: Add support for root hub port status CAS")
Once xHCI resumes and DbC is enabled back, all activity on the xHC
roothub host side port disappears. The hub driver will anyway think
port has a connection and link is in compliance mode, and hub driver
will try to recover it.
The port power-cycle during recovery seems to cause issues to the active
DbC connection.
Fix this by clearing connect_change flag if hub_port_reset() returns
-ENOTCONN, thus avoiding the whole unnecessary port recovery and
initialization attempt.
Cc: stable@vger.kernel.org
Fixes:
8bea2bd37df0 ("usb: Add support for root hub port status CAS")
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20250623133947.3144608-1-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Wahren [Tue, 15 Jul 2025 16:11:08 +0000 (18:11 +0200)]
staging: vchiq_arm: Make vchiq_shutdown never fail
Most of the users of vchiq_shutdown ignore the return value,
which is bad because this could lead to resource leaks.
So instead of changing all calls to vchiq_shutdown, it's easier
to make vchiq_shutdown never fail.
Fixes:
71bad7f08641 ("staging: add bcm2708 vchiq driver")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/20250715161108.3411-4-wahrenst@gmx.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Wahren [Tue, 15 Jul 2025 16:11:07 +0000 (18:11 +0200)]
Revert "staging: vchiq_arm: Create keep-alive thread during probe"
The commit
86bc88217006 ("staging: vchiq_arm: Create keep-alive thread
during probe") introduced a regression for certain configurations,
which doesn't have a VCHIQ user. This results in a unused and hanging
keep-alive thread:
INFO: task vchiq-keep/0:85 blocked for more than 120 seconds.
Not tainted 6.12.34-v8-+ #13
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:vchiq-keep/0 state:D stack:0 pid:85 tgid:85 ppid:2
Call trace:
__switch_to+0x188/0x230
__schedule+0xa54/0xb28
schedule+0x80/0x120
schedule_preempt_disabled+0x30/0x50
kthread+0xd4/0x1a0
ret_from_fork+0x10/0x20
Fixes:
86bc88217006 ("staging: vchiq_arm: Create keep-alive thread during probe")
Reported-by: Maíra Canal <mcanal@igalia.com>
Closes: https://lore.kernel.org/linux-staging/
ba35b960-a981-4671-9f7f-
060da10feaa1@usp.br/
Cc: stable@kernel.org
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Link: https://lore.kernel.org/r/20250715161108.3411-3-wahrenst@gmx.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Wahren [Tue, 15 Jul 2025 16:11:06 +0000 (18:11 +0200)]
Revert "staging: vchiq_arm: Improve initial VCHIQ connect"
The commit
3e5def4249b9 ("staging: vchiq_arm: Improve initial VCHIQ connect")
based on the assumption that in good case the VCHIQ connect always happen and
therefore the keep-alive thread is guaranteed to be woken up. This is wrong,
because in certain configurations there are no VCHIQ users and so the VCHIQ
connect never happen. So revert it.
Fixes:
3e5def4249b9 ("staging: vchiq_arm: Improve initial VCHIQ connect")
Reported-by: Maíra Canal <mcanal@igalia.com>
Closes: https://lore.kernel.org/linux-staging/
ba35b960-a981-4671-9f7f-
060da10feaa1@usp.br/
Cc: stable@kernel.org
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Link: https://lore.kernel.org/r/20250715161108.3411-2-wahrenst@gmx.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nathan Chancellor [Wed, 16 Jul 2025 03:19:44 +0000 (20:19 -0700)]
tracing/probes: Avoid using params uninitialized in parse_btf_arg()
After a recent change in clang to strengthen uninitialized warnings [1],
it points out that in one of the error paths in parse_btf_arg(), params
is used uninitialized:
kernel/trace/trace_probe.c:660:19: warning: variable 'params' is uninitialized when used here [-Wuninitialized]
660 | return PTR_ERR(params);
| ^~~~~~
Match many other NO_BTF_ENTRY error cases and return -ENOENT, clearing
up the warning.
Link: https://lore.kernel.org/all/20250715-trace_probe-fix-const-uninit-warning-v1-1-98960f91dd04@kernel.org/
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2110
Fixes:
d157d7694460 ("tracing/probes: Support BTF field access from $retval")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Jakub Kicinski [Wed, 16 Jul 2025 00:31:30 +0000 (17:31 -0700)]
Merge branch 'mptcp-fix-fallback-related-races'
Matthieu Baerts says:
====================
mptcp: fix fallback-related races
This series contains 3 fixes somewhat related to various races we have
while handling fallback.
The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL -- where the window race is even
wider.
Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2. These fixes can be
backported up to v5.19 and v5.15.
Note that mptcp_disconnect() unconditionally clears the fallback status
(zeroing msk->flags) but don't touch the `allows_infinite_fallback`
flag. Such issue is addressed in patch 3, and can be backported up to
v5.17.
====================
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-0-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 14 Jul 2025 16:41:46 +0000 (18:41 +0200)]
mptcp: reset fallback status gracefully at disconnect() time
mptcp_disconnect() clears the fallback bit unconditionally, without
touching the associated flags.
The bit clear is safe, as no fallback operation can race with that --
all subflow are already in TCP_CLOSE status thanks to the previous
FASTCLOSE -- but we need to consistently reset all the fallback related
status.
Also acquire the relevant lock, to avoid fouling static analyzers.
Fixes:
b29fcfb54cd7 ("mptcp: full disconnect implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-3-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 14 Jul 2025 16:41:45 +0000 (18:41 +0200)]
mptcp: plug races between subflow fail and subflow creation
We have races similar to the one addressed by the previous patch between
subflow failing and additional subflow creation. They are just harder to
trigger.
The solution is similar. Use a separate flag to track the condition
'socket state prevent any additional subflow creation' protected by the
fallback lock.
The socket fallback makes such flag true, and also receiving or sending
an MP_FAIL option.
The field 'allow_infinite_fallback' is now always touched under the
relevant lock, we can drop the ONCE annotation on write.
Fixes:
478d770008b0 ("mptcp: send out MP_FAIL when data checksum fails")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-2-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 14 Jul 2025 16:41:44 +0000 (18:41 +0200)]
mptcp: make fallback action and fallback decision atomic
Syzkaller reported the following splat:
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 __mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 check_fully_established net/mptcp/options.c:982 [inline]
WARNING: CPU: 1 PID: 7704 at net/mptcp/protocol.h:1223 mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Modules linked in:
CPU: 1 UID: 0 PID: 7704 Comm: syz.3.1419 Not tainted
6.16.0-rc3-gbd5ce2324dba #20 PREEMPT(voluntary)
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:__mptcp_do_fallback net/mptcp/protocol.h:1223 [inline]
RIP: 0010:mptcp_do_fallback net/mptcp/protocol.h:1244 [inline]
RIP: 0010:check_fully_established net/mptcp/options.c:982 [inline]
RIP: 0010:mptcp_incoming_options+0x21a8/0x2510 net/mptcp/options.c:1153
Code: 24 18 e8 bb 2a 00 fd e9 1b df ff ff e8 b1 21 0f 00 e8 ec 5f c4 fc 44 0f b7 ac 24 b0 00 00 00 e9 54 f1 ff ff e8 d9 5f c4 fc 90 <0f> 0b 90 e9 b8 f4 ff ff e8 8b 2a 00 fd e9 8d e6 ff ff e8 81 2a 00
RSP: 0018:
ffff8880a3f08448 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff8880180a8000 RCX:
ffffffff84afcf45
RDX:
ffff888090223700 RSI:
ffffffff84afdaa7 RDI:
0000000000000001
RBP:
ffff888017955780 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
ffff8880180a8910 R14:
ffff8880a3e9d058 R15:
0000000000000000
FS:
00005555791b8500(0000) GS:
ffff88811c495000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000000110c2800b7 CR3:
0000000058e44000 CR4:
0000000000350ef0
Call Trace:
<IRQ>
tcp_reset+0x26f/0x2b0 net/ipv4/tcp_input.c:4432
tcp_validate_incoming+0x1057/0x1b60 net/ipv4/tcp_input.c:5975
tcp_rcv_established+0x5b5/0x21f0 net/ipv4/tcp_input.c:6166
tcp_v4_do_rcv+0x5dc/0xa70 net/ipv4/tcp_ipv4.c:1925
tcp_v4_rcv+0x3473/0x44a0 net/ipv4/tcp_ipv4.c:2363
ip_protocol_deliver_rcu+0xba/0x480 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2f1/0x500 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_local_deliver+0x1be/0x560 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:469 [inline]
ip_rcv_finish net/ipv4/ip_input.c:447 [inline]
NF_HOOK include/linux/netfilter.h:317 [inline]
NF_HOOK include/linux/netfilter.h:311 [inline]
ip_rcv+0x514/0x810 net/ipv4/ip_input.c:567
__netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5975
__netif_receive_skb+0x1f/0x120 net/core/dev.c:6088
process_backlog+0x301/0x1360 net/core/dev.c:6440
__napi_poll.constprop.0+0xba/0x550 net/core/dev.c:7453
napi_poll net/core/dev.c:7517 [inline]
net_rx_action+0xb44/0x1010 net/core/dev.c:7644
handle_softirqs+0x1d0/0x770 kernel/softirq.c:579
do_softirq+0x3f/0x90 kernel/softirq.c:480
</IRQ>
<TASK>
__local_bh_enable_ip+0xed/0x110 kernel/softirq.c:407
local_bh_enable include/linux/bottom_half.h:33 [inline]
inet_csk_listen_stop+0x2c5/0x1070 net/ipv4/inet_connection_sock.c:1524
mptcp_check_listen_stop.part.0+0x1cc/0x220 net/mptcp/protocol.c:2985
mptcp_check_listen_stop net/mptcp/mib.h:118 [inline]
__mptcp_close+0x9b9/0xbd0 net/mptcp/protocol.c:3000
mptcp_close+0x2f/0x140 net/mptcp/protocol.c:3066
inet_release+0xed/0x200 net/ipv4/af_inet.c:435
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:487
__sock_release+0xb3/0x270 net/socket.c:649
sock_close+0x1c/0x30 net/socket.c:1439
__fput+0x402/0xb70 fs/file_table.c:465
task_work_run+0x150/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xd4/0xe0 kernel/entry/common.c:114
exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
do_syscall_64+0x245/0x360 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc92f8a36ad
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007ffcf52802d8 EFLAGS:
00000246 ORIG_RAX:
00000000000001b4
RAX:
0000000000000000 RBX:
00007ffcf52803a8 RCX:
00007fc92f8a36ad
RDX:
0000000000000000 RSI:
000000000000001e RDI:
0000000000000003
RBP:
00007fc92fae7ba0 R08:
0000000000000001 R09:
0000002800000000
R10:
00007fc92f700000 R11:
0000000000000246 R12:
00007fc92fae5fac
R13:
00007fc92fae5fa0 R14:
0000000000026d00 R15:
0000000000026c51
</TASK>
irq event stamp: 4068
hardirqs last enabled at (4076): [<
ffffffff81544816>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:344
hardirqs last disabled at (4085): [<
ffffffff815447fb>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:342
softirqs last enabled at (3096): [<
ffffffff840e1be0>] local_bh_enable include/linux/bottom_half.h:33 [inline]
softirqs last enabled at (3096): [<
ffffffff840e1be0>] inet_csk_listen_stop+0x2c0/0x1070 net/ipv4/inet_connection_sock.c:1524
softirqs last disabled at (3097): [<
ffffffff813b6b9f>] do_softirq+0x3f/0x90 kernel/softirq.c:480
Since we need to track the 'fallback is possible' condition and the
fallback status separately, there are a few possible races open between
the check and the actual fallback action.
Add a spinlock to protect the fallback related information and use it
close all the possible related races. While at it also remove the
too-early clearing of allow_infinite_fallback in __mptcp_subflow_connect():
the field will be correctly cleared by subflow_finish_connect() if/when
the connection will complete successfully.
If fallback is not possible, as per RFC, reset the current subflow.
Since the fallback operation can now fail and return value should be
checked, rename the helper accordingly.
Fixes:
0530020a7c8f ("mptcp: track and update contiguous data status")
Cc: stable@vger.kernel.org
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/570
Reported-by: syzbot+5cf807c20386d699b524@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/555
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250714-net-mptcp-fallback-races-v1-1-391aff963322@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiawen Wu [Mon, 14 Jul 2025 01:56:56 +0000 (09:56 +0800)]
net: libwx: fix multicast packets received count
Multicast good packets received by PF rings that pass ethternet MAC
address filtering are counted for rtnl_link_stats64.multicast. The
counter is not cleared on read. Fix the duplicate counting on updating
statistics.
Fixes:
46b92e10d631 ("net: libwx: support hardware statistics")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/DA229A4F58B70E51+20250714015656.91772-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 16 Jul 2025 00:28:28 +0000 (17:28 -0700)]
Merge branch 'fix-rx-fatal-errors'
Jiawen Wu says:
====================
Fix Rx fatal errors
There are some fatal errors on the Rx NAPI path, which can cause
the kernel to crash. Fix known issues and potential risks.
The part of the patches has been mentioned before[1].
[1]: https://lore.kernel.org/all/
C8A23A11DB646E60+
20250630094102.22265-1-jiawenwu@trustnetic.com/
v1: https://lore.kernel.org/
20250709064025.19436-1-jiawenwu@trustnetic.com
====================
Link: https://patch.msgid.link/20250714024755.17512-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiawen Wu [Mon, 14 Jul 2025 02:47:55 +0000 (10:47 +0800)]
net: libwx: properly reset Rx ring descriptor
When device reset is triggered by feature changes such as toggling Rx
VLAN offload, wx->do_reset() is called to reinitialize Rx rings. The
hardware descriptor ring may retain stale values from previous sessions.
And only set the length to 0 in rx_desc[0] would result in building
malformed SKBs. Fix it to ensure a clean slate after device reset.
[ 549.186435] [ C16] ------------[ cut here ]------------
[ 549.186457] [ C16] kernel BUG at net/core/skbuff.c:2814!
[ 549.186468] [ C16] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 549.186472] [ C16] CPU: 16 UID: 0 PID: 0 Comm: swapper/16 Kdump: loaded Not tainted 6.16.0-rc4+ #23 PREEMPT(voluntary)
[ 549.186476] [ C16] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[ 549.186478] [ C16] RIP: 0010:__pskb_pull_tail+0x3ff/0x510
[ 549.186484] [ C16] Code: 06 f0 ff 4f 34 74 7b 4d 8b 8c 24 c8 00 00 00 45 8b 84 24 c0 00 00 00 e9 c8 fd ff ff 48 c7 44 24 08 00 00 00 00 e9 5e fe ff ff <0f> 0b 31 c0 e9 23 90 5b ff 41 f7 c6 ff 0f 00 00 75 bf 49 8b 06 a8
[ 549.186487] [ C16] RSP: 0018:
ffffb391c0640d70 EFLAGS:
00010282
[ 549.186490] [ C16] RAX:
00000000fffffff2 RBX:
ffff8fe7e4d40200 RCX:
00000000fffffff2
[ 549.186492] [ C16] RDX:
ffff8fe7c3a4bf8e RSI:
0000000000000180 RDI:
ffff8fe7c3a4bf40
[ 549.186494] [ C16] RBP:
ffffb391c0640da8 R08:
ffff8fe7c3a4c0c0 R09:
000000000000000e
[ 549.186496] [ C16] R10:
ffffb391c0640d88 R11:
000000000000000e R12:
ffff8fe7e4d40200
[ 549.186497] [ C16] R13:
00000000fffffff2 R14:
ffff8fe7fa01a000 R15:
00000000fffffff2
[ 549.186499] [ C16] FS:
0000000000000000(0000) GS:
ffff8fef5ae40000(0000) knlGS:
0000000000000000
[ 549.186502] [ C16] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 549.186503] [ C16] CR2:
00007f77d81d6000 CR3:
000000051a032000 CR4:
0000000000750ef0
[ 549.186505] [ C16] PKRU:
55555554
[ 549.186507] [ C16] Call Trace:
[ 549.186510] [ C16] <IRQ>
[ 549.186513] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186517] [ C16] __skb_pad+0xc7/0xf0
[ 549.186523] [ C16] wx_clean_rx_irq+0x355/0x3b0 [libwx]
[ 549.186533] [ C16] wx_poll+0x92/0x120 [libwx]
[ 549.186540] [ C16] __napi_poll+0x28/0x190
[ 549.186544] [ C16] net_rx_action+0x301/0x3f0
[ 549.186548] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186551] [ C16] ? __raw_spin_lock_irqsave+0x1e/0x50
[ 549.186554] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186557] [ C16] ? wake_up_nohz_cpu+0x35/0x160
[ 549.186559] [ C16] ? srso_alias_return_thunk+0x5/0xfbef5
[ 549.186563] [ C16] handle_softirqs+0xf9/0x2c0
[ 549.186568] [ C16] __irq_exit_rcu+0xc7/0x130
[ 549.186572] [ C16] common_interrupt+0xb8/0xd0
[ 549.186576] [ C16] </IRQ>
[ 549.186577] [ C16] <TASK>
[ 549.186579] [ C16] asm_common_interrupt+0x22/0x40
[ 549.186582] [ C16] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[ 549.186585] [ C16] Code: 00 00 e8 11 0e 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 0d ed 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[ 549.186587] [ C16] RSP: 0018:
ffffb391c0277e78 EFLAGS:
00000246
[ 549.186590] [ C16] RAX:
ffff8fef5ae40000 RBX:
0000000000000003 RCX:
0000000000000000
[ 549.186591] [ C16] RDX:
0000007fde0faac5 RSI:
ffffffff826e53f6 RDI:
ffffffff826fa9b3
[ 549.186593] [ C16] RBP:
ffff8fe7c3a20800 R08:
0000000000000002 R09:
0000000000000000
[ 549.186595] [ C16] R10:
0000000000000000 R11:
000000000000ffff R12:
ffffffff82ed7a40
[ 549.186596] [ C16] R13:
0000007fde0faac5 R14:
0000000000000003 R15:
0000000000000000
[ 549.186601] [ C16] ? cpuidle_enter_state+0xb3/0x420
[ 549.186605] [ C16] cpuidle_enter+0x29/0x40
[ 549.186609] [ C16] cpuidle_idle_call+0xfd/0x170
[ 549.186613] [ C16] do_idle+0x7a/0xc0
[ 549.186616] [ C16] cpu_startup_entry+0x25/0x30
[ 549.186618] [ C16] start_secondary+0x117/0x140
[ 549.186623] [ C16] common_startup_64+0x13e/0x148
[ 549.186628] [ C16] </TASK>
Fixes:
3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-4-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiawen Wu [Mon, 14 Jul 2025 02:47:54 +0000 (10:47 +0800)]
net: libwx: fix the using of Rx buffer DMA
The wx_rx_buffer structure contained two DMA address fields: 'dma' and
'page_dma'. However, only 'page_dma' was actually initialized and used
to program the Rx descriptor. But 'dma' was uninitialized and used in
some paths.
This could lead to undefined behavior, including DMA errors or
use-after-free, if the uninitialized 'dma' was used. Althrough such
error has not yet occurred, it is worth fixing in the code.
Fixes:
3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-3-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiawen Wu [Mon, 14 Jul 2025 02:47:53 +0000 (10:47 +0800)]
net: libwx: remove duplicate page_pool_put_full_page()
page_pool_put_full_page() should only be invoked when freeing Rx buffers
or building a skb if the size is too short. At other times, the pages
need to be reused. So remove the redundant page put. In the original
code, double free pages cause kernel panic:
[ 876.949834] __irq_exit_rcu+0xc7/0x130
[ 876.949836] common_interrupt+0xb8/0xd0
[ 876.949838] </IRQ>
[ 876.949838] <TASK>
[ 876.949840] asm_common_interrupt+0x22/0x40
[ 876.949841] RIP: 0010:cpuidle_enter_state+0xc2/0x420
[ 876.949843] Code: 00 00 e8 d1 1d 5e ff e8 ac f0 ff ff 49 89 c5 0f 1f 44 00 00 31 ff e8 cd fc 5c ff 45 84 ff 0f 85 40 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 84 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[ 876.949844] RSP: 0018:
ffffaa7340267e78 EFLAGS:
00000246
[ 876.949845] RAX:
ffff9e3f135be000 RBX:
0000000000000002 RCX:
0000000000000000
[ 876.949846] RDX:
000000cc2dc4cb7c RSI:
ffffffff89ee49ae RDI:
ffffffff89ef9f9e
[ 876.949847] RBP:
ffff9e378f940800 R08:
0000000000000002 R09:
00000000000000ed
[ 876.949848] R10:
000000000000afc8 R11:
ffff9e3e9e5a9b6c R12:
ffffffff8a6d8580
[ 876.949849] R13:
000000cc2dc4cb7c R14:
0000000000000002 R15:
0000000000000000
[ 876.949852] ? cpuidle_enter_state+0xb3/0x420
[ 876.949855] cpuidle_enter+0x29/0x40
[ 876.949857] cpuidle_idle_call+0xfd/0x170
[ 876.949859] do_idle+0x7a/0xc0
[ 876.949861] cpu_startup_entry+0x25/0x30
[ 876.949862] start_secondary+0x117/0x140
[ 876.949864] common_startup_64+0x13e/0x148
[ 876.949867] </TASK>
[ 876.949868] ---[ end trace
0000000000000000 ]---
[ 876.949869] ------------[ cut here ]------------
[ 876.949870] list_del corruption,
ffffead40445a348->next is NULL
[ 876.949873] WARNING: CPU: 14 PID: 0 at lib/list_debug.c:52 __list_del_entry_valid_or_report+0x67/0x120
[ 876.949875] Modules linked in: snd_hrtimer(E) bnep(E) binfmt_misc(E) amdgpu(E) squashfs(E) vfat(E) loop(E) fat(E) amd_atl(E) snd_hda_codec_realtek(E) intel_rapl_msr(E) snd_hda_codec_generic(E) intel_rapl_common(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) edac_mce_amd(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hda_core(E) amdxcp(E) kvm_amd(E) snd_hwdep(E) gpu_sched(E) drm_panel_backlight_quirks(E) cec(E) snd_pcm(E) drm_buddy(E) snd_seq_dummy(E) drm_ttm_helper(E) btusb(E) kvm(E) snd_seq_oss(E) btrtl(E) ttm(E) btintel(E) snd_seq_midi(E) btbcm(E) drm_exec(E) snd_seq_midi_event(E) i2c_algo_bit(E) snd_rawmidi(E) bluetooth(E) drm_suballoc_helper(E) irqbypass(E) snd_seq(E) ghash_clmulni_intel(E) sha512_ssse3(E) drm_display_helper(E) aesni_intel(E) snd_seq_device(E) rfkill(E) snd_timer(E) gf128mul(E) drm_client_lib(E) drm_kms_helper(E) snd(E) i2c_piix4(E) joydev(E) soundcore(E) wmi_bmof(E) ccp(E) k10temp(E) i2c_smbus(E) gpio_amdpt(E) i2c_designware_platform(E) gpio_generic(E) sg(E)
[ 876.949914] i2c_designware_core(E) sch_fq_codel(E) parport_pc(E) drm(E) ppdev(E) lp(E) parport(E) fuse(E) nfnetlink(E) ip_tables(E) ext4 crc16 mbcache jbd2 sd_mod sfp mdio_i2c i2c_core txgbe ahci ngbe pcs_xpcs libahci libwx r8169 phylink libata realtek ptp pps_core video wmi
[ 876.949933] CPU: 14 UID: 0 PID: 0 Comm: swapper/14 Kdump: loaded Tainted: G W E 6.16.0-rc2+ #20 PREEMPT(voluntary)
[ 876.949935] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
[ 876.949936] Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024
[ 876.949936] RIP: 0010:__list_del_entry_valid_or_report+0x67/0x120
[ 876.949938] Code: 00 00 00 48 39 7d 08 0f 85 a6 00 00 00 5b b8 01 00 00 00 5d 41 5c e9 73 0d 93 ff 48 89 fe 48 c7 c7 a0 31 e8 89 e8 59 7c b3 ff <0f> 0b 31 c0 5b 5d 41 5c e9 57 0d 93 ff 48 89 fe 48 c7 c7 c8 31 e8
[ 876.949940] RSP: 0018:
ffffaa73405d0c60 EFLAGS:
00010282
[ 876.949941] RAX:
0000000000000000 RBX:
ffffead40445a348 RCX:
0000000000000000
[ 876.949942] RDX:
0000000000000105 RSI:
0000000000000001 RDI:
00000000ffffffff
[ 876.949943] RBP:
0000000000000000 R08:
000000010006dfde R09:
ffffffff8a47d150
[ 876.949944] R10:
ffffffff8a47d150 R11:
0000000000000003 R12:
dead000000000122
[ 876.949945] R13:
ffff9e3e9e5af700 R14:
ffffead40445a348 R15:
ffff9e3e9e5af720
[ 876.949946] FS:
0000000000000000(0000) GS:
ffff9e3f135be000(0000) knlGS:
0000000000000000
[ 876.949947] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 876.949948] CR2:
00007fa58b480048 CR3:
0000000156724000 CR4:
0000000000750ef0
[ 876.949949] PKRU:
55555554
[ 876.949950] Call Trace:
[ 876.949951] <IRQ>
[ 876.949952] __rmqueue_pcplist+0x53/0x2c0
[ 876.949955] alloc_pages_bulk_noprof+0x2e0/0x660
[ 876.949958] __page_pool_alloc_pages_slow+0xa9/0x400
[ 876.949961] page_pool_alloc_pages+0xa/0x20
[ 876.949963] wx_alloc_rx_buffers+0xd7/0x110 [libwx]
[ 876.949967] wx_clean_rx_irq+0x262/0x430 [libwx]
[ 876.949971] wx_poll+0x92/0x130 [libwx]
[ 876.949975] __napi_poll+0x28/0x190
[ 876.949977] net_rx_action+0x301/0x3f0
[ 876.949980] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949981] ? profile_tick+0x30/0x70
[ 876.949983] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949984] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949986] ? timerqueue_add+0xa3/0xc0
[ 876.949988] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949989] ? __raise_softirq_irqoff+0x16/0x70
[ 876.949991] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949993] ? srso_alias_return_thunk+0x5/0xfbef5
[ 876.949994] ? wx_msix_clean_rings+0x41/0x50 [libwx]
[ 876.949998] handle_softirqs+0xf9/0x2c0
Fixes:
3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250714024755.17512-2-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Markus Blöchl [Sun, 13 Jul 2025 20:21:41 +0000 (22:21 +0200)]
net: stmmac: intel: populate entire system_counterval_t in get_time_fn() callback
get_time_fn() callback implementations are expected to fill out the
entire system_counterval_t struct as it may be initially uninitialized.
This broke with the removal of convert_art_to_tsc() helper functions
which left use_nsecs uninitialized.
Initially assign the entire struct with default values.
Fixes:
f5e1d0db3f02 ("stmmac: intel: Remove convert_art_to_tsc()")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Blöchl <markus@blochl.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250713-stmmac_crossts-v1-1-31bfe051b5cb@blochl.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 15 Jul 2025 23:05:41 +0000 (16:05 -0700)]
Merge tag 'linux-can-fixes-for-6.16-
20250715' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-07-15
Brett Werling's patch for the tcan4x5x glue code driver fixes the
detection of chips which are held in reset/sleep and must be woken up
by GPIO prior to communication.
* tag 'linux-can-fixes-for-6.16-
20250715' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: tcan4x5x: fix reset gpio usage during probe
====================
Link: https://patch.msgid.link/20250715101625.3202690-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oliver Neukum [Mon, 14 Jul 2025 11:12:56 +0000 (13:12 +0200)]
usb: net: sierra: check for no status endpoint
The driver checks for having three endpoints and
having bulk in and out endpoints, but not that
the third endpoint is interrupt input.
Rectify the omission.
Reported-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/
686d5a9f.
050a0220.1ffab7.0017.GAE@google.com/
Tested-by: syzbot+3f89ec3d1d0842e95d50@syzkaller.appspotmail.com
Fixes:
eb4fd8cd355c8 ("net/usb: add sierra_net.c driver")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://patch.msgid.link/20250714111326.258378-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Christopherson [Mon, 14 Jul 2025 22:19:28 +0000 (15:19 -0700)]
KVM: VMX: Ensure unused kvm_tdx_capabilities fields are zeroed out
Zero-allocate the kernel's kvm_tdx_capabilities structure and copy only
the number of CPUID entries from the userspace structure. As is, KVM
doesn't ensure kernel_tdvmcallinfo_1_{r11,r12} and user_tdvmcallinfo_1_r12
are zeroed, i.e. KVM will reflect whatever happens to be in the userspace
structure back at userspace, and thus may report garbage to userspace.
Zeroing the entire kernel structure also provides better semantics for the
reserved field. E.g. if KVM extends kvm_tdx_capabilities to enumerate new
information by repurposing bytes from the reserved field, userspace would
be required to zero the new field in order to get useful information back
(because older KVMs without support for the repurposed field would report
garbage, a la the aforementioned tdvmcallinfo bugs).
Fixes:
61bb28279623 ("KVM: TDX: Get system-wide info about TDX module on initialization")
Suggested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reported-by: Xiaoyao Li <xiaoyao.li@intel.com>
Closes: https://lore.kernel.org/all/
3ef581f1-1ff1-4b99-b216-
b316f6415318@intel.com
Tested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://lore.kernel.org/r/20250714221928.1788095-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Swiatkowski [Tue, 24 Jun 2025 09:26:36 +0000 (11:26 +0200)]
ice: check correct pointer in fwlog debugfs
pf->ice_debugfs_pf_fwlog should be checked for an error here.
Fixes:
96a9a9341cda ("ice: configure FW logging")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Dave Ertman [Thu, 22 May 2025 17:16:57 +0000 (13:16 -0400)]
ice: add NULL check in eswitch lag check
The function ice_lag_is_switchdev_running() is being called from outside of
the LAG event handler code. This results in the lag->upper_netdev being
NULL sometimes. To avoid a NULL-pointer dereference, there needs to be a
check before it is dereferenced.
Fixes:
776fe19953b0 ("ice: block default rule setting on LAG interface")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Arnd Bergmann [Fri, 20 Jun 2025 17:31:24 +0000 (19:31 +0200)]
ethernet: intel: fix building with large NR_CPUS
With large values of CONFIG_NR_CPUS, three Intel ethernet drivers fail to
compile like:
In function ‘i40e_free_q_vector’,
inlined from ‘i40e_vsi_alloc_q_vectors’ at drivers/net/ethernet/intel/i40e/i40e_main.c:12112:3:
571 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
include/linux/rcupdate.h:1084:17: note: in expansion of macro ‘BUILD_BUG_ON’
1084 | BUILD_BUG_ON(offsetof(typeof(*(ptr)), rhf) >= 4096); \
drivers/net/ethernet/intel/i40e/i40e_main.c:5113:9: note: in expansion of macro ‘kfree_rcu’
5113 | kfree_rcu(q_vector, rcu);
| ^~~~~~~~~
The problem is that the 'rcu' member in 'q_vector' is too far from the start
of the structure. Move this member before the CPU mask instead, in all three
drivers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marius Zachmann [Thu, 19 Jun 2025 13:27:47 +0000 (15:27 +0200)]
hwmon: (corsair-cpro) Validate the size of the received input buffer
Add buffer_recv_size to store the size of the received bytes.
Validate buffer_recv_size in send_usb_cmd().
Reported-by: syzbot+3bbbade4e1a7ab45ca3b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-hwmon/
61233ba1-e5ad-4d7a-ba31-
3b5d0adcffcc@roeck-us.net
Fixes:
40c3a4454225 ("hwmon: add Corsair Commander Pro driver")
Signed-off-by: Marius Zachmann <mail@mariuszachmann.de>
Link: https://lore.kernel.org/r/20250619132817.39764-5-mail@mariuszachmann.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Paolo Bonzini [Tue, 15 Jul 2025 17:32:31 +0000 (19:32 +0200)]
Merge tag 'kvm-riscv-fixes-6.16-2' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.16, take #2
- Disable vstimecmp before exiting to user-space
- Move HGEI[E|P] CSR access to IMSIC virtualization
Paolo Bonzini [Tue, 15 Jul 2025 17:32:23 +0000 (19:32 +0200)]
Merge tag 'kvmarm-fixes-6.16-6' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.16, take #6
- Fix use of u64_replace_bits() in adjusting the guest's view of
MDCR_EL2.HPMN.
Paolo Bonzini [Thu, 10 Jul 2025 10:17:11 +0000 (12:17 +0200)]
KVM: Documentation: document how KVM is tested
Proper testing greatly simplifies both patch development and review,
but it can be unclear what kind of userspace or guest support
should accompany new features. Clarify maintainer expectations
in terms of testing expectations; additionally, list the cases in
which open-source userspace support is pretty much a necessity and
its absence can only be mitigated by selftests.
While these ideas have long been followed implicitly by KVM contributors
and maintainers, formalize them in writing to provide consistent (though
not universal) guidelines.
Suggested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 10 Jul 2025 10:14:17 +0000 (12:14 +0200)]
KVM: Documentation: minimal updates to review-checklist.rst
While the file could stand a larger update, these are the bare minimum changes
needed to make it more widely applicable.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Tue, 15 Jul 2025 16:26:33 +0000 (09:26 -0700)]
Merge tag 'soc-fixes-6.16-2' of git://git./linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"There are 18 devicetree fixes for three arm64 plaforms: Qualcomm
Snapdragon, Rockchips and NXP i.MX. These get updated to more
correctly describe the hardware, fixing issues with:
- real-time clock on Snapdragon based laptops
- SD card detection, PCI probing and HDMI/DDC communication on
Rockchips
- ethernet and SPI probing on certain i.MX based boards
- a regression with the i.MX watchdog
Aside from the devicetree fixes, there are two additional fixes for
the merged ASPEED LPC snoop driver that saw some changes in 6.16, and
one additional driver enabled in arm64 defconfig to fix CPU frequency
scaling"
* tag 'soc-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on
soc: aspeed: lpc-snoop: Don't disable channels that aren't enabled
soc: aspeed: lpc-snoop: Cleanup resources in stack-order
arm64: dts: imx95: Correct the DMA interrupter number of pcie0_ep
arm64: dts: rockchip: Add missing fan-supply to rk3566-quartz64-a
arm64: dts: rockchip: use cs-gpios for spi1 on ringneck
arm64: dts: add big-endian property back into watchdog node
arm64: dts: imx95-15x15-evk: fix the overshoot issue of NETC
arm64: dts: imx95-19x19-evk: fix the overshoot issue of NETC
arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5
arm64: dts: imx8mp-venice-gw74xx: fix TPM SPI frequency
arm64: dts: imx8mp-venice-gw73xx: fix TPM SPI frequency
arm64: dts: imx8mp-venice-gw72xx: fix TPM SPI frequency
arm64: dts: imx8mp-venice-gw71xx: fix TPM SPI frequency
arm64: dts: qcom: x1e80100: describe uefi rtc offset
arm64: dts: qcom: sc8280xp-x13s: describe uefi rtc offset
arm64: defconfig: Enable Qualcomm CPUCP mailbox driver
arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi 4B
arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi CM5
arm64: dts: rockchip: Adjust the HDMI DDC IO driver strength for rk3588
...
Linus Torvalds [Tue, 15 Jul 2025 16:20:44 +0000 (09:20 -0700)]
Merge tag 'hid-for-linus-
2025071501' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- one warning cleanup introduced in the last PR (Andy Shevchenko)
- a nasty syzbot buffer underflow fix co-debugged with Alan Stern
(Benjamin Tissoires)
* tag 'hid-for-linus-
2025071501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
selftests/hid: add a test case for the recent syzbot underflow
HID: core: do not bypass hid_hw_raw_request
HID: core: ensure __hid_request reserves the report ID as the first byte
HID: core: ensure the allocated report buffer can contain the reserved report ID
HID: debug: Remove duplicate entry (BTN_WHEEL)
Abinash Singh [Sat, 5 Jul 2025 16:00:55 +0000 (21:30 +0530)]
dma: dw-edma: Fix build warning in dw_edma_pcie_probe()
The function dw_edma_pcie_probe() in dw-edma-pcie.c triggered a
frame size warning:
ld.lld:warning:
drivers/dma/dw-edma/dw-edma-pcie.c:162:0: stack frame size (1040) exceeds limit (1024) in function 'dw_edma_pcie_probe'
This patch reduces the stack usage by dynamically allocating the
`vsec_data` structure using kmalloc(), rather than placing it on
the stack. This eliminates the overflow warning and improves kernel
robustness.
Signed-off-by: Abinash Singh <abinashsinghlalotra@gmail.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://lore.kernel.org/r/20250705160055.808165-1-abinashsinghlalotra@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Dan Carpenter [Tue, 1 Jul 2025 22:31:40 +0000 (17:31 -0500)]
dmaengine: nbpfaxi: Fix memory corruption in probe()
The nbpf->chan[] array is allocated earlier in the nbpf_probe() function
and it has "num_channels" elements. These three loops iterate one
element farther than they should and corrupt memory.
The changes to the second loop are more involved. In this case, we're
copying data from the irqbuf[] array into the nbpf->chan[] array. If
the data in irqbuf[i] is the error IRQ then we skip it, so the iterators
are not in sync. I added a check to ensure that we don't go beyond the
end of the irqbuf[] array. I'm pretty sure this can't happen, but it
seemed harmless to add a check.
On the other hand, after the loop has ended there is a check to ensure
that the "chan" iterator is where we expect it to be. In the original
code we went one element beyond the end of the array so the iterator
wasn't in the correct place and it would always return -EINVAL. However,
now it will always be in the correct place. I deleted the check since
we know the result.
Cc: stable@vger.kernel.org
Fixes:
b45b262cefd5 ("dmaengine: add a driver for AMBA AXI NBPF DMAC IP cores")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/b13c5225-7eff-448c-badc-a2c98e9bcaca@sabinyo.mountain
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Harshit Mogalapalli [Thu, 10 Jul 2025 17:24:02 +0000 (10:24 -0700)]
phy: qcom: fix error code in snps_eusb2_hsphy_probe()
When phy->ref_clk is NULL PTR_ERR(NULL) will be a success. Fix this by
using -ENOENT when phy->ref_clk is NULL instead.
Fixes:
80090810f5d3 ("phy: qcom: Add QCOM SNPS eUSB2 driver")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aDCbeuCTy9zyWJAM@stanley.mountain/
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20250710172403.2593193-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Kai Huang [Sun, 13 Jul 2025 22:20:20 +0000 (10:20 +1200)]
KVM: x86: Reject KVM_SET_TSC_KHZ vCPU ioctl for TSC protected guest
Reject KVM_SET_TSC_KHZ vCPU ioctl if guest's TSC is protected and not
changeable by KVM, and update the documentation to reflect it.
For such TSC protected guests, e.g. TDX guests, typically the TSC is
configured once at VM level before any vCPU are created and remains
unchanged during VM's lifetime. KVM provides the KVM_SET_TSC_KHZ VM
scope ioctl to allow the userspace VMM to configure the TSC of such VM.
After that the userspace VMM is not supposed to call the KVM_SET_TSC_KHZ
vCPU scope ioctl anymore when creating the vCPU.
The de facto userspace VMM Qemu does this for TDX guests. The upcoming
SEV-SNP guests with Secure TSC should follow.
Note, TDX support hasn't been fully released as of the "buggy" commit,
i.e. there is no established ABI to break.
Fixes:
adafea110600 ("KVM: x86: Add infrastructure for secure TSC")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
Link: https://lore.kernel.org/r/71bbdf87fdd423e3ba3a45b57642c119ee2dd98c.1752444335.git.kai.huang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Imre Deak [Tue, 8 Jul 2025 21:23:31 +0000 (00:23 +0300)]
drm/dp: Change AUX DPCD probe address from LANE0_1_STATUS to TRAINING_PATTERN_SET
Commit
a3ef3c2da675 ("drm/dp: Change AUX DPCD probe address from
DPCD_REV to LANE0_1_STATUS") stopped using the DPCD_REV register for
DPCD probing, since this results in link training failures at least when
using an Intel Barlow Ridge TBT hub at UHBR link rates (the
DP_INTRA_HOP_AUX_REPLY_INDICATION never getting cleared after the failed
link training). Since accessing DPCD_REV during link training is
prohibited by the DP Standard, LANE0_1_STATUS (0x202) was used instead,
as it falls within the Standard's valid register address range
(0x102-0x106, 0x202-0x207, 0x200c-0x200f, 0x2216) and it fixed the link
training on the above TBT hub.
However, reading the LANE0_1_STATUS register also has a side-effect at
least on a Novatek eDP panel, as reported on the Closes: link below,
resulting in screen flickering on that panel. One clear side-effect when
doing the 1-byte probe reads from LANE0_1_STATUS during link training
before reading out the full 6 byte link status starting at the same
address is that the panel will report the link training as completed
with voltage swing 0. This is different from the normal, flicker-free
scenario when no DPCD probing is done, the panel reporting the link
training complete with voltage swing 2.
Using the TRAINING_PATTERN_SET register for DPCD probing doesn't have
the above side-effect, the panel will link train with voltage swing 2 as
expected and it will stay flicker-free. This register is also in the
above valid register range and is unlikely to have a side-effect as that
of LANE0_1_STATUS: Reading LANE0_1_STATUS is part of the link training
CR/EQ sequences and so it may cause a state change in the sink - even if
inadvertently as I suspect in the case of the above Novatek panel. As
opposed to this, reading TRAINING_PATTERN_SET is not part of the link
training sequence (it must be only written once at the beginning of the
CR/EQ sequences), so it's unlikely to cause any state change in the
sink.
As a side-note, this Novatek panel also lacks support for TPS3, while
claiming support for HBR2, which violates the DP Standard (the Standard
mandating TPS3 for HBR2).
Besides the Novatek panel (PSR 1), which this change fixes, I also
verified the change on a Samsung (PSR 1) and an Analogix (PSR 2) eDP
panel as well as on the Intel Barlow Ridge TBT hub.
Note that in the drm-tip tree (targeting the v6.17 kernel version) the
i915 and xe drivers keep DPCD probing enabled only for the panel known
to require this (HP ZR24w), hence those drivers in drm-tip are not
affected by the above problem.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Fixes:
a3ef3c2da675 ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS")
Reported-and-tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250708212331.112898-1-imre.deak@intel.com
(cherry picked from commit
bba9aa41654036534d86b198f5647a9ce15ebd7f)
[Imre: Rebased on drm-intel-fixes]
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Changed original commit hash to match with the one propagated in fixes]
Paolo Abeni [Thu, 10 Jul 2025 16:04:50 +0000 (18:04 +0200)]
selftests: net: increase inter-packet timeout in udpgro.sh
The mentioned test is not very stable when running on top of
debug kernel build. Increase the inter-packet timeout to allow
more slack in such environments.
Fixes:
3327a9c46352 ("selftests: add functionals test for UDP GRO")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/b0370c06ddb3235debf642c17de0284b2cd3c652.1752163107.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Rafael J. Wysocki [Mon, 14 Jul 2025 17:45:31 +0000 (19:45 +0200)]
PM: sleep: Update power.completion for all devices on errors
After commit
aa7a9275ab81 ("PM: sleep: Suspend async parents after
suspending children"), the following scenario is possible:
1. Device A is async and it depends on device B that is sync.
2. Async suspend is scheduled for A before the processing of B is
started.
3. A is waiting for B.
4. In the meantime, an unrelated device fails to suspend and returns
an error.
5. The processing of B doesn't start at all and its power.completion is
not updated.
6. A is still waiting for B when async_synchronize_full() is called.
7. Deadlock ensues.
To prevent this from happening, update power.completion for all devices
on errors in all suspend phases, but do not do it directly for devices
that are already being processed or are waiting for the processing to
start because in those cases it may be necessary to wait for the
processing to actually complete before updating power.completion for
the device.
Fixes:
aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children")
Fixes:
443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous")
Closes: https://lore.kernel.org/linux-pm/
e13740a0-88f3-4a6f-920f-
15805071a7d6@linaro.org/
Reported-and-tested-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/6191258.lOV4Wx5bFT@rjwysocki.net
Zihuan Zhang [Sat, 12 Jul 2025 03:08:24 +0000 (11:08 +0800)]
PM: suspend: clean up redundant filesystems_freeze/thaw() handling
The recently introduced support for freezing filesystems during system
suspend included calls to filesystems_freeze() in both suspend_prepare()
and enter_state(), as well as calls to filesystems_thaw() in both
suspend_finish() and the Unlock path in enter_state(). These are
redundant.
Moreover, calling filesystems_freeze() twice, from both suspend_prepare()
and enter_state(), leads to a black screen and makes the system unable
to resume in some cases.
Address this as follows:
- filesystems_freeze() is already called in suspend_prepare(), which
is the proper and consistent place to handle pre-suspend operations.
The second call in enter_state() is unnecessary and so remove it.
- filesystems_thaw() is invoked in suspend_finish(), which covers
successful suspend/resume paths. In the failure case, add a call
to filesystems_thaw() only when needed, avoiding the duplicate call
in the general Unlock path.
This change simplifies the suspend code and avoids repeated freeze/thaw
calls, while preserving correct ordering and behavior.
Fixes:
eacfbf74196f ("power: freeze filesystems during suspend/resume")
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/20250712030824.81474-1-zhangzihuan@kylinos.cn
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Thu, 10 Jul 2025 08:43:26 +0000 (10:43 +0200)]
PM: suspend: Drop a misplaced pm_restore_gfp_mask() call
The pm_restore_gfp_mask() call added by commit
12ffc3b1513e ("PM:
Restrict swap use to later in the suspend sequence") to
suspend_devices_and_enter() is done too early because it takes
place before calling dpm_resume() in dpm_resume_end() and some
swap-backing devices may not be ready at that point. Moreover,
dpm_resume_end() called subsequently in the same code path invokes
pm_restore_gfp_mask() again and calling it twice in a row is
pointless.
Drop the misplaced pm_restore_gfp_mask() call from
suspend_devices_and_enter() to address this issue.
Fixes:
12ffc3b1513e ("PM: Restrict swap use to later in the suspend sequence")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/2810409.mvXUDI8C0e@rjwysocki.net
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Amit Pundir [Wed, 9 Jul 2025 17:49:49 +0000 (23:19 +0530)]
soundwire: Revert "soundwire: qcom: Add set_channel_map api support"
This reverts commit
7796c97df6b1b2206681a07f3c80f6023a6593d5.
This patch broke Dragonboard 845c (sdm845). I see:
Unexpected kernel BRK exception at EL1
Internal error: BRK handler:
00000000f20003e8 [#1] SMP
pc : qcom_swrm_set_channel_map+0x7c/0x80 [soundwire_qcom]
lr : snd_soc_dai_set_channel_map+0x34/0x78
Call trace:
qcom_swrm_set_channel_map+0x7c/0x80 [soundwire_qcom] (P)
sdm845_dai_init+0x18c/0x2e0 [snd_soc_sdm845]
snd_soc_link_init+0x28/0x6c
snd_soc_bind_card+0x5f4/0xb0c
snd_soc_register_card+0x148/0x1a4
devm_snd_soc_register_card+0x50/0xb0
sdm845_snd_platform_probe+0x124/0x148 [snd_soc_sdm845]
platform_probe+0x6c/0xd0
really_probe+0xc0/0x2a4
__driver_probe_device+0x7c/0x130
driver_probe_device+0x40/0x118
__device_attach_driver+0xc4/0x108
bus_for_each_drv+0x8c/0xf0
__device_attach+0xa4/0x198
device_initial_probe+0x18/0x28
bus_probe_device+0xb8/0xbc
deferred_probe_work_func+0xac/0xfc
process_one_work+0x244/0x658
worker_thread+0x1b4/0x360
kthread+0x148/0x228
ret_from_fork+0x10/0x20
Kernel panic - not syncing: BRK handler: Fatal exception
Dan has also reported following issues with the original patch
https://lore.kernel.org/all/
33fe8fe7-719a-405a-9ed2-
d9f816ce1d57@sabinyo.mountain/
Bug #1:
The zeroeth element of ctrl->pconfig[] is supposed to be unused. We
start counting at 1. However this code sets ctrl->pconfig[0].ch_mask = 128.
Bug #2:
There are SLIM_MAX_TX_PORTS (16) elements in tx_ch[] array but only
QCOM_SDW_MAX_PORTS + 1 (15) in the ctrl->pconfig[] array so it corrupts
memory like Yongqin Liu pointed out.
Bug 3:
Like Jie Gan pointed out, it erases all the tx information with the rx
information.
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Acked-by: Srinivas Kandagatla <srini@kernel.org>
Link: https://lore.kernel.org/r/20250709174949.8541-1-amit.pundir@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Adam Queler [Tue, 15 Jul 2025 03:14:24 +0000 (23:14 -0400)]
ASoC: amd: yc: Add DMI entries to support HP 15-fb1xxx
This model requires an additional detection quirk to
enable the internal microphone.
Signed-off-by: Adam Queler <queler+k@gmail.com>
Link: https://patch.msgid.link/20250715031434.222062-1-queler+k@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Greg Kroah-Hartman [Tue, 15 Jul 2025 11:13:53 +0000 (13:13 +0200)]
Merge tag 'usb-serial-6.16-rc7' of ssh://gitolite./linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB serial device ids for 6.16-rc7
Here are some more device ids for 6.16-rc7.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.16-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: option: add Telit Cinterion
FE910C04 (ECM) composition
USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI
Johannes Berg [Tue, 15 Jul 2025 11:07:24 +0000 (13:07 +0200)]
Merge tag 'iwlwifi-fixes-2025-07-15' of https://git./linux/kernel/git/iwlwifi/iwlwifi-next
Miri Korenblit says:
====================
iwlwifi-fixes
- missing unlock in error path
- Avoid FW assert on bad command values
- fix kernel panic due to incorrect index calculation
====================
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Brett Werling [Fri, 11 Jul 2025 14:17:28 +0000 (09:17 -0500)]
can: tcan4x5x: fix reset gpio usage during probe
Fixes reset GPIO usage during probe by ensuring we retrieve the GPIO and
take the device out of reset (if it defaults to being in reset) before
we attempt to communicate with the device. This is achieved by moving
the call to tcan4x5x_get_gpios() before tcan4x5x_find_version() and
avoiding any device communication while getting the GPIOs. Once we
determine the version, we can then take the knowledge of which GPIOs we
obtained and use it to decide whether we need to disable the wake or
state pin functions within the device.
This change is necessary in a situation where the reset GPIO is pulled
high externally before the CPU takes control of it, meaning we need to
explicitly bring the device out of reset before we can start
communicating with it at all.
This also has the effect of fixing an issue where a reset of the device
would occur after having called tcan4x5x_disable_wake(), making the
original behavior not actually disable the wake. This patch should now
disable wake or state pin functions well after the reset occurs.
Signed-off-by: Brett Werling <brett.werling@garmin.com>
Link: https://patch.msgid.link/20250711141728.1826073-1-brett.werling@garmin.com
Cc: Markus Schneider-Pargmann <msp@baylibre.com>
Fixes:
142c6dc6d9d7 ("can: tcan4x5x: Add support for tcan4552/4553")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Bartosz Golaszewski [Tue, 15 Jul 2025 09:40:25 +0000 (11:40 +0200)]
Merge tag 'intel-gpio-v6.16-2' of git://git./linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v6.16-2
* Add a quirk for Acer Nitro V15 against wakeup capability
The following is an automated git shortlog grouped by driver:
gpiolib:
- acpi: Add a quirk for Acer Nitro V15
Ville Syrjälä [Fri, 11 Jul 2025 20:57:44 +0000 (23:57 +0300)]
wifi: iwlwifi: Fix botched indexing conversion
The conversion from compiler assisted indexing to manual
indexing wasn't done correctly. The array is still made
up of __le16 elements so multiplying the outer index by
the element size is not what we want. Fix it up.
This causes the kernel to oops when trying to transfer any
significant amount of data over wifi:
BUG: unable to handle page fault for address:
ffffc900009f5282
PGD
100000067 P4D
100000067 PUD
1000fb067 PMD
102e82067 PTE 0
Oops: Oops: 0002 [#1] SMP
CPU: 1 UID: 0 PID: 99 Comm: kworker/u8:3 Not tainted
6.15.0-rc2-cl-bisect3-00604-g6204d5130a64-dirty #78 PREEMPT
Hardware name: Dell Inc. Latitude E5400 /0D695C, BIOS A19 06/13/2013
Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
RIP: 0010:iwl_trans_pcie_tx+0x4dd/0xe60 [iwlwifi]
Code: 00 00 66 81 fa ff 0f 0f 87 42 09 00 00 3d ff 00 00 00 0f 8f 37 09 00 00 41 c1 e0 0c 41 09 d0 48 8d 14 b6 48 c1 e2 07 48 01 ca <66> 44 89 04 57 48 8d 0c 12 83 f8 3f 0f 8e 84 01 00 00 41 8b 85 80
RSP: 0018:
ffffc900001c3b50 EFLAGS:
00010206
RAX:
00000000000000c1 RBX:
ffff88810b180028 RCX:
00000000000000c1
RDX:
0000000000002141 RSI:
000000000000000d RDI:
ffffc900009f1000
RBP:
0000000000000002 R08:
0000000000000025 R09:
ffffffffa050fa60
R10:
00000000fbdbf4bc R11:
0000000000000082 R12:
ffff88810e5ade40
R13:
ffff88810af81588 R14:
000000000000001a R15:
ffff888100dfe0c8
FS:
0000000000000000(0000) GS:
ffff8881998c3000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffc900009f5282 CR3:
0000000001e39000 CR4:
00000000000426f0
Call Trace:
<TASK>
? rcu_is_watching+0xd/0x40
? __iwl_dbg+0xb1/0xe0 [iwlwifi]
iwlagn_tx_skb+0x8e2/0xcb0 [iwldvm]
iwlagn_mac_tx+0x18/0x30 [iwldvm]
ieee80211_handle_wake_tx_queue+0x6c/0xc0 [mac80211]
ieee80211_agg_start_txq+0x140/0x2e0 [mac80211]
ieee80211_agg_tx_operational+0x126/0x210 [mac80211]
ieee80211_process_addba_resp+0x27b/0x2a0 [mac80211]
ieee80211_iface_work+0x4bd/0x4d0 [mac80211]
? _raw_spin_unlock_irq+0x1f/0x40
cfg80211_wiphy_work+0x117/0x1f0 [cfg80211]
process_one_work+0x1ee/0x570
worker_thread+0x1c5/0x3b0
? bh_worker+0x240/0x240
kthread+0x110/0x220
? kthread_queue_delayed_work+0x90/0x90
ret_from_fork+0x28/0x40
? kthread_queue_delayed_work+0x90/0x90
ret_from_fork_asm+0x11/0x20
</TASK>
Modules linked in: ctr aes_generic ccm sch_fq_codel bnep xt_tcpudp xt_multiport xt_state iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ip_tables x_tables btusb btrtl btintel btbcm bluetooth ecdh_generic ecc libaes hid_generic usbhid hid binfmt_misc joydev mousedev snd_hda_codec_hdmi iwldvm snd_hda_codec_idt snd_hda_codec_generic mac80211 coretemp iTCO_wdt watchdog kvm_intel i2c_dev snd_hda_intel libarc4 kvm snd_intel_dspcfg sdhci_pci sdhci_uhs2 snd_hda_codec iwlwifi sdhci irqbypass cqhci snd_hwdep snd_hda_core cfg80211 firewire_ohci mmc_core psmouse snd_pcm i2c_i801 firewire_core pcspkr led_class uhci_hcd i2c_smbus tg3 crc_itu_t iosf_mbi snd_timer rfkill libphy ehci_pci snd ehci_hcd lpc_ich mfd_core usbcore video intel_agp usb_common soundcore intel_gtt evdev agpgart parport_pc wmi parport backlight
CR2:
ffffc900009f5282
---[ end trace
0000000000000000 ]---
RIP: 0010:iwl_trans_pcie_tx+0x4dd/0xe60 [iwlwifi]
Code: 00 00 66 81 fa ff 0f 0f 87 42 09 00 00 3d ff 00 00 00 0f 8f 37 09 00 00 41 c1 e0 0c 41 09 d0 48 8d 14 b6 48 c1 e2 07 48 01 ca <66> 44 89 04 57 48 8d 0c 12 83 f8 3f 0f 8e 84 01 00 00 41 8b 85 80
RSP: 0018:
ffffc900001c3b50 EFLAGS:
00010206
RAX:
00000000000000c1 RBX:
ffff88810b180028 RCX:
00000000000000c1
RDX:
0000000000002141 RSI:
000000000000000d RDI:
ffffc900009f1000
RBP:
0000000000000002 R08:
0000000000000025 R09:
ffffffffa050fa60
R10:
00000000fbdbf4bc R11:
0000000000000082 R12:
ffff88810e5ade40
R13:
ffff88810af81588 R14:
000000000000001a R15:
ffff888100dfe0c8
FS:
0000000000000000(0000) GS:
ffff8881998c3000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffffc900009f5282 CR3:
0000000001e39000 CR4:
00000000000426f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Cc: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Fixes:
6204d5130a64 ("wifi: iwlwifi: use bc entries instead of bc table also for pre-ax210")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20250711205744.28723-1-ville.syrjala@linux.intel.com
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Johannes Berg [Mon, 14 Jul 2025 12:21:30 +0000 (14:21 +0200)]
wifi: cfg80211: remove scan request n_channels counted_by
This reverts commit
e3eac9f32ec0 ("wifi: cfg80211: Annotate struct
cfg80211_scan_request with __counted_by").
This really has been a completely failed experiment. There were
no actual bugs found, and yet at this point we already have four
"fixes" to it, with nothing to show for but code churn, and it
never even made the code any safer.
In all of the cases that ended up getting "fixed", the structure
is also internally inconsistent after the n_channels setting as
the channel list isn't actually filled yet. You cannot scan with
such a structure, that's just wrong. In mac80211, the struct is
also reused multiple times, so initializing it once is no good.
Some previous "fixes" (e.g. one in brcm80211) are also just setting
n_channels before accessing the array, under the assumption that the
code is correct and the array can be accessed, further showing that
the whole thing is just pointless when the allocation count and use
count are not separate.
If we really wanted to fix it, we'd need to separately track the
number of channels allocated and the number of channels currently
used, but given that no bugs were found despite the numerous syzbot
reports, that'd just be a waste of time.
Remove the __counted_by() annotation. We really should also remove
a number of the n_channels settings that are setting up a structure
that's inconsistent, but that can wait.
Reported-by: syzbot+e834e757bd9b3d3e1251@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
e834e757bd9b3d3e1251
Fixes:
e3eac9f32ec0 ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by")
Link: https://patch.msgid.link/20250714142130.9b0bbb7e1f07.I09112ccde72d445e11348fc2bef68942cb2ffc94@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>