Linus Torvalds [Fri, 11 Oct 2024 18:41:20 +0000 (11:41 -0700)]
Merge tag 'pm-6.12-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address two issues in the TPMI module of the Intel RAPL power
capping driver and one issue in the processor part of the Intel
int340x thermal driver, update a CPU ID list and register definitions
needed for RAPL PL4 support and remove some unused code.
Specifics:
- Fix the TPMI_RAPL_REG_DOMAIN_INFO register offset in the TPMI part
of the Intel RAPL power capping driver, make it ignore minor
hardware version mismatches (which only indicate exposing
additional features) and update register definitions in it to
enable PL4 support (Zhang Rui)
- Add Arrow Lake-U to the list of processors supporting PL4 in the
MSR part of the Intel RAPL power capping driver (Sumeet Pawnikar)
- Remove excess pci_disable_device() calls from the processor part of
the int340x thermal driver to address a warning triggered during
module unload and remove unused CPU hotplug code related to RAPL
support from it (Zhang Rui)"
* tag 'pm-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: int340x: processor: Add MMIO RAPL PL4 support
thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug support
powercap: intel_rapl_msr: Add PL4 support for Arrowlake-U
powercap: intel_rapl_tpmi: Ignore minor version change
thermal: intel: int340x: processor: Fix warning during module unload
powercap: intel_rapl_tpmi: Fix bogus register reading
Linus Torvalds [Fri, 11 Oct 2024 18:35:30 +0000 (11:35 -0700)]
Merge tag 'thermal-6.12-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Address possible use-after-free scenarios during the processing of
thermal netlink commands and during thermal zone removal (Rafael
Wysocki)"
* tag 'thermal-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: core: Free tzp copy along with the thermal zone
thermal: core: Reference count the zone in thermal_zone_get_by_id()
Linus Torvalds [Fri, 11 Oct 2024 18:32:10 +0000 (11:32 -0700)]
Merge tag 'acpi-6.12-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Reduce the number of ACPI IRQ override DMI quirks by combining quirks
that cover similar systems while making them cover additional models
at the same time (Hans de Goede)"
* tag 'acpi-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: resource: Fold Asus Vivobook Pro N6506M* DMI quirks together
ACPI: resource: Fold Asus ExpertBook B1402C* and B1502C* DMI quirks together
ACPI: resource: Make Asus ExpertBook B2502 matches cover more models
ACPI: resource: Make Asus ExpertBook B2402 matches cover more models
Linus Torvalds [Fri, 11 Oct 2024 18:26:15 +0000 (11:26 -0700)]
Merge tag 'pmdomain-v6.12-rc1' of git://git./linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"pmdomain core:
- Fix alloc/free in dev_pm_domain_attach|detach_list()
pmdomain providers:
- qcom: Fix the return of uninitialized variable
pmdomain consumers:
- drm/tegra/gr3d: Revert conversion to dev_pm_domain_attach|detach_list()
OPP core:
- Fix error code in dev_pm_opp_set_config()"
* tag 'pmdomain-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
PM: domains: Fix alloc/free in dev_pm_domain_attach|detach_list()
Revert "drm/tegra: gr3d: Convert into dev_pm_domain_attach|detach_list()"
pmdomain: qcom-cpr: Fix the return of uninitialized variable
OPP: fix error code in dev_pm_opp_set_config()
Linus Torvalds [Fri, 11 Oct 2024 18:23:21 +0000 (11:23 -0700)]
Merge tag 'mmc-v6.12-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Prevent splat from warning when setting maximum DMA segment
MMC host:
- mvsdio: Drop sg_miter support for PIO as it didn't work
- sdhci-of-dwcmshc: Prevent stale interrupt for the T-Head 1520
variant"
* tag 'mmc-v6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-dwcmshc: Prevent stale command interrupt handling
Revert "mmc: mvsdio: Use sg_miter for PIO"
mmc: core: Only set maximum DMA segment size if DMA is supported
Linus Torvalds [Fri, 11 Oct 2024 18:18:31 +0000 (11:18 -0700)]
Merge tag 'ata-6.12-rc3' of git://git./linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Fix a hibernate regression where the disk was needlessly spun down
and then immediately spun up both when entering and when resuming
from hibernation (me)
- Update the MAINTAINERS file to remove remnants from Jens
maintainership of libata (Damien)
* tag 'ata-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata: Update MAINTAINERS file
ata: libata: avoid superfluous disk spin down + spin up during hibernation
Linus Torvalds [Fri, 11 Oct 2024 18:13:05 +0000 (11:13 -0700)]
Merge tag 'drm-fixes-2024-10-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly fixes haul for drm, lots of small fixes all over, amdgpu, xe
lead the way, some minor nouveau and radeon fixes, and then a bunch of
misc all over.
Nothing too scary or out of the unusual.
sched:
- Avoid leaking lockdep map
fbdev-dma:
- Only clean up deferred I/O if instanciated
amdgpu:
- Fix invalid UBSAN warnings
- Fix artifacts in MPO transitions
- Hibernation fix
amdkfd:
- Fix an eviction fence leak
radeon:
- Add late register for connectors
- Always set GEM function pointers
i915:
- HDCP refcount fix
nouveau:
- dmem: Fix privileged error in copy engine channel; Fix possible
data leak in migrate_to_ram()
- gsp: Fix coding style
v3d:
- Stop active perfmon before destroying it
vc4:
- Stop active perfmon before destroying it
xe:
- Drop GuC submit_wq pool
- Fix error checking with xa_store()
- Fix missing freq restore on GSC load error
- Fix wedged_mode file permission
- Fix use-after-free in ct communication"
* tag 'drm-fixes-2024-10-11' of https://gitlab.freedesktop.org/drm/kernel:
drm/fbdev-dma: Only cleanup deferred I/O if necessary
drm/xe: Make wedged_mode debugfs writable
drm/xe: Restore GT freq on GSC load error
drm/xe/guc_submit: fix xa_store() error checking
drm/xe/ct: fix xa_store() error checking
drm/xe/ct: prevent UAF in send_recv()
drm/radeon: always set GEM function pointer
nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error
nouveau/dmem: Fix privileged error in copy engine channel
drm/amd/display: fix hibernate entry for DCN35+
drm/amd/display: Clear update flags after update has been applied
drm/amdgpu: partially revert powerplay `__counted_by` changes
drm/radeon: add late_register for connector
drm/amdkfd: Fix an eviction fence leak
drm/vc4: Stop the active perfmon before being destroyed
drm/v3d: Stop the active perfmon before being destroyed
drm/i915/hdcp: fix connector refcounting
drm/nouveau/gsp: remove extraneous ; after mutex
drm/xe: Drop GuC submit_wq pool
drm/sched: Use drm sched lockdep map for submit_wq
Dave Airlie [Fri, 11 Oct 2024 03:54:05 +0000 (13:54 +1000)]
Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix error checking with xa_store() (Matthe Auld)
- Fix missing freq restore on GSC load error (Vinay)
- Fix wedged_mode file permission (Matt Roper)
- Fix use-after-free in ct communication (Matthew Auld)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/jri65tmv3bjbhqhxs5smv45nazssxzhtwphojem4uufwtjuliy@gsdhlh6kzsdy
Dave Airlie [Thu, 10 Oct 2024 23:03:20 +0000 (09:03 +1000)]
Merge tag 'drm-misc-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
fbdev-dma:
- Only clean up deferred I/O if instanciated
nouveau:
- dmem: Fix privileged error in copy engine channel; Fix possible
data leak in migrate_to_ram()
- gsp: Fix coding style
sched:
- Avoid leaking lockdep map
v3d:
- Stop active perfmon before destroying it
vc4:
- Stop active perfmon before destroying it
xe:
- Drop GuC submit_wq pool
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20241010133708.GA461532@localhost.localdomain
Dave Airlie [Thu, 10 Oct 2024 22:55:26 +0000 (08:55 +1000)]
Merge tag 'drm-intel-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- HDCP refcount fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zwd78Tnw8t3w9F16@jlahtine-mobl.ger.corp.intel.com
Linus Torvalds [Thu, 10 Oct 2024 19:36:35 +0000 (12:36 -0700)]
Merge tag 'net-6.12-rc3' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth and netfilter.
Current release - regressions:
- dsa: sja1105: fix reception from VLAN-unaware bridges
- Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is
enabled"
- eth: fec: don't save PTP state if PTP is unsupported
Current release - new code bugs:
- smc: fix lack of icsk_syn_mss with IPPROTO_SMC, prevent null-deref
- eth: airoha: update Tx CPU DMA ring idx at the end of xmit loop
- phy: aquantia: AQR115c fix up PMA capabilities
Previous releases - regressions:
- tcp: 3 fixes for retrans_stamp and undo logic
Previous releases - always broken:
- net: do not delay dst_entries_add() in dst_release()
- netfilter: restrict xtables extensions to families that are safe,
syzbot found a way to combine ebtables with extensions that are
never used by userspace tools
- sctp: ensure sk_state is set to CLOSED if hashing fails in
sctp_listen_start
- mptcp: handle consistently DSS corruption, and prevent corruption
due to large pmtu xmit"
* tag 'net-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
MAINTAINERS: Add headers and mailing list to UDP section
MAINTAINERS: consistently exclude wireless files from NETWORKING [GENERAL]
slip: make slhc_remember() more robust against malicious packets
net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC
ppp: fix ppp_async_encode() illegal access
docs: netdev: document guidance on cleanup patches
phonet: Handle error of rtnl_register_module().
mpls: Handle error of rtnl_register_module().
mctp: Handle error of rtnl_register_module().
bridge: Handle error of rtnl_register_module().
vxlan: Handle error of rtnl_register_module().
rtnetlink: Add bulk registration helpers for rtnetlink message handlers.
net: do not delay dst_entries_add() in dst_release()
mptcp: pm: do not remove closing subflows
mptcp: fallback when MPTCP opts are dropped after 1st data
tcp: fix mptcp DSS corruption due to large pmtu xmit
mptcp: handle consistently DSS corruption
net: netconsole: fix wrong warning
net: dsa: refuse cross-chip mirroring operations
net: fec: don't save PTP state if PTP is unsupported
...
Linus Torvalds [Thu, 10 Oct 2024 19:25:32 +0000 (12:25 -0700)]
Merge tag 'trace-ringbuffer-v6.12-rc2' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
"Ring-buffer fix: do not have boot-mapped buffers use CPU hotplug
callbacks
When a ring buffer is mapped to memory assigned at boot, it also
splits it up evenly between the possible CPUs. But the allocation code
still attached a CPU notifier callback to this ring buffer. When a CPU
is added, the callback will happen and another per-cpu buffer is
created for the ring buffer.
But for boot mapped buffers, there is no room to add another one (as
they were all created already). The result of calling the CPU hotplug
notifier on a boot mapped ring buffer is unpredictable and could lead
to a system crash.
If the ring buffer is boot mapped simply do not attach the CPU
notifier to it"
* tag 'trace-ringbuffer-v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not have boot mapped buffers hook to CPU hotplug
Linus Torvalds [Thu, 10 Oct 2024 17:02:59 +0000 (10:02 -0700)]
Merge tag 'for-6.12-rc2-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- update fstrim loop and add more cancellation points, fix reported
delayed or blocked suspend if there's a huge chunk queued
- fix error handling in recent qgroup xarray conversion
- in zoned mode, fix warning printing device path without RCU
protection
- again fix invalid extent xarray state (
6252690f7e1b), lost due to
refactoring
* tag 'for-6.12-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix clear_dirty and writeback ordering in submit_one_sector()
btrfs: zoned: fix missing RCU locking in error message when loading zone info
btrfs: fix missing error handling when adding delayed ref with qgroups enabled
btrfs: add cancellation points to trim loops
btrfs: split remaining space to discard in chunks
Linus Torvalds [Thu, 10 Oct 2024 16:52:49 +0000 (09:52 -0700)]
Merge tag 'nfsd-6.12-1' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix NFSD bring-up / shutdown
- Fix a UAF when releasing a stateid
* tag 'nfsd-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: fix possible badness in FREE_STATEID
nfsd: nfsd_destroy_serv() must call svc_destroy() even if nfsd_startup_net() failed
NFSD: Mark filecache "down" if init fails
Linus Torvalds [Thu, 10 Oct 2024 16:45:45 +0000 (09:45 -0700)]
Merge tag 'xfs-6.12-fixes-3' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Carlos Maiolino:
- A few small typo fixes
- fstests xfs/538 DEBUG-only fix
- Performance fix on blockgc on COW'ed files, by skipping trims on
cowblock inodes currently opened for write
- Prevent cowblocks to be freed under dirty pagecache during unshare
- Update MAINTAINERS file to quote the new maintainer
* tag 'xfs-6.12-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix a typo
xfs: don't free cowblocks from under dirty pagecache on unshare
xfs: skip background cowblock trims on inodes open for write
xfs: support lowmode allocations in xfs_bmap_exact_minlen_extent_alloc
xfs: call xfs_bmap_exact_minlen_extent_alloc from xfs_bmap_btalloc
xfs: don't ifdef around the exact minlen allocations
xfs: fold xfs_bmap_alloc_userdata into xfs_bmapi_allocate
xfs: distinguish extra split from real ENOSPC from xfs_attr_node_try_addname
xfs: distinguish extra split from real ENOSPC from xfs_attr3_leaf_split
xfs: return bool from xfs_attr3_leaf_add
xfs: merge xfs_attr_leaf_try_add into xfs_attr_leaf_addname
xfs: Use try_cmpxchg() in xlog_cil_insert_pcp_aggregate()
xfs: scrub: convert comma to semicolon
xfs: Remove empty declartion in header file
MAINTAINERS: add Carlos Maiolino as XFS release manager
Jakub Kicinski [Thu, 10 Oct 2024 16:35:50 +0000 (09:35 -0700)]
Merge branch 'maintainers-networking-file-coverage-updates'
Simon Horman says:
====================
MAINTAINERS: Networking file coverage updates
The aim of this proposal is to make the handling of some files,
related to Networking and Wireless, more consistently. It does so by:
1. Adding some more headers to the UDP section, making it consistent
with the TCP section.
2. Excluding some files relating to Wireless from NETWORKING [GENERAL],
making their handling consistent with other files related to
Wireless.
The aim of this is to make things more consistent. And for MAINTAINERS
to better reflect the situation on the ground. I am more than happy to
be told that the current state of affairs is fine. Or for other ideas to
be discussed.
v1: https://lore.kernel.org/
20241004-maint-net-hdrs-v1-0-
41fd555aacc5@kernel.org
====================
Link: https://patch.msgid.link/20241009-maint-net-hdrs-v2-0-f2c86e7309c8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 9 Oct 2024 08:47:23 +0000 (09:47 +0100)]
MAINTAINERS: Add headers and mailing list to UDP section
Add netdev mailing list and some more udp.h headers to the UDP section.
This is now more consistent with the TCP section.
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241009-maint-net-hdrs-v2-2-f2c86e7309c8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 9 Oct 2024 08:47:22 +0000 (09:47 +0100)]
MAINTAINERS: consistently exclude wireless files from NETWORKING [GENERAL]
We already exclude wireless drivers from the netdev@ traffic, to
delegate it to linux-wireless@, and avoid overwhelming netdev@.
Many of the following wireless-related sections MAINTAINERS
are already not included in the NETWORKING [GENERAL] section.
For consistency, exclude those that are.
* 802.11 (including CFG80211/NL80211)
* MAC80211
* RFKILL
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241009-maint-net-hdrs-v2-1-f2c86e7309c8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 9 Oct 2024 09:11:32 +0000 (09:11 +0000)]
slip: make slhc_remember() more robust against malicious packets
syzbot found that slhc_remember() was missing checks against
malicious packets [1].
slhc_remember() only checked the size of the packet was at least 20,
which is not good enough.
We need to make sure the packet includes the IPv4 and TCP header
that are supposed to be carried.
Add iph and th pointers to make the code more readable.
[1]
BUG: KMSAN: uninit-value in slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666
slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666
ppp_receive_nonmp_frame+0xe45/0x35e0 drivers/net/ppp/ppp_generic.c:2455
ppp_receive_frame drivers/net/ppp/ppp_generic.c:2372 [inline]
ppp_do_recv+0x65f/0x40d0 drivers/net/ppp/ppp_generic.c:2212
ppp_input+0x7dc/0xe60 drivers/net/ppp/ppp_generic.c:2327
pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379
sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113
__release_sock+0x1da/0x330 net/core/sock.c:3072
release_sock+0x6b/0x250 net/core/sock.c:3626
pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:744
____sys_sendmsg+0x903/0xb60 net/socket.c:2602
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
__sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
__do_sys_sendmmsg net/socket.c:2771 [inline]
__se_sys_sendmmsg net/socket.c:2768 [inline]
__x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at:
slab_post_alloc_hook mm/slub.c:4091 [inline]
slab_alloc_node mm/slub.c:4134 [inline]
kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4186
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587
__alloc_skb+0x363/0x7b0 net/core/skbuff.c:678
alloc_skb include/linux/skbuff.h:1322 [inline]
sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732
pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:744
____sys_sendmsg+0x903/0xb60 net/socket.c:2602
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
__sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
__do_sys_sendmmsg net/socket.c:2771 [inline]
__se_sys_sendmmsg net/socket.c:2768 [inline]
__x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
CPU: 0 UID: 0 PID: 5460 Comm: syz.2.33 Not tainted
6.12.0-rc2-syzkaller-00006-g87d6aab2389e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Fixes:
b5451d783ade ("slip: Move the SLIP drivers")
Reported-by: syzbot+2ada1bc857496353be5a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
670646db.
050a0220.3f80e.0027.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241009091132.2136321-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
D. Wythe [Wed, 9 Oct 2024 06:55:16 +0000 (14:55 +0800)]
net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC
Eric report a panic on IPPROTO_SMC, and give the facts
that when INET_PROTOSW_ICSK was set, icsk->icsk_sync_mss must be set too.
Bug: Unable to handle kernel NULL pointer dereference at virtual address
0000000000000000
Mem abort info:
ESR = 0x0000000086000005
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=
00000001195d1000
[
0000000000000000] pgd=
0800000109c46003, p4d=
0800000109c46003,
pud=
0000000000000000
Internal error: Oops:
0000000086000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 UID: 0 PID: 8037 Comm: syz.3.265 Not tainted
6.11.0-rc7-syzkaller-g5f5673607153 #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 08/06/2024
pstate:
80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : cipso_v4_sock_setattr+0x2a8/0x3c0 net/ipv4/cipso_ipv4.c:1910
sp :
ffff80009b887a90
x29:
ffff80009b887aa0 x28:
ffff80008db94050 x27:
0000000000000000
x26:
1fffe0001aa6f5b3 x25:
dfff800000000000 x24:
ffff0000db75da00
x23:
0000000000000000 x22:
ffff0000d8b78518 x21:
0000000000000000
x20:
ffff0000d537ad80 x19:
ffff0000d8b78000 x18:
1fffe000366d79ee
x17:
ffff8000800614a8 x16:
ffff800080569b84 x15:
0000000000000001
x14:
000000008b336894 x13:
00000000cd96feaa x12:
0000000000000003
x11:
0000000000040000 x10:
00000000000020a3 x9 :
1fffe0001b16f0f1
x8 :
0000000000000000 x7 :
0000000000000000 x6 :
000000000000003f
x5 :
0000000000000040 x4 :
0000000000000001 x3 :
0000000000000000
x2 :
0000000000000002 x1 :
0000000000000000 x0 :
ffff0000d8b78000
Call trace:
0x0
netlbl_sock_setattr+0x2e4/0x338 net/netlabel/netlabel_kapi.c:1000
smack_netlbl_add+0xa4/0x154 security/smack/smack_lsm.c:2593
smack_socket_post_create+0xa8/0x14c security/smack/smack_lsm.c:2973
security_socket_post_create+0x94/0xd4 security/security.c:4425
__sock_create+0x4c8/0x884 net/socket.c:1587
sock_create net/socket.c:1622 [inline]
__sys_socket_create net/socket.c:1659 [inline]
__sys_socket+0x134/0x340 net/socket.c:1706
__do_sys_socket net/socket.c:1720 [inline]
__se_sys_socket net/socket.c:1718 [inline]
__arm64_sys_socket+0x7c/0x94 net/socket.c:1718
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Code: ???????? ???????? ???????? ???????? (????????)
---[ end trace
0000000000000000 ]---
This patch add a toy implementation that performs a simple return to
prevent such panic. This is because MSS can be set in sock_create_kern
or smc_setsockopt, similar to how it's done in AF_SMC. However, for
AF_SMC, there is currently no way to synchronize MSS within
__sys_connect_file. This toy implementation lays the groundwork for us
to support such feature for IPPROTO_SMC in the future.
Fixes:
d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://patch.msgid.link/1728456916-67035-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 9 Oct 2024 18:58:02 +0000 (18:58 +0000)]
ppp: fix ppp_async_encode() illegal access
syzbot reported an issue in ppp_async_encode() [1]
In this case, pppoe_sendmsg() is called with a zero size.
Then ppp_async_encode() is called with an empty skb.
BUG: KMSAN: uninit-value in ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline]
BUG: KMSAN: uninit-value in ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675
ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline]
ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675
ppp_async_send+0x130/0x1b0 drivers/net/ppp/ppp_async.c:634
ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2280 [inline]
ppp_input+0x1f1/0xe60 drivers/net/ppp/ppp_generic.c:2304
pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379
sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113
__release_sock+0x1da/0x330 net/core/sock.c:3072
release_sock+0x6b/0x250 net/core/sock.c:3626
pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:744
____sys_sendmsg+0x903/0xb60 net/socket.c:2602
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
__sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
__do_sys_sendmmsg net/socket.c:2771 [inline]
__se_sys_sendmmsg net/socket.c:2768 [inline]
__x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at:
slab_post_alloc_hook mm/slub.c:4092 [inline]
slab_alloc_node mm/slub.c:4135 [inline]
kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4187
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587
__alloc_skb+0x363/0x7b0 net/core/skbuff.c:678
alloc_skb include/linux/skbuff.h:1322 [inline]
sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732
pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:744
____sys_sendmsg+0x903/0xb60 net/socket.c:2602
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
__sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
__do_sys_sendmmsg net/socket.c:2771 [inline]
__se_sys_sendmmsg net/socket.c:2768 [inline]
__x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
CPU: 1 UID: 0 PID: 5411 Comm: syz.1.14 Not tainted
6.12.0-rc1-syzkaller-00165-g360c1f1f24c6 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+1d121645899e7692f92a@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241009185802.3763282-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 9 Oct 2024 09:12:19 +0000 (10:12 +0100)]
docs: netdev: document guidance on cleanup patches
The purpose of this section is to document what is the current practice
regarding clean-up patches which address checkpatch warnings and similar
problems. I feel there is a value in having this documented so others
can easily refer to it.
Clearly this topic is subjective. And to some extent the current
practice discourages a wider range of patches than is described here.
But I feel it is best to start somewhere, with the most well established
part of the current practice.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241009-doc-mc-clean-v2-1-e637b665fa81@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Thu, 10 Oct 2024 13:39:37 +0000 (15:39 +0200)]
Merge branch 'rtnetlink-handle-error-of-rtnl_register_module'
Kuniyuki Iwashima says:
====================
rtnetlink: Handle error of rtnl_register_module().
While converting phonet to per-netns RTNL, I found a weird comment
/* Further rtnl_register_module() cannot fail */
that was true but no longer true after commit
addf9b90de22 ("net:
rtnetlink: use rcu to free rtnl message handlers").
Many callers of rtnl_register_module() just ignore the returned
value but should handle them properly.
This series introduces two helpers, rtnl_register_many() and
rtnl_unregister_many(), to do that easily and fix such callers.
All rtnl_register() and rtnl_register_module() will be converted
to _many() variant and some rtnl_lock() will be saved in _many()
later in net-next.
Changes:
v4:
* Add more context in changelog of each patch
v3: https://lore.kernel.org/all/
20241007124459.5727-1-kuniyu@amazon.com/
* Move module *owner to struct rtnl_msg_handler
* Make struct rtnl_msg_handler args/vars const
* Update mctp goto labels
v2: https://lore.kernel.org/netdev/
20241004222358.79129-1-kuniyu@amazon.com/
* Remove __exit from mctp_neigh_exit().
v1: https://lore.kernel.org/netdev/
20241003205725.5612-1-kuniyu@amazon.com/
====================
Link: https://patch.msgid.link/20241008184737.9619-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:37 +0000 (11:47 -0700)]
phonet: Handle error of rtnl_register_module().
Before commit
addf9b90de22 ("net: rtnetlink: use rcu to free rtnl
message handlers"), once the first rtnl_register_module() allocated
rtnl_msg_handlers[PF_PHONET], the following calls never failed.
However, after the commit, rtnl_register_module() could fail silently
to allocate rtnl_msg_handlers[PF_PHONET][msgtype] and requires error
handling for each call.
Handling the error allows users to view a module as an all-or-nothing
thing in terms of the rtnetlink functionality. This prevents syzkaller
from reporting spurious errors from its tests, where OOM often occurs
and module is automatically loaded.
Let's use rtnl_register_many() to handle the errors easily.
Fixes:
addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Rémi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:36 +0000 (11:47 -0700)]
mpls: Handle error of rtnl_register_module().
Since introduced, mpls_init() has been ignoring the returned
value of rtnl_register_module(), which could fail silently.
Handling the error allows users to view a module as an all-or-nothing
thing in terms of the rtnetlink functionality. This prevents syzkaller
from reporting spurious errors from its tests, where OOM often occurs
and module is automatically loaded.
Let's handle the errors by rtnl_register_many().
Fixes:
03c0566542f4 ("mpls: Netlink commands to add, remove, and dump routes")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:35 +0000 (11:47 -0700)]
mctp: Handle error of rtnl_register_module().
Since introduced, mctp has been ignoring the returned value of
rtnl_register_module(), which could fail silently.
Handling the error allows users to view a module as an all-or-nothing
thing in terms of the rtnetlink functionality. This prevents syzkaller
from reporting spurious errors from its tests, where OOM often occurs
and module is automatically loaded.
Let's handle the errors by rtnl_register_many().
Fixes:
583be982d934 ("mctp: Add device handling and netlink interface")
Fixes:
831119f88781 ("mctp: Add neighbour netlink interface")
Fixes:
06d2f4c583a7 ("mctp: Add netlink route management")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:34 +0000 (11:47 -0700)]
bridge: Handle error of rtnl_register_module().
Since introduced, br_vlan_rtnl_init() has been ignoring the returned
value of rtnl_register_module(), which could fail silently.
Handling the error allows users to view a module as an all-or-nothing
thing in terms of the rtnetlink functionality. This prevents syzkaller
from reporting spurious errors from its tests, where OOM often occurs
and module is automatically loaded.
Let's handle the errors by rtnl_register_many().
Fixes:
8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support")
Fixes:
f26b296585dc ("net: bridge: vlan: add new rtm message support")
Fixes:
adb3ce9bcb0f ("net: bridge: vlan: add del rtm message support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:33 +0000 (11:47 -0700)]
vxlan: Handle error of rtnl_register_module().
Since introduced, vxlan_vnifilter_init() has been ignoring the
returned value of rtnl_register_module(), which could fail silently.
Handling the error allows users to view a module as an all-or-nothing
thing in terms of the rtnetlink functionality. This prevents syzkaller
from reporting spurious errors from its tests, where OOM often occurs
and module is automatically loaded.
Let's handle the errors by rtnl_register_many().
Fixes:
f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kuniyuki Iwashima [Tue, 8 Oct 2024 18:47:32 +0000 (11:47 -0700)]
rtnetlink: Add bulk registration helpers for rtnetlink message handlers.
Before commit
addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message
handlers"), once rtnl_msg_handlers[protocol] was allocated, the following
rtnl_register_module() for the same protocol never failed.
However, after the commit, rtnl_msg_handler[protocol][msgtype] needs to
be allocated in each rtnl_register_module(), so each call could fail.
Many callers of rtnl_register_module() do not handle the returned error,
and we need to add many error handlings.
To handle that easily, let's add wrapper functions for bulk registration
of rtnetlink message handlers.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Ulf Hansson [Wed, 2 Oct 2024 12:22:23 +0000 (14:22 +0200)]
PM: domains: Fix alloc/free in dev_pm_domain_attach|detach_list()
The dev_pm_domain_attach|detach_list() functions are not resource managed,
hence they should not use devm_* helpers to manage allocation/freeing of
data. Let's fix this by converting to the traditional alloc/free functions.
Fixes:
161e16a5e50a ("PM: domains: Add helper functions to attach/detach multiple PM domains")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20241002122232.194245-3-ulf.hansson@linaro.org
Ulf Hansson [Wed, 2 Oct 2024 12:22:22 +0000 (14:22 +0200)]
Revert "drm/tegra: gr3d: Convert into dev_pm_domain_attach|detach_list()"
This reverts commit
f790b5c09665cab0d51dfcc84832d79d2b1e6c0e.
The reverted commit was not ready to be applied due to dependency on other
OPP/pmdomain changes that didn't make it for the last release cycle. Let's
revert it to fix the behaviour.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20241002122232.194245-2-ulf.hansson@linaro.org
Paolo Abeni [Thu, 10 Oct 2024 11:50:55 +0000 (13:50 +0200)]
Merge tag 'nf-24-10-09' of git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Restrict xtables extensions to families that are safe, syzbot found
a way to combine ebtables with extensions that are never used by
userspace tools. From Florian Westphal.
2) Set l3mdev inconditionally whenever possible in nft_fib to fix lookup
mismatch, also from Florian.
netfilter pull request 24-10-09
* tag 'nf-24-10-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
selftests: netfilter: conntrack_vrf.sh: add fib test case
netfilter: fib: check correct rtable in vrf setups
netfilter: xtables: avoid NFPROTO_UNSPEC where needed
====================
Link: https://patch.msgid.link/20241009213858.3565808-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Michal Wilczynski [Tue, 8 Oct 2024 10:03:27 +0000 (12:03 +0200)]
mmc: sdhci-of-dwcmshc: Prevent stale command interrupt handling
While working with the T-Head 1520 LicheePi4A SoC, certain conditions
arose that allowed me to reproduce a race issue in the sdhci code.
To reproduce the bug, you need to enable the sdio1 controller in the
device tree file
`arch/riscv/boot/dts/thead/th1520-lichee-module-4a.dtsi` as follows:
&sdio1 {
bus-width = <4>;
max-frequency = <
100000000>;
no-sd;
no-mmc;
broken-cd;
cap-sd-highspeed;
post-power-on-delay-ms = <50>;
status = "okay";
wakeup-source;
keep-power-in-suspend;
};
When resetting the SoC using the reset button, the following messages
appear in the dmesg log:
[ 8.164898] mmc2: Got command interrupt 0x00000001 even though no
command operation was in progress.
[ 8.174054] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 8.180503] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00000005
[ 8.186950] mmc2: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 8.193395] mmc2: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 8.199841] mmc2: sdhci: Present: 0x03da0000 | Host ctl: 0x00000000
[ 8.206287] mmc2: sdhci: Power: 0x0000000f | Blk gap: 0x00000000
[ 8.212733] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x0000decf
[ 8.219178] mmc2: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
[ 8.225622] mmc2: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 8.232068] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
[ 8.238513] mmc2: sdhci: Caps: 0x3f69c881 | Caps_1: 0x08008177
[ 8.244959] mmc2: sdhci: Cmd: 0x00000502 | Max curr: 0x00191919
[ 8.254115] mmc2: sdhci: Resp[0]: 0x00001009 | Resp[1]: 0x00000000
[ 8.260561] mmc2: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 8.267005] mmc2: sdhci: Host ctl2: 0x00001000
[ 8.271453] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr:
0x0000000000000000
[ 8.278594] mmc2: sdhci: ============================================
I also enabled some traces to better understand the problem:
kworker/3:1-62 [003] ..... 8.163538: mmc_request_start:
mmc2: start struct mmc_request[
000000000d30cc0c]: cmd_opcode=5
cmd_arg=0x0 cmd_flags=0x2e1 cmd_retries=0 stop_opcode=0 stop_arg=0x0
stop_flags=0x0 stop_retries=0 sbc_opcode=0 sbc_arg=0x0 sbc_flags=0x0
sbc_retires=0 blocks=0 block_size=0 blk_addr=0 data_flags=0x0 tag=0
can_retune=0 doing_retune=0 retune_now=0 need_retune=0 hold_retune=1
retune_period=0
<idle>-0 [000] d.h2. 8.164816: sdhci_cmd_irq:
hw_name=
ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x10000
intmask_p=0x18000
irq/24-mmc2-96 [000] ..... 8.164840: sdhci_thread_irq:
msg=
irq/24-mmc2-96 [000] d.h2. 8.164896: sdhci_cmd_irq:
hw_name=
ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x1
intmask_p=0x1
irq/24-mmc2-96 [000] ..... 8.285142: mmc_request_done:
mmc2: end struct mmc_request[
000000000d30cc0c]: cmd_opcode=5
cmd_err=-110 cmd_resp=0x0 0x0 0x0 0x0 cmd_retries=0 stop_opcode=0
stop_err=0 stop_resp=0x0 0x0 0x0 0x0 stop_retries=0 sbc_opcode=0
sbc_err=0 sbc_resp=0x0 0x0 0x0 0x0 sbc_retries=0 bytes_xfered=0
data_err=0 tag=0 can_retune=0 doing_retune=0 retune_now=0 need_retune=0
hold_retune=1 retune_period=0
Here's what happens: the __mmc_start_request function is called with
opcode 5. Since the power to the Wi-Fi card, which resides on this SDIO
bus, is initially off after the reset, an interrupt SDHCI_INT_TIMEOUT is
triggered. Immediately after that, a second interrupt SDHCI_INT_RESPONSE
is triggered. Depending on the exact timing, these conditions can
trigger the following race problem:
1) The sdhci_cmd_irq top half handles the command as an error. It sets
host->cmd to NULL and host->pending_reset to true.
2) The sdhci_thread_irq bottom half is scheduled next and executes faster
than the second interrupt handler for SDHCI_INT_RESPONSE. It clears
host->pending_reset before the SDHCI_INT_RESPONSE handler runs.
3) The pending interrupt SDHCI_INT_RESPONSE handler gets called, triggering
a code path that prints: "mmc2: Got command interrupt 0x00000001 even
though no command operation was in progress."
To solve this issue, we need to clear pending interrupts when resetting
host->pending_reset. This ensures that after sdhci_threaded_irq restores
interrupts, there are no pending stale interrupts.
The behavior observed here is non-compliant with the SDHCI standard.
Place the code in the sdhci-of-dwcmshc driver to account for a
hardware-specific quirk instead of the core SDHCI code.
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes:
43658a542ebf ("mmc: sdhci-of-dwcmshc: Add support for T-Head TH1520")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241008100327.4108895-1-m.wilczynski@samsung.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Eric Dumazet [Tue, 8 Oct 2024 14:31:10 +0000 (14:31 +0000)]
net: do not delay dst_entries_add() in dst_release()
dst_entries_add() uses per-cpu data that might be freed at netns
dismantle from ip6_route_net_exit() calling dst_entries_destroy()
Before ip6_route_net_exit() can be called, we release all
the dsts associated with this netns, via calls to dst_release(),
which waits an rcu grace period before calling dst_destroy()
dst_entries_add() use in dst_destroy() is racy, because
dst_entries_destroy() could have been called already.
Decrementing the number of dsts must happen sooner.
Notes:
1) in CONFIG_XFRM case, dst_destroy() can call
dst_release_immediate(child), this might also cause UAF
if the child does not have DST_NOCOUNT set.
IPSEC maintainers might take a look and see how to address this.
2) There is also discussion about removing this count of dst,
which might happen in future kernels.
Fixes:
f88649721268 ("ipv4: fix dst race in sk_dst_get()")
Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20241008143110.1064899-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Damien Le Moal [Thu, 10 Oct 2024 02:01:17 +0000 (11:01 +0900)]
ata: libata: Update MAINTAINERS file
Modify the entry for the ahci_platform driver (LIBATA SATA
AHCI PLATFORM devices support) in the MAINTAINERS file to remove Jens
as maintainer. Also remove all references to Jens block tree from the
various LIBATA driver entries as the tree reference for these is defined
by the LIBATA SUBSYSTEM entry.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20241010020117.416333-1-dlemoal@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Janne Grunau [Sun, 6 Oct 2024 17:49:45 +0000 (19:49 +0200)]
drm/fbdev-dma: Only cleanup deferred I/O if necessary
Commit
5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if
necessary") initializes deferred I/O only if it is used.
drm_fbdev_dma_fb_destroy() however calls fb_deferred_io_cleanup()
unconditionally with struct fb_info.fbdefio == NULL. KASAN with the
out-of-tree Apple silicon display driver posts following warning from
__flush_work() of a random struct work_struct instead of the expected
NULL pointer derefs.
[ 22.053799] ------------[ cut here ]------------
[ 22.054832] WARNING: CPU: 2 PID: 1 at kernel/workqueue.c:4177 __flush_work+0x4d8/0x580
[ 22.056597] Modules linked in: uhid bnep uinput nls_ascii ip6_tables ip_tables i2c_dev loop fuse dm_multipath nfnetlink zram hid_magicmouse btrfs xor xor_neon brcmfmac_wcc raid6_pq hci_bcm4377 bluetooth brcmfmac hid_apple brcmutil nvmem_spmi_mfd simple_mfd_spmi dockchannel_hid cfg80211 joydev regmap_spmi nvme_apple ecdh_generic ecc macsmc_hid rfkill dwc3 appledrm snd_soc_macaudio macsmc_power nvme_core apple_isp phy_apple_atc apple_sart apple_rtkit_helper apple_dockchannel tps6598x macsmc_hwmon snd_soc_cs42l84 videobuf2_v4l2 spmi_apple_controller nvmem_apple_efuses videobuf2_dma_sg apple_z2 videobuf2_memops spi_nor panel_summit videobuf2_common asahi videodev pwm_apple apple_dcp snd_soc_apple_mca apple_admac spi_apple clk_apple_nco i2c_pasemi_platform snd_pcm_dmaengine mc i2c_pasemi_core mux_core ofpart adpdrm drm_dma_helper apple_dart apple_soc_cpufreq leds_pwm phram
[ 22.073768] CPU: 2 UID: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.11.2-asahi+ #asahi-dev
[ 22.075612] Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT)
[ 22.077032] pstate:
01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 22.078567] pc : __flush_work+0x4d8/0x580
[ 22.079471] lr : __flush_work+0x54/0x580
[ 22.080345] sp :
ffffc000836ef820
[ 22.081089] x29:
ffffc000836ef880 x28:
0000000000000000 x27:
ffff80002ddb7128
[ 22.082678] x26:
dfffc00000000000 x25:
1ffff000096f0c57 x24:
ffffc00082d3e358
[ 22.084263] x23:
ffff80004b7862b8 x22:
dfffc00000000000 x21:
ffff80005aa1d470
[ 22.085855] x20:
ffff80004b786000 x19:
ffff80004b7862a0 x18:
0000000000000000
[ 22.087439] x17:
0000000000000000 x16:
0000000000000000 x15:
0000000000000005
[ 22.089030] x14:
1ffff800106ddf0a x13:
0000000000000000 x12:
0000000000000000
[ 22.090618] x11:
ffffb800106ddf0f x10:
dfffc00000000000 x9 :
1ffff800106ddf0e
[ 22.092206] x8 :
0000000000000000 x7 :
aaaaaaaaaaaaaaaa x6 :
0000000000000001
[ 22.093790] x5 :
ffffc000836ef728 x4 :
0000000000000000 x3 :
0000000000000020
[ 22.095368] x2 :
0000000000000008 x1 :
00000000000000aa x0 :
0000000000000000
[ 22.096955] Call trace:
[ 22.097505] __flush_work+0x4d8/0x580
[ 22.098330] flush_delayed_work+0x80/0xb8
[ 22.099231] fb_deferred_io_cleanup+0x3c/0x130
[ 22.100217] drm_fbdev_dma_fb_destroy+0x6c/0xe0 [drm_dma_helper]
[ 22.101559] unregister_framebuffer+0x210/0x2f0
[ 22.102575] drm_fb_helper_unregister_info+0x48/0x60
[ 22.103683] drm_fbdev_dma_client_unregister+0x4c/0x80 [drm_dma_helper]
[ 22.105147] drm_client_dev_unregister+0x1cc/0x230
[ 22.106217] drm_dev_unregister+0x58/0x570
[ 22.107125] apple_drm_unbind+0x50/0x98 [appledrm]
[ 22.108199] component_del+0x1f8/0x3a8
[ 22.109042] dcp_platform_shutdown+0x24/0x38 [apple_dcp]
[ 22.110357] platform_shutdown+0x70/0x90
[ 22.111219] device_shutdown+0x368/0x4d8
[ 22.112095] kernel_restart+0x6c/0x1d0
[ 22.112946] __arm64_sys_reboot+0x1c8/0x328
[ 22.113868] invoke_syscall+0x78/0x1a8
[ 22.114703] do_el0_svc+0x124/0x1a0
[ 22.115498] el0_svc+0x3c/0xe0
[ 22.116181] el0t_64_sync_handler+0x70/0xc0
[ 22.117110] el0t_64_sync+0x190/0x198
[ 22.117931] ---[ end trace
0000000000000000 ]---
Signed-off-by: Janne Grunau <j@jannau.net>
Fixes:
5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary")
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/ZwLNuZL-8Gh5UUQb@robin
Jakub Kicinski [Thu, 10 Oct 2024 03:01:20 +0000 (20:01 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-10-08 (ice, i40e, igb, e1000e)
This series contains updates to ice, i40e, igb, and e1000e drivers.
For ice:
Marcin allows driver to load, into safe mode, when DDP package is
missing or corrupted and adjusts the netif_is_ice() check to
account for when the device is in safe mode. He also fixes an
out-of-bounds issue when MSI-X are increased for VFs.
Wojciech clears FDB entries on reset to match the hardware state.
For i40e:
Aleksandr adds locking around MACVLAN filters to prevent memory leaks
due to concurrency issues.
For igb:
Mohamed Khalfella adds a check to not attempt to bring up an already
running interface on non-fatal PCIe errors.
For e1000e:
Vitaly changes board type for I219 to more closely match the hardware
and stop PHY issues.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: change I219 (19) devices to ADP
igb: Do not bring the device up after non-fatal error
i40e: Fix macvlan leak by synchronizing access to mac_filter_hash
ice: Fix increasing MSI-X on VF
ice: Flush FDB entries before reset
ice: Fix netif_is_ice() in Safe Mode
ice: Fix entering Safe Mode
====================
Link: https://patch.msgid.link/20241008230050.928245-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 10 Oct 2024 02:43:46 +0000 (19:43 -0700)]
Merge branch 'mptcp-misc-fixes-involving-fallback-to-tcp'
Matthieu Baerts says:
====================
mptcp: misc. fixes involving fallback to TCP
- Patch 1: better handle DSS corruptions from a bugged peer: reducing
warnings, doing a fallback or a reset depending on the subflow state.
For >= v5.7.
- Patch 2: fix DSS corruption due to large pmtu xmit, where MPTCP was
not taken into account. For >= v5.6.
- Patch 3: fallback when MPTCP opts are dropped after the first data
packet, instead of resetting the connection. For >= v5.6.
- Patch 4: restrict the removal of a subflow to other closing states, a
better fix, for a recent one. For >= v5.10.
====================
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-0-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Tue, 8 Oct 2024 11:04:55 +0000 (13:04 +0200)]
mptcp: pm: do not remove closing subflows
In a previous fix, the in-kernel path-manager has been modified not to
retrigger the removal of a subflow if it was already closed, e.g. when
the initial subflow is removed, but kept in the subflows list.
To be complete, this fix should also skip the subflows that are in any
closing state: mptcp_close_ssk() will initiate the closure, but the
switch to the TCP_CLOSE state depends on the other peer.
Fixes:
58e1b66b4e4b ("mptcp: pm: do not remove already closed subflows")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-4-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Tue, 8 Oct 2024 11:04:54 +0000 (13:04 +0200)]
mptcp: fallback when MPTCP opts are dropped after 1st data
As reported by Christoph [1], before this patch, an MPTCP connection was
wrongly reset when a host received a first data packet with MPTCP
options after the 3wHS, but got the next ones without.
According to the MPTCP v1 specs [2], a fallback should happen in this
case, because the host didn't receive a DATA_ACK from the other peer,
nor receive data for more than the initial window which implies a
DATA_ACK being received by the other peer.
The patch here re-uses the same logic as the one used in other places:
by looking at allow_infinite_fallback, which is disabled at the creation
of an additional subflow. It's not looking at the first DATA_ACK (or
implying one received from the other side) as suggested by the RFC, but
it is in continuation with what was already done, which is safer, and it
fixes the reported issue. The next step, looking at this first DATA_ACK,
is tracked in [4].
This patch has been validated using the following Packetdrill script:
0 socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
// 3WHS is OK
+0.0 < S 0:0(0) win 65535 <mss 1460, sackOK, nop, nop, nop, wscale 6, mpcapable v1 flags[flag_h] nokey>
+0.0 > S. 0:0(0) ack 1 <mss 1460, nop, nop, sackOK, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey]>
+0.1 < . 1:1(0) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[ckey=2, skey]>
+0 accept(3, ..., ...) = 4
// Data from the client with valid MPTCP options (no DATA_ACK: normal)
+0.1 < P. 1:501(500) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[skey, ckey] mpcdatalen 500, nop, nop>
// From here, the MPTCP options will be dropped by a middlebox
+0.0 > . 1:1(0) ack 501 <dss dack8=501 dll=0 nocs>
+0.1 read(4, ..., 500) = 500
+0 write(4, ..., 100) = 100
// The server replies with data, still thinking MPTCP is being used
+0.0 > P. 1:101(100) ack 501 <dss dack8=501 dsn8=1 ssn=1 dll=100 nocs, nop, nop>
// But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options
+0.1 < . 501:501(0) ack 101 win 2048
+0.0 < P. 501:601(100) ack 101 win 2048
// The server should fallback to TCP, not reset: it didn't get a DATA_ACK, nor data for more than the initial window
+0.0 > . 101:101(0) ack 601
Note that this script requires Packetdrill with MPTCP support, see [3].
Fixes:
dea2b1ea9c70 ("mptcp: do not reset MP_CAPABLE subflow on mapping errors")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/518 [1]
Link: https://datatracker.ietf.org/doc/html/rfc8684#name-fallback
Link: https://github.com/multipath-tcp/packetdrill
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/519
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-3-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 8 Oct 2024 11:04:53 +0000 (13:04 +0200)]
tcp: fix mptcp DSS corruption due to large pmtu xmit
Syzkaller was able to trigger a DSS corruption:
TCP: request_sock_subflow_v4: Possible SYN flooding on port [::]:20002. Sending cookies.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5227 at net/mptcp/protocol.c:695 __mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
Modules linked in:
CPU: 0 UID: 0 PID: 5227 Comm: syz-executor350 Not tainted
6.11.0-syzkaller-08829-gaf9c191ac2a0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
RIP: 0010:__mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
Code: 0f b6 dc 31 ff 89 de e8 b5 dd ea f5 89 d8 48 81 c4 50 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 98 da ea f5 90 <0f> 0b 90 e9 47 ff ff ff e8 8a da ea f5 90 0f 0b 90 e9 99 e0 ff ff
RSP: 0018:
ffffc90000006db8 EFLAGS:
00010246
RAX:
ffffffff8ba9df18 RBX:
00000000000055f0 RCX:
ffff888030023c00
RDX:
0000000000000100 RSI:
00000000000081e5 RDI:
00000000000055f0
RBP:
1ffff110062bf1ae R08:
ffffffff8ba9cf12 R09:
1ffff110062bf1b8
R10:
dffffc0000000000 R11:
ffffed10062bf1b9 R12:
0000000000000000
R13:
dffffc0000000000 R14:
00000000700cec61 R15:
00000000000081e5
FS:
000055556679c380(0000) GS:
ffff8880b8600000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020287000 CR3:
0000000077892000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<IRQ>
move_skbs_to_msk net/mptcp/protocol.c:811 [inline]
mptcp_data_ready+0x29c/0xa90 net/mptcp/protocol.c:854
subflow_data_ready+0x34a/0x920 net/mptcp/subflow.c:1490
tcp_data_queue+0x20fd/0x76c0 net/ipv4/tcp_input.c:5283
tcp_rcv_established+0xfba/0x2020 net/ipv4/tcp_input.c:6237
tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2350
ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
__netif_receive_skb_one_core net/core/dev.c:5662 [inline]
__netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
process_backlog+0x662/0x15b0 net/core/dev.c:6107
__napi_poll+0xcb/0x490 net/core/dev.c:6771
napi_poll net/core/dev.c:6840 [inline]
net_rx_action+0x89b/0x1240 net/core/dev.c:6962
handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
do_softirq+0x11b/0x1e0 kernel/softirq.c:455
</IRQ>
<TASK>
__local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
__dev_queue_xmit+0x1764/0x3e80 net/core/dev.c:4451
dev_queue_xmit include/linux/netdevice.h:3094 [inline]
neigh_hh_output include/net/neighbour.h:526 [inline]
neigh_output include/net/neighbour.h:540 [inline]
ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236
ip_local_out net/ipv4/ip_output.c:130 [inline]
__ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:536
__tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline]
tcp_mtu_probe net/ipv4/tcp_output.c:2547 [inline]
tcp_write_xmit+0x641d/0x6bf0 net/ipv4/tcp_output.c:2752
__tcp_push_pending_frames+0x9b/0x360 net/ipv4/tcp_output.c:3015
tcp_push_pending_frames include/net/tcp.h:2107 [inline]
tcp_data_snd_check net/ipv4/tcp_input.c:5714 [inline]
tcp_rcv_established+0x1026/0x2020 net/ipv4/tcp_input.c:6239
tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
sk_backlog_rcv include/net/sock.h:1113 [inline]
__release_sock+0x214/0x350 net/core/sock.c:3072
release_sock+0x61/0x1f0 net/core/sock.c:3626
mptcp_push_release net/mptcp/protocol.c:1486 [inline]
__mptcp_push_pending+0x6b5/0x9f0 net/mptcp/protocol.c:1625
mptcp_sendmsg+0x10bb/0x1b10 net/mptcp/protocol.c:1903
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x1a6/0x270 net/socket.c:745
____sys_sendmsg+0x52a/0x7e0 net/socket.c:2603
___sys_sendmsg net/socket.c:2657 [inline]
__sys_sendmsg+0x2aa/0x390 net/socket.c:2686
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb06e9317f9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007ffe2cfd4f98 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00007fb06e97f468 RCX:
00007fb06e9317f9
RDX:
0000000000000000 RSI:
0000000020000080 RDI:
0000000000000005
RBP:
00007fb06e97f446 R08:
0000555500000000 R09:
0000555500000000
R10:
0000555500000000 R11:
0000000000000246 R12:
00007fb06e97f406
R13:
0000000000000001 R14:
00007ffe2cfd4fe0 R15:
0000000000000003
</TASK>
Additionally syzkaller provided a nice reproducer. The repro enables
pmtu on the loopback device, leading to tcp_mtu_probe() generating
very large probe packets.
tcp_can_coalesce_send_queue_head() currently does not check for
mptcp-level invariants, and allowed the creation of cross-DSS probes,
leading to the mentioned corruption.
Address the issue teaching tcp_can_coalesce_send_queue_head() about
mptcp using the tcp_skb_can_collapse(), also reducing the code
duplication.
Fixes:
85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions")
Cc: stable@vger.kernel.org
Reported-by: syzbot+d1bff73460e33101f0e7@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/513
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-2-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 8 Oct 2024 11:04:52 +0000 (13:04 +0200)]
mptcp: handle consistently DSS corruption
Bugged peer implementation can send corrupted DSS options, consistently
hitting a few warning in the data path. Use DEBUG_NET assertions, to
avoid the splat on some builds and handle consistently the error, dumping
related MIBs and performing fallback and/or reset according to the
subflow type.
Fixes:
6771bfd9ee24 ("mptcp: update mptcp ack sequence from work queue")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-1-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Breno Leitao [Tue, 8 Oct 2024 09:43:24 +0000 (02:43 -0700)]
net: netconsole: fix wrong warning
A warning is triggered when there is insufficient space in the buffer
for userdata. However, this is not an issue since userdata will be sent
in the next iteration.
Current warning message:
------------[ cut here ]------------
WARNING: CPU: 13 PID:
3013042 at drivers/net/netconsole.c:1122 write_ext_msg+0x3b6/0x3d0
? write_ext_msg+0x3b6/0x3d0
console_flush_all+0x1e9/0x330
The code incorrectly issues a warning when this_chunk is zero, which is
a valid scenario. The warning should only be triggered when this_chunk
is negative.
Fixes:
1ec9daf95093 ("net: netconsole: append userdata to fragmented netconsole messages")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241008094325.896208-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Tue, 8 Oct 2024 09:43:20 +0000 (12:43 +0300)]
net: dsa: refuse cross-chip mirroring operations
In case of a tc mirred action from one switch to another, the behavior
is not correct. We simply tell the source switch driver to program a
mirroring entry towards mirror->to_local_port = to_dp->index, but it is
not even guaranteed that the to_dp belongs to the same switch as dp.
For proper cross-chip support, we would need to go through the
cross-chip notifier layer in switch.c, program the entry on cascade
ports, and introduce new, explicit API for cross-chip mirroring, given
that intermediary switches should have introspection into the DSA tags
passed through the cascade port (and not just program a port mirror on
the entire cascade port). None of that exists today.
Reject what is not implemented so that user space is not misled into
thinking it works.
Fixes:
f50f212749e8 ("net: dsa: Add plumbing for port mirroring")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241008094320.3340980-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wei Fang [Tue, 8 Oct 2024 06:11:53 +0000 (14:11 +0800)]
net: fec: don't save PTP state if PTP is unsupported
Some platforms (such as i.MX25 and i.MX27) do not support PTP, so on
these platforms fec_ptp_init() is not called and the related members
in fep are not initialized. However, fec_ptp_save_state() is called
unconditionally, which causes the kernel to panic. Therefore, add a
condition so that fec_ptp_save_state() is not called if PTP is not
supported.
Fixes:
a1477dc87dc4 ("net: fec: Restart PPS after link state change")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/lkml/
353e41fe-6bb4-4ee9-9980-
2da2a9c1c508@roeck-us.net/
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://patch.msgid.link/20241008061153.1977930-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Tue, 8 Oct 2024 23:30:50 +0000 (16:30 -0700)]
net: ibm: emac: mal: add dcr_unmap to _remove
It's done in probe so it should be undone here.
Fixes:
1d3bb996481e ("Device tree aware EMAC driver")
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20241008233050.9422-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacky Chou [Mon, 7 Oct 2024 03:24:35 +0000 (11:24 +0800)]
net: ftgmac100: fixed not check status from fixed phy
Add error handling from calling fixed_phy_register.
It may return some error, therefore, need to check the status.
And fixed_phy_register needs to bind a device node for mdio.
Add the mac device node for fixed_phy_register function.
This is a reference to this function, of_phy_register_fixed_link().
Fixes:
e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Link: https://patch.msgid.link/20241007032435.787892-1-jacky_chou@aspeedtech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 9 Oct 2024 23:01:40 +0000 (16:01 -0700)]
Merge tag 'mm-hotfixes-stable-2024-10-09-15-46' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"12 hotfixes, 5 of which are c:stable. All singletons, about half of
which are MM"
* tag 'mm-hotfixes-stable-2024-10-09-15-46' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: zswap: delete comments for "value" member of 'struct zswap_entry'.
CREDITS: sort alphabetically by name
secretmem: disable memfd_secret() if arch cannot set direct map
.mailmap: update Fangrui's email
mm/huge_memory: check pmd_special() only after pmd_present()
resource, kunit: fix user-after-free in resource_test_region_intersects()
fs/proc/kcore.c: allow translation of physical memory addresses
selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
device-dax: correct pgoff align in dax_set_mapping()
kthread: unpark only parked kthread
Revert "mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN"
bcachefs: do not use PF_MEMALLOC_NORECLAIM
Florian Westphal [Wed, 9 Oct 2024 07:19:03 +0000 (09:19 +0200)]
selftests: netfilter: conntrack_vrf.sh: add fib test case
meta iifname veth0 ip daddr ... fib daddr oif
... is expected to return "dummy0" interface which is part of same vrf
as veth0.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 9 Oct 2024 07:19:02 +0000 (09:19 +0200)]
netfilter: fib: check correct rtable in vrf setups
We need to init l3mdev unconditionally, else main routing table is searched
and incorrect result is returned unless strict (iif keyword) matching is
requested.
Next patch adds a selftest for this.
Fixes:
2a8a7c0eaa87 ("netfilter: nft_fib: Fix for rpath check with VRF devices")
Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1761
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 7 Oct 2024 09:28:16 +0000 (11:28 +0200)]
netfilter: xtables: avoid NFPROTO_UNSPEC where needed
syzbot managed to call xt_cluster match via ebtables:
WARNING: CPU: 0 PID: 11 at net/netfilter/xt_cluster.c:72 xt_cluster_mt+0x196/0x780
[..]
ebt_do_table+0x174b/0x2a40
Module registers to NFPROTO_UNSPEC, but it assumes ipv4/ipv6 packet
processing. As this is only useful to restrict locally terminating
TCP/UDP traffic, register this for ipv4 and ipv6 family only.
Pablo points out that this is a general issue, direct users of the
set/getsockopt interface can call into targets/matches that were only
intended for use with ip(6)tables.
Check all UNSPEC matches and targets for similar issues:
- matches and targets are fine except if they assume skb_network_header()
is valid -- this is only true when called from inet layer: ip(6) stack
pulls the ip/ipv6 header into linear data area.
- targets that return XT_CONTINUE or other xtables verdicts must be
restricted too, they are incompatbile with the ebtables traverser, e.g.
EBT_CONTINUE is a completely different value than XT_CONTINUE.
Most matches/targets are changed to register for NFPROTO_IPV4/IPV6, as
they are provided for use by ip(6)tables.
The MARK target is also used by arptables, so register for NFPROTO_ARP too.
While at it, bail out if connbytes fails to enable the corresponding
conntrack family.
This change passes the selftests in iptables.git.
Reported-by: syzbot+256c348558aa5cf611a9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netfilter-devel/
66fec2e2.
050a0220.9ec68.0047.GAE@google.com/
Fixes:
0269ea493734 ("netfilter: xtables: add cluster match")
Signed-off-by: Florian Westphal <fw@strlen.de>
Co-developed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Kanchana P Sridhar [Wed, 2 Oct 2024 19:42:13 +0000 (12:42 -0700)]
mm: zswap: delete comments for "value" member of 'struct zswap_entry'.
Made a minor edit in the comments for 'struct zswap_entry' to delete the
description of the 'value' member that was deleted in commit
20a5532ffa53
("mm: remove code to handle same filled pages").
Link: https://lkml.kernel.org/r/20241002194213.30041-1-kanchana.p.sridhar@intel.com
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Fixes:
20a5532ffa53 ("mm: remove code to handle same filled pages")
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Usama Arif <usamaarif642@gmail.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Wajdi Feghali <wajdi.k.feghali@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Krzysztof Kozlowski [Wed, 2 Oct 2024 11:19:32 +0000 (13:19 +0200)]
CREDITS: sort alphabetically by name
Re-sort few misplaced entries in the CREDITS file.
Link: https://lkml.kernel.org/r/20241002111932.46012-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Roy [Tue, 1 Oct 2024 08:00:41 +0000 (09:00 +0100)]
secretmem: disable memfd_secret() if arch cannot set direct map
Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map(). This
is the case for example on some arm64 configurations, where marking 4k
PTEs in the direct map not present can only be done if the direct map is
set up at 4k granularity in the first place (as ARM's break-before-make
semantics do not easily allow breaking apart large/gigantic pages).
More precisely, on arm64 systems with !can_set_direct_map(),
set_direct_map_invalid_noflush() is a no-op, however it returns success
(0) instead of an error. This means that memfd_secret will seemingly
"work" (e.g. syscall succeeds, you can mmap the fd and fault in pages),
but it does not actually achieve its goal of removing its memory from the
direct map.
Note that with this patch, memfd_secret() will start erroring on systems
where can_set_direct_map() returns false (arm64 with
CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and
CONFIG_KFENCE=n), but that still seems better than the current silent
failure. Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most
arm64 systems actually have a working memfd_secret() and aren't be
affected.
From going through the iterations of the original memfd_secret patch
series, it seems that disabling the syscall in these scenarios was the
intended behavior [1] (preferred over having
set_direct_map_invalid_noflush return an error as that would result in
SIGBUSes at page-fault time), however the check for it got dropped between
v16 [2] and v17 [3], when secretmem moved away from CMA allocations.
[1]: https://lore.kernel.org/lkml/
20201124164930.GK8537@kernel.org/
[2]: https://lore.kernel.org/lkml/
20210121122723.3446-11-rppt@kernel.org/#t
[3]: https://lore.kernel.org/lkml/
20201125092208.12544-10-rppt@kernel.org/
Link: https://lkml.kernel.org/r/20241001080056.784735-1-roypat@amazon.co.uk
Fixes:
1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: James Gowans <jgowans@amazon.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fangrui Song [Fri, 27 Sep 2024 19:29:12 +0000 (12:29 -0700)]
.mailmap: update Fangrui's email
I'm leaving Google.
Link: https://lkml.kernel.org/r/20240927192912.31532-1-i@maskray.me
Signed-off-by: Fangrui Song <i@maskray.me>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
David Hildenbrand [Thu, 26 Sep 2024 15:42:34 +0000 (17:42 +0200)]
mm/huge_memory: check pmd_special() only after pmd_present()
We should only check for pmd_special() after we made sure that we have a
present PMD. For example, if we have a migration PMD, pmd_special() might
indicate that we have a special PMD although we really don't.
This fixes confusing migration entries as PFN mappings, and not doing what
we are supposed to do in the "is_swap_pmd()" case further down in the
function -- including messing up COW, page table handling and accounting.
Link: https://lkml.kernel.org/r/20240926154234.2247217-1-david@redhat.com
Fixes:
bc02afbd4d73 ("mm/fork: accept huge pfnmap entries")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: syzbot+bf2c35fa302ebe3c7471@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/
66f15c8d.
050a0220.c23dd.000f.GAE@google.com/
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Huang Ying [Mon, 30 Sep 2024 07:06:11 +0000 (15:06 +0800)]
resource, kunit: fix user-after-free in resource_test_region_intersects()
In resource_test_insert_resource(), the pointer is used in error message
after kfree(). This is user-after-free. To fix this, we need to call
kunit_add_action_or_reset() to schedule memory freeing after usage. But
kunit_add_action_or_reset() itself may fail and free the memory. So, its
return value should be checked and abort the test for failure. Then, we
found that other usage of kunit_add_action_or_reset() in
resource_test_region_intersects() needs to be fixed too. We fix all these
user-after-free bugs in this patch.
Link: https://lkml.kernel.org/r/20240930070611.353338-1-ying.huang@intel.com
Fixes:
99185c10d5d9 ("resource, kunit: add test case for region_intersects()")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: Kees Bakker <kees@ijzerbout.nl>
Closes: https://lore.kernel.org/lkml/87ldzaotcg.fsf@yhuang6-desk2.ccr.corp.intel.com/
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alexander Gordeev [Mon, 30 Sep 2024 12:21:19 +0000 (14:21 +0200)]
fs/proc/kcore.c: allow translation of physical memory addresses
When /proc/kcore is read an attempt to read the first two pages results in
HW-specific page swap on s390 and another (so called prefix) pages are
accessed instead. That leads to a wrong read.
Allow architecture-specific translation of memory addresses using
kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily
to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks. That
way an architecture can deal with specific physical memory ranges.
Re-use the existing /dev/mem callback implementation on s390, which
handles the described prefix pages swapping correctly.
For other architectures the default callback is basically NOP. It is
expected the condition (vaddr == __va(__pa(vaddr))) always holds true for
KCORE_RAM memory type.
Link: https://lkml.kernel.org/r/20240930122119.1651546-1-agordeev@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Donet Tom [Fri, 27 Sep 2024 05:07:52 +0000 (00:07 -0500)]
selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
The hmm2 double_map test was failing due to an incorrect buffer->mirror
size. The buffer->mirror size was 6, while buffer->ptr size was 6 *
PAGE_SIZE. The test failed because the kernel's copy_to_user function was
attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror. Since the
size of buffer->mirror was incorrect, copy_to_user failed.
This patch corrects the buffer->mirror size to 6 * PAGE_SIZE.
Test Result without this patch
==============================
# RUN hmm2.hmm2_device_private.double_map ...
# hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0)
# double_map: Test terminated by assertion
# FAIL hmm2.hmm2_device_private.double_map
not ok 53 hmm2.hmm2_device_private.double_map
Test Result with this patch
===========================
# RUN hmm2.hmm2_device_private.double_map ...
# OK hmm2.hmm2_device_private.double_map
ok 53 hmm2.hmm2_device_private.double_map
Link: https://lkml.kernel.org/r/20240927050752.51066-1-donettom@linux.ibm.com
Fixes:
fee9f6d1b8df ("mm/hmm/test: add selftests for HMM")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kun(llfl) [Fri, 27 Sep 2024 07:45:09 +0000 (15:45 +0800)]
device-dax: correct pgoff align in dax_set_mapping()
pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise,
vmf->address not aligned to fault_size will be aligned to the next
alignment, that can result in memory failure getting the wrong address.
It's a subtle situation that only can be observed in
page_mapped_in_vma() after the page is page fault handled by
dev_dax_huge_fault. Generally, there is little chance to perform
page_mapped_in_vma in dev-dax's page unless in specific error injection
to the dax device to trigger an MCE - memory-failure. In that case,
page_mapped_in_vma() will be triggered to determine which task is
accessing the failure address and kill that task in the end.
We used self-developed dax device (which is 2M aligned mapping) , to
perform error injection to random address. It turned out that error
injected to non-2M-aligned address was causing endless MCE until panic.
Because page_mapped_in_vma() kept resulting wrong address and the task
accessing the failure address was never killed properly:
[ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.049006] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.448042] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3784.792026] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.162502] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.461116] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3785.764730] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.042128] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.464293] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3786.818090] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
[ 3787.085297] mce: Uncorrected hardware memory error in user-access at
200c9742380
[ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page:
Recovered
It took us several weeks to pinpoint this problem, but we eventually
used bpftrace to trace the page fault and mce address and successfully
identified the issue.
Joao added:
; Likely we never reproduce in production because we always pin
: device-dax regions in the region align they provide (Qemu does
: similarly with prealloc in hugetlb/file backed memory). I think this
: bug requires that we touch *unpinned* device-dax regions unaligned to
: the device-dax selected alignment (page size i.e. 4K/2M/1G)
Link: https://lkml.kernel.org/r/23c02a03e8d666fef11bbe13e85c69c8b4ca0624.1727421694.git.llfl@linux.alibaba.com
Fixes:
b9b5777f09be ("device-dax: use ALIGN() for determining pgoff")
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Tested-by: JianXiong Zhao <zhaojianxiong.zjx@alibaba-inc.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Frederic Weisbecker [Fri, 13 Sep 2024 21:46:34 +0000 (23:46 +0200)]
kthread: unpark only parked kthread
Calling into kthread unparking unconditionally is mostly harmless when
the kthread is already unparked. The wake up is then simply ignored
because the target is not in TASK_PARKED state.
However if the kthread is per CPU, the wake up is preceded by a call
to kthread_bind() which expects the task to be inactive and in
TASK_PARKED state, which obviously isn't the case if it is unparked.
As a result, calling kthread_stop() on an unparked per-cpu kthread
triggers such a warning:
WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
<TASK>
kthread_stop+0x17a/0x630 kernel/kthread.c:707
destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
ops_exit_list net/core/net_namespace.c:178 [inline]
cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Fix this with skipping unecessary unparking while stopping a kthread.
Link: https://lkml.kernel.org/r/20240913214634.12557-1-frederic@kernel.org
Fixes:
5c25b5ff89f0 ("workqueue: Tag bound workers with KTHREAD_IS_PER_CPU")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
Tested-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Thu, 26 Sep 2024 17:11:51 +0000 (19:11 +0200)]
Revert "mm: introduce PF_MEMALLOC_NORECLAIM, PF_MEMALLOC_NOWARN"
This reverts commit
eab0af905bfc3e9c05da2ca163d76a1513159aa4.
There is no existing user of those flags. PF_MEMALLOC_NOWARN is dangerous
because a nested allocation context can use GFP_NOFAIL which could cause
unexpected failure. Such a code would be hard to maintain because it
could be deeper in the call chain.
PF_MEMALLOC_NORECLAIM has been added even when it was pointed out [1] that
such a allocation contex is inherently unsafe if the context doesn't fully
control all allocations called from this context.
While PF_MEMALLOC_NOWARN is not dangerous the way PF_MEMALLOC_NORECLAIM is
it doesn't have any user and as Matthew has pointed out we are running out
of those flags so better reclaim it without any real users.
[1] https://lore.kernel.org/all/ZcM0xtlKbAOFjv5n@tiehlicka/
Link: https://lkml.kernel.org/r/20240926172940.167084-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Michal Hocko [Thu, 26 Sep 2024 17:11:50 +0000 (19:11 +0200)]
bcachefs: do not use PF_MEMALLOC_NORECLAIM
Patch series "remove PF_MEMALLOC_NORECLAIM" v3.
This patch (of 2):
bch2_new_inode relies on PF_MEMALLOC_NORECLAIM to try to allocate a new
inode to achieve GFP_NOWAIT semantic while holding locks. If this
allocation fails it will drop locks and use GFP_NOFS allocation context.
We would like to drop PF_MEMALLOC_NORECLAIM because it is really
dangerous to use if the caller doesn't control the full call chain with
this flag set. E.g. if any of the function down the chain needed
GFP_NOFAIL request the PF_MEMALLOC_NORECLAIM would override this and
cause unexpected failure.
While this is not the case in this particular case using the scoped gfp
semantic is not really needed bacause we can easily pus the allocation
context down the chain without too much clutter.
[akpm@linux-foundation.org: fix kerneldoc warnings]
Link: https://lkml.kernel.org/r/20240926172940.167084-1-mhocko@kernel.org
Link: https://lkml.kernel.org/r/20240926172940.167084-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz> # For vfs changes
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dimitri Sivanich [Thu, 19 Sep 2024 12:34:50 +0000 (07:34 -0500)]
misc: sgi-gru: Don't disable preemption in GRU driver
Disabling preemption in the GRU driver is unnecessary, and clashes with
sleeping locks in several code paths. Remove preempt_disable and
preempt_enable from the GRU driver.
Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 9 Oct 2024 19:22:02 +0000 (12:22 -0700)]
Merge tag 'unicode-fixes-6.12-rc3' of git://git./linux/kernel/git/krisman/unicode
Pull unicode fix from Gabriel Krisman Bertazi:
- Handle code-points with the Ignorable property as regular character
instead of treating them as an empty string (me)
* tag 'unicode-fixes-6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
unicode: Don't special case ignorable code points
Gabriel Krisman Bertazi [Tue, 8 Oct 2024 22:43:16 +0000 (18:43 -0400)]
unicode: Don't special case ignorable code points
We don't need to handle them separately. Instead, just let them
decompose/casefold to themselves.
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Niklas Cassel [Tue, 8 Oct 2024 13:58:44 +0000 (15:58 +0200)]
ata: libata: avoid superfluous disk spin down + spin up during hibernation
A user reported that commit
aa3998dbeb3a ("ata: libata-scsi: Disable scsi
device manage_system_start_stop") introduced a spin down + immediate spin
up of the disk both when entering and when resuming from hibernation.
This behavior was not there before, and causes an increased latency both
when entering and when resuming from hibernation.
Hibernation is done by three consecutive PM events, in the following order:
1) PM_EVENT_FREEZE
2) PM_EVENT_THAW
3) PM_EVENT_HIBERNATE
Commit
aa3998dbeb3a ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") modified ata_eh_handle_port_suspend() to call
ata_dev_power_set_standby() (which spins down the disk), for both event
PM_EVENT_FREEZE and event PM_EVENT_HIBERNATE.
Documentation/driver-api/pm/devices.rst, section "Entering Hibernation",
explicitly mentions that PM_EVENT_FREEZE does not have to be put the device
in a low-power state, and actually recommends not doing so. Thus, let's not
spin down the disk on PM_EVENT_FREEZE. (The disk will instead be spun down
during the subsequent PM_EVENT_HIBERNATE event.)
This way, PM_EVENT_FREEZE will behave as it did before commit
aa3998dbeb3a
("ata: libata-scsi: Disable scsi device manage_system_start_stop"), while
PM_EVENT_HIBERNATE will continue to spin down the disk.
This will avoid the superfluous spin down + spin up when entering and
resuming from hibernation, while still making sure that the disk is spun
down before actually entering hibernation.
Cc: stable@vger.kernel.org # v6.6+
Fixes:
aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20241008135843.1266244-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Steven Rostedt [Tue, 8 Oct 2024 18:32:42 +0000 (14:32 -0400)]
ring-buffer: Do not have boot mapped buffers hook to CPU hotplug
The boot mapped ring buffer has its buffer mapped at a fixed location
found at boot up. It is not dynamic. It cannot grow or be expanded when
new CPUs come online.
Do not hook fixed memory mapped ring buffers to the CPU hotplug callback,
otherwise it can cause a crash when it tries to add the buffer to the
memory that is already fully occupied.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20241008143242.25e20801@gandalf.local.home
Fixes:
be68d63a139bd ("ring-buffer: Add ring_buffer_alloc_range()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Jijie Shao [Tue, 8 Oct 2024 02:48:36 +0000 (10:48 +0800)]
net: hns3/hns: Update the maintainer for the HNS3/HNS ethernet driver
Yisen Zhuang has left the company in September.
Jian Shen will be responsible for maintaining the
hns3/hns driver's code in the future,
so add Jian Shen to the hns3/hns driver's matainer list.
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 7 Oct 2024 16:25:11 +0000 (12:25 -0400)]
sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start
If hashing fails in sctp_listen_start(), the socket remains in the
LISTENING state, even though it was not added to the hash table.
This can lead to a scenario where a socket appears to be listening
without actually being accessible.
This patch ensures that if the hashing operation fails, the sk_state
is set back to CLOSED before returning an error.
Note that there is no need to undo the autobind operation if hashing
fails, as the bind port can still be used for next listen() call on
the same socket.
Fixes:
76c6d988aeb3 ("sctp: add sock_reuseport for the sock in __sctp_hash_endpoint")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Palmer [Mon, 7 Oct 2024 10:43:17 +0000 (19:43 +0900)]
net: amd: mvme147: Fix probe banner message
Currently this driver prints this line with what looks like
a rogue format specifier when the device is probed:
[ 2.840000] eth%d: MVME147 at 0xfffe1800, irq 12, Hardware Address xx:xx:xx:xx:xx:xx
Change the printk() for netdev_info() and move it after the
registration has completed so it prints out the name of the
interface properly.
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Mon, 7 Oct 2024 09:57:41 +0000 (11:57 +0200)]
net: phy: realtek: Fix MMD access on RTL8126A-integrated PHY
All MMD reads return 0 for the RTL8126A-integrated PHY. Therefore phylib
assumes it doesn't support EEE, what results in higher power consumption,
and a significantly higher chip temperature in my case.
To fix this split out the PHY driver for the RTL8126A-integrated PHY
and set the read_mmd/write_mmd callbacks to read from vendor-specific
registers.
Fixes:
5befa3728b85 ("net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Naohiro Aota [Fri, 4 Oct 2024 04:53:35 +0000 (13:53 +0900)]
btrfs: fix clear_dirty and writeback ordering in submit_one_sector()
This commit is a replay of commit
6252690f7e1b ("btrfs: fix invalid
mapping of extent xarray state"). We need to call
btrfs_folio_clear_dirty() before btrfs_set_range_writeback(), so that
xarray DIRTY tag is cleared.
With a refactoring commit
8189197425e7 ("btrfs: refactor
__extent_writepage_io() to do sector-by-sector submission"), it screwed
up and the order is reversed and causing the same hang. Fix the ordering
now in submit_one_sector().
Fixes:
8189197425e7 ("btrfs: refactor __extent_writepage_io() to do sector-by-sector submission")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Wed, 2 Oct 2024 14:02:56 +0000 (15:02 +0100)]
btrfs: zoned: fix missing RCU locking in error message when loading zone info
At btrfs_load_zone_info() we have an error path that is dereferencing
the name of a device which is a RCU string but we are not holding a RCU
read lock, which is incorrect.
Fix this by using btrfs_err_in_rcu() instead of btrfs_err().
The problem is there since commit
08e11a3db098 ("btrfs: zoned: load zone's
allocation offset"), back then at btrfs_load_block_group_zone_info() but
then later on that code was factored out into the helper
btrfs_load_zone_info() by commit
09a46725cc84 ("btrfs: zoned: factor out
per-zone logic from btrfs_load_block_group_zone_info").
Fixes:
08e11a3db098 ("btrfs: zoned: load zone's allocation offset")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
MD Danish Anwar [Mon, 7 Oct 2024 05:41:24 +0000 (11:11 +0530)]
net: ti: icssg-prueth: Fix race condition for VLAN table access
The VLAN table is a shared memory between the two ports/slices
in a ICSSG cluster and this may lead to race condition when the
common code paths for both ports are executed in different CPUs.
Fix the race condition access by locking the shared memory access
Fixes:
487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Kreimer [Wed, 2 Oct 2024 21:19:48 +0000 (00:19 +0300)]
xfs: fix a typo
Fix a typo in comments.
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Brian Foster [Fri, 6 Sep 2024 11:40:51 +0000 (07:40 -0400)]
xfs: don't free cowblocks from under dirty pagecache on unshare
fallocate unshare mode explicitly breaks extent sharing. When a
command completes, it checks the data fork for any remaining shared
extents to determine whether the reflink inode flag and COW fork
preallocation can be removed. This logic doesn't consider in-core
pagecache and I/O state, however, which means we can unsafely remove
COW fork blocks that are still needed under certain conditions.
For example, consider the following command sequence:
xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \
-c "pwrite 0 32k" -c "funshare 0 1k" <file>
This allocates a data block at offset 0, shares it, and then
overwrites it with a larger buffered write. The overwrite triggers
COW fork preallocation, 32 blocks by default, which maps the entire
32k write to delalloc in the COW fork. All but the shared block at
offset 0 remains hole mapped in the data fork. The unshare command
redirties and flushes the folio at offset 0, removing the only
shared extent from the inode. Since the inode no longer maps shared
extents, unshare purges the COW fork before the remaining 28k may
have written back.
This leaves dirty pagecache backed by holes, which writeback quietly
skips, thus leaving clean, non-zeroed pagecache over holes in the
file. To verify, fiemap shows holes in the first 32k of the file and
reads return different data across a remount:
$ xfs_io -c "fiemap -v" <file>
<file>:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
...
1: [8..511]: hole 504
...
$ xfs_io -c "pread -v 4k 8" <file>
00001000: cd cd cd cd cd cd cd cd ........
$ umount <mnt>; mount <dev> <mnt>
$ xfs_io -c "pread -v 4k 8" <file>
00001000: 00 00 00 00 00 00 00 00 ........
To avoid this problem, make unshare follow the same rules used for
background cowblock scanning and never purge the COW fork for inodes
with dirty pagecache or in-flight I/O.
Fixes:
46afb0628b86347 ("xfs: only flush the unshared range in xfs_reflink_unshare")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Dave Airlie [Wed, 9 Oct 2024 06:30:21 +0000 (16:30 +1000)]
Merge tag 'amd-drm-fixes-6.12-2024-10-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.12-2024-10-08:
amdgpu:
- Fix invalid UBSAN warnings
- Fix artifacts in MPO transitions
- Hibernation fix
amdkfd:
- Fix an eviction fence leak
radeon:
- Add late register for connectors
- Always set GEM function pointers
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008142831.3739244-1-alexander.deucher@amd.com
Rosen Penev [Mon, 7 Oct 2024 23:57:11 +0000 (16:57 -0700)]
net: ibm: emac: mal: fix wrong goto
dcr_map is called in the previous if and therefore needs to be unmapped.
Fixes:
1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling")
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20241007235711.5714-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matt Roper [Wed, 2 Oct 2024 23:06:21 +0000 (16:06 -0700)]
drm/xe: Make wedged_mode debugfs writable
The intent of this debugfs entry is to allow modification of wedging
behavior, either from IGT tests or during manual debug; it should be
marked as writable to properly reflect this. In practice this hasn't
caused a problem because we always access wedged_mode as root, which
ignores file permissions, but it's still misleading to have the entry
incorrectly marked as RO.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes:
6b8ef44cc0a9 ("drm/xe: Introduce the wedged_mode debugfs")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241002230620.1249258-2-matthew.d.roper@intel.com
(cherry picked from commit
93d93813422758f6c99289de446b19184019ef5a)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Vinay Belgaumkar [Wed, 25 Sep 2024 20:49:18 +0000 (13:49 -0700)]
drm/xe: Restore GT freq on GSC load error
As part of a Wa_22019338487, ensure that GT freq is restored
even when GSC reload is not successful.
Fixes:
3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487")
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit
491418a258322bbd7f045e36884d2849b673f23d)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matthew Auld [Tue, 1 Oct 2024 08:43:49 +0000 (09:43 +0100)]
drm/xe/guc_submit: fix xa_store() error checking
Looks like we are meant to use xa_err() to extract the error encoded in
the ptr.
Fixes:
dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-7-matthew.auld@intel.com
(cherry picked from commit
f040327238b1a8311598c40ac94464e77fff368c)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matthew Auld [Tue, 1 Oct 2024 08:43:48 +0000 (09:43 +0100)]
drm/xe/ct: fix xa_store() error checking
Looks like we are meant to use xa_err() to extract the error encoded in
the ptr.
Fixes:
dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-6-matthew.auld@intel.com
(cherry picked from commit
1aa4b7864707886fa40d959483591f3d3937fa28)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Matthew Auld [Tue, 1 Oct 2024 08:43:47 +0000 (09:43 +0100)]
drm/xe/ct: prevent UAF in send_recv()
Ensure we serialize with completion side to prevent UAF with fence going
out of scope on the stack, since we have no clue if it will fire after
the timeout before we can erase from the xa. Also we have some dependent
loads and stores for which we need the correct ordering, and we lack the
needed barriers. Fix this by grabbing the ct->lock after the wait, which
is also held by the completion side.
v2 (Badal):
- Also print done after acquiring the lock and seeing timeout.
Fixes:
dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241001084346.98516-5-matthew.auld@intel.com
(cherry picked from commit
52789ce35c55ccd30c4b67b9cc5b2af55e0122ea)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Eric Dumazet [Mon, 7 Oct 2024 18:41:30 +0000 (18:41 +0000)]
net/sched: accept TCA_STAB only for root qdisc
Most qdiscs maintain their backlog using qdisc_pkt_len(skb)
on the assumption it is invariant between the enqueue()
and dequeue() handlers.
Unfortunately syzbot can crash a host rather easily using
a TBF + SFQ combination, with an STAB on SFQ [1]
We can't support TCA_STAB on arbitrary level, this would
require to maintain per-qdisc storage.
[1]
[ 88.796496] BUG: kernel NULL pointer dereference, address:
0000000000000000
[ 88.798611] #PF: supervisor read access in kernel mode
[ 88.799014] #PF: error_code(0x0000) - not-present page
[ 88.799506] PGD 0 P4D 0
[ 88.799829] Oops: Oops: 0000 [#1] SMP NOPTI
[ 88.800569] CPU: 14 UID: 0 PID: 2053 Comm:
b371744477 Not tainted 6.12.0-rc1-virtme #1117
[ 88.801107] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 88.801779] RIP: 0010:sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
[ 88.802544] Code: 0f b7 50 12 48 8d 04 d5 00 00 00 00 48 89 d6 48 29 d0 48 8b 91 c0 01 00 00 48 c1 e0 03 48 01 c2 66 83 7a 1a 00 7e c0 48 8b 3a <4c> 8b 07 4c 89 02 49 89 50 08 48 c7 47 08 00 00 00 00 48 c7 07 00
All code
========
0: 0f b7 50 12 movzwl 0x12(%rax),%edx
4: 48 8d 04 d5 00 00 00 lea 0x0(,%rdx,8),%rax
b: 00
c: 48 89 d6 mov %rdx,%rsi
f: 48 29 d0 sub %rdx,%rax
12: 48 8b 91 c0 01 00 00 mov 0x1c0(%rcx),%rdx
19: 48 c1 e0 03 shl $0x3,%rax
1d: 48 01 c2 add %rax,%rdx
20: 66 83 7a 1a 00 cmpw $0x0,0x1a(%rdx)
25: 7e c0 jle 0xffffffffffffffe7
27: 48 8b 3a mov (%rdx),%rdi
2a:* 4c 8b 07 mov (%rdi),%r8 <-- trapping instruction
2d: 4c 89 02 mov %r8,(%rdx)
30: 49 89 50 08 mov %rdx,0x8(%r8)
34: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi)
3b: 00
3c: 48 rex.W
3d: c7 .byte 0xc7
3e: 07 (bad)
...
Code starting with the faulting instruction
===========================================
0: 4c 8b 07 mov (%rdi),%r8
3: 4c 89 02 mov %r8,(%rdx)
6: 49 89 50 08 mov %rdx,0x8(%r8)
a: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi)
11: 00
12: 48 rex.W
13: c7 .byte 0xc7
14: 07 (bad)
...
[ 88.803721] RSP: 0018:
ffff9a1f892b7d58 EFLAGS:
00000206
[ 88.804032] RAX:
0000000000000000 RBX:
ffff9a1f8420c800 RCX:
ffff9a1f8420c800
[ 88.804560] RDX:
ffff9a1f81bc1440 RSI:
0000000000000000 RDI:
0000000000000000
[ 88.805056] RBP:
ffffffffc04bb0e0 R08:
0000000000000001 R09:
00000000ff7f9a1f
[ 88.805473] R10:
000000000001001b R11:
0000000000009a1f R12:
0000000000000140
[ 88.806194] R13:
0000000000000001 R14:
ffff9a1f886df400 R15:
ffff9a1f886df4ac
[ 88.806734] FS:
00007f445601a740(0000) GS:
ffff9a2e7fd80000(0000) knlGS:
0000000000000000
[ 88.807225] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 88.807672] CR2:
0000000000000000 CR3:
000000050cc46000 CR4:
00000000000006f0
[ 88.808165] Call Trace:
[ 88.808459] <TASK>
[ 88.808710] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 88.809261] ? page_fault_oops (arch/x86/mm/fault.c:715)
[ 88.809561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[ 88.809806] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[ 88.810074] ? sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
[ 88.810411] sfq_reset (net/sched/sch_sfq.c:525) sch_sfq
[ 88.810671] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
[ 88.810950] tbf_reset (./include/linux/timekeeping.h:169 net/sched/sch_tbf.c:334) sch_tbf
[ 88.811208] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
[ 88.811484] netif_set_real_num_tx_queues (./include/linux/spinlock.h:396 ./include/net/sch_generic.h:768 net/core/dev.c:2958)
[ 88.811870] __tun_detach (drivers/net/tun.c:590 drivers/net/tun.c:673)
[ 88.812271] tun_chr_close (drivers/net/tun.c:702 drivers/net/tun.c:3517)
[ 88.812505] __fput (fs/file_table.c:432 (discriminator 1))
[ 88.812735] task_work_run (kernel/task_work.c:230)
[ 88.813016] do_exit (kernel/exit.c:940)
[ 88.813372] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
[ 88.813639] ? handle_mm_fault (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/memcontrol.h:1022 ./include/linux/memcontrol.h:1045 ./include/linux/memcontrol.h:1052 mm/memory.c:5928 mm/memory.c:6088)
[ 88.813867] do_group_exit (kernel/exit.c:1070)
[ 88.814138] __x64_sys_exit_group (kernel/exit.c:1099)
[ 88.814490] x64_sys_call (??:?)
[ 88.814791] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[ 88.815012] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 88.815495] RIP: 0033:0x7f44560f1975
Fixes:
175f9c1bba9b ("net_sched: Add size table for qdiscs")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20241007184130.3960565-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vitaly Lifshits [Sun, 8 Sep 2024 06:49:17 +0000 (09:49 +0300)]
e1000e: change I219 (19) devices to ADP
Sporadic issues, such as PHY access loss, have been observed on I219 (19)
devices. It was found that these devices have hardware more closely
related to ADP than MTP and the issues were caused by taking MTP-specific
flows.
Change the MAC and board types of these devices from MTP to ADP to
correctly reflect the LAN hardware, and flows, of these devices.
Fixes:
db2d737d63c5 ("e1000e: Separate MTP board type from ADP")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Mohamed Khalfella [Tue, 24 Sep 2024 21:06:01 +0000 (15:06 -0600)]
igb: Do not bring the device up after non-fatal error
Commit
004d25060c78 ("igb: Fix igb_down hung on surprise removal")
changed igb_io_error_detected() to ignore non-fatal pcie errors in order
to avoid hung task that can happen when igb_down() is called multiple
times. This caused an issue when processing transient non-fatal errors.
igb_io_resume(), which is called after igb_io_error_detected(), assumes
that device is brought down by igb_io_error_detected() if the interface
is up. This resulted in panic with stacktrace below.
[ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down
[ T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
[ T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ T292] igb 0000:09:00.0: device [8086:1537] error status/mask=
00004000/
00000000
[ T292] igb 0000:09:00.0: [14] CmpltTO [ 200.105524,009][ T292] igb 0000:09:00.0: AER: TLP Header:
00000000 00000000 00000000 00000000
[ T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message
[ T292] igb 0000:09:00.0: Non-correctable non-fatal error reported.
[ T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message
[ T292] pcieport 0000:00:1c.5: AER: broadcast resume message
[ T292] ------------[ cut here ]------------
[ T292] kernel BUG at net/core/dev.c:6539!
[ T292] invalid opcode: 0000 [#1] PREEMPT SMP
[ T292] RIP: 0010:napi_enable+0x37/0x40
[ T292] Call Trace:
[ T292] <TASK>
[ T292] ? die+0x33/0x90
[ T292] ? do_trap+0xdc/0x110
[ T292] ? napi_enable+0x37/0x40
[ T292] ? do_error_trap+0x70/0xb0
[ T292] ? napi_enable+0x37/0x40
[ T292] ? napi_enable+0x37/0x40
[ T292] ? exc_invalid_op+0x4e/0x70
[ T292] ? napi_enable+0x37/0x40
[ T292] ? asm_exc_invalid_op+0x16/0x20
[ T292] ? napi_enable+0x37/0x40
[ T292] igb_up+0x41/0x150
[ T292] igb_io_resume+0x25/0x70
[ T292] report_resume+0x54/0x70
[ T292] ? report_frozen_detected+0x20/0x20
[ T292] pci_walk_bus+0x6c/0x90
[ T292] ? aer_print_port_info+0xa0/0xa0
[ T292] pcie_do_recovery+0x22f/0x380
[ T292] aer_process_err_devices+0x110/0x160
[ T292] aer_isr+0x1c1/0x1e0
[ T292] ? disable_irq_nosync+0x10/0x10
[ T292] irq_thread_fn+0x1a/0x60
[ T292] irq_thread+0xe3/0x1a0
[ T292] ? irq_set_affinity_notifier+0x120/0x120
[ T292] ? irq_affinity_notify+0x100/0x100
[ T292] kthread+0xe2/0x110
[ T292] ? kthread_complete_and_exit+0x20/0x20
[ T292] ret_from_fork+0x2d/0x50
[ T292] ? kthread_complete_and_exit+0x20/0x20
[ T292] ret_from_fork_asm+0x11/0x20
[ T292] </TASK>
To fix this issue igb_io_resume() checks if the interface is running and
the device is not down this means igb_io_error_detected() did not bring
the device down and there is no need to bring it up.
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Fixes:
004d25060c78 ("igb: Fix igb_down hung on surprise removal")
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Aleksandr Loktionov [Mon, 23 Sep 2024 09:12:19 +0000 (11:12 +0200)]
i40e: Fix macvlan leak by synchronizing access to mac_filter_hash
This patch addresses a macvlan leak issue in the i40e driver caused by
concurrent access to vsi->mac_filter_hash. The leak occurs when multiple
threads attempt to modify the mac_filter_hash simultaneously, leading to
inconsistent state and potential memory leaks.
To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing
vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock),
ensuring atomic operations and preventing concurrent access.
Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in
i40e_add_mac_filter() to help catch similar issues in the future.
Reproduction steps:
1. Spawn VFs and configure port vlan on them.
2. Trigger concurrent macvlan operations (e.g., adding and deleting
portvlan and/or mac filters).
3. Observe the potential memory leak and inconsistent state in the
mac_filter_hash.
This synchronization ensures the integrity of the mac_filter_hash and prevents
the described leak.
Fixes:
fed0d9f13266 ("i40e: Fix VF's MAC Address change on VM")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marcin Szycik [Fri, 27 Sep 2024 15:15:40 +0000 (17:15 +0200)]
ice: Fix increasing MSI-X on VF
Increasing MSI-X value on a VF leads to invalid memory operations. This
is caused by not reallocating some arrays.
Reproducer:
modprobe ice
echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe
echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs
echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count
Default MSI-X is 16, so 17 and above triggers this issue.
KASAN reports:
BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
Read of size 8 at addr
ffff8888b937d180 by task bash/28433
(...)
Call Trace:
(...)
? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
kasan_report+0xed/0x120
? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice]
ice_vsi_cfg_def+0x3360/0x4770 [ice]
? mutex_unlock+0x83/0xd0
? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice]
? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice]
ice_vsi_cfg+0x7f/0x3b0 [ice]
ice_vf_reconfig_vsi+0x114/0x210 [ice]
ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice]
sriov_vf_msix_count_store+0x21c/0x300
(...)
Allocated by task 28201:
(...)
ice_vsi_cfg_def+0x1c8e/0x4770 [ice]
ice_vsi_cfg+0x7f/0x3b0 [ice]
ice_vsi_setup+0x179/0xa30 [ice]
ice_sriov_configure+0xcaa/0x1520 [ice]
sriov_numvfs_store+0x212/0x390
(...)
To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This
causes the required arrays to be reallocated taking the new queue count
into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq
before ice_vsi_rebuild(), so that realloc uses the newly set queue
count.
Additionally, ice_vsi_rebuild() does not remove VSI filters
(ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer
necessary.
Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Fixes:
2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Wojciech Drewek [Fri, 27 Sep 2024 12:38:01 +0000 (14:38 +0200)]
ice: Flush FDB entries before reset
Triggering the reset while in switchdev mode causes
errors[1]. Rules are already removed by this time
because switch content is flushed in case of the reset.
This means that rules were deleted from HW but SW
still thinks they exist so when we get
SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to
delete not existing rule.
We can avoid these errors by clearing the rules
early in the reset flow before they are removed from HW.
Switchdev API will get notified that the rule was removed
so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification.
Remove unnecessary ice_clear_sw_switch_recipes.
[1]
ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2
ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2
Fixes:
7c945a1a8e5f ("ice: Switchdev FDB events support")
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marcin Szycik [Tue, 24 Sep 2024 10:04:24 +0000 (12:04 +0200)]
ice: Fix netif_is_ice() in Safe Mode
netif_is_ice() works by checking the pointer to netdev ops. However, it
only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops,
so in Safe Mode it always returns false, which is unintuitive. While it
doesn't look like netif_is_ice() is currently being called anywhere in Safe
Mode, this could change and potentially lead to unexpected behaviour.
Fixes:
df006dd4b1dc ("ice: Add initial support framework for LAG")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Marcin Szycik [Tue, 24 Sep 2024 10:04:23 +0000 (12:04 +0200)]
ice: Fix entering Safe Mode
If DDP package is missing or corrupted, the driver should enter Safe Mode.
Instead, an error is returned and probe fails.
To fix this, don't exit init if ice_init_ddp_config() returns an error.
Repro:
* Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg)
* Load ice
Fixes:
cc5776fe1832 ("ice: Enable switching default Tx scheduler topology")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Linus Torvalds [Tue, 8 Oct 2024 19:54:04 +0000 (12:54 -0700)]
Merge tag 'sched_ext-for-6.12-rc2-fixes' of git://git./linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- ops.enqueue() didn't have a way to tell whether select_task_rq_scx()
and thus ops.select() were skipped. Some schedulers were incorrectly
using SCX_ENQ_WAKEUP. Add SCX_ENQ_CPU_SELECTED and fix scx_qmap using
it.
- Remove a spurious WARN_ON_ONCE() in scx_cgroup_exit()
- Fix error information clobbering during load
- Add missing __weak markers to BPF helper declarations
- Doc update
* tag 'sched_ext-for-6.12-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Documentation: Update instructions for running example schedulers
sched_ext, scx_qmap: Add and use SCX_ENQ_CPU_SELECTED
sched/core: Add ENQUEUE_RQ_SELECTED to indicate whether ->select_task_rq() was called
sched/core: Make select_task_rq() take the pointer to wake_flags instead of value
sched_ext: scx_cgroup_exit() may be called without successful scx_cgroup_init()
sched_ext: Improve error reporting during loading
sched_ext: Add __weak markers to BPF helper function decalarations
Zhang Rui [Mon, 30 Sep 2024 08:18:01 +0000 (16:18 +0800)]
thermal: intel: int340x: processor: Add MMIO RAPL PL4 support
Similar to the MSR RAPL interface, MMIO RAPL supports PL4 too, so add
MMIO RAPL PL4d support to the processor_thermal driver.
As a result, the powercap sysfs for MMIO RAPL will show a new "peak
power" constraint.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-7-rui.zhang@intel.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Zhang Rui [Mon, 30 Sep 2024 08:18:00 +0000 (16:18 +0800)]
thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug support
CPU0/package0 is always online and the MMIO RAPL driver runs on single
package systems only, so there is no need to handle CPU hotplug in it.
Always register a RAPL package device for package 0 and remove the
unnecessary CPU hotplug support.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-6-rui.zhang@intel.com
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sumeet Pawnikar [Mon, 30 Sep 2024 08:17:59 +0000 (16:17 +0800)]
powercap: intel_rapl_msr: Add PL4 support for Arrowlake-U
Add PL4 support for ArrowLake-U platform.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-5-rui.zhang@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Zhang Rui [Mon, 30 Sep 2024 08:17:58 +0000 (16:17 +0800)]
powercap: intel_rapl_tpmi: Ignore minor version change
The hardware definition of every TPMI feature contains a major and minor
version. When there is a change in the MMIO offset or change in the
definition of a field, hardware will change major version. For addition
of new fields without modifying existing MMIO offsets or fields, only
the minor version is changed.
If the driver has not been updated to recognize a new hardware major
version, it cannot provide the RAPL interface to users due to possible
register layout incompatibilities. However, the driver does not need to
be updated every time the hardware minor version changes because in that
case it will just miss some new functionality exposed by the hardware.
The current implementation causes the driver to refuse to work for any
hardware version change which is unnecessarily restrictive.
If there is a minor version mismatch, log an information message and
continue, but if there is a major version mismatch, log a warning and
exit (as before).
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240930081801.28502-4-rui.zhang@intel.com
Fixes:
9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Devaansh-Kumar [Tue, 8 Oct 2024 14:26:20 +0000 (19:56 +0530)]
sched_ext: Documentation: Update instructions for running example schedulers
Since the artifact paths for tools changed, we need to update the documentation to reflect that path.
Signed-off-by: Devaansh-Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Torvalds [Tue, 8 Oct 2024 17:53:06 +0000 (10:53 -0700)]
Merge tag 'ntfs3_for_6.12' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"New:
- implement fallocate for compressed files
- add support for the compression attribute
- optimize large writes to sparse files
Fixes:
- fix several potential deadlock scenarios
- fix various internal bugs detected by syzbot
- add checks before accessing NTFS structures during parsing
- correct the format of output messages
Refactoring:
- replace fsparam_flag_no with fsparam_flag in options parser
- remove unused functions and macros"
* tag 'ntfs3_for_6.12' of https://github.com/Paragon-Software-Group/linux-ntfs3: (25 commits)
fs/ntfs3: Format output messages like others fs in kernel
fs/ntfs3: Additional check in ntfs_file_release
fs/ntfs3: Fix general protection fault in run_is_mapped_full
fs/ntfs3: Sequential field availability check in mi_enum_attr()
fs/ntfs3: Additional check in ni_clear()
fs/ntfs3: Fix possible deadlock in mi_read
ntfs3: Change to non-blocking allocation in ntfs_d_hash
fs/ntfs3: Remove unused al_delete_le
fs/ntfs3: Rename ntfs3_setattr into ntfs_setattr
fs/ntfs3: Replace fsparam_flag_no -> fsparam_flag
fs/ntfs3: Add support for the compression attribute
fs/ntfs3: Implement fallocate for compressed files
fs/ntfs3: Make checks in run_unpack more clear
fs/ntfs3: Add rough attr alloc_size check
fs/ntfs3: Stale inode instead of bad
fs/ntfs3: Refactor enum_rstbl to suppress static checker
fs/ntfs3: Fix sparse warning in ni_fiemap
fs/ntfs3: Fix warning possible deadlock in ntfs_set_state
fs/ntfs3: Fix sparse warning for bigendian
fs/ntfs3: Separete common code for file_read/write iter/splice
...
Linus Torvalds [Tue, 8 Oct 2024 17:43:22 +0000 (10:43 -0700)]
Merge tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git./linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix an assert() to handle captured and unprocessed ARM CoreSight CPU
traces
- Fix static build compilation error when libdw isn't installed or is
too old
- Add missing include when building with
!HAVE_DWARF_GETLOCATIONS_SUPPORT
- Add missing refcount put on 32-bit DSOs
- Fix disassembly of user space binaries by setting the binary_type of
DSO when loading
- Update headers with the kernel sources, including asound.h, sched.h,
fcntl, msr-index.h, irq_vectors.h, socket.h, list_sort.c and arm64's
cputype.h
* tag 'perf-tools-fixes-for-v6.12-1-2024-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf cs-etm: Fix the assert() to handle captured and unprocessed cpu trace
perf build: Fix build feature-dwarf_getlocations fail for old libdw
perf build: Fix static compilation error when libdw is not installed
perf dwarf-aux: Fix build with !HAVE_DWARF_GETLOCATIONS_SUPPORT
tools headers arm64: Sync arm64's cputype.h with the kernel sources
perf tools: Cope with differences for lib/list_sort.c copy from the kernel
tools check_headers.sh: Add check variant that excludes some hunks
perf beauty: Update copy of linux/socket.h with the kernel sources
tools headers UAPI: Sync the linux/in.h with the kernel sources
perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools include UAPI: Sync linux/fcntl.h copy with the kernel sources
tools include UAPI: Sync linux/sched.h copy with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
perf vdso: Missed put on 32-bit dsos
perf symbol: Set binary_type of dso when loading