git.kernel.dk Git - linux-block.git/log

Merge tag 'drm-fixes-2025-03-21' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
"Just the usual spread of a bunch for amdgpu, and small changes to
  others.

  scheduler:
   - fix fence reference leak

  xe:
   - Fix for an error if exporting a dma-buf multiple time

  amdgpu:
   - Fix video caps limits on several asics
   - SMU 14.x fixes
   - GC 12 fixes
   - eDP fixes
   - DMUB fix

  amdkfd:
   - GC 12 trap handler fix
   - GC 7/8 queue validation fix

  radeon:
   - VCE IB parsing fix

  v3d:
   - fix job error handling bugs

  qaic:
   - fix two integer overflows

  host1x:
   - fix NULL domain handling"

* tag 'drm-fixes-2025-03-21' of https://gitlab.freedesktop.org/drm/kernel: (21 commits)
  drm/xe: Fix exporting xe buffers multiple times
  gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU
  drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2
  drm/amd/display: Fix incorrect fw_state address in dmub_srv
  drm/amd/display: Use HW lock mgr for PSR1 when only one eDP
  drm/amd/display: Fix message for support_edp0_on_dp1
  drm/amdkfd: Fix user queue validation on Gfx7/8
  drm/amdgpu: Restore uncached behaviour on GFX12
  drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()
  drm/amdkfd: Fix instruction hazard in gfx12 trap handler
  drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2
  drm/amd/pm: add unique_id for gfx12
  drm/amdgpu: Remove JPEG from vega and carrizo video caps
  drm/amdgpu: Fix JPEG video caps max size for navi1x and raven
  drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size
  drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse()
  accel/qaic: Fix integer overflow in qaic_validate_req()
  accel/qaic: Fix possible data corruption in BOs > 2G
  drm/v3d: Set job pointer to NULL when the job's fence has an error
  drm/v3d: Don't run jobs that have errors flagged in its fence
  ...

Merge tag 'v6.14-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:
"smb3 client reconnect fix"

* tag 'v6.14-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: don't retry IO on failed negprotos with soft mounts

Merge tag 'amd-drm-fixes-6.14-2025-03-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.14-2025-03-20:

amdgpu:
- Fix video caps limits on several asics
- SMU 14.x fixes
- GC 12 fixes
- eDP fixes
- DMUB fix

amdkfd:
- GC 12 trap handler fix
- GC 7/8 queue validation fix

radeon:
- VCE IB parsing fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250320210800.1358992-1-alexander.deucher@amd.com

Merge tag 'drm-xe-fixes-2025-03-20' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix for an error if exporting a dma-buf multiple time (Tomasz)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z9xalLaCWsNbh0P0@fedora

Merge tag 'drm-misc-fixes-2025-03-20' of ssh://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

A sched fence reference leak fix, two fence fixes for v3d, two overflow
fixes for quaic, and a iommu handling fix for host1x.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250320-valiant-outstanding-nightingale-e9acae@houat

Merge tag 'dma-mapping-6.14-2025-03-21' of git://git./linux/kernel/git/mszyprowski/linux

Pull dma-mapping fix from Marek Szyprowski:

- fix missing clear bdr in check_ram_in_range_map() (Baochen Qiang)

* tag 'dma-mapping-6.14-2025-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: fix missing clear bdr in check_ram_in_range_map()

Merge tag 'vfs-6.14-final.fixes' of git://git./linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
"A final set of fixes for this cycle:

  VFS:

   - Ensure that the stable offset api doesn't return duplicate
     directory entries when userspace has to perform the getdents call
     multiple times on large directories

  afs:

   - Prevent invalid pointer dereference during get_link RCU pathwalk

  fuse:

   - Fix deadlock caused by uninitialized rings when using io_uring with
     fuse

   - Handle race condition when using io_uring with fuse to prevent NULL
     dereference

  libnetfs:

   - Ensure that invalidate_cache is only called if implemented

   - Fix collection of results during pause when collection is
     offloaded

   - Ensure rolling_buffer_load_from_ra() doesn't clear mark bits

   - Make netfs_unbuffered_read() return ssize_t rather than int"

* tag 'vfs-6.14-final.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  libfs: Fix duplicate directory entry in offset_dir_lookup
  fuse: fix possible deadlock if rings are never initialized
  netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
  netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits
  netfs: Call `invalidate_cache` only if implemented
  netfs: Fix collection of results during pause when collection offloaded
  fuse: fix uring race condition for null dereference of fc
  afs: Fix afs_atcell_get_link() to check if ws_cell is unset first

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
"A lone fix for a s390 regression. An earlier 6.14 commit stopped
  taking the pte lock for pages that are being converted to secure, but
  it was needed to avoid races.

  The patch was in development for a while and is finally ready, but I
  wish it was split into 3-4 commits at least"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: pv: fix race when making a page secure

drm/xe: Fix exporting xe buffers multiple times

The `struct ttm_resource->placement` contains TTM_PL_FLAG_* flags, but
it was incorrectly tested for XE_PL_* flags.
This caused xe_dma_buf_pin() to always fail when invoked for
the second time. Fix this by checking the `mem_type` field instead.

Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: intel-xe@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.8+
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
(cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Merge tag 'net-6.14-rc8' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
"Including fixes from can, bluetooth and ipsec.

  This contains a last minute revert of a recent GRE patch, mostly to
  allow me stating there are no known regressions outstanding.

  Current release - regressions:

   - revert "gre: Fix IPv6 link-local address generation."

   - eth: ti: am65-cpsw: fix NAPI registration sequence

  Previous releases - regressions:

   - ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().

   - mptcp: fix data stream corruption in the address announcement

   - bluetooth: fix connection regression between LE and non-LE adapters

   - can:
       - flexcan: only change CAN state when link up in system PM
       - ucan: fix out of bound read in strscpy() source

  Previous releases - always broken:

   - lwtunnel: fix reentry loops

   - ipv6: fix TCP GSO segmentation with NAT

   - xfrm: force software GSO only in tunnel mode

   - eth: ti: icssg-prueth: add lock to stats

  Misc:

   - add Andrea Mayer as a maintainer of SRv6"

* tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
  MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
  Revert "gre: Fix IPv6 link-local address generation."
  Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
  net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
  tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
  mptcp: Fix data stream corruption in the address announcement
  selftests: net: test for lwtunnel dst ref loops
  net: ipv6: ioam6: fix lwtunnel_output() loop
  net: lwtunnel: fix recursion loops
  net: ti: icssg-prueth: Add lock to stats
  net: atm: fix use after free in lec_send()
  xsk: fix an integer overflow in xp_create_and_assign_umem()
  net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
  selftests: drv-net: use defer in the ping test
  phy: fix xa_alloc_cyclic() error handling
  dpll: fix xa_alloc_cyclic() error handling
  devlink: fix xa_alloc_cyclic() error handling
  ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
  ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
  net: ipv6: fix TCP GSO segmentation with NAT
  ...

Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Collected driver fixes from the last few weeks, I was surprised how
  significant many of them seemed to be.

   - Fix rdma-core test failures due to wrong startup ordering in rxe

   - Don't crash in bnxt_re if the FW supports more than 64k QPs

   - Fix wrong QP table indexing math in bnxt_re

   - Calculate the max SRQs for userspace properly in bnxt_re

   - Don't try to do math on errno for mlx5's rate calculation

   - Properly allow userspace to control the VLAN in the QP state during
     INIT->RTR for bnxt_re

   - 6 bug fixes for HNS:
      - Soft lockup when processing huge MRs, add a cond_resched()
      - Fix missed error unwind for doorbell allocation
      - Prevent bad send queue parameters from userspace
      - Wrong error unwind in qp creation
      - Missed xa_destroy during driver shutdown
      - Fix reporting to userspace of max_sge_rd, hns doesn't have a
        read/write difference"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/hns: Fix wrong value of max_sge_rd
  RDMA/hns: Fix missing xa_destroy()
  RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common()
  RDMA/hns: Fix invalid sq params not being blocked
  RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db()
  RDMA/hns: Fix soft lockup during bt pages loop
  RDMA/bnxt_re: Avoid clearing VLAN_ID mask in modify qp path
  RDMA/mlx5: Handle errors returned from mlx5r_ib_rate()
  RDMA/bnxt_re: Fix reporting maximum SRQs on P7 chips
  RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indx
  RDMA/bnxt_re: Fix allocation of QP table
  RDMA/rxe: Fix the failure of ibv_query_device() and ibv_query_device_ex() tests

Merge tag 'mmc-v6.14-rc4' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

- sdhci-brcmstb: Fix CQE suspend/resume support

- atmel-mci: Add a missing clk_disable_unprepare() in ->probe()

* tag 'mmc-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops
mmc: atmel-mci: Add missing clk_disable_unprepare()

Merge tag 'efi-fixes-for-v6.14-3' of git://git./linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:
"Here's a final batch of EFI fixes for v6.14.

  The efivarfs ones are fixes for changes that were made this cycle.
  James's fix is somewhat of a band-aid, but it was blessed by the VFS
  folks, who are working with James to come up with something better for
  the next cycle.

   - Avoid physical address 0x0 for random page allocations

   - Add correct lockdep annotation when traversing efivarfs on resume

   - Avoid NULL mount in kernel_file_open() when traversing efivarfs on
     resume"

* tag 'efi-fixes-for-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efivarfs: fix NULL dereference on resume
  efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume
  efi/libstub: Avoid physical address 0x0 when doing random allocation

MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6

Andrea has made significant contributions to SRv6 support in Linux.
Acknowledge the work and on-going interest in Srv6 support with a
maintainers entry for these files so hopefully he is included
on patches going forward.

Signed-off-by: David Ahern <dsahern@kernel.org>
Acked-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Link: https://patch.msgid.link/20250312092212.46299-1-dsahern@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'gre-revert-ipv6-link-local-address-fix'

Guillaume Nault says:

====================
gre: Revert IPv6 link-local address fix.

Following Paolo's suggestion, let's revert the IPv6 link-local address
generation fix for GRE devices. The patch introduced regressions in the
upstream CI, which are still under investigation.

Start by reverting the kselftest that depend on that fix (patch 1), then
revert the kernel code itself (patch 2).
====================

Link: https://patch.msgid.link/cover.1742418408.git.gnault@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Revert "gre: Fix IPv6 link-local address generation."

This reverts commit 183185a18ff96751db52a46ccf93fff3a1f42815.

This patch broke net/forwarding/ip6gre_custom_multipath_hash.sh in some
circumstances (https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/).
Let's revert it while the problem is being investigated.

Fixes: 183185a18ff9 ("gre: Fix IPv6 link-local address generation.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/8b1ce738eb15dd841aab9ef888640cab4f6ccfea.1742418408.git.gnault@redhat.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."

This reverts commit 6f50175ccad4278ed3a9394c00b797b75441bd6e.

Commit 183185a18ff9 ("gre: Fix IPv6 link-local address generation.") is
going to be reverted. So let's revert the corresponding kselftest
first.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/259a9e98f7f1be7ce02b53d0b4afb7c18a8ff747.1742418408.git.gnault@redhat.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'ipsec-2025-03-19' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2025-03-19

1) Fix tunnel mode TX datapath in packet offload mode
   by directly putting it to the xmit path.
   From Alexandre Cassen.

2) Force software GSO only in tunnel mode in favor
   of potential HW GSO. From Cosmin Ratiu.

ipsec-2025-03-19

* tag 'ipsec-2025-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm_output: Force software GSO only in tunnel mode
  xfrm: fix tunnel mode TX datapath in packet offload mode
====================

Link: https://patch.msgid.link/20250319065513.987135-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here is batman-adv bugfix:

- Ignore own maximum aggregation size during RX, Sven Eckelmann

* tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge:
batman-adv: Ignore own maximum aggregation size during RX
====================

Link: https://patch.msgid.link/20250318150035.35356-1-sw@simonwunderlich.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES

Previous commit 8b5c171bb3dc ("neigh: new unresolved queue limits")
introduces new netlink attribute NDTPA_QUEUE_LENBYTES to represent
approximative value for deprecated QUEUE_LEN. However, it forgot to add
the associated nla_policy in nl_ntbl_parm_policy array. Fix it with one
simple NLA_U32 type policy.

Fixes: 8b5c171bb3dc ("neigh: new unresolved queue limits")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://patch.msgid.link/20250315165113.37600-1-linma@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

tools headers: Sync uapi/asm-generic/socket.h with the kernel sources

This also fixes a wrong definitions for SCM_TS_OPT_ID & SO_RCVPRIORITY.

Accidentally found while working on another patchset.

Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Cc: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: a89568e9be75 ("selftests: txtimestamp: add SCM_TS_OPT_ID test")
Fixes: e45469e594b2 ("sock: Introduce SO_RCVPRIORITY socket option")
Link: https://lore.kernel.org/netdev/20250314195257.34854-1-kuniyu@amazon.com/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250314214155.16046-1-aleksandr.mikhalitsyn@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

mptcp: Fix data stream corruption in the address announcement

Because of the size restriction in the TCP options space, the MPTCP
ADD_ADDR option is exclusive and cannot be sent with other MPTCP ones.
For this reason, in the linked mptcp_out_options structure, group of
fields linked to different options are part of the same union.

There is a case where the mptcp_pm_add_addr_signal() function can modify
opts->addr, but not ended up sending an ADD_ADDR. Later on, back in
mptcp_established_options, other options will be sent, but with
unexpected data written in other fields due to the union, e.g. in
opts->ext_copy. This could lead to a data stream corruption in the next
packet.

Using an intermediate variable, prevents from corrupting previously
established DSS option. The assignment of the ADD_ADDR option
parameters is now done once we are sure this ADD_ADDR option can be set
in the packet, e.g. after having dropped other suboptions.

Fixes: 1bff1e43a30e ("mptcp: optimize out option generation")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Arthur Mongodin <amongodin@randorisec.fr>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
[ Matt: the commit message has been updated: long lines splits and some
clarifications. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-1-122dbb249db3@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

libfs: Fix duplicate directory entry in offset_dir_lookup

There is an issue in the kernel:

In tmpfs, when using the "ls" command to list the contents
of a directory with a large number of files, glibc performs
the getdents call in multiple rounds. If a concurrent unlink
occurs between these getdents calls, it may lead to duplicate
directory entries in the ls output. One possible reproduction
scenario is as follows:

Create 1026 files and execute ls and rm concurrently:

for i in {1..1026}; do
echo "This is file $i" > /tmp/dir/file$i
done

ls /tmp/dir rm /tmp/dir/file4
->getdents(file1026-file5)
->unlink(file4)

->getdents(file5,file3,file2,file1)

It is expected that the second getdents call to return file3
through file1, but instead it returns an extra file5.

The root cause of this problem is in the offset_dir_lookup
function. It uses mas_find to determine the starting position
for the current getdents call. Since mas_find locates the first
position that is greater than or equal to mas->index, when file4
is deleted, it ends up returning file5.

It can be fixed by replacing mas_find with mas_find_rev, which
finds the first position that is less than or equal to mas->index.

Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories")
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
Link: https://lore.kernel.org/r/20250320034417.555810-1-sunyongjian@huaweicloud.com
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

Merge branch 'net-fix-lwtunnel-reentry-loops'

Justin Iurman says:

====================
net: fix lwtunnel reentry loops

When the destination is the same after the transformation, we enter a
lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6,
seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and
output() handlers respectively, where either dst_input() or dst_output()
is called at the end. It can also happen in xmit() handlers.

Here is an example for rpl_input():

dump_stack_lvl+0x60/0x80
rpl_input+0x9d/0x320
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
[...]
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
lwtunnel_input+0x64/0xa0
ip6_sublist_rcv_finish+0x85/0x90
ip6_sublist_rcv+0x236/0x2f0

... until rpl_do_srh() fails, which means skb_cow_head() failed.

This series provides a fix at the core level of lwtunnel to catch such
loops when they're not caught by the respective lwtunnel users, and
handle the loop case in ioam6 which is one of the users. This series
also comes with a new selftest to detect some dst cache reference loops
in lwtunnel users.
====================

Link: https://patch.msgid.link/20250314120048.12569-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

selftests: net: test for lwtunnel dst ref loops

As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note
on selftest posting") in net-next, the selftest is therefore shipped in
this series. However, this selftest does not really test this series. It
needs this series to avoid crashing the kernel. What it really tests,
thanks to kmemleak, is what was fixed by the following commits:
- commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and
ioam6 lwtunnels")
- commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and
ioam6 lwtunnels")
- commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6
lwt")
- commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl
lwt")
- commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel")
- commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila
lwtunnel")

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ipv6: ioam6: fix lwtunnel_output() loop

Fix the lwtunnel_output() reentry loop in ioam6_iptunnel when the
destination is the same after transformation. Note that a check on the
destination address was already performed, but it was not enough. This
is the example of a lwtunnel user taking care of loops without relying
only on the last resort detection offered by lwtunnel.

Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-3-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: lwtunnel: fix recursion loops

This patch acts as a parachute, catch all solution, by detecting
recursion loops in lwtunnel users and taking care of them (e.g., a loop
between routes, a loop within the same route, etc). In general, such
loops are the consequence of pathological configurations. Each lwtunnel
user is still free to catch such loops early and do whatever they want
with them. It will be the case in a separate patch for, e.g., seg6 and
seg6_local, in order to provide drop reasons and update statistics.
Another example of a lwtunnel user taking care of loops is ioam6, which
has valid use cases that include loops (e.g., inline mode), and which is
addressed by the next patch in this series. Overall, this patch acts as
a last resort to catch loops and drop packets, since we don't want to
leak something unintentionally because of a pathological configuration
in lwtunnels.

The solution in this patch reuses dev_xmit_recursion(),
dev_xmit_recursion_inc(), and dev_xmit_recursion_dec(), which seems fine
considering the context.

Closes: https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/
Closes: https://lore.kernel.org/netdev/Z7NKYMY7fJT5cYWu@shredder/
Fixes: ffce41962ef6 ("lwtunnel: support dst output redirect function")
Fixes: 2536862311d2 ("lwt: Add support to redirect dst.input")
Fixes: 14972cbd34ff ("net: lwtunnel: Handle fragmentation")
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-2-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ti: icssg-prueth: Add lock to stats

Currently the API emac_update_hardware_stats() reads different ICSSG
stats without any lock protection.

This API gets called by .ndo_get_stats64() which is only under RCU
protection and nothing else. Add lock to this API so that the reading of
statistics happens during lock.

Fixes: c1e10d5dc7a1 ("net: ti: icssg-prueth: Add ICSSG Stats")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314102721.1394366-1-danishanwar@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: atm: fix use after free in lec_send()

The ->send() operation frees skb so save the length before calling
->send() to avoid a use after free.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/c751531d-4af4-42fe-affe-6104b34b791d@stanley.mountain
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

xsk: fix an integer overflow in xp_create_and_assign_umem()

Since the i and pool->chunk_size variables are of type 'u32',
their product can wrap around and then be cast to 'u64'.
This can lead to two different XDP buffers pointing to the same
memory area.

Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.

Fixes: 94033cd8e73b ("xsk: Optimize for aligned case")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Link: https://patch.msgid.link/20250313085007.3116044-1-Ilia.Gavrilov@infotecs.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'for-net-2025-03-14' of git://git./linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

- hci_event: Fix connection regression between LE and non-LE adapters
- Fix error code in chan_alloc_skb_cb()

* tag 'for-net-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_event: Fix connection regression between LE and non-LE adapters
Bluetooth: Fix error code in chan_alloc_skb_cb()
====================

Link: https://patch.msgid.link/20250314163847.110069-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'hwmon-fixes-for-v6.14-rc8/6.14' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

- Fix an entry in MAINTAINERS to avoid sending hwmon review requests to
   the i2c mailing list

- Fix an out-of-bounds access in nct6775 driver

* tag 'hwmon-fixes-for-v6.14-rc8/6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (nct6775-core) Fix out of bounds access for NCT679{8,9}
  MAINTAINERS: correct list and scope of LTC4286 HARDWARE MONITOR

net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data

Everywhere else in the driver uses devm_kzalloc() when allocating the
AXI data, so there is no kfree() of this structure. However,
dwc-qos-eth uses kzalloc(), which leads to this memory being leaked.
Switch to use devm_kzalloc().

Fixes: d8256121a91a ("stmmac: adding new glue driver dwmac-dwc-qos-eth")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tsRyv-0064nU-O9@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU

Previously with tegra-smmu, even with CONFIG_IOMMU_DMA, the default domain
could have been left as NULL. The NULL domain is specially recognized by
host1x_iommu_attach() as meaning it is not the DMA domain and
should be replaced with the special shared domain.

This happened prior to the below commit because tegra-smmu was using the
NULL domain to mean IDENTITY.

Now that the domain is properly labled the test in DRM doesn't see NULL.
Check for IDENTITY as well to enable the special domains.

This is the same issue and basic fix as seen in
commit fae6e669cdc5 ("drm/tegra: Do not assume that a NULL domain means no
DMA IOMMU").

Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain")
Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e927dc@tecnico.ulisboa.pt/
Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-10dcc8ce3869+3a7-host1x_identity_jgg@nvidia.com

selftests: drv-net: use defer in the ping test

Make sure the test cleans up after itself. The XDP off statements
at the end of the test may not be reached.

Fixes: 75cc19c8ff89 ("selftests: drv-net: add xdp cases for ping.py")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250312131040.660386-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'ata-6.14-final' of git://git./linux/kernel/git/libata/linux

Pull ata fix from Niklas Cassel:

- Fix a regression on ATI AHCI controllers, where certain Samsung
   drives fails to be detected on a warm boot when LPM is enabled.

   LPM on ATI AHCI works fine with other drives. Likewise, the
   Samsung drives works fine with LPM with other AHI controllers.

   Thus, just like the weirdo ATA_QUIRK_NO_NCQ_ON_ATI quirk, add a
   new ATA_QUIRK_NO_LPM_ON_ATI quirk to disable LPM only on ATI
   AHCI controllers.

* tag 'ata-6.14-final' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDs

Merge tag 'kvm-s390-master-6.14-1' of https://git./linux/kernel/git/kvms390/linux into HEAD

Holding the pte lock for the page that is being converted to secure is
needed to avoid races. A previous commit removed the locking, which
caused issues. Fix by locking the pte again.

fuse: fix possible deadlock if rings are never initialized

When mounting a user-space filesystem using io_uring, the initialization
of the rings is done separately in the server side. If for some reason
(e.g. a server bug) this step is not performed it will be impossible to
unmount the filesystem if there are already requests waiting.

This issue is easily reproduced with the libfuse passthrough_ll example,
if the queue depth is set to '0' and a request is queued before trying to
unmount the filesystem. When trying to force the unmount, fuse_abort_conn()
will try to wake up all tasks waiting in fc->blocked_waitq, but because the
rings were never initialized, fuse_uring_ready() will never return 'true'.

Fixes: 3393ff964e0f ("fuse: block request allocation until io-uring init is complete")
Signed-off-by: Luis Henriques <luis@igalia.com>
Link: https://lore.kernel.org/r/20250306111218.13734-1-luis@igalia.com
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

Merge branch 'xa_alloc_cyclic-checks'

Michal Swiatkowski says:

====================
fix xa_alloc_cyclic() return checks

Pierre Riteau <pierre@stackhpc.com> found suspicious handling an error
from xa_alloc_cyclic() in scheduler code [1]. The same is done in few
other places.

v1 --> v2: [2]
* add fixes tags
* fix also the same usage in dpll and phy

[1] https://lore.kernel.org/netdev/20250213223610.320278-1-pierre@stackhpc.com/
[2] https://lore.kernel.org/netdev/20250214132453.4108-1-michal.swiatkowski@linux.intel.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

phy: fix xa_alloc_cyclic() error handling

xa_alloc_cyclic() can return 1, which isn't an error. To prevent
situation when the caller of this function will treat it as no error do
a check only for negative here.

Fixes: 384968786909 ("net: phy: Introduce ethernet link topology representation")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpll: fix xa_alloc_cyclic() error handling

In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will
be returned, which will cause IS_ERR() to be false. Which can lead to
dereference not allocated pointer (pin).

Fix it by checking if err is lower than zero.

This wasn't found in real usecase, only noticed. Credit to Pierre.

Fixes: 97f265ef7f5b ("dpll: allocate pin ids in cycle")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: fix xa_alloc_cyclic() error handling

In case of returning 1 from xa_alloc_cyclic() (wrapping) ERR_PTR(1) will
be returned, which will cause IS_ERR() to be false. Which can lead to
dereference not allocated pointer (rel).

Fix it by checking if err is lower than zero.

This wasn't found in real usecase, only noticed. Credit to Pierre.

Fixes: c137743bce02 ("devlink: introduce object and nested devlink relationship infra")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge patch series "netfs: Miscellaneous fixes"

David Howells <dhowells@redhat.com> says:

Here are some miscellaneous fixes and changes for netfslib:

(1) Fix the collection of results during a pause in transmission.

(2) Call ->invalidate_cache() only if provided.

(3) Fix the rolling buffer to not hammer atomic bit clears when loading
     from readahead.

(4) Fix netfs_unbuffered_read() to return ssize_t.

* patches from https://lore.kernel.org/r/20250314164201.1993231-1-dhowells@redhat.com:
  netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
  netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits
  netfs: Call `invalidate_cache` only if implemented
  netfs: Fix collection of results during pause when collection offloaded

Link: https://lore.kernel.org/r/20250314164201.1993231-1-dhowells@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>

netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int

Fix netfs_unbuffered_read() to return an ssize_t rather than an int as
netfs_wait_for_read() returns ssize_t and this gets implicitly truncated.

Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-5-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Viacheslav Dubeyko <slava@dubeyko.com>
cc: Alex Markuze <amarkuze@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: ceph-devel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits

rolling_buffer_load_from_ra() looms large in the perf report because it
loops around doing an atomic clear for each of the three mark bits per
folio. However, this is both inefficient (it would be better to build a
mask and atomically AND them out) and unnecessary as they shouldn't be set.

Fix this by removing the loop.

Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-4-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

netfs: Call `invalidate_cache` only if implemented

Many filesystems such as NFS and Ceph do not implement the
`invalidate_cache` method.  On those filesystems, if writing to the
cache (`NETFS_WRITE_TO_CACHE`) fails for some reason, the kernel
crashes like this:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: Oops: 0010 [#1] SMP PTI
CPU: 9 UID: 0 PID: 3380 Comm: kworker/u193:11 Not tainted 6.13.3-cm4all1-hp #437
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/17/2018
Workqueue: events_unbound netfs_write_collection_worker
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffff9b86e2ca7dc0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 7fffffffffffffff
RDX: 0000000000000001 RSI: ffff89259d576a18 RDI: ffff89259d576900
RBP: ffff89259d5769b0 R08: ffff9b86e2ca7d28 R09: 0000000000000002
R10: ffff89258ceaca80 R11: 0000000000000001 R12: 0000000000000020
R13: ffff893d158b9338 R14: ffff89259d576900 R15: ffff89259d5769b0
FS:  0000000000000000(0000) GS:ffff893c9fa40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000054442e003 CR4: 00000000001706f0
Call Trace:
  <TASK>
  ? __die+0x1f/0x60
  ? page_fault_oops+0x15c/0x460
  ? try_to_wake_up+0x2d2/0x530
  ? exc_page_fault+0x5e/0x100
  ? asm_exc_page_fault+0x22/0x30
  netfs_write_collection_worker+0xe9f/0x12b0
  ? xs_poll_check_readable+0x3f/0x80
  ? xs_stream_data_receive_workfn+0x8d/0x110
  process_one_work+0x134/0x2d0
  worker_thread+0x299/0x3a0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xba/0xe0
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x30/0x50
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>
Modules linked in:
CR2: 0000000000000000

This patch adds the missing `NULL` check.

Fixes: 0e0f2dfe880f ("netfs: Dispatch write requests to process a writeback slice")
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-3-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

netfs: Fix collection of results during pause when collection offloaded

A netfs read request can run in one of two modes: for synchronous reads
writes, the app thread does the collection of results and for asynchronous
reads, this is offloaded to a worker thread. This is controlled by the
NETFS_RREQ_OFFLOAD_COLLECTION flag.

Now, if a subrequest incurs an error, the NETFS_RREQ_PAUSE flag is set to
stop the issuing loop temporarily from issuing more subrequests until a
retry is successful or the request is abandoned.

When the issuing loop sees NETFS_RREQ_PAUSE, it jumps to
netfs_wait_for_pause() which will wait for the PAUSE flag to be cleared -
and whilst it is waiting, it will call out to the collector as more results
acrue... But this is the wrong thing to do if OFFLOAD_COLLECTION is set as
we can then end up with both the app thread and the work item collecting
results simultaneously.

This manifests itself occasionally when running the generic/323 xfstest
against multichannel cifs as an oops that's a bit random but frequently
involving io_submit() (the test does lots of simultaneous async DIO reads).

Fix this by only doing the collection in netfs_wait_for_pause() if the
NETFS_RREQ_OFFLOAD_COLLECTION is not set.

Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Steve French <stfrench@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250314164201.1993231-2-dhowells@redhat.com
Acked-by: "Paulo Alcantara (Red Hat)" <pc@manguebit.com>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

fuse: fix uring race condition for null dereference of fc

There is a race condition leading to a kernel crash from a null
dereference when attemping to access fc->lock in
fuse_uring_create_queue(). fc may be NULL in the case where another
thread is creating the uring in fuse_uring_create() and has set
fc->ring but has not yet set ring->fc when fuse_uring_create_queue()
reads ring->fc. There is another race condition as well where in
fuse_uring_register(), ring->nr_queues may still be 0 and not yet set
to the new value when we compare qid against it.

This fix sets fc->ring only after ring->fc and ring->nr_queues have been
set, which guarantees now that ring->fc is a proper pointer when any
queues are created and ring->nr_queues reflects the right number of
queues if ring is not NULL. We must use smp_store_release() and
smp_load_acquire() semantics to ensure the ordering will remain correct
where fc->ring is assigned only after ring->fc and ring->nr_queues have
been assigned.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20250318003028.3330599-1-joannelkoong@gmail.com
Fixes: 24fe962c86f5 ("fuse: {io-uring} Handle SQEs - register commands")
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

afs: Fix afs_atcell_get_link() to check if ws_cell is unset first

Fix afs_atcell_get_link() to check if the workstation cell is unset before
doing the RCU pathwalk bit where we dereference that.

Fixes: 823869e1e616 ("afs: Fix afs_atcell_get_link() to handle RCU pathwalk")
Reported-by: syzbot+76a6f18e3af82e84f264@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/2481796.1742296819@warthog.procyon.org.uk
Tested-by: syzbot+76a6f18e3af82e84f264@syzkaller.appspotmail.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

drm/amdgpu/pm: Handle SCLK offset correctly in overdrive for smu 14.0.2

Currently, it seems like the code was carried over from RDNA3 because
it assumes two possible values to set. RDNA4, instead of having:
0: min SCLK
1: max SCLK
only has
0: SCLK offset

This change makes it so it only reports current offset value instead of
showing possible min/max values and their indices. Moreover, it now only
accepts the offset as a value, without the indice index.

Additionally, the lower bound was printed as %u by mistake.

Old:
OD_SCLK_OFFSET:
0: -500Mhz
1: 1000Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

New:
OD_SCLK_OFFSET:
0Mhz
OD_MCLK:
0: 97Mhz
1: 1259MHz
OD_VDDGFX_OFFSET:
0mV
OD_RANGE:
SCLK_OFFSET:    -500Mhz       1000Mhz
MCLK:      97Mhz       1500Mhz
VDDGFX_OFFSET:    -200mv          0mv

Setting this offset:
Old: "s 1 <offset>"
New: "s <offset>"

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4036
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1cfeb60e6e8837b1de5eb4e17df7cf31f4442144)
Cc: stable@vger.kernel.org # 6.12.x

drm/amd/display: Fix incorrect fw_state address in dmub_srv

[WHY]
The fw_state in dmub_srv was assigned with wrong address.
The address was pointed to the firmware region.

[HOW]
Fix the firmware state by using DMUB_DEBUG_FW_STATE_OFFSET
in dmub_cmd.h.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f57b38ac85a01bf03020cc0a9761d63e5c0ce197)

drm/amd/display: Use HW lock mgr for PSR1 when only one eDP

[WHY]
DMUB locking is important to make sure that registers aren't accessed
while in PSR. Previously it was enabled but caused a deadlock in
situations with multiple eDP panels.

[HOW]
Detect if multiple eDP panels are in use to decide whether to use
lock. Refactor the function so that the first check is for PSR-SU
and then replay is in use to prevent having to look up number
of eDP panels for those configurations.

Fixes: f245b400a223 ("Revert "drm/amd/display: Use HW lock mgr for PSR1"")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3965
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ed569e1279a3045d6b974226c814e071fa0193a6)
Cc: stable@vger.kernel.org

drm/amd/display: Fix message for support_edp0_on_dp1

[WHY]
The info message was wrong when support_edp0_on_dp1 is enabled

[HOW]
Use correct info message for support_edp0_on_dp1

Fixes: f6d17270d18a ("drm/amd/display: add a quirk to enable eDP0 on DP1")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Yilin Chen <Yilin.Chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 79538e6365c99d7b1c3e560d1ea8d11ef8313465)
Cc: stable@vger.kernel.org

drm/amdkfd: Fix user queue validation on Gfx7/8

To workaround queue full h/w issue on Gfx7/8, when application create
AQL queue, the ring buffer bo allocate size is queue_size/2 and
map queue_size ring buffer to GPU in 2 pieces using 2 attachments, each
attachment map size is queue_size/2, with same ring_bo backing memory.

For Gfx7/8, user queue buffer validation should use queue_size/2 to
verify ring_bo allocation and mapping size.

Fixes: 68e599db7a54 ("drm/amdkfd: Validate user queue buffers")
Suggested-by: Tomáš Trnka <trnka@scm.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e7a477735f1771b9a9346a5fbd09d7ff0641723a)
Cc: stable@vger.kernel.org

drm/amdgpu: Restore uncached behaviour on GFX12

Always use MTYPE_UC if UNCACHED flag is specified.

This makes kernarg region uncached and it restores
usermode cache disable debug flag functionality.

Do not set MTYPE_UC for COHERENT flag, on GFX12 coherence is handled by
shader code.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb6cdfb807d038d9b9986b5c87188f28a4071eae)
Cc: stable@vger.kernel.org # 6.12.x

drm/amdgpu/gfx12: correct cleanup of 'me' field with gfx_v12_0_me_fini()

In gfx_v12_0_cp_gfx_load_me_microcode_rs64(), gfx_v12_0_pfp_fini() is
incorrectly used to free 'me' field of 'gfx', since gfx_v12_0_pfp_fini()
can only release 'pfp' field of 'gfx'. The release function of 'me' field
should be gfx_v12_0_me_fini().

Fixes: 52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ebdc52607a46cda08972888178c6aa9cd6965141)
Cc: stable@vger.kernel.org # 6.12.x

drm/amdkfd: Fix instruction hazard in gfx12 trap handler

VALU instructions with SGPR source need wait states to avoid hazard
with SALU using different SGPR.

v2: Eliminate some hazards to reduce code explosion

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7e0459d453b911435673edd7a86eadc600c63238)
Cc: stable@vger.kernel.org # 6.12.x

drm/amdgpu/pm: wire up hwmon fan speed for smu 14.0.2

Add callbacks for fan speed fetching.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4034
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 90df6db62fa78a8ab0b705ec38db99c7973b95d6)
Cc: stable@vger.kernel.org # 6.12.x

drm/amd/pm: add unique_id for gfx12

Expose unique_id for gfx12

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 16fbc18cb07470cd33fb5f37ad181b51583e6dc0)
Cc: stable@vger.kernel.org # 6.12.x

drm/amdgpu: Remove JPEG from vega and carrizo video caps

JPEG is only supported for VCN1+.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0a6e7b06bdbead2e43d56a2274b7e0c9c86d536e)
Cc: stable@vger.kernel.org

drm/amdgpu: Fix JPEG video caps max size for navi1x and raven

8192x8192 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6e0d2fde3ae8fdb5b47e10389f23ed2cb4daec5d)
Cc: stable@vger.kernel.org

drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size

1920x1088 is the maximum supported resolution.

Signed-off-by: David Rosca <david.rosca@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1a0807feb97082bff2b1342dbbe55a2a9a8bdb88)
Cc: stable@vger.kernel.org

drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse()

On the off chance that command stream passed from userspace via
ioctl() call to radeon_vce_cs_parse() is weirdly crafted and
first command to execute is to encode (case 0x03000001), the function
in question will attempt to call radeon_vce_cs_reloc() with size
argument that has not been properly initialized. Specifically, 'size'
will point to 'tmp' variable before the latter had a chance to be
assigned any value.

Play it safe and init 'tmp' with 0, thus ensuring that
radeon_vce_cs_reloc() will catch an early error in cases like these.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: 2fc5703abda2 ("drm/radeon: check VCE relocation buffer range v3")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2d52de55f9ee7aaee0e09ac443f77855989c6b68)
Cc: stable@vger.kernel.org

Merge tag 'pmdomain-v6.14-rc4' of git://git./linux/kernel/git/ulfh/linux-pm

Pull pmdomain fix from Ulf Hansson:

- Fix amlogic T7 ISP secpower

* tag 'pmdomain-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: amlogic: fix T7 ISP secpower

ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().

While creating a new IPv6, we could get a weird -ENOMEM when
RTA_NH_ID is set and either of the conditions below is true:

  1) CONFIG_IPV6_SUBTREES is enabled and rtm_src_len is specified
  2) nexthop_get() fails

e.g.)

  # strace ip -6 route add fe80::dead:beef:dead:beef nhid 1 from ::
  recvmsg(3, {msg_iov=[{iov_base=[...[
    {error=-ENOMEM, msg=[... [...]]},
    [{nla_len=49, nla_type=NLMSGERR_ATTR_MSG}, "Nexthops can not be used with so"...]
  ]], iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 148

Let's set err explicitly after ip_fib_metrics_init() in
ip6_route_info_create().

Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250312013854.61125-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().

fib_check_nh_v6_gw() expects that fib6_nh_init() cleans up everything
when it fails.

Commit 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
moved fib_nh_common_init() before alloc_percpu_gfp() within fib6_nh_init()
but forgot to add cleanup for fib6_nh->nh_common.nhc_pcpu_rth_output in
case it fails to allocate fib6_nh->rt6i_pcpu, resulting in memleak.

Let's call fib_nh_common_release() and clear nhc_pcpu_rth_output in the
error path.

Note that we can remove the fib6_nh_release() call in nh_create_ipv6()
later in net-next.git.

Fixes: 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250312010333.56001-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge tag 'linux-can-fixes-for-6.14-20250314' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2025-03-14

this is a pull request of 6 patches for net/main.

The first patch is by Vincent Mailhol and fixes an out of bound read
in strscpy() in the ucan driver.

Oliver Hartkopp contributes a patch for the af_can statistics to use
atomic access in the hot path.

The next 2 patches are by Biju Das, target the rcar_canfd driver and
fix the page entries in the AFL list.

The 2 patches by Haibo Chen for the flexcan driver fix the suspend and
resume functions.

linux-can-fixes-for-6.14-20250314

* tag 'linux-can-fixes-for-6.14-20250314' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: flexcan: disable transceiver during system PM
  can: flexcan: only change CAN state when link up in system PM
  can: rcar_canfd: Fix page entries in the AFL list
  dt-bindings: can: renesas,rcar-canfd: Fix typo in pattern properties for R-Car V4M
  can: statistics: use atomic access in hot path
  can: ucan: fix out of bound read in strscpy() source
====================

Link: https://patch.msgid.link/20250314130909.2890541-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ipv6: fix TCP GSO segmentation with NAT

When updating the source/destination address, the TCP/UDP checksum needs to
be updated as well.

Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/20250311212530.91519-1-nbd@nbd.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: mana: Support holes in device list reply msg

According to GDMA protocol, holes (zeros) are allowed at the beginning
or middle of the gdma_list_devices_resp message. The existing code
cannot properly handle this, and may miss some devices in the list.

To fix, scan the entire list until the num_of_devs are found, or until
the end of the list.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@microsoft.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1741723974-1534-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: ethernet: ti: am65-cpsw: Fix NAPI registration sequence

Registering the interrupts for TX or RX DMA Channels prior to registering
their respective NAPI callbacks can result in a NULL pointer dereference.
This is seen in practice as a random occurrence since it depends on the
randomness associated with the generation of traffic by Linux and the
reception of traffic from the wire.

Fixes: 681eb2beb3ef ("net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path")
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250311154259.102865-1-s-vadapalli@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDs

Before commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
the ATI AHCI controllers specified board type 'board_ahci' rather than
board type 'board_ahci'. This means that LPM was historically not enabled
for the ATI AHCI controllers.

By looking at commit 7a8526a5cd51 ("libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI
for Samsung 860 and 870 SSD."), it is clear that, for some unknown reason,
that Samsung SSDs do not play nice with ATI AHCI controllers. (When using
other AHCI controllers, NCQ can be enabled on these Samsung SSDs without
issues.)

In a similar way, from user reports, it is clear the ATI AHCI controllers
can enable LPM on e.g. Maxtor HDDs perfectly fine, but when enabling LPM
on certain Samsung SSDs, things break. (E.g. the SSDs will not get detected
by the ATI AHCI controller even after a COMRESET.)

Yet, when using LPM on these Samsung SSDs with other AHCI controllers, e.g.
Intel AHCI controllers, these Samsung drives appear to work perfectly fine.

Considering that the combination of ATI + Samsung, for some unknown reason,
does not seem to work well, disable LPM when detecting an ATI AHCI
controller with a problematic Samsung SSD.

Apply this new ATA_QUIRK_NO_LPM_ON_ATI quirk for all Samsung SSDs that have
already been reported to not play nice with ATI (ATA_QUIRK_NO_NCQ_ON_ATI).

Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Reported-by: Eric <eric.4.debian@grabatoulnz.fr>
Closes: https://lore.kernel.org/linux-ide/Z8SBZMBjvVXA7OAK@eldamar.lan/
Tested-by: Eric <eric.4.debian@grabatoulnz.fr>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250317170348.1748671-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>

efivarfs: fix NULL dereference on resume

LSMs often inspect the path.mnt of files in the security hooks, and this
causes a NULL deref in efivarfs_pm_notify() because the path is
constructed with a NULL path.mnt.

Fix by obtaining from vfs_kern_mount() instead, and being very careful
to ensure that deactivate_super() (potentially triggered by a racing
userspace umount) is not called directly from the notifier, because it
would deadlock when efivarfs_kill_sb() tried to unregister the notifier
chain.

[ Al notes:
Umm... That's probably safe, but not as a long-term solution -
it's too intimately dependent upon fs/super.c internals. The
reasons why you can't run into ->s_umount deadlock here are
non-trivial... ]

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Link: https://lore.kernel.org/r/e54e6a2f-1178-4980-b771-4d9bafc2aa47@tnxip.de
Link: https://lore.kernel.org/r/3e998bf87638a442cbc6864cdcd3d8d9e08ce3e3.camel@HansenPartnership.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Merge tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git./linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
"15 hotfixes. 7 are cc:stable and the remainder address post-6.13
  issues or aren't considered necessary for -stable kernels.

  13 are for MM and the other two are for squashfs and procfs.

  All are singletons. Please see the individual changelogs for details"

* tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/page_alloc: fix memory accept before watermarks gets initialized
  mm: decline to manipulate the refcount on a slab page
  memcg: drain obj stock on cpu hotplug teardown
  mm/huge_memory: drop beyond-EOF folios with the right number of refs
  selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation
  mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT
  mm: memcontrol: fix swap counter leak from offline cgroup
  mm/vma: do not register private-anon mappings with khugepaged during mmap
  squashfs: fix invalid pointer dereference in squashfs_cache_delete
  mm/migrate: fix shmem xarray update during migration
  mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
  mm/damon/core: initialize damos->walk_completed in damon_new_scheme()
  mm/damon: respect core layer filters' allowance decision on ops layer
  filemap: move prefaulting out of hot write path
  proc: fix UAF in proc_get_inode()

MAINTAINERS: Remove myself

Unfortunately I no longer have time to meaningfully take part in the
linux kernel development.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

smb: client: don't retry IO on failed negprotos with soft mounts

If @server->tcpStatus is set to CifsNeedReconnect after acquiring
@ses->session_mutex in smb2_reconnect() or cifs_reconnect_tcon(), it
means that a concurrent thread failed to negotiate, in which case the
server is no longer responding to any SMB requests, so there is no
point making the caller retry the IO by returning -EAGAIN.

Fix this by returning -EHOSTDOWN to the callers on soft mounts.

Cc: David Howells <dhowells@redhat.com>
Reported-by: Jay Shin <jaeshin@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

Merge tag 'soc-fixes-6.14-2' of git://git./linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
"The majority of these last fixes are for devicetree files.

  These address two important regressions for the Qualcomm SMMU and the
  Raspberry Pi 4 USB controller, as well as a larger number of patches
  fixing minor mistakes in board specific files for Rockchips, i.MX,
  starfive and broadcom.

  The non-DT changes are

   - A fix for an old boot regression on Renesas shmobile chips

   - Another boot time regression for for the Qualcomm PDR SoC driver,
     among a few other Qualcomm firmware driver fixes for efivars and
     tzmem

   - Minor Kconfig fixes for davinci and OMAP1

   - Minor code fixes for sparx5 reset controllers, OMAP memory
     controller, i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC
     driver

   - One more update to the Asahi maintainers, adding Neal Gompa as a
     reviewer"

* tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
  ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX
  soc: hisilicon: kunpeng_hccs: Fix incorrect string assembly
  memory: omap-gpmc: drop no compatible check
  reset: mchp: sparx5: Fix for lan966x
  ARM: shmobile: smp: Enforce shmobile_smp_* alignment
  MAINTAINERS: Add myself (Neal Gompa) as a reviewer for ARM Apple support
  MAINTAINERS: Add apple-spi driver & binding files
  arm64: dts: rockchip: slow down emmc freq for rock 5 itx
  ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC3200
  ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC5300
  ARM: dts: bcm2711: Don't mark timer regs unconfigured
  ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP
  arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi
  arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsi
  arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1
  arm64: dts: rockchip: fix pinmux of UART5 for PX30 Ringneck on Haikou
  arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou
  arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S
  arm64: dts: bcm2712: PL011 UARTs are actually r1p5
  ARM: dts: bcm2711: PL011 UARTs are actually r1p5
  ...

Merge tag 'probes-fixes-v6.14-rc6' of git://git./linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

- Clean up tprobe correctly when module unload

   Tracepoint probes do not set TRACEPOINT_STUB on the 'tpoint' pointer
   when unloading a module, thus they show as a normal 'fprobe' instead
   of 'tprobe' and never come back

- Fix leakage of tprobe module refcount

   When a tprobe's target module is loaded, it gets the module's
   refcount in the module notifier but forgot to put it after
   registering the probe on it.

   Fix it by getting the refcount only when registering tprobe.

* tag 'probes-fixes-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: tprobe-events: Fix leakage of module refcount
  tracing: tprobe-events: Fix to clean up tprobe correctly when module unload

efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume

syzbot warns about a potential deadlock, but this is a false positive
resulting from a missing lockdep annotation: iterate_dir() locks the
parent whereas the inode_lock() it warns about locks the child, which is
guaranteed to be a different lock.

So use inode_lock_nested() instead with the appropriate lock class.

Reported-by: syzbot+019072ad24ab1d948228@syzkaller.appspotmail.com
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

hwmon: (nct6775-core) Fix out of bounds access for NCT679{8,9}

pwm_num is set to 7 for these chips, but NCT6776_REG_PWM_MODE and
NCT6776_PWM_MODE_MASK only contain 6 values.

Fix this by adding another 0 to the end of each array.

Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Link: https://lore.kernel.org/r/20250312030832.106475-1-tasos@tasossah.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

MAINTAINERS: correct list and scope of LTC4286 HARDWARE MONITOR

This entry has a wrong list, i2c instead of hwmon. Also, it states to
maintain Kconfig and Makefile which is not suitable for a single driver.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Fixes: 7707cf82e138 ("dt-bindings: hwmon: Add lltc ltc4286 driver bindings")
Link: https://lore.kernel.org/r/20250317091459.41462-2-wsa+renesas@sang-engineering.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops

cqhci timeouts observed on brcmstb platforms during suspend:
  ...
  [  164.832853] mmc0: cqhci: timeout for tag 18
  ...

Adding cqhci_suspend()/resume() calls to disable cqe
in sdhci_brcmstb_suspend()/resume() respectively to fix
CQE timeouts seen on PM suspend.

Fixes: d46ba2d17f90 ("mmc: sdhci-brcmstb: Add support for Command Queuing (CQE)")
Cc: stable@vger.kernel.org
Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20250311165946.28190-1-kamal.dasu@broadcom.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

mm/page_alloc: fix memory accept before watermarks gets initialized

Watermarks are initialized during the postcore initcall. Until then, all
watermarks are set to zero. This causes cond_accept_memory() to
incorrectly skip memory acceptance because a watermark of 0 is always met.

This can lead to a premature OOM on boot.

To ensure progress, accept one MAX_ORDER page if the watermark is zero.

Link: https://lkml.kernel.org/r/20250310082855.2587122-1-kirill.shutemov@linux.intel.com
Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Reported-by: Farrah Chen <farrah.chen@intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Thomas Lendacky <thomas.lendacky@amd.com>
Cc: <stable@vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: decline to manipulate the refcount on a slab page

Slab pages now have a refcount of 0, so nobody should be trying to
manipulate the refcount on them.  Doing so has little effect; the object
could be freed and reallocated to a different purpose, although the slab
itself would not be until the refcount was put making it behave rather
like TYPESAFE_BY_RCU.

Unfortunately, __iov_iter_get_pages_alloc() does take a refcount.  Fix
that to not change the refcount, and make put_page() silently not change
the refcount.  get_page() warns so that we can fix any other callers that
need to be changed.

Long-term, networking needs to stop taking a refcount on the pages that it
uses and rely on the caller to hold whatever references are necessary to
make the memory stable.  In the medium term, more page types are going to
hav a zero refcount, so we'll want to move get_page() and put_page() out
of line.

Link: https://lkml.kernel.org/r/20250310143544.1216127-1-willy@infradead.org
Fixes: 9aec2fb0fd5e (slab: allocate frozen pages)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hannes Reinecke <hare@suse.de>
Closes: https://lore.kernel.org/all/08c29e4b-2f71-4b6d-8046-27e407214d8c@suse.com/
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

memcg: drain obj stock on cpu hotplug teardown

Currently on cpu hotplug teardown, only memcg stock is drained but we
need to drain the obj stock as well otherwise we will miss the stats
accumulated on the target cpu as well as the nr_bytes cached. The stats
include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In
addition we are leaking reference to struct obj_cgroup object.

Link: https://lkml.kernel.org/r/20250310230934.2913113-1-shakeel.butt@linux.dev
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/huge_memory: drop beyond-EOF folios with the right number of refs

When an after-split folio is large and needs to be dropped due to EOF,
folio_put_refs(folio, folio_nr_pages(folio)) should be used to drop all
page cache refs. Otherwise, the folio will not be freed, causing memory
leak.

This leak would happen on a filesystem with blocksize > page_size and a
truncate is performed, where the blocksize makes folios split to >0 order
ones, causing truncated folios not being freed.

Link: https://lkml.kernel.org/r/20250310155727.472846-1-ziy@nvidia.com
Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: Hugh Dickins <hughd@google.com>
Closes: https://lore.kernel.org/all/fcbadb7f-dd3e-21df-f9a7-2853b53183c4@google.com/
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Luis Chamberalin <mcgrof@kernel.org>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation

We noticed that uffd-stress test was always failing to run when invoked
for the hugetlb profiles on x86_64 systems with a processor count of 64 or
bigger:

  ...
  # ------------------------------------
  # running ./uffd-stress hugetlb 128 32
  # ------------------------------------
  # ERROR: invalid MiB (errno=9, @uffd-stress.c:459)
  ...
  # [FAIL]
  not ok 3 uffd-stress hugetlb 128 32 # exit=1
  ...

The problem boils down to how run_vmtests.sh (mis)calculates the size of
the region it feeds to uffd-stress.  The latter expects to see an amount
of MiB while the former is just giving out the number of free hugepages
halved down.  This measurement discrepancy ends up violating uffd-stress'
assertion on number of hugetlb pages allocated per CPU, causing it to bail
out with the error above.

This commit fixes that issue by adjusting run_vmtests.sh's
half_ufd_size_MB calculation so it properly renders the region size in
MiB, as expected, while maintaining all of its original constraints in
place.

Link: https://lkml.kernel.org/r/20250218192251.53243-1-aquini@redhat.com
Fixes: 2e47a445d7b3 ("selftests/mm: run_vmtests.sh: fix hugetlb mem size calculation")
Signed-off-by: Rafael Aquini <raquini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT

original report:
https://lore.kernel.org/all/CAKhLTr1UL3ePTpYjXOx2AJfNk8Ku2EdcEfu+CH1sf3Asr=B-Dw@mail.gmail.com/T/

When doing buffered writes with FGP_NOWAIT, under memory pressure, the
system returned ENOMEM despite there being plenty of available memory, to
be reclaimed from page cache.  The user space used io_uring interface,
which in turn submits I/O with FGP_NOWAIT (the fast path).

retsnoop pointed to iomap_get_folio:

00:34:16.180612 -> 00:34:16.180651 TID/PID 253786/253721
(reactor-1/combined_tests):

                    entry_SYSCALL_64_after_hwframe+0x76
                    do_syscall_64+0x82
                    __do_sys_io_uring_enter+0x265
                    io_submit_sqes+0x209
                    io_issue_sqe+0x5b
                    io_write+0xdd
                    xfs_file_buffered_write+0x84
                    iomap_file_buffered_write+0x1a6
    32us [-ENOMEM]  iomap_write_begin+0x408
iter=&{.inode=0xffff8c67aa031138,.len=4096,.flags=33,.iomap={.addr=0xffffffffffffffff,.length=4096,.type=1,.flags=3,.bdev=0x…
pos=0 len=4096 foliop=0xffffb32c296b7b80
!    4us [-ENOMEM]  iomap_get_folio
iter=&{.inode=0xffff8c67aa031138,.len=4096,.flags=33,.iomap={.addr=0xffffffffffffffff,.length=4096,.type=1,.flags=3,.bdev=0x…
pos=0 len=4096

This is likely a regression caused by 66dabbb65d67 ("mm: return an ERR_PTR
from __filemap_get_folio"), which moved error handling from
io_map_get_folio() to __filemap_get_folio(), but broke FGP_NOWAIT
handling, so ENOMEM is being escaped to user space.  Had it correctly
returned -EAGAIN with NOWAIT, either io_uring or user space itself would
be able to retry the request.

It's not enough to patch io_uring since the iomap interface is the one
responsible for it, and pwritev2(RWF_NOWAIT) and AIO interfaces must
return the proper error too.

The patch was tested with scylladb test suite (its original reproducer),
and the tests all pass now when memory is pressured.

Link: https://lkml.kernel.org/r/20250224143700.23035-1-raphaelsc@scylladb.com
Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio")
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm: memcontrol: fix swap counter leak from offline cgroup

Commit 6769183166b3 removed the parameter of id from swap_cgroup_record()
and get the memcg id from mem_cgroup_id(folio_memcg(folio)).  However, the
caller of it may update a different memcg's counter instead of
folio_memcg(folio).

E.g.  in the caller of mem_cgroup_swapout(), @swap_memcg could be
different with @memcg and update the counter of @swap_memcg, but
swap_cgroup_record() records the wrong memcg's ID.  When it is uncharged
from __mem_cgroup_uncharge_swap(), the swap counter will leak since the
wrong recorded ID.

Fix it by bringing the parameter of id back.

Link: https://lkml.kernel.org/r/20250306023133.44838-1-songmuchun@bytedance.com
Fixes: 6769183166b3 ("mm/swap_cgroup: decouple swap cgroup recording and clearing")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/vma: do not register private-anon mappings with khugepaged during mmap

We already are registering private-anon VMAs with khugepaged during fault
time, in do_huge_pmd_anonymous_page(). Commit "register suitable readonly
file vmas for khugepaged" moved the khugepaged registration logic from
shmem_mmap to the generic mmap path.

The userspace-visible effect should be this: khugepaged will unnecessarily
scan mm's which haven't yet faulted in. Note that it won't actually
collapse because all PTEs are none.

Now that I think about it, the mm is going to have a file VMA anyways
during fork+exec, so the mm already gets registered during mmap due to the
non-anon case (I *think*), so at least one of either the mmap registration
or fault-time registration is redundant.

Make this logic specific for non-anon mappings.

Link: https://lkml.kernel.org/r/20250306063037.16299-1-dev.jain@arm.com
Fixes: 613bec092fe7 ("mm: mmap: register suitable readonly file vmas for khugepaged")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

squashfs: fix invalid pointer dereference in squashfs_cache_delete

When mounting a squashfs fails, squashfs_cache_init() may return an error
pointer (e.g., -ENOMEM) instead of NULL. However, squashfs_cache_delete()
only checks for a NULL cache, and attempts to dereference the invalid
pointer. This leads to a kernel crash (BUG: unable to handle kernel
paging request in squashfs_cache_delete).

This patch fixes the issue by checking IS_ERR(cache) before accessing it.

Link: https://lkml.kernel.org/r/20250306132855.2030-1-zhiyuzhang999@gmail.com
Fixes: 49ff29240ebb ("squashfs: make squashfs_cache_init() return ERR_PTR(-ENOMEM)")
Signed-off-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
Reported-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
Closes: https://lore.kernel.org/linux-fsdevel/CALf2hKvaq8B4u5yfrE+BYt7aNguao99mfWxHngA+=o5hwzjdOg@mail.gmail.com/
Tested-by: Zhiyu Zhang <zhiyuzhang999@gmail.com>
Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/migrate: fix shmem xarray update during migration

A shmem folio can be either in page cache or in swap cache, but not at the
same time.  Namely, once it is in swap cache, folio->mapping should be
NULL, and the folio is no longer in a shmem mapping.

In __folio_migrate_mapping(), to determine the number of xarray entries to
update, folio_test_swapbacked() is used, but that conflates shmem in page
cache case and shmem in swap cache case.  It leads to xarray multi-index
entry corruption, since it turns a sibling entry to a normal entry during
xas_store() (see [1] for a userspace reproduction).  Fix it by only using
folio_test_swapcache() to determine whether xarray is storing swap cache
entries or not to choose the right number of xarray entries to update.

[1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/

Note:
In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is
used to get swap_cache address space, but that ignores the shmem folio in
swap cache case.  It could lead to NULL pointer dereferencing when a
in-swap-cache shmem folio is split at __xa_store(), since
!folio_test_anon() is true and folio->mapping is NULL.  But fortunately,
its caller split_huge_page_to_list_to_order() bails out early with EBUSY
when folio->mapping is NULL.  So no need to take care of it here.

Link: https://lkml.kernel.org/r/20250305200403.2822855-1-ziy@nvidia.com
Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: Liu Shixin <liushixin2@huawei.com>
Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
Suggested-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/hugetlb: fix surplus pages in dissolve_free_huge_page()

In dissolve_free_huge_page(), free huge pages are dissolved without
adjusting surplus count. However, free huge pages may be accounted as
surplus pages, and will lead to wrong surplus count.

I reproduce this issue on qemu. The steps are:
1) Node1 is memory-less at first. Hot-add memory to node1 by executing
the two commands in qemu monitor:
  object_add memory-backend-ram,id=mem1,size=1G
  device_add pc-dimm,id=dimm1,memdev=mem1,node=1
2) online one memory block of Node1 with：
  echo online_movable > /sys/devices/system/node/node1/memoryX/state
3) create 64 huge pages for node1
4) run a program to reserve (don't consume) all the huge pages
5) echo 0 > nr_huge_pages for node1. After this step, free huge pages in
Node1 are surplus.
6) create 80 huge pages for node0
7) offline memory of node1, The memory range to offline contains the free
surplus huge pages created in step3) ~ step5)
  echo offline > /sys/devices/system/node/node1/memoryX/state
8) kill the program in step 4)

The result:
           Node0     Node1
total       80        0
free        80        0
surplus     0         61

To fix it, adjust surplus when destroying huge pages if the node has
surplus pages in dissolve_free_hugetlb_folio().

The result with this patch:
           Node0     Node1
total       80        0
free        80        0
surplus     0         0

Link: https://lkml.kernel.org/r/20250304132106.2872754-1-tujinjiang@huawei.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Jinjiang Tu <tujinjiang@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon/core: initialize damos->walk_completed in damon_new_scheme()

The function for allocating and initialize a 'struct damos' object,
damon_new_scheme(), is not initializing damos->walk_completed field.  Only
damos_walk_complete() is setting the field.  Hence the field will be
eventually set and used correctly from second damos_walk() call for the
scheme.  But the first damos_walk() could mistakenly not walk on the
regions.  Actually, a common usage of DAMOS for taking an access pattern
snapshot is installing a monitoring-purpose DAMOS scheme, doing
damos_walk() to retrieve the snapshot, and then removing the scheme.
DAMON user-space tool (damo) also gets runtime snapshot in the way.  Hence
the problem can continuously happen in such use cases.  Initialize it
properly in the allocation function.

Link: https://lkml.kernel.org/r/20250228174450.41472-1-sj@kernel.org
Fixes: bf0eaba0ff9c ("mm/damon/core: implement damos_walk()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

mm/damon: respect core layer filters' allowance decision on ops layer

Filtering decisions are made in filters evaluation order.  Once a decision
is made by a filter, filters that scheduled to be evaluated after the
decision-made filter should just respect it.  This is the intended and
documented behavior.  Since core layer-handled filters are evaluated
before operations layer-handled filters, decisions made on core layer
should respected by ops layer.

In case of reject filters, the decision is respected, since core
layer-rejected regions are not passed to ops layer.  But in case of allow
filters, ops layer filters don't know if the region has passed to them
because it was allowed by core filters or just because it didn't match to
any core layer.  The current wrong implementation assumes it was due to
not matched by any core filters.  As a reuslt, the decision is not
respected.  Pass the missing information to ops layer using a new filed in
'struct damos', and make the ops layer filters respect it.

Link: https://lkml.kernel.org/r/20250228175336.42781-1-sj@kernel.org
Fixes: 491fee286e56 ("mm/damon/core: support damos_filter->allow")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

filemap: move prefaulting out of hot write path

There is a generic anti-pattern that shows up in the VFS and several
filesystems where the hot write paths touch userspace twice when they
could get away with doing it once.

Dave Chinner suggested that they should all be fixed up[1]. I agree[2].
But, the series to do that fixup spans a bunch of filesystems and a lot of
people. This patch fixes common code that absolutely everyone uses. It
has measurable performance benefits[3].

I think this patch can go in and not be held up by the others.

I will post them separately to their separate maintainers for
consideration. But, honestly, I'm not going to lose any sleep if
the maintainers don't pick those up.

1. https://lore.kernel.org/all/Z5f-x278Z3wTIugL@dread.disaster.area/
2. https://lore.kernel.org/all/20250129181749.C229F6F3@davehans-spike.ostc.intel.com/
3. https://lore.kernel.org/all/202502121529.d62a409e-lkp@intel.com/

This patch:

There is a bit of a sordid history here. I originally wrote
998ef75ddb57 ("fs: do not prefault sys_write() user buffer pages")
to fix a performance issue that showed up on early SMAP hardware.
But that was reverted with 00a3d660cbac because it exposed an
underlying filesystem bug.

This is a reimplementation of the original commit along with some
simplification and comment improvements.

The basic problem is that the generic write path has two userspace
accesses: one to prefault the write source buffer and then another to
perform the actual write. On x86, this means an extra STAC/CLAC pair.
These are relatively expensive instructions because they function as
barriers.

Keep the prefaulting behavior but move it into the slow path that gets
run when the write did not make any progress. This avoids livelocks
that can happen when the write's source and destination target the
same folio. Contrary to the existing comments, the fault-in does not
prevent deadlocks. That's accomplished by using an "atomic" usercopy
that disables page faults.

The end result is that the generic write fast path now touches
userspace once instead of twice.

0day has shown some improvements on a couple of microbenchmarks:

https://lore.kernel.org/all/202502121529.d62a409e-lkp@intel.com/

Link: https://lkml.kernel.org/r/20250228203722.CAEB63AC@davehans-spike.ostc.intel.com
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/yxyuijjfd6yknryji2q64j3keq2ygw6ca6fs5jwyolklzvo45s@4u63qqqyosy2/
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

proc: fix UAF in proc_get_inode()

Fix race between rmmod and /proc/XXX's inode instantiation.

The bug is that pde->proc_ops don't belong to /proc, it belongs to a
module, therefore dereferencing it after /proc entry has been registered
is a bug unless use_pde/unuse_pde() pair has been used.

use_pde/unuse_pde can be avoided (2 atomic ops!) because pde->proc_ops
never changes so information necessary for inode instantiation can be
saved _before_ proc_register() in PDE itself and used later, avoiding
pde->proc_ops->...  dereference.

      rmmod                         lookup
sys_delete_module
                         proc_lookup_de
   pde_get(de);
   proc_get_inode(dir->i_sb, de);
  mod->exit()
    proc_remove
      remove_proc_subtree
       proc_entry_rundown(de);
  free_module(mod);

                               if (S_ISREG(inode->i_mode))
                         if (de->proc_ops->proc_read_iter)
                           --> As module is already freed, will trigger UAF

BUG: unable to handle page fault for address: fffffbfff80a702b
PGD 817fc4067 P4D 817fc4067 PUD 817fc0067 PMD 102ef4067 PTE 0
Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 26 UID: 0 PID: 2667 Comm: ls Tainted: G
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:proc_get_inode+0x302/0x6e0
RSP: 0018:ffff88811c837998 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffffffffc0538140 RCX: 0000000000000007
RDX: 1ffffffff80a702b RSI: 0000000000000001 RDI: ffffffffc0538158
RBP: ffff8881299a6000 R08: 0000000067bbe1e5 R09: 1ffff11023906f20
R10: ffffffffb560ca07 R11: ffffffffb2b43a58 R12: ffff888105bb78f0
R13: ffff888100518048 R14: ffff8881299a6004 R15: 0000000000000001
FS:  00007f95b9686840(0000) GS:ffff8883af100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff80a702b CR3: 0000000117dd2000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
proc_lookup_de+0x11f/0x2e0
__lookup_slow+0x188/0x350
walk_component+0x2ab/0x4f0
path_lookupat+0x120/0x660
filename_lookup+0x1ce/0x560
vfs_statx+0xac/0x150
__do_sys_newstat+0x96/0x110
do_syscall_64+0x5f/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e

[adobriyan@gmail.com: don't do 2 atomic ops on the common path]
Link: https://lkml.kernel.org/r/3d25ded0-1739-447e-812b-e34da7990dcf@p183
Fixes: 778f3dd5a13c ("Fix procfs compat_ioctl regression")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David S. Miller <davem@davemloft.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Linux 6.14-rc7

Merge tag 'media/v6.14-3' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
"rtl2832 driver regression fix"

* tag 'media/v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: rtl2832_sdr: assign vb2 lock before vb2_queue_init

Merge tag 'i2c-for-6.14-rc7' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- omap: fix irq ACKS to avoid irq storming and system hang

- ali1535, ali15x3, sis630: fix error path at probe exit

* tag 'i2c-for-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: sis630: Fix an error handling path in sis630_probe()
  i2c: ali15x3: Fix an error handling path in ali15x3_probe()
  i2c: ali1535: Fix an error handling path in ali1535_probe()
  i2c: omap: fix IRQ storms

Merge tag 'trace-v6.14-rc5' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fix from Steven Rostedt:
"Fix ref count of trace_array in error path of histogram file open

  Tracing instances have a ref count to keep them around while files
  within their directories are open. This prevents them from being
  deleted while they are used.

  The histogram code had some files that needed to take the ref count
  and that was added, but the error paths did not decrement the ref
  counts. This caused the instances from ever being removed if a
  histogram file failed to open due to some error"

* tag 'trace-v6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Correct the refcount if the hist/hist_debug file fails to open