Jens Axboe [Thu, 5 Jun 2025 17:39:17 +0000 (11:39 -0600)]
io_uring/uring_cmd: implement ->sqe_copy() to avoid unnecessary copies
uring_cmd currently copies the full SQE at prep time, just in case it
needs it to be stable. However, for inline completions or requests that
get queued up on the device side, there's no need to ever copy the SQE.
This is particularly important, as various use cases of uring_cmd will
be using 128b sized SQEs.
Opt in to using ->sqe_copy() to let the core of io_uring decide when to
copy SQEs. This callback will only be called if it is safe to do so.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 3 Jun 2025 20:00:27 +0000 (14:00 -0600)]
io_uring/uring_cmd: get rid of io_uring_cmd_prep_setup()
It's a pretty pointless helper, just allocates and copies data. Fold it
into io_uring_cmd_prep().
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 5 Jun 2025 17:33:52 +0000 (11:33 -0600)]
io_uring: add struct io_cold_def->sqe_copy() method
Will be called by the core of io_uring, if inline issue is not going
to be tried for a request. Opcodes can define this handler to defer
copying of SQE data that should remain stable.
Only called if IO_URING_F_INLINE is set. If it isn't set, then there's a
bug in the core handling of this, and -EFAULT will be returned instead
to terminate the request. This will trigger a WARN_ON_ONCE(). Don't
expect this to ever trigger, and down the line this can be removed.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 5 Jun 2025 17:48:33 +0000 (11:48 -0600)]
io_uring: add IO_URING_F_INLINE issue flag
Set when the execution of the request is done inline from the system
call itself. Any deferred issue will never have this flag set.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 8 Jun 2025 20:44:43 +0000 (13:44 -0700)]
Linux 6.16-rc1
Linus Torvalds [Sun, 8 Jun 2025 18:44:41 +0000 (11:44 -0700)]
Merge tag 'turbostat-2025.06.08' of git://git./linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- Add initial DMR support, which required smarter RAPL probe
- Fix AMD MSR RAPL energy reporting
- Add RAPL power limit configuration output
- Minor fixes
* tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2025.06.08
tools/power turbostat: Add initial support for BartlettLake
tools/power turbostat: Add initial support for DMR
tools/power turbostat: Dump RAPL sysfs info
tools/power turbostat: Avoid probing the same perf counters
tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
tools/power turbostat: Clean up add perf/msr counter logic
tools/power turbostat: Introduce add_msr_counter()
tools/power turbostat: Remove add_msr_perf_counter_()
tools/power turbostat: Remove add_cstate_perf_counter_()
tools/power turbostat: Remove add_rapl_perf_counter_()
tools/power turbostat: Quit early for unsupported RAPL counters
tools/power turbostat: Always check rapl_joules flag
tools/power turbostat: Fix AMD package-energy reporting
tools/power turbostat: Fix RAPL_GFX_ALL typo
tools/power turbostat: Add Android support for MSR device handling
tools/power turbostat.8: pm_domain wording fix
tools/power turbostat.8: fix typo: idle_pct should be pct_idle
Linus Torvalds [Sun, 8 Jun 2025 18:33:00 +0000 (11:33 -0700)]
Merge tag 'timers-cleanups-2025-06-08' of git://git./linux/kernel/git/tip/tip
Pull timer cleanup from Thomas Gleixner:
"The delayed from_timer() API cleanup:
The renaming to the timer_*() namespace was delayed due massive
conflicts against Linux-next. Now that everything is upstream finish
the conversion"
* tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
treewide, timers: Rename from_timer() to timer_container_of()
Linus Torvalds [Sun, 8 Jun 2025 18:27:20 +0000 (11:27 -0700)]
Merge tag 'x86-urgent-2025-06-08' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A small set of x86 fixes:
- Cure IO bitmap inconsistencies
A failed fork cleans up all resources of the newly created thread
via exit_thread(). exit_thread() invokes io_bitmap_exit() which
does the IO bitmap cleanups, which unfortunately assume that the
cleanup is related to the current task, which is obviously bogus.
Make it work correctly
- A lockdep fix in the resctrl code removed the clearing of the
command buffer in two places, which keeps stale error messages
around. Bring them back.
- Remove unused trace events"
* tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex
x86/iopl: Cure TIF_IO_BITMAP inconsistencies
x86/fpu: Remove unused trace events
Linus Torvalds [Sun, 8 Jun 2025 18:25:13 +0000 (11:25 -0700)]
Merge tag 'timers-urgent-2025-06-08' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"Add the missing seq_file forward declaration in the timer namespace
header"
* tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timens: Add struct seq_file forward declaration
Len Brown [Sun, 8 Jun 2025 16:31:59 +0000 (12:31 -0400)]
tools/power turbostat: version 2025.06.08
Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Fri, 18 Apr 2025 06:04:26 +0000 (14:04 +0800)]
tools/power turbostat: Add initial support for BartlettLake
Add initial support for BartlettLake.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Mon, 4 Mar 2024 06:54:40 +0000 (14:54 +0800)]
tools/power turbostat: Add initial support for DMR
Add initial support for DMR.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Fri, 30 May 2025 06:01:31 +0000 (14:01 +0800)]
tools/power turbostat: Dump RAPL sysfs info
for example:
intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W
[lenb: simplified format]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
squish me
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Fri, 30 May 2025 00:09:28 +0000 (08:09 +0800)]
tools/power turbostat: Avoid probing the same perf counters
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.
Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.
As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.
Fix the problem by skipping the already probed counter.
Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.
In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 09:44:50 +0000 (17:44 +0800)]
tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.
Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.
With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 09:35:17 +0000 (17:35 +0800)]
tools/power turbostat: Clean up add perf/msr counter logic
Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 07:58:51 +0000 (15:58 +0800)]
tools/power turbostat: Introduce add_msr_counter()
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.
Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.
No functional change intended.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 09:40:08 +0000 (17:40 +0800)]
tools/power turbostat: Remove add_msr_perf_counter_()
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.
Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 07:43:59 +0000 (15:43 +0800)]
tools/power turbostat: Remove add_cstate_perf_counter_()
As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.
Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 04:06:22 +0000 (12:06 +0800)]
tools/power turbostat: Remove add_rapl_perf_counter_()
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.
Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Sat, 17 May 2025 02:26:14 +0000 (10:26 +0800)]
tools/power turbostat: Quit early for unsupported RAPL counters
Quit early for unsupported RAPL counters.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Fri, 30 May 2025 06:00:33 +0000 (14:00 +0800)]
tools/power turbostat: Always check rapl_joules flag
rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Gautham R. Shenoy [Thu, 29 May 2025 11:48:25 +0000 (17:18 +0530)]
tools/power turbostat: Fix AMD package-energy reporting
commit
05a2f07db888 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.
However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.
Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.
Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.
Fixes:
05a2f07db888 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Kaushlendra Kumar [Fri, 23 May 2025 08:06:59 +0000 (13:36 +0530)]
tools/power turbostat: Fix RAPL_GFX_ALL typo
Fix typo in the currently unused RAPL_GFX_ALL macro definition.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Kaushlendra Kumar [Thu, 22 May 2025 08:49:46 +0000 (14:19 +0530)]
tools/power turbostat: Add Android support for MSR device handling
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 18 Apr 2025 21:54:39 +0000 (17:54 -0400)]
tools/power turbostat.8: pm_domain wording fix
turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Wed, 9 Apr 2025 04:06:24 +0000 (00:06 -0400)]
tools/power turbostat.8: fix typo: idle_pct should be pct_idle
idle_pct should be pct_idle
Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Sun, 8 Jun 2025 18:07:33 +0000 (11:07 -0700)]
Merge tag 'perf-urgent-2025-06-08' of git://git./linux/kernel/git/tip/tip
Pull x86 perf fix from Thomas Gleixner:
"A single fix for the x86 performance counters on Intel CPUs:
The MSR offset calculations for fixed performance counters are stored
at the wrong index in the configuration array causing the general
purpose counter MSR offset to be overwritten, so both the general
purpose and the fixed counters offsets are incorrect.
Correct the array index calculation to fix that"
* tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()
Linus Torvalds [Sun, 8 Jun 2025 18:02:53 +0000 (11:02 -0700)]
Merge tag 'irq-urgent-2025-06-08' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
"A single fix for the PCI/MSI code:
The conversion to per device MSI domains created a MSI domain with
size 1 instead of sizing it to the maximum possible number of MSI
interrupts for the device. This "worked" as the subsequent allocations
resized the domain, but the recent change to move the prepare() call
into the domain creation path broke this works by chance mechanism.
Size the domain properly at creation time"
* tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Size device MSI domain with the maximum number of vectors
Linus Torvalds [Sun, 8 Jun 2025 17:35:12 +0000 (10:35 -0700)]
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
"Various mount-related bugfixes:
- split the do_move_mount() checks in subtree-of-our-ns and
entire-anon cases and adapt detached mount propagation selftest for
mount_setattr
- allow clone_private_mount() for a path on real rootfs
- fix a race in call of has_locked_children()
- fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP
- make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right
userns
- avoid false negatives in path_overmount()
- don't leak MNT_LOCKED from parent to child in finish_automount()
- do_change_type(): refuse to operate on unmounted/not ours mounts"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
do_change_type(): refuse to operate on unmounted/not ours mounts
clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
selftests/mount_setattr: adapt detached mount propagation test
do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
fs: allow clone_private_mount() for a path on real rootfs
fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
finish_automount(): don't leak MNT_LOCKED from parent to child
path_overmount(): avoid false negatives
fs/fhandle.c: fix a race in call of has_locked_children()
Linus Torvalds [Sun, 8 Jun 2025 17:20:21 +0000 (10:20 -0700)]
Merge tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb client updates from Steve French:
- multichannel/reconnect fixes
- move smbdirect (smb over RDMA) defines to fs/smb/common so they will
be able to be used in the future more broadly, and a documentation
update explaining setting up smbdirect mounts
- update email address for Paulo
* tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal version number
MAINTAINERS, mailmap: Update Paulo Alcantara's email address
cifs: add documentation for smbdirect setup
cifs: do not disable interface polling on failure
cifs: serialize other channels when query server interfaces is pending
cifs: deal with the channel loading lag while picking channels
smb: client: make use of common smbdirect_socket_parameters
smb: smbdirect: introduce smbdirect_socket_parameters
smb: client: make use of common smbdirect_socket
smb: smbdirect: add smbdirect_socket.h
smb: client: make use of common smbdirect.h
smb: smbdirect: add smbdirect.h with public structures
smb: client: make use of common smbdirect_pdu.h
smb: smbdirect: add smbdirect_pdu.h with protocol definitions
Linus Torvalds [Sun, 8 Jun 2025 15:19:01 +0000 (08:19 -0700)]
Merge tag 'trace-v6.16-3' of git://git./linux/kernel/git/trace/linux-trace
Pull more tracing fixes from Steven Rostedt:
- Fix regression of waiting a long time on updating trace event filters
When the faultable trace points were added, it needed task trace RCU
synchronization.
This was added to the tracepoint_synchronize_unregister() function.
The filter logic always called this function whenever it updated the
trace event filters before freeing the old filters. This increased
the time of "trace-cmd record" from taking 13 seconds to running over
2 minutes to complete.
Move the freeing of the filters to call_rcu*() logic, which brings
the time back down to 13 seconds.
- Fix ring_buffer_subbuf_order_set() error path lock protection
The error path of the ring_buffer_subbuf_order_set() released the
mutex too early and allowed subsequent accesses to setting the
subbuffer size to corrupt the data and cause a bug.
By moving the mutex locking to the end of the error path, it prevents
the reentrant access to the critical data and also allows the
function to convert the taking of the mutex over to the guard()
logic.
- Remove unused power management clock events
The clock events were added in 2010 for power management. In 2011 arm
used them. In 2013 the code they were used in was removed. These
events have been wasting memory since then.
- Fix sparse warnings
There was a few places that sparse warned about trace_events_filter.c
where file->filter was referenced directly, but it is annotated with
an __rcu tag. Use the helper functions and fix them up to use
rcu_dereference() properly.
* tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Add rcu annotation around file->filter accesses
tracing: PM: Remove unused clock events
ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()
tracing: Fix regression of filter waiting a long time on RCU synchronization
Ingo Molnar [Fri, 9 May 2025 05:51:14 +0000 (07:51 +0200)]
treewide, timers: Rename from_timer() to timer_container_of()
Move this API to the canonical timer_*() namespace.
[ tglx: Redone against pre rc1 ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
Linus Torvalds [Sat, 7 Jun 2025 17:05:35 +0000 (10:05 -0700)]
Merge tag 'kbuild-v6.16' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which
exports a symbol only to specified modules
- Improve ABI handling in gendwarfksyms
- Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n
- Add checkers for redundant or missing <linux/export.h> inclusion
- Deprecate the extra-y syntax
- Fix a genksyms bug when including enum constants from *.symref files
* tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits)
genksyms: Fix enum consts from a reference affecting new values
arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds
kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}
efi/libstub: use 'targets' instead of extra-y in Makefile
module: make __mod_device_table__* symbols static
scripts/misc-check: check unnecessary #include <linux/export.h> when W=1
scripts/misc-check: check missing #include <linux/export.h> when W=1
scripts/misc-check: add double-quotes to satisfy shellcheck
kbuild: move W=1 check for scripts/misc-check to top-level Makefile
scripts/tags.sh: allow to use alternative ctags implementation
kconfig: introduce menu type enum
docs: symbol-namespaces: fix reST warning with literal block
kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n
tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
docs/core-api/symbol-namespaces: drop table of contents and section numbering
modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time
kbuild: move kbuild syntax processing to scripts/Makefile.build
Makefile: remove dependency on archscripts for header installation
Documentation/kbuild: Add new gendwarfksyms kABI rules
Documentation/kbuild: Drop section numbers
...
Linus Torvalds [Sat, 7 Jun 2025 17:00:03 +0000 (10:00 -0700)]
Merge tag 'sh-for-v6.16-tag1' of git://git./linux/kernel/git/glaubitz/sh-linux
Pull sh updates from John Paul Adrian Glaubitz:
- replace the __ASSEMBLY__ with __ASSEMBLER__ macro in all headers
since the latter is now defined automatically by both GCC and Clang
when compiling assembly code (Thomas Huth)
- set the default SPI mode for the ecovec24 board which became
necessary after a new mode member as added to the sh_msiof_spi_info
struct in
cf9e4784f3bd ("spi: sh-msiof: Add slave mode support")
(Geert Uytterhoeven)
- remove unused variables in the kprobes code in
kprobe_exceptions_notify() (Mike Rapoport)
* tag 'sh-for-v6.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: kprobes: Remove unused variables in kprobe_exceptions_notify()
sh: ecovec24: Make SPI mode explicit
sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headers
Linus Torvalds [Sat, 7 Jun 2025 16:56:18 +0000 (09:56 -0700)]
Merge tag 'loongarch-6.16' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Adjust the 'make install' operation
- Support SCHED_MC (Multi-core scheduler)
- Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
- Enable HAVE_ARCH_STACKLEAK
- Increase max supported CPUs up to 2048
- Introduce the numa_memblks conversion
- Add PWM controller nodes in dts
- Some bug fixes and other small changes
* tag 'loongarch-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
platform/loongarch: laptop: Unregister generic_sub_drivers on exit
platform/loongarch: laptop: Add backlight power control support
platform/loongarch: laptop: Get brightness setting from EC on probe
LoongArch: dts: Add PWM support to Loongson-2K2000
LoongArch: dts: Add PWM support to Loongson-2K1000
LoongArch: dts: Add PWM support to Loongson-2K0500
LoongArch: vDSO: Correctly use asm parameters in syscall wrappers
LoongArch: Fix panic caused by NULL-PMD in huge_pte_offset()
LoongArch: Preserve firmware configuration when desired
LoongArch: Avoid using $r0/$r1 as "mask" for csrxchg
LoongArch: Introduce the numa_memblks conversion
LoongArch: Increase max supported CPUs up to 2048
LoongArch: Enable HAVE_ARCH_STACKLEAK
LoongArch: Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
LoongArch: Add SCHED_MC (Multi-core scheduler) support
LoongArch: Add some annotations in archhelp
LoongArch: Using generic scripts/install.sh in `make install`
LoongArch: Add a default install.sh
Linus Torvalds [Sat, 7 Jun 2025 16:40:08 +0000 (09:40 -0700)]
Merge tag 'sound-fix-6.16-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of fix patches for the 6.16-rc1 merge window.
Most of changes are about ASoC, especially lots of AVS driver fixes.
Larger LOCs are seen in TAS571x codec drivers, but the changes are
trivial and safe. The rest are all device-specific small fixes"
* tag 'sound-fix-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits)
ASoC: Intel: avs: boards: Fix rt5663 front end name
ASoC: Intel: avs: Simplify verification of parse_int_array() result
ALSA: usb-audio: Add implicit feedback quirk for RODE AI-1
ALSA: hda: Ignore unsol events for cards being shut down
ALSA: hda: Add new pci id for AMD GPU display HD audio controller
ALSA: hda: cs35l41: Constify regmap_irq_chip
ALSA: usb-audio: Add a quirk for Lenovo Thinkpad Thunderbolt 3 dock
ASoC: ti: omap-hdmi: Re-add dai_link->platform to fix card init
ASoC: pcm: Do not open FEs with no BEs connected
ASoC: rt1320: fix speaker noise when volume bar is 100%
ASoC: Intel: avs: Include missing string.h
ASoC: Intel: avs: Verify content returned by parse_int_array()
ASoC: Intel: avs: Verify kcalloc() status when setting constraints
ASoC: Intel: avs: Fix paths in MODULE_FIRMWARE hints
ASoC: Intel: avs: Fix possible null-ptr-deref when initing hw
ASoC: Intel: avs: Fix PPLCxFMT calculation
ASoC: Intel: avs: Fix deadlock when the failing IPC is SET_D0IX
ASoC: codecs: hda: Fix RPM usage count underflow
ASoC: amd: yc: Add support for Lenovo Yoga 7 16ARP8
ASoC: tas571x: fix tas5733 num_controls
...
Steven Rostedt [Sat, 7 Jun 2025 14:28:21 +0000 (10:28 -0400)]
tracing: Add rcu annotation around file->filter accesses
Running sparse on trace_events_filter.c triggered several warnings about
file->filter being accessed directly even though it's annotated with __rcu.
Add rcu_dereference() around it and shuffle the logic slightly so that
it's always referenced via accessor functions.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20250607102821.6c7effbf@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Sat, 7 Jun 2025 14:24:07 +0000 (07:24 -0700)]
Merge tag 'ubifs-for-linus-6.16-rc1' of git://git./linux/kernel/git/rw/ubifs
Pull JFFS2 and UBIFS fixes from Richard Weinberger:
"JFFS2:
- Correctly check return code of jffs2_prealloc_raw_node_refs()
UBIFS:
- Spelling fixes"
* tag 'ubifs-for-linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
jffs2: check jffs2_prealloc_raw_node_refs() result in few other places
jffs2: check that raw node were preallocated before writing summary
ubifs: Fix grammar in error message
Mike Rapoport [Sat, 17 May 2025 09:30:48 +0000 (12:30 +0300)]
sh: kprobes: Remove unused variables in kprobe_exceptions_notify()
kbuild reports the following warning:
arch/sh/kernel/kprobes.c: In function 'kprobe_exceptions_notify':
>> arch/sh/kernel/kprobes.c:412:24: warning: variable 'p' set but not used [-Wunused-but-set-variable]
412 | struct kprobe *p = NULL;
| ^
The variable 'p' is indeed unused since the commit
fa5a24b16f94
("sh/kprobes: Don't call the ->break_handler() in SH kprobes code")
Remove that variable along with 'kprobe_opcode_t *addr' which also
becomes unused after 'p' is removed.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202505151341.EuRFR22l-lkp@intel.com/
Fixes:
fa5a24b16f94 ("sh/kprobes: Don't call the ->break_handler() in SH kprobes code")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Geert Uytterhoeven [Fri, 2 May 2025 11:13:36 +0000 (13:13 +0200)]
sh: ecovec24: Make SPI mode explicit
Commit
cf9e4784f3bde3e4 ("spi: sh-msiof: Add slave mode support") added
a new mode member to the sh_msiof_spi_info structure, but did not update
any board files. Hence all users in board files rely on the default
being host mode.
Make this unambiguous by configuring host mode explicitly.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Thomas Huth [Fri, 14 Mar 2025 07:10:03 +0000 (08:10 +0100)]
sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headers
While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembly code, __ASSEMBLY__ is a
macro that only gets defined by the Makefiles in the kernel.
This can be very confusing when switching between userspace
and kernelspace coding, or when dealing with uapi headers that
rather should use __ASSEMBLER__ instead. So let's standardize on
the __ASSEMBLER__ macro that is provided by the compilers now.
This is a completely mechanical patch (done with a simple "sed -i"
statement).
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Petr Pavlu [Tue, 3 Jun 2025 13:02:09 +0000 (15:02 +0200)]
genksyms: Fix enum consts from a reference affecting new values
Enumeration constants read from a symbol reference file can incorrectly
affect new enumeration constants parsed from an actual input file.
Example:
$ cat test.c
enum { E_A, E_B, E_MAX };
struct bar { int mem[E_MAX]; };
int foo(struct bar *a) {}
__GENKSYMS_EXPORT_SYMBOL(foo);
$ cat test.c | ./scripts/genksyms/genksyms -T test.0.symtypes
#SYMVER foo 0x070d854d
$ cat test.0.symtypes
E#E_MAX 2
s#bar struct bar { int mem [ E#E_MAX ] ; }
foo int foo ( s#bar * )
$ cat test.c | ./scripts/genksyms/genksyms -T test.1.symtypes -r test.0.symtypes
<stdin>:4: warning: foo: modversion changed because of changes in enum constant E_MAX
#SYMVER foo 0x9c9dfd81
$ cat test.1.symtypes
E#E_MAX ( 2 ) + 3
s#bar struct bar { int mem [ E#E_MAX ] ; }
foo int foo ( s#bar * )
The __add_symbol() function includes logic to handle the incrementation of
enumeration values, but this code is also invoked when reading a reference
file. As a result, the variables last_enum_expr and enum_counter might be
incorrectly set after reading the reference file, which later affects
parsing of the actual input.
Fix the problem by splitting the logic for the incrementation of
enumeration values into a separate function process_enum() and call it from
__add_symbol() only when processing non-reference data.
Fixes:
e37ddb825003 ("genksyms: Track changes to enum constants")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Mon, 2 Jun 2025 18:12:54 +0000 (03:12 +0900)]
arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds
The extra-y syntax is deprecated. Instead, use always-$(KBUILD_BUILTIN),
which behaves equivalently.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Al Viro [Wed, 4 Jun 2025 16:27:08 +0000 (12:27 -0400)]
do_change_type(): refuse to operate on unmounted/not ours mounts
Ensure that propagation settings can only be changed for mounts located
in the caller's mount namespace. This change aligns permission checking
with the rest of mount(2).
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
07b20889e305 ("beginning of the shared-subtree proper")
Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Masahiro Yamada [Mon, 2 Jun 2025 18:12:53 +0000 (03:12 +0900)]
kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}
KBUILD_BUILTIN is set to 1 unless you are building only modules.
KBUILD_MODULES is set to 1 when you are building only modules
(a typical use case is "make modules").
It is more useful to set them to 'y' instead, so we can do
something like:
always-$(KBUILD_BUILTIN) += vmlinux.lds
This works equivalently to:
extra-y += vmlinux.lds
This allows us to deprecate extra-y. extra-y and always-y are quite
similar, and we do not need both.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Al Viro [Mon, 2 Jun 2025 00:11:06 +0000 (20:11 -0400)]
clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
What we want is to verify there is that clone won't expose something
hidden by a mount we wouldn't be able to undo. "Wouldn't be able to undo"
may be a result of MNT_LOCKED on a child, but it may also come from
lacking admin rights in the userns of the namespace mount belongs to.
clone_private_mnt() checks the former, but not the latter.
There's a number of rather confusing CAP_SYS_ADMIN checks in various
userns during the mount, especially with the new mount API; they serve
different purposes and in case of clone_private_mnt() they usually,
but not always end up covering the missing check mentioned above.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com>
Fixes:
427215d85e8d ("ovl: prevent private clone if bind mount is not allowed")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sat, 7 Jun 2025 05:06:57 +0000 (22:06 -0700)]
Merge tag 'mm-stable-2025-06-06-16-09' of git://git./linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
"The series 'Fix uprobe pte be overwritten when expanding vma' fixes a
longstanding and quite obscure bug related to the vma merging of the
uprobe mmap page"
* tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
selftests/mm: add test about uprobe pte be orphan during vma merge
selftests/mm: extract read_sysfs and write_sysfs into vm_util
mm: expose abnormal new_pte during move_ptes
mm: fix uprobe pte be overwritten when expanding vma
mm/damon: s/primitives/code/ on comments
Linus Torvalds [Sat, 7 Jun 2025 04:45:45 +0000 (21:45 -0700)]
Merge tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"13 hotfixes.
6 are cc:stable and the remainder address post-6.15 issues or aren't
considered necessary for -stable kernels. 11 are for MM"
* tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
MAINTAINERS: add mm swap section
kmsan: test: add module description
MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION
mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
mm/hugetlb: unshare page tables during VMA split, not before
MAINTAINERS: add Alistair as reviewer of mm memory policy
iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec
mm/mempolicy: fix incorrect freeing of wi_kobj
alloc_tag: handle module codetag load errors as module load failures
mm/madvise: handle madvise_lock() failure during race unwinding
mm: fix vmstat after removing NR_BOUNCE
KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
Christian Brauner [Thu, 5 Jun 2025 12:50:54 +0000 (14:50 +0200)]
selftests/mount_setattr: adapt detached mount propagation test
Make sure that detached trees don't receive mount propagation.
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 6 Jun 2025 22:31:03 +0000 (18:31 -0400)]
do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
... and fix the breakage in anon-to-anon case. There are two cases
acceptable for do_move_mount() and mixing checks for those is making
things hard to follow.
One case is move of a subtree in caller's namespace.
* source and destination must be in caller's namespace
* source must be detachable from parent
Another is moving the entire anon namespace elsewhere
* source must be the root of anon namespace
* target must either in caller's namespace or in a suitable
anon namespace (see may_use_mount() for details).
* target must not be in the same namespace as source.
It's really easier to follow if tests are *not* mixed together...
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
3b5260d12b1f ("Don't propagate mounts into detached trees")
Reported-by: Allison Karlitskaya <lis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
KONDO KAZUMA(近藤 和真) [Thu, 15 May 2025 12:18:30 +0000 (12:18 +0000)]
fs: allow clone_private_mount() for a path on real rootfs
Mounting overlayfs with a directory on real rootfs (initramfs)
as upperdir has failed with following message since commit
db04662e2f4f ("fs: allow detached mounts in clone_private_mount()").
[ 4.080134] overlayfs: failed to clone upperpath
Overlayfs mount uses clone_private_mount() to create internal mount
for the underlying layers.
The commit made clone_private_mount() reject real rootfs because
it does not have a parent mount and is in the initial mount namespace,
that is not an anonymous mount namespace.
This issue can be fixed by modifying the permission check
of clone_private_mount() following [1].
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
db04662e2f4f ("fs: allow detached mounts in clone_private_mount()")
Link: https://lore.kernel.org/all/20250514190252.GQ2023217@ZenIV/
Link: https://lore.kernel.org/all/20250506194849.GT2023217@ZenIV/
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kazuma Kondo <kazuma-kondo@nec.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 3 Jun 2025 21:57:27 +0000 (17:57 -0400)]
fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
9ffb14ef61ba "move_mount: allow to add a mount into an existing group"
breaks assertions on ->mnt_share/->mnt_slave. For once, the data structures
in question are actually documented.
Documentation/filesystem/sharedsubtree.rst:
All vfsmounts in a peer group have the same ->mnt_master. If it is
non-NULL, they form a contiguous (ordered) segment of slave list.
do_set_group() puts a mount into the same place in propagation graph
as the old one. As the result, if old mount gets events from somewhere
and is not a pure event sink, new one needs to be placed next to the
old one in the slave list the old one's on. If it is a pure event
sink, we only need to make sure the new one doesn't end up in the
middle of some peer group.
"move_mount: allow to add a mount into an existing group" ends up putting
the new one in the beginning of list; that's definitely not going to be
in the middle of anything, so that's fine for case when old is not marked
shared. In case when old one _is_ marked shared (i.e. is not a pure event
sink), that breaks the assumptions of propagation graph iterators.
Put the new mount next to the old one on the list - that does the right thing
in "old is marked shared" case and is just as correct as the current behaviour
if old is not marked shared (kudos to Pavel for pointing that out - my original
suggested fix changed behaviour in the "nor marked" case, which complicated
things for no good reason).
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
9ffb14ef61ba ("move_mount: allow to add a mount into an existing group")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 4 May 2025 17:28:37 +0000 (13:28 -0400)]
finish_automount(): don't leak MNT_LOCKED from parent to child
Intention for MNT_LOCKED had always been to protect the internal
mountpoints within a subtree that got copied across the userns boundary,
not the mountpoint that tree got attached to - after all, it _was_
exposed before the copying.
For roots of secondary copies that is enforced in attach_recursive_mnt() -
MNT_LOCKED is explicitly stripped for those. For the root of primary
copy we are almost always guaranteed that MNT_LOCKED won't be there,
so attach_recursive_mnt() doesn't bother. Unfortunately, one call
chain got overlooked - triggering e.g. NFS referral will have the
submount inherit the public flags from parent; that's fine for such
things as read-only, nosuid, etc., but not for MNT_LOCKED.
This is particularly pointless since the mount attached by finish_automount()
is usually expirable, which makes any protection granted by MNT_LOCKED
null and void; just wait for a while and that mount will go away on its own.
Include MNT_LOCKED into the set of flags to be ignored by do_add_mount() - it
really is an internal flag.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 1 Jun 2025 18:02:26 +0000 (14:02 -0400)]
path_overmount(): avoid false negatives
Holding namespace_sem is enough to make sure that result remains valid.
It is *not* enough to avoid false negatives from __lookup_mnt(). Mounts
can be unhashed outside of namespace_sem (stuck children getting detached
on final mntput() of lazy-umounted mount) and having an unrelated mount
removed from the hash chain while we traverse it may end up with false
negative from __lookup_mnt(). We need to sample and recheck the seqlock
component of mount_lock...
Bug predates the introduction of path_overmount() - it had come from
the code in finish_automount() that got abstracted into that helper.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Fixes:
26df6034fdb2 ("fix automount/automount race properly")
Fixes:
6ac392815628 ("fs: allow to mount beneath top mount")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sun, 1 Jun 2025 18:23:52 +0000 (14:23 -0400)]
fs/fhandle.c: fix a race in call of has_locked_children()
may_decode_fh() is calling has_locked_children() while holding no locks.
That's an oopsable race...
The rest of the callers are safe since they are holding namespace_sem and
are guaranteed a positive refcount on the mount in question.
Rename the current has_locked_children() to __has_locked_children(), make
it static and switch the fs/namespace.c users to it.
Make has_locked_children() a wrapper for __has_locked_children(), calling
the latter under read_seqlock_excl(&mount_lock).
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes:
620c266f3949 ("fhandle: relax open_by_handle_at() permission checks")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Yao Zi [Thu, 5 Jun 2025 12:34:46 +0000 (20:34 +0800)]
platform/loongarch: laptop: Unregister generic_sub_drivers on exit
Without correct unregisteration, ACPI notify handlers and the platform
drivers installed by generic_subdriver_init() will become dangling
references after removing the loongson_laptop module, triggering various
kernel faults when a hotkey is sent or at kernel shutdown.
Cc: stable@vger.kernel.org
Fixes:
6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Yao Zi [Thu, 5 Jun 2025 12:34:46 +0000 (20:34 +0800)]
platform/loongarch: laptop: Add backlight power control support
loongson_laptop_turn_{on,off}_backlight() are designed for controlling
the power of the backlight, but they aren't really used in the driver
previously.
Unify these two functions since they only differ in arguments passed to
ACPI method, and wire up loongson_laptop_backlight_update() to update
the power state of the backlight as well. Tested on the TongFang L860-T2
Loongson-3A5000 laptop.
Cc: stable@vger.kernel.org
Fixes:
6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Linus Torvalds [Sat, 7 Jun 2025 03:02:51 +0000 (20:02 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Mostly trivial updates and bug fixes (core update is a comment
spelling fix).
The bigger UFS update is the clock scaling and frequency fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom: Prevent calling phy_exit() before phy_init()
scsi: ufs: qcom: Call ufs_qcom_cfg_timers() in clock scaling path
scsi: ufs: qcom: Map devfreq OPP freq to UniPro Core Clock freq
scsi: ufs: qcom: Check gear against max gear in vop freq_to_gear()
scsi: aacraid: Remove useless code
scsi: core: devinfo: Fix typo in comment
scsi: ufs: core: Don't perform UFS clkscaling during host async scan
Linus Torvalds [Sat, 7 Jun 2025 01:05:18 +0000 (18:05 -0700)]
Merge tag 'riscv-for-linus-6.16-mw1' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions
- Support for getrandom() in the VDSO
- Support for mseal
- Optimized routines for raid6 syndrome and recovery calculations
- kexec_file() supports loading Image-formatted kernel binaries
- Improvements to the instruction patching framework to allow for
atomic instruction patching, along with rules as to how systems need
to behave in order to function correctly
- Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions
- Various fixes and cleanups, including: misaligned access handling,
perf symbol mangling, module loading, PUD THPs, and improved uaccess
routines
* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
riscv: uaccess: Only restore the CSR_STATUS SUM bit
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
RISC-V: Documentation: Add enough title underlines to CMODX
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: uaccess: do not do misaligned accesses in get/put_user()
riscv: process: use unsigned int instead of unsigned long for put_user()
riscv: make unsafe user copy routines use existing assembly routines
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
...
Linus Torvalds [Sat, 7 Jun 2025 01:02:37 +0000 (18:02 -0700)]
Merge tag 's390-6.16-2' of git://git./linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Add missing select CRYPTO_ENGINE to CRYPTO_PAES_S390
- Fix secure storage access exception handling when fault handling is
disabled
* tag 's390-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix in_atomic() handling in do_secure_storage_access()
s390/crypto: Select crypto engine in Kconfig when PAES is chosen
Linus Torvalds [Sat, 7 Jun 2025 01:00:36 +0000 (18:00 -0700)]
Merge tag 'tomoyo-pr-
20250606' of git://git.code.sf.net/p/tomoyo/tomoyo
Pull tomoyo update from Tetsuo Handa:
"Update mailing list address"
* tag 'tomoyo-pr-
20250606' of git://git.code.sf.net/p/tomoyo/tomoyo:
tomoyo: update mailing lists
Linus Torvalds [Sat, 7 Jun 2025 00:56:19 +0000 (17:56 -0700)]
Merge tag 'ceph-for-6.16-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
- a one-liner that leads to a startling (but also very much rational)
performance improvement in cases where an IMA policy with rules that
are based on fsmagic matching is enforced
- an encryption-related fixup that addresses generic/397 and other
fstest failures
- a couple of cleanups in CephFS
* tag 'ceph-for-6.16-rc1' of https://github.com/ceph/ceph-client:
ceph: fix variable dereferenced before check in ceph_umount_begin()
ceph: set superblock s_magic for IMA fsmagic matching
ceph: cleanup hardcoded constants of file handle size
ceph: fix possible integer overflow in ceph_zero_objects()
ceph: avoid kernel BUG for encrypted inode with unaligned file size
Linus Torvalds [Sat, 7 Jun 2025 00:54:09 +0000 (17:54 -0700)]
Merge tag 'ovl-update-v2-6.16' of git://git./linux/kernel/git/overlayfs/vfs
Pull overlayfs update from Miklos Szeredi:
- Fix a regression in getting the path of an open file (e.g. in
/proc/PID/maps) for a nested overlayfs setup (André Almeida)
- Support data-only layers and verity in a user namespace (unprivileged
composefs use case)
- Fix a gcc warning (Kees)
- Cleanups
* tag 'ovl-update-v2-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: Annotate struct ovl_entry with __counted_by()
ovl: Replace offsetof() with struct_size() in ovl_stack_free()
ovl: Replace offsetof() with struct_size() in ovl_cache_entry_new()
ovl: Check for NULL d_inode() in ovl_dentry_upper()
ovl: Use str_on_off() helper in ovl_show_options()
ovl: don't require "metacopy=on" for "verity"
ovl: relax redirect/metacopy requirements for lower -> data redirect
ovl: make redirect/metacopy rejection consistent
ovl: Fix nested backing file paths
Steven Rostedt [Thu, 5 Jun 2025 20:21:06 +0000 (16:21 -0400)]
tracing: PM: Remove unused clock events
The events clock_enable, clock_disable, and clock_set_rate were added back
in 2010. In 2011 they were used by the arm architecture but removed in
2013. These events add around 7K of memory which was wasted for the last 12
years.
Remove them.
Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home/
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kajetan Puchalski <kajetan.puchalski@arm.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/20250605162106.1a459dad@gandalf.local.home
Fixes:
74704ac6ea402 ("tracing, perf: Add more power related events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Dmitry Antipov [Fri, 6 Jun 2025 11:22:42 +0000 (14:22 +0300)]
ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()
Enlarge the critical section in ring_buffer_subbuf_order_set() to
ensure that error handling takes place with per-buffer mutex held,
thus preventing list corruption and other concurrency-related issues.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
Link: https://lore.kernel.org/20250606112242.1510605-1-dmantipov@yandex.ru
Reported-by: syzbot+05d673e83ec640f0ced9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
05d673e83ec640f0ced9
Fixes:
f9b94daa542a8 ("ring-buffer: Set new size of the ring buffer sub page")
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Sat, 7 Jun 2025 00:20:20 +0000 (20:20 -0400)]
tracing: Fix regression of filter waiting a long time on RCU synchronization
When faultable trace events were added, a trace event may no longer use
normal RCU to synchronize but instead used synchronize_rcu_tasks_trace().
This synchronization takes a much longer time to synchronize.
The filter logic would free the filters by calling
tracepoint_synchronize_unregister() after it unhooked the filter strings
and before freeing them. With this function now calling
synchronize_rcu_tasks_trace() this increased the time to free a filter
tremendously. On a PREEMPT_RT system, it was even more noticeable.
# time trace-cmd record -p function sleep 1
[..]
real 2m29.052s
user 0m0.244s
sys 0m20.136s
As trace-cmd would clear out all the filters before recording, it could
take up to 2 minutes to do a recording of "sleep 1".
To find out where the issues was:
~# trace-cmd sqlhist -e -n sched_stack select start.prev_state as state, end.next_comm as comm, TIMESTAMP_DELTA_USECS as delta, start.STACKTRACE as stack from sched_switch as start join sched_switch as end on start.prev_pid = end.next_pid
Which will produce the following commands (and -e will also execute them):
echo 's:sched_stack s64 state; char comm[16]; u64 delta; unsigned long stack[];' >> /sys/kernel/tracing/dynamic_events
echo 'hist:keys=prev_pid:__arg_18057_2=prev_state,__arg_18057_4=common_timestamp.usecs,__arg_18057_7=common_stacktrace' >> /sys/kernel/tracing/events/sched/sched_switch/trigger
echo 'hist:keys=next_pid:__state_18057_1=$__arg_18057_2,__comm_18057_3=next_comm,__delta_18057_5=common_timestamp.usecs-$__arg_18057_4,__stack_18057_6=$__arg_18057_7:onmatch(sched.sched_switch).trace(sched_stack,$__state_18057_1,$__comm_18057_3,$__delta_18057_5,$__stack_18057_6)' >> /sys/kernel/tracing/events/sched/sched_switch/trigger
The above creates a synthetic event that creates a stack trace when a task
schedules out and records it with the time it scheduled back in. Basically
the time a task is off the CPU. It also records the state of the task when
it left the CPU (running, blocked, sleeping, etc). It also saves the comm
of the task as "comm" (needed for the next command).
~# echo 'hist:keys=state,stack.stacktrace:vals=delta:sort=state,delta if comm == "trace-cmd" && state & 3' > /sys/kernel/tracing/events/synthetic/sched_stack/trigger
The above creates a histogram with buckets per state, per stack, and the
value of the total time it was off the CPU for that stack trace. It filters
on tasks with "comm == trace-cmd" and only the sleeping and blocked states
(1 - sleeping, 2 - blocked).
~# trace-cmd record -p function sleep 1
~# cat /sys/kernel/tracing/events/synthetic/sched_stack/hist | tail -18
{ state: 2, stack.stacktrace __schedule+0x1545/0x3700
schedule+0xe2/0x390
schedule_timeout+0x175/0x200
wait_for_completion_state+0x294/0x440
__wait_rcu_gp+0x247/0x4f0
synchronize_rcu_tasks_generic+0x151/0x230
apply_subsystem_event_filter+0xa2b/0x1300
subsystem_filter_write+0x67/0xc0
vfs_write+0x1e2/0xeb0
ksys_write+0xff/0x1d0
do_syscall_64+0x7b/0x420
entry_SYSCALL_64_after_hwframe+0x76/0x7e
} hitcount: 237 delta:
99756288 <<--------------- Delta is 99 seconds!
Totals:
Hits: 525
Entries: 21
Dropped: 0
This shows that this particular trace waited for 99 seconds on
synchronize_rcu_tasks() in apply_subsystem_event_filter().
In fact, there's a lot of places in the filter code that spends a lot of
time waiting for synchronize_rcu_tasks_trace() in order to free the
filters.
Add helper functions that will use call_rcu*() variants to asynchronously
free the filters. This brings the timings back to normal:
# time trace-cmd record -p function sleep 1
[..]
real 0m14.681s
user 0m0.335s
sys 0m28.616s
And the histogram also shows this:
~# cat /sys/kernel/tracing/events/synthetic/sched_stack/hist | tail -21
{ state: 2, stack.stacktrace __schedule+0x1545/0x3700
schedule+0xe2/0x390
schedule_timeout+0x175/0x200
wait_for_completion_state+0x294/0x440
__wait_rcu_gp+0x247/0x4f0
synchronize_rcu_normal+0x3db/0x5c0
tracing_reset_online_cpus+0x8f/0x1e0
tracing_open+0x335/0x440
do_dentry_open+0x4c6/0x17a0
vfs_open+0x82/0x360
path_openat+0x1a36/0x2990
do_filp_open+0x1c5/0x420
do_sys_openat2+0xed/0x180
__x64_sys_openat+0x108/0x1d0
do_syscall_64+0x7b/0x420
} hitcount: 2 delta: 77044
Totals:
Hits: 55
Entries: 28
Dropped: 0
Where the total waiting time of synchronize_rcu_tasks_trace() is 77
milliseconds.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Andreas Ziegler <ziegler.andreas@siemens.com>
Cc: Felix MOESSBAUER <felix.moessbauer@siemens.com>
Link: https://lore.kernel.org/20250606201936.1e3d09a9@batman.local.home
Reported-by: "Flot, Julien" <julien.flot@siemens.com>
Tested-by: Julien Flot <julien.flot@siemens.com>
Fixes:
a363d27cdbc2 ("tracing: Allow system call tracepoints to handle page faults")
Closes: https://lore.kernel.org/all/
240017f656631c7dd4017aa93d91f41f653788ea.camel@siemens.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Linus Torvalds [Fri, 6 Jun 2025 20:22:31 +0000 (13:22 -0700)]
Merge tag 'spi-v6.16-merge-window' of git://git./linux/kernel/git/broonie/spi
Pull more spi updates from Mark Brown:
"A small set of updates that came in during the merge window, we've
got:
- Some small fixes for the Broadcom and spi-pci1xxxx drivers
- A change to the QPIC SNAND driver to flag that the error correction
features are less useful than people might be expecting
- A new device ID for the SOPHGO SG2042
- The addition of Yang Shen as a Huawei maintainer"
* tag 'spi-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-qpic-snand: document the limited bit error reporting capability
spi: bcm63xx-hsspi: fix shared reset
spi: bcm63xx-spi: fix shared reset
MAINTAINERS: Update HiSilicon SFC driver maintainer
MAINTAINERS: Update HiSilicon SPI Controller driver maintainer
spi: dt-bindings: spi-sg2044-nor: Add SOPHGO SG2042
spi: spi-pci1xxxx: Fix Probe failure with Dual SPI instance with INTx interrupts
Linus Torvalds [Fri, 6 Jun 2025 20:20:26 +0000 (13:20 -0700)]
Merge tag 'regulator-fix-v6.16-merge-window' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A very minor fix that came in during the merge window, checking for
I/O errors in the MAX14577 driver"
* tag 'regulator-fix-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: max14577: Add error check for max14577_read_reg()
Linus Torvalds [Fri, 6 Jun 2025 20:16:50 +0000 (13:16 -0700)]
Merge tag 'pwm/for-6.16-rc1-fixes' of git://git./linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
"axi-pwmgen: Fix handling of external clock
The pwm-axi-pwmgen device is backed by an FPGA and can be synthesized
in different ways. Relevant here is that it can use one or two
external clock signals. These fix clock handling for the two clocks
case"
* tag 'pwm/for-6.16-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: axi-pwmgen: fix missing separate external clock
dt-bindings: pwm: adi,axi-pwmgen: Fix clocks
Linus Torvalds [Fri, 6 Jun 2025 20:12:50 +0000 (13:12 -0700)]
Merge tag 'block-6.16-
20250606' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
- NVMe pull request via Christoph:
- TCP error handling fix (Shin'ichiro Kawasaki)
- TCP I/O stall handling fixes (Hannes Reinecke)
- fix command limits status code (Keith Busch)
- support vectored buffers also for passthrough (Pavel Begunkov)
- spelling fixes (Yi Zhang)
- MD pull request via Yu:
- fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10
- fix max_write_behind setting for dm-raid
- some minor cleanups
- Integrity data direction fix and cleanup
- bcache NULL pointer fix
- Fix for loop missing write start/end handling
- Decouple hardware queues and IO threads in ublk
- Slew of ublk selftests additions and updates
* tag 'block-6.16-
20250606' of git://git.kernel.dk/linux: (29 commits)
nvme: spelling fixes
nvme-tcp: fix I/O stalls on congested sockets
nvme-tcp: sanitize request list handling
nvme-tcp: remove tag set when second admin queue config fails
nvme: enable vectored registered bufs for passthrough cmds
nvme: fix implicit bool to flags conversion
nvme: fix command limits status code
selftests: ublk: kublk: improve behavior on init failure
block: flip iter directions in blk_rq_integrity_map_user()
block: drop direction param from bio_integrity_copy_user()
selftests: ublk: cover PER_IO_DAEMON in more stress tests
Documentation: ublk: document UBLK_F_PER_IO_DAEMON
selftests: ublk: add stress test for per io daemons
selftests: ublk: add functional test for per io daemons
selftests: ublk: kublk: decouple ublk_queues from ublk server threads
selftests: ublk: kublk: move per-thread data out of ublk_queue
selftests: ublk: kublk: lift queue initialization out of thread
selftests: ublk: kublk: tie sqe allocation to io instead of queue
selftests: ublk: kublk: plumb q_id in io_uring user_data
ublk: have a per-io daemon instead of a per-queue daemon
...
Linus Torvalds [Fri, 6 Jun 2025 20:09:03 +0000 (13:09 -0700)]
Merge tag 'io_uring-6.16-
20250606' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix for a regression introduced in this merge window, where the 'id'
passed to xa_find() for ifq lookup is uninitialized
- Fix for zcrx release on registration failure. From 6.15, going to
stable
- Tweak for recv bundles, where msg_inq should be > 1 before being used
to gate a retry event
- Pavel doesnt want to be a maintainer anymore, remove him from the
MAINTAINERS entry
- Limit legacy kbuf registrations to 64k, which is the size of the
buffer ID field anyway. Hence it's nonsensical to support more than
that, and the only purpose that serves is to have syzbot trigger long
exit delays for heavily configured debug kernels
- Fix for the io_uring futex handling, which got broken for
FUTEX2_PRIVATE by a generic futex commit adding private hashes
* tag 'io_uring-6.16-
20250606' of git://git.kernel.dk/linux:
io_uring/futex: mark wait requests as inflight
io_uring/futex: get rid of struct io_futex addr union
io_uring/kbuf: limit legacy provided buffer lists to USHRT_MAX
MAINTAINERS: remove myself from io_uring
io_uring/net: only consider msg_inq if larger than 1
io_uring/zcrx: fix area release on registration failure
io_uring/zcrx: init id for xa_find
Linus Torvalds [Fri, 6 Jun 2025 19:45:35 +0000 (12:45 -0700)]
Merge tag 'usb-6.16-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.16-rc1.
Included in here are the following:
- USB offload support for audio devices.
I think this takes the record for the most number of patch series
(30+) over the longest period of time (2+ years) to get merged
properly.
Many props go to Wesley Cheng for seeing this effort through, they
took a major out-of-tree hacked-up-monstrosity that was created by
multiple vendors for their specific devices, got it all merged into
a semi-coherent set of changes, and got all of the different major
subsystems to agree on how this should be implemented both with
changes to their code as well as userspace apis, AND wrangled the
hardware companies into agreeing to go forward with this, despite
making them all redo work they had already done in their private
device trees.
This feature offers major power savings on embedded devices where a
USB audio stream can continue to flow while the rest of the system
is sleeping, something that devices running on battery power really
care about. There are still some more small tweaks left to be done
here, and those patches are still out for review and arguing among
the different hardware companies, but this is a major step forward
and a great example of how to do upstream development well.
- small number of thunderbolt fixes and updates, things seem to be
slowing down here (famous last words...)
- xhci refactors and reworking to try to handle some rough corner
cases in some hardware implementations where things don't always
work properly
- typec driver updates
- USB3 power management reworking and updates
- Removal of some old and orphaned UDC gadget drivers that had not
been used in a very long time, dropping over 11 thousand lines from
the tree, always a nice thing, making up for the 12k lines added
for the USB offload feature.
- lots of little updates and fixes in different drivers
All of these have been in linux-next for over 2 weeks, the USB offload
logic has been in there for 8 weeks now, with no reported issues"
* tag 'usb-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (172 commits)
ALSA: usb-audio: qcom: fix USB_XHCI dependency
ASoC: qdsp6: fix compile-testing without CONFIG_OF
usb: misc: onboard_usb_dev: fix build warning for CONFIG_USB_ONBOARD_DEV_USB5744=n
usb: typec: tipd: fix typo in TPS_STATUS_HIGH_VOLAGE_WARNING macro
USB: typec: fix const issue in typec_match()
USB: gadget: udc: fix const issue in gadget_match_driver()
USB: gadget: fix up const issue with struct usb_function_instance
USB: serial: pl2303: add new chip PL2303GC-Q20 and PL2303GT-2AB
USB: serial: bus: fix const issue in usb_serial_device_match()
usb: usbtmc: Fix timeout value in get_stb
usb: usbtmc: Fix read_stb function and get_stb ioctl
ALSA: qc_audio_offload: try to reduce address space confusion
ALSA: qc_audio_offload: avoid leaking xfer_buf allocation
ALSA: qc_audio_offload: rename dma/iova/va/cpu/phys variables
ALSA: usb-audio: qcom: Fix an error handling path in qc_usb_audio_probe()
usb: misc: onboard_usb_dev: Fix usb5744 initialization sequence
dt-bindings: usb: ti,usb8041: Add binding for TI USB8044 hub controller
usb: misc: onboard_usb_dev: Add support for TI TUSB8044 hub
usb: gadget: lpc32xx_udc: Use USB API functions rather than constants
usb: gadget: epautoconf: Use USB API functions rather than constants
...
Linus Torvalds [Fri, 6 Jun 2025 19:32:02 +0000 (12:32 -0700)]
Merge tag 'tty-6.16-rc1' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.16-rc1.
A little more churn than normal in this portion of the kernel for this
development cycle, Jiri and Nicholas were busy with cleanups and
reviews and fixes for the vt unicode handling logic which composed
most of the overall work in here.
Major changes are:
- vt unicode changes/reverts/changes from Nicholas. This should help
out a lot with screen readers and others that rely on vt console
support
- lock guard additions to the core tty/serial code to clean up lots
of error handling logic
- 8250 driver updates and fixes
- device tree conversions to yaml
- sh-sci driver updates
- other small cleanups and updates for serial drivers and tty core
portions
All of these have been in linux-next for 2 weeks with no reported issues"
* tag 'tty-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (105 commits)
tty: serial: 8250_omap: fix TX with DMA for am33xx
vt: add VT_GETCONSIZECSRPOS to retrieve console size and cursor position
vt: bracketed paste support
vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl()
vt: process the full-width ASCII fallback range programmatically
vt: make use of ucs_get_fallback() when glyph is unavailable
vt: add ucs_get_fallback()
vt: create ucs_fallback_table.h_shipped with gen_ucs_fallback_table.py
vt: introduce gen_ucs_fallback_table.py to create ucs_fallback_table.h
vt: move glyph determination to a separate function
vt: make sure displayed double-width characters are remembered as such
vt: ucs.c: fix misappropriate in_range() usage
serial: max3100: Replace open-coded parity calculation with parity8()
dt-bindings: serial: 8250_omap: Drop redundant properties
dt-bindings: serial: Convert socionext,milbeaut-usio-uart to DT schema
dt-bindings: serial: Convert microchip,pic32mzda-uart to DT schema
dt-bindings: serial: Convert arm,sbsa-uart to DT schema
dt-bindings: serial: Convert snps,arc-uart to DT schema
dt-bindings: serial: Convert marvell,armada-3700-uart to DT schema
dt-bindings: serial: Convert lantiq,asc to DT schema
...
Linus Torvalds [Fri, 6 Jun 2025 18:50:47 +0000 (11:50 -0700)]
Merge tag 'char-misc-6.16-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char / misc / iio driver updates from Greg KH:
"Here is the big char/misc/iio and other small driver subsystem pull
request for 6.16-rc1.
Overall, a lot of individual changes, but nothing major, just the
normal constant forward progress of new device support and cleanups to
existing subsystems. Highlights in here are:
- Large IIO driver updates and additions and device tree changes
- Android binder bugfixes and logfile fixes
- mhi driver updates
- comedi driver updates
- counter driver updates and additions
- coresight driver updates and additions
- echo driver removal as there are no in-kernel users of it
- nvmem driver updates
- spmi driver updates
- new amd-sbi driver "subsystem" and drivers added
- rust miscdriver binding documentation fix
- other small driver fixes and updates (uio, w1, acrn, hpet,
xillybus, cardreader drivers, fastrpc and others)
All of these have been in linux-next for quite a while with no
reported problems"
* tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits)
binder: fix yet another UAF in binder_devices
counter: microchip-tcb-capture: Add watch validation support
dt-bindings: iio: adc: Add ROHM BD79100G
iio: adc: add support for Nuvoton NCT7201
dt-bindings: iio: adc: add NCT7201 ADCs
iio: chemical: Add driver for SEN0322
dt-bindings: trivial-devices: Document SEN0322
iio: adc: ad7768-1: reorganize driver headers
iio: bmp280: zero-init buffer
iio: ssp_sensors: optimalize -> optimize
HID: sensor-hub: Fix typo and improve documentation
iio: admv1013: replace redundant ternary operator with just len
iio: chemical: mhz19b: Fix error code in probe()
iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
iio: make IIO_DMA_MINALIGN minimum of 8 bytes
...
Linus Torvalds [Fri, 6 Jun 2025 18:23:37 +0000 (11:23 -0700)]
Merge tag 'staging-6.16-rc1' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the "big" set of staging driver changes for 6.16-rc1. Included
in here are:
- gpib driver cleanups and updates.
This subsystem is _almost_ ready to be merged into the main portion
of the kernel tree. Hopefully should happen in the next kernel
merge cycle if all goes well.
- sm750fb driver cleanups
- rtl8723bs driver cleanups
- other small driver cleanups for coding style issues
All of these have been in for over 2 weeks with no reported issues"
* tag 'staging-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (203 commits)
staging: rtl8723bs: remove unnecessary braces for single statement blocks
staging: rtl8723bs: Removed multiple blank lines of rtw_pwrctrl.c
staging: sm750fb: rename `hw_sm750le_setBLANK`
staging: sm750fb: rename `hw_sm750_setBLANK`
staging: sm750fb: rename `hw_sm750_setColReg`
staging: sm750fb: rename `hw_sm750_crtc_setMode`
staging: sm750fb: rename `hw_sm750_crtc_checkMode`
staging: sm750fb: rename `hw_sm750_output_setMode`
staging: sm750fb: rename `hw_sm750le_deWait`
staging: sm750fb: rename `hw_sm750_deWait`
staging: sm750fb: rename `hw_sm750_initAccel`
staging: gpib: switch to kmalloc(sizeof(*status))
staging: gpib: Fix secondary address restriction
staging: gpib: Fix uapi include header guard name
staging: gpib: Avoid unused variable warning
staging: gpib: Declare driver entry points static
staging: gpib: Fix PCMCIA config identifier
staging: gpib: Avoid unused variable warnings
staging: gpib: Fix lpvo request_system_control
staging: sm750fb: rename sm750_hw_cursor_setData2
...
Linus Torvalds [Fri, 6 Jun 2025 16:26:47 +0000 (09:26 -0700)]
Merge tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Simona Vetter:
"Another small batch of drm fixes, this time with a different baseline
and hence separate.
Drivers:
- ivpu:
- dma_resv locking
- warning fixes
- reset failure handling
- improve logging
- update fw file names
- fix cmdqueue unregister
- panel-simple: add Evervision VGG644804
Core Changes:
- sysfb: screen_info type check
- video: screen_info for relocated pci fb
- drm/sched: signal fence of killed job
- dummycon: deferred takeover fix"
* tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel:
sysfb: Fix screen_info type check for VGA
video: screen_info: Relocate framebuffers behind PCI bridges
accel/ivpu: Fix warning in ivpu_gem_bo_free()
accel/ivpu: Trigger device recovery on engine reset/resume failure
accel/ivpu: Use dma_resv_lock() instead of a custom mutex
drm/panel-simple: fix the warnings for the Evervision VGG644804
accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction
accel/ivpu: Use firmware names from upstream repo
accel/ivpu: Improve buffer object logging
dummycon: Trigger redraw when switching consoles with deferred takeover
drm/scheduler: signal scheduled fence when kill job
Linus Torvalds [Fri, 6 Jun 2025 15:09:56 +0000 (08:09 -0700)]
Merge tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"This is pretty much two weeks worth of fixes, plus one thing that
might be considered next: amdkfd is now able to be enabled on risc-v
platforms.
Otherwise, amdgpu and xe with the majority of fixes, and then a
smattering all over.
panel:
- nt37801: fix IS_ERR
- nt37801: fix KConfig
connector:
- Fix null deref in HDMI audio helper.
bridge:
- analogix_dp: fixup clk-disable removal
nouveau:
- minor typo fix (',' vs ';')
msm:
- mailmap updates
i915:
- Fix the enabling/disabling of DP audio SDP splitting
- Fix PSR register definitions for ALPM
- Fix u32 overflow in SNPS PHY HDMI PLL setup
- Fix GuC pending message underflow when submit fails
- Fix GuC wakeref underflow race during reset
xe:
- Two documentation fixes
- A couple of vm init fixes
- Hwmon fixes
- Drop reduntant conversion to bool
- Fix CONFIG_INTEL_VSEC dependency
- Rework eviction rejection of bound external bos
- Stop re-submitting signalled jobs
- A couple of pxp fixes
- Add back a fix that got lost in a merge
- Create LRC bo without VM
- Fix for the above fix
amdgpu:
- UserQ fixes
- SMU 13.x fixes
- VCN fixes
- JPEG fixes
- Misc cleanups
- runtime pm fix
- DCN 4.0.1 fixes
- Misc display fixes
- ISP fix
- VRAM manager fix
- RAS fixes
- IP discovery fix
- Cleaner shader fix for GC 10.1.x
- OD fix
- Non-OLED panel fix
- Misc display fixes
- Brightness fixes
amdkfd:
- Enable CONFIG_HSA_AMD on RISCV
- SVM fix
- Misc cleanups
- Ref leak fix
- WPTR BO fix
radeon:
- Misc cleanups"
* tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: (105 commits)
drm/nouveau/vfn/r535: Convert comma to semicolon
drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init()
drm/xe: Create LRC BO without VM
drm/xe/guc_submit: add back fix
drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready
drm/xe/pxp: Use the correct define in the set_property_funcs array
drm/xe/sched: stop re-submitting signalled jobs
drm/xe: Rework eviction rejection of bound external bos
drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency
drm/xe: drop redundant conversion to bool
drm/xe/hwmon: Move card reactive critical power under channel card
drm/xe/hwmon: Add support to manage power limits though mailbox
drm/xe/vm: move xe_svm_init() earlier
drm/xe/vm: move rebind_work init earlier
MAINTAINERS: .mailmap: update Rob Clark's email address
mailmap: Update entry for Akhil P Oommen
MAINTAINERS: update my email address
MAINTAINERS: drop myself as maintainer
drm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setup
drm/amd/display: Fix default DC and AC levels
...
Linus Torvalds [Fri, 6 Jun 2025 14:56:36 +0000 (07:56 -0700)]
Merge tag 'mips_6.16' of git://git./linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- Added support for EcoNet platform
- Added support for parallel CPU bring up on EyeQ
- Other cleanups and fixes
* tag 'mips_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (23 commits)
MIPS: loongson2ef: lemote-2f: add missing function prototypes
MIPS: loongson2ef: cs5536: add missing function prototypes
MIPS: SMP: Move the AP sync point before the calibration delay
mips: econet: Fix incorrect Kconfig dependencies
MAINTAINERS: Add entry for newly added EcoNet platform.
mips: dts: Add EcoNet DTS with EN751221 and SmartFiber XP8421-B board
dt-bindings: vendor-prefixes: Add SmartFiber
mips: Add EcoNet MIPS platform support
dt-bindings: mips: Add EcoNet platform binding
MIPS: bcm63xx: nvram: avoid inefficient use of crc32_le_combine()
mips: dts: pic32: pic32mzda: Rename the sdhci nodename to match with common mmc-controller binding
MIPS: SMP: Move the AP sync point before the non-parallel aware functions
MIPS: Replace strcpy() with strscpy() in vpe_elfload()
MIPS: BCM63XX: Replace strcpy() with strscpy() in board_prom_init()
mips: ptrace: Improve code formatting and indentation
MIPS: SMP: Implement parallel CPU bring up for EyeQ
mips: Add -std= flag specified in KBUILD_CFLAGS to vdso CFLAGS
MIPS: Loongson64: Add missing '#interrupt-cells' for loongson64c_ls7a
mips: dts: realtek: Add MDIO controller
MIPS: txx9: gpio: use new line value setter callbacks
...
Simona Vetter [Fri, 6 Jun 2025 12:38:50 +0000 (14:38 +0200)]
Merge tag 'drm-misc-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
ivpu:
- gem: Use dma-resv lock
- gem. Fix a warning
- Trigger recovery on device engine reset/resume failure
panel:
- panel-simple: Fix settings for Evervision VGG644804
sysfb:
- Fix screen_info type check
video:
- Update screen_info for relocated PCI framebuffers
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250606072853.GA13099@linux.fritz.box
Yao Zi [Thu, 5 Jun 2025 12:34:46 +0000 (20:34 +0800)]
platform/loongarch: laptop: Get brightness setting from EC on probe
Previously during driver probe, 1 is unconditionally taken as current
brightness value and set to props.brightness, which will be considered
as the brightness before suspend and restored to EC on resume. Since a
brightness value of 1 almost never matches EC's state on coldboot (my
laptop's EC defaults to 80), this causes surprising changes of screen
brightness on the first time of resume after coldboot.
Let's get brightness from EC and take it as the current brightness on
probe of the laptop driver to avoid the surprising behavior. Tested on
TongFang L860-T2 Loongson-3A5000 laptop.
Cc: stable@vger.kernel.org
Fixes:
6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Thu, 5 Jun 2025 12:34:34 +0000 (20:34 +0800)]
LoongArch: dts: Add PWM support to Loongson-2K2000
The module is supported, enable it.
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Thu, 5 Jun 2025 12:34:34 +0000 (20:34 +0800)]
LoongArch: dts: Add PWM support to Loongson-2K1000
The module is supported, enable it.
Also, add the pwm-fan and cooling-maps associated with it.
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Binbin Zhou [Thu, 5 Jun 2025 12:34:34 +0000 (20:34 +0800)]
LoongArch: dts: Add PWM support to Loongson-2K0500
The module is supported, enable it.
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Thomas Weißschuh [Thu, 5 Jun 2025 12:34:18 +0000 (20:34 +0800)]
LoongArch: vDSO: Correctly use asm parameters in syscall wrappers
The syscall wrappers use the "a0" register for two different register
variables, both the first argument and the return value. Here the "ret"
variable is used as both input and output while the argument register is
only used as input. Clang treats the conflicting input parameters as an
undefined behaviour and optimizes away the argument assignment.
The code seems to work by chance for the most part today but that may
change in the future. Specifically clock_gettime_fallback() fails with
clockids from 16 to 23, as implemented by the upcoming auxiliary clocks.
Switch the "ret" register variable to a pure output, similar to the
other architectures' vDSO code. This works in both clang and GCC.
Link: https://lore.kernel.org/lkml/20250602102825-42aa84f0-23f1-4d10-89fc-e8bbaffd291a@linutronix.de/
Link: https://lore.kernel.org/lkml/20250519082042.742926976@linutronix.de/
Fixes:
c6b99bed6b8f ("LoongArch: Add VDSO and VSYSCALL support")
Fixes:
18efd0b10e0f ("LoongArch: vDSO: Wire up getrandom() vDSO implementation")
Cc: stable@vger.kernel.org
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Viacheslav Dubeyko [Mon, 2 Jun 2025 18:49:56 +0000 (11:49 -0700)]
ceph: fix variable dereferenced before check in ceph_umount_begin()
smatch warnings:
fs/ceph/super.c:1042 ceph_umount_begin() warn: variable dereferenced before check 'fsc' (see line 1041)
vim +/fsc +1042 fs/ceph/super.c
void ceph_umount_begin(struct super_block *sb)
{
struct ceph_fs_client *fsc = ceph_sb_to_fs_client(sb);
doutc(fsc->client, "starting forced umount\n");
^^^^^^^^^^^
Dereferenced
if (!fsc)
^^^^
Checked too late.
return;
fsc->mount_state = CEPH_MOUNT_SHUTDOWN;
__ceph_umount_begin(fsc);
}
The VFS guarantees that the superblock is still
alive when it calls into ceph via ->umount_begin().
Finally, we don't need to check the fsc and
it should be valid. This patch simply removes
the fsc check.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/
202503280852.YDB3pxUY-lkp@intel.com/
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Simona Vetter [Fri, 6 Jun 2025 06:21:45 +0000 (08:21 +0200)]
Merge tag 'drm-misc-fixes-2025-05-28' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
drm-scheduler:
- signal scheduled fence when killing job
dummycon:
- trigger deferred takeover when switching consoles
ivpu:
- improve logging
- update firmware filenames
- reorder steps in command-queue unregistering
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250528153550.GA21050@linux.fritz.box
Max Kellermann [Sun, 4 May 2025 18:08:31 +0000 (20:08 +0200)]
kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
Expose a simple counter to userspace for monitoring tools.
(akpm:
2536c5c7d6ae added the documentation but the code changes were lost)
Link: https://lkml.kernel.org/r/20250504180831.4190860-3-max.kellermann@ionos.com
Fixes:
2536c5c7d6ae ("kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Cc: Core Minyard <cminyard@mvista.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: Max Kellermann <max.kellermann@ionos.com>
Cc: Song Liu <song@kernel.org>
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lorenzo Stoakes [Wed, 4 Jun 2025 16:31:39 +0000 (17:31 +0100)]
MAINTAINERS: add mm swap section
In furtherance of ongoing efforts to ensure people are aware of who
de-facto maintains/has an interest in specific parts of mm, as well trying
to avoid get_maintainers.pl listing only Andrew and the mailing list for
mm files - establish a swap memory management section and add relevant
maintainers/reviewers.
Link: https://lkml.kernel.org/r/20250604163139.126630-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Chris Li <chrisl@kernel.org>
Acked-by: Kairui Song <kasong@tencent.com>
Acked-by: Kemeng Shi <shikemeng@huaweicloud.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Barry Song <baohua@kernel.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Arnd Bergmann [Tue, 3 Jun 2025 07:53:07 +0000 (09:53 +0200)]
kmsan: test: add module description
Every module should have a description, and kbuild now warns for those
that don't.
WARNING: modpost: missing MODULE_DESCRIPTION() in mm/kmsan/kmsan_test.o
Link: https://lkml.kernel.org/r/20250603075323.1839608-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Macro Elver <elver@google.com>
Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tal Zussman [Tue, 3 Jun 2025 22:50:17 +0000 (18:50 -0400)]
MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION
The MMU GATHER AND TLB INVALIDATION entry lists other TLB-related files.
Add the tlb.h tracepoint file there as well.
Link: https://lore.kernel.org/linux-mm/ce048e11-f79d-44a6-bacc-46e1ebc34b24@redhat.com/
Link: https://lkml.kernel.org/r/20250603-tlb-maintainers-v1-1-726d193c6693@columbia.edu
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jann Horn [Tue, 27 May 2025 21:23:54 +0000 (23:23 +0200)]
mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
huge_pmd_unshare() drops a reference on a page table that may have
previously been shared across processes, potentially turning it into a
normal page table used in another process in which unrelated VMAs can
afterwards be installed.
If this happens in the middle of a concurrent gup_fast(), gup_fast() could
end up walking the page tables of another process. While I don't see any
way in which that immediately leads to kernel memory corruption, it is
really weird and unexpected.
Fix it with an explicit broadcast IPI through tlb_remove_table_sync_one(),
just like we do in khugepaged when removing page tables for a THP
collapse.
Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-2-1329349bad1a@google.com
Link: https://lkml.kernel.org/r/20250527-hugetlb-fixes-splitrace-v1-2-f4136f5ec58a@google.com
Fixes:
39dde65c9940 ("[PATCH] shared page table for hugetlb page")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jann Horn [Tue, 27 May 2025 21:23:53 +0000 (23:23 +0200)]
mm/hugetlb: unshare page tables during VMA split, not before
Currently, __split_vma() triggers hugetlb page table unsharing through
vm_ops->may_split(). This happens before the VMA lock and rmap locks are
taken - which is too early, it allows racing VMA-locked page faults in our
process and racing rmap walks from other processes to cause page tables to
be shared again before we actually perform the split.
Fix it by explicitly calling into the hugetlb unshare logic from
__split_vma() in the same place where THP splitting also happens. At that
point, both the VMA and the rmap(s) are write-locked.
An annoying detail is that we can now call into the helper
hugetlb_unshare_pmds() from two different locking contexts:
1. from hugetlb_split(), holding:
- mmap lock (exclusively)
- VMA lock
- file rmap lock (exclusively)
2. hugetlb_unshare_all_pmds(), which I think is designed to be able to
call us with only the mmap lock held (in shared mode), but currently
only runs while holding mmap lock (exclusively) and VMA lock
Backporting note:
This commit fixes a racy protection that was introduced in commit
b30c14cd6102 ("hugetlb: unshare some PMDs when splitting VMAs"); that
commit claimed to fix an issue introduced in 5.13, but it should actually
also go all the way back.
[jannh@google.com: v2]
Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-1-1329349bad1a@google.com
Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-0-1329349bad1a@google.com
Link: https://lkml.kernel.org/r/20250527-hugetlb-fixes-splitrace-v1-1-f4136f5ec58a@google.com
Fixes:
39dde65c9940 ("[PATCH] shared page table for hugetlb page")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> [b30c14cd6102: hugetlb: unshare some PMDs when splitting VMAs]
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alistair Popple [Fri, 30 May 2025 01:49:17 +0000 (11:49 +1000)]
MAINTAINERS: add Alistair as reviewer of mm memory policy
I'm particularly familiar with mm/migrate.c and especially
mm/migrate_device.c so add myself to MAINTAINERS.
Link: https://lkml.kernel.org/r/20250530014917.2946940-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Nitesh Shetty [Mon, 28 Apr 2025 09:58:48 +0000 (15:28 +0530)]
iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec
If iov_offset is non-zero, then we need to consider iov_offset in length
calculation, otherwise we might pass smaller IOs such as 512 bytes, in
below scenario [1].
This issue is reproducible using lib-uring test/fixed-seg.c application
with fixed buffer on a 512 LBA formatted device.
[1]
At present we pass the alignment check, for 512 LBA formatted devices,
len_mask = 511 when IO is smaller, i->count = 512 has an offset,
i->io_offset = 3584 with bvec values, bvec->bv_offset = 256,
bvec->bv_len = 3840. In short, the first 256 bytes are in the current
page, next 256 bytes are in the another page. Ideally we expect to
fail the IO.
I can think of 2 userspace scenarios where we experience this.
a: From userspace, we observe a different behaviour when device LBA
size is 512 vs 4096 bytes. For 4096 LBA formatted device, I see the
same liburing test [2] failing, whereas 512 the test passes without
this. This is reproducible everytime.
[2] https://github.com/axboe/liburing/
b: Although I was not able to reproduce the below condition, but I
suspect below case should be possible from user space for devices
with 512 LBA formatted device. Lets say from userspace while
allocating a virtually single chunk of memory, if we get 2 physical
chunk of memory, and IO happens to be at the boundary of first
physical chunk with length crossing first chunk, then we allow IOs
to proceed and hence we might map wrong physical address length and
proceed with IO rather than failing.
: --- a/test/fixed-seg.c
: +++ b/test/fixed-seg.c
: @@ -64,7 +64,7 @@ static int test(struct io_uring *ring, int fd, int
: vec_off)
: return T_EXIT_FAIL;
: }
:
: - ret = read_it(ring, fd, 4096, vec_off);
: + ret = read_it(ring, fd, 4096, 7*512 + 256);
: if (ret) {
: fprintf(stderr, "4096 0 failed\n");
: return T_EXIT_FAIL;
Effectively this is a write crossing the page boundary.
Link: https://lkml.kernel.org/r/20250428095849.11709-1-nj.shetty@samsung.com
Fixes:
2263639f96f2 ("iov_iter: streamline iovec/bvec alignment iteration")
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joshua Hahn [Mon, 2 Jun 2025 16:23:39 +0000 (09:23 -0700)]
mm/mempolicy: fix incorrect freeing of wi_kobj
We should not free wi_group->wi_kobj here. In the error path of
add_weighted_interleave_group() where this snippet is called from,
kobj_{del, put} is immediately called right after this section. Thus, it
is not only unnecessary but also incorrect to free it here.
Link: https://lkml.kernel.org/r/20250602162345.2595696-1-joshua.hahnjy@gmail.com
Fixes:
e341f9c3c841 ("mm/mempolicy: Weighted Interleave Auto-tuning")
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202506011545.Fduxqxqj-lkp@intel.com/
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Suren Baghdasaryan [Wed, 21 May 2025 16:06:02 +0000 (09:06 -0700)]
alloc_tag: handle module codetag load errors as module load failures
Failures inside codetag_load_module() are currently ignored. As a result
an error there would not cause a module load failure and freeing of the
associated resources. Correct this behavior by propagating the error code
to the caller and handling possible errors. With this change, error to
allocate percpu counters, which happens at this stage, will not be ignored
and will cause a module load failure and freeing of resources. With this
change we also do not need to disable memory allocation profiling when
this error happens, instead we fail to load the module.
Link: https://lkml.kernel.org/r/20250521160602.1940771-1-surenb@google.com
Fixes:
10075262888b ("alloc_tag: allocate percpu counters for module tags dynamically")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: Casey Chen <cachen@purestorage.com>
Closes: https://lore.kernel.org/all/
20250520231620.15259-1-cachen@purestorage.com/
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: David Wang <00107082@163.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Luis Chamberalin <mcgrof@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
SeongJae Park [Mon, 2 Jun 2025 17:49:26 +0000 (10:49 -0700)]
mm/madvise: handle madvise_lock() failure during race unwinding
When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
madvise_lock() failure is ignored. Check the failure and abort remaining
works in the case.
Link: https://lkml.kernel.org/r/20250602174926.1074-1-sj@kernel.org
Fixes:
4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Barry Song <21cnbao@gmail.com>
Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Barry Song <baohua@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kirill A. Shutemov [Thu, 29 May 2025 10:38:32 +0000 (13:38 +0300)]
mm: fix vmstat after removing NR_BOUNCE
Hongyu noticed that the nr_unaccepted counter kept growing even in the
absence of unaccepted memory on the machine.
This happens due to a commit that removed NR_BOUNCE: it removed the
counter from the enum zone_stat_item, but left it in the vmstat_text
array.
As a result, all counters below nr_bounce in /proc/vmstat are shifted by
one line, causing the numa_hit counter to be labeled as nr_unaccepted.
To fix this issue, remove nr_bounce from the vmstat_text array.
Link: https://lkml.kernel.org/r/20250529103832.2937460-1-kirill.shutemov@linux.intel.com
Fixes:
194df9f66db8 ("mm: remove NR_BOUNCE zone stat")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Hongyu Ning <hongyu.ning@linux.intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Lorenzo Stoakes [Mon, 19 May 2025 14:56:57 +0000 (15:56 +0100)]
KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
The enum type prot_type declared in arch/s390/kvm/gaccess.c declares an
unfortunate identifier within it - PROT_NONE.
This clashes with the protection bit define from the uapi for mmap()
declared in include/uapi/asm-generic/mman-common.h, which is indeed what
those casually reading this code would assume this to refer to.
This means that any changes which subsequently alter headers in any way
which results in the uapi header being imported here will cause build
errors.
Resolve the issue by renaming PROT_NONE to PROT_TYPE_DUMMY.
Link: https://lkml.kernel.org/r/20250519145657.178365-1-lorenzo.stoakes@oracle.com
Fixes:
b3cefd6bf16e ("KVM: s390: Pass initialized arg even if unused")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202505140943.IgHDa9s7-lkp@intel.com/
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com>
Acked-by: Yang Shi <yang@os.amperecomputing.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>