Sakari Ailus [Fri, 4 Jul 2025 07:54:43 +0000 (10:54 +0300)]
pwm: img: Remove redundant pm_runtime_mark_last_busy() calls
pm_runtime_put_autosuspend(), pm_runtime_put_sync_autosuspend(),
pm_runtime_autosuspend() and pm_request_autosuspend() now include a call
to pm_runtime_mark_last_busy(). Remove the now-reduntant explicit call to
pm_runtime_mark_last_busy().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20250704075443.3221370-1-sakari.ailus@linux.intel.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Wed, 9 Jul 2025 08:48:21 +0000 (10:48 +0200)]
Merge tag 'pm-runtime-6.17-rc1' of https://git./linux/kernel/git/rafael/linux-pm
Runtime PM updates related to autosuspend for 6.17
Make several autosuspend functions mark last busy stamp and update
the documentation accordingly (Sakari Ailus).
Michal Wilczynski [Wed, 2 Jul 2025 13:45:29 +0000 (15:45 +0200)]
pwm: Expose PWM_WFHWSIZE in public header
The WFHWSIZE constant defines the maximum size for the hardware-specific
waveform representation buffer. It is currently local to
drivers/pwm/core.c, which makes it inaccessible to external tools like
bindgen.
Move the constant to include/linux/pwm.h to make it part of the public
API. As part of this change, rename it to PWM_WFHWSIZE to follow
standard kernel conventions for namespacing macros in public headers.
This allows bindgen to automatically generate a corresponding constant
for the Rust PWM abstractions, ensuring the value remains synchronized
between the C core and Rust code and preventing future maintenance
issues.
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Link: https://lore.kernel.org/r/20250702-rust-next-pwm-working-fan-for-sending-v7-1-67ef39ff1d29@samsung.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Frank Li [Wed, 25 Jun 2025 16:19:08 +0000 (12:19 -0400)]
dt-bindings: pwm: Convert lpc32xx-pwm.txt to yaml format
Convert pc32xx-pwm.txt to yaml format.
Additional changes:
- add ref to pwm.yaml
- add clocks
- restrict #pwm-cells to 3
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250625161909.2541315-1-Frank.Li@nxp.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 10:05:00 +0000 (12:05 +0200)]
docs: pwm: Adapt Locking paragraph to reality
We have the distinction between pwm_apply_atomic() and
pwm_apply_might_sleep() since commit
c748a6d77c06 (pwm: Rename
pwm_apply_state() to pwm_apply_might_sleep()) contained in v6.8-rc1.
Locking in the core was introduced in commit
1cc2e1faafb3 ("pwm: Add
more locking", contained in v6.13-rc1) to serialize per-chip callbacks
and device removal.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20250624100500.1429163-2-u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:44 +0000 (20:15 +0200)]
pwm: twl-led: Drop driver local locking
The pwm core already serializes .apply(). twl6030's .request() and .free()
are also already serialized against .apply() because there is only a single
PWM. So the mutex doesn't add any additional protection and can be dropped.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/c1c7f646190f7cb2fe43b10959aa8dade80cb79e.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:43 +0000 (20:15 +0200)]
pwm: sun4i: Drop driver local locking
The pwm core serializes calls to .apply(), so the driver lock doesn't
add any protection and can safely be dropped.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/87b71c46b82b787959f0cea314d3010f16a50a29.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:42 +0000 (20:15 +0200)]
pwm: sti: Drop driver local locking
The pwm core already serializes calls to .apply(), so the driver local
mutex adds no protection and can be dropped.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/7ad150e40b45d6cb16fee633dcd6390a49a327a1.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:41 +0000 (20:15 +0200)]
pwm: microchip-core: Drop driver local locking
The pwm core already serializes .apply() and .get_state(), so the driver
local lock is always free and adds no protection.
Drop it.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/6d6ef0376ea0058b040eec3b257e324493a083f1.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:40 +0000 (20:15 +0200)]
pwm: lpc18xx-sct: Drop driver local locking
Both mutexes are only used in one function each. These functions are only
called by the .apply() callback. As the .apply() calls are serialized by
the core since commit
1cc2e1faafb3 ("pwm: Add more locking") the mutexes
have no effect apart from runtime overhead. Drop them.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Link: https://lore.kernel.org/r/4f7a2da37adbfe4743564245119045826d86eca6.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:39 +0000 (20:15 +0200)]
pwm: fsl-ftm: Drop driver local locking
The pwm core serializes calls to .apply(), so the driver local lock isn't
needed for that. It only has the effect to serialize .apply() with
.request() and .free() for a different PWM, and .request() and .free
against each other. But given that .request and .free() only do a single
regmap operation under the lock and regmap itself serializes register
accesses, it might happen without the lock that the calls are interleaved
now, but affecting different PWMs, so nothing bad can happen.
So the mutex has no effect and can be dropped.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/6b72104e5e1823170c7c9664189cc0f2ca5c2347.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:38 +0000 (20:15 +0200)]
pwm: clps711x: Drop driver local locking
The pwm core serializes calls to .apply(), so the spinlock adds no
additional protection. Disabling the irq is also irrelevant as the driver
isn't an atomic one and so the callbacks cannot be called from atomic
context.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/d4931dc0c0d657d80722cfe7d97cb4fb4ccec90e.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Tue, 24 Jun 2025 18:15:37 +0000 (20:15 +0200)]
pwm: atmel: Drop driver local locking
The two functions making use of the lock are only called transitively from
.apply(). Calls to .apply() are already serialized by the pwm core so the
lock in the driver has no effect and can safely be dropped.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/5ad3417aecd4dc6eca9699e21691e3725ea0bb87.1750788649.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Marek Vasut [Sun, 29 Jun 2025 22:07:20 +0000 (00:07 +0200)]
pwm: argon-fan-hat: Add Argon40 Fan HAT support
Add trivial PWM driver for Argon40 Fan HAT, which is a RaspberryPi
blower fan hat which can be controlled over I2C. Model this device
as a PWM, so the pwm-fan can be attached to it and handle thermal
zones and RPM management in a generic manner.
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://lore.kernel.org/r/20250629220757.936212-3-marek.vasut+renesas@mailbox.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Marek Vasut [Sun, 29 Jun 2025 22:07:19 +0000 (00:07 +0200)]
dt-bindings: pwm: argon40,fan-hat: Document Argon40 Fan HAT
Document trivial PWM on Argon40 Fan HAT, which is a RaspberryPi
blower fan hat which can be controlled over I2C.
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://lore.kernel.org/r/20250629220757.936212-2-marek.vasut+renesas@mailbox.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Marek Vasut [Sun, 29 Jun 2025 22:07:18 +0000 (00:07 +0200)]
dt-bindings: vendor-prefixes: Document Argon40
Argon 40 Technologies Limited is a SBC expansion board vendor.
Document the prefix. For details see https://argon40.com .
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://lore.kernel.org/r/20250629220757.936212-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
AngeloGioacchino Del Regno [Mon, 23 Jun 2025 12:01:18 +0000 (14:01 +0200)]
pwm: pwm-mediatek: Add support for PWM IP V3.0.2 in MT6991/MT8196
Add support for the PWM IP version 3.0.2, found in MediaTek's
Dimensity 9400 MT6991 and in the MT8196 Chromebook SoC: this
needs a new register offset array and also a different offset
for the PWM_CK_26M_SEL register.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250623120118.109170-4-angelogioacchino.delregno@collabora.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
AngeloGioacchino Del Regno [Mon, 23 Jun 2025 12:01:17 +0000 (14:01 +0200)]
pwm: pwm-mediatek: Pass PWM_CK_26M_SEL from platform data
In preparation for adding support for new SoCs, remove variable
has_ck_26m_sel from pwm_mediatek_of_data and replace it with a
u16 pwm_ck_26m_sel_reg, meant to hold the register offset for
PWM_CK_26M_SEL.
Also, since the reg offset is guaranteed to never be zero, the
logic to check for "has_ck_26m_sel" is changed to check if the
register offset in pwm_ck_26m_sel_reg is more than zero.
Analogously, when writing, use the register offset from platform
data instead of using the PWM_CK_26M_SEL definition.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250623120118.109170-3-angelogioacchino.delregno@collabora.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
AngeloGioacchino Del Regno [Mon, 23 Jun 2025 12:01:16 +0000 (14:01 +0200)]
dt-bindings: pwm: mediatek,mt2712-pwm: Add support for MT6991/MT8196
Add compatible strings for the MediaTek Dimensity 9400 MT6991 and
for the MT8196 Chromebook SoC, having the same PWM IP v3.0.2.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250623120118.109170-2-angelogioacchino.delregno@collabora.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Frank Li [Mon, 16 Jun 2025 19:04:34 +0000 (15:04 -0400)]
dt-bindings: pwm: convert lpc1850-sct-pwm.txt to yaml format
Convert lpc1850-sct-pwm.txt to yaml format.
Additional changes:
- add ref pwm.yaml.
- add resets property to match existed dts.
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250616190435.1998078-1-Frank.Li@nxp.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Nicolas Frattaroli [Mon, 16 Jun 2025 15:14:17 +0000 (17:14 +0200)]
pwm: rockchip: Round period/duty down on apply, up on get
With CONFIG_PWM_DEBUG=y, the rockchip PWM driver produces warnings like
this:
rockchip-pwm
fd8b0010.pwm: .apply is supposed to round down
duty_cycle (requested: 23529/50000, applied: 23542/50000)
This is because the driver chooses ROUND_CLOSEST for purported
idempotency reasons. However, it's possible to keep idempotency while
always rounding down in .apply().
Do this by making .get_state() always round up, and making .apply()
always round down. This is done with u64 maths, and setting both period
and duty to U32_MAX (the biggest the hardware can support) if they would
exceed their 32 bits confines.
Fixes:
12f9ce4a5198 ("pwm: rockchip: Fix period and duty cycle approximation")
Fixes:
1ebb74cf3537 ("pwm: rockchip: Add support for hardware readout")
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Link: https://lore.kernel.org/r/20250616-rockchip-pwm-rounding-fix-v2-1-a9c65acad7b6@collabora.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Fabrice Gasnier [Fri, 10 Jan 2025 09:19:18 +0000 (10:19 +0100)]
pwm: stm32: add support for stm32mp25
Add support for STM32MP25 SoC. Use newly introduced compatible to handle
new features along with registers and bits diversity.
The MFD part of the driver fills in ipidr, so it is used to check the
hardware configuration register, when available to gather the number
of PWM channels and complementary outputs.
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/20250110091922.980627-5-fabrice.gasnier@foss.st.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
David Lechner [Thu, 29 May 2025 16:53:18 +0000 (11:53 -0500)]
dt-bindings: pwm: adi,axi-pwmgen: Update documentation link
Change the documentation link to point to the location with the most
up-to-date information.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://lore.kernel.org/r/20250529-pwm-axi-pwmgen-add-external-clock-v3-1-5d8809a7da91@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Longbin Li [Wed, 28 May 2025 10:11:38 +0000 (18:11 +0800)]
pwm: sophgo-sg2042: Add support for SG2044
Add PWM controller for SG2044 on base of SG2042.
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Tested-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Longbin Li <looong.bin@gmail.com>
Link: https://lore.kernel.org/r/20250528101139.28702-4-looong.bin@gmail.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Longbin Li [Wed, 28 May 2025 10:11:37 +0000 (18:11 +0800)]
pwm: sophgo-sg2042: Reorganize the code structure
As the driver logic can be used in both SG2042 and SG2044, it
will be better to reorganize the code structure.
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Tested-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Longbin Li <looong.bin@gmail.com>
Link: https://lore.kernel.org/r/20250528101139.28702-3-looong.bin@gmail.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Longbin Li [Wed, 28 May 2025 10:11:36 +0000 (18:11 +0800)]
dt-bindings: pwm: sophgo: Add pwm controller for SG2044
Add compatible string for PWM controller on SG2044.
Tested-by: Chen Wang <unicorn_wang@outlook.com>
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Longbin Li <looong.bin@gmail.com>
Link: https://lore.kernel.org/r/20250528101139.28702-2-looong.bin@gmail.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Nylon Chen [Thu, 29 May 2025 03:53:41 +0000 (11:53 +0800)]
pwm: sifive: Fix rounding and idempotency issues in apply and get_state
This fix ensures consistent rounding and avoids mismatches
between applied and reported PWM values that could trigger false
idempotency failures in debug checks
This change ensures:
- real_period is now calculated using DIV_ROUND_UP_ULL() to avoid underestimation.
- duty_cycle is rounded up to match the fractional computation in apply()
- apply() truncates the result to compensate for get_state's rounding up logic
These fixes resolve issues like:
.apply is supposed to round down duty_cycle (requested: 360/504000, applied: 361/504124)
.apply is not idempotent (ena=1 pol=0
1739692/
4032985) -> (ena=1 pol=0
1739630/
4032985)
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202505080303.dBfU5YMS-lkp@intel.com/
Co-developed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Link: https://lore.kernel.org/r/20250529035341.51736-4-nylon.chen@sifive.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Nylon Chen [Thu, 29 May 2025 03:53:40 +0000 (11:53 +0800)]
pwm: sifive: Fix PWM algorithm and clarify inverted compare behavior
The `frac` variable represents the pulse inactive time, and the result
of this algorithm is the pulse active time. Therefore, we must reverse
the result.
Although the SiFive Reference Manual states "pwms >= pwmcmpX -> HIGH",
the hardware behavior is inverted due to a fixed XNOR with 0. As a result,
the pwmcmp register actually defines the low (inactive) portion of the pulse.
The reference is SiFive FU740-C000 Manual[0]
Link: https://sifive.cdn.prismic.io/sifive/1a82e600-1f93-4f41-b2d8-86ed8b16acba_fu740-c000-manual-v1p6.pdf
Co-developed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Zong Li <zong.li@sifive.com>
Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Link: https://lore.kernel.org/r/20250529035341.51736-3-nylon.chen@sifive.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Nylon Chen [Thu, 29 May 2025 03:53:39 +0000 (11:53 +0800)]
riscv: dts: sifive: unleashed/unmatched: Remove PWM controlled LED's active-low properties
This removes the active-low properties of the PWM-controlled LEDs in
the HiFive Unmatched device tree.
The reference is hifive-unleashed-a00.pdf[0] and hifive-unmatched-schematics-v3.pdf[1].
Link: https://sifive.cdn.prismic.io/sifive/c52a8e32-05ce-4aaf-95c8-7bf8453f8698_hifive-unleashed-a00-schematics-1.pdf
Link: https://sifive.cdn.prismic.io/sifive/6a06d6c0-6e66-49b5-8e9e-e68ce76f4192_hifive-unmatched-schematics-v3.pdf
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Link: https://lore.kernel.org/r/20250529035341.51736-2-nylon.chen@sifive.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Guodong Xu [Tue, 29 Apr 2025 08:50:47 +0000 (16:50 +0800)]
pwm: pxa: Allow to enable for SpacemiT K1 SoC
The SpacemiT K1 SoC uses devices similar to the ones on PXA SoCs. Add
ARCH_SPACEMIT as one of the possible architectures this driver can be
enabled for.
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Link: https://lore.kernel.org/r/20250429085048.1310409-6-guodong@riscstar.com
[ukleinek: reword commit log]
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Guodong Xu [Tue, 29 Apr 2025 08:50:44 +0000 (16:50 +0800)]
pwm: pxa: Add optional reset control
Support optional reset control for the PWM PXA driver.
During probe, it acquires the reset controller using
devm_reset_control_get_optional_exclusive_deasserted() to get and deassert
the reset controller to enable the PWM channel.
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Link: https://lore.kernel.org/r/20250429085048.1310409-3-guodong@riscstar.com
[ukleinek: Fix conflict with commit
df08fff8add2 ("pwm: pxa: Improve using dev_err_probe()")]
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Guodong Xu [Tue, 29 Apr 2025 08:50:43 +0000 (16:50 +0800)]
dt-bindings: pwm: marvell,pxa-pwm: Add SpacemiT K1 PWM support
The SpacemiT K1 SoC reuses the Marvell PXA910-compatible PWM controller
with one notable difference: the addition of a resets property. To make
the device tree pass schema validation (make dtbs_check W=3), this patch
updates the binding to accept spacemit,k1-pwm as a compatible string, when
used in conjunction with the fallback marvell,pxa910-pwm.
Support for the optional resets property is also added, as it is required
by the K1 integration but was not present in the original Marvell bindings.
Since the PWM reset line may be deasserted during the early bootloader
stage, making the resets property optional avoids potential
double-deassertion, which could otherwise cause flickering on displays
that use PWM for backlight control.
Additionally, this patch adjusts the required value of the #pwm-cells
property for the new compatible string:
- For "spacemit,k1-pwm", #pwm-cells must be set to 3.
- For existing Marvell compatibles, #pwm-cells remains 1.
Background of #pwm-cells change is by an ongoing community discussion
about increasing the #pwm-cells value from 1 to 3 for all Marvell PXA PWM
devices. These devices are currently the only ones whose bindings do not
pass the line index as the first argument. See [1] for further details.
[1] https://lore.kernel.org/all/cover.
1738842938.git.u.kleine-koenig@baylibre.com/
Reviewed-by: Rob Herring (Arm) <robh@kernel.org> # v2
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Link: https://lore.kernel.org/r/20250429085048.1310409-2-guodong@riscstar.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Wed, 30 Apr 2025 11:56:01 +0000 (13:56 +0200)]
pwm: Add support for pwmchip devices for faster and easier userspace access
With this change each pwmchip defining the new-style waveform callbacks
can be accessed from userspace via a character device. Compared to the
sysfs-API this is faster and allows to pass the whole configuration in a
single ioctl allowing atomic application and thus reducing glitches.
On an STM32MP13 I see:
root@DistroKit:~ time pwmtestperf
real 0m 1.27s
user 0m 0.02s
sys 0m 1.21s
root@DistroKit:~ rm /dev/pwmchip0
root@DistroKit:~ time pwmtestperf
real 0m 3.61s
user 0m 0.27s
sys 0m 3.26s
pwmtestperf does essentially:
for i in 0 .. 50000:
pwm_set_waveform(duty_length_ns=i, period_length_ns=50000, duty_offset_ns=0)
and in the presence of /dev/pwmchip0 is uses the ioctls introduced here,
without that device it uses /sys/class/pwm/pwmchip0.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/ad4a4e49ae3f8ea81e23cac1ac12b338c3bf5c5b.1746010245.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Fri, 4 Jul 2025 17:27:27 +0000 (19:27 +0200)]
pwm: mediatek: Ensure to disable clocks in error path
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes:
7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.com
Cc: stable@vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Uwe Kleine-König [Fri, 4 Jul 2025 17:24:17 +0000 (19:24 +0200)]
pwm: Fix invalid state detection
Commit
9dd42d019e63 ("pwm: Allow pwm state transitions from an invalid
state") intended to allow some state transitions that were not allowed
before. The idea is sane and back then I also got the code comment
right, but the check for enabled is bogus. This resulted in state
transitions for enabled states to be allowed to have invalid duty/period
settings and thus it can happen that low-level drivers get requests for
invalid states🙄.
Invert the check to allow state transitions for disabled states only.
Fixes:
9dd42d019e63 ("pwm: Allow pwm state transitions from an invalid state")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20250704172416.626433-2-u.kleine-koenig@baylibre.com
Cc: stable@vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Sakari Ailus [Mon, 16 Jun 2025 06:12:12 +0000 (09:12 +0300)]
Documentation: PM: *_autosuspend() functions update last busy time
Document that the *_autosuspend() variants of the Runtime PM functions
update the last busy timestamp.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20250616061212.2286741-7-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Mon, 16 Jun 2025 06:12:11 +0000 (09:12 +0300)]
PM: runtime: Mark last busy stamp in pm_request_autosuspend()
Set device's last busy timestamp to current time in
pm_request_autosuspend().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20250616061212.2286741-6-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Mon, 16 Jun 2025 06:12:10 +0000 (09:12 +0300)]
PM: runtime: Mark last busy stamp in pm_runtime_autosuspend()
Set device's last busy timestamp to current time in
pm_runtime_autosuspend().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20250616061212.2286741-5-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Mon, 16 Jun 2025 06:12:09 +0000 (09:12 +0300)]
PM: runtime: Mark last busy stamp in pm_runtime_put_sync_autosuspend()
Set device's last busy timestamp to current time in
pm_runtime_put_sync_autosuspend().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20250616061212.2286741-4-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Mon, 16 Jun 2025 06:12:08 +0000 (09:12 +0300)]
PM: runtime: Mark last busy stamp in pm_runtime_put_autosuspend()
Set device's last busy timestamp to current time in
pm_runtime_put_autosuspend(). Callers wishing not to do that will need to
use __pm_runtime_put_autosuspend().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patch.msgid.link/20250616061212.2286741-3-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sakari Ailus [Mon, 16 Jun 2025 06:12:07 +0000 (09:12 +0300)]
PM: runtime: Document return values of suspend-related API functions
Document return values for device suspend and idle related API
functions.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://patch.msgid.link/20250616061212.2286741-2-sakari.ailus@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Sun, 15 Jun 2025 20:49:41 +0000 (13:49 -0700)]
Linux 6.16-rc2
Linus Torvalds [Sun, 15 Jun 2025 16:14:27 +0000 (09:14 -0700)]
Merge tag 'kbuild-fixes-v6.16' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Move warnings about linux/export.h from W=1 to W=2
- Fix structure type overrides in gendwarfksyms
* tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
gendwarfksyms: Fix structure type overrides
kbuild: move warnings about linux/export.h from W=1 to W=2
Sami Tolvanen [Sat, 14 Jun 2025 00:55:33 +0000 (00:55 +0000)]
gendwarfksyms: Fix structure type overrides
As we always iterate through the entire die_map when expanding
type strings, recursively processing referenced types in
type_expand_child() is not actually necessary. Furthermore,
the type_string kABI rule added in commit
c9083467f7b9
("gendwarfksyms: Add a kABI rule to override type strings") can
fail to override type strings for structures due to a missing
kabi_get_type_string() check in this function.
Fix the issue by dropping the unnecessary recursion and moving
the override check to type_expand(). Note that symbol versions
are otherwise unchanged with this patch.
Fixes:
c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings")
Reported-by: Giuliano Procida <gprocida@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Masahiro Yamada [Thu, 12 Jun 2025 16:08:48 +0000 (01:08 +0900)]
kbuild: move warnings about linux/export.h from W=1 to W=2
This hides excessive warnings, as nobody builds with W=2.
Fixes:
a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1")
Fixes:
7d95680d64ac ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Linus Torvalds [Sat, 14 Jun 2025 17:13:32 +0000 (10:13 -0700)]
Merge tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- SMB3.1.1 POSIX extensions fix for char remapping
- Fix for repeated directory listings when directory leases enabled
- deferred close handle reuse fix
* tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: improve directory cache reuse for readdir operations
smb: client: fix perf regression with deferred closes
smb: client: disable path remapping with POSIX extensions
Linus Torvalds [Sat, 14 Jun 2025 17:01:47 +0000 (10:01 -0700)]
Merge tag 'iommu-fixes-v6.16-rc1' of git://git./linux/kernel/git/iommu/linux
Pull iommu fix from Joerg Roedel:
- Fix PTE size calculation for NVidia Tegra
* tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/tegra: Fix incorrect size calculation
Linus Torvalds [Sat, 14 Jun 2025 16:25:22 +0000 (09:25 -0700)]
Merge tag 'block-6.16-
20250614' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fix for a deadlock on queue freeze with zoned writes
- Fix for zoned append emulation
- Two bio folio fixes, for sparsemem and for very large folios
- Fix for a performance regression introduced in 6.13 when plug
insertion was changed
- Fix for NVMe passthrough handling for polled IO
- Document the ublk auto registration feature
- loop lockdep warning fix
* tag 'block-6.16-
20250614' of git://git.kernel.dk/linux:
nvme: always punt polled uring_cmd end_io work to task_work
Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists
block: Fix bvec_set_folio() for very large folios
bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP
block: use plug request list tail for one-shot backmerge attempt
block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work
block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completion
ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)
loop: move lo_set_size() out of queue freeze
Linus Torvalds [Sat, 14 Jun 2025 15:44:54 +0000 (08:44 -0700)]
Merge tag 'io_uring-6.16-
20250614' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix for a race between SQPOLL exit and fdinfo reading.
It's slim and I was only able to reproduce this with an artificial
delay in the kernel. Followup sparse fix as well to unify the access
to ->thread.
- Fix for multiple buffer peeking, avoiding truncation if possible.
- Run local task_work for IOPOLL reaping when the ring is exiting.
This currently isn't done due to an assumption that polled IO will
never need task_work, but a fix on the block side is going to change
that.
* tag 'io_uring-6.16-
20250614' of git://git.kernel.dk/linux:
io_uring: run local task_work from ring exit IOPOLL reaping
io_uring/kbuf: don't truncate end buffer for multiple buffer peeks
io_uring: consistently use rcu semantics with sqpoll thread
io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
Linus Torvalds [Sat, 14 Jun 2025 15:38:34 +0000 (08:38 -0700)]
Merge tag 'rust-fixes-6.16' of git://git./linux/kernel/git/ojeda/linux
Pull Rust fix from Miguel Ojeda:
- 'hrtimer': fix future compile error when the 'impl_has_hr_timer!'
macro starts to get called
* tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
rust: time: Fix compile error in impl_has_hr_timer macro
Linus Torvalds [Sat, 14 Jun 2025 15:18:09 +0000 (08:18 -0700)]
Merge tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"9 hotfixes. 3 are cc:stable and the remainder address post-6.15 issues
or aren't considered necessary for -stable kernels. Only 4 are for MM"
* tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: add mmap_prepare() compatibility layer for nested file systems
init: fix build warnings about export.h
MAINTAINERS: add Barry as a THP reviewer
drivers/rapidio/rio_cm.c: prevent possible heap overwrite
mm: close theoretical race where stale TLB entries could linger
mm/vma: reset VMA iterator on commit_merge() OOM failure
docs: proc: update VmFlags documentation in smaps
scatterlist: fix extraneous '@'-sign kernel-doc notation
selftests/mm: skip failed memfd setups in gup_longterm
Linus Torvalds [Fri, 13 Jun 2025 23:49:39 +0000 (16:49 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"All fixes for drivers.
The core change in the error handler is simply to translate an ALUA
specific sense code into a retry the ALUA components can handle and
won't impact any other devices"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: error: alua: I/O errors for ALUA state transitions
scsi: storvsc: Increase the timeouts to storvsc_timeout
scsi: s390: zfcp: Ensure synchronous unit_add
scsi: iscsi: Fix incorrect error path labels for flashnode operations
scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registers
scsi: core: ufs: Fix a hang in the error handler
Linus Torvalds [Fri, 13 Jun 2025 23:27:27 +0000 (16:27 -0700)]
Merge tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Quiet week, only two pull requests came my way, xe has a couple of
fixes and then a bunch of fixes across the board, vc4 probably fixes
the biggest problem:
vc4:
- Fix infinite EPROBE_DEFER loop in vc4 probing
amdxdna:
- Fix amdxdna firmware size
meson:
- modesetting fixes
sitronix:
- Kconfig fix for st7171-i2c
dma-buf:
- Fix -EBUSY WARN_ON_ONCE in dma-buf
udmabuf:
- Use dma_sync_sgtable_for_cpu in udmabuf
xe:
- Fix regression disallowing 64K SVM migration
- Use a bounce buffer for WA BB"
* tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel:
drm/xe/lrc: Use a temporary buffer for WA BB
udmabuf: use sgtable-based scatterlist wrappers
dma-buf: fix compare in WARN_ON_ONCE
drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERS
drm/meson: fix more rounding issues with 59.94Hz modes
drm/meson: use vclk_freq instead of pixel_freq in debug print
drm/meson: fix debug log statement when setting the HDMI clocks
drm/vc4: fix infinite EPROBE_DEFER loop
drm/xe/svm: Fix regression disallowing 64K SVM migration
accel/amdxdna: Fix incorrect PSP firmware size
Jens Axboe [Fri, 13 Jun 2025 21:24:53 +0000 (15:24 -0600)]
io_uring: run local task_work from ring exit IOPOLL reaping
In preparation for needing to shift NVMe passthrough to always use
task_work for polled IO completions, ensure that those are suitably
run at exit time. See commit:
9ce6c9875f3e ("nvme: always punt polled uring_cmd end_io work to task_work")
for details on why that is necessary.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 13 Jun 2025 19:37:41 +0000 (13:37 -0600)]
nvme: always punt polled uring_cmd end_io work to task_work
Currently NVMe uring_cmd completions will complete locally, if they are
polled. This is done because those completions are always invoked from
task context. And while that is true, there's no guarantee that it's
invoked under the right ring context, or even task. If someone does
NVMe passthrough via multiple threads and with a limited number of
poll queues, then ringA may find completions from ringB. For that case,
completing the request may not be sound.
Always just punt the passthrough completions via task_work, which will
redirect the completion, if needed.
Cc: stable@vger.kernel.org
Fixes:
585079b6e425 ("nvme: wire up async polling for io passthrough commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 13 Jun 2025 20:39:15 +0000 (13:39 -0700)]
Merge tag 'acpi-6.16-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix an ACPI APEI error injection driver failure that started to
occur after switching it over to using a faux device, address an EC
driver issue related to invalid ECDT tables, clean up the usage of
mwait_idle_with_hints() in the ACPI PAD driver, add a new IRQ override
quirk, and fix a NULL pointer dereference related to nosmp:
- Update the faux device handling code in the driver core and address
an ACPI APEI error injection driver failure that started to occur
after switching it over to using a faux device on top of that (Dan
Williams)
- Update data types of variables passed as arguments to
mwait_idle_with_hints() in the ACPI PAD (processor aggregator
device) driver to match the function definition after recent
changes (Uros Bizjak)
- Fix a NULL pointer dereference in the ACPI CPPC library that occurs
when nosmp is passed to the kernel in the command line (Yunhui Cui)
- Ignore ECDT tables with an invalid ID string to prevent using an
incorrect GPE for signaling events on some systems (Armin Wolf)
- Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan)"
* tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: resource: Use IRQ override on MACHENIKE 16P
ACPI: EC: Ignore ECDT tables with an invalid ID string
ACPI: CPPC: Fix NULL pointer dereference when nosmp is used
ACPI: PAD: Update arguments of mwait_idle_with_hints()
ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure
driver core: faux: Quiet probe failures
driver core: faux: Suppress bind attributes
Linus Torvalds [Fri, 13 Jun 2025 20:27:41 +0000 (13:27 -0700)]
Merge tag 'pm-6.16-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix the cpupower utility installation, fix up the recently added
Rust abstractions for cpufreq and OPP, restore the x86 update
eliminating mwait_play_dead_cpuid_hint() that has been reverted during
the 6.16 merge window along with preventing the failure caused by it
from happening, and clean up mwait_idle_with_hints() usage in
intel_idle:
- Implement CpuId Rust abstraction and use it to fix doctest failure
related to the recently introduced cpumask abstraction (Viresh
Kumar)
- Do minor cleanups in the `# Safety` sections for cpufreq
abstractions added recently (Viresh Kumar)
- Unbreak cpupower systemd service units installation on some systems
by adding a unitdir variable for specifying the location to install
them (Francesco Poli)
- Eliminate mwait_play_dead_cpuid_hint() again after reverting its
elimination during the 6.16 merge window due to a problem with
handling "dead" SMT siblings, but this time prevent leaving them in
C1 after initialization by taking them online and back offline when
a proper cpuidle driver for the platform has been registered
(Rafael Wysocki)
- Update data types of variables passed as arguments to
mwait_idle_with_hints() to match the function definition after
recent changes (Uros Bizjak)"
* tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
rust: cpu: Add CpuId::current() to retrieve current CPU ID
rust: Use CpuId in place of raw CPU numbers
rust: cpu: Introduce CpuId abstraction
intel_idle: Update arguments of mwait_idle_with_hints()
cpufreq: Convert `/// SAFETY` lines to `# Safety` sections
cpupower: split unitdir from libdir in Makefile
Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
ACPI: processor: Rescan "dead" SMT siblings during initialization
intel_idle: Rescan "dead" SMT siblings during initialization
x86/smp: PM/hibernate: Split arch_resume_nosmt()
intel_idle: Use subsys_initcall_sync() for initialization
Rafael J. Wysocki [Fri, 13 Jun 2025 19:55:30 +0000 (21:55 +0200)]
Merge branches 'acpi-pad', 'acpi-cppc', 'acpi-ec' and 'acpi-resource'
Merge assorted ACPI updates for 6.16-rc2:
- Update data types of variables passed as arguments to
mwait_idle_with_hints() in the ACPI PAD (processor aggregator device)
driver to match the function definition after recent changes (Uros
Bizjak).
- Fix a NULL pointer dereference in the ACPI CPPC library that occurs
when nosmp is passed to the kernel in the command line (Yunhui Cui).
- Ignore ECDT tables with an invalid ID string to prevent using an
incorrect GPE for signaling events on some systems (Armin Wolf).
- Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan).
* acpi-pad:
ACPI: PAD: Update arguments of mwait_idle_with_hints()
* acpi-cppc:
ACPI: CPPC: Fix NULL pointer dereference when nosmp is used
* acpi-ec:
ACPI: EC: Ignore ECDT tables with an invalid ID string
* acpi-resource:
ACPI: resource: Use IRQ override on MACHENIKE 16P
Rafael J. Wysocki [Fri, 13 Jun 2025 19:28:07 +0000 (21:28 +0200)]
Merge branch 'pm-cpuidle'
Merge cpuidle updates for 6.16-rc2:
- Update data types of variables passed as arguments to
mwait_idle_with_hints() to match the function definition
after recent changes (Uros Bizjak).
- Eliminate mwait_play_dead_cpuid_hint() again after reverting its
elimination during the merge window due to a problem with handling
"dead" SMT siblings, but this time prevent leaving them in C1 after
initialization by taking them online and back offline when a proper
cpuidle driver for the platform has been registered (Rafael Wysocki).
* pm-cpuidle:
intel_idle: Update arguments of mwait_idle_with_hints()
Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
ACPI: processor: Rescan "dead" SMT siblings during initialization
intel_idle: Rescan "dead" SMT siblings during initialization
x86/smp: PM/hibernate: Split arch_resume_nosmt()
intel_idle: Use subsys_initcall_sync() for initialization
Rafael J. Wysocki [Fri, 13 Jun 2025 19:25:38 +0000 (21:25 +0200)]
Merge branch 'pm-tools'
Merge a cpupower utility fix for 6.16-rc2 that unbreaks systemd service
units installation on some sysems (Francesco Poli).
* pm-tools:
cpupower: split unitdir from libdir in Makefile
Linus Torvalds [Fri, 13 Jun 2025 18:01:44 +0000 (11:01 -0700)]
Merge tag 'spi-fix-v6.16-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A collection of driver specific fixes, most minor apart from the OMAP
ones which disable some recent performance optimisations in some
non-standard cases where we could start driving the bus incorrectly.
The change to the stm32-ospi driver to use the newer reset APIs is a
fix for interactions with other IP sharing the same reset line in some
SoCs"
* tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine
spi: stm32-ospi: clean up on error in probe()
spi: stm32-ospi: Make usage of reset_control_acquire/release() API
spi: offload: check offload ops existence before disabling the trigger
spi: spi-pci1xxxx: Fix error code in probe
spi: loongson: Fix build warnings about export.h
spi: omap2-mcspi: Disable multi-mode when the previous message kept CS asserted
spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after message
Linus Torvalds [Fri, 13 Jun 2025 18:00:19 +0000 (11:00 -0700)]
Merge tag 'regulator-fix-v6.16-rc1' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One minor fix for a leak in the DT parsing code in the max20086 driver"
* tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: max20086: Fix refcount leak in max20086_parse_regulators_dt()
Oleg Nesterov [Fri, 13 Jun 2025 17:26:50 +0000 (19:26 +0200)]
posix-cpu-timers: fix race between handle_posix_cpu_timers() and posix_cpu_timer_del()
If an exiting non-autoreaping task has already passed exit_notify() and
calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent
or debugger right after unlock_task_sighand().
If a concurrent posix_cpu_timer_del() runs at that moment, it won't be
able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or
lock_task_sighand() will fail.
Add the tsk->exit_state check into run_posix_cpu_timers() to fix this.
This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because
exit_task_work() is called before exit_notify(). But the check still
makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail
anyway in this case.
Cc: stable@vger.kernel.org
Reported-by: Benoît Sevens <bsevens@google.com>
Fixes:
0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 13 Jun 2025 17:51:11 +0000 (10:51 -0700)]
Merge tag 'trace-v6.16-rc1' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
- Do not free "head" variable in filter_free_subsystem_filters()
The first error path jumps to "free_now" label but first frees the
newly allocated "head" variable. But the "free_now" code checks this
variable, and if it is not NULL, it will iterate the list. As this
list variable was already initialized, the "free_now" code will not
do anything as it is empty. But freeing it will cause a UAF bug.
The error path should simply jump to the "free_now" label and leave
the "head" variable alone.
* tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Do not free "head" on error path of filter_free_subsystem_filters()
Linus Torvalds [Fri, 13 Jun 2025 17:05:31 +0000 (10:05 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Rework of system register accessors for system registers that are
directly writen to memory, so that sanitisation of the in-memory
value happens at the correct time (after the read, or before the
write). For convenience, RMW-style accessors are also provided.
- Multiple fixes for the so-called "arch-timer-edge-cases' selftest,
which was always broken.
x86:
- Make KVM_PRE_FAULT_MEMORY stricter for TDX, allowing userspace to
pass only the "untouched" addresses and flipping the shared/private
bit in the implementation.
- Disable SEV-SNP support on initialization failure
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORY
KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORY
KVM: SEV: Disable SEV-SNP support on initialization failure
KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases
KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases
KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases
KVM: arm64: selftests: Fix help text for arch_timer_edge_cases
KVM: arm64: Make __vcpu_sys_reg() a pure rvalue operand
KVM: arm64: Don't use __vcpu_sys_reg() to get the address of a sysreg
KVM: arm64: Add RMW specific sysreg accessor
KVM: arm64: Add assignment-specific sysreg accessor
Jens Axboe [Fri, 13 Jun 2025 17:01:49 +0000 (11:01 -0600)]
io_uring/kbuf: don't truncate end buffer for multiple buffer peeks
If peeking a bunch of buffers, normally io_ring_buffers_peek() will
truncate the end buffer. This isn't optimal as presumably more data will
be arriving later, and hence it's better to stop with the last full
buffer rather than truncate the end buffer.
Cc: stable@vger.kernel.org
Fixes:
35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers")
Reported-by: Christian Mazakas <christian.mazakas@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 13 Jun 2025 16:59:29 +0000 (09:59 -0700)]
Merge tag 'v6.16-p4' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a broken self-test in hkdf (new regression)"
* tag 'v6.16-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: hkdf - move to late_initcall
Linus Torvalds [Fri, 13 Jun 2025 16:49:07 +0000 (09:49 -0700)]
Merge tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"As usual, highlighting the ones users have been noticing:
- Fix a small issue with has_case_insensitive not being propagated on
snapshot creation; this led to fsck errors, which we're harmless
because we're not using this flag yet (it's for overlayfs +
casefolding).
- Log the error being corrected in the journal when we're doing fsck
repair: this was one of the "lessons learned" from the i_nlink 0 ->
subvolume deletion bug, where reconstructing what had happened by
analyzing the journal was a bit more difficult than it needed to
be.
- Don't schedule btree node scan to run in the superblock: this fixes
a regression from the 6.16 recovery passes rework, and let to it
running unnecessarily.
The real issue here is that we don't have online, "self healing"
style topology repair yet: topology repair currently has to run
before we go RW, which means that we may schedule it unnecessarily
after a transient error. This will be fixed in the future.
- We now track, in btree node flags, the reason it was scheduled to
be rewritten. We discovered a deadlock in recovery when many btree
nodes need to be rewritten because they're degraded: fully fixing
this will take some work but it's now easier to see what's going
on.
For the bug report where this came up, a device had been kicked RO
due to transient errors: manually setting it back to RW was
sufficient to allow recovery to succeed.
- Mark a few more fsck errors as autofix: as a reminder to users,
please do keep reporting cases where something needs to be repaired
and is not repaired automatically (i.e. cases where -o fix_errors
or fsck -y is required).
- rcu_pending.c now works with PREEMPT_RT
- 'bcachefs device add', then umount, then remount wasn't working -
we now emit a uevent so that the new device's new superblock is
correctly picked up
- Assorted repair fixes: btree node scan will no longer incorrectly
update sb->version_min,
- Assorted syzbot fixes"
* tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefs: (23 commits)
bcachefs: Don't trace should_be_locked unless changing
bcachefs: Ensure that snapshot creation propagates has_case_insensitive
bcachefs: Print devices we're mounting on multi device filesystems
bcachefs: Don't trust sb->nr_devices in members_to_text()
bcachefs: Fix version checks in validate_bset()
bcachefs: ioctl: avoid stack overflow warning
bcachefs: Don't pass trans to fsck_err() in gc_accounting_done
bcachefs: Fix leak in bch2_fs_recovery() error path
bcachefs: Fix rcu_pending for PREEMPT_RT
bcachefs: Fix downgrade_table_extra()
bcachefs: Don't put rhashtable on stack
bcachefs: Make sure opts.read_only gets propagated back to VFS
bcachefs: Fix possible console lock involved deadlock
bcachefs: mark more errors autofix
bcachefs: Don't persistently run scan_for_btree_nodes
bcachefs: Read error message now prints if self healing
bcachefs: Only run 'increase_depth' for keys from btree node csan
bcachefs: Mark need_discard_freespace_key_bad autofix
bcachefs: Update /dev/disk/by-uuid on device add
bcachefs: Add more flags to btree nodes for rewrite reason
...
Bagas Sanjaya [Fri, 13 Jun 2025 02:38:57 +0000 (09:38 +0700)]
Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists
Stephen Rothwell reports htmldocs warning on ublk docs:
Documentation/block/ublk.rst:414: ERROR: Unexpected indentation. [docutils]
Fix the warning by separating sublists of auto buffer registration
fallback behavior from their appropriate parent list item.
Fixes:
ff20c516485e ("ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/linux-next/
20250612132638.
193de386@canb.auug.org.au/
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20250613023857.15971-1-bagasdotme@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jason Gunthorpe [Tue, 3 Jun 2025 19:14:45 +0000 (16:14 -0300)]
iommu/tegra: Fix incorrect size calculation
This driver uses a mixture of ways to get the size of a PTE,
tegra_smmu_set_pde() did it as sizeof(*pd) which became wrong when pd
switched to a struct tegra_pd.
Switch pd back to a u32* in tegra_smmu_set_pde() so the sizeof(*pd)
returns 4.
Fixes:
50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory")
Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Closes: https://lore.kernel.org/all/
62e7f7fe-6200-4e4f-ad42-
d58ad272baa6@tecnico.ulisboa.pt/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Matthew Wilcox (Oracle) [Thu, 12 Jun 2025 14:42:53 +0000 (15:42 +0100)]
block: Fix bvec_set_folio() for very large folios
Similarly to
26064d3e2b4d ("block: fix adding folio to bio"), if
we attempt to add a folio that is larger than 4GB, we'll silently
truncate the offset and len. Widen the parameters to size_t, assert
that the length is less than 4GB and set the first page that contains
the interesting data rather than the first page of the folio.
Fixes:
26db5ee15851 (block: add a bvec_set_folio helper)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Matthew Wilcox (Oracle) [Thu, 12 Jun 2025 14:41:25 +0000 (15:41 +0100)]
bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP
It is possible for physically contiguous folios to have discontiguous
struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not.
This is correctly handled by folio_page_idx(), so remove this open-coded
implementation.
Fixes:
640d1930bef4 (block: Add bio_for_each_folio_all())
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Thangaraj Samynathan [Thu, 12 Jun 2025 02:30:59 +0000 (08:00 +0530)]
spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine
Removes MSI-X from the interrupt request path, as the DMA engine used by
the SPI controller does not support MSI-X interrupts.
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Link: https://patch.msgid.link/20250612023059.71726-1-thangaraj.s@microchip.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Dave Airlie [Fri, 13 Jun 2025 04:57:09 +0000 (14:57 +1000)]
Merge tag 'drm-misc-fixes-2025-06-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.16-rc2:
- Fix infinite EPROBE_DEFER loop in vc4 probing.
- Fix amdxdna firmware size.
- mode fixes for meson.
- Kconfig fix for st7171-i2c.
- Fix -EBUSY WARN_ON_ONCE in dma-buf
- Use dma_sync_sgtable_for_cpu in udmabuf.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/62c06195-8bc1-4dae-8777-e86d94e4d9d9@linux.intel.com
Lorenzo Stoakes [Mon, 9 Jun 2025 16:57:49 +0000 (17:57 +0100)]
mm: add mmap_prepare() compatibility layer for nested file systems
Nested file systems, that is those which invoke call_mmap() within their
own f_op->mmap() handlers, may encounter underlying file systems which
provide the f_op->mmap_prepare() hook introduced by commit
c84bf6dd2b83
("mm: introduce new .mmap_prepare() file callback").
We have a chicken-and-egg scenario here - until all file systems are
converted to using .mmap_prepare(), we cannot convert these nested
handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
So we have to do it the other way round - invoke the .mmap_prepare() hook
from an .mmap() one.
in order to do so, we need to convert VMA state into a struct vm_area_desc
descriptor, invoking the underlying file system's f_op->mmap_prepare()
callback passing a pointer to this, and then setting VMA state accordingly
and safely.
This patch achieves this via the compat_vma_mmap_prepare() function, which
we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
passed in file pointer.
We place the fundamental logic into mm/vma.h where VMA manipulation
belongs. We also update the VMA userland tests to accommodate the
changes.
The compat_vma_mmap_prepare() function and its associated machinery is
temporary, and will be removed once the conversion of file systems is
complete.
We carefully place this code so it can be used with CONFIG_MMU and also
with cutting edge nommu silicon.
[akpm@linux-foundation.org: export compat_vma_mmap_prepare tp fix build]
[lorenzo.stoakes@oracle.com: remove unused declarations]
Link: https://lkml.kernel.org/r/ac3ae324-4c65-432a-8c6d-2af988b18ac8@lucifer.local
Link: https://lkml.kernel.org/r/20250609165749.344976-1-lorenzo.stoakes@oracle.com
Fixes:
c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Dave Airlie [Fri, 13 Jun 2025 01:05:23 +0000 (11:05 +1000)]
Merge tag 'drm-xe-fixes-2025-06-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix regression disallowing 64K SVM migration (Maarten)
- Use a bounce buffer for WA BB (Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/aEsBQoh5Si3ouPgE@fedora
Linus Torvalds [Thu, 12 Jun 2025 19:32:09 +0000 (12:32 -0700)]
Merge tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linux
Pull bitmap fix from Yury Norov:
"Fix for __GENMASK() and __GENMASK_ULL() in UAPI"
* tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linux:
uapi: bitops: use UAPI-safe variant of BITS_PER_LONG again
Bharath SM [Wed, 11 Jun 2025 11:29:02 +0000 (16:59 +0530)]
smb: improve directory cache reuse for readdir operations
Currently, cached directory contents were not reused across subsequent
'ls' operations because the cache validity check relied on comparing
the ctx pointer, which changes with each readdir invocation. As a
result, the cached dir entries was not marked as valid and the cache was
not utilized for subsequent 'ls' operations.
This change uses the file pointer, which remains consistent across all
readdir calls for a given directory instance, to associate and validate
the cache. As a result, cached directory contents can now be
correctly reused, improving performance for repeated directory listings.
Performance gains with local windows SMB server:
Without the patch and default actimeo=1:
1000 directory enumeration operations on dir with 10k files took 135.0s
With this patch and actimeo=0:
1000 directory enumeration operations on dir with 10k files took just 5.1s
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Thu, 12 Jun 2025 15:45:04 +0000 (12:45 -0300)]
smb: client: fix perf regression with deferred closes
Customer reported that one of their applications started failing to
open files with STATUS_INSUFFICIENT_RESOURCES due to NetApp server
hitting the maximum number of opens to same file that it would allow
for a single client connection.
It turned out the client was failing to reuse open handles with
deferred closes because matching ->f_flags directly without masking
off O_CREAT|O_EXCL|O_TRUNC bits first broke the comparision and then
client ended up with thousands of deferred closes to same file. Those
bits are already satisfied on the original open, so no need to check
them against existing open handles.
Reproducer:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#define NR_THREADS 4
#define NR_ITERATIONS 2500
#define TEST_FILE "/mnt/1/test/dir/foo"
static char buf[64];
static void *worker(void *arg)
{
int i, j;
int fd;
for (i = 0; i < NR_ITERATIONS; i++) {
fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_APPEND, 0666);
for (j = 0; j < 16; j++)
write(fd, buf, sizeof(buf));
close(fd);
}
}
int main(int argc, char *argv[])
{
pthread_t t[NR_THREADS];
int fd;
int i;
fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_TRUNC, 0666);
close(fd);
memset(buf, 'a', sizeof(buf));
for (i = 0; i < NR_THREADS; i++)
pthread_create(&t[i], NULL, worker, NULL);
for (i = 0; i < NR_THREADS; i++)
pthread_join(t[i], NULL);
return 0;
}
Before patch:
$ mount.cifs //srv/share /mnt/1 -o ...
$ mkdir -p /mnt/1/test/dir
$ gcc repro.c && ./a.out
...
number of opens: 1391
After patch:
$ mount.cifs //srv/share /mnt/1 -o ...
$ mkdir -p /mnt/1/test/dir
$ gcc repro.c && ./a.out
...
number of opens: 1
Cc: linux-cifs@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jay Shin <jaeshin@redhat.com>
Cc: Pierguido Lambri <plambri@redhat.com>
Fixes:
b8ea3b1ff544 ("smb: enable reuse of deferred file handles for write operations")
Acked-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Thu, 12 Jun 2025 16:50:36 +0000 (09:50 -0700)]
Merge tag 'net-6.16-rc2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth and wireless.
Current release - regressions:
- af_unix: allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD
Current release - new code bugs:
- eth: airoha: correct enable mask for RX queues 16-31
- veth: prevent NULL pointer dereference in veth_xdp_rcv when peer
disappears under traffic
- ipv6: move fib6_config_validate() to ip6_route_add(), prevent
invalid routes
Previous releases - regressions:
- phy: phy_caps: don't skip better duplex match on non-exact match
- dsa: b53: fix untagged traffic sent via cpu tagged with VID 0
- Revert "wifi: mwifiex: Fix HT40 bandwidth issue.", it caused
transient packet loss, exact reason not fully understood, yet
Previous releases - always broken:
- net: clear the dst when BPF is changing skb protocol (IPv4 <> IPv6)
- sched: sfq: fix a potential crash on gso_skb handling
- Bluetooth: intel: improve rx buffer posting to avoid causing issues
in the firmware
- eth: intel: i40e: make reset handling robust against multiple
requests
- eth: mlx5: ensure FW pages are always allocated on the local NUMA
node, even when device is configure to 'serve' another node
- wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850,
prevent kernel crashes
- wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request()
for 3 sec if fw_stats_done is not set"
* tag 'net-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits)
selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context
net: ethtool: Don't check if RSS context exists in case of context 0
af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.
ipv6: Move fib6_config_validate() to ip6_route_add().
net: drv: netdevsim: don't napi_complete() from netpoll
net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get()
veth: prevent NULL pointer dereference in veth_xdp_rcv
net_sched: remove qdisc_tree_flush_backlog()
net_sched: ets: fix a race in ets_qdisc_change()
net_sched: tbf: fix a race in tbf_change()
net_sched: red: fix a race in __red_change()
net_sched: prio: fix a race in prio_tune()
net_sched: sch_sfq: reject invalid perturb period
net: phy: phy_caps: Don't skip better duplex macth on non-exact match
MAINTAINERS: Update Kuniyuki Iwashima's email address.
selftests: net: add test case for NAT46 looping back dst
net: clear the dst when changing skb protocol
net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_oper
net/mlx5e: Fix leak of Geneve TLV option object
net/mlx5: HWS, make sure the uplink is the last destination
...
Lucas De Marchi [Wed, 4 Jun 2025 15:03:05 +0000 (08:03 -0700)]
drm/xe/lrc: Use a temporary buffer for WA BB
In case the BO is in iomem, we can't simply take the vaddr and write to
it. Instead, prepare a separate buffer that is later copied into io
memory. Right now it's just a few words that could be using
xe_map_write32(), but the intention is to grow the WA BB for other
uses.
Fixes:
617d824c5323 ("drm/xe: Add WA BB to capture active context utilization")
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://lore.kernel.org/r/20250604-wa-bb-fix-v1-1-0dfc5dafcef0@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit
ef48715b2d3df17c060e23b9aa636af3d95652f8)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Linus Torvalds [Thu, 12 Jun 2025 15:21:13 +0000 (08:21 -0700)]
Merge tag 'pinctrl-v6.16-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Add some missing pins on the Qualcomm QCM2290, along with a managed
resources patch that make it clean and nice
- Drop an unused function in the ST Micro driver
- Drop bouncing MAINTAINER entry
- Drop of_match_ptr() macro to rid compile warnings in the TB10x
driver
- Fix up calculation of pin numbers from base in the Sunxi driver
* tag 'pinctrl-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunxi: dt: Consider pin base when calculating bank number from pin
pinctrl: tb10x: Drop of_match_ptr for ID table
pinctrl: MAINTAINERS: Drop bouncing Jianlong Huang
pinctrl: st: Drop unused st_gpio_bank() function
pinctrl: qcom: pinctrl-qcm2290: Add missing pins
pinctrl: qcom: switch to devm_gpiochip_add_data()
Linus Torvalds [Thu, 12 Jun 2025 15:17:56 +0000 (08:17 -0700)]
Merge tag 'arc-6.16-rc1' of git://git./linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- arch_atomic64_cmpxchg relaxed variant [Jason]
- use of inbuilt swap in stack unwinder [Yu-Chun Lin]
- use of __ASSEMBLER__ in kernel headers [Thomas Huth]
* tag 'arc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in the non-uapi headers
ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headers
ARC: unwind: Use built-in sort swap to reduce code size and improve performance
ARC: atomics: Implement arch_atomic64_cmpxchg using _relaxed
Jakub Kicinski [Thu, 12 Jun 2025 15:16:47 +0000 (08:16 -0700)]
Merge tag 'wireless-2025-06-12' of https://git./linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Another quick round of updates:
- revert mwifiex HT40 that was causing issues
- many ath10k/ath11k/ath12k fixes
- re-add some iwlwifi code I lost in a merge
- use kfree_sensitive() on an error path in cfg80211
* tag 'wireless-2025-06-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: cfg80211: use kfree_sensitive() for connkeys cleanup
wifi: iwlwifi: fix merge damage related to iwl_pci_resume
Revert "wifi: mwifiex: Fix HT40 bandwidth issue."
wifi: ath12k: fix uaf in ath12k_core_init()
wifi: ath12k: Fix hal_reo_cmd_status kernel-doc
wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850
wifi: ath11k: validate ath11k_crypto_mode on top of ath11k_core_qmi_firmware_ready
wifi: ath11k: consistently use ath11k_mac_get_fw_stats()
wifi: ath11k: move locking outside of ath11k_mac_get_fw_stats()
wifi: ath11k: adjust unlock sequence in ath11k_update_stats_event()
wifi: ath11k: move some firmware stats related functions outside of debugfs
wifi: ath11k: don't wait when there is no vdev started
wifi: ath11k: don't use static variables in ath11k_debugfs_fw_stats_process()
wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request()
wil6210: fix support for sparrow chipsets
wifi: ath10k: Avoid vdev delete timeout when firmware is already down
ath10k: snoc: fix unbalanced IRQ enable in crash recovery
====================
Link: https://patch.msgid.link/20250612082519.11447-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 12 Jun 2025 15:15:37 +0000 (08:15 -0700)]
Merge branch 'fix-ntuple-rules-targeting-default-rss'
Gal Pressman says:
====================
Fix ntuple rules targeting default RSS
This series addresses a regression in ethtool flow steering where rules
targeting the default RSS context (context 0) were incorrectly rejected.
The default RSS context always exists but is not stored in the rss_ctx
xarray like additional contexts. The current validation logic was
checking for the existence of context 0 in this array, causing valid
flow steering rules to be rejected.
This prevented configurations such as:
- High priority rules directing specific traffic to the default context
- Low priority catch-all rules directing remaining traffic to additional
contexts
Patch 1 fixes the validation logic to skip the existence check for
context 0.
Patch 2 adds a selftest that verifies this behavior.
v3: https://lore.kernel.org/
20250609120250.
1630125-1-gal@nvidia.com
v2: https://lore.kernel.org/
20250225071348.509432-1-gal@nvidia.com
====================
Link: https://patch.msgid.link/20250612071958.1696361-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Thu, 12 Jun 2025 07:19:58 +0000 (10:19 +0300)]
selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context
Add test_rss_default_context_rule() to verify that ntuple rules can
correctly direct traffic to the default RSS context (context 0).
The test creates two ntuple rules with explicit location priorities:
- A high-priority rule (loc 0) directing specific port traffic to
context 0.
- A low-priority rule (loc 1) directing all other TCP traffic to context
1.
This validates that:
1. Rules targeting the default context function properly.
2. Traffic steering works as expected when mixing default and
additional RSS contexts.
The test was written by AI, and reviewed by humans.
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250612071958.1696361-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Thu, 12 Jun 2025 07:19:57 +0000 (10:19 +0300)]
net: ethtool: Don't check if RSS context exists in case of context 0
Context 0 (default context) always exists, there is no need to check
whether it exists or not when adding a flow steering rule.
The existing check fails when creating a flow steering rule for context
0 as it is not stored in the rss_ctx xarray.
For example:
$ ethtool --config-ntuple eth2 flow-type tcp4 dst-ip 194.237.147.23 dst-port 19983 context 0 loc 618
rmgr: Cannot insert RX class rule: Invalid argument
Cannot insert classification rule
An example usecase for this could be:
- A high-priority rule (loc 0) directing specific port traffic to
context 0.
- A low-priority rule (loc 1) directing all other TCP traffic to context
1.
This is a user-visible regression that was caught in our testing
environment, it was not reported by a user yet.
Fixes:
de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist")
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250612071958.1696361-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 12 Jun 2025 15:13:47 +0000 (08:13 -0700)]
Merge tag 'for-net-2025-06-11' of git://git./linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- eir: Fix NULL pointer deference on eir_get_service_data
- eir: Fix possible crashes on eir_create_adv_data
- hci_sync: Fix broadcast/PA when using an existing instance
- ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets
- ISO: Fix not using bc_sid as advertisement SID
- MGMT: Fix sparse errors
* tag 'for-net-2025-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: MGMT: Fix sparse errors
Bluetooth: ISO: Fix not using bc_sid as advertisement SID
Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets
Bluetooth: eir: Fix possible crashes on eir_create_adv_data
Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance
Bluetooth: Fix NULL pointer deference on eir_get_service_data
====================
Link: https://patch.msgid.link/20250611204944.1559356-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Wed, 11 Jun 2025 20:27:35 +0000 (13:27 -0700)]
af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.
Before the cited commit, the kernel unconditionally embedded SCM
credentials to skb for embryo sockets even when both the sender
and listener disabled SO_PASSCRED and SO_PASSPIDFD.
Now, the credentials are added to skb only when configured by the
sender or the listener.
However, as reported in the link below, it caused a regression for
some programs that assume credentials are included in every skb,
but sometimes not now.
The only problematic scenario would be that a socket starts listening
before setting the option. Then, there will be 2 types of non-small
race window, where a client can send skb without credentials, which
the peer receives as an "invalid" message (and aborts the connection
it seems ?):
Client Server
------ ------
s1.listen() <-- No SO_PASS{CRED,PIDFD}
s2.connect()
s2.send() <-- w/o cred
s1.setsockopt(SO_PASS{CRED,PIDFD})
s2.send() <-- w/ cred
or
Client Server
------ ------
s1.listen() <-- No SO_PASS{CRED,PIDFD}
s2.connect()
s2.send() <-- w/o cred
s3, _ = s1.accept() <-- Inherit cred options
s2.send() <-- w/o cred but not set yet
s3.setsockopt(SO_PASS{CRED,PIDFD})
s2.send() <-- w/ cred
It's unfortunate that buggy programs depend on the behaviour,
but let's restore the previous behaviour.
Fixes:
3f84d577b79d ("af_unix: Inherit sk_flags at connect().")
Reported-by: Jacek Łuczak <difrost.kernel@gmail.com>
Closes: https://lore.kernel.org/all/
68d38b0b-1666-4974-85d4-
15575789c8d4@gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Tested-by: Christian Heusel <christian@heusel.eu>
Tested-by: André Almeida <andrealmeid@igalia.com>
Tested-by: Jacek Łuczak <difrost.kernel@gmail.com>
Link: https://patch.msgid.link/20250611202758.3075858-1-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Wed, 11 Jun 2025 19:35:02 +0000 (12:35 -0700)]
ipv6: Move fib6_config_validate() to ip6_route_add().
syzkaller created an IPv6 route from a malformed packet, which has
a prefix len > 128, triggering the splat below. [0]
This is a similar issue fixed by commit
586ceac9acb7 ("ipv6: Restore
fib6_config validation for SIOCADDRT.").
The cited commit removed fib6_config validation from some callers
of ip6_add_route().
Let's move the validation back to ip6_route_add() and
ip6_route_multipath_add().
[0]:
UBSAN: array-index-out-of-bounds in ./include/net/ipv6.h:616:34
index 20 is out of range for type '__u8 [16]'
CPU: 1 UID: 0 PID: 7444 Comm: syz.0.708 Not tainted
6.16.0-rc1-syzkaller-g19272b37aa4f #0 PREEMPT
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<
ffffffff80078a80>] dump_backtrace+0x2e/0x3c arch/riscv/kernel/stacktrace.c:132
[<
ffffffff8000327a>] show_stack+0x30/0x3c arch/riscv/kernel/stacktrace.c:138
[<
ffffffff80061012>] __dump_stack lib/dump_stack.c:94 [inline]
[<
ffffffff80061012>] dump_stack_lvl+0x12e/0x1a6 lib/dump_stack.c:120
[<
ffffffff800610a6>] dump_stack+0x1c/0x24 lib/dump_stack.c:129
[<
ffffffff8001c0ea>] ubsan_epilogue+0x14/0x46 lib/ubsan.c:233
[<
ffffffff819ba290>] __ubsan_handle_out_of_bounds+0xf6/0xf8 lib/ubsan.c:455
[<
ffffffff85b363a4>] ipv6_addr_prefix include/net/ipv6.h:616 [inline]
[<
ffffffff85b363a4>] ip6_route_info_create+0x8f8/0x96e net/ipv6/route.c:3793
[<
ffffffff85b635da>] ip6_route_add+0x2a/0x1aa net/ipv6/route.c:3889
[<
ffffffff85b02e08>] addrconf_prefix_route+0x2c4/0x4e8 net/ipv6/addrconf.c:2487
[<
ffffffff85b23bb2>] addrconf_prefix_rcv+0x1720/0x1e62 net/ipv6/addrconf.c:2878
[<
ffffffff85b92664>] ndisc_router_discovery+0x1a06/0x3504 net/ipv6/ndisc.c:1570
[<
ffffffff85b99038>] ndisc_rcv+0x500/0x600 net/ipv6/ndisc.c:1874
[<
ffffffff85bc2c18>] icmpv6_rcv+0x145e/0x1e0a net/ipv6/icmp.c:988
[<
ffffffff85af6798>] ip6_protocol_deliver_rcu+0x18a/0x1976 net/ipv6/ip6_input.c:436
[<
ffffffff85af8078>] ip6_input_finish+0xf4/0x174 net/ipv6/ip6_input.c:480
[<
ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:317 [inline]
[<
ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:311 [inline]
[<
ffffffff85af8262>] ip6_input+0x16a/0x70c net/ipv6/ip6_input.c:491
[<
ffffffff85af8dcc>] ip6_mc_input+0x5c8/0x1268 net/ipv6/ip6_input.c:588
[<
ffffffff85af6112>] dst_input include/net/dst.h:469 [inline]
[<
ffffffff85af6112>] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
[<
ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:317 [inline]
[<
ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:311 [inline]
[<
ffffffff85af6112>] ipv6_rcv+0x5ae/0x6e0 net/ipv6/ip6_input.c:309
[<
ffffffff85087e84>] __netif_receive_skb_one_core+0x106/0x16e net/core/dev.c:5977
[<
ffffffff85088104>] __netif_receive_skb+0x2c/0x144 net/core/dev.c:6090
[<
ffffffff850883c6>] netif_receive_skb_internal net/core/dev.c:6176 [inline]
[<
ffffffff850883c6>] netif_receive_skb+0x1aa/0xbf2 net/core/dev.c:6235
[<
ffffffff8328656e>] tun_rx_batched.isra.0+0x430/0x686 drivers/net/tun.c:1485
[<
ffffffff8329ed3a>] tun_get_user+0x2952/0x3d6c drivers/net/tun.c:1938
[<
ffffffff832a21e0>] tun_chr_write_iter+0xc4/0x21c drivers/net/tun.c:1984
[<
ffffffff80b9b9ae>] new_sync_write fs/read_write.c:593 [inline]
[<
ffffffff80b9b9ae>] vfs_write+0x56c/0xa9a fs/read_write.c:686
[<
ffffffff80b9c2be>] ksys_write+0x126/0x228 fs/read_write.c:738
[<
ffffffff80b9c42e>] __do_sys_write fs/read_write.c:749 [inline]
[<
ffffffff80b9c42e>] __se_sys_write fs/read_write.c:746 [inline]
[<
ffffffff80b9c42e>] __riscv_sys_write+0x6e/0x94 fs/read_write.c:746
[<
ffffffff80076912>] syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112
[<
ffffffff8637e31e>] do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341
[<
ffffffff863a69e2>] handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197
Fixes:
fa76c1674f2e ("ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config().")
Reported-by: syzbot+4c2358694722d304c44e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
6849b8c3.
a00a0220.1eb5f5.00f0.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611193551.2999991-1-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 11 Jun 2025 17:46:43 +0000 (10:46 -0700)]
net: drv: netdevsim: don't napi_complete() from netpoll
netdevsim supports netpoll. Make sure we don't call napi_complete()
from it, since it may not be scheduled. Breno reports hitting a
warning in napi_complete_done():
WARNING: CPU: 14 PID: 104 at net/core/dev.c:6592 napi_complete_done+0x2cc/0x560
__napi_poll+0x2d8/0x3a0
handle_softirqs+0x1fe/0x710
This is presumably after netpoll stole the SCHED bit prematurely.
Reported-by: Breno Leitao <leitao@debian.org>
Fixes:
3762ec05a9fb ("netdevsim: add NAPI support")
Tested-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250611174643.2769263-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dan Carpenter [Wed, 11 Jun 2025 13:14:32 +0000 (16:14 +0300)]
net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get()
Check for if ida_alloc() or rhashtable_lookup_get_insert_fast() fails.
Fixes:
17e0accac577 ("net/mlx5: HWS, support complex matchers")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Link: https://patch.msgid.link/aEmBONjyiF6z5yCV@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jesper Dangaard Brouer [Wed, 11 Jun 2025 12:40:04 +0000 (14:40 +0200)]
veth: prevent NULL pointer dereference in veth_xdp_rcv
The veth peer device is RCU protected, but when the peer device gets
deleted (veth_dellink) then the pointer is assigned NULL (via
RCU_INIT_POINTER).
This patch adds a necessary NULL check in veth_xdp_rcv when accessing
the veth peer net_device.
This fixes a bug introduced in commit
dc82a33297fc ("veth: apply qdisc
backpressure on full ptr_ring to reduce TX drops"). The bug is a race
and only triggers when having inflight packets on a veth that is being
deleted.
Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Closes: https://lore.kernel.org/all/
fecfcad0-7a16-42b8-bff2-
66ee83a6e5c4@linux.dev/
Reported-by: syzbot+c4c7bf27f6b0c4bd97fe@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
683da55e.
a00a0220.d8eae.0052.GAE@google.com/
Fixes:
dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://patch.msgid.link/174964557873.519608.10855046105237280978.stgit@firesoul
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 12 Jun 2025 15:05:53 +0000 (08:05 -0700)]
Merge branch 'net_sched-no-longer-use-qdisc_tree_flush_backlog'
Eric Dumazet says:
====================
net_sched: no longer use qdisc_tree_flush_backlog()
This series is based on a report from Gerrard Tai.
Essentially, all users of qdisc_tree_flush_backlog() are racy.
We must instead use qdisc_purge_queue().
====================
Link: https://patch.msgid.link/20250611111515.1983366-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 11:15:15 +0000 (11:15 +0000)]
net_sched: remove qdisc_tree_flush_backlog()
This function is no longer used after the four prior fixes.
Given all prior uses were wrong, it seems better to remove it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611111515.1983366-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 11:15:14 +0000 (11:15 +0000)]
net_sched: ets: fix a race in ets_qdisc_change()
Gerrard Tai reported a race condition in ETS, whenever SFQ perturb timer
fires at the wrong time.
The race is as follows:
CPU 0 CPU 1
[1]: lock root
[2]: qdisc_tree_flush_backlog()
[3]: unlock root
|
| [5]: lock root
| [6]: rehash
| [7]: qdisc_tree_reduce_backlog()
|
[4]: qdisc_put()
This can be abused to underflow a parent's qlen.
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes:
b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611111515.1983366-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 11:15:13 +0000 (11:15 +0000)]
net_sched: tbf: fix a race in tbf_change()
Gerrard Tai reported a race condition in TBF, whenever SFQ perturb timer
fires at the wrong time.
The race is as follows:
CPU 0 CPU 1
[1]: lock root
[2]: qdisc_tree_flush_backlog()
[3]: unlock root
|
| [5]: lock root
| [6]: rehash
| [7]: qdisc_tree_reduce_backlog()
|
[4]: qdisc_put()
This can be abused to underflow a parent's qlen.
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes:
b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://patch.msgid.link/20250611111515.1983366-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 11:15:12 +0000 (11:15 +0000)]
net_sched: red: fix a race in __red_change()
Gerrard Tai reported a race condition in RED, whenever SFQ perturb timer
fires at the wrong time.
The race is as follows:
CPU 0 CPU 1
[1]: lock root
[2]: qdisc_tree_flush_backlog()
[3]: unlock root
|
| [5]: lock root
| [6]: rehash
| [7]: qdisc_tree_reduce_backlog()
|
[4]: qdisc_put()
This can be abused to underflow a parent's qlen.
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes:
0c8d13ac9607 ("net: sched: red: delay destroying child qdisc on replace")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611111515.1983366-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 11:15:11 +0000 (11:15 +0000)]
net_sched: prio: fix a race in prio_tune()
Gerrard Tai reported a race condition in PRIO, whenever SFQ perturb timer
fires at the wrong time.
The race is as follows:
CPU 0 CPU 1
[1]: lock root
[2]: qdisc_tree_flush_backlog()
[3]: unlock root
|
| [5]: lock root
| [6]: rehash
| [7]: qdisc_tree_reduce_backlog()
|
[4]: qdisc_put()
This can be abused to underflow a parent's qlen.
Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.
Fixes:
7b8e0b6e6599 ("net: sched: prio: delay destroying child qdiscs on change")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611111515.1983366-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 11 Jun 2025 08:35:01 +0000 (08:35 +0000)]
net_sched: sch_sfq: reject invalid perturb period
Gerrard Tai reported that SFQ perturb_period has no range check yet,
and this can be used to trigger a race condition fixed in a separate patch.
We want to make sure ctl->perturb_period * HZ will not overflow
and is positive.
Tested:
tc qd add dev lo root sfq perturb -10 # negative value : error
Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb
1000000000 # too big : error
Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb
2000000 # acceptable value
tc -s -d qd sh dev lo
qdisc sfq 8005: root refcnt 2 limit 127p quantum 64Kb depth 127 flows 128 divisor 1024 perturb 2000000sec
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250611083501.1810459-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>