WingMan Kwok [Wed, 6 Nov 2024 09:17:07 +0000 (14:47 +0530)]
net: hsr: Add VLAN support
Add support for creating VLAN interfaces over HSR/PRP interface.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20241106091710.3308519-2-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 12 Nov 2024 00:04:34 +0000 (16:04 -0800)]
Merge branch 'side-mdio-support-for-lan937x-switches'
Oleksij Rempel says:
====================
Side MDIO Support for LAN937x Switches
This patch set introduces support for an internal MDIO bus in LAN937x
switches, enabling the use of a side MDIO channel for PHY management
while keeping SPI as the main interface for switch configuration.
other changelogs are added to separate patches.
====================
Link: https://patch.msgid.link/20241106075942.1636998-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:41 +0000 (08:59 +0100)]
net: dsa: microchip: parse PHY config from device tree
Introduce ksz_parse_dt_phy_config() to validate and parse PHY
configuration from the device tree for KSZ switches. This function
ensures proper setup of internal PHYs by checking `phy-handle`
properties, verifying expected PHY IDs, and handling parent node
mismatches. Sets the PHY mask on the MII bus if validation is
successful. Returns -EINVAL on configuration errors.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241106075942.1636998-7-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:40 +0000 (08:59 +0100)]
net: dsa: microchip: add support for side MDIO interface in LAN937x
Implement side MDIO channel support for LAN937x switches, providing an
alternative to SPI for PHY management alongside existing SPI-based
switch configuration. This is needed to reduce SPI load, as SPI can be
relatively expensive for small packets compared to MDIO support.
Also, implemented static mappings for PHY addresses for various LAN937x
models to support different internal PHY configurations. Since the PHY
address mappings are not equal to the port indexes, this patch also
provides PHY address calculation based on hardware strapping
configuration.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241106075942.1636998-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:39 +0000 (08:59 +0100)]
net: dsa: microchip: cleanup error handling in ksz_mdio_register
Replace repeated cleanup code with a single error path using a label.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241106075942.1636998-5-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:38 +0000 (08:59 +0100)]
net: dsa: microchip: Refactor MDIO handling for side MDIO access
Add support for accessing PHYs via a side MDIO interface in LAN937x
switches. The existing code already supports accessing PHYs via main
management interfaces, which can be SPI, I2C, or MDIO, depending on the
chip variant. This patch enables using a side MDIO bus, where SPI is
used for the main switch configuration and MDIO for managing the
integrated PHYs. On LAN937x, this is optional, allowing them to operate
in both configurations: SPI only, or SPI + MDIO. Typically, the SPI
interface is used for switch configuration, while MDIO handles PHY
management.
Additionally, update interrupt controller code to support non-linear
port to PHY address mapping, enabling correct interrupt handling for
configurations where PHY addresses do not directly correspond to port
indexes. This change ensures that the interrupt mechanism properly
aligns with the new, flexible PHY address mappings introduced by side
MDIO support.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241106075942.1636998-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:37 +0000 (08:59 +0100)]
dt-bindings: net: dsa: microchip: add mdio-parent-bus property for internal MDIO
Introduce `mdio-parent-bus` property in the ksz DSA bindings to
reference the parent MDIO bus when the internal MDIO bus is attached to
it, bypassing the main management interface.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241106075942.1636998-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oleksij Rempel [Wed, 6 Nov 2024 07:59:36 +0000 (08:59 +0100)]
dt-bindings: net: dsa: microchip: add internal MDIO bus description
Add description for the internal MDIO bus, including integrated PHY
nodes, to ksz DSA bindings.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241106075942.1636998-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohammad Heib [Thu, 7 Nov 2024 12:07:39 +0000 (14:07 +0200)]
net: atlantic: use irq_update_affinity_hint()
irq_set_affinity_hint() is deprecated, Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.
The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that. when the driver applies the
affinities by itself, it breaks the users' expectations:
1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
order to prevent IRQs from being moved to certain CPUs that run a
real-time workload.
2. atlantic device reopening will resets the affinity
in aq_ndev_open().
3. atlantic has no idea about irqbalance's config, so it may move an IRQ to
a banned CPU. The real-time workload suffers unacceptable latency.
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241107120739.415743-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohammad Heib [Thu, 7 Nov 2024 11:50:02 +0000 (13:50 +0200)]
nfp: use irq_update_affinity_hint()
irq_set_affinity_hint() is deprecated, Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.
The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that. when the driver applies the
affinities by itself, it breaks the users' expectations:
1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
order to prevent IRQs from being moved to certain CPUs that run a
real-time workload.
2. nfp device reopening will resets the affinity
in nfp_net_netdev_open().
3. nfp has no idea about irqbalance's config, so it may move an IRQ to
a banned CPU. The real-time workload suffers unacceptable latency.
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241107115002.413358-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mohammad Heib [Wed, 6 Nov 2024 18:08:11 +0000 (20:08 +0200)]
bnxt_en: use irq_update_affinity_hint()
irq_set_affinity_hint() is deprecated, Use irq_update_affinity_hint()
instead. This removes the side-effect of actually applying the affinity.
The driver does not really need to worry about spreading its IRQs across
CPUs. The core code already takes care of that. when the driver applies the
affinities by itself, it breaks the users' expectations:
1. The user configures irqbalance with IRQBALANCE_BANNED_CPULIST in
order to prevent IRQs from being moved to certain CPUs that run a
real-time workload.
2. bnxt_en device reopening will resets the affinity
in bnxt_open().
3. bnxt_en has no idea about irqbalance's config, so it may move an IRQ to
a banned CPU. The real-time workload suffers unacceptable latency.
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://patch.msgid.link/20241106180811.385175-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 6 Nov 2024 13:00:45 +0000 (13:00 +0000)]
rxrpc: Add a tracepoint for aborts being proposed
Add a tracepoint to rxrpc to trace the proposal of an abort. The abort is
performed asynchronously by the I/O thread.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
Link: https://patch.msgid.link/726356.1730898045@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Omid Ehtemam-Haghighi [Wed, 6 Nov 2024 01:02:36 +0000 (17:02 -0800)]
ipv6: Fix soft lockups in fib6_select_path under high next hop churn
Soft lockups have been observed on a cluster of Linux-based edge routers
located in a highly dynamic environment. Using the `bird` service, these
routers continuously update BGP-advertised routes due to frequently
changing nexthop destinations, while also managing significant IPv6
traffic. The lockups occur during the traversal of the multipath
circular linked-list in the `fib6_select_path` function, particularly
while iterating through the siblings in the list. The issue typically
arises when the nodes of the linked list are unexpectedly deleted
concurrently on a different core—indicated by their 'next' and
'previous' elements pointing back to the node itself and their reference
count dropping to zero. This results in an infinite loop, leading to a
soft lockup that triggers a system panic via the watchdog timer.
Apply RCU primitives in the problematic code sections to resolve the
issue. Where necessary, update the references to fib6_siblings to
annotate or use the RCU APIs.
Include a test script that reproduces the issue. The script
periodically updates the routing table while generating a heavy load
of outgoing IPv6 traffic through multiple iperf3 clients. It
consistently induces infinite soft lockups within a couple of minutes.
Kernel log:
0 [
ffffbd13003e8d30] machine_kexec at
ffffffff8ceaf3eb
1 [
ffffbd13003e8d90] __crash_kexec at
ffffffff8d0120e3
2 [
ffffbd13003e8e58] panic at
ffffffff8cef65d4
3 [
ffffbd13003e8ed8] watchdog_timer_fn at
ffffffff8d05cb03
4 [
ffffbd13003e8f08] __hrtimer_run_queues at
ffffffff8cfec62f
5 [
ffffbd13003e8f70] hrtimer_interrupt at
ffffffff8cfed756
6 [
ffffbd13003e8fd0] __sysvec_apic_timer_interrupt at
ffffffff8cea01af
7 [
ffffbd13003e8ff0] sysvec_apic_timer_interrupt at
ffffffff8df1b83d
-- <IRQ stack> --
8 [
ffffbd13003d3708] asm_sysvec_apic_timer_interrupt at
ffffffff8e000ecb
[exception RIP: fib6_select_path+299]
RIP:
ffffffff8ddafe7b RSP:
ffffbd13003d37b8 RFLAGS:
00000287
RAX:
ffff975850b43600 RBX:
ffff975850b40200 RCX:
0000000000000000
RDX:
000000003fffffff RSI:
0000000051d383e4 RDI:
ffff975850b43618
RBP:
ffffbd13003d3800 R8:
0000000000000000 R9:
ffff975850b40200
R10:
0000000000000000 R11:
0000000000000000 R12:
ffffbd13003d3830
R13:
ffff975850b436a8 R14:
ffff975850b43600 R15:
0000000000000007
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
9 [
ffffbd13003d3808] ip6_pol_route at
ffffffff8ddb030c
10 [
ffffbd13003d3888] ip6_pol_route_input at
ffffffff8ddb068c
11 [
ffffbd13003d3898] fib6_rule_lookup at
ffffffff8ddf02b5
12 [
ffffbd13003d3928] ip6_route_input at
ffffffff8ddb0f47
13 [
ffffbd13003d3a18] ip6_rcv_finish_core.constprop.0 at
ffffffff8dd950d0
14 [
ffffbd13003d3a30] ip6_list_rcv_finish.constprop.0 at
ffffffff8dd96274
15 [
ffffbd13003d3a98] ip6_sublist_rcv at
ffffffff8dd96474
16 [
ffffbd13003d3af8] ipv6_list_rcv at
ffffffff8dd96615
17 [
ffffbd13003d3b60] __netif_receive_skb_list_core at
ffffffff8dc16fec
18 [
ffffbd13003d3be0] netif_receive_skb_list_internal at
ffffffff8dc176b3
19 [
ffffbd13003d3c50] napi_gro_receive at
ffffffff8dc565b9
20 [
ffffbd13003d3c80] ice_receive_skb at
ffffffffc087e4f5 [ice]
21 [
ffffbd13003d3c90] ice_clean_rx_irq at
ffffffffc0881b80 [ice]
22 [
ffffbd13003d3d20] ice_napi_poll at
ffffffffc088232f [ice]
23 [
ffffbd13003d3d80] __napi_poll at
ffffffff8dc18000
24 [
ffffbd13003d3db8] net_rx_action at
ffffffff8dc18581
25 [
ffffbd13003d3e40] __do_softirq at
ffffffff8df352e9
26 [
ffffbd13003d3eb0] run_ksoftirqd at
ffffffff8ceffe47
27 [
ffffbd13003d3ec0] smpboot_thread_fn at
ffffffff8cf36a30
28 [
ffffbd13003d3ee8] kthread at
ffffffff8cf2b39f
29 [
ffffbd13003d3f28] ret_from_fork at
ffffffff8ce5fa64
30 [
ffffbd13003d3f50] ret_from_fork_asm at
ffffffff8ce03cbb
Fixes:
66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Reported-by: Adrian Oliver <kernel@aoliver.ca>
Signed-off-by: Omid Ehtemam-Haghighi <omid.ehtemamhaghighi@menlosecurity.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ido Schimmel <idosch@idosch.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241106010236.1239299-1-omid.ehtemamhaghighi@menlosecurity.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 11 Nov 2024 22:16:00 +0000 (14:16 -0800)]
Merge branch 'knobs-for-npc-default-rule-counters'
Linu Cherian says:
====================
Knobs for NPC default rule counters
Patch 1 introduce _rvu_mcam_remove/add_counter_from/to_rule
by refactoring existing code
Patch 2 adds a devlink param to enable/disable counters for default
rules. Once enabled, counters can
Patch 3 adds documentation for devlink params
v4: https://lore.kernel.org/
20241029035739.
1981839-1-lcherian@marvell.com
====================
Link: https://patch.msgid.link/20241105125620.2114301-1-lcherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linu Cherian [Tue, 5 Nov 2024 12:56:20 +0000 (18:26 +0530)]
devlink: Add documentation for OcteonTx2 AF
Add documentation for the following devlink params
- npc_mcam_high_zone_percent
- npc_def_rule_cntr
- nix_maxlf
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Link: https://patch.msgid.link/20241105125620.2114301-4-lcherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linu Cherian [Tue, 5 Nov 2024 12:56:19 +0000 (18:26 +0530)]
octeontx2-af: Knobs for NPC default rule counters
Add devlink knobs to enable/disable counters on NPC
default rule entries.
Sample command to enable default rule counters:
devlink dev param set <dev> name npc_def_rule_cntr value true cmode runtime
Sample command to read the counter:
cat /sys/kernel/debug/cn10k/npc/mcam_rules
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Link: https://patch.msgid.link/20241105125620.2114301-3-lcherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linu Cherian [Tue, 5 Nov 2024 12:56:18 +0000 (18:26 +0530)]
octeontx2-af: Refactor few NPC mcam APIs
Introduce lowlevel variant of rvu_mcam_remove/add_counter_from/to_rule
for better code reuse, which assumes necessary locks are taken at
higher level.
These low level functions would be used for implementing default rule
counter APIs in the subsequent patch.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241105125620.2114301-2-lcherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Caleb Sander Mateos [Thu, 7 Nov 2024 18:30:52 +0000 (11:30 -0700)]
mlx5/core: deduplicate {mlx5_,}eq_update_ci()
The logic of eq_update_ci() is duplicated in mlx5_eq_update_ci(). The
only additional work done by mlx5_eq_update_ci() is to increment
eq->cons_index. Call eq_update_ci() from mlx5_eq_update_ci() to avoid
the duplication.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241107183054.2443218-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Caleb Sander Mateos [Thu, 7 Nov 2024 18:30:51 +0000 (11:30 -0700)]
mlx5/core: relax memory barrier in eq_update_ci()
The memory barrier in eq_update_ci() after the doorbell write is a
significant hot spot in mlx5_eq_comp_int(). Under heavy TCP load, we see
3% of CPU time spent on the mfence instruction.
98df6d5b877c ("net/mlx5: A write memory barrier is sufficient in EQ ci
update") already relaxed the full memory barrier to just a write barrier
in mlx5_eq_update_ci(), which duplicates eq_update_ci(). So replace mb()
with wmb() in eq_update_ci() too.
On strongly ordered architectures, no barrier is actually needed because
the MMIO writes to the doorbell register are guaranteed to appear to the
device in the order they were made. However, the kernel's ordered MMIO
primitive writel() lacks a convenient big-endian interface.
Therefore, we opt to stick with __raw_writel() + a barrier.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241107183054.2443218-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 11 Nov 2024 22:12:23 +0000 (14:12 -0800)]
Merge branch 'macsec-inherit-lower-device-s-features-and-tso-limits-when-offloading'
Sabrina Dubroca says:
====================
macsec: inherit lower device's features and TSO limits when offloading
When macsec is offloaded to a NIC, we can take advantage of some of
its features, mainly TSO and checksumming. This increases performance
significantly. Some features cannot be inherited, because they require
additional ops that aren't provided by the macsec netdevice.
We also need to inherit TSO limits from the lower device, like
VLAN/macvlan devices do.
This series also moves the existing macsec offload selftest to the
netdevsim selftests before adding tests for the new features. To allow
this new selftest to work, netdevsim's hw_features are expanded.
====================
Link: https://patch.msgid.link/cover.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:34 +0000 (00:13 +0100)]
selftests: netdevsim: add ethtool features to macsec offload tests
The test verifies that available features aren't changed by toggling
offload on the device. Creating a device with offload off and then
enabling it later should result in the same features as creating the
device with offload enabled directly.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/ba801bd0a75b02de2dddbfc77f9efceb8b3d8a2e.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:33 +0000 (00:13 +0100)]
selftests: netdevsim: add test toggling macsec offload
The test verifies that toggling offload works (both via rtnetlink and
macsec's genetlink APIs). This is only possible when no SA is
configured.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/bf8e27ee0d921caa4eb35f1e830eca6d4080ddb2.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:32 +0000 (00:13 +0100)]
selftests: move macsec offload tests from net/rtnetlink to drivers/net/netdvesim
We're going to expand this test, and macsec offload is only lightly
related to rtnetlink.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/a1f92c250cc129b4bb111a206c4b560bab4e24a5.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:31 +0000 (00:13 +0100)]
macsec: inherit lower device's TSO limits when offloading
If macsec is offloaded, we need to follow the lower device's
capabilities, like VLAN devices do.
Leave the limits unchanged when the offload is disabled.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8240c0181e851f169d815f59658a01fb9dfc5073.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:30 +0000 (00:13 +0100)]
macsec: clean up local variables in macsec_notify
For all events, we need to loop over the list of secys, so let's move
the common variables out of the switch/case.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/9b8996af518fbeb3b7d527feb15d5788495e3108.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:29 +0000 (00:13 +0100)]
macsec: add some of the lower device's features when offloading
This commit extends the set of netdevice features supported by macsec
devices when offload is enabled, which increases performance
significantly (for a single TCP stream: 17.5Gbps to 38.5Gbps on my
test machines).
Commit
c850240b6c41 ("net: macsec: report real_dev features when HW
offloading is enabled") previously attempted something similar, but
had to be reverted (commit
8bcd560ae878 ("Revert "net: macsec: report
real_dev features when HW offloading is enabled"")) because the set of
features it exposed was too large.
During initialization, all features are set, and they're then removed
via ndo_fix_features (macsec_fix_features). This allows the
offloadable features to be automatically enabled if offloading is
turned on after device creation.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8b32c3011d269d6f149724e80c1ffe67c9534067.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:28 +0000 (00:13 +0100)]
selftests: netdevsim: add a test checking ethtool features
Add a test checking that some features are active by default and
changeable.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/fff58fa70f8a300440958b5020f6a4eb2e9dad61.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Wed, 6 Nov 2024 23:13:27 +0000 (00:13 +0100)]
netdevsim: add more hw_features
netdevsim currently only set HW_TC in its hw_features, but other
features should also be present to better reflect the behavior of real
HW.
In my macsec offload testing, this ends up as HW_CSUM being missing
from hw_features, so it doesn't stick in wanted_features when offload
is turned off. Then HW_CSUM (and thus TSO, thanks to
netdev_fix_features) is not automatically turned back on when offload
is re-enabled.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/b918dc4dd76410a57f7516a855f66b0a2bd58326.1730929545.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 11 Nov 2024 18:56:30 +0000 (10:56 -0800)]
Merge branch 'replace-page_frag-with-page_frag_cache-part-1'
Yunsheng Lin says:
====================
Replace page_frag with page_frag_cache (Part-1)
This is part 1 of "Replace page_frag with page_frag_cache",
which mainly contain refactoring and optimization for the
implementation of page_frag API before the replacing.
As the discussion in [1], it would be better to target net-next
tree to get more testing as all the callers page_frag API are
in networking, and the chance of conflicting with MM tree seems
low as implementation of page_frag API seems quite self-contained.
After [2], there are still two implementations for page frag:
1. mm/page_alloc.c: net stack seems to be using it in the
rx part with 'struct page_frag_cache' and the main API
being page_frag_alloc_align().
2. net/core/sock.c: net stack seems to be using it in the
tx part with 'struct page_frag' and the main API being
skb_page_frag_refill().
This patchset tries to unfiy the page frag implementation
by replacing page_frag with page_frag_cache for sk_page_frag()
first. net_high_order_alloc_disable_key for the implementation
in net/core/sock.c doesn't seems matter that much now as pcp
is also supported for high-order pages:
commit
44042b449872 ("mm/page_alloc: allow high-order pages to
be stored on the per-cpu lists")
As the related change is mostly related to networking, so
targeting the net-next. And will try to replace the rest
of page_frag in the follow patchset.
After this patchset:
1. Unify the page frag implementation by taking the best out of
two the existing implementations: we are able to save some space
for the 'page_frag_cache' API user, and avoid 'get_page()' for
the old 'page_frag' API user.
2. Future bugfix and performance can be done in one place, hence
improving maintainability of page_frag's implementation.
Kernel Image changing:
Linux Kernel total | text data bss
------------------------------------------------------
after
45250307 |
27274279 17209996 766032
before
45254134 |
27278118 17209984 766032
delta -3827 | -3839 +12 +0
1. https://lore.kernel.org/all/
add10dd4-7f5d-4aa1-aa04-
767590f944e0@redhat.com/
2. https://lore.kernel.org/all/
20240228093013.8263-1-linyunsheng@huawei.com/
====================
Link: https://patch.msgid.link/20241028115343.3405838-1-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:42 +0000 (19:53 +0800)]
mm: page_frag: use __alloc_pages() to replace alloc_pages_node()
It seems there is about 24Bytes binary size increase for
__page_frag_cache_refill() after refactoring in arm64 system
with 64K PAGE_SIZE. By doing the gdb disassembling, It seems
we can have more than 100Bytes decrease for the binary size
by using __alloc_pages() to replace alloc_pages_node(), as
there seems to be some unnecessary checking for nid being
NUMA_NO_NODE, especially when page_frag is part of the mm
system.
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-8-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:41 +0000 (19:53 +0800)]
mm: page_frag: reuse existing space for 'size' and 'pfmemalloc'
Currently there is one 'struct page_frag' for every 'struct
sock' and 'struct task_struct', we are about to replace the
'struct page_frag' with 'struct page_frag_cache' for them.
Before begin the replacing, we need to ensure the size of
'struct page_frag_cache' is not bigger than the size of
'struct page_frag', as there may be tens of thousands of
'struct sock' and 'struct task_struct' instances in the
system.
By or'ing the page order & pfmemalloc with lower bits of
'va' instead of using 'u16' or 'u32' for page size and 'u8'
for pfmemalloc, we are able to avoid 3 or 5 bytes space waste.
And page address & pfmemalloc & order is unchanged for the
same page in the same 'page_frag_cache' instance, it makes
sense to fit them together.
After this patch, the size of 'struct page_frag_cache' should be
the same as the size of 'struct page_frag'.
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-7-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:40 +0000 (19:53 +0800)]
xtensa: remove the get_order() implementation
As the get_order() implemented by xtensa supporting 'nsau'
instruction seems be the same as the generic implementation
in include/asm-generic/getorder.h when size is not a constant
value as the generic implementation calling the fls*() is also
utilizing the 'nsau' instruction for xtensa.
So remove the get_order() implemented by xtensa, as using the
generic implementation may enable the compiler to do the
computing when size is a constant value instead of runtime
computing and enable the using of get_order() in BUILD_BUG_ON()
macro in next patch.
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-6-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:39 +0000 (19:53 +0800)]
mm: page_frag: avoid caller accessing 'page_frag_cache' directly
Use appropriate frag_page API instead of caller accessing
'page_frag_cache' directly.
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:38 +0000 (19:53 +0800)]
mm: page_frag: use initial zero offset for page_frag_alloc_align()
We are about to use page_frag_alloc_*() API to not just
allocate memory for skb->data, but also use them to do
the memory allocation for skb frag too. Currently the
implementation of page_frag in mm subsystem is running
the offset as a countdown rather than count-up value,
there may have several advantages to that as mentioned
in [1], but it may have some disadvantages, for example,
it may disable skb frag coalescing and more correct cache
prefetching
We have a trade-off to make in order to have a unified
implementation and API for page_frag, so use a initial zero
offset in this patch, and the following patch will try to
make some optimization to avoid the disadvantages as much
as possible.
1. https://lore.kernel.org/all/
f4abe71b3439b39d17a6fb2d410180f367cadf5c.camel@gmail.com/
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-4-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:37 +0000 (19:53 +0800)]
mm: move the page fragment allocator from page_alloc into its own file
Inspired by [1], move the page fragment allocator from page_alloc
into its own c file and header file, as we are about to make more
change for it to replace another page_frag implementation in
sock.c
As this patchset is going to replace 'struct page_frag' with
'struct page_frag_cache' in sched.h, including page_frag_cache.h
in sched.h has a compiler error caused by interdependence between
mm_types.h and mm.h for asm-offsets.c, see [2]. So avoid the compiler
error by moving 'struct page_frag_cache' to mm_types_task.h as
suggested by Alexander, see [3].
1. https://lore.kernel.org/all/
20230411160902.
4134381-3-dhowells@redhat.com/
2. https://lore.kernel.org/all/
15623dac-9358-4597-b3ee-
3694a5956920@gmail.com/
3. https://lore.kernel.org/all/CAKgT0UdH1yD=LSCXFJ=YM_aiA4OomD-2wXykO42bizaWMt_HOA@mail.gmail.com/
CC: David Howells <dhowells@redhat.com>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-3-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Mon, 28 Oct 2024 11:53:36 +0000 (19:53 +0800)]
mm: page_frag: add a test module for page_frag
The testing is done by ensuring that the fragment allocated
from a frag_frag_cache instance is pushed into a ptr_ring
instance in a kthread binded to a specified cpu, and a kthread
binded to a specified cpu will pop the fragment from the
ptr_ring and free the fragment.
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linux-MM <linux-mm@kvack.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/20241028115343.3405838-2-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Berg [Fri, 8 Nov 2024 10:41:45 +0000 (11:41 +0100)]
net: convert to nla_get_*_default()
Most of the original conversion is from the spatch below,
but I edited some and left out other instances that were
either buggy after conversion (where default values don't
fit into the type) or just looked strange.
@@
expression attr, def;
expression val;
identifier fn =~ "^nla_get_.*";
fresh identifier dfn = fn ## "_default";
@@
(
-if (attr)
- val = fn(attr);
-else
- val = def;
+val = dfn(attr, def);
|
-if (!attr)
- val = def;
-else
- val = fn(attr);
+val = dfn(attr, def);
|
-if (!attr)
- return def;
-return fn(attr);
+return dfn(attr, def);
|
-attr ? fn(attr) : def
+dfn(attr, def)
|
-!attr ? def : fn(attr)
+dfn(attr, def)
)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
Link: https://patch.msgid.link/20241108114145.0580b8684e7f.I740beeaa2f70ebfc19bfca1045a24d6151992790@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Berg [Fri, 8 Nov 2024 10:41:44 +0000 (11:41 +0100)]
net: netlink: add nla_get_*_default() accessors
There are quite a number of places that use patterns
such as
if (attr)
val = nla_get_u16(attr);
else
val = DEFAULT;
Add nla_get_u16_default() and friends like that to
not have to type this out all the time.
Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20241108114145.acd2aadb03ac.I3df6aac71d38a5baa1c0a03d0c7e82d4395c030e@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Tue, 5 Nov 2024 13:39:54 +0000 (15:39 +0200)]
bridge: Allow deleting FDB entries with non-existent VLAN
It is currently impossible to delete individual FDB entries (as opposed
to flushing) that were added with a VLAN that no longer exists:
# ip link add name dummy1 up type dummy
# ip link add name br1 up type bridge vlan_filtering 1
# ip link set dev dummy1 master br1
# bridge fdb add 00:11:22:33:44:55 dev dummy1 master static vlan 1
# bridge vlan del vid 1 dev dummy1
# bridge fdb get 00:11:22:33:44:55 br br1 vlan 1
00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static
# bridge fdb del 00:11:22:33:44:55 dev dummy1 master vlan 1
RTNETLINK answers: Invalid argument
# bridge fdb get 00:11:22:33:44:55 br br1 vlan 1
00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static
This is in contrast to MDB entries that can be deleted after the VLAN
was deleted:
# bridge vlan add vid 10 dev dummy1
# bridge mdb add dev br1 port dummy1 grp 239.1.1.1 permanent vid 10
# bridge vlan del vid 10 dev dummy1
# bridge mdb get dev br1 grp 239.1.1.1 vid 10
dev br1 port dummy1 grp 239.1.1.1 permanent vid 10
# bridge mdb del dev br1 port dummy1 grp 239.1.1.1 permanent vid 10
# bridge mdb get dev br1 grp 239.1.1.1 vid 10
Error: bridge: MDB entry not found.
Align the two interfaces and allow user space to delete FDB entries that
were added with a VLAN that no longer exists:
# ip link add name dummy1 up type dummy
# ip link add name br1 up type bridge vlan_filtering 1
# ip link set dev dummy1 master br1
# bridge fdb add 00:11:22:33:44:55 dev dummy1 master static vlan 1
# bridge vlan del vid 1 dev dummy1
# bridge fdb get 00:11:22:33:44:55 br br1 vlan 1
00:11:22:33:44:55 dev dummy1 vlan 1 master br1 static
# bridge fdb del 00:11:22:33:44:55 dev dummy1 master vlan 1
# bridge fdb get 00:11:22:33:44:55 br br1 vlan 1
Error: Fdb entry not found.
Add a selftest to make sure this behavior does not regress:
# ./rtnetlink.sh -t kci_test_fdb_del
PASS: bridge fdb del
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Andy Roulin <aroulin@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20241105133954.350479-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Caleb Sander Mateos [Tue, 5 Nov 2024 20:39:59 +0000 (13:39 -0700)]
mlx5/core: Schedule EQ comp tasklet only if necessary
Currently, the mlx5_eq_comp_int() interrupt handler schedules a tasklet
to call mlx5_cq_tasklet_cb() if it processes any completions. For CQs
whose completions don't need to be processed in tasklet context, this
adds unnecessary overhead. In a heavy TCP workload, we see 4% of CPU
time spent on the tasklet_trylock() in tasklet_action_common(), with a
smaller amount spent on the atomic operations in tasklet_schedule(),
tasklet_clear_sched(), and locking the spinlock in mlx5_cq_tasklet_cb().
TCP completions are handled by mlx5e_completion_event(), which schedules
NAPI to poll the queue, so they don't need tasklet processing.
Schedule the tasklet in mlx5_add_cq_to_tasklet() instead to avoid this
overhead. mlx5_add_cq_to_tasklet() is responsible for enqueuing the CQs
to be processed in tasklet context, so it can schedule the tasklet. CQs
that need tasklet processing have their interrupt comp handler set to
mlx5_add_cq_to_tasklet(), so they will schedule the tasklet. CQs that
don't need tasklet processing won't schedule the tasklet. To avoid
scheduling the tasklet multiple times during the same interrupt, only
schedule the tasklet in mlx5_add_cq_to_tasklet() if the tasklet work
queue was empty before the new CQ was pushed to it.
The additional branch in mlx5_add_cq_to_tasklet(), called for each EQE,
may add a small cost for the userspace Infiniband CQs whose completions
are processed in tasklet context. But this seems worth it to avoid the
tasklet overhead for CQs that don't need it.
Note that the mlx4 driver works the same way: it schedules the tasklet
in mlx4_add_cq_to_tasklet() and only if the work queue was empty before.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://patch.msgid.link/20241105204000.1807095-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 9 Nov 2024 21:22:58 +0000 (13:22 -0800)]
Merge branch 'improve-neigh_flush_dev-performance'
Gilad Naaman says:
====================
Improve neigh_flush_dev performance
This patchsets improves the performance of neigh_flush_dev.
Currently, the only way to implement it requires traversing
all neighbours known to the kernel, across all network-namespaces.
This means that some flows are slowed down as a function of neigh-scale,
even if the specific link they're handling has little to no neighbours.
In order to solve this, this patchset adds a netdev->neighbours list,
as well as making the original linked-list doubly-, so that it is
possible to unlink neighbours without traversing the hash-bucket to
obtain the previous neighbour.
The original use-case we encountered was mass-deletion of links (12K
VLANs) while there are 50K ARPs and 50K NDPs in the system; though the
slowdowns would also appear when the links are set down.
====================
Link: https://patch.msgid.link/20241107160444.2913124-1-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:43 +0000 (16:04 +0000)]
neighbour: Create netdev->neighbour association
Create a mapping between a netdev and its neighoburs,
allowing for much cheaper flushes.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241107160444.2913124-7-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:42 +0000 (16:04 +0000)]
neighbour: Remove bare neighbour::next pointer
Remove the now-unused neighbour::next pointer, leaving struct neighbour
solely with the hlist_node implementation.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241107160444.2913124-6-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:41 +0000 (16:04 +0000)]
neighbour: Convert iteration to use hlist+macro
Remove all usage of the bare neighbour::next pointer,
replacing them with neighbour::hash and its for_each macro.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241107160444.2913124-5-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:40 +0000 (16:04 +0000)]
neighbour: Convert seq_file functions to use hlist
Convert seq_file-related neighbour functionality to use neighbour::hash
and the related for_each macro.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241107160444.2913124-4-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:39 +0000 (16:04 +0000)]
neighbour: Define neigh_for_each_in_bucket
Introduce neigh_for_each_in_bucket in neighbour.h, to help iterate over
the neighbour table more succinctly.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241107160444.2913124-3-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Thu, 7 Nov 2024 16:04:38 +0000 (16:04 +0000)]
neighbour: Add hlist_node to struct neighbour
Add a doubly-linked node to neighbours, so that they
can be deleted without iterating the entire bucket they're in.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241107160444.2913124-2-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 9 Nov 2024 21:21:16 +0000 (13:21 -0800)]
Merge branch 'r8169-improve-wol-suspend-related-code'
Heiner Kallweit says:
====================
r8169: improve wol/suspend-related code
This series improves wol/suspend-related code parts.
====================
Link: https://patch.msgid.link/be734d10-37f7-4830-b7c2-367c0a656c08@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 16:57:08 +0000 (17:57 +0100)]
r8169: align WAKE_PHY handling with r8125/r8126 vendor drivers
Vendor drivers r8125/r8126 apply this additional magic setting when
enabling WAKE_PHY, so do the same here.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/51130715-45be-4db5-abb7-05d87e1f5df9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 16:56:28 +0000 (17:56 +0100)]
r8169: improve rtl_set_d3_pll_down
Make use of new helper r8169_mod_reg8_cond() and move from a switch()
to an if() clause. Benefit is that we don't have to touch this piece of
code each time support for a new chip version is added.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/e1ccdb85-a4ed-4800-89c2-89770ff06452@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 16:55:45 +0000 (17:55 +0100)]
r8169: improve __rtl8169_set_wol
Add helper r8169_mod_reg8_cond() what allows to significantly simplify
__rtl8169_set_wol().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/697b197a-8eac-40c6-8847-27093cacec36@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Abhinav Saxena [Fri, 8 Nov 2024 19:56:42 +0000 (12:56 -0700)]
tc: fix typo probabilty in tc.yaml doc
Fix spelling of "probability" in tc.yaml documentation. This corrects
the max-P field description in struct tc_sfq_qopt_v1.
Signed-off-by: Abhinav Saxena <xandfury@gmail.com>
Link: https://patch.msgid.link/20241108195642.139315-1-xandfury@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andrew Kreimer [Wed, 6 Nov 2024 11:24:20 +0000 (13:24 +0200)]
mISDN: Fix typos
Fix typos:
- syncronized -> synchronized
- interfacs -> interface
- otherwhise -> otherwise
- ony -> only
- busses -> buses
- maxinum -> maximum
Via codespell.
Reported-by: Simon Horman <horms@kernel.org>
Signed-off-by: Andrew Kreimer <algonell@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241106112513.9559-1-algonell@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hyunwoo Kim [Wed, 6 Nov 2024 09:36:04 +0000 (04:36 -0500)]
hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer
When hvs is released, there is a possibility that vsk->trans may not
be initialized to NULL, which could lead to a dangling pointer.
This issue is resolved by initializing vsk->trans to NULL.
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/Zys4hCj61V+mQfX2@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Tue, 5 Nov 2024 23:18:55 +0000 (15:18 -0800)]
net: sfc: use ethtool string helpers
The latter is the preferred way to copy ethtool strings.
Avoids manually incrementing the pointer. Cleans up the code quite well.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20241105231855.235894-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MoYuanhao [Wed, 6 Nov 2024 07:10:35 +0000 (15:10 +0800)]
mptcp: remove the redundant assignment of 'new_ctx->tcp_sock' in subflow_ulp_clone()
The variable has already been assigned in the subflow_create_ctx(),
So we don't need to reassign this variable in the subflow_ulp_clone().
Signed-off-by: MoYuanhao <moyuanhao3676@163.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241106071035.2591-1-moyuanhao3676@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Khang Nguyen [Tue, 5 Nov 2024 07:19:15 +0000 (14:19 +0700)]
net: mctp: Expose transport binding identifier via IFLA attribute
MCTP control protocol implementations are transport binding dependent.
Endpoint discovery is mandatory based on transport binding.
Message timing requirements are specified in each respective transport
binding specification.
However, we currently have no means to get this information from MCTP
links.
Add a IFLA_MCTP_PHYS_BINDING netlink link attribute, which represents
the transport type using the DMTF DSP0239-defined type numbers, returned
as part of RTM_GETLINK data.
We get an IFLA_MCTP_PHYS_BINDING attribute for each MCTP link, for
example:
- 0x00 (unspec) for loopback interface;
- 0x01 (SMBus/I2C) for mctpi2c%d interfaces; and
- 0x05 (serial) for mctpserial%d interfaces.
Signed-off-by: Khang Nguyen <khangng@os.amperecomputing.com>
Reviewed-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20241105071915.821871-1-khangng@os.amperecomputing.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jianbo Liu [Tue, 5 Nov 2024 19:27:21 +0000 (21:27 +0200)]
bonding: add ESP offload features when slaves support
Add NETIF_F_GSO_ESP bit to bond's gso_partial_features if all slaves
support it, such that ESP segmentation is handled by hardware if possible.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241105192721.584822-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 8 Nov 2024 04:34:59 +0000 (20:34 -0800)]
Merge branch 'netlink-specs-add-neigh-and-rule-ynl-specs'
Donald Hunter says:
====================
netlink: specs: Add neigh and rule YNL specs
Add YNL specs for the FDB neighbour tables and FIB rules from the
rtnelink families.
Example usage:
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_neigh.yaml \
--dump getneigh
[{'cacheinfo': {'confirmed':
122664055,
'refcnt': 0,
'updated':
122658055,
'used':
122658055},
'dst': '0.0.0.0',
'family': 2,
'flags': set(),
'ifindex': 5,
'lladr': '',
'probes': 0,
'state': {'noarp'},
'type': 'broadcast'},
...]
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_rule.yaml \
--dump getrule --json '{"family": 2}'
[{'action': 'to-tbl',
'dst-len': 0,
'family': 2,
'flags': 0,
'protocol': 2,
'src-len': 0,
'suppress-prefixlen': '0xffffffff',
'table': 255,
'tos': 0},
... ]
====================
Link: https://patch.msgid.link/20241106090718.64713-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Wed, 6 Nov 2024 09:07:18 +0000 (09:07 +0000)]
netlink: specs: Add a spec for FIB rule management
Add a YNL spec for FIB rules:
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_rule.yaml \
--dump getrule --json '{"family": 2}'
[{'action': 'to-tbl',
'dst-len': 0,
'family': 2,
'flags': 0,
'protocol': 2,
'src-len': 0,
'suppress-prefixlen': '0xffffffff',
'table': 255,
'tos': 0},
... ]
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20241106090718.64713-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Wed, 6 Nov 2024 09:07:17 +0000 (09:07 +0000)]
netlink: specs: Add a spec for neighbor tables in rtnetlink
Add a YNL spec for neighbour tables and neighbour entries in rtnetlink.
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_neigh.yaml \
--dump getneigh
[{'cacheinfo': {'confirmed':
122664055,
'refcnt': 0,
'updated':
122658055,
'used':
122658055},
'dst': '0.0.0.0',
'family': 2,
'flags': set(),
'ifindex': 5,
'lladr': '',
'probes': 0,
'state': {'noarp'},
'type': 'broadcast'},
...]
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20241106090718.64713-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 6 Nov 2024 13:18:17 +0000 (13:18 +0000)]
phonet: do not call synchronize_rcu() from phonet_route_del()
Calling synchronize_rcu() while holding rcu_read_lock() is not
permitted [1]
Move the synchronize_rcu() + dev_put() to route_doit().
Alternative would be to not use rcu_read_lock() in route_doit().
[1]
WARNING: suspicious RCU usage
6.12.0-rc5-syzkaller-01056-gf07a6e6ceb05 #0 Not tainted
-----------------------------
kernel/rcu/tree.c:4092 Illegal synchronize_rcu() in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor427/5840:
#0:
ffffffff8e937da0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#0:
ffffffff8e937da0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#0:
ffffffff8e937da0 (rcu_read_lock){....}-{1:2}, at: route_doit+0x3d6/0x640 net/phonet/pn_netlink.c:264
stack backtrace:
CPU: 1 UID: 0 PID: 5840 Comm: syz-executor427 Not tainted
6.12.0-rc5-syzkaller-01056-gf07a6e6ceb05 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
lockdep_rcu_suspicious+0x226/0x340 kernel/locking/lockdep.c:6821
synchronize_rcu+0xea/0x360 kernel/rcu/tree.c:4089
phonet_route_del+0xc6/0x140 net/phonet/pn_dev.c:409
route_doit+0x514/0x640 net/phonet/pn_netlink.c:275
rtnetlink_rcv_msg+0x791/0xcf0 net/core/rtnetlink.c:6790
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:744
sock_write_iter+0x2d7/0x3f0 net/socket.c:1165
new_sync_write fs/read_write.c:590 [inline]
vfs_write+0xaeb/0xd30 fs/read_write.c:683
ksys_write+0x183/0x2b0 fs/read_write.c:736
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes:
17a1ac0018ae ("phonet: Don't hold RTNL for route_doit().")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Remi Denis-Courmont <courmisch@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20241106131818.1240710-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Wed, 6 Nov 2024 21:37:32 +0000 (22:37 +0100)]
ipv4: Prepare ip_route_output() to future .flowi4_tos conversion.
Convert the "tos" parameter of ip_route_output() to dscp_t. This way
we'll have a dscp_t value directly available when .flowi4_tos will
eventually be converted to dscp_t.
All ip_route_output() callers but one set this "tos" parameter to 0 and
therefore don't need to be adapted to the new prototype.
Only br_nf_pre_routing_finish() needs conversion. It can just use
ip4h_dscp() to get the DSCP field from the IPv4 header.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/0f10d031dd44c70aae9bc6e19391cb30d5c2fe71.1730928699.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 8 Nov 2024 04:31:10 +0000 (20:31 -0800)]
Merge branch 'net-phy-remove-genphy_config_eee_advert'
Heiner Kallweit says:
====================
net: phy: remove genphy_config_eee_advert
This series removes genphy_config_eee_advert().
Note: The change to bcm_config_lre_aneg() is compile-tested only
as I don't have supported hardware.
====================
Link: https://patch.msgid.link/69d22b31-57d1-4b01-bfde-0c6a1df1e310@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 20:25:12 +0000 (21:25 +0100)]
net: phy: remove genphy_config_eee_advert
bcm_config_lre_aneg() doesn't use genphy_config_eee_advert() any longer.
As this was the only user, we can remove genphy_config_eee_advert() now.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/37da7f3e-b883-4c07-9881-b8c0516822b7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 20:24:18 +0000 (21:24 +0100)]
net: phy: broadcom: use genphy_c45_an_config_eee_aneg in bcm_config_lre_aneg
bcm_config_lre_aneg() is the only user of genphy_config_eee_advert(),
therefore use genphy_c45_an_config_eee_aneg() instead. The resulting
functionality is equivalent, and bcm_config_lre_aneg() follows the
structure of __genphy_config_aneg().
In a follow-up step genphy_config_eee_advert() can be removed.
Note: We preserve the current behavior to ignore errors.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/6e5cd4ab-28bb-4d82-b449-fec85f3d1e8a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 20:21:11 +0000 (21:21 +0100)]
net: phy: export genphy_c45_an_config_eee_aneg
We'll use this function in bcm_config_lre_aneg(), therefore export it.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/02bd7c39-7413-4433-bafc-a276089bd292@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 6 Nov 2024 20:19:59 +0000 (21:19 +0100)]
net: phy: make genphy_c45_write_eee_adv() static
genphy_c45_write_eee_adv() isn't used outside phy-c45.c,
so make it static.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/d23bd784-44e6-4a15-af3a-b37379156521@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 1 Nov 2024 00:30:16 +0000 (17:30 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc7).
Conflicts:
drivers/net/ethernet/freescale/enetc/enetc_pf.c
e15c5506dd39 ("net: enetc: allocate vf_state during PF probes")
3774409fd4c6 ("net: enetc: build enetc_pf_common.c as a separate module")
https://lore.kernel.org/
20241105114100.
118bd35e@canb.auug.org.au
Adjacent changes:
drivers/net/ethernet/ti/am65-cpsw-nuss.c
de794169cf17 ("net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7")
4a7b2ba94a59 ("net: ethernet: ti: am65-cpsw: Use tstats instead of open coded version")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 7 Nov 2024 21:07:57 +0000 (11:07 -1000)]
Merge tag 'net-6.12-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from can and netfilter.
Things are slowing down quite a bit, mostly driver fixes here. No
known ongoing investigations.
Current release - new code bugs:
- eth: ti: am65-cpsw:
- fix multi queue Rx on J7
- fix warning in am65_cpsw_nuss_remove_rx_chns()
Previous releases - regressions:
- mptcp: do not require admin perm to list endpoints, got missed in a
refactoring
- mptcp: use sock_kfree_s instead of kfree
Previous releases - always broken:
- sctp: properly validate chunk size in sctp_sf_ootb() fix OOB access
- virtio_net: make RSS interact properly with queue number
- can: mcp251xfd: mcp251xfd_get_tef_len(): fix length calculation
- can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing
configuration when switching CAN modes
Misc:
- revert earlier hns3 fixes, they were ignoring IOMMU abstractions
and need to be reworked
- can: {cc770,sja1000}_isa: allow building on x86_64"
* tag 'net-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits)
drivers: net: ionic: add missed debugfs cleanup to ionic_probe() error path
net/smc: do not leave a dangling sk pointer in __smc_create()
rxrpc: Fix missing locking causing hanging calls
net/smc: Fix lookup of netdev by using ib_device_get_netdev()
net: arc: rockchip: fix emac mdio node support
net: arc: fix the device for dma_map_single/dma_unmap_single
virtio_net: Update rss when set queue
virtio_net: Sync rss config to device when virtnet_probe
virtio_net: Add hash_key_length check
virtio_net: Support dynamic rss indirection table size
netfilter: nf_tables: wait for rcu grace period on net_device removal
net: stmmac: Fix unbalanced IRQ wake disable warning on single irq case
net: vertexcom: mse102x: Fix possible double free of TX skb
mptcp: use sock_kfree_s instead of kfree
mptcp: no admin perm to list endpoints
net: phy: ti: add PHY_RST_AFTER_CLK_EN flag
net: ethernet: ti: am65-cpsw: fix warning in am65_cpsw_nuss_remove_rx_chns()
net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7
net: hns3: fix kernel crash when uninstalling driver
Revert "Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'"
...
Wentao Liang [Thu, 7 Nov 2024 02:17:56 +0000 (10:17 +0800)]
drivers: net: ionic: add missed debugfs cleanup to ionic_probe() error path
The ionic_setup_one() creates a debugfs entry for ionic upon
successful execution. However, the ionic_probe() does not
release the dentry before returning, resulting in a memory
leak.
To fix this bug, we add the ionic_debugfs_del_dev() to release
the resources in a timely manner before returning.
Fixes:
0de38d9f1dba ("ionic: extract common bits from ionic_probe")
Signed-off-by: Wentao Liang <Wentao_liang_g@163.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20241107021756.1677-1-liangwentao@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 6 Nov 2024 22:19:22 +0000 (22:19 +0000)]
net/smc: do not leave a dangling sk pointer in __smc_create()
Thanks to commit
4bbd360a5084 ("socket: Print pf->create() when
it does not clear sock->sk on failure."), syzbot found an issue with AF_SMC:
smc_create must clear sock->sk on failure, family: 43, type: 1, protocol: 0
WARNING: CPU: 0 PID: 5827 at net/socket.c:1565 __sock_create+0x96f/0xa30 net/socket.c:1563
Modules linked in:
CPU: 0 UID: 0 PID: 5827 Comm: syz-executor259 Not tainted 6.12.0-rc6-next-
20241106-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:__sock_create+0x96f/0xa30 net/socket.c:1563
Code: 03 00 74 08 4c 89 e7 e8 4f 3b 85 f8 49 8b 34 24 48 c7 c7 40 89 0c 8d 8b 54 24 04 8b 4c 24 0c 44 8b 44 24 08 e8 32 78 db f7 90 <0f> 0b 90 90 e9 d3 fd ff ff 89 e9 80 e1 07 fe c1 38 c1 0f 8c ee f7
RSP: 0018:
ffffc90003e4fda0 EFLAGS:
00010246
RAX:
099c6f938c7f4700 RBX:
1ffffffff1a595fd RCX:
ffff888034823c00
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000000
RBP:
00000000ffffffe9 R08:
ffffffff81567052 R09:
1ffff920007c9f50
R10:
dffffc0000000000 R11:
fffff520007c9f51 R12:
ffffffff8d2cafe8
R13:
1ffffffff1a595fe R14:
ffffffff9a789c40 R15:
ffff8880764298c0
FS:
000055557b518380(0000) GS:
ffff8880b8600000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fa62ff43225 CR3:
0000000031628000 CR4:
00000000003526f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
sock_create net/socket.c:1616 [inline]
__sys_socket_create net/socket.c:1653 [inline]
__sys_socket+0x150/0x3c0 net/socket.c:1700
__do_sys_socket net/socket.c:1714 [inline]
__se_sys_socket net/socket.c:1712 [inline]
For reference, see commit
2d859aff775d ("Merge branch
'do-not-leave-dangling-sk-pointers-in-pf-create-functions'")
Fixes:
d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ignat Korchagin <ignat@cloudflare.com>
Cc: D. Wythe <alibuda@linux.alibaba.com>
Cc: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://patch.msgid.link/20241106221922.1544045-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 6 Nov 2024 13:03:22 +0000 (13:03 +0000)]
rxrpc: Fix missing locking causing hanging calls
If a call gets aborted (e.g. because kafs saw a signal) between it being
queued for connection and the I/O thread picking up the call, the abort
will be prioritised over the connection and it will be removed from
local->new_client_calls by rxrpc_disconnect_client_call() without a lock
being held. This may cause other calls on the list to disappear if a race
occurs.
Fix this by taking the client_call_lock when removing a call from whatever
list its ->wait_link happens to be on.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Fixes:
9d35d880e0e4 ("rxrpc: Move client call connection to the I/O thread")
Link: https://patch.msgid.link/726660.1730898202@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wenjia Zhang [Wed, 6 Nov 2024 08:26:12 +0000 (09:26 +0100)]
net/smc: Fix lookup of netdev by using ib_device_get_netdev()
The SMC-R variant of the SMC protocol used direct call to function
ib_device_ops.get_netdev() to lookup netdev. As we used mlx5 device
driver to run SMC-R, it failed to find a device, because in mlx5_ib the
internal net device management for retrieving net devices was replaced
by a common interface ib_device_get_netdev() in commit
8d159eb2117b
("RDMA/mlx5: Use IB set_netdev and get_netdev functions").
Since such direct accesses to the internal net device management is not
recommended at all, update the SMC-R code to use proper API
ib_device_get_netdev().
Fixes:
54903572c23c ("net/smc: allow pnetid-less configuration")
Reported-by: Aswin K <aswin@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://patch.msgid.link/20241106082612.57803-1-wenjia@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 7 Nov 2024 17:41:34 +0000 (07:41 -1000)]
Merge tag 'pwm/for-6.12-rc7-fixes' of git://git./linux/kernel/git/ukleinek/linux
Pull pwm fix from Uwe Kleine-König:
"Fix period setting in imx-tpm driver and a maintainer update
Erik Schumacher found and fixed a problem in the calculation of the
PWM period setting yielding too long periods. Trevor Gamblin - who
already cared about mainlining the pwm-axi-pwmgen driver - stepped
forward as an additional reviewer.
Thanks to Erik and Trevor"
* tag 'pwm/for-6.12-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
MAINTAINERS: add self as reviewer for AXI PWM GENERATOR
pwm: imx-tpm: Use correct MODULO value for EPWM mode
David Wang [Wed, 6 Nov 2024 02:12:28 +0000 (10:12 +0800)]
proc/softirqs: replace seq_printf with seq_put_decimal_ull_width
seq_printf is costy, on a system with n CPUs, reading /proc/softirqs
would yield 10*n decimal values, and the extra cost parsing format string
grows linearly with number of cpus. Replace seq_printf with
seq_put_decimal_ull_width have significant performance improvement.
On an 8CPUs system, reading /proc/softirqs show ~40% performance
gain with this patch.
Signed-off-by: David Wang <00107082@163.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jakub Kicinski [Thu, 7 Nov 2024 16:16:42 +0000 (08:16 -0800)]
Merge tag 'nf-24-11-07' of git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fix for net
The following series contains a Netfilter fix:
1) Wait for rcu grace period after netdevice removal is reported via event.
* tag 'nf-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: wait for rcu grace period on net_device removal
====================
Link: https://patch.msgid.link/20241107113212.116634-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gilad Naaman [Mon, 4 Nov 2024 08:35:44 +0000 (08:35 +0000)]
sctp: Avoid enqueuing addr events redundantly
Avoid modifying or enqueuing new events if it's possible to tell that no
one will consume them.
Since enqueueing requires searching the current queue for opposite
events for the same address, adding addresses en-masse turns this
inetaddr_event into a bottle-neck, as it will get slower and slower
with each address added.
Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20241104083545.114-1-gnaaman@drivenets.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 7 Nov 2024 12:39:43 +0000 (13:39 +0100)]
Merge branch 'fix-the-arc-emac-driver'
Andy Yan says:
====================
Fix the arc emac driver
The arc emac driver was broken for a long time,
The first broken happens when a dma releated fix introduced in Linux 5.10.
The second broken happens when a emac device tree node restyle introduced
in Linux 6.1.
These two patches are try to make the arc emac work again.
Changes in v2:
- Add cover letter.
- Add fix tag.
- Add more detail explaination.
====================
Link: https://patch.msgid.link/20241104130147.440125-1-andyshrk@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Johan Jonker [Mon, 4 Nov 2024 13:01:39 +0000 (21:01 +0800)]
net: arc: rockchip: fix emac mdio node support
The binding emac_rockchip.txt is converted to YAML.
Changed against the original binding is an added MDIO subnode.
This make the driver failed to find the PHY, and given the 'mdio
has invalid PHY address' it is probably looking in the wrong node.
Fix emac_mdio.c so that it can handle both old and new
device trees.
Fixes:
1dabb74971b3 ("ARM: dts: rockchip: restyle emac nodes")
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Tested-by: Andy Yan <andyshrk@163.com>
Link: https://lore.kernel.org/r/20220603163539.537-3-jbx6244@gmail.com
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Johan Jonker [Mon, 4 Nov 2024 13:01:38 +0000 (21:01 +0800)]
net: arc: fix the device for dma_map_single/dma_unmap_single
The ndev->dev and pdev->dev aren't the same device, use ndev->dev.parent
which has dma_mask, ndev->dev.parent is just pdev->dev.
Or it would cause the following issue:
[ 39.933526] ------------[ cut here ]------------
[ 39.938414] WARNING: CPU: 1 PID: 501 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x90/0x1f8
Fixes:
f959dcd6ddfd ("dma-direct: Fix potential NULL pointer dereference")
Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Signed-off-by: Andy Yan <andy.yan@rock-chips.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 7 Nov 2024 12:33:48 +0000 (13:33 +0100)]
Merge branch 'net-wwan-t7xx-add-t7xx-debug-ports'
Jinjian Song says:
====================
net: wwan: t7xx: Add t7xx debug ports
Add support for t7xx WWAN device to debug by ADB (Android Debug Bridge)
port and MTK MIPCi (Modem Information Process Center) port.
Application can use ADB (Android Debug Bridge) port to implement
functions (shell, pull, push ...) by ADB protocol commands.
Application can use MIPC (Modem Information Process Center) port
to debug antenna tuner or noise profiling through this MTK modem
diagnostic interface.
====================
Link: https://patch.msgid.link/20241104094436.466861-1-jinjian.song@fibocom.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jinjian Song [Mon, 4 Nov 2024 09:44:36 +0000 (17:44 +0800)]
net: wwan: t7xx: Unify documentation column width
Unify the column width of the document to comply with specifications.
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jinjian Song [Mon, 4 Nov 2024 09:44:35 +0000 (17:44 +0800)]
net: wwan: t7xx: Add debug ports
Add support for userspace to enable/disable the debug ports(ADB,MIPC).
- ADB port: /dev/wwan0adb0
- MIPC port: /dev/wwan0mipc0
Application can use ADB (Android Debug Bridge) port to implement
functions (shell, pull, push ...) by ADB protocol commands.
E.g., ADB commands:
- A_OPEN: OPEN(local-id, 0, "destination")
- A_WRTE: WRITE(local-id, remote-id, "data")
- A_OKEY: READY(local-id, remote-id, "")
- A_CLSE: CLOSE(local-id, remote-id, "")
Link: https://android.googlesource.com/platform/packages/modules/adb/+/refs/heads/main/README.md
Application can use MIPC (Modem Information Process Center) port
to debug antenna tuner or noise profiling through this MTK modem
diagnostic interface.
By default, debug ports are not exposed, so using the command
to enable or disable debug ports.
Enable debug ports:
- enable: 'echo 1 > /sys/bus/pci/devices/${bdf}/t7xx_debug_ports
Disable debug ports:
- disable: 'echo 0 > /sys/bus/pci/devices/${bdf}/t7xx_debug_ports
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jinjian Song [Mon, 4 Nov 2024 09:44:34 +0000 (17:44 +0800)]
wwan: core: Add WWAN ADB and MIPC port type
Add new WWAN ports that connect to the device's ADB protocol interface
and MTK MIPC diagnostic interface.
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 7 Nov 2024 11:46:03 +0000 (12:46 +0100)]
Merge tag 'nf-next-24-11-07' of git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following series contains Netfilter updates for net-next:
1) Make legacy xtables configs user selectable, from Breno Leitao.
2) Fix a few sparse warnings related to percpu, from Uros Bizjak.
3) Use strscpy_pad, from Justin Stitt.
4) Use nft_trans_elem_alloc() in catchall flush, from Florian Westphal.
5) A series of 7 patches to fix false positive with CONFIG_RCU_LIST=y.
Florian also sees possible issue with 10 while module load/removal
when requesting an expression that is available via module. As for
patch 11, object is being updated so reference on the module already
exists so I don't see any real issue.
Florian says:
"Unfortunately there are many more errors, and not all are false positives.
First patches pass lockdep_commit_lock_is_held() to the rcu list traversal
macro so that those splats are avoided.
The last two patches are real code change as opposed to
'pass the transaction mutex to relax rcu check':
Those two lists are not protected by transaction mutex so could be altered
in parallel.
This targets nf-next because these are long-standing issues."
netfilter pull request 24-11-07
* tag 'nf-next-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: must hold rcu read lock while iterating object type list
netfilter: nf_tables: must hold rcu read lock while iterating expression type list
netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook
netfilter: nf_tables: avoid false-positive lockdep splats in set walker
netfilter: nf_tables: avoid false-positive lockdep splats with flowtables
netfilter: nf_tables: avoid false-positive lockdep splats with sets
netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion
netfilter: nf_tables: prefer nft_trans_elem_alloc helper
netfilter: nf_tables: replace deprecated strncpy with strscpy_pad
netfilter: nf_tables: Fix percpu address space issues in nf_tables_api.c
netfilter: Make legacy configs user selectable
====================
Link: https://patch.msgid.link/20241106234625.168468-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 7 Nov 2024 11:40:20 +0000 (12:40 +0100)]
Merge branch 'virtio_net-make-rss-interact-properly-with-queue-number'
Philo Lu says:
====================
virtio_net: Make RSS interact properly with queue number
With this patch set, RSS updates with queue_pairs changing:
- When virtnet_probe, init default rss and commit
- When queue_pairs changes _without_ user rss configuration, update rss
with the new queue number
- When queue_pairs changes _with_ user rss configuration, keep rss as user
configured
Patch 1 and 2 fix possible out of bound errors for indir_table and key.
Patch 3 and 4 add RSS update in probe() and set_queues().
====================
Link: https://patch.msgid.link/20241104085706.13872-1-lulie@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Philo Lu [Mon, 4 Nov 2024 08:57:06 +0000 (16:57 +0800)]
virtio_net: Update rss when set queue
RSS configuration should be updated with queue number. In particular, it
should be updated when (1) rss enabled and (2) default rss configuration
is used without user modification.
During rss command processing, device updates queue_pairs using
rss.max_tx_vq. That is, the device updates queue_pairs together with
rss, so we can skip the sperate queue_pairs update
(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET below) and return directly.
Also remove the `vi->has_rss ?` check when setting vi->rss.max_tx_vq,
because this is not used in the other hash_report case.
Fixes:
c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Philo Lu [Mon, 4 Nov 2024 08:57:05 +0000 (16:57 +0800)]
virtio_net: Sync rss config to device when virtnet_probe
During virtnet_probe, default rss configuration is initialized, but was
not committed to the device. This patch fix this by sending rss command
after device ready in virtnet_probe. Otherwise, the actual rss
configuration used by device can be different with that read by user
from driver, which may confuse the user.
If the command committing fails, driver rss will be disabled.
Fixes:
c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Joe Damato <jdamato@fastly.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Philo Lu [Mon, 4 Nov 2024 08:57:04 +0000 (16:57 +0800)]
virtio_net: Add hash_key_length check
Add hash_key_length check in virtnet_probe() to avoid possible out of
bound errors when setting/reading the hash key.
Fixes:
c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Joe Damato <jdamato@fastly.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Philo Lu [Mon, 4 Nov 2024 08:57:03 +0000 (16:57 +0800)]
virtio_net: Support dynamic rss indirection table size
When reading/writing virtio_net_ctrl_rss, we get the indirection table
size from vi->rss_indir_table_size, which is initialized in
virtnet_probe(). However, the actual size of indirection_table was set
as VIRTIO_NET_RSS_MAX_TABLE_LEN=128. This collision may cause issues if
the vi->rss_indir_table_size exceeds 128.
This patch instead uses dynamic indirection table, allocated with
vi->rss after vi->rss_indir_table_size initialized. And free it in
virtnet_remove().
In virtnet_commit_rss_command(), sgs for rss is initialized differently
with hash_report. So indirection_table is not used if !vi->has_rss, and
then we don't need to alloc indirection_table for hash_report only uses.
Fixes:
c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Joe Damato <jdamato@fastly.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pablo Neira Ayuso [Tue, 5 Nov 2024 11:07:22 +0000 (12:07 +0100)]
netfilter: nf_tables: wait for rcu grace period on net_device removal
8c873e219970 ("netfilter: core: free hooks with call_rcu") removed
synchronize_net() call when unregistering basechain hook, however,
net_device removal event handler for the NFPROTO_NETDEV was not updated
to wait for RCU grace period.
Note that
835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks
on net_device removal") does not remove basechain rules on device
removal, I was hinted to remove rules on net_device removal later, see
5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on
netdevice removal").
Although NETDEV_UNREGISTER event is guaranteed to be handled after
synchronize_net() call, this path needs to wait for rcu grace period via
rcu callback to release basechain hooks if netns is alive because an
ongoing netlink dump could be in progress (sockets hold a reference on
the netns).
Note that nf_tables_pre_exit_net() unregisters and releases basechain
hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
the netns exit path, eg. veth peer device in another netns:
cleanup_net()
default_device_exit_batch()
unregister_netdevice_many_notify()
notifier_call_chain()
nf_tables_netdev_event()
__nft_release_basechain()
In this particular case, same rule of thumb applies: if netns is alive,
then wait for rcu grace period because netlink dump in the other netns
could be in progress. Otherwise, if the other netns is going away then
no netlink dump can be in progress and basechain hooks can be released
inmediately.
While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
validation, which should not ever happen.
Fixes:
835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Mohsin Bashir [Mon, 4 Nov 2024 03:13:00 +0000 (19:13 -0800)]
eth: fbnic: Add support to write TCE TCAM entries
Add support to redirect host-to-BMC traffic by writing MACDA entries
from the RPC (RX Parser and Classifier) to TCE-TCAM. The TCE TCAM is a
small L2 destination TCAM which is placed at the end of the TX path (TCE).
Unlike other NICs, where BMC diversion is typically handled by firmware,
for fbnic, firmware does not touch anything related to the host; hence,
the host uses TCE TCAM to divert BMC traffic.
Currently, we lack metadata to track where addresses have been written
in the TCAM, except for the last entry written. To address this issue,
we start at the opposite end of the table in each pass, so that adding
or deleting entries does not affect the availability of all entries,
assuming there is no significant reordering of entries.
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Link: https://patch.msgid.link/20241104031300.1330657-1-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Juraj Šarinay [Sun, 3 Nov 2024 12:45:25 +0000 (13:45 +0100)]
net: nfc: Propagate ISO14443 type A target ATS to userspace via netlink
Add a 20-byte field ats to struct nfc_target and expose it as
NFC_ATTR_TARGET_ATS via the netlink interface. The payload contains
'historical bytes' that help to distinguish cards from one another.
The information is commonly used to assemble an emulated ATR similar
to that reported by smart cards with contacts.
Add a 20-byte field target_ats to struct nci_dev to hold the payload
obtained in nci_rf_intf_activated_ntf_packet() and copy it to over to
nfc_target.ats in nci_activate_target(). The approach is similar
to the handling of 'general bytes' within ATR_RES.
Replace the hard-coded size of rats_res within struct
activation_params_nfca_poll_iso_dep by the equal constant NFC_ATS_MAXSIZE
now defined in nfc.h
Within NCI, the information corresponds to the 'RATS Response' activation
parameter that omits the initial length byte TL. This loses no
information and is consistent with our handling of SENSB_RES that
also drops the first (constant) byte.
Tested with nxp_nci_i2c on a few type A targets including an
ICAO 9303 compliant passport.
I refrain from the corresponding change to digital_in_recv_ats()
to have the few drivers based on digital.h fill nfc_target.ats,
as I have no way to test it. That class of drivers appear not to set
NFC_ATTR_TARGET_SENSB_RES either. Consider a separate patch to propagate
(all) the parameters.
Signed-off-by: Juraj Šarinay <juraj@sarinay.com>
Link: https://patch.msgid.link/20241103124525.8392-1-juraj@sarinay.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Nícolas F. R. A. Prado [Fri, 1 Nov 2024 21:17:29 +0000 (17:17 -0400)]
net: stmmac: Fix unbalanced IRQ wake disable warning on single irq case
Commit
a23aa0404218 ("net: stmmac: ethtool: Fixed calltrace caused by
unbalanced disable_irq_wake calls") introduced checks to prevent
unbalanced enable and disable IRQ wake calls. However it only
initialized the auxiliary variable on one of the paths,
stmmac_request_irq_multi_msi(), missing the other,
stmmac_request_irq_single().
Add the same initialization on stmmac_request_irq_single() to prevent
"Unbalanced IRQ <x> wake disable" warnings from being printed the first
time disable_irq_wake() is called on platforms that run on that code
path.
Fixes:
a23aa0404218 ("net: stmmac: ethtool: Fixed calltrace caused by unbalanced disable_irq_wake calls")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241101-stmmac-unbalanced-wake-single-fix-v1-1-5952524c97f0@collabora.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stefan Wahren [Tue, 5 Nov 2024 16:31:01 +0000 (17:31 +0100)]
net: vertexcom: mse102x: Fix possible double free of TX skb
The scope of the TX skb is wider than just mse102x_tx_frame_spi(),
so in case the TX skb room needs to be expanded, we should free the
the temporary skb instead of the original skb. Otherwise the original
TX skb pointer would be freed again in mse102x_tx_work(), which leads
to crashes:
Internal error: Oops:
0000000096000004 [#2] PREEMPT SMP
CPU: 0 PID: 712 Comm: kworker/0:1 Tainted: G D 6.6.23
Hardware name: chargebyte Charge SOM DC-ONE (DT)
Workqueue: events mse102x_tx_work [mse102x]
pstate:
20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : skb_release_data+0xb8/0x1d8
lr : skb_release_data+0x1ac/0x1d8
sp :
ffff8000819a3cc0
x29:
ffff8000819a3cc0 x28:
ffff0000046daa60 x27:
ffff0000057f2dc0
x26:
ffff000005386c00 x25:
0000000000000002 x24:
00000000ffffffff
x23:
0000000000000000 x22:
0000000000000001 x21:
ffff0000057f2e50
x20:
0000000000000006 x19:
0000000000000000 x18:
ffff00003fdacfcc
x17:
e69ad452d0c49def x16:
84a005feff870102 x15:
0000000000000000
x14:
000000000000024a x13:
0000000000000002 x12:
0000000000000000
x11:
0000000000000400 x10:
0000000000000930 x9 :
ffff00003fd913e8
x8 :
fffffc00001bc008
x7 :
0000000000000000 x6 :
0000000000000008
x5 :
ffff00003fd91340 x4 :
0000000000000000 x3 :
0000000000000009
x2 :
00000000fffffffe x1 :
0000000000000000 x0 :
0000000000000000
Call trace:
skb_release_data+0xb8/0x1d8
kfree_skb_reason+0x48/0xb0
mse102x_tx_work+0x164/0x35c [mse102x]
process_one_work+0x138/0x260
worker_thread+0x32c/0x438
kthread+0x118/0x11c
ret_from_fork+0x10/0x20
Code:
aa1303e0 97fffab6 72001c1f 54000141 (
f9400660)
Cc: stable@vger.kernel.org
Fixes:
2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20241105163101.33216-1-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 Nov 2024 01:54:50 +0000 (17:54 -0800)]
Merge branch 'net-ucc_geth-devm-cleanups'
Rosen Penev says:
====================
net: ucc_geth: devm cleanups
Also added a small fix for NVMEM mac addresses.
This was tested as working on a Watchguard T10 device.
====================
Link: https://patch.msgid.link/20241104210127.307420-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Mon, 4 Nov 2024 21:01:27 +0000 (13:01 -0800)]
net: ucc_geth: fix usage with NVMEM MAC address
When nvmem is not ready, of_get_ethdev_address returns -EPROBE_DEFER. In
such a case, return -EPROBE_DEFER to avoid not having a proper MAC
address.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20241104210127.307420-5-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Mon, 4 Nov 2024 21:01:26 +0000 (13:01 -0800)]
net: ucc_geth: use devm for register_netdev
Avoids having to unregister manually.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20241104210127.307420-4-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Mon, 4 Nov 2024 21:01:25 +0000 (13:01 -0800)]
net: ucc_geth: use devm for alloc_etherdev
Avoids manual frees. Removes one goto.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20241104210127.307420-3-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>