Ido Schimmel [Thu, 26 Jun 2025 07:31:10 +0000 (10:31 +0300)]
neighbor: Add NTF_EXT_VALIDATED flag for externally validated entries
tl;dr
=====
Add a new neighbor flag ("extern_valid") that can be used to indicate to
the kernel that a neighbor entry was learned and determined to be valid
externally. The kernel will not try to remove or invalidate such an
entry, leaving these decisions to the user space control plane. This is
needed for EVPN multi-homing where a neighbor entry for a multi-homed
host needs to be synced across all the VTEPs among which the host is
multi-homed.
Background
==========
In a typical EVPN multi-homing setup each host is multi-homed using a
set of links called ES (Ethernet Segment, i.e., LAG) to multiple leaf
switches (VTEPs). VTEPs that are connected to the same ES are called ES
peers.
When a neighbor entry is learned on a VTEP, it is distributed to both ES
peers and remote VTEPs using EVPN MAC/IP advertisement routes. ES peers
use the neighbor entry when routing traffic towards the multi-homed host
and remote VTEPs use it for ARP/NS suppression.
Motivation
==========
If the ES link between a host and the VTEP on which the neighbor entry
was locally learned goes down, the EVPN MAC/IP advertisement route will
be withdrawn and the neighbor entries will be removed from both ES peers
and remote VTEPs. Routing towards the multi-homed host and ARP/NS
suppression can fail until another ES peer locally learns the neighbor
entry and distributes it via an EVPN MAC/IP advertisement route.
"draft-rbickhart-evpn-ip-mac-proxy-adv-03" [1] suggests avoiding these
intermittent failures by having the ES peers install the neighbor
entries as before, but also injecting EVPN MAC/IP advertisement routes
with a proxy indication. When the previously mentioned ES link goes down
and the original EVPN MAC/IP advertisement route is withdrawn, the ES
peers will not withdraw their neighbor entries, but instead start aging
timers for the proxy indication.
If an ES peer locally learns the neighbor entry (i.e., it becomes
"reachable"), it will restart its aging timer for the entry and emit an
EVPN MAC/IP advertisement route without a proxy indication. An ES peer
will stop its aging timer for the proxy indication if it observes the
removal of the proxy indication from at least one of the ES peers
advertising the entry.
In the event that the aging timer for the proxy indication expired, an
ES peer will withdraw its EVPN MAC/IP advertisement route. If the timer
expired on all ES peers and they all withdrew their proxy
advertisements, the neighbor entry will be completely removed from the
EVPN fabric.
Implementation
==============
In the above scheme, when the control plane (e.g., FRR) advertises a
neighbor entry with a proxy indication, it expects the corresponding
entry in the data plane (i.e., the kernel) to remain valid and not be
removed due to garbage collection or loss of carrier. The control plane
also expects the kernel to notify it if the entry was learned locally
(i.e., became "reachable") so that it will remove the proxy indication
from the EVPN MAC/IP advertisement route. That is why these entries
cannot be programmed with dummy states such as "permanent" or "noarp".
Instead, add a new neighbor flag ("extern_valid") which indicates that
the entry was learned and determined to be valid externally and should
not be removed or invalidated by the kernel. The kernel can probe the
entry and notify user space when it becomes "reachable" (it is initially
installed as "stale"). However, if the kernel does not receive a
confirmation, have it return the entry to the "stale" state instead of
the "failed" state.
In other words, an entry marked with the "extern_valid" flag behaves
like any other dynamically learned entry other than the fact that the
kernel cannot remove or invalidate it.
One can argue that the "extern_valid" flag should not prevent garbage
collection and that instead a neighbor entry should be programmed with
both the "extern_valid" and "extern_learn" flags. There are two reasons
for not doing that:
1. Unclear why a control plane would like to program an entry that the
kernel cannot invalidate but can completely remove.
2. The "extern_learn" flag is used by FRR for neighbor entries learned
on remote VTEPs (for ARP/NS suppression) whereas here we are
concerned with local entries. This distinction is currently irrelevant
for the kernel, but might be relevant in the future.
Given that the flag only makes sense when the neighbor has a valid
state, reject attempts to add a neighbor with an invalid state and with
this flag set. For example:
# ip neigh add 192.0.2.1 nud none dev br0.10 extern_valid
Error: Cannot create externally validated neighbor with an invalid state.
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid
# ip neigh replace 192.0.2.1 nud failed dev br0.10 extern_valid
Error: Cannot mark neighbor as externally validated with an invalid state.
The above means that a neighbor cannot be created with the
"extern_valid" flag and flags such as "use" or "managed" as they result
in a neighbor being created with an invalid state ("none") and
immediately getting probed:
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
Error: Cannot create externally validated neighbor with an invalid state.
However, these flags can be used together with "extern_valid" after the
neighbor was created with a valid state:
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
One consequence of preventing the kernel from invalidating a neighbor
entry is that by default it will only try to determine reachability
using unicast probes. This can be changed using the "mcast_resolicit"
sysctl:
# sysctl net.ipv4.neigh.br0/10.mcast_resolicit
0
# tcpdump -nn -e -i br0.10 -Q out arp &
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
# sysctl -wq net.ipv4.neigh.br0/10.mcast_resolicit=3
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
iproute2 patches can be found here [2].
[1] https://datatracker.ietf.org/doc/html/draft-rbickhart-evpn-ip-mac-proxy-adv-03
[2] https://github.com/idosch/iproute2/tree/submit/extern_valid_v1
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250626073111.244534-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Fri, 27 Jun 2025 11:58:21 +0000 (11:58 +0000)]
ipv6: guard ip6_mr_output() with rcu
syzbot found at least one path leads to an ip_mr_output()
without RCU being held.
Add guard(rcu)() to fix this in a concise way.
WARNING: net/ipv6/ip6mr.c:2376 at ip6_mr_output+0xe0b/0x1040 net/ipv6/ip6mr.c:2376, CPU#1: kworker/1:2/121
Call Trace:
<TASK>
ip6tunnel_xmit include/net/ip6_tunnel.h:162 [inline]
udp_tunnel6_xmit_skb+0x640/0xad0 net/ipv6/ip6_udp_tunnel.c:112
send6+0x5ac/0x8d0 drivers/net/wireguard/socket.c:152
wg_socket_send_skb_to_peer+0x111/0x1d0 drivers/net/wireguard/socket.c:178
wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
wg_packet_tx_worker+0x1c8/0x7c0 drivers/net/wireguard/send.c:276
process_one_work kernel/workqueue.c:3239 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3322
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3403
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Fixes:
96e8f5a9fe2d ("net: ipv6: Add ip6_mr_output()")
Reported-by: syzbot+0141c834e47059395621@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
685e86b3.
a00a0220.129264.0003.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Benjamin Poirier <bpoirier@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250627115822.3741390-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 30 Jun 2025 15:41:32 +0000 (08:41 -0700)]
Merge branch 'net-ethtool-consistently-take-rss_lock-for-all-rxfh-ops'
Jakub Kicinski says:
====================
net: ethtool: consistently take rss_lock for all rxfh ops
I'd like to bring RXFH and RXFHINDIR ioctls under a single set of
Netlink ops. It appears that while core takes the ethtool->rss_lock
around some of the RXFHINDIR ops, drivers (sfc) take it internally
for the RXFH.
Consistently take the lock around all ops and accesses to the XArray
within the core. This should hopefully make the rss_lock a lot less
confusing.
====================
Link: https://patch.msgid.link/20250626202848.104457-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Jun 2025 20:28:48 +0000 (13:28 -0700)]
net: ethtool: move get_rxfh callback under the rss_lock
We already call get_rxfh under the rss_lock when we read back
context state after changes. Let's be consistent and always
hold the lock. The existing callers are all under rtnl_lock
so this should make no difference in practice, but it makes
the locking rules far less confusing IMHO. Any RSS callback
and any access to the RSS XArray should hold the lock.
Link: https://patch.msgid.link/20250626202848.104457-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Jun 2025 20:28:47 +0000 (13:28 -0700)]
net: ethtool: move rxfh_fields callbacks under the rss_lock
Netlink code will want to perform the RSS_SET operation atomically
under the rss_lock. sfc wants to hold the rss_lock in rxfh_fields_get,
which makes that difficult. Lets move the locking up to the core
so that for all driver-facing callbacks rss_lock is taken consistently
by the core.
Link: https://patch.msgid.link/20250626202848.104457-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Jun 2025 20:28:46 +0000 (13:28 -0700)]
net: ethtool: take rss_lock for all rxfh changes
Always take the rss_lock in ethtool_set_rxfh(). We will want to
make a similar change in ethtool_set_rxfh_fields() and some
drivers lock that callback regardless of rss context ID being set.
Having some callbacks locked unconditionally and some only if
context ID is set would be very confusing.
ethtool handling is under rtnl_lock, so rss_lock is very unlikely
to ever be congested.
Link: https://patch.msgid.link/20250626202848.104457-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Jun 2025 23:39:26 +0000 (16:39 -0700)]
net: ethtool: avoid OOB accesses in PAUSE_SET
We now reuse .parse_request() from GET on SET, so we need to make sure
that the policies for both cover the attributes used for .parse_request().
genetlink will only allocate space in info->attrs for ARRAY_SIZE(policy).
Reported-by: syzbot+430f9f76633641a62217@syzkaller.appspotmail.com
Fixes:
963781bdfe20 ("net: ethtool: call .parse_request for SET handlers")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250626233926.199801-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fushuai Wang [Thu, 26 Jun 2025 05:30:03 +0000 (13:30 +0800)]
net/mlx5e: Fix error handling in RQ memory model registration
Currently when xdp_rxq_info_reg_mem_model() fails in the XSK path, the
error handling incorrectly jumps to err_destroy_page_pool. While this
may not cause errors, we should make it jump to the correct location.
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 25 Jun 2025 15:23:05 +0000 (10:23 -0500)]
octeontx2-af: Fix error code in rvu_mbox_init()
The error code was intended to be -EINVAL here, but it was accidentally
changed to returning success. Set the error code.
Fixes:
e53ee4acb220 ("octeontx2-af: CN20k basic mbox operations and structures")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 27 Jun 2025 11:46:41 +0000 (11:46 +0000)]
net: ipv4: guard ip_mr_output() with rcu
syzbot found at least one path leads to an ip_mr_output()
without RCU being held.
Add guard(rcu)() to fix this in a concise way.
WARNING: CPU: 0 PID: 0 at net/ipv4/ipmr.c:2302 ip_mr_output+0xbb1/0xe70 net/ipv4/ipmr.c:2302
Call Trace:
<IRQ>
igmp_send_report+0x89e/0xdb0 net/ipv4/igmp.c:799
igmp_timer_expire+0x204/0x510 net/ipv4/igmp.c:-1
call_timer_fn+0x17e/0x5f0 kernel/time/timer.c:1747
expire_timers kernel/time/timer.c:1798 [inline]
__run_timers kernel/time/timer.c:2372 [inline]
__run_timer_base+0x61a/0x860 kernel/time/timer.c:2384
run_timer_base kernel/time/timer.c:2393 [inline]
run_timer_softirq+0xb7/0x180 kernel/time/timer.c:2403
handle_softirqs+0x286/0x870 kernel/softirq.c:579
__do_softirq kernel/softirq.c:613 [inline]
invoke_softirq kernel/softirq.c:453 [inline]
__irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1050 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1050
Fixes:
35bec72a24ac ("net: ipv4: Add ip_mr_output()")
Reported-by: syzbot+f02fb9e43bd85c6c66ae@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
685e841a.
a00a0220.129264.0002.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Petr Machata <petrm@nvidia.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Benjamin Poirier <bpoirier@nvidia.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 27 Jun 2025 23:56:02 +0000 (16:56 -0700)]
Merge branch 'octeontx2-pf-extend-link-modes-support'
Hariprasad Kelam says:
====================
Octeontx2-pf: extend link modes support
This series of patches adds multi advertise mode support along with
other improvements in link mode management code flow.
Patch1: Currently all SGMII modes 10/100/1000baseT are mapped with
single firmware mode. This patch updates these link modes
with corresponding firmware modes.
Patch2: Due to limitation in current kernel <-> firmware communication,
link modes are divided into multiple groups, and identified
with their group index.
Patch3: Adds support for multi advertise mode.
====================
Link: https://patch.msgid.link/20250625092107.9746-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Wed, 25 Jun 2025 09:21:07 +0000 (14:51 +0530)]
Octeontx2-pf: ethtool: support multi advertise mode
Current implementation considers only first advertise
mode and passes the same to firmware to process.
This patch extends code such that user can advertise
multiple modes on the given interface.
Below are high level changes:
1. Remove unnecessary speed/duplex/autoneg validation as its
already verified as part of "set_link_ksettings"
2. Since scratch csr framework designed to support single mode at a time,
use "shared firmware data" for multi mode support.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-4-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Wed, 25 Jun 2025 09:21:06 +0000 (14:51 +0530)]
Octeontx2-af: Introduce mode group index
Kernel and firmware communicates via scratch register which is
64 bit in size.
[MODE_ID PORT AUTONEG DUPLEX SPEED CMD_ID OWNERSHIP ]
63-22 21-14 13 12 11-8 7-2 1-0
The existing MODE_ID bitmap can only support up to 42 modes.
To resolve the issue, the unused port field is modified as below
uint64_t reserved2:6;
uint64_t mode_group_idx:2;
'mode_group_idx' categorize the mode ID range to accommodate more modes.
To specify mode ID range of 0 - 41, this field will be 0.
To specify mode ID range of 42 - 83, this field will be 1.
mode ID will be still mentioned as 1 << (0 - 41). But the mode_group_idx
decides the actual mode range
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-3-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hariprasad Kelam [Wed, 25 Jun 2025 09:21:05 +0000 (14:51 +0530)]
Octeontx-pf: Update SGMII mode mapping
Current implementation maps ethtool link modes 10baseT/100baseT/1000baseT
to single firmware mode SGMII. This create a problem for end users who want
to advertise only one speed among them.
This patch addresses the issue by mapping each ethtool link mode
to a corresponding firmware mode also updates new modes supported
by firmware.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-2-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 23:38:05 +0000 (16:38 -0700)]
Merge branch 'dpll-add-reference-sync-feature'
Arkadiusz Kubalewski says:
====================
dpll: add Reference SYNC feature
The device may support the Reference SYNC feature, which allows the
combination of two inputs into a input pair. In this configuration,
clock signals from both inputs are used to synchronize the DPLL device.
The higher frequency signal is utilized for the loop bandwidth of the DPLL,
while the lower frequency signal is used to syntonize the output signal of
the DPLL device. This feature enables the provision of a high-quality loop
bandwidth signal from an external source.
A capable input provides a list of inputs that can be bound with to create
Reference SYNC. To control this feature, the user must request a
desired state for a target pin: use ``DPLL_PIN_STATE_CONNECTED`` to
enable or ``DPLL_PIN_STATE_DISCONNECTED`` to disable the feature. An input
pin can be bound to only one other pin at any given time.
Verify pins bind state/capabilities:
$ ./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/dpll.yaml \
--do pin-get \
--json '{"id":0}'
{'board-label': 'CVL-SDP22',
'id': 0,
[...]
'reference-sync': [{'id': 1, 'state': 'disconnected'}],
[...]}
Bind the pins by setting connected state between them:
$ ./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/dpll.yaml \
--do pin-set \
--json '{"id":0, "reference-sync":{"id":1, "state":"connected"}}'
Verify pins bind state:
$ ./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/dpll.yaml \
--do pin-get \
--json '{"id":0}'
{'board-label': 'CVL-SDP22',
'id': 0,
[...]
'reference-sync': [{'id': 1, 'state': 'connected'}],
[...]}
Unbind the pins by setting disconnected state between them:
$ ./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/dpll.yaml \
--do pin-set \
--json '{"id":0, "reference-sync":{"id":1, "state":"disconnected"}}'
====================
Link: https://patch.msgid.link/20250626135219.1769350-1-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Thu, 26 Jun 2025 13:52:19 +0000 (15:52 +0200)]
ice: add ref-sync dpll pins
Implement reference sync input pin get/set callbacks, allow user space
control over dpll pin pairs capable of reference sync support.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://patch.msgid.link/20250626135219.1769350-4-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Thu, 26 Jun 2025 13:52:18 +0000 (15:52 +0200)]
dpll: add reference sync get/set
Define function for reference sync pin registration and callback ops to
set/get current feature state.
Implement netlink handler to fill netlink messages with reference sync
pin configuration of capable pins (pin-get).
Implement netlink handler to call proper ops and configure reference
sync pin state (pin-set).
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Milena Olech <milena.olech@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://patch.msgid.link/20250626135219.1769350-3-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Thu, 26 Jun 2025 13:52:17 +0000 (15:52 +0200)]
dpll: add reference-sync netlink attribute
Add new netlink attribute to allow user space configuration of reference
sync pin pairs, where both pins are used to provide one clock signal
consisting of both: base frequency and sync signal.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Milena Olech <milena.olech@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://patch.msgid.link/20250626135219.1769350-2-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 23:27:08 +0000 (16:27 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
ice: remaining TSPLL cleanups
These are the remaining patches from the "ice: Separate TSPLL from PTP
and cleanup" series [1] with control flow macros removed. What remains
are cleanups and some minor improvements.
[1] https://lore.kernel.org/netdev/
20250618174231.
3100231-1-anthony.l.nguyen@intel.com/
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: default to TIME_REF instead of TXCO on E825-C
ice: move TSPLL init calls to ice_ptp.c
ice: fall back to TCXO on TSPLL lock fail
ice: wait before enabling TSPLL
ice: add multiple TSPLL helpers
ice: use bitfields instead of unions for CGU regs
ice: read TSPLL registers again before reporting status
ice: clear time_sync_en field for E825-C during reprogramming
====================
Link: https://patch.msgid.link/20250626162921.1173068-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Jun 2025 16:54:41 +0000 (09:54 -0700)]
eth: bnxt: take page size into account for page pool recycling rings
The Rx rings are filled with Rx buffers. Which are supposed to fit
packet headers (or MTU if HW-GRO is disabled). The aggregation buffers
are filled with "device pages". Adjust the sizes of the page pool
recycling ring appropriately, based on ratio of the size of the
buffer on given ring vs system page size. Otherwise on a system
with 64kB pages we end up with >700MB of memory sitting in every
single page pool cache.
Correct the size calculation for the head_pool. Since the buffers
there are always small I'm pretty sure I meant to cap the size
at 1k, rather than make it the lowest possible size. With 64k pages
1k cache with a 1k ring is 64x larger than we need.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250626165441.4125047-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 22:35:08 +0000 (15:35 -0700)]
Merge branch 'tcp-fix-dsack-bug-with-non-contiguous-ranges'
Eric Dumazet says:
====================
tcp: fix DSACK bug with non contiguous ranges
This series combines a fix from xin.guo and a new packetdrill test.
====================
Link: https://patch.msgid.link/20250626123420.1933835-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 26 Jun 2025 12:34:20 +0000 (12:34 +0000)]
selftests/net: packetdrill: add tcp_dsack_mult.pkt
Test DSACK behavior with non contiguous ranges.
Without prior fix (tcp: fix tcp_ofo_queue() to avoid including
too much DUP SACK range) this would fail with:
tcp_dsack_mult.pkt:37: error handling packet: bad value outbound TCP option 5
script packet: 0.100682 . 1:1(0) ack 6001 <nop,nop,sack 1001:3001 7001:8001>
actual packet: 0.100679 . 1:1(0) ack 6001 win 1097 <nop,nop,sack 1001:6001 7001:8001>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: xin.guo <guoxin0309@gmail.com>
Link: https://patch.msgid.link/20250626123420.1933835-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
xin.guo [Thu, 26 Jun 2025 12:34:19 +0000 (12:34 +0000)]
tcp: fix tcp_ofo_queue() to avoid including too much DUP SACK range
If the new coming segment covers more than one skbs in the ofo queue,
and which seq is equal to rcv_nxt, then the sequence range
that is duplicated will be sent as DUP SACK, the detail as below,
in step6, the {501,2001} range is clearly including too much
DUP SACK range, in violation of RFC 2883 rules.
1. client > server: Flags [.], seq 501:1001, ack
1325288529, win 20000, length 500
2. server > client: Flags [.], ack 1, [nop,nop,sack 1 {501:1001}], length 0
3. client > server: Flags [.], seq 1501:2001, ack
1325288529, win 20000, length 500
4. server > client: Flags [.], ack 1, [nop,nop,sack 2 {1501:2001} {501:1001}], length 0
5. client > server: Flags [.], seq 1:2001, ack
1325288529, win 20000, length 2000
6. server > client: Flags [.], ack 2001, [nop,nop,sack 1 {501:2001}], length 0
After this fix, the final ACK is as below:
6. server > client: Flags [.], ack 2001, options [nop,nop,sack 1 {501:1001}], length 0
[edumazet] added a new packetdrill test in the following patch.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: xin.guo <guoxin0309@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250626123420.1933835-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 22:34:20 +0000 (15:34 -0700)]
Merge branch 'tcp-remove-rtx_syn_ack-and-inet_rtx_syn_ack'
Eric Dumazet says:
====================
tcp: remove rtx_syn_ack and inet_rtx_syn_ack()
After DCCP removal, we can cleanup SYNACK retransmits a bit.
====================
Link: https://patch.msgid.link/20250626153017.2156274-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 26 Jun 2025 15:30:17 +0000 (15:30 +0000)]
tcp: remove inet_rtx_syn_ack()
inet_rtx_syn_ack() is a simple wrapper around tcp_rtx_synack(),
if we move req->num_retrans update.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250626153017.2156274-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 26 Jun 2025 15:30:16 +0000 (15:30 +0000)]
tcp: remove rtx_syn_ack field
Now inet_rtx_syn_ack() is only used by TCP, it can directly
call tcp_rtx_synack() instead of using an indirect call
to req->rsk_ops->rtx_syn_ack().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250626153017.2156274-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 22:14:58 +0000 (15:14 -0700)]
Merge branch 'net-dsa-ks8995-fix-up-bindings'
Linus Walleij says:
====================
net: dsa: ks8995: Fix up bindings
After looking at the datasheets for KS8995 I realized this is
a DSA switch and need to have DT bindings as such and be implemented
as such.
This series just fixes up the bindings and the offending device tree.
The existing kernel driver which is in drivers/net/phy/spi_ks8995.c
does not implement DSA. It can be forgiven for this because it was
merged in 2011 and the DSA framework was not widely established
back then. It continues to probe fine but needs to be rewritten
to use the special DSA tag and moved to drivers/net/dsa as time
permits. (I hope I can do this.)
It's fine for the networking tree to merge both patches, I maintain
ixp4xx as well. But I can also carry the second patch through the
SoC tree if so desired.
v1: https://lore.kernel.org/
20250624-ks8995-dsa-bindings-v1-0-
71a8b4f63315@linaro.org
====================
Link: https://patch.msgid.link/20250625-ks8995-dsa-bindings-v2-0-ce71dce9be0b@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Walleij [Wed, 25 Jun 2025 06:51:25 +0000 (08:51 +0200)]
ARM: dts: Fix up wrv54g device tree
Fix up the KS8995 switch and PHYs the way that is most likely:
- Phy 1-4 is certainly the PHYs of the KS8995 (mask 0x1e in
the outoftree code masks PHYs 1,2,3,4).
- Phy 5 is the MII-P5 separate WAN phy of the KS8995 directly
connected to EthC.
- The EthB MII is probably connected as CPU interface to the
KS8995.
Properly integrate the KS8995 switch using the new bindings.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250625-ks8995-dsa-bindings-v2-2-ce71dce9be0b@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Walleij [Wed, 25 Jun 2025 06:51:24 +0000 (08:51 +0200)]
dt-bindings: dsa: Rewrite Micrel KS8995 in schema
After studying the datasheets for some of the KS8995 variants
it becomes pretty obvious that this is a straight-forward
and simple MII DSA switch with one port in (CPU) and four outgoing
ports, and it even supports custom tags by setting a bit in
a special register, and elaborate VLAN handling as all DSA
switches do.
What is a bit odd with KS8995 is that it uses an extra MII-P5
port to access one of the PHYs separately, on the side of the
switch fabric, such as when using a WAN port separately from
a LAN switch in a home router.
Rewrite the terse bindings to YAML, and move to the proper
subdirectory. Include a verbose example to make things clear.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250625-ks8995-dsa-bindings-v2-1-ce71dce9be0b@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Kocialkowski [Thu, 26 Jun 2025 08:09:21 +0000 (10:09 +0200)]
dt-bindings: net: sun8i-emac: Add A100 EMAC compatible
The Allwinner A100/A133 has an Ethernet MAC (EMAC) controller that is
compatible with the A64 one. It features the same syscon registers for
control of the top-level integration of the unit.
Signed-off-by: Paul Kocialkowski <paulk@sys-base.io>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250626080923.632789-4-paulk@sys-base.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 22:09:04 +0000 (15:09 -0700)]
Merge branch 'nfc-trf7970a-add-option-to-reduce-antenna-gain'
Paul Geurts says:
====================
NFC: trf7970a: Add option to reduce antenna gain
The TRF7970a device is sensitive to RF disturbances, which can make it
hard to pass some EMC immunity tests. By reducing the RX antenna gain,
the device becomes less sensitive to EMC disturbances, as a trade-off
against antenna performance.
====================
Link: https://patch.msgid.link/20250626141242.3749958-1-paul.geurts@prodrive-technologies.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Geurts [Thu, 26 Jun 2025 14:12:42 +0000 (16:12 +0200)]
NFC: trf7970a: Create device-tree parameter for RX gain reduction
The TRF7970a device is sensitive to RF disturbances, which can make it
hard to pass some EMC immunity tests. By reducing the RX antenna gain,
the device becomes less sensitive to EMC disturbances, as a trade-off
against antenna performance.
Add a device tree option to select RX gain reduction to improve EMC
performance.
Selecting a communication standard in the ISO control register resets
the RX antenna gain settings. Therefore set the RX gain reduction
everytime the ISO control register changes, when the option is used.
Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250626141242.3749958-3-paul.geurts@prodrive-technologies.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Geurts [Thu, 26 Jun 2025 14:12:41 +0000 (16:12 +0200)]
dt-bindings: net/nfc: ti,trf7970a: Add ti,rx-gain-reduction-db option
Add option to reduce the RX antenna gain to be able to reduce the
sensitivity.
Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250626141242.3749958-2-paul.geurts@prodrive-technologies.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Frank Li [Tue, 24 Jun 2025 20:20:27 +0000 (16:20 -0400)]
dt-bindings: net: convert lpc-eth.txt yaml format
Convert lpc-eth.txt yaml format.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250624202028.2516257-1-Frank.Li@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 20:22:59 +0000 (13:22 -0700)]
Merge branch 'ref_tracker-fix'
Merge a fix from Jeff from a stable commit ID:
* ref_tracker: do xarray and workqueue job initializations earlier
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jeff Layton [Thu, 26 Jun 2025 12:52:14 +0000 (08:52 -0400)]
ref_tracker: do xarray and workqueue job initializations earlier
The kernel test robot reported an oops that occurred when attempting to
deregister a dentry from the xarray during subsys_initcall().
The ref_tracker xarrays and workqueue job are being initialized in
late_initcall() which is too late. Move those to postcore_initcall()
instead.
Fixes:
65b584f53611 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/
202506251406.
c28f2adb-lkp@intel.com
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250626-reftrack-dbgfs-v1-1-812102e2a394@kernel.org
Simon Horman [Wed, 25 Jun 2025 12:52:10 +0000 (13:52 +0100)]
tg3: spelling corrections
Correct spelling as flagged by codespell.
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Tue, 17 Jun 2025 09:16:53 +0000 (11:16 +0200)]
net: mdio: Add MDIO bus controller for Airoha AN7583
Airoha AN7583 SoC have 2 dedicated MDIO bus controller in the SCU
register map. To driver register an MDIO controller based on the DT
reg property and access the register by accessing the parent syscon.
The MDIO bus logic is similar to the MT7530 internal MDIO bus but
deviates of some setting and some HW bug.
On Airoha AN7583 the MDIO clock is set to 25MHz by default and needs to
be correctly setup to 2.5MHz to correctly work (by setting the divisor
to 10x).
There seems to be Hardware bug where AN7583_MII_RWDATA
is not wiped in the context of unconnected PHY and the
previous read value is returned.
Example: (only one PHY on the BUS at 0x1f)
- read at 0x1f report at 0x2 0x7500
- read at 0x0 report 0x7500 on every address
To workaround this, we reset the Mdio BUS at every read
to have consistent values on read operation.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Tue, 17 Jun 2025 09:16:52 +0000 (11:16 +0200)]
dt-bindings: net: Document support for Airoha AN7583 MDIO Controller
Airoha AN7583 SoC have 3 different MDIO Controller. One comes from
the intergated Switch based on MT7530. The other 2 live under the SCU
register and expose 2 dedicated MDIO controller.
Document the schema for the 2 dedicated MDIO controller.
Each MDIO controller can be independently reset with the SoC reset line.
Each MDIO controller have a dedicated clock configured to 2.5MHz by
default to follow MDIO bus IEEE 802.3 standard.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 27 Jun 2025 00:54:10 +0000 (17:54 -0700)]
Merge branch 'ptp-belated-spring-cleaning-of-the-chardev-driver'
Thomas Gleixner says:
====================
ptp: Belated spring cleaning of the chardev driver
When looking into supporting auxiliary clocks in the PTP ioctl, the
inpenetrable ptp_ioctl() letter soup bothered me enough to clean it up.
The code (~400 lines!) is really hard to follow due to a gazillion of
local variables, which are only used in certain case scopes, and a
mixture of gotos, breaks and direct error return paths.
Clean it up by splitting out the IOCTL functionality into seperate
functions, which contain only the required local variables and are trivial
to follow. Complete the cleanup by converting the code to lock guards and
get rid of all gotos.
That reduces the code size by 48 lines and also the binary text size is
80 bytes smaller than the current maze.
The series is split up into one patch per IOCTL command group for easy
review.
v1: https://lore.kernel.org/
20250620130144.
351492917@linutronix.de
====================
Link: https://patch.msgid.link/20250625114404.102196103@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:39 +0000 (13:52 +0200)]
ptp: Simplify ptp_read()
The mixture of gotos and direct return codes is inconsistent and just makes
the code harder to read. Let it consistently return error codes directly and
tidy the code flow up accordingly.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20250625115133.486953538@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:38 +0000 (13:52 +0200)]
ptp: Convert chardev code to lock guards
Convert the various spin_lock_irqsave() protected critical regions to
scoped guards. Use spinlock_irq instead of spinlock_irqsave as all the
functions are invoked in thread context with interrupts enabled.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20250625115133.425029269@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:36 +0000 (13:52 +0200)]
ptp: Split out PTP_MASK_EN_SINGLE ioctl code
Finish the ptp_ioctl() cleanup by splitting out the PTP_MASK_EN_SINGLE
ioctl code and removing the remaining local variables and return
statements.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115133.364422719@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:35 +0000 (13:52 +0200)]
ptp: Split out PTP_MASK_CLEAR_ALL ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_MASK_CLEAR_ALL ioctl
code into a helper function.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115133.302755618@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:34 +0000 (13:52 +0200)]
ptp: Split out PTP_PIN_SETFUNC ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_PIN_SETFUNC ioctl
code into a helper function. Convert to lock guard while at it and remove
the pointless memset of the pd::rsv because nothing uses it.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115133.241503804@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:33 +0000 (13:52 +0200)]
ptp: Split out PTP_PIN_GETFUNC ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_PIN_GETFUNC ioctl
code into a helper function. Convert to lock guard while at it and remove
the pointless memset of the pd::rsv because nothing uses it.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115133.177265865@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:31 +0000 (13:52 +0200)]
ptp: Split out PTP_SYS_OFFSET ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_SYS_OFFSET ioctl
code into a helper function.
Convert it to __free() to avoid gotos.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20250625115133.113841216@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:30 +0000 (13:52 +0200)]
ptp: Split out PTP_SYS_OFFSET_EXTENDED ioctl code
Continue the ptp_ioctl() cleanup by splitting out the
PTP_SYS_OFFSET_EXTENDED ioctl code into a helper function.
Convert it to __free() to avoid gotos.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115133.050445505@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:29 +0000 (13:52 +0200)]
ptp: Split out PTP_SYS_OFFSET_PRECISE ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_SYS_OFFSET_PRECISE
ioctl code into a helper function.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115132.986897454@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:28 +0000 (13:52 +0200)]
ptp: Split out PTP_ENABLE_PPS ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_ENABLE_PPS
ioctl code into a helper function. Convert to a lock guard while at it.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115132.923803136@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:27 +0000 (13:52 +0200)]
ptp: Split out PTP_PEROUT_REQUEST ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_PEROUT_REQUEST
ioctl code into a helper function. Convert to a lock guard while at it.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115132.860150473@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:25 +0000 (13:52 +0200)]
ptp: Split out PTP_EXTTS_REQUEST ioctl code
Continue the ptp_ioctl() cleanup by splitting out the PTP_EXTTS_REQUEST
ioctl code into a helper function. Convert to a lock guard while at it.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115132.797588258@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Thomas Gleixner [Wed, 25 Jun 2025 11:52:24 +0000 (13:52 +0200)]
ptp: Split out PTP_CLOCK_GETCAPS ioctl code
ptp_ioctl() is an inpenetrable letter soup with a gazillion of case (scope)
specific variables defined at the top of the function and pointless breaks
and gotos.
Start cleaning it up by splitting out the PTP_CLOCK_GETCAPS ioctl code into
a helper function. Use a argument pointer with a single sparse compliant
type cast instead of proliferating the type cast all over the place.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250625115132.733409073@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata [Wed, 25 Jun 2025 10:41:23 +0000 (12:41 +0200)]
selftests: forwarding: lib: Split setup_wait()
setup_wait() takes an optional argument and then is called from the top
level of the test script. That confuses shellcheck, which thinks that maybe
the intention is to pass $1 of the script to the function, which is never
the case. To avoid having to annotate every single new test with a SC
disable, split the function in two: one that takes a mandatory argument,
and one that takes no argument at all.
Convert the two existing users of that optional argument, both in Spectrum
resource selftest, to use the new form. Clean up vxlan_bridge_1q_mc_ul.sh
to not pass a now-unused argument.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/8e13123236fe3912ae29bc04a1528bdd8551da1f.1750847794.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Wed, 25 Jun 2025 10:21:55 +0000 (18:21 +0800)]
net: Remove unused function first_net_device_rcu()
This is unused since commit
f04565ddf52e ("dev: use name hash for
dev_seq_ops")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250625102155.483570-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Wed, 25 Jun 2025 02:20:59 +0000 (10:20 +0800)]
ipv4: fib: Remove unnecessary encap_type check
lwtunnel_build_state() has check validity of encap_type,
so no need to do this before call it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250625022059.3958215-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 27 Jun 2025 00:13:16 +0000 (17:13 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2025-06-27
We've added 6 non-merge commits during the last 8 day(s) which contain
a total of 6 files changed, 120 insertions(+), 20 deletions(-).
The main changes are:
1) Fix RCU usage in task_cls_state() for BPF programs using helpers like
bpf_get_cgroup_classid_curr() outside of networking, from Charalampos
Mitrodimas.
2) Fix a sockmap race between map_update and a pending workqueue from
an earlier map_delete freeing the old psock where both pointed to the
same psock->sk, from Jiayuan Chen.
3) Fix a data corruption issue when using bpf_msg_pop_data() in kTLS which
failed to recalculate the ciphertext length, also from Jiayuan Chen.
4) Remove xdp_redirect_map{,_err} trace events since they are unused and
also hide XDP trace events under CONFIG_BPF_SYSCALL, from Steven Rostedt.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
xdp: tracing: Hide some xdp events under CONFIG_BPF_SYSCALL
xdp: Remove unused events xdp_redirect_map and xdp_redirect_map_err
net, bpf: Fix RCU usage in task_cls_state() for BPF programs
selftests/bpf: Add test to cover ktls with bpf_msg_pop_data
bpf, ktls: Fix data corruption when using bpf_msg_pop_data() in ktls
bpf, sockmap: Fix psock incorrectly pointing to sk
====================
Link: https://patch.msgid.link/20250626230111.24772-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi [Wed, 25 Jun 2025 14:43:15 +0000 (16:43 +0200)]
net: airoha: Get rid of dma_sync_single_for_device() in airoha_qdma_fill_rx_queue()
Since the page_pool for airoha_eth driver is created with
PP_FLAG_DMA_SYNC_DEV flag, we do not need to sync_for_device each page
received from the pool since it is already done by the page_pool codebase.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250625-airoha-sync-for-device-v1-1-923741deaabf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Erni Sri Satya Vennela [Wed, 25 Jun 2025 11:35:55 +0000 (04:35 -0700)]
net: mana: Fix build errors when CONFIG_NET_SHAPER is disabled
Fix build errors when CONFIG_NET_SHAPER is disabled, including:
drivers/net/ethernet/microsoft/mana/mana_en.c:804:10: error:
'const struct net_device_ops' has no member named 'net_shaper_ops'
804 | .net_shaper_ops = &mana_shaper_ops,
drivers/net/ethernet/microsoft/mana/mana_en.c:804:35: error:
initialization of 'int (*)(struct net_device *, struct neigh_parms *)'
from incompatible pointer type 'const struct net_shaper_ops *'
[-Werror=incompatible-pointer-types]
804 | .net_shaper_ops = &mana_shaper_ops,
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Fixes:
75cabb46935b ("net: mana: Add support for net_shaper_ops")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202506230625.bfUlqb8o-lkp@intel.com/
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1750851355-8067-1-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Geert Uytterhoeven [Wed, 25 Jun 2025 08:10:48 +0000 (10:10 +0200)]
dt-bindings: net: Rename renesas,r9a09g057-gbeth.yaml
The DT bindings file "renesas,r9a09g057-gbeth.yaml" applies to a whole
family of SoCs, and uses "renesas,rzv2h-gbeth" as a fallback compatible
value. Hence rename it to the more generic "renesas,rzv2h-gbeth.yaml".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/721f6e0e09777e0842ecaca4578bc50c953d2428.1750838954.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 12 Jun 2025 17:08:24 +0000 (10:08 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.16-rc4).
Conflicts:
Documentation/netlink/specs/mptcp_pm.yaml
9e6dd4c256d0 ("netlink: specs: mptcp: replace underscores with dashes in names")
ec362192aa9e ("netlink: specs: fix up indentation errors")
https://lore.kernel.org/
20250626122205.
389c2cd4@canb.auug.org.au
Adjacent changes:
Documentation/netlink/specs/fou.yaml
791a9ed0a40d ("netlink: specs: fou: replace underscores with dashes in names")
880d43ca9aa4 ("netlink: specs: clean up spaces in brackets")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 26 Jun 2025 16:13:27 +0000 (09:13 -0700)]
Merge tag 'net-6.16-rc4' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and wireless.
Current release - regressions:
- bridge: fix use-after-free during router port configuration
Current release - new code bugs:
- eth: wangxun: fix the creation of page_pool
Previous releases - regressions:
- netpoll: initialize UDP checksum field before checksumming
- wifi: mac80211: finish link init before RCU publish
- bluetooth: fix use-after-free in vhci_flush()
- eth:
- ionic: fix DMA mapping test
- bnxt: properly flush XDP redirect lists
Previous releases - always broken:
- netlink: specs: enforce strict naming of properties
- unix: don't leave consecutive consumed OOB skbs.
- vsock: fix linux/vm_sockets.h userspace compilation errors
- selftests: fix TCP packet checksum"
* tag 'net-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
net: libwx: fix the creation of page_pool
net: selftests: fix TCP packet checksum
atm: Release atm_dev_mutex after removing procfs in atm_dev_deregister().
netlink: specs: enforce strict naming of properties
netlink: specs: tc: replace underscores with dashes in names
netlink: specs: rt-link: replace underscores with dashes in names
netlink: specs: mptcp: replace underscores with dashes in names
netlink: specs: ovs_flow: replace underscores with dashes in names
netlink: specs: devlink: replace underscores with dashes in names
netlink: specs: dpll: replace underscores with dashes in names
netlink: specs: ethtool: replace underscores with dashes in names
netlink: specs: fou: replace underscores with dashes in names
netlink: specs: nfsd: replace underscores with dashes in names
net: enetc: Correct endianness handling in _enetc_rd_reg64
atm: idt77252: Add missing `dma_map_error()`
bnxt: properly flush XDP redirect lists
vsock/uapi: fix linux/vm_sockets.h userspace compilation errors
wifi: mac80211: finish link init before RCU publish
wifi: iwlwifi: mvm: assume '1' as the default mac_config_cmd version
selftest: af_unix: Add tests for -ECONNRESET.
...
Jacob Keller [Tue, 24 Jun 2025 00:30:04 +0000 (17:30 -0700)]
ice: default to TIME_REF instead of TXCO on E825-C
The driver currently defaults to the internal oscillator as the clock
source for E825-C hardware. While this clock source is labeled TCXO,
indicating a temperature compensated oscillator, this is only true for some
board designs. Many board designs have a less capable oscillator. The
E825-C hardware may also have its clock source set to the TIME_REF pin.
This pin is connected to the DPLL and is often a more stable clock source.
The choice of the internal oscillator is not suitable for all systems,
especially those which want to enable SyncE support.
There is currently no interface available for users to configure the clock
source. Other variants of the E82x board have the clock source configured
in the NVM, but E825-C lacks this capability, so different board designs
cannot select a different default clock via firmware.
In most setups, the TIME_REF is a suitable default clock source.
Additionally, we now fall back to the internal oscillator automatically if
the TIME_REF clock source cannot be locked.
Change the default clock source for E825-C to TIME_REF. Note that the
driver logs a dev_dbg message upon configuring the TSPLL which includes the
clock source and frequency. This can be enabled to confirm which clock
source is in use.
Longterm a proper interface to dynamically introspect and change the clock
source will be designed (perhaps some extension of the DPLL subsystem?)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:03 +0000 (17:30 -0700)]
ice: move TSPLL init calls to ice_ptp.c
Initialize TSPLL after initializing PHC in ice_ptp.c instead of calling
for each product in PHC init in ice_ptp_hw.c.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:02 +0000 (17:30 -0700)]
ice: fall back to TCXO on TSPLL lock fail
TSPLL can fail when trying to lock to TIME_REF as a clock source, e.g.
when the external clock source is not stable or connected to the board.
To continue operation after failure, try to lock again to internal TCXO
and inform user about this.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:01 +0000 (17:30 -0700)]
ice: wait before enabling TSPLL
To ensure proper operation, wait for 10 to 20 microseconds before
enabling TSPLL.
Adjust wait time after enabling TSPLL from 1-5 ms to 1-2 ms.
Those values are empirical and tested on multiple HW configurations.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:30:00 +0000 (17:30 -0700)]
ice: add multiple TSPLL helpers
Add helpers for checking TSPLL params, disabling sticky bits,
configuring TSPLL and getting default clock frequency to simplify
the code flows.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Karol Kolacinski [Tue, 24 Jun 2025 00:29:59 +0000 (17:29 -0700)]
ice: use bitfields instead of unions for CGU regs
Switch from unions with bitfield structs to definitions with bitfield
masks. This is necessary, because some registers have different
field definitions or even use a different register for the same fields
based on HW type.
Remove unused register fields.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 24 Jun 2025 00:29:58 +0000 (17:29 -0700)]
ice: read TSPLL registers again before reporting status
After programming the TSPLL, re-read the registers before reporting status.
This ensures the debug log message will show what was actually programmed,
rather than relying on a cached value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 24 Jun 2025 00:29:57 +0000 (17:29 -0700)]
ice: clear time_sync_en field for E825-C during reprogramming
When programming the Clock Generation Unit for E285-C hardware, we need
to clear the time_sync_en bit of the DWORD 9 before we set the
frequency.
Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Abdelrahman Fekry [Tue, 24 Jun 2025 15:09:23 +0000 (18:09 +0300)]
docs: net: sysctl documentation cleanup
Add missing default values for networking sysctl parameters and
standardize documentation:
- Use "0 (disabled)" / "1 (enabled)" format consistently
- Fix cipso_rbm_struct_valid -> cipso_rbm_strictvalid typo
- Convert fwmark_reflect description to enabled/disabled terminology
- Document possible values for tcp_autocorking
Also addresses formatting inconsistencies in touched parameters.
Signed-off-by: Abdelrahman Fekry <abdelrahmanfekry375@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250624150923.40590-1-abdelrahmanfekry375@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 26 Jun 2025 12:56:16 +0000 (14:56 +0200)]
Merge branch 'eth-fbnic-trivial-code-tweaks'
Jakub Kicinski says:
====================
eth: fbnic: trivial code tweaks
A handful of code cleanups. No functional changes.
====================
Link: https://patch.msgid.link/20250624142834.3275164-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:34 +0000 (07:28 -0700)]
eth: fbnic: rename fbnic_fw_clear_cmpl to fbnic_mbx_clear_cmpl
fbnic_fw_clear_cmpl() does the inverse of fbnic_mbx_set_cmpl().
It removes the completion from the mailbox table.
It also calls fbnic_mbx_set_cmpl_slot() internally.
It should have fbnic_mbx prefix, not fbnic_fw.
I'm not very clear on what the distinction is between the two
prefixes but the matching "set" and "clear" functions should
use the same prefix.
While at it move the "clear" function closer to the "set".
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250624142834.3275164-6-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:33 +0000 (07:28 -0700)]
eth: fbnic: sort includes
Make sure includes are in alphabetical order.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250624142834.3275164-5-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:32 +0000 (07:28 -0700)]
eth: fbnic: realign whitespace
Relign various whitespace things. Some of it is spaces which should
be tabs and some is making sure the values are actually correctly
aligned to "columns" with 8 space tabs. Whitespace changes only.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250624142834.3275164-4-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:31 +0000 (07:28 -0700)]
eth: fbnic: fix stampinn typo in a comment
Fix a typo:
stampinn -> stamping
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250624142834.3275164-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 14:28:30 +0000 (07:28 -0700)]
eth: fbnic: remove duplicate FBNIC_MAX_.XQS macros
Somehow we ended up with two copies of FBNIC_MAX_[TR]XQS in fbnic_txrx.h.
Remove the one mixed with the struct declarations.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250624142834.3275164-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 26 Jun 2025 12:49:12 +0000 (14:49 +0200)]
Merge branch 'follow-up-to-rgmii-mode-clarification-am65-cpsw-fix-checkpatch'
Matthias Schiffer says:
====================
Follow-up to RGMII mode clarification: am65-cpsw fix + checkpatch
Following previous discussion [1] and the documentation update by
Andrew [2]:
Fix up the mode to account for the fixed TX delay on the AM65 CPSW
Ethernet controllers, similar to the way the icssg-prueth does it. For
backwards compatibility, the "impossible" modes that claim to have a
delay on the PCB are still accepted, but trigger a warning message.
As Andrew suggested, I have also added a checkpatch check that requires
a comment for any RGMII mode that is not "rgmii-id".
No Device Trees are updated to avoid the warning for now, to give other
projects syncing the Linux Device Trees some time to fix their drivers
as well. I intend to submit an equivalent change for U-Boot's
am65-cpsw-nuss driver as soon as the changes are accepted for Linux.
[1] https://lore.kernel.org/lkml/
d25b1447-c28b-4998-b238-
92672434dc28@lunn.ch/
[2] https://lore.kernel.org/all/
20250430-v6-15-rc3-net-rgmii-delays-v2-1-
099ae651d5e5@lunn.ch/
commit
c360eb0c3ccb ("dt-bindings: net: ethernet-controller: Add informative text about RGMII delays")
v1: https://lore.kernel.org/all/cover.
1744710099.git.matthias.schiffer@ew.tq-group.com/
====================
Link: https://patch.msgid.link/cover.1750756583.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthias Schiffer [Tue, 24 Jun 2025 10:53:34 +0000 (12:53 +0200)]
checkpatch: check for comment explaining rgmii(|-rxid|-txid) PHY modes
Historically, the RGMII PHY modes specified in Device Trees have been
used inconsistently, often referring to the usage of delays on the PHY
side rather than describing the board; many drivers still implement this
incorrectly.
Require a comment in Devices Trees using these modes (usually mentioning
that the delay is realized on the PCB), so we can avoid adding more
incorrect uses (or will at least notice which drivers still need to be
fixed).
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/bc112b8aa510cf9df9ab33178d122f234d0aebf7.1750756583.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthias Schiffer [Tue, 24 Jun 2025 10:53:33 +0000 (12:53 +0200)]
net: ethernet: ti: am65-cpsw: fixup PHY mode for fixed RGMII TX delay
All am65-cpsw controllers have a fixed TX delay, so the PHY interface
mode must be fixed up to account for this.
Modes that claim to a delay on the PCB can't actually work. Warn people
to update their Device Trees if one of the unsupported modes is specified.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/9b3fb1fbf719bef30702192155c6413cd5de5dcf.1750756583.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthias Schiffer [Tue, 24 Jun 2025 10:53:32 +0000 (12:53 +0200)]
dt-bindings: net: ti: k3-am654-cpsw-nuss: update phy-mode in example
k3-am65-cpsw-nuss controllers have a fixed internal TX delay, so RXID
mode is not actually possible and will result in a warning from the
driver going forward.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/f9b5e84fcaf565506ed86cf1838444c2bc47334f.1750756583.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jiawen Wu [Wed, 25 Jun 2025 02:39:24 +0000 (10:39 +0800)]
net: libwx: fix the creation of page_pool
'rx_ring->size' means the count of ring descriptors multiplied by the
size of one descriptor. When increasing the count of ring descriptors,
it may exceed the limit of pool size.
[ 864.209610] page_pool_create_percpu() gave up with errno -7
[ 864.209613] txgbe 0000:11:00.0: Page pool creation failed: -7
Fix to set the pool_size to the count of ring descriptors.
Fixes:
850b971110b2 ("net: libwx: Allocate Rx and Tx resources")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/434C72BFB40E350A+20250625023924.21821-1-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 24 Jun 2025 18:32:58 +0000 (11:32 -0700)]
net: selftests: fix TCP packet checksum
The length in the pseudo header should be the length of the L3 payload
AKA the L4 header+payload. The selftest code builds the packet from
the lower layers up, so all the headers are pushed already when it
constructs L4. We need to subtract the lower layer headers from skb->len.
Fixes:
3e1e58d64c3d ("net: add generic selftest support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reported-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250624183258.3377740-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Thu, 26 Jun 2025 04:09:02 +0000 (21:09 -0700)]
Merge tag 'bpf-fixes' of git://git./linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Fix use-after-free in libbpf when map is resized (Adin Scannell)
- Fix verifier assumptions about 2nd argument of bpf_sysctl_get_name
(Jerome Marchand)
- Fix verifier assumption of nullness of d_inode in dentry (Song Liu)
- Fix global starvation of LRU map (Willem de Bruijn)
- Fix potential NULL dereference in btf_dump__free (Yuan Chen)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: adapt one more case in test_lru_map to the new target_free
libbpf: Fix possible use-after-free for externs
selftests/bpf: Convert test_sysctl to prog_tests
bpf: Specify access type of bpf_sysctl_get_name args
libbpf: Fix null pointer dereference in btf_dump__free on allocation failure
bpf: Adjust free target to avoid global starvation of LRU map
bpf: Mark dentry->d_inode as trusted_or_null
Linus Torvalds [Thu, 26 Jun 2025 03:48:48 +0000 (20:48 -0700)]
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
"Several mount-related fixes"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
userns and mnt_idmap leak in open_tree_attr(2)
attach_recursive_mnt(): do not lock the covering tree when sliding something under it
replace collect_mounts()/drop_collected_mounts() with a safer variant
Daniel Braunwarth [Tue, 24 Jun 2025 14:17:33 +0000 (16:17 +0200)]
net: phy: realtek: add error handling to rtl8211f_get_wol
We should check if the WOL settings was successfully read from the PHY.
In case this fails we cannot just use the error code and proceed.
Signed-off-by: Daniel Braunwarth <daniel.braunwarth@kuka.com>
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/
baaa083b-9a69-460f-ab35-
2a7cb3246ffd@nvidia.com
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250624-realtek_fixes-v1-1-02a0b7c369bc@kuka.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Tue, 24 Jun 2025 14:01:59 +0000 (22:01 +0800)]
net: Reoder rxq_idx check in __net_mp_open_rxq()
array_index_nospec() clamp the rxq_idx within the range of
[0, dev->real_num_rx_queues), move the check before it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250624140159.3929503-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Tue, 24 Jun 2025 14:00:15 +0000 (22:00 +0800)]
net: Remove unnecessary NULL check for lwtunnel_fill_encap()
lwtunnel_fill_encap() has NULL check and return 0, so no need
to check before call it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250624140015.3929241-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Tue, 24 Jun 2025 21:45:00 +0000 (14:45 -0700)]
atm: Release atm_dev_mutex after removing procfs in atm_dev_deregister().
syzbot reported a warning below during atm_dev_register(). [0]
Before creating a new device and procfs/sysfs for it, atm_dev_register()
looks up a duplicated device by __atm_dev_lookup(). These operations are
done under atm_dev_mutex.
However, when removing a device in atm_dev_deregister(), it releases the
mutex just after removing the device from the list that __atm_dev_lookup()
iterates over.
So, there will be a small race window where the device does not exist on
the device list but procfs/sysfs are still not removed, triggering the
splat.
Let's hold the mutex until procfs/sysfs are removed in
atm_dev_deregister().
[0]:
proc_dir_entry 'atm/atmtcp:0' already registered
WARNING: CPU: 0 PID: 5919 at fs/proc/generic.c:377 proc_register+0x455/0x5f0 fs/proc/generic.c:377
Modules linked in:
CPU: 0 UID: 0 PID: 5919 Comm: syz-executor284 Not tainted
6.16.0-rc2-syzkaller-00047-g52da431bf03b #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
RIP: 0010:proc_register+0x455/0x5f0 fs/proc/generic.c:377
Code: 48 89 f9 48 c1 e9 03 80 3c 01 00 0f 85 a2 01 00 00 48 8b 44 24 10 48 c7 c7 20 c0 c2 8b 48 8b b0 d8 00 00 00 e8 0c 02 1c ff 90 <0f> 0b 90 90 48 c7 c7 80 f2 82 8e e8 0b de 23 09 48 8b 4c 24 28 48
RSP: 0018:
ffffc9000466fa30 EFLAGS:
00010282
RAX:
0000000000000000 RBX:
0000000000000000 RCX:
ffffffff817ae248
RDX:
ffff888026280000 RSI:
ffffffff817ae255 RDI:
0000000000000001
RBP:
ffff8880232bed48 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000001 R12:
ffff888076ed2140
R13:
dffffc0000000000 R14:
ffff888078a61340 R15:
ffffed100edda444
FS:
00007f38b3b0c6c0(0000) GS:
ffff888124753000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007f38b3bdf953 CR3:
0000000076d58000 CR4:
00000000003526f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
proc_create_data+0xbe/0x110 fs/proc/generic.c:585
atm_proc_dev_register+0x112/0x1e0 net/atm/proc.c:361
atm_dev_register+0x46d/0x890 net/atm/resources.c:113
atmtcp_create+0x77/0x210 drivers/atm/atmtcp.c:369
atmtcp_attach drivers/atm/atmtcp.c:403 [inline]
atmtcp_ioctl+0x2f9/0xd60 drivers/atm/atmtcp.c:464
do_vcc_ioctl+0x12c/0x930 net/atm/ioctl.c:159
sock_do_ioctl+0x115/0x280 net/socket.c:1190
sock_ioctl+0x227/0x6b0 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl fs/ioctl.c:893 [inline]
__x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f38b3b74459
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007f38b3b0c198 EFLAGS:
00000246 ORIG_RAX:
0000000000000010
RAX:
ffffffffffffffda RBX:
00007f38b3bfe318 RCX:
00007f38b3b74459
RDX:
0000000000000000 RSI:
0000000000006180 RDI:
0000000000000005
RBP:
00007f38b3bfe310 R08:
65732f636f72702f R09:
65732f636f72702f
R10:
65732f636f72702f R11:
0000000000000246 R12:
00007f38b3bcb0ac
R13:
00007f38b3b0c1a0 R14:
0000200000000200 R15:
00007f38b3bcb03b
</TASK>
Fixes:
64bf69ddff76 ("[ATM]: deregistration removes device from atm_devs list immediately")
Reported-by: syzbot+8bd335d2ad3b93e80715@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/
685316de.
050a0220.216029.0087.GAE@google.com/
Tested-by: syzbot+8bd335d2ad3b93e80715@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250624214505.570679-1-kuni1840@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 25 Jun 2025 22:36:30 +0000 (15:36 -0700)]
Merge branch 'netlink-specs-enforce-strict-naming-of-properties'
Jakub Kicinski says:
====================
netlink: specs: enforce strict naming of properties
I got annoyed once again by the name properties in the ethtool spec
which use underscore instead of dash. I previously assumed that there
is a lot of such properties in the specs so fixing them now would
be near impossible. On a closer look, however, I only found 22
(rough grep suggests we have ~4.8k names in the specs, so bad ones
are just 0.46%).
Add a regex to the JSON schema to enforce the naming, fix the few
bad names. I was hoping we could start enforcing this from newer
families, but there's no correlation between the protocol and the
number of errors. If anything classic netlink has more recently
added specs so it has fewer errors.
The regex is just for name properties which will end up visible
to the user (in Python or YNL CLI). I left the c-name properties
alone, those don't matter as much. C codegen rewrites them, anyway.
I'm not updating the spec for genetlink-c. Looks like it has no
users, new families use genetlink, all old ones need genetlink-legacy.
If these patches are merged I will remove genetlink-c completely
in net-next.
====================
Link: https://patch.msgid.link/20250624211002.3475021-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:10:02 +0000 (14:10 -0700)]
netlink: specs: enforce strict naming of properties
Add a regexp to make sure all names which may end up being visible
to the user consist of lower case characters, numbers and dashes.
Underscores keep sneaking into the specs, which is not visible
in the C code but makes the Python and alike inconsistent.
Note that starting with a number is okay, as in C the full
name will include the family name.
For legacy families we can't enforce the naming in the family
name or the multicast group names, as these are part of the
binary uAPI of the kernel.
For classic netlink we need to allow capital letters in names
of struct members. TC has some structs with capitalized members.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:10:01 +0000 (14:10 -0700)]
netlink: specs: tc: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
a1bcfde83669 ("doc/netlink/specs: Add a spec for tc")
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:10:00 +0000 (14:10 -0700)]
netlink: specs: rt-link: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
b2f63d904e72 ("doc/netlink: Add spec for rt link messages")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:59 +0000 (14:09 -0700)]
netlink: specs: mptcp: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp")
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250624211002.3475021-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:58 +0000 (14:09 -0700)]
netlink: specs: ovs_flow: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
93b230b549bc ("netlink: specs: add ynl spec for ovs_flow")
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20250624211002.3475021-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:57 +0000 (14:09 -0700)]
netlink: specs: devlink: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
429ac6211494 ("devlink: define enum for attr types of dynamic attributes")
Fixes:
f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:56 +0000 (14:09 -0700)]
netlink: specs: dpll: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML")
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:55 +0000 (14:09 -0700)]
netlink: specs: ethtool: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen replaces special chars in names)
but gives more uniform naming in Python.
Fixes:
13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Fixes:
46fb3ba95b93 ("ethtool: Add an interface for flashing transceiver modules' firmware")
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:54 +0000 (14:09 -0700)]
netlink: specs: fou: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
4eb77b4ecd3c ("netlink: add a proto specification for FOU")
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250624211002.3475021-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 24 Jun 2025 21:09:53 +0000 (14:09 -0700)]
netlink: specs: nfsd: replace underscores with dashes in names
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes:
13727f85b49b ("NFSD: introduce netlink stubs")
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20250624211002.3475021-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>