linux-block.git
3 months agodt-bindings: net: airoha: Add the NPU node for EN7581 SoC
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:18 +0000 (11:54 +0100)]
dt-bindings: net: airoha: Add the NPU node for EN7581 SoC

This patch adds the NPU document binding for EN7581 SoC.
The Airoha Network Processor Unit (NPU) provides a configuration interface
to implement wired and wireless hardware flow offloading programming Packet
Processor Engine (PPE) flow table.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Rename airoha_set_gdm_port_fwd_cfg() in airoha_set_vip_for_gdm_port()
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:17 +0000 (11:54 +0100)]
net: airoha: Rename airoha_set_gdm_port_fwd_cfg() in airoha_set_vip_for_gdm_port()

Rename airoha_set_gdm_port() in airoha_set_vip_for_gdm_port().
Get rid of airoha_set_gdm_ports routine.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move REG_GDM_FWD_CFG() initialization in airoha_dev_init()
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:16 +0000 (11:54 +0100)]
net: airoha: Move REG_GDM_FWD_CFG() initialization in airoha_dev_init()

Move REG_GDM_FWD_CFG() register initialization in airoha_dev_init
routine. Moreover, always send traffic PPE module in order to be
processed by hw accelerator.
This is a preliminary patch to enable netfilter flowtable hw offloading
on EN7581 SoC.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Enable support for multiple net_devices
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:15 +0000 (11:54 +0100)]
net: airoha: Enable support for multiple net_devices

In the current codebase airoha_eth driver supports just a single
net_device connected to the Packet Switch Engine (PSE) lan port (GDM1).
As shown in commit 23020f049327 ("net: airoha: Introduce ethernet
support for EN7581 SoC"), PSE can switch packets between four GDM ports.
Enable the capability to create a net_device for each GDM port of the
PSE module. Moreover, since the QDMA blocks can be shared between
net_devices, do not stop TX/RX DMA in airoha_dev_stop() if there are
active net_devices for this QDMA block.
This is a preliminary patch to enable flowtable hw offloading for EN7581
SoC.

Co-developed-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: dsa: mt7530: Enable Rx sptag for EN7581 SoC
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:14 +0000 (11:54 +0100)]
net: dsa: mt7530: Enable Rx sptag for EN7581 SoC

Packet Processor Engine (PPE) module used for hw acceleration on EN7581
mac block, in order to properly parse packets, requires DSA untagged
packets on TX side and read DSA tag from DMA descriptor on RX side.
For this reason, enable RX Special Tag (SPTAG) for EN7581 SoC.
This is a preliminary patch to enable netfilter flowtable hw offloading
on EN7581 SoC.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move DSA tag in DMA descriptor
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:13 +0000 (11:54 +0100)]
net: airoha: Move DSA tag in DMA descriptor

Packet Processor Engine (PPE) module reads DSA tags from the DMA descriptor
and requires untagged DSA packets to properly parse them. Move DSA tag
in the DMA descriptor on TX side and read DSA tag from DMA descriptor
on RX side. In order to avoid skb reallocation, store tag in skb_dst on
RX side.
This is a preliminary patch to enable netfilter flowtable hw offloading
on EN7581 SoC.

Tested-by: Sayantan Nandy <sayantan.nandy@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move register definitions in airoha_regs.h
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:12 +0000 (11:54 +0100)]
net: airoha: Move register definitions in airoha_regs.h

Move common airoha_eth register definitions in airoha_regs.h in order
to reuse them for Packet Processor Engine (PPE) codebase.
PPE module is used to enable support for flowtable hw offloading in
airoha_eth driver.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move reg/write utility routines in airoha_eth.h
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:11 +0000 (11:54 +0100)]
net: airoha: Move reg/write utility routines in airoha_eth.h

This is a preliminary patch to introduce flowtable hw offloading
support for airoha_eth driver.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move definitions in airoha_eth.h
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:10 +0000 (11:54 +0100)]
net: airoha: Move definitions in airoha_eth.h

Move common airoha_eth definitions in airoha_eth.h in order to reuse
them for Packet Processor Engine (PPE) codebase.
PPE module is used to enable support for flowtable hw offloading in
airoha_eth driver.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: airoha: Move airoha_eth driver in a dedicated folder
Lorenzo Bianconi [Fri, 28 Feb 2025 10:54:09 +0000 (11:54 +0100)]
net: airoha: Move airoha_eth driver in a dedicated folder

The airoha_eth driver has no codebase shared with mtk_eth_soc one.
Moreover, the upcoming features (flowtable hw offloading, PCS, ..) will
not reuse any code from MediaTek driver. Move the Airoha driver in a
dedicated folder.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoMerge branch 'net-notify-users-when-an-iface-cannot-change-its-netns'
Paolo Abeni [Tue, 4 Mar 2025 11:44:51 +0000 (12:44 +0100)]
Merge branch 'net-notify-users-when-an-iface-cannot-change-its-netns'

Nicolas Dichtel says:

====================
net: notify users when an iface cannot change its netns

This series adds a way to see if an interface cannot be moved to another netns.

 Documentation/netlink/specs/rt_link.yaml           |  3 ++
 .../networking/net_cachelines/net_device.rst       |  2 +-
 Documentation/networking/switchdev.rst             |  2 +-
 drivers/net/amt.c                                  |  2 +-
 drivers/net/bonding/bond_main.c                    |  2 +-
 drivers/net/ethernet/adi/adin1110.c                |  2 +-
 .../net/ethernet/marvell/prestera/prestera_main.c  |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  4 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c     |  2 +-
 drivers/net/ethernet/rocker/rocker_main.c          |  2 +-
 drivers/net/ethernet/ti/cpsw_new.c                 |  2 +-
 drivers/net/loopback.c                             |  2 +-
 drivers/net/net_failover.c                         |  2 +-
 drivers/net/team/team_core.c                       |  2 +-
 drivers/net/vrf.c                                  |  2 +-
 include/linux/netdevice.h                          |  9 +++--
 include/uapi/linux/if_link.h                       |  1 +
 net/batman-adv/soft-interface.c                    |  2 +-
 net/bridge/br_device.c                             |  2 +-
 net/core/dev.c                                     | 45 +++++++++++++++++-----
 net/core/rtnetlink.c                               |  5 ++-
 net/hsr/hsr_device.c                               |  2 +-
 net/ieee802154/6lowpan/core.c                      |  2 +-
 net/ieee802154/core.c                              | 10 ++---
 net/ipv4/ip_tunnel.c                               |  2 +-
 net/ipv4/ipmr.c                                    |  2 +-
 net/ipv6/ip6_gre.c                                 |  2 +-
 net/ipv6/ip6_tunnel.c                              |  2 +-
 net/ipv6/ip6mr.c                                   |  2 +-
 net/ipv6/sit.c                                     |  2 +-
 net/openvswitch/vport-internal_dev.c               |  2 +-
 net/wireless/core.c                                | 10 ++---
 tools/testing/selftests/net/forwarding/README      |  2 +-
 34 files changed, 86 insertions(+), 53 deletions(-)

Comments are welcome.

Regards,
Nicolas
====================

Link: https://patch.msgid.link/20250228102144.154802-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: plumb extack in __dev_change_net_namespace()
Nicolas Dichtel [Fri, 28 Feb 2025 10:20:58 +0000 (11:20 +0100)]
net: plumb extack in __dev_change_net_namespace()

It could be hard to understand why the netlink command fails. For example,
if dev->netns_immutable is set, the error is "Invalid argument".

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: advertise netns_immutable property via netlink
Nicolas Dichtel [Fri, 28 Feb 2025 10:20:57 +0000 (11:20 +0100)]
net: advertise netns_immutable property via netlink

Since commit 05c1280a2bcf ("netdev_features: convert NETIF_F_NETNS_LOCAL to
dev->netns_local"), there is no way to see if the netns_immutable property
s set on a device. Let's add a netlink attribute to advertise it.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: rename netns_local to netns_immutable
Nicolas Dichtel [Fri, 28 Feb 2025 10:20:56 +0000 (11:20 +0100)]
net: rename netns_local to netns_immutable

The name 'netns_local' is confusing. A following commit will export it via
netlink, so let's use a more explicit name.

Reported-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoMerge branch 'some-pktgen-fixes-improvments-part-ii'
Paolo Abeni [Tue, 4 Mar 2025 09:58:00 +0000 (10:58 +0100)]
Merge branch 'some-pktgen-fixes-improvments-part-ii'

Peter Seiderer says:

====================
Some pktgen fixes/improvments (part II)

While taking a look at '[PATCH net] pktgen: Avoid out-of-range in
get_imix_entries' ([1]) and '[PATCH net v2] pktgen: Avoid out-of-bounds
access in get_imix_entries' ([2], [3]) and doing some tests and code review
I detected that the /proc/net/pktgen/... parsing logic does not honour the
user given buffer bounds (resulting in out-of-bounds access).

This can be observed e.g. by the following simple test (sometimes the
old/'longer' previous value is re-read from the buffer):

        $ echo add_device lo@0 > /proc/net/pktgen/kpktgend_0

        $ echo "min_pkt_size 12345" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000  min_pkt_size: 12345  max_pkt_size: 0
Result: OK: min_pkt_size=12345

        $ echo -n "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000  min_pkt_size: 12345  max_pkt_size: 0
Result: OK: min_pkt_size=12345

        $ echo "min_pkt_size 123" > /proc/net/pktgen/lo\@0 && grep min_pkt_size /proc/net/pktgen/lo\@0
Params: count 1000  min_pkt_size: 123  max_pkt_size: 0
Result: OK: min_pkt_size=123

So fix the out-of-bounds access (and some minor findings) and add a simple
proc_net_pktgen selftest...

Patch set splited into part I (now already applied to net-next)

- net: pktgen: replace ENOTSUPP with EOPNOTSUPP
- net: pktgen: enable 'param=value' parsing
- net: pktgen: fix hex32_arg parsing for short reads
- net: pktgen: fix 'rate 0' error handling (return -EINVAL)
- net: pktgen: fix 'ratep 0' error handling (return -EINVAL)
- net: pktgen: fix ctrl interface command parsing
- net: pktgen: fix access outside of user given buffer in pktgen_thread_write()

nd part II (this one):

- net: pktgen: use defines for the various dec/hex number parsing digits lengths
- net: pktgen: fix mix of int/long
- net: pktgen: remove extra tmp variable (re-use len instead)
- net: pktgen: remove some superfluous variable initializing
- net: pktgen: fix mpls maximum labels list parsing
- net: pktgen: fix access outside of user given buffer in pktgen_if_write()
- net: pktgen: fix mpls reset parsing
- net: pktgen: remove all superfluous index assignements
- selftest: net: add proc_net_pktgen

[1] https://lore.kernel.org/netdev/20241006221221.3744995-1-artem.chernyshev@red-soft.ru/
[2] https://lore.kernel.org/netdev/20250109083039.14004-1-pchelkin@ispras.ru/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76201b5979768500bca362871db66d77cb4c225e
====================

Link: https://patch.msgid.link/20250227135604.40024-1-ps.report@gmx.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoselftest: net: add proc_net_pktgen
Peter Seiderer [Thu, 27 Feb 2025 13:56:04 +0000 (14:56 +0100)]
selftest: net: add proc_net_pktgen

Add some test for /proc/net/pktgen/... interface.

- enable 'CONFIG_NET_PKTGEN=m' in tools/testing/selftests/net/config

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: remove all superfluous index assignements
Peter Seiderer [Thu, 27 Feb 2025 13:56:03 +0000 (14:56 +0100)]
net: pktgen: remove all superfluous index assignements

Remove all superfluous index ('i += len') assignements (value not used
afterwards).

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: fix mpls reset parsing
Peter Seiderer [Thu, 27 Feb 2025 13:56:02 +0000 (14:56 +0100)]
net: pktgen: fix mpls reset parsing

Fix mpls list reset parsing to work as describe in
Documentation/networking/pktgen.rst:

  pgset "mpls 0"    turn off mpls (or any invalid argument works too!)

- before the patch

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
     mpls: 0000000100000002
Result: OK: mpls=00000001,00000002

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ echo "mpls 0" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
     mpls: 00000000
Result: OK: mpls=00000000

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ echo "mpls invalid" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
Result: OK: mpls=

- after the patch

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
     mpls: 0000000100000002
Result: OK: mpls=00000001,00000002

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ echo "mpls 0" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
Result: OK: mpls=

$ echo "mpls 00000001,00000002" > /proc/net/pktgen/lo\@0
$ echo "mpls invalid" > /proc/net/pktgen/lo\@0
$ grep mpls /proc/net/pktgen/lo\@0
Result: OK: mpls=

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: fix access outside of user given buffer in pktgen_if_write()
Peter Seiderer [Thu, 27 Feb 2025 13:56:01 +0000 (14:56 +0100)]
net: pktgen: fix access outside of user given buffer in pktgen_if_write()

Honour the user given buffer size for the hex32_arg(), num_arg(),
strn_len(), get_imix_entries() and get_labels() calls (otherwise they will
access memory outside of the user given buffer).

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: fix mpls maximum labels list parsing
Peter Seiderer [Thu, 27 Feb 2025 13:56:00 +0000 (14:56 +0100)]
net: pktgen: fix mpls maximum labels list parsing

Fix mpls maximum labels list parsing up to MAX_MPLS_LABELS entries (instead
of up to MAX_MPLS_LABELS - 1).

Addresses the following:

$ echo "mpls 00000f00,00000f01,00000f02,00000f03,00000f04,00000f05,00000f06,00000f07,00000f08,00000f09,00000f0a,00000f0b,00000f0c,00000f0d,00000f0e,00000f0f" > /proc/net/pktgen/lo\@0
-bash: echo: write error: Argument list too long

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: remove some superfluous variable initializing
Peter Seiderer [Thu, 27 Feb 2025 13:55:59 +0000 (14:55 +0100)]
net: pktgen: remove some superfluous variable initializing

Remove some superfluous variable initializing before hex32_arg call (as the
same init is done here already).

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: remove extra tmp variable (re-use len instead)
Peter Seiderer [Thu, 27 Feb 2025 13:55:58 +0000 (14:55 +0100)]
net: pktgen: remove extra tmp variable (re-use len instead)

Remove extra tmp variable in pktgen_if_write (re-use len instead).

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: pktgen: fix mix of int/long
Peter Seiderer [Thu, 27 Feb 2025 13:55:57 +0000 (14:55 +0100)]
net: pktgen: fix mix of int/long

Fix mix of int/long (and multiple conversion from/to) by using consequently
size_t for i and max and ssize_t for len and adjust function signatures
of hex32_arg(), count_trail_chars(), num_arg() and strn_len() accordingly.

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: sfp: add quirk for FS SFP-10GM-T copper SFP+ module
Martin Schiller [Thu, 27 Feb 2025 07:10:58 +0000 (08:10 +0100)]
net: sfp: add quirk for FS SFP-10GM-T copper SFP+ module

Add quirk for a copper SFP that identifies itself as "FS" "SFP-10GM-T".
It uses RollBall protocol to talk to the PHY and needs 4 sec wait before
probing the PHY.

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Link: https://patch.msgid.link/20250227071058.1520027-1-ms@dev.tdt.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agomptcp: Remove unused declaration mptcp_set_owner_r()
Yue Haibing [Fri, 28 Feb 2025 09:51:48 +0000 (17:51 +0800)]
mptcp: Remove unused declaration mptcp_set_owner_r()

Commit 6639498ed85f ("mptcp: cleanup mem accounting")
removed the implementation but leave declaration.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228095148.4003065-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'add-sock_kmemdup-helper'
Jakub Kicinski [Tue, 4 Mar 2025 01:16:36 +0000 (17:16 -0800)]
Merge branch 'add-sock_kmemdup-helper'

Geliang Tang says:

====================
add sock_kmemdup helper

While developing MPTCP BPF path manager [1], I found it's useful to
add a new sock_kmemdup() helper.

My use case is this:

In mptcp_userspace_pm_append_new_local_addr() function (see patch 3
in this patchset), it uses sock_kmalloc() to allocate an address
entry "e", then immediately duplicate the input "entry" to it:

'''
e = sock_kmalloc(sk, sizeof(*e), GFP_ATOMIC);
if (!e) {
ret = -ENOMEM;
goto append_err;
}

*e = *entry;
'''

When I implemented MPTCP BPF path manager, I needed to implement a
code similar to this in BPF.

The kfunc sock_kmalloc() can be easily invoked in BPF to allocate
an entry "e", but the code "*e = *entry;" that assigns "entry" to
"e" is not easy to implemented.

I had to implement such a "copy entry" helper in BPF:

'''
static void mptcp_pm_copy_addr(struct mptcp_addr_info *dst,
                               struct mptcp_addr_info *src)
{
       dst->id = src->id;
       dst->family = src->family;
       dst->port = src->port;

       if (src->family == AF_INET) {
               dst->addr.s_addr = src->addr.s_addr;
       } else if (src->family == AF_INET6) {
               dst->addr6.s6_addr32[0] = src->addr6.s6_addr32[0];
               dst->addr6.s6_addr32[1] = src->addr6.s6_addr32[1];
               dst->addr6.s6_addr32[2] = src->addr6.s6_addr32[2];
               dst->addr6.s6_addr32[3] = src->addr6.s6_addr32[3];
       }
}

static void mptcp_pm_copy_entry(struct mptcp_pm_addr_entry *dst,
                                struct mptcp_pm_addr_entry *src)
{
       mptcp_pm_copy_addr(&dst->addr, &src->addr);

       dst->flags = src->flags;
       dst->ifindex = src->ifindex;
}
'''

And add "write permission" for BPF to each field of mptcp_pm_addr_entry:

'''
@@ static int bpf_mptcp_pm_btf_struct_access(struct bpf_verifier_log *log,
  case offsetof(struct mptcp_pm_addr_entry, addr.port):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.port);
    break;
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
  case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[0]):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[0]);
    break;
  case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[1]):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[1]);
    break;
  case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[2]):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[2]);
    break;
  case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[3]):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[3]);
    break;
 #else
  case offsetof(struct mptcp_pm_addr_entry, addr.addr.s_addr):
    end = offsetofend(struct mptcp_pm_addr_entry, addr.addr.s_addr);
    break;
 #endif
'''

But if there's a sock_kmemdup() helper, it will become much simpler,
only need to call kfunc sock_kmemdup() instead in BPF.

So this patchset adds this new helper and uses it in several places.

[1]
https://lore.kernel.org/mptcp/cover.1738924875.git.tanggeliang@kylinos.cn/
====================

Link: https://patch.msgid.link/cover.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agomptcp: use sock_kmemdup for address entry
Geliang Tang [Fri, 28 Feb 2025 10:01:33 +0000 (18:01 +0800)]
mptcp: use sock_kmemdup for address entry

Instead of using sock_kmalloc() to allocate an address
entry "e" and then immediately duplicate the input "entry"
to it, the newly added sock_kmemdup() helper can be used in
mptcp_userspace_pm_append_new_local_addr() to simplify the code.

More importantly, the code "*e = *entry;" that assigns "entry"
to "e" is not easy to implemented in BPF if we use the same code
to implement an append_new_local_addr() helper of a BFP path
manager. This patch avoids this type of memory assignment
operation.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/3e5a307aed213038a87e44ff93b5793229b16279.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: use sock_kmemdup for ip_options
Geliang Tang [Fri, 28 Feb 2025 10:01:32 +0000 (18:01 +0800)]
net: use sock_kmemdup for ip_options

Instead of using sock_kmalloc() to allocate an ip_options and then
immediately duplicate another ip_options to the newly allocated one in
ipv6_dup_options(), mptcp_copy_ip_options() and sctp_v4_copy_ip_options(),
the newly added sock_kmemdup() helper can be used to simplify the code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/91ae749d66600ec6fb679e0e518fda6acb5c3e6f.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agosock: add sock_kmemdup helper
Geliang Tang [Fri, 28 Feb 2025 10:01:31 +0000 (18:01 +0800)]
sock: add sock_kmemdup helper

This patch adds the sock version of kmemdup() helper, named sock_kmemdup(),
to duplicate the input "src" memory block using the socket's option memory
buffer.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/f828077394c7d1f3560123497348b438c875b510.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'tcp-misc-changes'
Jakub Kicinski [Mon, 3 Mar 2025 23:44:23 +0000 (15:44 -0800)]
Merge branch 'tcp-misc-changes'

Eric Dumazet says:

====================
tcp: misc changes

Minor changes, following recent changes in TCP stack.
====================

Link: https://patch.msgid.link/20250301201424.2046477-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotcp: tcp_set_window_clamp() cleanup
Eric Dumazet [Sat, 1 Mar 2025 20:14:24 +0000 (20:14 +0000)]
tcp: tcp_set_window_clamp() cleanup

Remove one indentation level.

Use max_t() and clamp() macros.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotcp: remove READ_ONCE(req->ts_recent)
Eric Dumazet [Sat, 1 Mar 2025 20:14:23 +0000 (20:14 +0000)]
tcp: remove READ_ONCE(req->ts_recent)

After commit 8d52da23b6c6 ("tcp: Defer ts_recent changes
until req is owned"), req->ts_recent is not changed anymore.

It is set once in tcp_openreq_init(), bpf_sk_assign_tcp_reqsk()
or cookie_tcp_reqsk_alloc() before the req can be seen by other
cpus/threads.

This completes the revert of eba20811f326 ("tcp: annotate
data-races around tcp_rsk(req)->ts_recent").

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: gro: convert four dev_net() calls
Eric Dumazet [Sat, 1 Mar 2025 20:14:22 +0000 (20:14 +0000)]
net: gro: convert four dev_net() calls

tcp4_check_fraglist_gro(), tcp6_check_fraglist_gro(),
udp4_gro_lookup_skb() and udp6_gro_lookup_skb()
assume RCU is held so that the net structure does not disappear.

Use dev_net_rcu() instead of dev_net() to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotcp: convert to dev_net_rcu()
Eric Dumazet [Sat, 1 Mar 2025 20:14:21 +0000 (20:14 +0000)]
tcp: convert to dev_net_rcu()

TCP uses of dev_net() are under RCU protection, change them
to dev_net_rcu() to get LOCKDEP support.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotcp: add four drop reasons to tcp_check_req()
Eric Dumazet [Sat, 1 Mar 2025 20:14:20 +0000 (20:14 +0000)]
tcp: add four drop reasons to tcp_check_req()

Use two existing drop reasons in tcp_check_req():

- TCP_RFC7323_PAWS

- TCP_OVERWINDOW

Add two new ones:

- TCP_RFC7323_TSECR (corresponds to LINUX_MIB_TSECRREJECTED)

- TCP_LISTEN_OVERFLOW (when a listener accept queue is full)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agotcp: add a drop_reason pointer to tcp_check_req()
Eric Dumazet [Sat, 1 Mar 2025 20:14:19 +0000 (20:14 +0000)]
tcp: add a drop_reason pointer to tcp_check_req()

We want to add new drop reasons for packets dropped in 3WHS in the
following patches.

tcp_rcv_state_process() has to set reason to TCP_FASTOPEN,
because tcp_check_req() will conditionally overwrite the drop_reason.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'ipv4-fib-convert-rtm_newroute-and-rtm_delroute-to-per-netns-rtnl'
Jakub Kicinski [Mon, 3 Mar 2025 23:04:14 +0000 (15:04 -0800)]
Merge branch 'ipv4-fib-convert-rtm_newroute-and-rtm_delroute-to-per-netns-rtnl'

Kuniyuki Iwashima says:

====================
ipv4: fib: Convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.

Patch 1 is misc cleanup.
Patch 2 ~ 8 converts two fib_info hash tables to per-netns.
Patch 9 ~ 12 converts rtnl_lock() to rtnl_net_lcok().

v2: https://lore.kernel.org/20250226192556.21633-1-kuniyu@amazon.com
v1: https://lore.kernel.org/20250225182250.74650-1-kuniyu@amazon.com
====================

Link: https://patch.msgid.link/20250228042328.96624-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:28 +0000 (20:23 -0800)]
ipv4: fib: Convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.

We converted fib_info hash tables to per-netns one and now ready to
convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.

Let's hold rtnl_net_lock() in inet_rtm_newroute() and inet_rtm_delroute().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-13-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:27 +0000 (20:23 -0800)]
ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config().

fib_valid_key_len() is called in the beginning of fib_table_insert()
or fib_table_delete() to check if the prefix length is valid.

fib_table_insert() and fib_table_delete() are called from 3 paths

  - ip_rt_ioctl()
  - inet_rtm_newroute() / inet_rtm_delroute()
  - fib_magic()

In the first ioctl() path, rtentry_to_fib_config() checks the prefix
length with bad_mask().  Also, fib_magic() always passes the correct
prefix: 32 or ifa->ifa_prefixlen, which is already validated.

Let's move fib_valid_key_len() to the rtnetlink path, rtm_to_fib_config().

While at it, 2 direct returns in rtm_to_fib_config() are changed to
goto to match other places in the same function

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-12-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Hold rtnl_net_lock() in ip_rt_ioctl().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:26 +0000 (20:23 -0800)]
ipv4: fib: Hold rtnl_net_lock() in ip_rt_ioctl().

ioctl(SIOCADDRT/SIOCDELRT) calls ip_rt_ioctl() to add/remove a route in
the netns of the specified socket.

Let's hold rtnl_net_lock() there.

Note that rtentry_to_fib_config() can be called without rtnl_net_lock()
if we convert rtentry.dev handling to RCU later.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-11-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Hold rtnl_net_lock() for ip_fib_net_exit().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:25 +0000 (20:23 -0800)]
ipv4: fib: Hold rtnl_net_lock() for ip_fib_net_exit().

ip_fib_net_exit() requires RTNL and is called from fib_net_init()
and fib_net_exit_batch().

Let's hold rtnl_net_lock() before ip_fib_net_exit().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-10-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Namespacify fib_info hash tables.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:24 +0000 (20:23 -0800)]
ipv4: fib: Namespacify fib_info hash tables.

We will convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.
Then, we need to have per-netns hash tables for struct fib_info.

Let's allocate the hash tables per netns.

fib_info_hash, fib_info_hash_bits, and fib_info_cnt are now moved
to struct netns_ipv4 and accessed with net->ipv4.fib_XXX.

Also, the netns checks are removed from fib_find_info_nh() and
fib_find_info().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Add fib_info_hash_grow().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:23 +0000 (20:23 -0800)]
ipv4: fib: Add fib_info_hash_grow().

When the number of struct fib_info exceeds the hash table size in
fib_create_info(), we try to allocate a new hash table with the
doubled size.

The allocation is done in fib_create_info(), and if successful, each
struct fib_info is moved to the new hash table by fib_info_hash_move().

Let's integrate the allocation and fib_info_hash_move() as
fib_info_hash_grow() to make the following change cleaner.

While at it, fib_info_hash_grow() is placed near other hash-table-specific
functions.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Remove fib_info_hash_size.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:22 +0000 (20:23 -0800)]
ipv4: fib: Remove fib_info_hash_size.

We will allocate the fib_info hash tables per netns.

There are 5 global variables for fib_info hash tables:
fib_info_hash, fib_info_laddrhash, fib_info_hash_size,
fib_info_hash_bits, fib_info_cnt.

However, fib_info_laddrhash and fib_info_hash_size can be
easily calculated from fib_info_hash and fib_info_hash_bits.

Let's remove fib_info_hash_size and use (1 << fib_info_hash_bits)
instead.

Now we need not pass the new hash table size to fib_info_hash_move().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Remove fib_info_laddrhash pointer.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:21 +0000 (20:23 -0800)]
ipv4: fib: Remove fib_info_laddrhash pointer.

We will allocate the fib_info hash tables per netns.

There are 5 global variables for fib_info hash tables:
fib_info_hash, fib_info_laddrhash, fib_info_hash_size,
fib_info_hash_bits, fib_info_cnt.

However, fib_info_laddrhash and fib_info_hash_size can be
easily calculated from fib_info_hash and fib_info_hash_bits.

Let's remove the fib_info_laddrhash pointer and instead use
fib_info_hash + (1 << fib_info_hash_bits).

While at it, fib_info_laddrhash_bucket() is moved near other
hash-table-specific functions.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Make fib_info_hashfn() return struct hlist_head.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:20 +0000 (20:23 -0800)]
ipv4: fib: Make fib_info_hashfn() return struct hlist_head.

Every time fib_info_hashfn() returns a hash value, we fetch
&fib_info_hash[hash].

Let's return the hlist_head pointer from fib_info_hashfn() and
rename it to fib_info_hash_bucket() to match a similar function,
fib_info_laddrhash_bucket().

Note that we need to move the fib_info_hash assignment earlier in
fib_info_hash_move() to use fib_info_hash_bucket() in the for loop.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Allocate fib_info_hash[] during netns initialisation.
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:19 +0000 (20:23 -0800)]
ipv4: fib: Allocate fib_info_hash[] during netns initialisation.

We will allocate fib_info_hash[] and fib_info_laddrhash[] for each netns.

Currently, fib_info_hash[] is allocated when the first route is added.

Let's move the first allocation to a new __net_init function.

Note that we must call fib4_semantics_exit() in fib_net_exit_batch()
because ->exit() is called earlier than ->exit_batch().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Allocate fib_info_hash[] and fib_info_laddrhash[] by kvcalloc().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:18 +0000 (20:23 -0800)]
ipv4: fib: Allocate fib_info_hash[] and fib_info_laddrhash[] by kvcalloc().

Both fib_info_hash[] and fib_info_laddrhash[] are hash tables for
struct fib_info and are allocated by kvzmalloc() separately.

Let's replace the two kvzmalloc() calls with kvcalloc() to remove
the fib_info_laddrhash pointer later.

Note that fib_info_hash_alloc() allocates a new hash table based on
fib_info_hash_bits because we will remove fib_info_hash_size later.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: fib: Use cached net in fib_inetaddr_event().
Kuniyuki Iwashima [Fri, 28 Feb 2025 04:23:17 +0000 (20:23 -0800)]
ipv4: fib: Use cached net in fib_inetaddr_event().

net is available in fib_inetaddr_event(), let's use it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: net: report output format as TAP 13 in Python tests
Jakub Kicinski [Fri, 28 Feb 2025 18:00:07 +0000 (10:00 -0800)]
selftests: net: report output format as TAP 13 in Python tests

The Python lib based tests report that they are producing
"KTAP version 1", but really we aren't making use of any
KTAP features, like subtests. Our output is plain TAP.

Report TAP 13 instead of KTAP 1, this is what mptcp tests do,
and what NIPA knows how to parse best. For HW testing we need
precise subtest result tracking.

Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228180007.83325-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'add-usb-net-support-for-telit-cinterion-fn990b'
Jakub Kicinski [Sat, 1 Mar 2025 01:54:05 +0000 (17:54 -0800)]
Merge branch 'add-usb-net-support-for-telit-cinterion-fn990b'

Fabio Porcedda says:

====================
Add usb net support for Telit Cinterion FN990B

Add usb net support for Telit Cinterion FE990B.
Also fix Telit Cinterion FE990A name.

Connection with ModemManager was tested also AT ports.

There is a different patch set for the usb option part.

0x10b0: rmnet + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) +
            tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb
    T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  7 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10b0 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE990
    S:  SerialNumber=28c2595e
    C:  #Ifs= 9 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
    E:  Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

    0x10b1: MBIM + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) +
            tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb
    T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  8 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10b1 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE990
    S:  SerialNumber=28c2595e
    C:  #Ifs=10 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
    E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 8 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
    E:  Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

    0x10b2: RNDIS + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) +
            tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb
    T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  9 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10b2 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE990
    S:  SerialNumber=28c2595e
    C:  #Ifs=10 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host
    E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
    I:  If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 8 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
    E:  Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

    0x10d3: ECM + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) +
            tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb
    T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 11 Spd=480  MxCh= 0
    D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
    P:  Vendor=1bc7 ProdID=10b3 Rev=05.15
    S:  Manufacturer=Telit Cinterion
    S:  Product=FE990
    S:  SerialNumber=28c2595e
    C:  #Ifs=10 Cfg#= 1 Atr=e0 MxPwr=500mA
    I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether
    E:  Ad=82(I) Atr=03(Int.) MxPS=  16 Ivl=32ms
    I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether
    E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
    E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
    E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
    I:  If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
    E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
    E:  Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 8 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
    E:  Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    I:  If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
    E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
    E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
====================

Link: https://patch.msgid.link/20250227112441.3653819-1-fabio.porcedda@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: usb: cdc_mbim: fix Telit Cinterion FE990A name
Fabio Porcedda [Thu, 27 Feb 2025 11:24:41 +0000 (12:24 +0100)]
net: usb: cdc_mbim: fix Telit Cinterion FE990A name

The correct name for FE990 is FE990A so use it in order to avoid
confusion with FE990B.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Link: https://patch.msgid.link/20250227112441.3653819-4-fabio.porcedda@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: usb: qmi_wwan: fix Telit Cinterion FE990A name
Fabio Porcedda [Thu, 27 Feb 2025 11:24:40 +0000 (12:24 +0100)]
net: usb: qmi_wwan: fix Telit Cinterion FE990A name

The correct name for FE990 is FE990A so use it in order to avoid
confusion with FE990B.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Link: https://patch.msgid.link/20250227112441.3653819-3-fabio.porcedda@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: usb: qmi_wwan: add Telit Cinterion FE990B composition
Fabio Porcedda [Thu, 27 Feb 2025 11:24:39 +0000 (12:24 +0100)]
net: usb: qmi_wwan: add Telit Cinterion FE990B composition

Add the following Telit Cinterion FE990B composition:
0x10b0: rmnet + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) +
        tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb

usb-devices:
T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  7 Spd=480  MxCh= 0
D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1bc7 ProdID=10b0 Rev=05.15
S:  Manufacturer=Telit Cinterion
S:  Product=FE990
S:  SerialNumber=28c2595e
C:  #Ifs= 9 Cfg#= 1 Atr=e0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8a(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none)
E:  Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none)
E:  Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Cc: stable@vger.kernel.org
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Link: https://patch.msgid.link/20250227112441.3653819-2-fabio.porcedda@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'inet-ping-remove-extra-skb_clone-consume_skb'
Jakub Kicinski [Fri, 28 Feb 2025 22:41:35 +0000 (14:41 -0800)]
Merge branch 'inet-ping-remove-extra-skb_clone-consume_skb'

Eric Dumazet says:

====================
inet: ping: remove extra skb_clone()/consume_skb()

First patch in the series moves ICMP_EXT_ECHOREPLY handling in icmp_rcv()
to prepare the second patch.

The second patch removes one skb_clone()/consume_skb() pair
when processing ICMP_EXT_REPLY packets. Some people
use hundreds of "ping -fq ..." to stress hosts :)
====================

Link: https://patch.msgid.link/20250226183437.1457318-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoinet: ping: avoid skb_clone() dance in ping_rcv()
Eric Dumazet [Wed, 26 Feb 2025 18:34:37 +0000 (18:34 +0000)]
inet: ping: avoid skb_clone() dance in ping_rcv()

ping_rcv() callers currently call skb_free() or consume_skb(),
forcing ping_rcv() to clone the skb.

After this patch ping_rcv() is now 'consuming' the original skb,
either moving to a socket receive queue, or dropping it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250226183437.1457318-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoipv4: icmp: do not process ICMP_EXT_ECHOREPLY for broadcast/multicast addresses
Eric Dumazet [Wed, 26 Feb 2025 18:34:36 +0000 (18:34 +0000)]
ipv4: icmp: do not process ICMP_EXT_ECHOREPLY for broadcast/multicast addresses

There is no point processing ICMP_EXT_ECHOREPLY for routes
which would drop ICMP_ECHOREPLY (RFC 1122 3.2.2.6, 3.2.2.8)

This seems an oversight of the initial implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250226183437.1457318-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'net-stmmac-cleanup-transmit-clock-setting'
Jakub Kicinski [Fri, 28 Feb 2025 18:20:50 +0000 (10:20 -0800)]
Merge branch 'net-stmmac-cleanup-transmit-clock-setting'

Russell King says:

====================
net: stmmac: cleanup transmit clock setting

A lot of stmmac platform code which sets the transmit clock is very
similar - they decode the speed to the clock rate (125, 25 or 2.5 MHz)
and then set a clock to that rate.

The DWMAC core appears to have a clock input for the transmit section
called clk_tx_i which requires this rate.

This series moves the code which sets this clock into the core stmmac
code.

Patch 1 adds a hook that platforms can use to configure the clock rate.
Patch 2 adds a generic implementation.
The remainder of the patches convert the glue code for various platforms
to use this new infrastructure.
====================

Link: https://patch.msgid.link/Z8AtX-wyPal1auVO@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: thead: switch to use set_clk_tx_rate() hook
Russell King (Oracle) [Thu, 27 Feb 2025 09:17:14 +0000 (09:17 +0000)]
net: stmmac: thead: switch to use set_clk_tx_rate() hook

Switch from using the fix_mac_speed() hook to set_clk_tx_rate() to
manage the transmit clock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna14-0052tT-S4@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: meson: switch to use set_clk_tx_rate() hook
Russell King (Oracle) [Thu, 27 Feb 2025 09:17:09 +0000 (09:17 +0000)]
net: stmmac: meson: switch to use set_clk_tx_rate() hook

Switch from using the fix_mac_speed() hook to set_clk_tx_rate() to
manage the transmit clock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0z-0052tN-O1@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: ipq806x: switch to use set_clk_tx_rate() hook
Russell King (Oracle) [Thu, 27 Feb 2025 09:17:04 +0000 (09:17 +0000)]
net: stmmac: ipq806x: switch to use set_clk_tx_rate() hook

Switch from using the fix_mac_speed() hook to set_clk_tx_rate() to
manage the transmit clock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0u-0052tH-KQ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: rk: switch to use set_clk_tx_rate() hook
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:59 +0000 (09:16 +0000)]
net: stmmac: rk: switch to use set_clk_tx_rate() hook

Switch from using the fix_mac_speed() hook to set_clk_tx_rate() to
manage the transmit clock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0p-0052t8-Gn@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: imx: use generic stmmac_set_clk_tx_rate()
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:54 +0000 (09:16 +0000)]
net: stmmac: imx: use generic stmmac_set_clk_tx_rate()

Convert non-i.MX93 users to use the generic stmmac_set_clk_tx_rate() to
configure the MAC transmit clock rate.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0k-0052t2-Cc@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: intel: use generic stmmac_set_clk_tx_rate()
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:49 +0000 (09:16 +0000)]
net: stmmac: intel: use generic stmmac_set_clk_tx_rate()

Use the generic stmmac_set_clk_tx_rate() to configure the MAC transmit
clock.

Note that given the current unpatched driver structure,
plat_dat->fix_mac_speed will always be populated with
kmb_eth_fix_mac_speed(), even when no clock is present. We preserve
this behaviour in this patch by always initialising plat_dat->clk_tx_i
and plat_dat->set_clk_tx_rate.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0f-0052sw-8r@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: s32: use generic stmmac_set_clk_tx_rate()
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:44 +0000 (09:16 +0000)]
net: stmmac: s32: use generic stmmac_set_clk_tx_rate()

Use the generic stmmac_set_clk_tx_rate() to configure the MAC transmit
clock.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0a-0052sq-59@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: starfive: use generic stmmac_set_clk_tx_rate()
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:39 +0000 (09:16 +0000)]
net: stmmac: starfive: use generic stmmac_set_clk_tx_rate()

Use the generic stmmac_set_clk_tx_rate() to configure the MAC transmit
clock.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0V-0052sk-1L@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: dwc-qos: use generic stmmac_set_clk_tx_rate()
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:33 +0000 (09:16 +0000)]
net: stmmac: dwc-qos: use generic stmmac_set_clk_tx_rate()

Use the generic stmmac_set_clk_tx_rate() to configure the MAC transmit
clock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0P-0052se-Tv@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: provide generic implementation for set_clk_tx_rate method
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:28 +0000 (09:16 +0000)]
net: stmmac: provide generic implementation for set_clk_tx_rate method

Provide a generic implementation for the set_clk_tx_rate method
introduced by the previous patch, which is capable of configuring the
MAC transmit clock for 10M, 100M and 1000M speeds for at least MII,
GMII, RGMII and RMII interface modes.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0K-0052sY-QF@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: stmmac: provide set_clk_tx_rate() hook
Russell King (Oracle) [Thu, 27 Feb 2025 09:16:23 +0000 (09:16 +0000)]
net: stmmac: provide set_clk_tx_rate() hook

Several stmmac sub-drivers which support RGMII follow the same pattern.
They calculate the transmit clock rate, and then call clk_set_rate().

Analysis of several implementation documents suggests that the platform
is responsible for providing the transmit clock to the DWMAC core's
clk_tx_i. The expected rates are:

10Mbps 100Mbps 1Gbps
MII 2.5MHz 25MHz
RMII 2.5MHz 25MHz
GMII 125MHz
RGMI 2.5MHz 25MHz 125MHz

It seems some platforms require this clock to be manually configured,
but there are outputs from the MAC core that indicate the speed, so a
platform may use these to automatically configure the clock. Thus, we
can't just provide one solution to configure this clock rate.

Moreover, the clock may need to be derived from one of several sources
depending on the interface mode.

Provide a platform hook that is passed the transmit clock, interface
mode and speed.

Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tna0F-0052sS-Lr@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'mlx5-health-syndrome'
David S. Miller [Fri, 28 Feb 2025 08:56:49 +0000 (08:56 +0000)]
Merge branch 'mlx5-health-syndrome'

Tariq Toukan says:

====================
mlx5: Trust lockdown health syndrome

This series introduces a new error type in the health syndrome,
specifically for trust lock-down.  Additionally, it exposes the CRR bit
in the health buffer, which, when set, indicates that the error cannot
be recovered without a process involving a cold reset. We add The CRR
bit value to the health buffer info log and update it to be logged on
any syndrome.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 months agonet/mlx5: Add trust lockdown error to health syndrome print function
Shahar Shitrit [Wed, 26 Feb 2025 12:25:43 +0000 (14:25 +0200)]
net/mlx5: Add trust lockdown error to health syndrome print function

Add the new health syndrome value to hsynd_str() function
to indicate that the device got a trust lockdown fault.

Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 months agonet/mlx5: Expose crr in health buffer
Shahar Shitrit [Wed, 26 Feb 2025 12:25:42 +0000 (14:25 +0200)]
net/mlx5: Expose crr in health buffer

Expose crr bit in struct health buffer. When set, it indicates that
the error cannot be recovered without flow involving a cold reset.
Add its value to the health buffer info log.

Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 months agonet/mlx5: Log health buffer data on any syndrome
Moshe Shemesh [Wed, 26 Feb 2025 12:25:41 +0000 (14:25 +0200)]
net/mlx5: Log health buffer data on any syndrome

Currently health buffer data is logged either when FW fatal error
detected or miss counter reached max misses threshold.

Log health buffer whenever new health syndrome is detected.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 months agonet/mlx5: Avoid report two health errors on same syndrome
Moshe Shemesh [Wed, 26 Feb 2025 12:25:40 +0000 (14:25 +0200)]
net/mlx5: Avoid report two health errors on same syndrome

In case health counter has not increased for few polling intervals, miss
counter will reach max misses threshold and health report will be
triggered for FW health reporter. In case syndrome found on same health
poll another health report will be triggered.

Avoid two health reports on same syndrome by marking this syndrome as
already known.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 months agoMerge branch 'selftests-net-deflake-gro-tests-and-fix-return-value-and-output'
Jakub Kicinski [Fri, 28 Feb 2025 02:38:26 +0000 (18:38 -0800)]
Merge branch 'selftests-net-deflake-gro-tests-and-fix-return-value-and-output'

Kevin Krakauer says:

====================
selftests/net: deflake GRO tests and fix return value and output

The GRO selftests can flake and have some confusing behavior. These
changes make the output and return value of GRO behave as expected, then
deflake the tests.

v1: https://lore.kernel.org/20250218164555.1955400-1-krakauer@google.com
====================

Link: https://patch.msgid.link/20250226192725.621969-1-krakauer@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests/net: deflake GRO tests
Kevin Krakauer [Wed, 26 Feb 2025 19:27:25 +0000 (11:27 -0800)]
selftests/net: deflake GRO tests

GRO tests are timing dependent and can easily flake. This is partially
mitigated in gro.sh by giving each subtest 3 chances to pass. However,
this still flakes on some machines. Reduce the flakiness by:

- Bumping retries to 6.
- Setting napi_defer_hard_irqs to 1 to reduce the chance that GRO is
  flushed prematurely. This also lets us reduce the gro_flush_timeout
  from 1ms to 100us.

Tested: Ran `gro.sh -t large` 1000 times. There were no failures with
this change. Ran inside strace to increase flakiness.

Signed-off-by: Kevin Krakauer <krakauer@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250226192725.621969-4-krakauer@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests/net: only print passing message in GRO tests when tests pass
Kevin Krakauer [Wed, 26 Feb 2025 19:27:24 +0000 (11:27 -0800)]
selftests/net: only print passing message in GRO tests when tests pass

gro.c:main no longer erroneously claims a test passes when running as a
sender.

Tested: Ran `gro.sh -t large` to verify the sender no longer prints a
status.

Signed-off-by: Kevin Krakauer <krakauer@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250226192725.621969-3-krakauer@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests/net: have `gro.sh -t` return a correct exit code
Kevin Krakauer [Wed, 26 Feb 2025 19:27:23 +0000 (11:27 -0800)]
selftests/net: have `gro.sh -t` return a correct exit code

Modify gro.sh to return a useful exit code when the -t flag is used. It
formerly returned 0 no matter what.

Tested: Ran `gro.sh -t large` and verified that test failures return 1.
Signed-off-by: Kevin Krakauer <krakauer@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250226192725.621969-2-krakauer@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge branch 'add-missing-netlink-error-message-macros-to-coccinelle-test'
Jakub Kicinski [Fri, 28 Feb 2025 02:11:39 +0000 (18:11 -0800)]
Merge branch 'add-missing-netlink-error-message-macros-to-coccinelle-test'

Gal Pressman says:

====================
Add missing netlink error message macros to coccinelle test

The newline_in_nl_msg.cocci test is missing some variants in the list of
checked macros, add them and fix all reported issues.
====================

Link: https://patch.msgid.link/20250226093904.6632-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoice: dpll: Remove newline at the end of a netlink error message
Gal Pressman [Wed, 26 Feb 2025 09:39:04 +0000 (11:39 +0200)]
ice: dpll: Remove newline at the end of a netlink error message

Netlink error messages should not have a newline at the end of the
string.

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250226093904.6632-6-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: sched: Remove newline at the end of a netlink error message
Gal Pressman [Wed, 26 Feb 2025 09:39:03 +0000 (11:39 +0200)]
net: sched: Remove newline at the end of a netlink error message

Netlink error messages should not have a newline at the end of the
string.

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250226093904.6632-5-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agosfc: Remove newline at the end of a netlink error message
Gal Pressman [Wed, 26 Feb 2025 09:39:02 +0000 (11:39 +0200)]
sfc: Remove newline at the end of a netlink error message

Netlink error messages should not have a newline at the end of the
string.

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250226093904.6632-4-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5: Remove newline at the end of a netlink error message
Gal Pressman [Wed, 26 Feb 2025 09:39:01 +0000 (11:39 +0200)]
net/mlx5: Remove newline at the end of a netlink error message

Netlink error messages should not have a newline at the end of the
string.

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250226093904.6632-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agococcinelle: Add missing (GE)NL_SET_ERR_MSG_* to strings ending with newline test
Gal Pressman [Wed, 26 Feb 2025 09:39:00 +0000 (11:39 +0200)]
coccinelle: Add missing (GE)NL_SET_ERR_MSG_* to strings ending with newline test

Add missing (GE)NL_SET_ERR_MSG_*() variants to the list of macros
checked for strings ending with a newline.

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250226093904.6632-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet-sysfs: remove unused initial ret values
Antoine Tenart [Wed, 26 Feb 2025 17:46:43 +0000 (18:46 +0100)]
net-sysfs: remove unused initial ret values

In some net-sysfs functions the ret value is initialized but never used
as it is always overridden. Remove those.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250226174644.311136-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agogeneve, specs: Add port range to rt_link specification
Daniel Borkmann [Wed, 26 Feb 2025 18:20:30 +0000 (19:20 +0100)]
geneve, specs: Add port range to rt_link specification

Add the port range to rt_link, example:

  # tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \
    --do getlink --json '{"ifname": "geneve1"}' --output-json | jq
  {
    "ifname": "geneve1",
    [...]
    "linkinfo": {
      "kind": "geneve",
      "data": {
        "id": 1000,
        "remote": "147.28.227.100",
        "udp-csum": 0,
        "ttl": 0,
        "tos": 0,
        "label": 0,
        "df": 0,
        "port": 49431,
        "udp-zero-csum6-rx": 1,
        "ttl-inherit": 0,
        "port-range": {
          "low": 4000,
          "high": 5000
        }
      }
    },
    [...]
  }

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250226182030.89440-2-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agogeneve: Allow users to specify source port range
Daniel Borkmann [Wed, 26 Feb 2025 18:20:29 +0000 (19:20 +0100)]
geneve: Allow users to specify source port range

Recently, in case of Cilium, we run into users on Azure who require to use
tunneling for east/west traffic due to hitting IPAM API limits for Kubernetes
Pods if they would have gone with publicly routable IPs for Pods. In case
of tunneling, Cilium supports the option of vxlan or geneve. In order to
RSS spread flows among remote CPUs both derive a source port hash via
udp_flow_src_port() which takes the inner packet's skb->hash into account.
For clusters with many nodes, this can then hit a new limitation [0]: Today,
the Azure networking stack supports 1M total flows (500k inbound and 500k
outbound) for a VM. [...] Once this limit is hit, other connections are
dropped. [...] Each flow is distinguished by a 5-tuple (protocol, local IP
address, remote IP address, local port, and remote port) information. [...]

For vxlan and geneve, this can create a massive amount of UDP flows which
then run into the limits if stale flows are not evicted fast enough. One
option to mitigate this for vxlan is to narrow the source port range via
IFLA_VXLAN_PORT_RANGE while still being able to benefit from RSS. However,
geneve currently does not have this option and it spreads traffic across
the full source port range of [1, USHRT_MAX]. To overcome this limitation
also for geneve, add an equivalent IFLA_GENEVE_PORT_RANGE setting for users.

Note that struct geneve_config before/after still remains at 2 cachelines
on x86-64. The low/high members of struct ifla_geneve_port_range (which is
uapi exposed) are of type __be16. While they would be perfectly fine to be
of __u16 type, the consensus was that it would be good to be consistent
with the existing struct ifla_vxlan_port_range from a uapi consumer PoV.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://learn.microsoft.com/en-us/azure/virtual-network/virtual-machine-network-throughput
Link: https://patch.msgid.link/20250226182030.89440-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet/mlx5e: Avoid a hundred -Wflex-array-member-not-at-end warnings
Gustavo A. R. Silva [Wed, 26 Feb 2025 03:17:32 +0000 (13:47 +1030)]
net/mlx5e: Avoid a hundred -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

So, in this particular case, we create a new `struct mlx5e_umr_wqe_hdr`
to enclose the header part of flexible structure `struct mlx5e_umr_wqe`.
This is, all the members except the flexible arrays `inline_mtts`,
`inline_klms` and `inline_ksms` in the anonymous union. We then replace
the header part with `struct mlx5e_umr_wqe_hdr hdr;` in `struct
mlx5e_umr_wqe`, and change the type of the object currently causing
trouble `umr_wqe` from `struct mlx5e_umr_wqe` to `struct
mlx5e_umr_wqe_hdr` --this last bit gets rid of the flex-array-in-the-middle
part and avoid the warnings.

Also, no new members should be added to `struct mlx5e_umr_wqe`, instead
any new members must be included in the header structure `struct
mlx5e_umr_wqe_hdr`. To enforce this, we use `static_assert()`, ensuring
that the memory layout of both the flexible structure and the newly
created header struct remain consistent.

The next step is to refactor the rest of the related code accordingly,
which means adding a bunch of `hdr.` wherever needed.

Lastly, we use `container_of()` whenever we need to retrieve a pointer
to the flexible structure `struct mlx5e_umr_wqe`.

So, with these changes, fix 125 of the following warnings:

drivers/net/ethernet/mellanox/mlx5/core/en.h:664:48: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://patch.msgid.link/Z76HzPW1dFTLOSSy@kspp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonetkit: Remove double invocation to clear ipvs property flag
Daniel Borkmann [Tue, 25 Feb 2025 21:29:27 +0000 (22:29 +0100)]
netkit: Remove double invocation to clear ipvs property flag

With ipvs_reset() now done unconditionally in skb_scrub_packet()
we would then call the former twice netkit_prep_forward(). Thus
remove the now unnecessary explicit call.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250225212927.69271-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: qed: make 'qed_ll2_ops_pass' as __maybe_unused
Arnd Bergmann [Tue, 25 Feb 2025 20:09:23 +0000 (21:09 +0100)]
net: qed: make 'qed_ll2_ops_pass' as __maybe_unused

gcc warns about unused const variables even in header files when
building with W=1:

    In file included from include/linux/qed/qed_rdma_if.h:14,
                     from drivers/net/ethernet/qlogic/qed/qed_rdma.h:16,
                     from drivers/net/ethernet/qlogic/qed/qed_cxt.c:23:
    include/linux/qed/qed_ll2_if.h:270:33: error: 'qed_ll2_ops_pass' defined but not used [-Werror=unused-const-variable=]
      270 | static const struct qed_ll2_ops qed_ll2_ops_pass = {

This one is intentional, so mark it as __maybe_unused to it can be
included from a file that doesn't use this variable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://patch.msgid.link/20250225200926.4057723-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 27 Feb 2025 18:14:23 +0000 (10:14 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR (net-6.14-rc5).

Conflicts:

drivers/net/ethernet/cadence/macb_main.c
  fa52f15c745c ("net: cadence: macb: Synchronize stats calculations")
  75696dd0fd72 ("net: cadence: macb: Convert to get_stats64")
https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/intel/ice/ice_sriov.c
  79990cf5e7ad ("ice: Fix deinitializing VF in error path")
  a203163274a4 ("ice: simplify VF MSI-X managing")

net/ipv4/tcp.c
  18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace")
  297d389e9e5b ("net: prefix devmem specific helpers")

net/mptcp/subflow.c
  8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join")
  c3349a22c200 ("mptcp: consolidate subflow cleanup")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoMerge tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 27 Feb 2025 17:32:42 +0000 (09:32 -0800)]
Merge tag 'net-6.14-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth.

  We didn't get netfilter or wireless PRs this week, so next week's PR
  is probably going to be bigger. A healthy dose of fixes for bugs
  introduced in the current release nonetheless.

  Current release - regressions:

   - Bluetooth: always allow SCO packets for user channel

   - af_unix: fix memory leak in unix_dgram_sendmsg()

   - rxrpc:
       - remove redundant peer->mtu_lock causing lockdep splats
       - fix spinlock flavor issues with the peer record hash

   - eth: iavf: fix circular lock dependency with netdev_lock

   - net: use rtnl_net_dev_lock() in
     register_netdevice_notifier_dev_net() RDMA driver register notifier
     after the device

  Current release - new code bugs:

   - ethtool: fix ioctl confusing drivers about desired HDS user config

   - eth: ixgbe: fix media cage present detection for E610 device

  Previous releases - regressions:

   - loopback: avoid sending IP packets without an Ethernet header

   - mptcp: reset connection when MPTCP opts are dropped after join

  Previous releases - always broken:

   - net: better track kernel sockets lifetime

   - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels

   - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT

   - eth: enetc: number of error handling fixes

   - dsa: rtl8366rb: reshuffle the code to fix config / build issue with
     LED support"

* tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
  net: ti: icss-iep: Reject perout generation request
  idpf: fix checksums set in idpf_rx_rsc()
  selftests: drv-net: Check if combined-count exists
  net: ipv6: fix dst ref loop on input in rpl lwt
  net: ipv6: fix dst ref loop on input in seg6 lwt
  usbnet: gl620a: fix endpoint checking in genelink_bind()
  net/mlx5: IRQ, Fix null string in debug print
  net/mlx5: Restore missing trace event when enabling vport QoS
  net/mlx5: Fix vport QoS cleanup on error
  net: mvpp2: cls: Fixed Non IP flow, with vlan tag flow defination.
  af_unix: Fix memory leak in unix_dgram_sendmsg()
  net: Handle napi_schedule() calls from non-interrupt
  net: Clear old fragment checksum value in napi_reuse_skb
  gve: unlink old napi when stopping a queue using queue API
  net: Use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net().
  tcp: Defer ts_recent changes until req is owned
  net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs()
  net: enetc: remove the mm_lock from the ENETC v4 driver
  net: enetc: add missing enetc4_link_deinit()
  net: enetc: update UDP checksum when updating originTimestamp field
  ...

3 months agoMerge tag 'sound-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 27 Feb 2025 16:41:19 +0000 (08:41 -0800)]
Merge tag 'sound-6.14-rc5' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of fixes. The only slightly large change is for ASoC
  Cirrus codec, but that's still in a normal range. All the rest are
  small device-specific fixes and should be fairly safe to take"

* tag 'sound-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Fix microphone regression on ASUS N705UD
  ALSA: hda/realtek: Fix wrong mic setup for ASUS VivoBook 15
  ASoC: cs35l56: Prevent races when soft-resetting using SPI control
  firmware: cs_dsp: Remove async regmap writes
  ASoC: Intel: sof_sdw: warn both sdw and pch dmic are used
  ASoC: SOF: Intel: don't check number of sdw links when set dmic_fixup
  ASoC: dapm-graph: set fill colour of turned on nodes
  ASoC: fsl: Rename stream name of SAI DAI driver
  ASoC: es8328: fix route from DAC to output
  ALSA: usb-audio: Re-add sample rate quirk for Pioneer DJM-900NXS2
  ASoC: tas2764: Set the SDOUT polarity correctly
  ASoC: tas2764: Fix power control mask
  ALSA: usb-audio: Avoid dropping MIDI events at closing multiple ports
  ASoC: tas2770: Fix volume scale

3 months agonet: ti: icss-iep: Reject perout generation request
Meghana Malladi [Thu, 27 Feb 2025 09:24:41 +0000 (14:54 +0530)]
net: ti: icss-iep: Reject perout generation request

IEP driver supports both perout and pps signal generation
but perout feature is faulty with half-cooked support
due to some missing configuration. Remove perout
support from the driver and reject perout requests with
"not supported" error code.

Fixes: c1e0230eeaab2 ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250227092441.1848419-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoidpf: fix checksums set in idpf_rx_rsc()
Eric Dumazet [Wed, 26 Feb 2025 22:12:52 +0000 (22:12 +0000)]
idpf: fix checksums set in idpf_rx_rsc()

idpf_rx_rsc() uses skb_transport_offset(skb) while the transport header
is not set yet.

This triggers the following warning for CONFIG_DEBUG_NET=y builds.

DEBUG_NET_WARN_ON_ONCE(!skb_transport_header_was_set(skb))

[   69.261620] WARNING: CPU: 7 PID: 0 at ./include/linux/skbuff.h:3020 idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf
[   69.261629] Modules linked in: vfat fat dummy bridge intel_uncore_frequency_tpmi intel_uncore_frequency_common intel_vsec_tpmi idpf intel_vsec cdc_ncm cdc_eem cdc_ether usbnet mii xhci_pci xhci_hcd ehci_pci ehci_hcd libeth
[   69.261644] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Tainted: G S      W          6.14.0-smp-DEV #1697
[   69.261648] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[   69.261650] RIP: 0010:idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf
[   69.261677] ? __warn (kernel/panic.c:242 kernel/panic.c:748)
[   69.261682] ? idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf
[   69.261687] ? report_bug (lib/bug.c:?)
[   69.261690] ? handle_bug (arch/x86/kernel/traps.c:285)
[   69.261694] ? exc_invalid_op (arch/x86/kernel/traps.c:309)
[   69.261697] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
[   69.261700] ? __pfx_idpf_vport_splitq_napi_poll (drivers/net/ethernet/intel/idpf/idpf_txrx.c:4011) idpf
[   69.261704] ? idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf
[   69.261708] ? idpf_vport_splitq_napi_poll (drivers/net/ethernet/intel/idpf/idpf_txrx.c:3072) idpf
[   69.261712] __napi_poll (net/core/dev.c:7194)
[   69.261716] net_rx_action (net/core/dev.c:7265)
[   69.261718] ? __qdisc_run (net/sched/sch_generic.c:293)
[   69.261721] ? sched_clock (arch/x86/include/asm/preempt.h:84 arch/x86/kernel/tsc.c:288)
[   69.261726] handle_softirqs (kernel/softirq.c:561)

Fixes: 3a8845af66edb ("idpf: add RX splitq napi poll support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alan Brady <alan.brady@intel.com>
Cc: Joshua Hay <joshua.a.hay@intel.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20250226221253.1927782-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agoselftests: drv-net: Check if combined-count exists
Joe Damato [Wed, 26 Feb 2025 18:19:57 +0000 (18:19 +0000)]
selftests: drv-net: Check if combined-count exists

Some drivers, like tg3, do not set combined-count:

$ ethtool -l enp4s0f1
Channel parameters for enp4s0f1:
Pre-set maximums:
RX: 4
TX: 4
Other: n/a
Combined: n/a
Current hardware settings:
RX: 4
TX: 1
Other: n/a
Combined: n/a

In the case where combined-count is not set, the ethtool netlink code
in the kernel elides the value and the code in the test:

  netnl.channels_get(...)

With a tg3 device, the returned dictionary looks like:

{'header': {'dev-index': 3, 'dev-name': 'enp4s0f1'},
 'rx-max': 4,
 'rx-count': 4,
 'tx-max': 4,
 'tx-count': 1}

Note that the key 'combined-count' is missing. As a result of this
missing key the test raises an exception:

 # Exception|     if channels['combined-count'] == 0:
 # Exception|        ~~~~~~~~^^^^^^^^^^^^^^^^^^
 # Exception| KeyError: 'combined-count'

Change the test to check if 'combined-count' is a key in the dictionary
first and if not assume that this means the driver has separate RX and
TX queues.

With this change, the test now passes successfully on tg3 and mlx5
(which does have a 'combined-count').

Fixes: 1cf270424218 ("net: selftest: add test for netdev netlink queue-get API")
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250226181957.212189-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 months agonet: ethernet: mtk_ppe_offload: Allow QinQ, double ETH_P_8021Q only
Eric Woudstra [Tue, 25 Feb 2025 20:15:09 +0000 (21:15 +0100)]
net: ethernet: mtk_ppe_offload: Allow QinQ, double ETH_P_8021Q only

mtk_foe_entry_set_vlan() in mtk_ppe.c already supports double vlan
tagging, but mtk_flow_offload_replace() in mtk_ppe_offload.c only allows
for 1 vlan tag, optionally in combination with pppoe and dsa tags.

However, mtk_foe_entry_set_vlan() only allows for setting the vlan id.
The protocol cannot be set, it is always ETH_P_8021Q, for inner and outer
tag. This patch adds QinQ support to mtk_flow_offload_replace(), only in
the case that both inner and outer tags are ETH_P_8021Q.

Only PPPoE-in-Q (as before) and Q-in-Q are allowed. A combination
of PPPoE and Q-in-Q is not allowed.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Link: https://patch.msgid.link/20250225201509.20843-1-ericwouds@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agoMerge branch 'fixes-for-seg6-and-rpl-lwtunnels-on-input'
Paolo Abeni [Thu, 27 Feb 2025 13:18:24 +0000 (14:18 +0100)]
Merge branch 'fixes-for-seg6-and-rpl-lwtunnels-on-input'

Justin Iurman says:

====================
fixes for seg6 and rpl lwtunnels on input

As a follow up to commit 92191dd10730 ("net: ipv6: fix dst ref loops in
rpl, seg6 and ioam6 lwtunnels"), we also need a conditional dst cache on
input for seg6_iptunnel and rpl_iptunnel to prevent dst ref loops (i.e.,
if the packet destination did not change, we may end up recording a
reference to the lwtunnel in its own cache, and the lwtunnel state will
never be freed). This series provides a fix to respectively prevent a
dst ref loop on input in seg6_iptunnel and rpl_iptunnel.

v2:
- https://lore.kernel.org/netdev/20250211221624.18435-1-justin.iurman@uliege.be/
v1:
- https://lore.kernel.org/netdev/20250209193840.20509-1-justin.iurman@uliege.be/
====================

Link: https://patch.msgid.link/20250225175139.25239-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: ipv6: fix dst ref loop on input in rpl lwt
Justin Iurman [Tue, 25 Feb 2025 17:51:39 +0000 (18:51 +0100)]
net: ipv6: fix dst ref loop on input in rpl lwt

Prevent a dst ref loop on input in rpl_iptunnel.

Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel")
Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 months agonet: ipv6: fix dst ref loop on input in seg6 lwt
Justin Iurman [Tue, 25 Feb 2025 17:51:38 +0000 (18:51 +0100)]
net: ipv6: fix dst ref loop on input in seg6 lwt

Prevent a dst ref loop on input in seg6_iptunnel.

Fixes: af4a2209b134 ("ipv6: sr: use dst_cache in seg6_input")
Cc: David Lebrun <dlebrun@google.com>
Cc: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>