linux-block.git
8 months agonet: ena: Refactor napi functions
David Arinzon [Mon, 1 Jan 2024 19:08:51 +0000 (19:08 +0000)]
net: ena: Refactor napi functions

This patch focuses on changes to the XDP part of the napi
polling routine.

1. Update the `napi_comp` stat only when napi is actually
   complete.
2. Simplify the code by using a function pointer to the right
   napi routine (XDP vs non-XDP path)
3. Remove unnecessary local variables.
4. Adjust a debug print to show the processed XDP frame index
   rather than the pointer.

Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-8-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Don't check if XDP program is loaded in ena_xdp_execute()
David Arinzon [Mon, 1 Jan 2024 19:08:50 +0000 (19:08 +0000)]
net: ena: Don't check if XDP program is loaded in ena_xdp_execute()

This check is already done in ena_clean_rx_irq() which indirectly
calls it.
This function is called in napi context and the driver doesn't
allow to change the XDP program without performing destruction and
reinitialization of napi context (part of ena_down/ena_up sequence).

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-7-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Use tx_ring instead of xdp_ring for XDP channel TX
David Arinzon [Mon, 1 Jan 2024 19:08:49 +0000 (19:08 +0000)]
net: ena: Use tx_ring instead of xdp_ring for XDP channel TX

When an XDP program is loaded the existing channels in the driver split
into two halves:
- The first half of the channels contain RX and TX rings, these queues
  are used for receiving traffic and sending packets originating from
  kernel.
- The second half of the channels contain only a TX ring. These queues
  are used for sending packets that were redirected using XDP_TX
  or XDP_REDIRECT.

Referring to the queues in the second half of the channels as "xdp_ring"
can be confusing and may give the impression that ENA has the capability
to generate an additional special queue.

This patch ensures that the xdp_ring field is exclusively used to
describe the XDP TX queue that a specific RX queue needs to utilize when
forwarding packets with XDP TX and XDP REDIRECT, preserving the
integrity of the xdp_ring field in ena_ring.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-6-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Introduce total_tx_size field in ena_tx_buffer struct
David Arinzon [Mon, 1 Jan 2024 19:08:48 +0000 (19:08 +0000)]
net: ena: Introduce total_tx_size field in ena_tx_buffer struct

To avoid de-referencing skb or xdp_frame when we poll for TX completion
(where they might not be in the cache), save the total TX packet size in
the ena_tx_buffer object representing the packet.

Also the 'print_once' field's type was changed from u32 to u8 to allow
adding the 'total_tx_size' without changing the total size of the
struct.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-5-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Put orthogonal fields in ena_tx_buffer in a union
David Arinzon [Mon, 1 Jan 2024 19:08:47 +0000 (19:08 +0000)]
net: ena: Put orthogonal fields in ena_tx_buffer in a union

The skb and xdpf pointers cannot be set together in the driver
(each TX descriptor can send either an SKB or an XDP frame), and so it
makes more sense to put them both in a union.

This decreases the overall size of the ena_tx_buffer struct which
improves cache locality.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-4-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Pass ena_adapter instead of net_device to ena_xmit_common()
David Arinzon [Mon, 1 Jan 2024 19:08:46 +0000 (19:08 +0000)]
net: ena: Pass ena_adapter instead of net_device to ena_xmit_common()

This change will enable the ability to use ena_xmit_common()
in functions that don't have a net_device pointer.
While it can be retrieved by dereferencing
ena_adapter (adapter->netdev), there's no reason to do it in
fast path code where this pointer is only needed for
debug prints.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-3-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ena: Move XDP code to its new files
David Arinzon [Mon, 1 Jan 2024 19:08:45 +0000 (19:08 +0000)]
net: ena: Move XDP code to its new files

XDP system has a very large footprint in the driver's overall code.
makes the whole driver's code much harder to read.

Moving XDP code to dedicated files.

This patch doesn't make any changes to the code itself and only
cut-pastes the code into ena_xdp.c and ena_xdp.h files so the change
is purely cosmetic.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://lore.kernel.org/r/20240101190855.18739-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoocteontx2-af: Fix max NPC MCAM entry check while validating ref_entry
Suman Ghosh [Mon, 1 Jan 2024 14:50:42 +0000 (20:20 +0530)]
octeontx2-af: Fix max NPC MCAM entry check while validating ref_entry

As of today, the last MCAM entry was not getting allocated because of
a <= check with the max_bmap count. This patch modifies that and if the
requested entry is greater than the available entries then set it to the
max value.

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20240101145042.419697-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoselftests/net: change shebang to bash to support "source"
Yujie Liu [Fri, 29 Dec 2023 13:19:31 +0000 (21:19 +0800)]
selftests/net: change shebang to bash to support "source"

The patch set [1] added a general lib.sh in net selftests, and converted
several test scripts to source the lib.sh.

unicast_extensions.sh (converted in [1]) and pmtu.sh (converted in [2])
have a /bin/sh shebang which may point to various shells in different
distributions, but "source" is only available in some of them. For
example, "source" is a built-it function in bash, but it cannot be
used in dash.

Refer to other scripts that were converted together, simply change the
shebang to bash to fix the following issues when the default /bin/sh
points to other shells.

not ok 51 selftests: net: unicast_extensions.sh # exit=1

v1 -> v2:
  - Fix pmtu.sh which has the same issue as unicast_extensions.sh,
    suggested by Hangbin
  - Change the style of the "source" line to be consistent with other
    tests, suggested by Hangbin

Link: https://lore.kernel.org/all/20231202020110.362433-1-liuhangbin@gmail.com/
Link: https://lore.kernel.org/all/20231219094856.1740079-1-liuhangbin@gmail.com/
Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 378f082eaf37 ("selftests/net: convert pmtu.sh to run it in unique namespace")
Fixes: 0f4765d0b48d ("selftests/net: convert unicast_extensions.sh to run it in unique namespace")
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20231229131931.3961150-1-yujie.liu@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agofib: remove unnecessary input parameters in fib_default_rule_add
Zhengchao Shao [Tue, 2 Jan 2024 07:15:19 +0000 (15:15 +0800)]
fib: remove unnecessary input parameters in fib_default_rule_add

When fib_default_rule_add is invoked, the value of the input parameter
'flags' is always 0. Rules uses kzalloc to allocate memory, so 'flags' has
been initialized to 0. Therefore, remove the input parameter 'flags' in
fib_default_rule_add.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240102071519.3781384-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: mvpp2: initialize port fwnode pointer
Marcin Wojtas [Sun, 31 Dec 2023 12:20:19 +0000 (12:20 +0000)]
net: mvpp2: initialize port fwnode pointer

Update the port's device structure also with its fwnode pointer
with a recommended device_set_node() helper routine.

Signed-off-by: Marcin Wojtas <marcin.s.wojtas@gmail.com>
Reviewed-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231231122019.123344-1-marcin.s.wojtas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: mdio: mux-bcm-iproc: Use alignment helpers and SZ_4K
Ilpo Järvinen [Fri, 29 Dec 2023 14:52:32 +0000 (16:52 +0200)]
net: mdio: mux-bcm-iproc: Use alignment helpers and SZ_4K

Instead of open coding, use IS_ALIGNED() and ALIGN_DOWN() when dealing
with alignment. Replace also literals with SZ_4K.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Ray Jui <ray.jui@broadcom.com>
Link: https://lore.kernel.org/r/20231229145232.6163-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet/sched: cls_api: complement tcf_tfilter_dump_policy
Lin Ma [Thu, 28 Dec 2023 06:43:58 +0000 (14:43 +0800)]
net/sched: cls_api: complement tcf_tfilter_dump_policy

In function `tc_dump_tfilter`, the attributes array is parsed via
tcf_tfilter_dump_policy which only describes TCA_DUMP_FLAGS. However,
the NLA TCA_CHAIN is also accessed with `nla_get_u32`.

The access to TCA_CHAIN is introduced in commit 5bc1701881e3 ("net:
sched: introduce multichain support for filters") and no nla_policy is
provided for parsing at that point. Later on, tcf_tfilter_dump_policy is
introduced in commit f8ab1807a9c9 ("net: sched: introduce terse dump
flag") while still ignoring the fact that TCA_CHAIN needs a check. This
patch does that by complementing the policy to allow the access
discussed here can be safe as other cases just choose rtm_tca_policy as
the parsing policy.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoppp: Fix spelling typo in comment in ppp_async_encode()
liyouhong [Wed, 27 Dec 2023 01:58:31 +0000 (09:58 +0800)]
ppp: Fix spelling typo in comment in ppp_async_encode()

Fix spelling typo in comment

Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: liyouhong <liyouhong@kylinos.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231227015831.289077-1-liyouhong@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethtool: Fix symmetric-xor RSS RX flow hash check
Gerhard Engleder [Tue, 26 Dec 2023 20:55:36 +0000 (21:55 +0100)]
net: ethtool: Fix symmetric-xor RSS RX flow hash check

Commit 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
adds a check to the ethtool set_rxnfc operation, which checks the RX
flow hash if the flag RXH_XFRM_SYM_XOR is set. This flag is introduced
with the same commit. It calls the ethtool get_rxfh operation to get the
RX flow hash data. If get_rxfh is not supported, then EOPNOTSUPP is
returned.

There are driver like tsnep, macb, asp2, genet, gianfar, mtk, ... which
support the ethtool operation set_rxnfc but not get_rxfh. This results
in EOPNOTSUPP returned by ethtool_set_rxnfc() without actually calling
the ethtool operation set_rxnfc. Thus, set_rxnfc got broken for all
these drivers.

Check RX flow hash in ethtool_set_rxnfc() only if driver supports RX
flow hash.

Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Link: https://lore.kernel.org/r/20231226205536.32003-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'bug-fixes-for-rss-symmetric-xor'
Jakub Kicinski [Wed, 3 Jan 2024 00:00:08 +0000 (16:00 -0800)]
Merge branch 'bug-fixes-for-rss-symmetric-xor'

Ahmed Zaki says:

====================
Bug fixes for RSS symmetric-xor

A couple of fixes for the symmetric-xor recently merged in net-next [1].

The first patch copies the xfrm value back to user-space when ethtool is
built with --disable-netlink. The second allows ethtool to change other
RSS attributes while not changing the xfrm values.

Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/
====================

Link: https://lore.kernel.org/r/20231221184235.9192-1-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm
Ahmed Zaki [Thu, 21 Dec 2023 18:42:35 +0000 (11:42 -0700)]
net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm

Add a NO_CHANGE uAPI value for the new RXFH/RSS input_xfrm uAPI field.
This needed so that user-space can set other RSS values (hkey or indir
table) without affecting input_xfrm.

Should have been part of [1].

Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231221184235.9192-3-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agonet: ethtool: copy input_xfrm to user-space in ethtool_get_rxfh
Ahmed Zaki [Thu, 21 Dec 2023 18:42:34 +0000 (11:42 -0700)]
net: ethtool: copy input_xfrm to user-space in ethtool_get_rxfh

The ioctl path of ethtool's get channels is missing the final step of
copying the new input_xfrm field to user-space. This should have been
part of [1].

Link: https://lore.kernel.org/netdev/20231213003321.605376-1-ahmed.zaki@intel.com/
Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231221184235.9192-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoxsk: make struct xsk_cb_desc available outside CONFIG_XDP_SOCKETS
Vladimir Oltean [Tue, 19 Dec 2023 11:02:05 +0000 (13:02 +0200)]
xsk: make struct xsk_cb_desc available outside CONFIG_XDP_SOCKETS

The ice driver fails to build when CONFIG_XDP_SOCKETS is disabled.

drivers/net/ethernet/intel/ice/ice_base.c:533:21: error:
variable has incomplete type 'struct xsk_cb_desc'
        struct xsk_cb_desc desc = {};
                           ^
include/net/xsk_buff_pool.h:15:8: note:
forward declaration of 'struct xsk_cb_desc'
struct xsk_cb_desc;
       ^

Fixes: d68d707dcbbf ("ice: Support XDP hints in AF_XDP ZC mode")
Closes: https://lore.kernel.org/netdev/8b76dad3-8847-475b-aa17-613c9c978f7a@infradead.org/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20231219110205.1289506-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoRevert "net: mdio: get/put device node during (un)registration"
Jakub Kicinski [Tue, 2 Jan 2024 22:23:34 +0000 (14:23 -0800)]
Revert "net: mdio: get/put device node during (un)registration"

This reverts commit cff9c565e65f3622e8dc1dcc21c1520a083dff35.

Revert based on feedback from Russell.

Link: https://lore.kernel.org/all/ZZPtUIRerqTI2%2Fyh@shell.armlinux.org.uk/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'renesas-rzg3s-add-support-for-ethernet'
David S. Miller [Tue, 2 Jan 2024 14:25:51 +0000 (14:25 +0000)]
Merge branch 'renesas-rzg3s-add-support-for-ethernet'

Claudiu Beznea says:

====================
renesas: rzg3s: Add support for Ethernet

Series adds Ethernet support for Renesas RZ/G3S.
Along with it preparatory cleanups and fixes were included.
====================

Link: https://lore.kernel.org/r/20231207070700.4156557-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agodt-bindings: net: renesas,etheravb: Document RZ/G3S support
Claudiu Beznea [Thu, 7 Dec 2023 07:06:57 +0000 (09:06 +0200)]
dt-bindings: net: renesas,etheravb: Document RZ/G3S support

Document Ethernet RZ/G3S support. Ethernet IP is similar to the one
available on RZ/G2L devices.

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
9 months agoMerge branch 'remove-retired-tc-uapi'
David S. Miller [Tue, 2 Jan 2024 14:25:51 +0000 (14:25 +0000)]
Merge branch 'remove-retired-tc-uapi'

Jamal Hadi Salim says:

====================
net/sched: Remove UAPI support for retired TC qdiscs and classifiers

Classifiers RSVP and tcindex as well as qdiscs dsmark, CBQ and ATM have already
been deleted. This patchset removes their UAPI support.

User space - with a focus on iproute2 - typically copies these UAPI headers for
different kernels.
These deletion patches are coordinated with the iproute2 maintainers to make
sure that they delete any user space code referencing removed objects at their
leisure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove uapi support for CBQ qdisc
Jamal Hadi Salim [Sat, 23 Dec 2023 14:01:54 +0000 (09:01 -0500)]
net/sched: Remove uapi support for CBQ qdisc

Commit 051d44209842 ("net/sched: Retire CBQ qdisc") retired the CBQ qdisc.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove uapi support for ATM qdisc
Jamal Hadi Salim [Sat, 23 Dec 2023 14:01:53 +0000 (09:01 -0500)]
net/sched: Remove uapi support for ATM qdisc

Commit fb38306ceb9e ("net/sched: Retire ATM qdisc") retired the ATM qdisc.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove uapi support for dsmark qdisc
Jamal Hadi Salim [Sat, 23 Dec 2023 14:01:52 +0000 (09:01 -0500)]
net/sched: Remove uapi support for dsmark qdisc

Commit bbe77c14ee61 ("net/sched: Retire dsmark qdisc") retired the dsmark
classifier. Remove UAPI support for it.
Iproute2 will sync by equally removing it from user space.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove uapi support for tcindex classifier
Jamal Hadi Salim [Sat, 23 Dec 2023 14:01:51 +0000 (09:01 -0500)]
net/sched: Remove uapi support for tcindex classifier

commit 8c710f75256b ("net/sched: Retire tcindex classifier") retired the TC
tcindex classifier.
Remove UAPI for it.  Iproute2 will sync by equally removing it from user space.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove uapi support for rsvp classifier
Jamal Hadi Salim [Sat, 23 Dec 2023 14:01:50 +0000 (09:01 -0500)]
net/sched: Remove uapi support for rsvp classifier

commit 265b4da82dbf ("net/sched: Retire rsvp classifier") retired the TC RSVP
classifier.
Remove UAPI for it. Iproute2 will sync by equally removing it from user space.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'octeon_ep_vf-driver'
David S. Miller [Tue, 2 Jan 2024 14:19:54 +0000 (14:19 +0000)]
Merge branch 'octeon_ep_vf-driver'

Shinas Rasheed says:

====================
add octeon_ep_vf driver

This driver implements networking functionality of Marvell's Octeon
PCI Endpoint NIC VF.

This driver support following devices:
 * Network controller: Cavium, Inc. Device b203
 * Network controller: Cavium, Inc. Device b403
 * Network controller: Cavium, Inc. Device b103
 * Network controller: Cavium, Inc. Device b903
 * Network controller: Cavium, Inc. Device ba03
 * Network controller: Cavium, Inc. Device bc03
 * Network controller: Cavium, Inc. Device bd03

Changes:
V2:
  - Removed linux/version.h header file from inclusion in
    octep_vf_main.c
  - Corrected Makefile entry to include building octep_vf_mbox.c in
    [6/8] patch.
  - Removed redundant vzalloc pointer cast and vfree pointer check in
    [6/8] patch.

V1: https://lore.kernel.org/all/20231221092844.2885872-1-srasheed@marvell.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: update MAINTAINERS
Shinas Rasheed [Sat, 23 Dec 2023 13:40:00 +0000 (05:40 -0800)]
octeon_ep_vf: update MAINTAINERS

add MAINTAINERS for octeon_ep_vf driver.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add ethtool support
Shinas Rasheed [Sat, 23 Dec 2023 13:39:59 +0000 (05:39 -0800)]
octeon_ep_vf: add ethtool support

Add support for the following ethtool commands:

ethtool -i|--driver devname
ethtool devname
ethtool -S|--statistics devname

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add Tx/Rx processing and interrupt support
Shinas Rasheed [Sat, 23 Dec 2023 13:39:58 +0000 (05:39 -0800)]
octeon_ep_vf: add Tx/Rx processing and interrupt support

Add support to enable MSI-x and register interrupts.
Add support to process Tx and Rx traffic. Includes processing
Tx completions and Rx refill.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add support for ndo ops
Shinas Rasheed [Sat, 23 Dec 2023 13:39:57 +0000 (05:39 -0800)]
octeon_ep_vf: add support for ndo ops

Add support for ndo ops to set MAC address, change MTU, get stats.
Add control path support to set MAC address, change MTU, get stats,
set speed, get and set link mode.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add Tx/Rx ring resource setup and cleanup
Shinas Rasheed [Sat, 23 Dec 2023 13:39:56 +0000 (05:39 -0800)]
octeon_ep_vf: add Tx/Rx ring resource setup and cleanup

Implement Tx/Rx ring resource allocation and cleanup.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add VF-PF mailbox communication.
Shinas Rasheed [Sat, 23 Dec 2023 13:39:55 +0000 (05:39 -0800)]
octeon_ep_vf: add VF-PF mailbox communication.

Implement VF-PF mailbox to send all control commands from VF to PF
and receive responses and notifications from PF to VF.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: add hardware configuration APIs
Shinas Rasheed [Sat, 23 Dec 2023 13:39:54 +0000 (05:39 -0800)]
octeon_ep_vf: add hardware configuration APIs

Implement hardware resource init and shutdown helper APIs, like
hardware Tx/Rx queue init/enable/disable/reset.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoocteon_ep_vf: Add driver framework and device initialization
Shinas Rasheed [Sat, 23 Dec 2023 13:39:53 +0000 (05:39 -0800)]
octeon_ep_vf: Add driver framework and device initialization

Add driver framework and device setup and initialization for Octeon
PCI Endpoint NIC VF.

Add implementation to load module, initialize, register network device,
cleanup and unload module.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/ps3_gelic_net: Add gelic_descr structures
Geoff Levand [Sat, 23 Dec 2023 07:28:20 +0000 (16:28 +0900)]
net/ps3_gelic_net: Add gelic_descr structures

In an effort to make the PS3 gelic driver easier to maintain, create two
new structures, struct gelic_hw_regs and struct gelic_chain_link, and
replace the corresponding members of struct gelic_descr with the new
structures.

The new struct gelic_hw_regs holds the register variables used by the
gelic hardware device.  The new struct gelic_chain_link holds variables
used to manage the driver's linked list of gelic descr structures.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'bnxt_en-ntuple-fuilter-support'
David S. Miller [Tue, 2 Jan 2024 13:52:28 +0000 (13:52 +0000)]
Merge branch 'bnxt_en-ntuple-fuilter-support'

Michael Chan says:

====================
bnxt_en: Add basic ntuple filter support

The current driver only supports ntuple filters added by aRFS.  This
patch series adds basic support for user defined TCP/UDP ntuple filters
added by the user using ethtool.  Many of the patches are refactoring
patches to make the existing code more general to support both aRFS
and user defined filters.  aRFS filters always have the Toeplitz hash
value from the NIC.  A Toepliz hash function is added in patch 5 to
get the same hash value for user defined filters.  The hash is used
to store all ntuple filters in the table and all filters must be
hashed identically using the same function and key.

v2: Fix compile error in patch #4 when CONFIG_BNXT_SRIOV is disabled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add support for ntuple filter deletion by ethtool.
Michael Chan [Sat, 23 Dec 2023 04:22:10 +0000 (20:22 -0800)]
bnxt_en: Add support for ntuple filter deletion by ethtool.

Add logic to delete a user specified ntuple filter from ethtool.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add support for ntuple filters added from ethtool.
Michael Chan [Sat, 23 Dec 2023 04:22:09 +0000 (20:22 -0800)]
bnxt_en: Add support for ntuple filters added from ethtool.

Add support for adding user defined ntuple TCP/UDP filters.  These
filters are similar to aRFS filters except that they don't get aged.
Source IP, destination IP, source port, or destination port can be
unspecifed as wildcard.  At least one of these tuples must be specifed.
If a tuple is specified, the full mask must be specified.

All ntuple related ethtool functions are now no longer compiled only
for CONFIG_RFS_ACCEL.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add ntuple matching flags to the bnxt_ntuple_filter structure.
Michael Chan [Sat, 23 Dec 2023 04:22:08 +0000 (20:22 -0800)]
bnxt_en: Add ntuple matching flags to the bnxt_ntuple_filter structure.

aRFS filters match all 5 tuples.  User defined ntuple filters may
specify some of the tuples as wildcards.  To support that, we add the
ntuple_flags to the bnxt_ntuple_filter struct to specify which tuple
fields are to be matched.  The matching tuple fields will then be
passed to the firmware in bnxt_hwrm_cfa_ntuple_filter_alloc() to create
the proper filter.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Refactor ntuple filter removal logic in bnxt_cfg_ntp_filters().
Michael Chan [Sat, 23 Dec 2023 04:22:07 +0000 (20:22 -0800)]
bnxt_en: Refactor ntuple filter removal logic in bnxt_cfg_ntp_filters().

Refactor the logic into a new function bnxt_del_ntp_filters().  The
same call will be used when the user deletes an ntuple filter.

The bnxt_hwrm_cfa_ntuple_filter_free() function to call fw to free
the ntuple filter is exported so that the ethtool logic can call it.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Refactor the hash table logic for ntuple filters.
Michael Chan [Sat, 23 Dec 2023 04:22:06 +0000 (20:22 -0800)]
bnxt_en: Refactor the hash table logic for ntuple filters.

Generalize the ethtool logic that walks the ntuple hash table now that
we have the common bnxt_filter_base structure.  This will allow the code
to easily extend to cover user defined ntuple or ether filters.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Refactor filter insertion logic in bnxt_rx_flow_steer().
Michael Chan [Sat, 23 Dec 2023 04:22:05 +0000 (20:22 -0800)]
bnxt_en: Refactor filter insertion logic in bnxt_rx_flow_steer().

Add a new function bnxt_insert_ntp_filter() to insert the ntuple filter
into the hash table and other basic setup.  We'll use this function
to insert a user defined filter from ethtool.

Also, export bnxt_lookup_ntp_filter_from_idx() and bnxt_get_ntp_filter_idx()
for similar purposes.  All ntuple related functions are now no longer
compiled only for CONFIG_RFS_ACCEL

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add new BNXT_FLTR_INSERTED flag to bnxt_filter_base struct.
Michael Chan [Sat, 23 Dec 2023 04:22:04 +0000 (20:22 -0800)]
bnxt_en: Add new BNXT_FLTR_INSERTED flag to bnxt_filter_base struct.

Change the unused flag to BNXT_FLTR_INSERTED.  To prepare for multiple
pathways that an ntuple filter can be deleted, we add this flag.  These
filter structures can be retreived from the RCU hash table but only
the caller that sees that the BNXT_FLTR_INSERTED flag is set can delete
the filter structure and clear the flag under spinlock.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add bnxt_lookup_ntp_filter_from_idx() function
Michael Chan [Sat, 23 Dec 2023 04:22:03 +0000 (20:22 -0800)]
bnxt_en: Add bnxt_lookup_ntp_filter_from_idx() function

Add the helper function to look up the ntuple filter from the
hash index and use it in bnxt_rx_flow_steer().  The helper function
will also be used by user defined ntuple filters in the next
patches.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add function to calculate Toeplitz hash
Pavan Chebbi [Sat, 23 Dec 2023 04:22:02 +0000 (20:22 -0800)]
bnxt_en: Add function to calculate Toeplitz hash

For ntuple filters added by aRFS, the Toeplitz hash calculated by our
NIC is available and is used to store the ntuple filter for quick
retrieval.  In the next patches, user defined ntuple filter support
will be added and we need to calculate the same hash for these
filters.  The same hash function needs to be used so we can detect
duplicates.

Add the function bnxt_toeplitz() to calculate the Toeplitz hash for
user defined ntuple filters.  bnxt_toeplitz() uses the same Toeplitz
key and the same key length as the NIC.

bnxt_get_ntp_filter_idx() is added to return the hash index.  For
aRFS, the hash comes from the NIC.  For user defined ntuple, we call
bnxt_toeplitz() to calculate the hash index.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Refactor L2 filter alloc/free firmware commands.
Michael Chan [Sat, 23 Dec 2023 04:22:01 +0000 (20:22 -0800)]
bnxt_en: Refactor L2 filter alloc/free firmware commands.

Refactor the L2 filter alloc/free logic so that these filters can be
added/deleted by the user.

The bp->ntp_fltr_bmap allocated size is also increased to allow enough
IDs for L2 filters.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Re-structure the bnxt_ntuple_filter structure.
Michael Chan [Sat, 23 Dec 2023 04:22:00 +0000 (20:22 -0800)]
bnxt_en: Re-structure the bnxt_ntuple_filter structure.

With the new bnxt_l2_filter structure, we can now re-structure the
bnxt_ntuple_filter structure to point to the bnxt_l2_filter structure.
We eliminate the L2 ether address info from the ntuple filter structure
as we can get the information from the L2 filter structure.  Note that
the source L2 MAC address is no longer used.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Add bnxt_l2_filter hash table.
Michael Chan [Sat, 23 Dec 2023 04:21:59 +0000 (20:21 -0800)]
bnxt_en: Add bnxt_l2_filter hash table.

The current driver only has an array of 4 additional L2 unicast
addresses to support the netdev uc address list.  Generalize and
expand this infrastructure with an L2 address hash table so we can
support an expanded list of unicast addresses (for bridges,
macvlans, OVS, etc).  The L2 hash table infrastructure will also
allow more generalized n-tuple filter support.

This patch creates the bnxt_l2_filter structure and the hash table.
This L2 filter structure has the same bnxt_filter_base structure
as used in the bnxt_ntuple_filter structure.

All currently supported L2 filters will now have an entry in this
new table.

Note that L2 filters may be created for the VF.  VF filters should
not be freed when the PF goes down.  Add some logic in
bnxt_free_l2_filters() to allow keeping the VF filters or to free
everything during rmmod.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agobnxt_en: Refactor bnxt_ntuple_filter structure.
Michael Chan [Sat, 23 Dec 2023 04:21:58 +0000 (20:21 -0800)]
bnxt_en: Refactor bnxt_ntuple_filter structure.

This is in preparation to support user defined L2 (ether) filters,
which will have many similarities with ntuple filters.  Refactor
bnxt_ntuple_filter structure to have a bnxt_filter_base structure
that can be re-used by the L2 filters.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge tag 'for-net-next-2023-12-22' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Tue, 2 Jan 2024 13:43:23 +0000 (13:43 +0000)]
Merge tag 'for-net-next-2023-12-22' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - btnxpuart: Fix recv_buf return value
 - L2CAP: Fix responding with multiple rejects
 - Fix atomicity violation in {min,max}_key_size_set
 - ISO: Allow binding a PA sync socket
 - ISO: Reassociate a socket with an active BIS
 - ISO: Avoid creating child socket if PA sync is terminating
 - Add device 13d3:3572 IMC Networks Bluetooth Radio
 - Don't suspend when there are connections
 - Remove le_restart_scan work
 - Fix bogus check for re-auth not supported with non-ssp
 - lib: Add documentation to exported functions
 - Support HFP offload for QCA2066
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoDocumentation: add pyyaml to requirements.txt
Vegard Nossum [Fri, 22 Dec 2023 13:36:28 +0000 (14:36 +0100)]
Documentation: add pyyaml to requirements.txt

Commit f061c9f7d058 ("Documentation: Document each netlink family") added
a new Python script that is invoked during 'make htmldocs' and which reads
the netlink YAML spec files.

Using the virtualenv from scripts/sphinx-pre-install, we get this new
error wen running 'make htmldocs':

  Traceback (most recent call last):
    File "./tools/net/ynl/ynl-gen-rst.py", line 26, in <module>
      import yaml
  ModuleNotFoundError: No module named 'yaml'
  make[2]: *** [Documentation/Makefile:112: Documentation/networking/netlink_spec/rt_link.rst] Error 1
  make[1]: *** [Makefile:1708: htmldocs] Error 2

Fix this by adding 'pyyaml' to requirements.txt.

Note: This was somehow present in the original patch submission:
<https://lore.kernel.org/all/20231103135622.250314-1-leitao@debian.org/>
I'm not sure why the pyyaml requirement disappeared in the meantime.

Fixes: f061c9f7d058 ("Documentation: Document each netlink family")
Cc: Breno Leitao <leitao@debian.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'mptcp-mib-counters'
David S. Miller [Tue, 2 Jan 2024 13:33:58 +0000 (13:33 +0000)]
Merge branch 'mptcp-mib-counters'

Matthieu Baerts says:

====================
mptcp: add CurrEstab MIB counter

This MIB counter is similar to the one of TCP -- CurrEstab -- available
in /proc/net/snmp. This is useful to quickly list the number of MPTCP
connections without having to iterate over all of them.

Patch 1 prepares its support by adding new helper functions:

 - MPTCP_DEC_STATS(): similar to MPTCP_INC_STATS(), but this time to
   decrement a counter.

 - mptcp_set_state(): similar to tcp_set_state(), to change the state of
   an MPTCP socket, and to inc/decrement the new counter when needed.

Patch 2 uses mptcp_set_state() instead of directly calling
inet_sk_state_store() to change the state of MPTCP sockets.

Patch 3 and 4 validate the new feature in MPTCP "join" and "diag"
selftests.
====================

Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoselftests: mptcp: diag: check CURRESTAB counters
Geliang Tang [Fri, 22 Dec 2023 12:47:25 +0000 (13:47 +0100)]
selftests: mptcp: diag: check CURRESTAB counters

This patch adds a new helper chk_msk_cestab() to check the current
established connections counter MIB_CURRESTAB in diag.sh. Invoke it
to check the counter during the connection after every chk_msk_inuse().

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoselftests: mptcp: join: check CURRESTAB counters
Geliang Tang [Fri, 22 Dec 2023 12:47:24 +0000 (13:47 +0100)]
selftests: mptcp: join: check CURRESTAB counters

This patch adds a new helper chk_cestab_nr() to check the current
established connections counter MIB_CURRESTAB. Set the newly added
variables cestab_ns1 and cestab_ns2 to indicate how many connections
are expected in ns1 or ns2.

Invoke check_cestab() to check the counter during the connection in
do_transfer() and invoke chk_cestab_nr() to re-check it when the
connection closed. These checks are embedded in add_tests().

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agomptcp: use mptcp_set_state
Geliang Tang [Fri, 22 Dec 2023 12:47:23 +0000 (13:47 +0100)]
mptcp: use mptcp_set_state

This patch replaces all the 'inet_sk_state_store()' calls under net/mptcp
with the new helper mptcp_set_state().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/460
Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agomptcp: add CurrEstab MIB counter support
Geliang Tang [Fri, 22 Dec 2023 12:47:22 +0000 (13:47 +0100)]
mptcp: add CurrEstab MIB counter support

Add a new MIB counter named MPTCP_MIB_CURRESTAB to count current
established MPTCP connections, similar to TCP_MIB_CURRESTAB. This is
useful to quickly list the number of MPTCP connections without having to
iterate over all of them.

This patch adds a new helper function mptcp_set_state(): if the state
switches from or to ESTABLISHED state, this newly added counter is
incremented. This helper is going to be used in the following patch.

Similar to MPTCP_INC_STATS(), a new helper called MPTCP_DEC_STATS() is
also needed to decrement a MIB counter.

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'selftests-tcp-ao'
David S. Miller [Tue, 2 Jan 2024 13:27:48 +0000 (13:27 +0000)]
Merge branch 'selftests-tcp-ao'

Dmitry Safonov says:

====================
selftest/net: Some more TCP-AO selftest post-merge fixups

Note that there's another post-merge fix for TCP-AO selftests, but that
doesn't conflict with these, so I don't resend that:

https://lore.kernel.org/all/20231219-b4-tcp-ao-selftests-out-of-tree-v1-1-0fff92d26eac@arista.com/T/#u
====================

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
9 months agoselftest/tcp-ao: Work on namespace-ified sysctl_optmem_max
Dmitry Safonov [Fri, 22 Dec 2023 01:59:07 +0000 (01:59 +0000)]
selftest/tcp-ao: Work on namespace-ified sysctl_optmem_max

Since commit f5769faeec36 ("net: Namespace-ify sysctl_optmem_max")
optmem_max is per-netns, so need of switching to root namespace.
It seems trivial to keep the old logic working, so going to keep it for
a while (at least, until kernel with netns-optmem_max will be release).

Currently, there is a test that checks that optmem_max limit applies to
TCP-AO keys and a little benchmark that measures linked-list TCP-AO keys
scaling, those are fixed by this.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoselftest/tcp-ao: Set routes in a proper VRF table id
Dmitry Safonov [Fri, 22 Dec 2023 01:59:06 +0000 (01:59 +0000)]
selftest/tcp-ao: Set routes in a proper VRF table id

In unsigned-md5 selftests ip_route_add() is not needed in
client_add_ip(): the route was pre-setup in __test_init() => link_init()
for subnet, rather than a specific ip-address.

Currently, __ip_route_add() mistakenly always sets VRF table
to RT_TABLE_MAIN - this seems to have sneaked in during unsigned-md5
tests debugging. That also explains, why ip_route_add_vrf() ignored
EEXIST, returned by fib6.

Yet, keep EEXIST ignoring in bench-lookups selftests as it's expected
that those selftests may add the same (duplicate) routes.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge tag 'wireless-next-2023-12-22' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Tue, 2 Jan 2024 12:46:10 +0000 (12:46 +0000)]
Merge tag 'wireless-next-2023-12-22' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.8

The third "new features" pull request for v6.8. This is a smaller one
to clear up our tree before the break and nothing really noteworthy
this time.

Major changes:

stack

* cfg80211: introduce cfg80211_ssid_eq() for SSID matching

* cfg80211: support P2P operation on DFS channels

* mac80211: allow 64-bit radiotap timestamps

iwlwifi

* AX210: allow concurrent P2P operation on DFS channels
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'net-tc-ipt-retire'
David S. Miller [Tue, 2 Jan 2024 12:41:16 +0000 (12:41 +0000)]
Merge branch 'net-tc-ipt-retire'

Jamal Hadi Salim says:

====================
net/sched: retire tc ipt action

In keeping up with my status as a hero who removes code: another one bites the
dust.
The tc ipt action was intended to run all netfilter/iptables target.
Unfortunately it has not benefitted over the years from proper updates when
netfilter changes, and for that reason it has remained rudimentary.
Pinging a bunch of people that i was aware were using this indicates that
removing it wont affect them.
Retire it to reduce maintenance efforts.
So Long, ipt, and Thanks for all the Fish.
====================

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Remove CONFIG_NET_ACT_IPT from default configs
Jamal Hadi Salim [Thu, 21 Dec 2023 21:31:04 +0000 (16:31 -0500)]
net/sched: Remove CONFIG_NET_ACT_IPT from default configs

Now that we are retiring the IPT action.

Reviewed-by: Victor Noguiera <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet/sched: Retire ipt action
Jamal Hadi Salim [Thu, 21 Dec 2023 21:31:03 +0000 (16:31 -0500)]
net/sched: Retire ipt action

The tc ipt action was intended to run all netfilter/iptables target.
Unfortunately it has not benefitted over the years from proper updates when
netfilter changes, and for that reason it has remained rudimentary.
Pinging a bunch of people that i was aware were using this indicates that
removing it wont affect them.
Retire it to reduce maintenance efforts. Buh-bye.

Reviewed-by: Victor Noguiera <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet-device: move gso_partial_features to net_device_read_tx
Eric Dumazet [Thu, 21 Dec 2023 14:07:47 +0000 (14:07 +0000)]
net-device: move gso_partial_features to net_device_read_tx

dev->gso_partial_features is read from tx fast path for GSO packets.

Move it to appropriate section to avoid a cache line miss.

Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoptp: ocp: Use DEFINE_RES_*() in place
Andy Shevchenko [Thu, 21 Dec 2023 14:06:07 +0000 (16:06 +0200)]
ptp: ocp: Use DEFINE_RES_*() in place

There is no need to have an intermediate functions as DEFINE_RES_*()
macros are represented by compound literals. Just use them in place.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'phy-listing-link_topology-tracking'
David S. Miller [Mon, 1 Jan 2024 18:38:57 +0000 (18:38 +0000)]
Merge branch 'phy-listing-link_topology-tracking'

Maxime Chevallier says:

====================
Introduce PHY listing and link_topology tracking

Here's a V5 of the multi-PHY support series.

At a glance, besides some minor fixes and R'd-by from Andrew, one of the
thing this series does is remove the ASSERT_RTNL() from the
topo_add_phy/del_phy operations.

These operations will take a PHY device and put it into the list of
devices associated to a netdevice. The main thing to protect here is the
list itself, but since we use xarrays, my naive understanding of it is
that it contains its own protection scheme. There shouldn't be a need
for more locking, as the insertion/deletion paths are already hooked
into the PHY connection to a netdev, or disconnection from it.

Now for the rest of the cover :

As a remainder, this ongoing work aims ultimately at supporting complex
link topologies that involve multiplexing multiple PHYs/SFPs on a single
netdevice. As a first step, it's required that we are able to enumerate the
PHYs on a given ethernet interface.

By just doing so, we also improve already-existing use-cases, namely the
copper SFP modules support when a media-converter is used (as we have 2
PHYs on the link, but only one is referenced by net_device.phydev, which
is used on a variety of netlink commands).

The series is architectured as follows :

- The first patch adds the notion of phy_link_topology, which tracks
all PHYs attached to a netdevice.

- Patches 2, 3 and 4 adds some plumbing into SFP and phylib to be able
  to connect the dots when building the topology tree, to know which PHY
  is connected to which SFP bus, trying not to be too invasive on phylib.

- Patch 5 allows passing a PHY_INDEX to ethnl commands. I'm uncertain about
  this, as there are at least 4 netlink commands ( 5 with the one introduced
  in patch 7 ) that targets PHYs directly or indirectly, which to me makes
  it worth-it to have a generic way to pass a PHY index to commands, however
  the approach taken may be too generic.

- Patch 6 is the netlink spec update + ethtool-user.c|h autogenerated code
update (the autogenerated code triggers checkpatch warning though)

- Patch 7 introduces a new netlink command set to list PHYs on a netdevice.
It implements a custom DUMP and GET operation to allow filtered dumps,
that lists all PHYs on a given netdevice. I couldn't use most of ethnl's
plumbing though.

- Patch 8 is the netlink spec update + ethtool-user.c|h update for that
new command

- Patch 8,9,10 and 11 updates the PLCA, strset, cable-test and pse netlink
commands to use the user-provided PHY instead of net_device.phydev.

- Finally patch 12 adds some documentation for this whole work.

Examples
========

Here's a short overview of the kind of operations you can have regarding
the PHY topology. These tests were performed on a MacchiatoBin, which
has 3 interfaces :

eth0 and eth1 have the following layout:

MAC - PHY - SFP

eth2 has this more classic topology :

MAC - PHY - RJ45

finally eth3 has the following topology :

MAC - SFP

When performing a dump with all interfaces down, we don't get any
result, as no PHY has been attached to their respective net_device :

None

The following output is with eth0, eth2 and eth3 up, but no SFP module
inserted in none of the interfaces :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1510',
  'header': {'dev-index': 4, 'dev-name': 'eth2'},
  'id': 21040593,
  'index': 1,
  'name': 'f212a200.mdio-mii:00',
  'upstream-type': 'mac'}]

And now is a dump operation with a copper SFP in the eth0 port :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1111',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 21040322,
  'index': 2,
  'name': 'i2c:sfp-eth0:16',
  'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
  'upstream-type': 'phy'},
 {'drvname': 'Marvell 88E1510',
  'header': {'dev-index': 4, 'dev-name': 'eth2'},
  'id': 21040593,
  'index': 1,
  'name': 'f212a200.mdio-mii:00',
  'upstream-type': 'mac'}]

 -- Note that this shouldn't actually work as the 88x3310 PHY doesn't allow
a 1G SFP to be connected to its SFP interface, and I don't have a 10G copper SFP,
so for the sake of the demo I applied the following modification, which
of courses gives a non-functionnal link, but the PHY attach still works,
which is what I want to demonstrate :

@@ -488,7 +488,7 @@ static int mv3310_sfp_insert(void *upstream, const struct sfp_eeprom_id *id)

        if (iface != PHY_INTERFACE_MODE_10GBASER) {
                dev_err(&phydev->mdio.dev, "incompatible SFP module inserted\n");
-               return -EINVAL;
+               //return -EINVAL;
        }
        return 0;
 }

Finally an example of the filtered DUMP operation that Jakub suggested
in V1 :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1111',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 21040322,
  'index': 2,
  'name': 'i2c:sfp-eth0:16',
  'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
  'upstream-type': 'phy'}]

And a classic GET operation allows querying a single PHY's info :

{'drvname': 'Marvell 88E1111',
 'header': {'dev-index': 2, 'dev-name': 'eth0'},
 'id': 21040322,
 'index': 2,
 'name': 'i2c:sfp-eth0:16',
 'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
 'upstream-type': 'phy'}

Changed in V5:
- Removed the RTNL assertion in the topology ops
- Made the phy_topo_get_phy inline
- Fixed the PSE-PD multi-PHY support by re-adding a wrongly dropped
  check
- Fixed some typos in the documentation
- Fixed reverse xmas trees

Changes in V4:
- Dropped the RFC flag
- Made the net_device integration independent to having phylib enabled
- Removed the autogenerated ethtool-user code for the YNL specs

Changes in V3:
- Added RTNL assertions where needed
- Fixed issues in the DUMP code for PHY_GET, which crashed when running it
  twice in a row
- Added the documentation, and moved in-source docs around
- renamed link_topology to phy_link_topology

Changes in V2:
- Added the DUMP operation
- Added much more information in the reported data, to be able to reconstruct
  precisely the topology tree
- renamed phy_list to link_topology
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoDocumentation: networking: document phy_link_topology
Maxime Chevallier [Thu, 21 Dec 2023 18:00:46 +0000 (19:00 +0100)]
Documentation: networking: document phy_link_topology

The newly introduced phy_link_topology tracks all ethernet PHYs that are
attached to a netdevice. Document the base principle, internal and
external APIs. As the phy_link_topology is expected to be extended, this
documentation will hold any further improvements and additions made
relative to topology handling.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: strset: Allow querying phy stats by index
Maxime Chevallier [Thu, 21 Dec 2023 18:00:45 +0000 (19:00 +0100)]
net: ethtool: strset: Allow querying phy stats by index

The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer
from the ethnl request to allow query phy stats from each PHY on the
link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: cable-test: Target the command to the requested PHY
Maxime Chevallier [Thu, 21 Dec 2023 18:00:44 +0000 (19:00 +0100)]
net: ethtool: cable-test: Target the command to the requested PHY

Cable testing is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: pse-pd: Target the command to the requested PHY
Maxime Chevallier [Thu, 21 Dec 2023 18:00:43 +0000 (19:00 +0100)]
net: ethtool: pse-pd: Target the command to the requested PHY

PSE and PD configuration is a PHY-specific command. Instead of targeting
the command towards dev->phydev, use the request to pick the targeted
PHY device.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: plca: Target the command to the requested PHY
Maxime Chevallier [Thu, 21 Dec 2023 18:00:42 +0000 (19:00 +0100)]
net: ethtool: plca: Target the command to the requested PHY

PLCA is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonetlink: specs: add ethnl PHY_GET command set
Maxime Chevallier [Thu, 21 Dec 2023 18:00:41 +0000 (19:00 +0100)]
netlink: specs: add ethnl PHY_GET command set

The PHY_GET command, supporting both DUMP and GET operations, is used to
retrieve the list of PHYs connected to a netdevice, and get topology
information to know where exactly it sits on the physical link.

Add the netlink specs corresponding to that command.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: Introduce a command to list PHYs on an interface
Maxime Chevallier [Thu, 21 Dec 2023 18:00:40 +0000 (19:00 +0100)]
net: ethtool: Introduce a command to list PHYs on an interface

As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.

Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonetlink: specs: add phy-index as a header parameter
Maxime Chevallier [Thu, 21 Dec 2023 18:00:39 +0000 (19:00 +0100)]
netlink: specs: add phy-index as a header parameter

Update the spec to take the newly introduced phy-index as a generic
request parameter.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: ethtool: Allow passing a phy index for some commands
Maxime Chevallier [Thu, 21 Dec 2023 18:00:38 +0000 (19:00 +0100)]
net: ethtool: Allow passing a phy index for some commands

Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the relevant phydev when the command targets a PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: sfp: Add helper to return the SFP bus name
Maxime Chevallier [Thu, 21 Dec 2023 18:00:37 +0000 (19:00 +0100)]
net: sfp: Add helper to return the SFP bus name

Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: phy: add helpers to handle sfp phy connect/disconnect
Maxime Chevallier [Thu, 21 Dec 2023 18:00:36 +0000 (19:00 +0100)]
net: phy: add helpers to handle sfp phy connect/disconnect

There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.

By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: sfp: pass the phy_device when disconnecting an sfp module's PHY
Maxime Chevallier [Thu, 21 Dec 2023 18:00:35 +0000 (19:00 +0100)]
net: sfp: pass the phy_device when disconnecting an sfp module's PHY

Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: phy: Introduce ethernet link topology representation
Maxime Chevallier [Thu, 21 Dec 2023 18:00:34 +0000 (19:00 +0100)]
net: phy: Introduce ethernet link topology representation

Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge tag 'nf-next-23-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
David S. Miller [Mon, 1 Jan 2024 16:15:40 +0000 (16:15 +0000)]
Merge tag 'nf-next-23-12-22' of git://git./linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
netfilter pull request 23-12-22

The following patchset contains Netfilter updates for net-next:

1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a
   race scenario with two concurrent processes running a dump-and-reset
   which exposes negative counters to userspace, from Phil Sutter.

2) Use GFP_KERNEL in pipapo GC, from Florian Westphal.

3) Reorder nf_flowtable struct members, place the read-mostly parts
   accessed by the datapath first. From Florian Westphal.

4) Set on dead flag for NFT_MSG_NEWSET in abort path,
   from Florian Westphal.

5) Support filtering zone in ctnetlink, from Felix Huettner.

6) Bail out if user tries to redefine an existing chain with different
   type in nf_tables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
David S. Miller [Mon, 1 Jan 2024 14:45:21 +0000 (14:45 +0000)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next-for-netdev
The following pull-request contains BPF updates for your *net-next* tree.

We've added 22 non-merge commits during the last 3 day(s) which contain
a total of 23 files changed, 652 insertions(+), 431 deletions(-).

The main changes are:

1) Add verifier support for annotating user's global BPF subprogram arguments
   with few commonly requested annotations for a better developer experience,
   from Andrii Nakryiko.

   These tags are:
     - Ability to annotate a special PTR_TO_CTX argument
     - Ability to annotate a generic PTR_TO_MEM as non-NULL

2) Support BPF verifier tracking of BPF_JNE which helps cases when the compiler
   transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like, from
   Menglong Dong.

3) Fix a warning in bpf_mem_cache's check_obj_size() as reported by LKP, from Hou Tao.

4) Re-support uid/gid options when mounting bpffs which had to be reverted with
   the prior token series revert to avoid conflicts, from Daniel Borkmann.

5) Fix a libbpf NULL pointer dereference in bpf_object__collect_prog_relos() found
   from fuzzing the library with malformed ELF files, from Mingyi Zhang.

6) Skip DWARF sections in libbpf's linker sanity check given compiler options to
   generate compressed debug sections can trigger a rejection due to misalignment,
   from Alyssa Ross.

7) Fix an unnecessary use of the comma operator in BPF verifier, from Simon Horman.

8) Fix format specifier for unsigned long values in cpustat sample, from Colin Ian King.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: mdio: get/put device node during (un)registration
Luiz Angelo Daros de Luca [Wed, 20 Dec 2023 04:52:29 +0000 (01:52 -0300)]
net: mdio: get/put device node during (un)registration

The __of_mdiobus_register() function was storing the device node in
dev.of_node without increasing its reference count. It implicitly relied
on the caller to maintain the allocated node until the mdiobus was
unregistered.

Now, __of_mdiobus_register() will acquire the node before assigning it,
and of_mdiobus_unregister_callback() will be called at the end of
mdio_unregister().

Drivers can now release the node immediately after MDIO registration.
Some of them are already doing that even before this patch.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Fri, 29 Dec 2023 22:35:13 +0000 (22:35 +0000)]
Merge tag 'mlx5-updates-2023-12-20' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-12-20

mlx5 Socket direct support and management PF profile.

Tariq Says:
===========
Support Socket-Direct multi-dev netdev

This series adds support for combining multiple devices (PFs) of the
same port under one netdev instance. Passing traffic through different
devices belonging to different NUMA sockets saves cross-numa traffic and
allows apps running on the same netdev from different numas to still
feel a sense of proximity to the device and achieve improved
performance.

We achieve this by grouping PFs together, and creating the netdev only
once all group members are probed. Symmetrically, we destroy the netdev
once any of the PFs is removed.

The channels are distributed between all devices, a proper configuration
would utilize the correct close numa when working on a certain app/cpu.

We pick one device to be a primary (leader), and it fills a special
role.  The other devices (secondaries) are disconnected from the network
in the chip level (set to silent mode). All RX/TX traffic is steered
through the primary to/from the secondaries.

Currently, we limit the support to PFs only, and up to two devices
(sockets).

===========

Armen Says:
===========
Management PF support and module integration

This patch rolls out comprehensive support for the Management Physical
Function (MGMT PF) within the mlx5 driver. It involves updating the
mlx5 interface header to introduce necessary definitions for MGMT PF
and adding a new management PF netdev profile, which will allow the host
side to communicate with the embedded linux on Blue-field devices.

===========
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agogenetlink: Use internal flags for multicast groups
Ido Schimmel [Wed, 20 Dec 2023 15:43:58 +0000 (17:43 +0200)]
genetlink: Use internal flags for multicast groups

As explained in commit e03781879a0d ("drop_monitor: Require
'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
multicast group structure reuses uAPI flags despite the field not being
exposed to user space. This makes it impossible to extend its use
without adding new uAPI flags, which is inappropriate for internal
kernel checks.

Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
the existing users to use them instead of the uAPI flags.

Tested using the reproducers in commit 44ec98ea5ea9 ("psample: Require
'CAP_NET_ADMIN' when joining "packets" group") and commit e03781879a0d
("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoiucv: make iucv_bus const
Greg Kroah-Hartman [Wed, 20 Dec 2023 07:41:18 +0000 (08:41 +0100)]
iucv: make iucv_bus const

Now that the driver core can properly handle constant struct bus_type,
move the iucv_bus variable to be a constant structure as well, placing
it into read-only memory which can not be modified at runtime.

Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-s390@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoethtool: reformat kerneldoc for struct ethtool_fec_stats
Jonathan Corbet [Tue, 19 Dec 2023 23:55:31 +0000 (16:55 -0700)]
ethtool: reformat kerneldoc for struct ethtool_fec_stats

The kerneldoc comment for struct ethtool_fec_stats attempts to describe the
"total" and "lanes" fields of the ethtool_fec_stat substructure in a way
leading to these warnings:

  ./include/linux/ethtool.h:424: warning: Excess struct member 'lane' description in 'ethtool_fec_stats'
  ./include/linux/ethtool.h:424: warning: Excess struct member 'total' description in 'ethtool_fec_stats'

Reformat the comment to retain the information while eliminating the
warnings.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoethtool: reformat kerneldoc for struct ethtool_link_settings
Jonathan Corbet [Tue, 19 Dec 2023 23:53:46 +0000 (16:53 -0700)]
ethtool: reformat kerneldoc for struct ethtool_link_settings

The kernel doc comments for struct ethtool_link_settings includes
documentation for three fields that were never present there, leading to
these docs-build warnings:

  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'supported' description in 'ethtool_link_settings'
  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'advertising' description in 'ethtool_link_settings'
  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'lp_advertising' description in 'ethtool_link_settings'

Remove the entries to make the warnings go away.  There was some
information there on how data in >link_mode_masks is formatted; move that
to the body of the comment to preserve it.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: sock: remove excess structure-member documentation
Jonathan Corbet [Tue, 19 Dec 2023 23:51:12 +0000 (16:51 -0700)]
net: sock: remove excess structure-member documentation

Remove a couple of kerneldoc entries for struct members that do not exist,
addressing these warnings:

  ./include/net/sock.h:548: warning: Excess struct member '__sk_flags_offset' description in 'sock'
  ./include/net/sock.h:548: warning: Excess struct member 'sk_padding' description in 'sock'

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: pktgen: Use wait_event_freezable_timeout() for freezable kthread
Kevin Hao [Tue, 19 Dec 2023 23:37:57 +0000 (07:37 +0800)]
net: pktgen: Use wait_event_freezable_timeout() for freezable kthread

A freezable kernel thread can enter frozen state during freezing by
either calling try_to_freeze() or using wait_event_freezable() and its
variants. So for the following snippet of code in a kernel thread loop:
  wait_event_interruptible_timeout();
  try_to_freeze();

We can change it to a simple wait_event_freezable_timeout() and then
eliminate a function call.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agoMerge branch 'net-tja11xx-macsec-support'
David S. Miller [Wed, 27 Dec 2023 13:08:10 +0000 (13:08 +0000)]
Merge branch 'net-tja11xx-macsec-support'

Radu Pirea says:

====================
Add MACsec support for TJA11XX C45 PHYs

This is the MACsec support for TJA11XX PHYs. The MACsec block encrypts
the ethernet frames on the fly and has no buffering. This operation will
grow the frames by 32 bytes. If the frames are sent back to back, the
MACsec block will not have enough room to insert the SecTAG and the ICV
and the frames will be dropped.

To mitigate this, the PHY can parse a specific ethertype with some
padding bytes and replace them with the SecTAG and ICV. These padding
bytes might be dummy or might contain information about TX SC that must
be used to encrypt the frame.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: phy: nxp-c45-tja11xx: implement mdo_insert_tx_tag
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:33 +0000 (16:53 +0200)]
net: phy: nxp-c45-tja11xx: implement mdo_insert_tx_tag

Implement mdo_insert_tx_tag to insert the TLV header in the ethernet
frame.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: phy: nxp-c45-tja11xx: add MACsec statistics
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:32 +0000 (16:53 +0200)]
net: phy: nxp-c45-tja11xx: add MACsec statistics

Add MACsec statistics callbacks.
The statistic registers must be set to 0 if the SC/SA is
deleted to read relevant values next time when the SC/SA is used.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: phy: nxp-c45-tja11xx: add MACsec support
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:31 +0000 (16:53 +0200)]
net: phy: nxp-c45-tja11xx: add MACsec support

Add MACsec support.
The MACsec block has four TX SCs and four RX SCs. The driver supports up
to four SecY. Each SecY with one TX SC and one RX SC.
The RX SCs can have two keys, key A and key B, written in hardware and
enabled at the same time.
The TX SCs can have two keys written in hardware, but only one can be
active at a given time.
On TX, the SC is selected using the MAC source address. Due of this
selection mechanism, each offloaded netdev must have a unique MAC
address.
On RX, the SC is selected by SCI(found in SecTAG or calculated using MAC
SA), or using RX SC 0 as implicit.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: macsec: introduce mdo_insert_tx_tag
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:30 +0000 (16:53 +0200)]
net: macsec: introduce mdo_insert_tx_tag

Offloading MACsec in PHYs requires inserting the SecTAG and the ICV in
the ethernet frame. This operation will increase the frame size with up
to 32 bytes. If the frames are sent at line rate, the PHY will not have
enough room to insert the SecTAG and the ICV.

Some PHYs use a hardware buffer to store a number of ethernet frames and,
if it fills up, a pause frame is sent to the MAC to control the flow.
This HW implementation does not need any modification in the stack.

Other PHYs might offer to use a specific ethertype with some padding
bytes present in the ethernet frame. This ethertype and its associated
bytes will be replaced by the SecTAG and ICV.

mdo_insert_tx_tag allows the PHY drivers to add any specific tag in the
skb.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: macsec: revert the MAC address if mdo_upd_secy fails
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:29 +0000 (16:53 +0200)]
net: macsec: revert the MAC address if mdo_upd_secy fails

Revert the MAC address if mdo_upd_secy fails. Offloaded MACsec device
might be left in an inconsistent state.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: macsec: documentation for macsec_context and macsec_ops
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:28 +0000 (16:53 +0200)]
net: macsec: documentation for macsec_context and macsec_ops

Add description for fields of struct macsec_context and struct
macsec_ops.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 months agonet: macsec: move sci_to_cpu to macsec header
Radu Pirea (NXP OSS) [Tue, 19 Dec 2023 14:53:27 +0000 (16:53 +0200)]
net: macsec: move sci_to_cpu to macsec header

Move sci_to_cpu to the MACsec header to use it in drivers.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>