linux-2.6-block.git
4 months agonet/mlx5: Change missing SyncE capability print to debug
Gal Pressman [Tue, 26 Dec 2023 09:12:04 +0000 (11:12 +0200)]
net/mlx5: Change missing SyncE capability print to debug

Lack of SyncE capability should not emit a warning, change the print to
debug level.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: Remove initial segmentation duplicate definitions
Gal Pressman [Tue, 26 Dec 2023 08:22:08 +0000 (10:22 +0200)]
net/mlx5: Remove initial segmentation duplicate definitions

Device definitions belong in mlx5_ifc, remove the duplicates in
mlx5_core.h.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: Return specific error code for timeout on wait_fw_init
Moshe Shemesh [Fri, 26 Jan 2024 07:26:29 +0000 (09:26 +0200)]
net/mlx5: Return specific error code for timeout on wait_fw_init

The function wait_fw_init() returns same error code either if it breaks
waiting due to timeout or other reason. Thus, the function callers print
error message on timeout without checking error type.

Return different error code for different failure reason and print error
message accordingly on wait_fw_init().

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: SF, Stop waiting for FW as teardown was called
Moshe Shemesh [Thu, 25 Jan 2024 12:24:09 +0000 (14:24 +0200)]
net/mlx5: SF, Stop waiting for FW as teardown was called

When PF/VF teardown is called the driver sets the flag
MLX5_BREAK_FW_WAIT to stop waiting for FW loading and initializing. Same
should be applied to SF driver teardown to cut waiting time. On
mlx5_sf_dev_remove() set the flag before draining health WQ as recovery
flow may also wait for FW reloading while it is not relevant anymore.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: remove fw reporter dump option for non PF
Moshe Shemesh [Thu, 25 Jan 2024 11:18:55 +0000 (13:18 +0200)]
net/mlx5: remove fw reporter dump option for non PF

In case function is not a Physical Function it is not allowed to get FW
core dump, so if tried it will fail the fw health reporter dump option.
Instead of failing, remove the option of fw_fatal health reporter dump
for such function.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: remove fw_fatal reporter dump option for non PF
Moshe Shemesh [Thu, 25 Jan 2024 10:49:55 +0000 (12:49 +0200)]
net/mlx5: remove fw_fatal reporter dump option for non PF

In case function is not a Physical Function it is not allowed to collect
crdump, so if tried it will fail the fw_fatal health reporter dump
option. Instead of failing on permission, remove the option of fw_fatal
health reporter dump for such function.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5: Rename mlx5_sf_dev_remove
Moshe Shemesh [Sat, 2 Dec 2023 17:13:17 +0000 (19:13 +0200)]
net/mlx5: Rename mlx5_sf_dev_remove

Mlx5 has two functions with the same name mlx5_sf_dev_remove. Both are
static, in different files, so no compilation or logical issue, but it
makes it hard to follow the code and some traces even can get both as
one leads to the other [1]. Rename one to mlx5_sf_dev_remove_aux() as it
actually removes the auxiliary device of the SF.

[1]
 mlx5_sf_dev_remove+0x2a/0x70 [mlx5_core]
 auxiliary_bus_remove+0x18/0x30
 device_release_driver_internal+0x199/0x200
 bus_remove_device+0xd7/0x140
 device_del+0x153/0x3d0
 ? process_one_work+0x16a/0x4b0
 mlx5_sf_dev_remove+0x2e/0x90 [mlx5_core]
 mlx5_sf_dev_table_destroy+0xa0/0x100 [mlx5_core]

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agoDocumentation: Fix counter name of mlx5 vnic reporter
Moshe Shemesh [Fri, 26 Jan 2024 13:05:58 +0000 (15:05 +0200)]
Documentation: Fix counter name of mlx5 vnic reporter

Fix counter name in documentation of mlx5 vnic health reporter diagnose
output: total_error_queues.

While here fix alignment in the documentation file of another counter,
comp_eq_overrun, as it should have its own line and not be part of
another counter's description.

Example:
$ devlink health diagnose  pci/0000:00:04.0 reporter vnic
 vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5e: Delete obsolete IPsec code
Leon Romanovsky [Wed, 4 Oct 2023 12:42:56 +0000 (15:42 +0300)]
net/mlx5e: Delete obsolete IPsec code

After addition of HW managed counters and implementation drop
in flow steering logic, the code in driver which checks syndrome
is not reachable anymore.

Let's delete it.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonet/mlx5e: Connect mlx5 IPsec statistics with XFRM core
Leon Romanovsky [Wed, 4 Oct 2023 11:58:37 +0000 (14:58 +0300)]
net/mlx5e: Connect mlx5 IPsec statistics with XFRM core

Fill integrity, replay and bad trailer counters.

As an example, after simulating replay window attack with 5 packets:
[leonro@c ~]$ grep XfrmInStateSeqError /proc/net/xfrm_stat
XfrmInStateSeqError      5
[leonro@c ~]$ sudo ip -s x s
<...>
stats:
  replay-window 0 replay 5 failed 0

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agoxfrm: get global statistics from the offloaded device
Leon Romanovsky [Wed, 4 Oct 2023 11:11:48 +0000 (14:11 +0300)]
xfrm: get global statistics from the offloaded device

Iterate over all SAs in order to fill global IPsec statistics.

Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agoxfrm: generalize xdo_dev_state_update_curlft to allow statistics update
Leon Romanovsky [Tue, 3 Oct 2023 17:57:20 +0000 (20:57 +0300)]
xfrm: generalize xdo_dev_state_update_curlft to allow statistics update

In order to allow drivers to fill all statistics, change the name
of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats.

Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 months agonetdevsim: add Makefile for selftests
David Wei [Tue, 30 Jan 2024 21:46:20 +0000 (13:46 -0800)]
netdevsim: add Makefile for selftests

Add a Makefile for netdevsim selftests and add selftests path to
MAINTAINERS

Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240130214620.3722189-5-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'qca8k-cleanup-fixes'
David S. Miller [Mon, 5 Feb 2024 12:39:27 +0000 (12:39 +0000)]
Merge branch 'qca8k-cleanup-fixes'

Vladimir Oltean says:

====================
Fixups for qca8k ds->user_mii_bus cleanup

The series "ds->user_mii_bus cleanup (part 1)" from the last development
cycle:
https://patchwork.kernel.org/project/netdevbpf/cover/20240104140037.374166-1-vladimir.oltean@nxp.com/

had some review comments I didn't have the time to address at the time.
One from Alvin and one from Luiz. They can reasonably be treated as
improvements for v6.9.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dsa: qca8k: consistently use "ret" rather than "err" for error codes
Vladimir Oltean [Fri, 2 Feb 2024 16:36:26 +0000 (18:36 +0200)]
net: dsa: qca8k: consistently use "ret" rather than "err" for error codes

It was pointed out during the review [1] of commit 68e1010cda79 ("net:
dsa: qca8k: put MDIO bus OF node on qca8k_mdio_register() failure") that
the rest of the qca8k driver uses "int ret" rather than "int err".

Make everything consistent in that regard, not only
qca8k_mdio_register(), but also qca8k_setup_mdio_bus().

[1] https://lore.kernel.org/netdev/qyl2w3ownx5q7363kqxib52j5htar4y6pkn7gen27rj45xr4on@pvy5agi6o2te/

Suggested-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dsa: qca8k: put MDIO controller OF node if unavailable
Vladimir Oltean [Fri, 2 Feb 2024 16:36:25 +0000 (18:36 +0200)]
net: dsa: qca8k: put MDIO controller OF node if unavailable

It was pointed out during the review [1] of commit e66bf63a7f67 ("net:
dsa: qca8k: skip MDIO bus creation if its OF node has status =
"disabled"") that we now leak a reference to the "mdio" OF node if it is
disabled.

This is only a concern when using dynamic OF as far as I can tell (like
probing on an overlay), since OF nodes are never freed in the regular
case. Additionally, I'm unaware of any actual device trees (in
production or elsewhere) which have status = "disabled" for the MDIO OF
node. So handling this as a simple enhancement.

[1] https://lore.kernel.org/netdev/CAJq09z4--Ug+3FAmp=EimQ8HTQYOWOuVon-PUMGB5a1N=RPv4g@mail.gmail.com/

Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dsa: reindent arguments of dsa_user_vlan_for_each()
Vladimir Oltean [Fri, 2 Feb 2024 16:20:41 +0000 (18:20 +0200)]
net: dsa: reindent arguments of dsa_user_vlan_for_each()

These got misaligned after commit 6ca80638b90c ("net: dsa: Use conduit
and user terms").

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ocelot: update the MODULE_DESCRIPTION()
Breno Leitao [Fri, 2 Feb 2024 16:05:37 +0000 (08:05 -0800)]
net: ocelot: update the MODULE_DESCRIPTION()

commit 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
got a suggestion from Vladimir Oltean after it had landed in net-next.

Rewrite the module description according to Vladimir's suggestion.

Fixes: 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: blackhole_dev: fix build warning for ethh set but not used
Breno Leitao [Fri, 2 Feb 2024 15:13:29 +0000 (07:13 -0800)]
net: blackhole_dev: fix build warning for ethh set but not used

lib/test_blackhole_dev.c sets a variable that is never read, causing
this following building warning:

lib/test_blackhole_dev.c:32:17: warning: variable 'ethh' set but not used [-Wunused-but-set-variable]

Remove the variable struct ethhdr *ethh, which is unused.

Fixes: 509e56b37cc3 ("blackhole_dev: add a selftest")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'mptcp-annotate-lockless'
David S. Miller [Mon, 5 Feb 2024 11:18:10 +0000 (11:18 +0000)]
Merge branch 'mptcp-annotate-lockless'

Matthieu Baerts says:

====================
mptcp: annotate lockless access

This is a series of 5 patches from Paolo to annotate lockless access.

The MPTCP locking schema is already quite complex. We need to clarify it
and make the lockless access already there consistent, or later changes
will be even harder to follow and understand.

This series goes through all the msk fields accessed in the RX and TX
path and makes the lockless annotation consistent with the in-use
locking schema.

As a bonus, this should fix data races eventually found by fuzzers --
even if we haven't seen many such reports so far.

Patch 1/5 hints we could remove "local_key" and "remote_key" from the
subflow context, and always use the ones from the msk socket, possibly
reducing the context memory usage. That change is left over as a
possible follow-up.
====================

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agomptcp: annotate lockless accesses around read-mostly fields
Paolo Abeni [Fri, 2 Feb 2024 11:40:11 +0000 (12:40 +0100)]
mptcp: annotate lockless accesses around read-mostly fields

The following MPTCP socket fields:

 - can_ack
 - fully_established
 - rcv_data_fin
 - snd_data_fin_enable
 - rcv_fastclose
 - use_64bit_ack

are accessed without any lock, add the appropriate annotation.

The schema is safe as each field can change its value at most
once in the whole mptcp socket life cycle.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agomptcp: annotate lockless access for token
Paolo Abeni [Fri, 2 Feb 2024 11:40:10 +0000 (12:40 +0100)]
mptcp: annotate lockless access for token

The token field is manipulated under the msk socket lock
and accessed lockless in a few spots, add proper ONCE annotation

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agomptcp: annotate lockless access for RX path fields
Paolo Abeni [Fri, 2 Feb 2024 11:40:09 +0000 (12:40 +0100)]
mptcp: annotate lockless access for RX path fields

The following fields:

 - ack_seq
 - snd_una
 - wnd_end
 - rmem_fwd_alloc

are protected by the data lock end accessed lockless in a few
spots. Ensure ONCE annotation for write (under such lock) and for
lockless read.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agomptcp: annotate lockless access for the tx path
Paolo Abeni [Fri, 2 Feb 2024 11:40:08 +0000 (12:40 +0100)]
mptcp: annotate lockless access for the tx path

The mptcp-level TX path info (write_seq, bytes_sent, snd_nxt) are under
the msk socket lock protection, and are accessed lockless in a few spots.

Always mark the write operations with WRITE_ONCE, read operations
outside the lock with READ_ONCE and drop the annotation for read
under such lock.

To simplify the annotations move mptcp_pending_data_fin_ack() from
__mptcp_data_acked() to __mptcp_clean_una(), under the msk socket
lock, where such call would belong.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agomptcp: annotate access for msk keys
Paolo Abeni [Fri, 2 Feb 2024 11:40:07 +0000 (12:40 +0100)]
mptcp: annotate access for msk keys

Both the local and the remote key follow the same locking
schema, put in place the proper ONCE accessors.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotsnep: Add helper for RX XDP_RING_NEED_WAKEUP flag
Gerhard Engleder [Wed, 31 Jan 2024 20:54:34 +0000 (21:54 +0100)]
tsnep: Add helper for RX XDP_RING_NEED_WAKEUP flag

Similar chunk of code is used in tsnep_rx_poll_zc() and
tsnep_rx_reopen_xsk() to maintain the RX XDP_RING_NEED_WAKEUP flag.
Consolidate the code to common helper function.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agosctp: preserve const qualifier in sctp_sk()
Eric Dumazet [Fri, 2 Feb 2024 10:14:03 +0000 (10:14 +0000)]
sctp: preserve const qualifier in sctp_sk()

We can change sctp_sk() to propagate its argument const qualifier,
thanks to container_of_const().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: make dev_unreg_count global
Eric Dumazet [Fri, 2 Feb 2024 10:11:06 +0000 (10:11 +0000)]
net: make dev_unreg_count global

We can use a global dev_unreg_count counter instead
of a per netns one.

As a bonus we can factorize the changes done on it
for bulk device removals.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotun: Implement ethtool's get_channels() callback
Yunjian Wang [Fri, 2 Feb 2024 07:53:20 +0000 (15:53 +0800)]
tun: Implement ethtool's get_channels() callback

Implement the tun .get_channels functionality. This feature is necessary
for some tools, such as libxdp, which need to retrieve the queue count.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotun: Fix code style issues in <linux/if_tun.h>
Yunjian Wang [Fri, 2 Feb 2024 07:25:55 +0000 (15:25 +0800)]
tun: Fix code style issues in <linux/if_tun.h>

This fixes the following code style problem:
- WARNING: please, no spaces at the start of a line
- CHECK: Please use a blank line after
         function/struct/union/enum declarations

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agor8169: add support for RTL8126A
Heiner Kallweit [Thu, 1 Feb 2024 21:38:01 +0000 (22:38 +0100)]
r8169: add support for RTL8126A

This adds support for the RTL8126A found on Asus z790 Maximus Formula.
It was successfully tested w/o the firmware at 1000Mbps. Firmware file
has been provided by Realtek and submitted to linux-firmware.
2.5G and 5G modes are untested.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoselftests: netdevsim: stop using ifconfig
Jakub Kicinski [Fri, 2 Feb 2024 00:11:54 +0000 (16:11 -0800)]
selftests: netdevsim: stop using ifconfig

Paolo points out that ifconfig is legacy and we should not use it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: micrel: Fix the frequency adjustments
Horatiu Vultur [Thu, 1 Feb 2024 20:42:03 +0000 (21:42 +0100)]
net: micrel: Fix the frequency adjustments

By default lan8841's 1588 clock frequency is 125MHz. But when adjusting
the frequency, it is using the 1PPM format of the lan8814. Which is the
wrong format as lan8814 has a 1588 clock frequency of 250MHz. So then
for each 1PPM adjustment would adjust less than expected.
Therefore fix this by using the correct 1PPM format for lan8841.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'qca-phy-led-fixes'
David S. Miller [Sat, 3 Feb 2024 12:50:45 +0000 (12:50 +0000)]
Merge branch 'qca-phy-led-fixes'

Christian Marangi says:

====================
net: phy: qcom: qca808x: fixup qca808x LED

This is a bit embarassing and totally my fault so sorry for that!

While reworking the patch to phy_modify API, it was done a logic
error and made the brightness_set function broken. It wasn't
notice in last revisions test as the testing method was to verify
if hw control was correctly working.

Noticing this problem also made me notice an additional problem
with the polarity.

The introduced patch made the polarity configurable but I forgot
to add the required code to enable Active High by default.
(the PHY sets active low by default)

This wasn't notice with hw control testing as the LED blink on
traffic and polarity problem are not notice.

It might be worth discussing if needed a change in implementation
where the polarity function is always called but I think it's
better this way where specific PHY apply fixup with the help
of priv struct and on the config_init phase.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: qcom: qca808x: default to LED active High if not set
Christian Marangi [Thu, 1 Feb 2024 13:46:01 +0000 (14:46 +0100)]
net: phy: qcom: qca808x: default to LED active High if not set

qca808x PHY provide support for the led_polarity_set OP to configure
and apply the active-low property but on PHY reset, the Active High bit
is not set resulting in the LED driven as active-low.

To fix this, check if active-low is not set in DT and enable Active High
polarity by default to restore correct funcionality of the LED.

Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: qcom: qca808x: fix logic error in LED brightness set
Christian Marangi [Thu, 1 Feb 2024 13:46:00 +0000 (14:46 +0100)]
net: phy: qcom: qca808x: fix logic error in LED brightness set

In switching to using phy_modify_mmd and a more short version of the
LED ON/OFF condition in later revision, it was made a logic error where

value ? QCA808X_LED_FORCE_ON : QCA808X_LED_FORCE_OFF is always true as
value is always OR with QCA808X_LED_FORCE_EN due to missing ()
resulting in the testing condition being QCA808X_LED_FORCE_EN | value.

Add the () to apply the correct condition and restore correct
functionality of the brightness ON/OFF.

Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'tools-ynl-auto-gen-for-all-genetlink-families'
Jakub Kicinski [Sat, 3 Feb 2024 05:15:55 +0000 (21:15 -0800)]
Merge branch 'tools-ynl-auto-gen-for-all-genetlink-families'

Jakub Kicinski says:

====================
tools: ynl: auto-gen for all genetlink families

The code gen has caught up with all features required in genetlink
families in Linux 6.8 already. We have also stopped committing auto-
-generated user space code to the tree. Instead of listing all the
families in the Makefile search the spec directory, and generate
code for everything that's not legacy netlink.
====================

Link: https://lore.kernel.org/r/20240202004926.447803-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agotools: ynl: auto-gen for all genetlink families
Jakub Kicinski [Fri, 2 Feb 2024 00:49:26 +0000 (16:49 -0800)]
tools: ynl: auto-gen for all genetlink families

Instead of listing the genetlink families that we want to codegen
for, always codegen for everyone. We can add an opt-out later but
it seems like most families are not causing any issues, and yet
folks forget to add them to the Makefile.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agotools: ynl: generate code for ovs families
Jakub Kicinski [Fri, 2 Feb 2024 00:49:25 +0000 (16:49 -0800)]
tools: ynl: generate code for ovs families

Add ovs_flow, ovs_vport and ovs_datapath to the families supported
in C. ovs-flow has some circular nesting which is fun to deal with,
but the necessary support has been added already in the previous
release cycle.

Add a sample that proves that dealing with fixed headers does
actually work correctly.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agotools: ynl: include dpll and mptcp_pm in C codegen
Jakub Kicinski [Fri, 2 Feb 2024 00:49:24 +0000 (16:49 -0800)]
tools: ynl: include dpll and mptcp_pm in C codegen

The DPLL and mptcp_pm families are pretty clean, and YNL C codegen
supports them fully with no changes. Add them to user space codegen
so that C samples can be written, and we know immediately if changes
to these families require YNL codegen work.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoipv6: make addrconf_wq single threaded
Eric Dumazet [Thu, 1 Feb 2024 17:30:31 +0000 (17:30 +0000)]
ipv6: make addrconf_wq single threaded

Both addrconf_verify_work() and addrconf_dad_work() acquire rtnl,
there is no point trying to have one thread per cpu.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240201173031.3654257-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agor8169: simplify EEE handling
Heiner Kallweit [Wed, 31 Jan 2024 20:31:01 +0000 (21:31 +0100)]
r8169: simplify EEE handling

We don't have to store the EEE modes to be advertised in the driver,
phylib does this for us and stores it in phydev->advertising_eee.
phylib also takes care of properly handling the EEE advertisement.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/27c336a8-ea47-483d-815b-02c45ae41da2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY
Heiner Kallweit [Wed, 31 Jan 2024 20:24:29 +0000 (21:24 +0100)]
net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY

A user reported that first consumer mainboards show up with a RTL8126A
5Gbps MAC/PHY. This adds support for the integrated PHY, which is also
available stand-alone. From a PHY driver perspective it's treated the
same as the 2.5Gbps PHY's, we just have to support the new PHY ID.

Reported-by: Joe Salmeri <jmscdba@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Joe Salmeri <jmscdba@gmail.com>
Link: https://lore.kernel.org/r/0c8e67ea-6505-43d1-bd51-94e7ecd6e222@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'net-sched-load-modules-via-alias'
Jakub Kicinski [Fri, 2 Feb 2024 18:57:57 +0000 (10:57 -0800)]
Merge branch 'net-sched-load-modules-via-alias'

Michal Koutný says:

====================
net/sched: Load modules via alias

These modules may be loaded lazily without user's awareness and
control. Add respective aliases to modules and request them under these
aliases so that modprobe's blacklisting mechanism (through aliases)
works for them. (The same pattern exists e.g. for filesystem
modules.)

For example (before the change):
  $ tc filter add dev lo parent 10: protocol ip prio 10 handle 1: cgroup
  # cls_cgroup module is loaded despite a `blacklist cls_cgroup` entry
  # in /etc/modprobe.d/*.conf

After the change:
  $ tc filter add dev lo parent 10: protocol ip prio 10 handle 1: cgroup
  Error: TC classifier not found.
  We have an error talking to the kernel
  # explicit/acknowledged (privileged) action is needed
  $ modprobe cls_cgroup
  # blacklist entry won't apply to this direct modprobe, module is
  # loaded with awareness

A considered alternative was invoking `modprobe -b` always from
request_module(), however, dismissed as too intrusive and slightly
confusing in favor of the precedented aliases (the commit 7f78e0351394
("fs: Limit sys_mount to only request filesystem modules.").

User experience suffers in both alternatives. Its improvement is
orthogonal to blacklist honoring.

v1: https://lore.kernel.org/r/20231121175640.9981-1-mkoutny@suse.com
v2 https://lore.kernel.org/r/20231206192752.18989-1-mkoutny@suse.com
v3 https://lore.kernel.org/r/20240112180646.13232-1-mkoutny@suse.com
v4 https://lore.kernel.org/r/20240123135242.11430-1-mkoutny@suse.com

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
====================

Link: https://lore.kernel.org/r/20240201130943.19536-1-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet/sched: Remove alias of sch_clsact
Michal Koutný [Thu, 1 Feb 2024 13:09:43 +0000 (14:09 +0100)]
net/sched: Remove alias of sch_clsact

The module sch_ingress stands out among net/sched modules
because it provides multiple act/sch functionalities in a single .ko.
They have aliases to make autoloading work for any of the provided
functionalities.

Since the autoloading was changed to uniformly request any functionality
under its alias, the non-systemic aliases can be removed now (i.e.
assuming the alias were only used to ensure autoloading).

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-5-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet/sched: Load modules via their alias
Michal Koutný [Thu, 1 Feb 2024 13:09:42 +0000 (14:09 +0100)]
net/sched: Load modules via their alias

The cls_,sch_,act_ modules may be loaded lazily during network
configuration but without user's awareness and control.

Switch the lazy loading from canonical module names to a module alias.
This allows finer control over lazy loading, the precedent from
commit 7f78e0351394 ("fs: Limit sys_mount to only request filesystem
modules.") explains it already:

Using aliases means user space can control the policy of which
filesystem^W net/sched modules are auto-loaded by editing
/etc/modprobe.d/*.conf with blacklist and alias directives.
Allowing simple, safe, well understood work-arounds to known
problematic software.

By default, nothing changes. However, if a specific module is
blacklisted (its canonical name), it won't be modprobe'd when requested
under its alias (i.e. kernel auto-loading). It would appear as if the
given module was unknown.

The module can still be loaded under its canonical name, which is an
explicit (privileged) user action.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-4-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet/sched: Add module aliases for cls_,sch_,act_ modules
Michal Koutný [Thu, 1 Feb 2024 13:09:41 +0000 (14:09 +0100)]
net/sched: Add module aliases for cls_,sch_,act_ modules

No functional change intended, aliases will be used in followup commits.
Note for backporters: you may need to add aliases also for modules that
are already removed in mainline kernel but still in your version.

Patches were generated with the help of Coccinelle scripts like:

cat >scripts/coccinelle/misc/tcf_alias.cocci <<EOD
virtual patch
virtual report

@ haskernel @
@@

@ tcf_has_kind depends on report && haskernel @
identifier ops;
constant K;
@@

  static struct tcf_proto_ops ops = {
    .kind = K,
    ...
  };
+char module_alias = K;
EOD

/usr/bin/spatch -D report --cocci-file scripts/coccinelle/misc/tcf_alias.cocci \
        --dir . \
        -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include \
        -I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi \
        -I ./include/uapi -I ./include/generated/uapi \
        --include ./include/linux/compiler-version.h --include ./include/linux/kconfig.h \
        --jobs 8 --chunksize 1 2>/dev/null | \
        sed 's/char module_alias = "\([^"]*\)";/MODULE_ALIAS_NET_CLS("\1");/'

And analogously for:

  static struct tc_action_ops ops = {
    .kind = K,

  static struct Qdisc_ops ops = {
    .id = K,

(Someone familiar would be able to fit those into one .cocci file
without sed post processing.)

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-3-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet/sched: Add helper macros with module names
Michal Koutný [Thu, 1 Feb 2024 13:09:40 +0000 (14:09 +0100)]
net/sched: Add helper macros with module names

The macros are preparation for adding module aliases en mass in a
separate commit.
Although it would be tempting to create aliases like cls-foo for name
cls_foo, this could not be used because modprobe utilities treat '-' and
'_' interchangeably.
In the end, the naming follows pattern of proto modules in linux/net.h.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-2-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'batadv-next-pullrequest-20240201' of git://git.open-mesh.org/linux-merge
David S. Miller [Fri, 2 Feb 2024 12:44:16 +0000 (12:44 +0000)]
Merge tag 'batadv-next-pullrequest-20240201' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Improve error handling in DAT and uevent generator,
   by Markus Elfring (2 patches)

 - Drop usage of export.h, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dccp: Simplify the allocation of slab caches in dccp_ackvec_init
Kunwu Chan [Wed, 31 Jan 2024 09:08:51 +0000 (17:08 +0800)]
net: dccp: Simplify the allocation of slab caches in dccp_ackvec_init

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agodt-bindings: net: ti: Update maintainers list
Ravi Gunasekaran [Wed, 31 Jan 2024 08:53:51 +0000 (14:23 +0530)]
dt-bindings: net: ti: Update maintainers list

Update the list with the current maintainers of TI's CPSW ethernet
peripheral.

Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Acked-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agosctp: Simplify the allocation of slab caches
Kunwu Chan [Wed, 31 Jan 2024 08:45:49 +0000 (16:45 +0800)]
sctp: Simplify the allocation of slab caches

commit 0a31bd5f2bbb ("KMEM_CACHE(): simplify slab cache creation")
introduces a new macro.
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'octeontx2-af-dynamically-allocate-BPIDs'
David S. Miller [Fri, 2 Feb 2024 12:12:36 +0000 (12:12 +0000)]
Merge branch 'octeontx2-af-dynamically-allocate-BPIDs'

Geetha sowjanya says:

====================
Dynamically allocate BPIDs for LBK

In current driver 64 BPIDs are reserved for LBK interfaces.
These bpids are 1-to-1 mapped to LBK interface channel numbers.
In some usecases one LBK interface required more than one bpids
and in some case they may not require at all. These usescas
can't be address with the current implementation as it always
reserves only one bpid per LBK channel.

This patch addresses this issue by creating free bpid pool from
these 64 bpids instead of 1-to-1 mapping to the lbk channel.
Now based on usecase LBK interface can request a bpid using (bp_enable()).

v1 -> v2:
   - Modified commit message.
   - Dropped patch2, as for now rvu netdev have no usecase. Will
     be upstream along with the CPT driver.
   - Addressed review comments by Simon Horman.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoocteontx2-af: Cleanup loopback device checks
Geetha sowjanya [Wed, 31 Jan 2024 07:54:41 +0000 (13:24 +0530)]
octeontx2-af: Cleanup loopback device checks

PCI device IDs of RVU device IDs are configurable and
RVU PF0's (ie AF's) are currently assumed as VFs that
identify loopback functionality ie LBKVFs. But in some
cases these VFs can be setup for different functionality.
Hence remove assumptions that AF's VFs are always LBK VFs
by renaming 'is_afvf' as 'is_lbkvf' explicitly and also
identify LBK VF using PCI dev ID. Similar change is done
for other VF types.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoocteontx2-af: Create BPIDs free pool
Geetha sowjanya [Wed, 31 Jan 2024 07:54:40 +0000 (13:24 +0530)]
octeontx2-af: Create BPIDs free pool

In current driver 64 BPIDs are reserved for LBK interfaces.
These bpids are 1-to-1 mapped to LBK interface channel numbers.
In some usecases one LBK interface required more than one
bpids and in some case they may not require at all.
These usescase can't be address with the current implementation
as it always reserves only one bpid per LBK channel.
This patch addresses this issue by creating free bpid pool from these
64 bpids instead of 1-to-1 mapping to the lbk channel.
Now based on usecase LBK interface can request a bpid using (bp_enable()).

This patch also reduces the number of bpids for cgx interfaces to 8
and adds proper error code

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoselftests: openvswitch: Test ICMP related matches work with SNAT
Brad Cowie [Wed, 31 Jan 2024 04:08:22 +0000 (17:08 +1300)]
selftests: openvswitch: Test ICMP related matches work with SNAT

Add a test case for regression in openvswitch nat that was fixed by
commit e6345d2824a3 ("netfilter: nf_nat: fix action not being set for
all ct states").

Link: https://lore.kernel.org/netdev/20231221224311.130319-1-brad@faucet.nz/
Link: https://mail.openvswitch.org/pipermail/ovs-dev/2024-January/410476.html
Suggested-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Brad Cowie <brad@faucet.nz>
Tested-by: Aaron Conole <aconole@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: dp83867: Add support for active-low LEDs
Alexander Stein [Wed, 31 Jan 2024 07:50:48 +0000 (08:50 +0100)]
net: phy: dp83867: Add support for active-low LEDs

Add the led_polarity_set callback for setting LED polarity.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'net-ipq4019-rate'
David S. Miller [Fri, 2 Feb 2024 10:08:02 +0000 (10:08 +0000)]
Merge branch 'net-ipq4019-rate'

Christian Marangi says:

====================
net: mdio-ipq4019: fix wrong default MDC rate

This was a long journey to arrive and discover this problem.

To not waste too much char, there is a race problem with PHY and driver
probe. This was observed with Aquantia PHY firmware loading.

With some hacks the race problem was workarounded but an interesting
thing was notice. It took more than a minute for the firmware to load
via MDIO.

This was strange as the same operation was done by UBoot in at max 5
second and the same data was loaded.

A similar problem was observed on a mtk board that also had an
Aquantia PHY where the load was very slow. It was notice that the cause
was the MDIO bus running at a very low speed and the firmware
was missing a property (present in mtk sdk) that set the right frequency
to the MDIO bus.

It was fun to find that THE VERY SAME PROBLEM is present on IPQ in a
different form. The MDIO apply internally a division to the feed clock
resulting in the bus running at 390KHz instead of 6.25Mhz.

Searching around the web for some documentation and some include and
analyzing the uboot codeflow resulted in the divider being set wrongly
at /256 instead of /16 as the value was actually never set.
Applying the value restore the original load time for the Aquantia PHY.

This series mainly handle this by adding support for the "clock-frequency"
property.

Changes v3:
- Add Reviewed-by tag
- Fix english grammar error in comment
- Drop DTS patch
Changes v2:
- Use DIV_ROUND_UP
- Introduce logic to chose a default value for 802.3 spec 2.5MHz
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: mdio: ipq4019: add support for clock-frequency property
Christian Marangi [Wed, 31 Jan 2024 02:26:04 +0000 (03:26 +0100)]
net: mdio: ipq4019: add support for clock-frequency property

The IPQ4019 MDIO internally divide the clock feed by AHB based on the
MDIO_MODE reg. On reset or power up, the default value for the
divider is 0xff that reflect the divider set to /256.

This makes the MDC run at a very low rate, that is, considering AHB is
always fixed to 100Mhz, a value of 390KHz.

This hasn't have been a problem as MDIO wasn't used for time sensitive
operation, it is now that on IPQ807x is usually mounted with PHY that
requires MDIO to load their firmware (example Aquantia PHY).

To handle this problem and permit to set the correct designed MDC
frequency for the SoC add support for the standard "clock-frequency"
property for the MDIO node.

The divider supports value from /1 to /256 and the common value are to
set it to /16 to reflect 6.25Mhz or to /8 on newer platform to reflect
12.5Mhz.

To scan if the requested rate is supported by the divider, loop with
each supported divider and stop when the requested rate match the final
rate with the current divider. An error is returned if the rate doesn't
match any value.

On MDIO reset, the divider is restored to the requested value to prevent
any kind of downclocking caused by the divider reverting to a default
value.

To follow 802.3 spec of 2.5MHz of default value, if divider is set at
/256 and "clock-frequency" is not set in DT, assume nobody set the
divider and try to find the closest MDC rate to 2.5MHz. (in the case of
AHB set to 100MHz, it's 1.5625MHz)

While at is also document other bits of the MDIO_MODE reg to have a
clear idea of what is actually applied there.

Documentation of some BITs is skipped as they are marked as reserved and
their usage is not clear (RES 11:9 GENPHY 16:13 RES1 19:17)

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agodt-bindings: net: ipq4019-mdio: document now supported clock-frequency
Christian Marangi [Wed, 31 Jan 2024 02:26:03 +0000 (03:26 +0100)]
dt-bindings: net: ipq4019-mdio: document now supported clock-frequency

Document support for clock-frequency and add details on why this
property is needed and what values are supported.

From internal documentation, while other values are supported, the
correct function of the MDIO bus is not assured hence add only the
suggested supported values to the property enum.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'net-ipa-simplify-tx-power-handling'
Jakub Kicinski [Fri, 2 Feb 2024 04:51:55 +0000 (20:51 -0800)]
Merge branch 'net-ipa-simplify-tx-power-handling'

Alex Elder says:

====================
net: ipa: simplify TX power handling

In order to deliver a packet to the IPA hardware, we must ensure
it is powered.  We request power by calling pm_runtime_get(), and
its return value tells us the power state.  We can't block in
ipa_start_xmit(), so if power isn't enabled we prevent further
transmit attempts by calling netif_stop_queue().  Power will
eventually become enabled, at which point we call netif_wake_queue()
to allow the transmit to be retried.  When it does, the power should
be enabled, so the packet delivery can proceed.

The logic that handles this is convoluted, and was put in place
to address a race condition pointed out by Jakub Kicinski during
review.  The fix addressed that race, as well as another one that
was found while investigating it:
  b8e36e13ea5e ("net: ipa: fix TX queue race")
I have wanted to simplify this code ever since, and I'm pleased to
report that this series implements a much better solution that
avoids both race conditions.

The first race occurs between the ->ndo_start_xmit thread and the
runtime resume thread.  If we find power is not enabled when
requested in ipa_start_xmit(), we stop queueing.  But there's a
chance the runtime resume will enable queuing just before that,
leaving queueing stopped forever.  A flag is used to ensure that
does not occur.

A second flag is used to prevent NETDEV_TX_BUSY from being returned
repeatedly during the small window between enabling queueing and
finishing power resume.  This can happen if resume was underway when
pm_runtime_get() was called and completes immediately afterward.
This condition only exists because of the use of the first flag.

The fix is to disable transmit for *every* call to ipa_start_xmit(),
disabling it *before* calling pm_runtime_get().  This leaves three
cases:
  - If the return value indicates power is not active (or is in
    transition), queueing remains disabled--thereby avoiding
    the race between disabling it and a concurrent power thread
    enabling it.
  - If pm_runtime_get() returns an error, we drop the packet and
    re-enable queueing.
  - Finally, if the hardware is powered, we re-enable queueing
    before delivering the packet to the hardware.

So the first race is avoided.  And as a result, the second condition
will not occur.

The first patch adds pointers to the TX and RX IPA endpoints in the
netdev private data.  The second patch has netif_stop_queue() be
called for all packets; if pm_runtime_get() indicates power is
enabled (or an error), netif_wake_queue() is called to enable it
again.  The third and fourth patches get rid of the STARTED and
STOPPED IPA power flags, as well as the power spinlock, because they
are no longer needed.  The last three patches just eliminate some
trivial functions, open-coding them instead.
====================

Link: https://lore.kernel.org/r/20240130192305.250915-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: kill ipa_power_modem_queue_wake()
Alex Elder [Tue, 30 Jan 2024 19:23:04 +0000 (13:23 -0600)]
net: ipa: kill ipa_power_modem_queue_wake()

All ipa_power_modem_queue_wake() does is call netif_wake_queue()
on the modem netdev.  There is no need to wrap that call in a
trivial function (and certainly not one defined in "ipa_power.c").

So get rid of ipa_power_modem_queue_wake(), and replace its one
caller with a direct call to netif_wake_queue().  Determine the
netdev pointer to use from the private TX endpoint's netdev pointer.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-8-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: kill ipa_power_modem_queue_active()
Alex Elder [Tue, 30 Jan 2024 19:23:03 +0000 (13:23 -0600)]
net: ipa: kill ipa_power_modem_queue_active()

All ipa_power_modem_queue_active() does now is call netif_wake_queue().
Just call netif_wake_queue() in the two places it's needed, and get
rid of ipa_power_modem_queue_active().

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-7-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: kill ipa_power_modem_queue_stop()
Alex Elder [Tue, 30 Jan 2024 19:23:02 +0000 (13:23 -0600)]
net: ipa: kill ipa_power_modem_queue_stop()

All ipa_power_modem_queue_stop() does now is call netif_stop_queue().
Just call netif_stop_queue() in the one place it's needed, and get
rid of ipa_power_modem_queue_stop().

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-6-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: kill the IPA power STOPPED flag
Alex Elder [Tue, 30 Jan 2024 19:23:01 +0000 (13:23 -0600)]
net: ipa: kill the IPA power STOPPED flag

Currently the STOPPED IPA power flag is used to indicate that the
transmit queue has been stopped.  Previously this was used to avoid
setting the STARTED flag unless the queue had already been stopped.
It meant transmit queuing would be enabled on resume if it was
stopped by the transmit path--and if so, it ensured it only got
enabled once.

We only stop the transmit queue in the transmit path.  The STARTED
flag has been removed, and it causes no real harm to enable
transmits when they're already enabled.  So we can get rid of
the STOPPED flag and call netif_wake_queue() unconditionally.

This makes the IPA power spinlock unnecessary, so it can be removed
as well.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-5-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: kill the STARTED IPA power flag
Alex Elder [Tue, 30 Jan 2024 19:23:00 +0000 (13:23 -0600)]
net: ipa: kill the STARTED IPA power flag

A transmit on the modem netdev can only complete if the IPA hardware
is powered.  Currently, if a transmit request arrives when the
hardware was not powered, further transmits are be stopped to allow
power-up to complete.  Once power-up completes, transmits are once
again enabled.

Runtime resume can complete at the same time a transmit request is
being handled, and previously there was a race between stopping and
restarting transmits.  The STARTED flag was used to ensure the
stop request in the transmit path was skipped if the start request
in the runtime resume path had already occurred.

Now, the queue is *always* stopped in the transmit path, *before*
determining whether power is ACTIVE.  If power is found to already
be active (or if the socket buffer is gets dropped), transmit is
re-enabled.  Otherwise it will (always) be enabled after runtime
resume completes.

The race between transmit and runtime resume no longer exists, so
there is no longer any need to maintain the STARTED flag.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-4-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: begin simplifying TX queue stop
Alex Elder [Tue, 30 Jan 2024 19:22:59 +0000 (13:22 -0600)]
net: ipa: begin simplifying TX queue stop

There are a number of flags used in the IPA driver to attempt to
manage race conditions that can occur between runtime resume and
netdev transmit.  If we disable TX before requesting power, we can
avoid these races entirely, simplifying things considerably.

This patch implements the main change, disabling transmit always in
the net_device->ndo_start_xmit() callback, then re-enabling it again
whenever we find power is active (or when we drop the skb).

The patches that follow will refactor the "old" code to the point
that most of it can be eliminated.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-3-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: ipa: stash modem TX and RX endpoints
Alex Elder [Tue, 30 Jan 2024 19:22:58 +0000 (13:22 -0600)]
net: ipa: stash modem TX and RX endpoints

Rather than repeatedly looking up the endpoints in the name map,
save the modem TX and RX endpoint pointers in the netdev private
area.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20240130192305.250915-2-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 1 Feb 2024 22:33:26 +0000 (14:33 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'net-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 1 Feb 2024 20:39:54 +0000 (12:39 -0800)]
Merge tag 'net-6.8-rc3' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  As Paolo promised we continue to hammer out issues in our selftests.
  This is not the end but probably the peak.

  Current release - regressions:

   - smc: fix incorrect SMC-D link group matching logic

  Current release - new code bugs:

   - eth: bnxt: silence WARN() when device skips a timestamp, it happens

  Previous releases - regressions:

   - ipmr: fix null-deref when forwarding mcast packets

   - conntrack: evaluate window negotiation only for packets in the
     REPLY direction, otherwise SYN retransmissions trigger incorrect
     window scale negotiation

   - ipset: fix performance regression in swap operation

  Previous releases - always broken:

   - tcp: add sanity checks to types of pages getting into the rx
     zerocopy path, we only support basic NIC -> user, no page cache
     pages etc.

   - ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()

   - nt_tables: more input sanitization changes

   - dsa: mt7530: fix 10M/100M speed on MediaTek MT7988 switch

   - bridge: mcast: fix loss of snooping after long uptime, jiffies do
     wrap on 32bit

   - xen-netback: properly sync TX responses, protect with locking

   - phy: mediatek-ge-soc: sync calibration values with MediaTek SDK,
     increase connection stability

   - eth: pds: fixes for various teardown, and reset races

  Misc:

   - hsr: silence WARN() if we can't alloc supervision frame, it
     happens"

* tag 'net-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
  doc/netlink/specs: Add missing attr in rt_link spec
  idpf: avoid compiler padding in virtchnl2_ptype struct
  selftests: mptcp: join: stop transfer when check is done (part 2)
  selftests: mptcp: join: stop transfer when check is done (part 1)
  selftests: mptcp: allow changing subtests prefix
  selftests: mptcp: decrease BW in simult flows
  selftests: mptcp: increase timeout to 30 min
  selftests: mptcp: add missing kconfig for NF Mangle
  selftests: mptcp: add missing kconfig for NF Filter in v6
  selftests: mptcp: add missing kconfig for NF Filter
  mptcp: fix data re-injection from stale subflow
  selftests: net: enable some more knobs
  selftests: net: add missing config for NF_TARGET_TTL
  selftests: forwarding: List helper scripts in TEST_FILES Makefile variable
  selftests: net: List helper scripts in TEST_FILES Makefile variable
  selftests: net: Remove executable bits from library scripts
  selftests: bonding: Check initial state
  selftests: team: Add missing config options
  hv_netvsc: Fix race condition between netvsc_probe and netvsc_remove
  xen-netback: properly sync TX responses
  ...

4 months agoMerge tag 'parisc-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/delle...
Linus Torvalds [Thu, 1 Feb 2024 20:32:43 +0000 (12:32 -0800)]
Merge tag 'parisc-for-6.8-rc3' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:
 "The current exception handler, which helps on kernel accesses to
  userspace, may exhibit data corruption. The problem is that it is not
  guaranteed that the compiler will use the processor register we
  specified in the source code, but may choose another register which
  then will lead to silent register- and data corruption. To fix this
  issue we now use another strategy to help the exception handler to
  always find and set the error code into the correct CPU register.

  The other fixes are small: fixing CPU hotplug bringup, fix the page
  alignment of the RO_DATA section, added a check for the calculated
  cache stride and fix possible hangups when printing longer output at
  bootup when running on serial console.

  Most of the patches are tagged for stable series.

   - Fix random data corruption triggered by exception handler

   - Fix crash when setting up BTLB at CPU bringup

   - Prevent hung tasks when printing inventory on serial console

   - Make RO_DATA page aligned in vmlinux.lds.S

   - Add check for valid cache stride size"

* tag 'parisc-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: BTLB: Fix crash when setting up BTLB at CPU bringup
  parisc: Fix random data corruption from exception handler
  parisc: Drop unneeded semicolon in parse_tree_node()
  parisc: Prevent hung tasks when printing inventory on serial console
  parisc: Check for valid stride size for cache flushes
  parisc: Make RO_DATA page aligned in vmlinux.lds.S

4 months agoMerge tag 'kbuild-fixes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahi...
Linus Torvalds [Thu, 1 Feb 2024 19:57:42 +0000 (11:57 -0800)]
Merge tag 'kbuild-fixes-v6.8' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix UML build with clang-18 and newer

 - Avoid using the alias attribute in host programs

 - Replace tabs with spaces when followed by conditionals for future GNU
   Make versions

 - Fix rpm-pkg for the systemd-provided kernel-install tool

 - Fix the undefined behavior in Kconfig for a 'int' symbol used in a
   conditional

* tag 'kbuild-fixes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: initialize sym->curr.tri to 'no' for all symbol types again
  kbuild: rpm-pkg: simplify installkernel %post
  kbuild: Replace tabs with spaces when followed by conditionals
  modpost: avoid using the alias attribute
  kbuild: fix W= flags in the help message
  modpost: Add '.ltext' and '.ltext.*' to TEXT_SECTIONS
  um: Fix adding '-no-pie' for clang
  kbuild: defconf: use SRCARCH to find merged configs

4 months agoMerge tag 'nfsd-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Thu, 1 Feb 2024 19:48:13 +0000 (11:48 -0800)]
Merge tag 'nfsd-6.8-2' of git://git./linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix a recent backchannel timeout fix

* tag 'nfsd-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSv4.1: Assign the right value for initval and retries for rpc timeout

4 months agoMerge tag 'exfat-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/linkin...
Linus Torvalds [Thu, 1 Feb 2024 19:45:53 +0000 (11:45 -0800)]
Merge tag 'exfat-for-6.8-rc3' of git://git./linux/kernel/git/linkinjeon/exfat

Pull exfat fix from Namjae Jeon:

 - Fix BUG in iov_iter_revert reported from syzbot

* tag 'exfat-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix zero the unwritten part for dio read

4 months agoMerge tag 'hid-for-linus-2024020101' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 1 Feb 2024 18:19:34 +0000 (10:19 -0800)]
Merge tag 'hid-for-linus-2024020101' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

 - cleanups in the error path in hid-steam (Dan Carpenter)

 - fixes for Wacom tablets selftests that sneaked in while the CI was
   taking a break during the year end holidays (Benjamin Tissoires)

 - null pointer check in nvidia-shield (Kunwu Chan)

 - memory leak fix in hidraw (Su Hui)

 - another null pointer fix in i2c-hid-of (Johan Hovold)

 - another memory leak fix in HID-BPF this time, as well as a double
   fdget() fix reported by Dan Carpenter (Benjamin Tissoires)

 - fix for Cirque touchpad when they go on suspend (Kai-Heng Feng)

 - new device ID in hid-logitech-hidpp: "Logitech G Pro X SuperLight 2"
   (Jiri Kosina)

* tag 'hid-for-linus-2024020101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: bpf: use __bpf_kfunc instead of noinline
  HID: bpf: actually free hdev memory after attaching a HID-BPF program
  HID: bpf: remove double fdget()
  HID: i2c-hid-of: fix NULL-deref on failed power up
  HID: hidraw: fix a problem of memory leak in hidraw_release()
  HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend
  HID: nvidia-shield: Add missing null pointer checks to LED initialization
  HID: logitech-hidpp: add support for Logitech G Pro X Superlight 2
  selftests/hid: wacom: fix confidence tests
  HID: hid-steam: Fix cleanup in probe()
  HID: hid-steam: remove pointless error message

4 months agoMerge tag 'firewire-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 1 Feb 2024 18:12:53 +0000 (10:12 -0800)]
Merge tag 'firewire-fixes-6.8-rc3' of git://git./linux/kernel/git/ieee1394/linux1394

Pull firewire fixes from Takashi Sakamoto:
 "FireWire subsystem now supports the legacy layout of configuration
  ROM, while it appears that some of DV devices in the early 2000's have
  the legacy layout with a quirk. This includes some changes to handle
  the quirk"

* tag 'firewire-fixes-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: core: search descriptor leaf just after vendor directory entry in root directory
  firewire: core: correct documentation of fw_csr_string() kernel API

4 months agoMerge tag 'spi-fix-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Thu, 1 Feb 2024 18:10:17 +0000 (10:10 -0800)]
Merge tag 'spi-fix-v6.8-rc2' of git://git./linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
 "One simple fix for a minor but valid issue with constants overflowing
  identified via cppcheck"

* tag 'spi-fix-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: sh-msiof: avoid integer overflow in constants

4 months agoMerge tag 'regulator-fix-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 1 Feb 2024 18:06:55 +0000 (10:06 -0800)]
Merge tag 'regulator-fix-v6.8-rc2' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "The main set of fixes here are for the PWM regulator, fixing
  bootstrapping issues on some platforms where the hardware setup looked
  like it was out of spec for the constraints we have for the regulator
  causing us to make spurious and unhelpful changes to try to bring
  things in line with the constraints.

  There's also a couple of other driver specific fixes"

* tag 'regulator-fix-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator (max5970): Fix IRQ handler
  regulator: ti-abb: don't use devm_platform_ioremap_resource_byname for shared interrupt register
  regulator: pwm-regulator: Manage boot-on with disabled PWM channels
  regulator: pwm-regulator: Calculate the output voltage for disabled PWMs
  regulator: pwm-regulator: Add validity checks in continuous .get_voltage

4 months agoMerge tag 'v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Thu, 1 Feb 2024 18:02:40 +0000 (10:02 -0800)]
Merge tag 'v6.8-p2' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "Fix regressions in caam and qat"

* tag 'v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix asynchronous hash
  crypto: qat - fix arbiter mapping generation algorithm for QAT 402xx

4 months agoMerge tag 'lsm-pr-20240131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Linus Torvalds [Thu, 1 Feb 2024 18:00:28 +0000 (10:00 -0800)]
Merge tag 'lsm-pr-20240131' of git://git./linux/kernel/git/pcmoore/lsm

Pull lsm fixes from Paul Moore:
 "Two small patches to fix some problems relating to LSM hook return
  values and how the individual LSMs interact"

* tag 'lsm-pr-20240131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lsm: fix default return value of the socket_getpeersec_*() hooks
  lsm: fix the logic in security_inode_getsecctx()

4 months agoMerge tag 'batadv-net-pullrequest-20240201' of git://git.open-mesh.org/linux-merge
Jakub Kicinski [Thu, 1 Feb 2024 17:25:53 +0000 (09:25 -0800)]
Merge tag 'batadv-net-pullrequest-20240201' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here are some batman-adv bugfixes:

 - fix a timeout issue and a memory leak in batman-adv multicast,
   by Linus Lüssing (2 patches)

* tag 'batadv-net-pullrequest-20240201' of git://git.open-mesh.org/linux-merge:
  batman-adv: mcast: fix memory leak on deleting a batman-adv interface
  batman-adv: mcast: fix mcast packet type counter on timeouted nodes
====================

Link: https://lore.kernel.org/r/20240201110110.29129-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agodoc/netlink/specs: Add missing attr in rt_link spec
Donald Hunter [Thu, 1 Feb 2024 11:38:53 +0000 (11:38 +0000)]
doc/netlink/specs: Add missing attr in rt_link spec

IFLA_DPLL_PIN was added to rt_link messages but not to the spec, which
breaks ynl. Add the missing definitions to the rt_link ynl spec.

Fixes: 5f1842692880 ("netdev: expose DPLL pin handle for netdevice")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201113853.37432-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 1 Feb 2024 17:14:13 +0000 (09:14 -0800)]
Merge tag 'nf-24-01-31' of git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) TCP conntrack now only evaluates window negotiation for packets in
   the REPLY direction, from Ryan Schaefer. Otherwise SYN retransmissions
   trigger incorrect window scale negotiation. From Ryan Schaefer.

2) Restrict tunnel objects to NFPROTO_NETDEV which is where it makes sense
   to use this object type.

3) Fix conntrack pick up from the middle of SCTP_CID_SHUTDOWN_ACK packets.
   From Xin Long.

4) Another attempt from Jozsef Kadlecsik to address the slow down of the
   swap command in ipset.

5) Replace a BUG_ON by WARN_ON_ONCE in nf_log, and consolidate check for
   the case that the logger is NULL from the read side lock section.

6) Address lack of sanitization for custom expectations. Restrict layer 3
   and 4 families to what it is supported by userspace.

* tag 'nf-24-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations
  netfilter: nf_log: replace BUG_ON by WARN_ON_ONCE when putting logger
  netfilter: ipset: fix performance regression in swap operation
  netfilter: conntrack: check SCTP_CID_SHUTDOWN_ACK for vtag setting in sctp_new
  netfilter: nf_tables: restrict tunnel object to NFPROTO_NETDEV
  netfilter: conntrack: correct window scaling with retransmitted SYN
====================

Link: https://lore.kernel.org/r/20240131225943.7536-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoidpf: avoid compiler padding in virtchnl2_ptype struct
Pavan Kumar Linga [Wed, 31 Jan 2024 22:22:40 +0000 (14:22 -0800)]
idpf: avoid compiler padding in virtchnl2_ptype struct

In the arm random config file, kconfig option 'CONFIG_AEABI' is
disabled which results in adding the compiler flag '-mabi=apcs-gnu'.
This causes the compiler to add padding in virtchnl2_ptype
structure to align it to 8 bytes, resulting in the following
size check failure:

include/linux/build_bug.h:78:41: error: static assertion failed: "(6) == sizeof(struct virtchnl2_ptype)"
      78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
         |                                         ^~~~~~~~~~~~~~
include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
      77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
         |                                  ^~~~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:26:9: note: in expansion of macro 'static_assert'
      26 |         static_assert((n) == sizeof(struct X))
         |         ^~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:982:1: note: in expansion of macro 'VIRTCHNL2_CHECK_STRUCT_LEN'
     982 | VIRTCHNL2_CHECK_STRUCT_LEN(6, virtchnl2_ptype);
         | ^~~~~~~~~~~~~~~~~~~~~~~~~~

Avoid the compiler padding by using "__packed" structure
attribute for the virtchnl2_ptype struct. Also align the
structure by using "__aligned(2)" for better code optimization.

Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312220250.ufEm8doQ-lkp@intel.com
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240131222241.2087516-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'mptcp-fixes-for-recent-issues-reported-by-ci-s'
Jakub Kicinski [Thu, 1 Feb 2024 17:06:40 +0000 (09:06 -0800)]
Merge branch 'mptcp-fixes-for-recent-issues-reported-by-ci-s'

Matthieu Baerts says:

====================
mptcp: fixes for recent issues reported by CI's

This series of 9 patches fixes issues mostly identified by CI's not
managed by the MPTCP maintainers. Thank you Linero (LKFT) and Netdev
maintainers (NIPA) for running our kunit and selftests tests!

For the first patch, it took a bit of time to identify the root cause.
Some MPTCP Join selftest subtests have been "flaky", mostly in slow
environments. It appears to be due to the use of a TCP-specific helper
on an MPTCP socket. A fix for kernels >= v5.15.

Patches 2 to 4 add missing kernel config to support NetFilter tables
needed for IPTables commands. These kconfigs are usually enabled in
default configurations, but apparently not for all architectures.
Patches 2 and 3 can be backported up to v5.11 and the 4th one up to
v5.19.

Patch 5 increases the time limit for MPTCP selftests. It appears that
many CI's execute tests in a VM without acceleration supports, e.g. QEmu
without KVM. As a result, the tests take longer. Plus, there are more
and more tests. This patch modifies the timeout added in v5.18.

Patch 6 reduces the maximum rate and delay of the different links in
some Simult Flows selftest subtests. The goal is to let slow VMs reach
the maximum speed. The original rate was introduced in v5.11.

Patch 7 lets CI changing the prefix of the subtests titles, to be able
to run the same selftest multiple times with different parameters. With
different titles, tests will be considered as different and not override
previous results as it is the case with some CI envs. Subtests have been
introduced in v6.6.

Patch 8 and 9 make some MPTCP Join selftest subtests quicker by stopping
the transfer when the expected events have been seen. Patch 8 can be
backported up to v6.5.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
====================

Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-0-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: join: stop transfer when check is done (part 2)
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:54 +0000 (22:49 +0100)]
selftests: mptcp: join: stop transfer when check is done (part 2)

Since the "Fixes" commits mentioned below, the newly added "userspace
pm" subtests of mptcp_join selftests are launching the whole transfer in
the background, do the required checks, then wait for the end of
transfer.

There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds on slow environments.

While at it, use 'mptcp_lib_kill_wait()' helper everywhere, instead of
on a specific one with 'kill_tests_wait()'.

Fixes: b2e2248f365a ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Cc: stable@vger.kernel.org
Reviewed-and-tested-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-9-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: join: stop transfer when check is done (part 1)
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:53 +0000 (22:49 +0100)]
selftests: mptcp: join: stop transfer when check is done (part 1)

Since the "Fixes" commit mentioned below, "userspace pm" subtests of
mptcp_join selftests introduced in v6.5 are launching the whole transfer
in the background, do the required checks, then wait for the end of
transfer.

There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds in slow environments.

Note that old versions will need commit bdbef0a6ff10 ("selftests: mptcp:
add mptcp_lib_kill_wait") as well to get 'mptcp_lib_kill_wait()' helper.

Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org # 6.5.x: bdbef0a6ff10: selftests: mptcp: add mptcp_lib_kill_wait
Cc: stable@vger.kernel.org # 6.5.x
Reviewed-and-tested-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-8-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: allow changing subtests prefix
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:52 +0000 (22:49 +0100)]
selftests: mptcp: allow changing subtests prefix

If a CI executes the same selftest multiple times with different
options, all results from the same subtests will have the same title,
which confuse the CI. With the same title printed in TAP, the tests are
considered as the same ones.

Now, it is possible to override this prefix by using MPTCP_LIB_KSFT_TEST
env var, and have a different title.

While at it, use 'basename' to remove the suffix as well instead of
using an extra 'sed'.

Fixes: c4192967e62f ("selftests: mptcp: lib: format subtests results in TAP")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-7-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: decrease BW in simult flows
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:51 +0000 (22:49 +0100)]
selftests: mptcp: decrease BW in simult flows

When running the simult_flow selftest in slow environments -- e.g. QEmu
without KVM support --, the results can be unstable. This selftest
checks if the aggregated bandwidth is (almost) fully used as expected.

To help improving the stability while still keeping the same validation
in place, the BW and the delay are reduced to lower the pressure on the
CPU.

Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests")
Fixes: 219d04992b68 ("mptcp: push pending frames when subflow has free space")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-6-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: increase timeout to 30 min
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:50 +0000 (22:49 +0100)]
selftests: mptcp: increase timeout to 30 min

On very slow environments -- e.g. when QEmu is used without KVM --,
mptcp_join.sh selftest can take a bit more than 20 minutes. Bump the
default timeout by 50% as it seems normal to take that long on some
environments.

When a debug kernel config is used, this selftest will take even longer,
but that's certainly not a common test env to consider for the timeout.

The Fixes tag that has been picked here is there simply to help having
this patch backported to older stable versions. It is difficult to point
to the exact commit that made some env reaching the timeout from time to
time.

Fixes: d17b968b9876 ("selftests: mptcp: increase timeout to 20 minutes")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-5-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: add missing kconfig for NF Mangle
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:49 +0000 (22:49 +0100)]
selftests: mptcp: add missing kconfig for NF Mangle

Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Mangle table, only in IPv4.

This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.

Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-4-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: add missing kconfig for NF Filter in v6
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:48 +0000 (22:49 +0100)]
selftests: mptcp: add missing kconfig for NF Filter in v6

Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table for IPv6.

It is then required to have IP6_NF_FILTER KConfig.

This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.

Fixes: 523514ed0a99 ("selftests: mptcp: add ADD_ADDR IPv6 test cases")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-3-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: mptcp: add missing kconfig for NF Filter
Matthieu Baerts (NGI0) [Wed, 31 Jan 2024 21:49:47 +0000 (22:49 +0100)]
selftests: mptcp: add missing kconfig for NF Filter

Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table.

It is then required to have IP_NF_FILTER KConfig.

This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.

Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agomptcp: fix data re-injection from stale subflow
Paolo Abeni [Wed, 31 Jan 2024 21:49:46 +0000 (22:49 +0100)]
mptcp: fix data re-injection from stale subflow

When the MPTCP PM detects that a subflow is stale, all the packet
scheduler must re-inject all the mptcp-level unacked data. To avoid
acquiring unneeded locks, it first try to check if any unacked data
is present at all in the RTX queue, but such check is currently
broken, as it uses TCP-specific helper on an MPTCP socket.

Funnily enough fuzzers and static checkers are happy, as the accessed
memory still belongs to the mptcp_sock struct, and even from a
functional perspective the recovery completed successfully, as
the short-cut test always failed.

A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
tcp_sock fast path variables") - exposed the issue, as the tcp field
reorganization makes the mptcp code always skip the re-inection.

Fix the issue dropping the bogus call: we are on a slow path, the early
optimization proved once again to be evil.

Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-1-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: net: enable some more knobs
Paolo Abeni [Wed, 31 Jan 2024 17:52:29 +0000 (18:52 +0100)]
selftests: net: enable some more knobs

The rtnetlink tests require additional options currently
off by default.

Fixes: 2766a11161cc ("selftests: rtnetlink: add ipsec offload API test")
Fixes: 5e596ee171ba ("selftests: add xfrm state-policy-monitor to rtnetlink.sh")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/9048ca58e49b962f35dba1dfb2beaf3dab3e0411.1706723341.git.pabeni@redhat.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: net: add missing config for NF_TARGET_TTL
Jakub Kicinski [Wed, 31 Jan 2024 16:56:05 +0000 (08:56 -0800)]
selftests: net: add missing config for NF_TARGET_TTL

amt test uses the TTL iptables module:

  ip netns exec "${RELAY}" iptables -t mangle -I PREROUTING \
   -d 239.0.0.1 -j TTL --ttl-set 2

Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Link: https://lore.kernel.org/r/20240131165605.4051645-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'selftests-net-more-small-fixes'
Jakub Kicinski [Thu, 1 Feb 2024 16:36:39 +0000 (08:36 -0800)]
Merge branch 'selftests-net-more-small-fixes'

Benjamin Poirier says:

====================
selftests: net: More small fixes

Some small fixes for net selftests which follow from these recent commits:
dd2d40acdbb2 ("selftests: bonding: Add more missing config options")
49078c1b80b6 ("selftests: forwarding: Remove executable bits from lib.sh")
====================

Link: https://lore.kernel.org/r/20240131140848.360618-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: forwarding: List helper scripts in TEST_FILES Makefile variable
Benjamin Poirier [Wed, 31 Jan 2024 14:08:48 +0000 (09:08 -0500)]
selftests: forwarding: List helper scripts in TEST_FILES Makefile variable

Some scripts are not tests themselves; they contain utility functions used
by other tests. According to Documentation/dev-tools/kselftest.rst, such
files should be listed in TEST_FILES. Currently they are incorrectly listed
in TEST_PROGS_EXTENDED so rename the variable.

Fixes: c085dbfb1cfc ("selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED")
Suggested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-6-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: net: List helper scripts in TEST_FILES Makefile variable
Benjamin Poirier [Wed, 31 Jan 2024 14:08:47 +0000 (09:08 -0500)]
selftests: net: List helper scripts in TEST_FILES Makefile variable

Some scripts are not tests themselves; they contain utility functions used
by other tests. According to Documentation/dev-tools/kselftest.rst, such
files should be listed in TEST_FILES. Move those utility scripts to
TEST_FILES.

Fixes: 1751eb42ddb5 ("selftests: net: use TEST_PROGS_EXTENDED")
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Fixes: b99ac1841147 ("kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile")
Fixes: f5173fe3e13b ("selftests: net: included needed helper in the install targets")
Suggested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-5-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoselftests: net: Remove executable bits from library scripts
Benjamin Poirier [Wed, 31 Jan 2024 14:08:46 +0000 (09:08 -0500)]
selftests: net: Remove executable bits from library scripts

setup_loopback.sh and net_helper.sh are meant to be sourced from other
scripts, not executed directly. Therefore, remove the executable bits from
those files' permissions.

This change is similar to commit 49078c1b80b6 ("selftests: forwarding:
Remove executable bits from lib.sh")

Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Fixes: 3bdd9fd29cb0 ("selftests/net: synchronize udpgro tests' tx and rx connection")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-4-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>