linux-2.6-block.git
8 months agoDocumentation: networking: Add missing PHY_GET command in the message list
Kory Maincent [Mon, 28 Oct 2024 13:23:51 +0000 (14:23 +0100)]
Documentation: networking: Add missing PHY_GET command in the message list

ETHTOOL_MSG_PHY_GET/GET_REPLY/NTF is missing in the ethtool message list.
Add it to the ethool netlink documentation.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20241028132351.75922-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge tag 'wireless-next-2024-10-25' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Wed, 30 Oct 2024 01:50:57 +0000 (18:50 -0700)]
Merge tag 'wireless-next-2024-10-25' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.13

The first -next "new features" pull request for v6.13. This is a big
one as we have not been able to send one earlier. We have also some
patches affecting other subsystems: in staging we deleted the rtl8192e
driver and in debugfs added a new interface to save struct
file_operations memory; both were acked by GregKH.

Because of the lib80211/libipw move there were quite a lot of
conflicts and to solve those we decided to merge net-next into
wireless-next.

Major changes:

cfg80211/mac80211
 * stop exporting wext symbols
 * new mac80211 op to indicate that a new interface is to be added
 * support radio separation of multi-band devices

Wireless Extensions
 * move wext spy implementation to libiw
 * remove iw_public_data from struct net_device

brcmfmac
 * optional LPO clock support

ipw2x00
 * move remaining lib80211 code into libiw

wilc1000
 * WILC3000 support

rtw89
 * RTL8852BE and RTL8852BE-VT BT-coexistence improvements

* tag 'wireless-next-2024-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (126 commits)
  mac80211: Remove NOP call to ieee80211_hw_config
  wifi: iwlwifi: work around -Wenum-compare-conditional warning
  wifi: mac80211: re-order assigning channel in activate links
  wifi: mac80211: convert debugfs files to short fops
  debugfs: add small file operations for most files
  wifi: mac80211: remove misleading j_0 construction parts
  wifi: mac80211_hwsim: use hrtimer_active()
  wifi: mac80211: refactor BW limitation check for CSA parsing
  wifi: mac80211: filter on monitor interfaces based on configured channel
  wifi: mac80211: refactor ieee80211_rx_monitor
  wifi: mac80211: add support for the monitor SKIP_TX flag
  wifi: cfg80211: add monitor SKIP_TX flag
  wifi: mac80211: add flag to opt out of virtual monitor support
  wifi: cfg80211: pass net_device to .set_monitor_channel
  wifi: mac80211: remove status->ampdu_delimiter_crc
  wifi: cfg80211: report per wiphy radio antenna mask
  wifi: mac80211: use vif radio mask to limit creating chanctx
  wifi: mac80211: use vif radio mask to limit ibss scan frequencies
  wifi: cfg80211: add option for vif allowed radios
  wifi: iwlwifi: allow IWL_FW_CHECK() with just a string
  ...

====================

Link: https://patch.msgid.link/20241025170705.5F6B2C4CEC3@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'devlink-minor-cleanup'
Jakub Kicinski [Tue, 29 Oct 2024 23:52:59 +0000 (16:52 -0700)]
Merge branch 'devlink-minor-cleanup'

Przemek Kitszel says:

====================
devlink: minor cleanup

(Patch 1, 2) Add one helper shortcut to put u64 values into skb.
(Patch 3, 4) Minor cleanup for error codes.
(Patch 5, 6, 7) Remove some devlink_resource_*() usage and functions
itself via replacing devlink_* variants by devl_* ones.

v2: fix metadata (cc list, target tree) - Jiri; rebase; tags collected

v1: https://lore.kernel.org/20241018102009.10124-1-przemyslaw.kitszel@intel.com
====================

Link: https://patch.msgid.link/20241023131248.27192-1-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: remove unused devlink_resource_register()
Przemek Kitszel [Wed, 23 Oct 2024 13:09:07 +0000 (15:09 +0200)]
devlink: remove unused devlink_resource_register()

Remove unused devlink_resource_register(); all the drivers use
devl_resource_register() variant instead.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-8-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: remove unused devlink_resource_occ_get_register() and _unregister()
Przemek Kitszel [Wed, 23 Oct 2024 13:09:06 +0000 (15:09 +0200)]
devlink: remove unused devlink_resource_occ_get_register() and _unregister()

Remove not used devlink_resource_occ_get_register() and
devlink_resource_occ_get_unregister() functions; current devlink resource
users are fine with devl_ variants of the two.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-7-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: dsa: replace devlink resource registration calls by devl_ variants
Przemek Kitszel [Wed, 23 Oct 2024 13:09:05 +0000 (15:09 +0200)]
net: dsa: replace devlink resource registration calls by devl_ variants

Replace devlink_resource_register(), devlink_resource_occ_get_register(),
and devlink_resource_occ_get_unregister() calls by respective devl_*
variants. Mentioned functions have no direct users in any drivers, and are
going to be removed in subsequent patches.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-6-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: region: snapshot IDs: consolidate error values
Przemek Kitszel [Wed, 23 Oct 2024 13:09:04 +0000 (15:09 +0200)]
devlink: region: snapshot IDs: consolidate error values

Consolidate error codes for too big message size.

Current code is written to return -EINVAL when tailroom in the skb msg
would be exhausted precisely when it's time to nest, and return -EMSGSIZE
in all other "not enough space" conditions.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-5-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: devl_resource_register(): differentiate error codes
Przemek Kitszel [Wed, 23 Oct 2024 13:09:03 +0000 (15:09 +0200)]
devlink: devl_resource_register(): differentiate error codes

Differentiate error codes of devl_resource_register().

Replace one of -EINVAL exit paths by -EEXIST. This should aid developers
introducing new resources and registering them in the wrong order.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-4-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: use devlink_nl_put_u64() helper
Przemek Kitszel [Wed, 23 Oct 2024 13:09:02 +0000 (15:09 +0200)]
devlink: use devlink_nl_put_u64() helper

Use devlink_nl_put_u64() shortcut added by prev commit on all devlink/.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-3-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodevlink: introduce devlink_nl_put_u64()
Przemek Kitszel [Wed, 23 Oct 2024 13:09:01 +0000 (15:09 +0200)]
devlink: introduce devlink_nl_put_u64()

Add devlink_nl_put_u64() that abstracts padding for u64 values.
All u64 values are passed with the very same padding option.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241023131248.27192-2-przemyslaw.kitszel@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agor8169: fix inconsistent indenting in rtl8169_get_eth_mac_stats
Heiner Kallweit [Thu, 24 Oct 2024 20:48:59 +0000 (22:48 +0200)]
r8169: fix inconsistent indenting in rtl8169_get_eth_mac_stats

This fixes an inconsistent indenting introduced with e3fc5139bd8f
("r8169: implement additional ethtool stats ops").

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410220413.1gAxIJ4t-lkp@intel.com/
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20fd6f39-3c1b-4af0-9adc-7d1f49728fad@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agosocket: Print pf->create() when it does not clear sock->sk on failure.
Kuniyuki Iwashima [Thu, 24 Oct 2024 20:14:58 +0000 (13:14 -0700)]
socket: Print pf->create() when it does not clear sock->sk on failure.

I suggested to put DEBUG_NET_WARN_ON_ONCE() in __sock_create() to
catch possible use-after-free.

But the warning itself was not useful because our interest is in
the callee than the caller.

Let's define DEBUG_NET_WARN_ONCE() and print the name of pf->create()
and the socket identifier.

While at it, we enclose DEBUG_NET_WARN_ON_ONCE() in parentheses too
to avoid a checkpatch error.

Note that %pf or %pF were obsoleted and will be removed later as per
comment in lib/vsprintf.c.

Link: https://lore.kernel.org/netdev/202410231427.633734b3-lkp@intel.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241024201458.49412-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agor8169: add support for RTL8125D
Heiner Kallweit [Thu, 24 Oct 2024 20:42:33 +0000 (22:42 +0200)]
r8169: add support for RTL8125D

This adds support for new chip version RTL8125D, which can be found on
boards like Gigabyte X870E AORUS ELITE WIFI7. Firmware rtl8125d-1.fw
for this chip version is available in linux-firmware already.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/d0306912-e88e-4c25-8b5d-545ae8834c0c@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: qlogic: use ethtool string helpers
Rosen Penev [Thu, 24 Oct 2024 19:55:34 +0000 (12:55 -0700)]
net: qlogic: use ethtool string helpers

The latter is the preferred way to copy ethtool strings.

Avoids manually incrementing the pointer. Cleans up the code quite well.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241024195534.176410-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: marvell: use ethtool string helpers
Rosen Penev [Thu, 24 Oct 2024 19:58:33 +0000 (12:58 -0700)]
net: marvell: use ethtool string helpers

The latter is the preferred way to copy ethtool strings.

Avoids manually incrementing the pointer. Cleans up the code quite well.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241024195833.176843-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomlx5: simplify EQ interrupt polling logic
Caleb Sander Mateos [Wed, 23 Oct 2024 20:51:12 +0000 (14:51 -0600)]
mlx5: simplify EQ interrupt polling logic

Use a while loop in mlx5_eq_comp_int() and mlx5_eq_async_int() to
clarify the EQE polling logic. This consolidates the next_eqe_sw() calls
for the first and subequent iterations. It also avoids a goto. Turn the
num_eqes < MLX5_EQ_POLLING_BUDGET check into a break condition.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241023205113.255866-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomlx5: fix typo in "mlx5_cqwq_get_cqe_enahnced_comp"
Caleb Sander Mateos [Wed, 23 Oct 2024 16:48:38 +0000 (10:48 -0600)]
mlx5: fix typo in "mlx5_cqwq_get_cqe_enahnced_comp"

"enahnced" looks to be a misspelling of "enhanced".
Rename "mlx5_cqwq_get_cqe_enahnced_comp" to
"mlx5_cqwq_get_cqe_enhanced_comp".

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20241023164840.140535-1-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoamd-xgbe: use ethtool string helpers
Rosen Penev [Tue, 22 Oct 2024 23:32:03 +0000 (16:32 -0700)]
amd-xgbe: use ethtool string helpers

The latter is the preferred way to copy ethtool strings.

Avoids manually incrementing the pointer.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241022233203.9670-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: mana: use ethtool string helpers
Rosen Penev [Tue, 22 Oct 2024 20:49:08 +0000 (13:49 -0700)]
net: mana: use ethtool string helpers

The latter is the preferred way to copy ethtool strings.

Avoids manually incrementing the data pointer.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Link: https://patch.msgid.link/20241022204908.511021-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoibmvnic: use ethtool string helpers
Rosen Penev [Tue, 22 Oct 2024 20:32:40 +0000 (13:32 -0700)]
ibmvnic: use ethtool string helpers

They are the preferred way to copy ethtool strings.

Avoids manually incrementing the data pointer.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Tested-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241022203240.391648-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: ftgmac100: refactor getting phy device handle
Jacky Chou [Tue, 22 Oct 2024 08:42:14 +0000 (16:42 +0800)]
net: ftgmac100: refactor getting phy device handle

Consolidate the handling of dedicated PHY and fixed-link phy by taking
advantage of logic in of_phy_get_and_connect() which handles both of
these cases, rather than open coding the same logic in ftgmac100_probe().

Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241022084214.1261174-1-jacky_chou@aspeedtech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'net-phylink-simplify-sfp-phy-attachment'
Jakub Kicinski [Tue, 29 Oct 2024 18:57:35 +0000 (11:57 -0700)]
Merge branch 'net-phylink-simplify-sfp-phy-attachment'

Russell King says:

====================
net: phylink: simplify SFP PHY attachment

These two patches simplify how we attach SFP PHYs.

The first patch notices that at the two sites where we call
sfp_select_interface(), if that fails, we always print the same error.
Move this into its own function.

The second patch adds an additional level of validation, checking that
the returned interface is one that is supported by the MAC/PCS.

The last patch simplifies how SFP PHYs are attached, reducing the
number of times that we do validation in this path.
====================

Link: https://patch.msgid.link/Zxj8_clRmDA_G7uH@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: phylink: simplify how SFP PHYs are attached
Russell King (Oracle) [Wed, 23 Oct 2024 13:41:57 +0000 (14:41 +0100)]
net: phylink: simplify how SFP PHYs are attached

There are a few issues with how SFP PHYs are attached:

a) The phylink_sfp_connect_phy() and phylink_sfp_config_phy() code
   validates the configuration three times:

1. To discover the support/advertising masks that the PHY/PCS/MAC
   can support in order to select an interface.
2. To validate the selected interface.
3. When the PHY is brought up after being attached, another validation
   is done.

   This is needlessly complex.

b) The configuration is set prior to the PHY being attached, which
   means we don't have the PHY available in phylink_major_config()
   for phylink_pcs_neg_mode() to make decisions upon.

We have already added an extra step to validate the selected interface,
so we can now move the attachment and bringup of the PHY earlier,
inside phylink_sfp_config_phy(). This results in the validation at
step 2 above becoming entirely unnecessary, so remove that too.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1t3bcb-000c8H-3e@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: phylink: validate sfp_select_interface() returned interface
Russell King (Oracle) [Wed, 23 Oct 2024 13:41:51 +0000 (14:41 +0100)]
net: phylink: validate sfp_select_interface() returned interface

Validate that the returned interface from sfp_select_interface() is
supportable by the MAC/PCS. If it isn't, print an error and return
the NA interface type. This is a preparatory step to reorganising
how a PHY on a SFP module is handled.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1t3bcV-000c8B-Vz@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: phylink: add common validation for sfp_select_interface()
Russell King (Oracle) [Wed, 23 Oct 2024 13:41:46 +0000 (14:41 +0100)]
net: phylink: add common validation for sfp_select_interface()

Whenever we call sfp_select_interface(), we check the returned value
and print an error. There are two cases where this happens with the
same message. Provide a common function to do this.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1t3bcQ-000c85-S4@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: phylink: simplify phylink_parse_fixedlink()
Russell King (Oracle) [Tue, 22 Oct 2024 14:17:07 +0000 (15:17 +0100)]
net: phylink: simplify phylink_parse_fixedlink()

phylink_parse_fixedlink() wants to preserve the pause, asym_pause and
autoneg bits in pl->supported. Rather than reading the bits into
separate bools, zeroing pl->supported, and then setting them if they
were previously set, use a mask and linkmode_and() to achieve the same
result.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/E1t3Fh5-000aQi-Nk@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'mlx5e-update-features-on-config-changes'
Jakub Kicinski [Tue, 29 Oct 2024 18:48:29 +0000 (11:48 -0700)]
Merge branch 'mlx5e-update-features-on-config-changes'

Tariq Toukan says:

====================
mlx5e update features on config changes

This small patchset by Dragos adds a call to netdev_update_features()
in configuration changes that could impact the features status.
====================

Link: https://patch.msgid.link/20241024164134.299646-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet/mlx5e: Update features on ring size change
Dragos Tatulea [Thu, 24 Oct 2024 16:41:33 +0000 (19:41 +0300)]
net/mlx5e: Update features on ring size change

When the ring size changes successfully, trigger
netdev_update_features() to enable features in wanted state if
applicable.

An example of such scenario:
$ ip link set dev eth1 up
$ ethtool --set-ring eth1 rx 8192
$ ip link set dev eth1 mtu 9000
$ ethtool --features eth1 rx-gro-hw on --> fails
$ ethtool --set-ring eth1 rx 1024

With this patch, HW GRO will be turned on automatically because
it is set in the device's wanted_features.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241024164134.299646-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet/mlx5e: Update features on MTU change
Dragos Tatulea [Thu, 24 Oct 2024 16:41:32 +0000 (19:41 +0300)]
net/mlx5e: Update features on MTU change

When the MTU changes successfully, trigger netdev_update_features() to
enable features in wanted state if applicable.

An example of such scenario:
$ ip link set dev eth1 up
$ ethtool --set-ring eth1 rx 8192
$ ip link set dev eth1 mtu 9000
$ ethtool --features eth1 rx-gro-hw on --> fails
$ ip link set dev eth1 mtu 7000

With this patch, HW GRO will be turned on automatically because
it is set in the device's wanted_features.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241024164134.299646-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agowwan: core: Pass string literal as format argument of dev_set_name()
Simon Horman [Wed, 23 Oct 2024 12:15:28 +0000 (13:15 +0100)]
wwan: core: Pass string literal as format argument of dev_set_name()

Both gcc-14 and clang-18 report that passing a non-string literal as the
format argument of dev_set_name() is potentially insecure.

E.g. clang-18 says:

drivers/net/wwan/wwan_core.c:442:34: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
  442 |         return dev_set_name(&port->dev, buf);
      |                                         ^~~
drivers/net/wwan/wwan_core.c:442:34: note: treat the string as an argument to avoid this
  442 |         return dev_set_name(&port->dev, buf);
      |                                         ^
      |                                         "%s",

It is always the case where the contents of mod is safe to pass as the
format argument. That is, in my understanding, it never contains any
format escape sequences.

But, it seems better to be safe than sorry. And, as a bonus, compiler
output becomes less verbose by addressing this issue as suggested by
clang-18.

Compile tested only.
No functional change intended.

Signed-off-by: Simon Horman <horms@kernel.org>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://patch.msgid.link/20241023-wwan-fmt-v1-1-521b39968639@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoselftests: tc-testing: Fix typo error
Karan Sanghavi [Tue, 22 Oct 2024 18:30:52 +0000 (18:30 +0000)]
selftests: tc-testing: Fix typo error

Correct the typo errors in json files

- "diffferent" is corrected to "different".
- "muliple" and "miltiple" is corrected to "multiple".

Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Karan Sanghavi <karansanghvi98@gmail.com>
Link: https://patch.msgid.link/20241022-multiple_spell_error-v2-1-7e5036506fe5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agortnetlink: Fix kdoc of rtnl_af_register().
Kuniyuki Iwashima [Tue, 22 Oct 2024 21:03:20 +0000 (14:03 -0700)]
rtnetlink: Fix kdoc of rtnl_af_register().

Commit 26eebdc4b005 ("rtnetlink: Return int from rtnl_af_register().")
made rtnl_af_register() return int again, and kdoc needs to be fixed up.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241022210320.86111-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'ipv4-prepare-core-ipv4-files-to-future-flowi4_tos-conversion'
Jakub Kicinski [Tue, 29 Oct 2024 18:21:25 +0000 (11:21 -0700)]
Merge branch 'ipv4-prepare-core-ipv4-files-to-future-flowi4_tos-conversion'

Guillaume Nault says:

====================
ipv4: Prepare core ipv4 files to future .flowi4_tos conversion.

Continue preparing users of ->flowi4_tos (struct flowi4) to the future
conversion of this field (from __u8 to dscp_t). The objective is to
have type annotation to properly separate DSCP bits from ECN ones. This
way we'll ensure that ECN doesn't interfere with DSCP and avoid
regressions where it break routing descisions (fib rules in particular).

This series concentrates on some easy IPv4 conversions where
->flowi4_tos is set directly from an IPv4 header, so we can get the
DSCP value using the ip4h_dscp() helper function.
====================

Link: https://patch.msgid.link/cover.1729530028.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoipv4: Prepare ip_rt_get_source() to future .flowi4_tos conversion.
Guillaume Nault [Tue, 22 Oct 2024 09:48:23 +0000 (11:48 +0200)]
ipv4: Prepare ip_rt_get_source() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/0a13a200f31809841975e38633914af1061e0c04.1729530028.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoipv4: Prepare ipmr_rt_fib_lookup() to future .flowi4_tos conversion.
Guillaume Nault [Tue, 22 Oct 2024 09:48:15 +0000 (11:48 +0200)]
ipv4: Prepare ipmr_rt_fib_lookup() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/462402a097260357a7aba80228612305f230b6a9.1729530028.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoipv4: Prepare icmp_reply() to future .flowi4_tos conversion.
Guillaume Nault [Tue, 22 Oct 2024 09:48:08 +0000 (11:48 +0200)]
ipv4: Prepare icmp_reply() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/61b7563563f8b0a562b5b62032fe5260034d0aac.1729530028.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoipv4: Prepare fib_compute_spec_dst() to future .flowi4_tos conversion.
Guillaume Nault [Tue, 22 Oct 2024 09:48:00 +0000 (11:48 +0200)]
ipv4: Prepare fib_compute_spec_dst() to future .flowi4_tos conversion.

Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the
dscp_t value to __u8 with inet_dscp_to_dsfield().

Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop
the inet_dscp_to_dsfield() call.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/a0eba69cce94f747e4c7516184a85ffd0abbe3f0.1729530028.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'ibm-emac-more-cleanups'
Paolo Abeni [Tue, 29 Oct 2024 14:33:24 +0000 (15:33 +0100)]
Merge branch 'ibm-emac-more-cleanups'

Rosen Penev says:

====================
ibm: emac: more cleanups

Tested on Cisco MX60W.

v2: fixed build errors. Also added extra commits to clean the driver up
further.
v3: Added tested message. Removed bad alloc_netdev_dummy commit.
v4: removed modules changes from patchset. Added fix for if MAC not
found.
v5: added of_find_matching_node commit.
v6: resend after net-next merge.
v7: removed of_find_matching_node commit. Adjusted mutex_init patch.
v8: removed patch removing custom init/exit. Needs more work.
====================

Link: https://patch.msgid.link/20241022002245.843242-1-rosenp@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: ibm: emac: generate random MAC if not found
Rosen Penev [Tue, 22 Oct 2024 00:22:45 +0000 (17:22 -0700)]
net: ibm: emac: generate random MAC if not found

On this Cisco MX60W, u-boot sets the local-mac-address property.
Unfortunately by default, the MAC is wrong and is actually located on a
UBI partition. Which means nvmem needs to be used to grab it.

In the case where that fails, EMAC fails to initialize instead of
generating a random MAC as many other drivers do.

Match behavior with other drivers to have a working ethernet interface.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: ibm: emac: use devm for mutex_init
Rosen Penev [Tue, 22 Oct 2024 00:22:44 +0000 (17:22 -0700)]
net: ibm: emac: use devm for mutex_init

It seems since inception that mutex_destroy was never called for these
in _remove. Instead of handling this manually, just use devm for
simplicity.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: ibm: emac: use platform_get_irq
Rosen Penev [Tue, 22 Oct 2024 00:22:43 +0000 (17:22 -0700)]
net: ibm: emac: use platform_get_irq

No need for irq_of_parse_and_map since we have platform_device.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: ibm: emac: use devm_platform_ioremap_resource
Rosen Penev [Tue, 22 Oct 2024 00:22:42 +0000 (17:22 -0700)]
net: ibm: emac: use devm_platform_ioremap_resource

No need to have a struct resource. Gets rid of the TODO.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: ibm: emac: use netif_receive_skb_list
Rosen Penev [Tue, 22 Oct 2024 00:22:41 +0000 (17:22 -0700)]
net: ibm: emac: use netif_receive_skb_list

Small rx improvement. Would use napi_gro_receive instead but that's a
lot more involved than netif_receive_skb_list because of how the
function is implemented.

Before:

> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 51556 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.04 sec   559 MBytes   467 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 48228 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.03 sec   558 MBytes   467 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 47600 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.04 sec   557 MBytes   466 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 37252 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.05 sec   559 MBytes   467 Mbits/sec

After:

> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 40786 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.05 sec   572 MBytes   478 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 52482 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.04 sec   571 MBytes   477 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 48370 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.04 sec   572 MBytes   478 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 46086 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.05 sec   571 MBytes   476 Mbits/sec
> iperf -c 192.168.1.1
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  1] local 192.168.1.101 port 46062 connected with 192.168.1.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.00-10.04 sec   572 MBytes   478 Mbits/sec

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoMerge branch 'ipv4-convert-rtm_-new-del-addr-and-more-to-per-netns-rtnl'
Paolo Abeni [Tue, 29 Oct 2024 10:55:28 +0000 (11:55 +0100)]
Merge branch 'ipv4-convert-rtm_-new-del-addr-and-more-to-per-netns-rtnl'

Kuniyuki Iwashima says:

====================
ipv4: Convert RTM_{NEW,DEL}ADDR and more to per-netns RTNL.

The IPv4 address hash table and GC are already namespacified.

This series converts RTM_NEWADDR/RTM_DELADDR and some more
RTNL users to per-netns RTNL.

Changes:
  v2:
    * Add patch 1 to address sparse warning for CONFIG_DEBUG_NET_SMALL_RTNL=n
    * Add Eric's tags to patch 2-12

  v1: https://lore.kernel.org/netdev/20241018012225.90409-1-kuniyu@amazon.com/
====================

Link: https://patch.msgid.link/20241021183239.79741-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert devinet_ioctl to per-netns RTNL.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:39 +0000 (11:32 -0700)]
ipv4: Convert devinet_ioctl to per-netns RTNL.

ioctl(SIOCGIFCONF) calls dev_ifconf() that operates on the current netns.

Let's use per-netns RTNL helpers in dev_ifconf() and inet_gifconf().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert devinet_ioctl() to per-netns RTNL except for SIOCSIFFLAGS.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:38 +0000 (11:32 -0700)]
ipv4: Convert devinet_ioctl() to per-netns RTNL except for SIOCSIFFLAGS.

Basically, devinet_ioctl() operates on a single netns.

However, ioctl(SIOCSIFFLAGS) will trigger the netdev notifier
that could touch another netdev in different netns.

Let's use per-netns RTNL helper in devinet_ioctl() and place
ASSERT_RTNL() for SIOCSIFFLAGS.

We will remove ASSERT_RTNL() once RTM_SETLINK and RTM_DELLINK
are converted.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert devinet_sysctl_forward() to per-netns RTNL.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:37 +0000 (11:32 -0700)]
ipv4: Convert devinet_sysctl_forward() to per-netns RTNL.

devinet_sysctl_forward() touches only a single netns.

Let's use rtnl_trylock() and __in_dev_get_rtnl_net().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agortnetlink: Define rtnl_net_trylock().
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:36 +0000 (11:32 -0700)]
rtnetlink: Define rtnl_net_trylock().

We will need the per-netns version of rtnl_trylock().

rtnl_net_trylock() calls __rtnl_net_lock() only when rtnl_trylock()
successfully holds RTNL.

When RTNL is removed, we will use mutex_trylock() for per-netns RTNL.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert check_lifetime() to per-netns RTNL.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:35 +0000 (11:32 -0700)]
ipv4: Convert check_lifetime() to per-netns RTNL.

Since commit 1675f385213e ("ipv4: Namespacify IPv4 address GC."),
check_lifetime() works on a per-netns basis.

Let's use rtnl_net_lock() and rtnl_net_dereference().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert RTM_DELADDR to per-netns RTNL.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:34 +0000 (11:32 -0700)]
ipv4: Convert RTM_DELADDR to per-netns RTNL.

Let's push down RTNL into inet_rtm_deladdr() as rtnl_net_lock().

Now, ip_mc_autojoin_config() is always called under per-netns RTNL,
so ASSERT_RTNL() can be replaced with ASSERT_RTNL_NET().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Use per-netns RTNL helpers in inet_rtm_newaddr().
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:33 +0000 (11:32 -0700)]
ipv4: Use per-netns RTNL helpers in inet_rtm_newaddr().

inet_rtm_to_ifa() and find_matching_ifa() are called
under rtnl_net_lock().

__in_dev_get_rtnl() and in_dev_for_each_ifa_rtnl() there
can use per-netns RTNL helpers.

Let's define and use __in_dev_get_rtnl_net() and
in_dev_for_each_ifa_rtnl_net().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Convert RTM_NEWADDR to per-netns RTNL.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:32 +0000 (11:32 -0700)]
ipv4: Convert RTM_NEWADDR to per-netns RTNL.

The address hash table and GC are already namespacified.

Let's push down RTNL into inet_rtm_newaddr() as rtnl_net_lock().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Don't allocate ifa for 0.0.0.0 in inet_rtm_newaddr().
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:31 +0000 (11:32 -0700)]
ipv4: Don't allocate ifa for 0.0.0.0 in inet_rtm_newaddr().

When we pass 0.0.0.0 to __inet_insert_ifa(), it frees ifa and returns 0.

We can do this check much earlier for RTM_NEWADDR even before allocating
struct in_ifaddr.

Let's move the validation to

  1. inet_insert_ifa() for ioctl()
  2. inet_rtm_newaddr() for RTM_NEWADDR

Now, we can remove the same check in find_matching_ifa().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoipv4: Factorise RTM_NEWADDR validation to inet_validate_rtm().
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:30 +0000 (11:32 -0700)]
ipv4: Factorise RTM_NEWADDR validation to inet_validate_rtm().

rtm_to_ifaddr() validates some attributes, looks up a netdev,
allocates struct in_ifaddr, and validates IFA_CACHEINFO.

There is no reason to delay IFA_CACHEINFO validation.

We will push RTNL down to inet_rtm_newaddr(), and then we want
to complete rtnetlink validation before rtnl_net_lock().

Let's factorise the validation parts.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agortnetlink: Define RTNL_FLAG_DOIT_PERNET for per-netns RTNL doit().
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:29 +0000 (11:32 -0700)]
rtnetlink: Define RTNL_FLAG_DOIT_PERNET for per-netns RTNL doit().

We will push RTNL down to each doit() as rtnl_net_lock().

We can use RTNL_FLAG_DOIT_UNLOCKED to call doit() without RTNL, but doit()
will still hold RTNL.

Let's define RTNL_FLAG_DOIT_PERNET as an alias of RTNL_FLAG_DOIT_UNLOCKED.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agortnetlink: Make per-netns RTNL dereference helpers to macro.
Kuniyuki Iwashima [Mon, 21 Oct 2024 18:32:28 +0000 (11:32 -0700)]
rtnetlink: Make per-netns RTNL dereference helpers to macro.

When CONFIG_DEBUG_NET_SMALL_RTNL is off, rtnl_net_dereference() is the
static inline wrapper of rtnl_dereference() returning a plain (void *)
pointer to make sure net is always evaluated as requested in [0].

But, it makes sparse complain [1] when the pointer has __rcu annotation:

  net/ipv4/devinet.c:674:47: sparse: warning: incorrect type in argument 2 (different address spaces)
  net/ipv4/devinet.c:674:47: sparse:    expected void *p
  net/ipv4/devinet.c:674:47: sparse:    got struct in_ifaddr [noderef] __rcu *

Also, if we evaluate net as (void *) in a macro, then the compiler
in turn fails to build due to -Werror=unused-value.

  #define rtnl_net_dereference(net, p)                  \
        ({                                              \
                (void *)net;                            \
                rtnl_dereference(p);                    \
        })

  net/ipv4/devinet.c: In function ‘inet_rtm_deladdr’:
  ./include/linux/rtnetlink.h:154:17: error: statement with no effect [-Werror=unused-value]
    154 |                 (void *)net;                            \
  net/ipv4/devinet.c:674:21: note: in expansion of macro ‘rtnl_net_dereference’
    674 |              (ifa = rtnl_net_dereference(net, *ifap)) != NULL;
        |                     ^~~~~~~~~~~~~~~~~~~~

Let's go back to the original simplest macro.

Note that checkpatch complains about this approach, but it's one-shot and
less noisy than the other two.

  WARNING: Argument 'net' is not used in function-like macro
  #76: FILE: include/linux/rtnetlink.h:142:
  +#define rtnl_net_dereference(net, p) \
  + rtnl_dereference(p)

Fixes: 844e5e7e656d ("rtnetlink: Add assertion helpers for per-netns RTNL.")
Link: https://lore.kernel.org/netdev/20241004132145.7fd208e9@kernel.org/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202410200325.SaEJmyZS-lkp@intel.com/ [1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoneighbour: use kvzalloc()/kvfree()
Eric Dumazet [Tue, 22 Oct 2024 15:00:59 +0000 (15:00 +0000)]
neighbour: use kvzalloc()/kvfree()

mm layer is providing convenient functions, we do not have
to work around old limitations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241022150059.1345406-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonetlink: specs: Add missing phy-ntf command to ethtool spec
Kory Maincent [Tue, 22 Oct 2024 15:14:18 +0000 (17:14 +0200)]
netlink: specs: Add missing phy-ntf command to ethtool spec

ETHTOOL_MSG_PHY_NTF description is missing in the ethtool netlink spec.
Add it to the spec.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20241022151418.875424-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agovsock: do not leave dangling sk pointer in vsock_create()
Eric Dumazet [Tue, 22 Oct 2024 13:48:19 +0000 (13:48 +0000)]
vsock: do not leave dangling sk pointer in vsock_create()

syzbot was able to trigger the following warning after recent
core network cleanup.

On error vsock_create() frees the allocated sk object, but sock_init_data()
has already attached it to the provided sock object.

We must clear sock->sk to avoid possible use-after-free later.

WARNING: CPU: 0 PID: 5282 at net/socket.c:1581 __sock_create+0x897/0x950 net/socket.c:1581
Modules linked in:
CPU: 0 UID: 0 PID: 5282 Comm: syz.2.43 Not tainted 6.12.0-rc2-syzkaller-00667-g53bac8330865 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
 RIP: 0010:__sock_create+0x897/0x950 net/socket.c:1581
Code: 7f 06 01 65 48 8b 34 25 00 d8 03 00 48 81 c6 b0 08 00 00 48 c7 c7 60 0b 0d 8d e8 d4 9a 3c 02 e9 11 f8 ff ff e8 0a ab 0d f8 90 <0f> 0b 90 e9 82 fd ff ff 89 e9 80 e1 07 fe c1 38 c1 0f 8c c7 f8 ff
RSP: 0018:ffffc9000394fda8 EFLAGS: 00010293
RAX: ffffffff89873c46 RBX: ffff888079f3c818 RCX: ffff8880314b9e00
RDX: 0000000000000000 RSI: 00000000ffffffed RDI: 0000000000000000
RBP: ffffffff8d3337f0 R08: ffffffff8987384e R09: ffffffff8989473a
R10: dffffc0000000000 R11: fffffbfff203a276 R12: 00000000ffffffed
R13: ffff888079f3c8c0 R14: ffffffff898736e7 R15: dffffc0000000000
FS:  00005555680ab500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f22b11196d0 CR3: 00000000308c0000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
  sock_create net/socket.c:1632 [inline]
  __sys_socket_create net/socket.c:1669 [inline]
  __sys_socket+0x150/0x3c0 net/socket.c:1716
  __do_sys_socket net/socket.c:1730 [inline]
  __se_sys_socket net/socket.c:1728 [inline]
  __x64_sys_socket+0x7a/0x90 net/socket.c:1728
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f22b117dff9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff56aec0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000029
RAX: ffffffffffffffda RBX: 00007f22b1335f80 RCX: 00007f22b117dff9
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000028
RBP: 00007f22b11f0296 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f22b1335f80 R14: 00007f22b1335f80 R15: 00000000000012dd

Fixes: 48156296a08c ("net: warn, if pf->create does not clear sock->sk on error")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20241022134819.1085254-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet/mlx5: unique names for per device caches
Sebastian Ott [Wed, 23 Oct 2024 13:41:46 +0000 (15:41 +0200)]
net/mlx5: unique names for per device caches

Add the device name to the per device kmem_cache names to
ensure their uniqueness. This fixes warnings like this:
"kmem_cache of name 'mlx5_fs_fgs' already exists".

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241023134146.28448-1-sebott@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'bonding-returns-detailed-error-about-xdp-failures'
Jakub Kicinski [Mon, 28 Oct 2024 23:09:43 +0000 (16:09 -0700)]
Merge branch 'bonding-returns-detailed-error-about-xdp-failures'

Hangbin Liu says:

====================
Bonding: returns detailed error about XDP failures

Based on discussion[1], this patch set returns detailed error about XDP
failures. And update bonding document about XDP supports.

https://lore.kernel.org/8088f2a7-3ab1-4a1e-996d-c15703da13cc@blackwall.org
====================

Link: https://patch.msgid.link/20241021031211.814-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoDocumentation: bonding: add XDP support explanation
Hangbin Liu [Mon, 21 Oct 2024 03:12:11 +0000 (03:12 +0000)]
Documentation: bonding: add XDP support explanation

Add document about which modes have native XDP support.

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20241021031211.814-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agobonding: return detailed error when loading native XDP fails
Hangbin Liu [Mon, 21 Oct 2024 03:12:10 +0000 (03:12 +0000)]
bonding: return detailed error when loading native XDP fails

Bonding only supports native XDP for specific modes, which can lead to
confusion for users regarding why XDP loads successfully at times and
fails at others. This patch enhances error handling by returning detailed
error messages, providing users with clearer insights into the specific
reasons for the failure when loading native XDP.

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20241021031211.814-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'mptcp-various-small-improvements'
Jakub Kicinski [Mon, 28 Oct 2024 22:55:48 +0000 (15:55 -0700)]
Merge branch 'mptcp-various-small-improvements'

Matthieu Baerts says:

====================
mptcp: various small improvements

The following patches are not related to each other.

- Patch 1: Avoid sending advertisements on stale subflows, reducing
  risks on loosing them.

- Patch 2: Annotate data-races around subflow->fully_established, using
  READ/WRITE_ONCE().

- Patch 3: A small clean-up on the PM side, avoiding a bit of duplicated
  code.

- Patch 4: Use "Middlebox interference" MP_TCPRST code in reaction to a
  packet received without MPTCP options in the middle of a connection.
====================

Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-0-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomptcp: use "middlebox interference" RST when no DSS
Davide Caratti [Mon, 21 Oct 2024 15:14:06 +0000 (17:14 +0200)]
mptcp: use "middlebox interference" RST when no DSS

RFC8684 suggests use of "Middlebox interference (code 0x06)" in case of
fully established subflow that carries data at TCP level with no DSS
sub-option.

This is generally the case when mpext is NULL or mpext->use_map is 0:
use a dedicated value of 'mapping_status' and use it before closing the
socket in subflow_check_data_avail().

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/518
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-4-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomptcp: implement mptcp_pm_connection_closed
Geliang Tang [Mon, 21 Oct 2024 15:14:05 +0000 (17:14 +0200)]
mptcp: implement mptcp_pm_connection_closed

The MPTCP path manager event handler mptcp_pm_connection_closed
interface has been added in the commit 1b1c7a0ef7f3 ("mptcp: Add path
manager interface") but it was an empty function from then on.

With such name, it sounds good to invoke mptcp_event with the
MPTCP_EVENT_CLOSED event type from it. It also removes a bit of
duplicated code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-3-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomptcp: annotate data-races around subflow->fully_established
Gang Yan [Mon, 21 Oct 2024 15:14:04 +0000 (17:14 +0200)]
mptcp: annotate data-races around subflow->fully_established

We introduce the same handling for potential data races with the
'fully_established' flag in subflow as previously done for
msk->fully_established.

Additionally, we make a crucial change: convert the subflow's
'fully_established' from 'bit_field' to 'bool' type. This is
necessary because methods for avoiding data races don't work well
with 'bit_field'. Specifically, the 'READ_ONCE' needs to know
the size of the variable being accessed, which is not supported in
'bit_field'. Also, 'test_bit' expect the address of 'bit_field'.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/516
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-2-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agomptcp: pm: send ACK on non-stale subflows
Matthieu Baerts (NGI0) [Mon, 21 Oct 2024 15:14:03 +0000 (17:14 +0200)]
mptcp: pm: send ACK on non-stale subflows

If the subflow is considered as "staled", it is better to avoid it to
send an ACK carrying an ADD_ADDR or RM_ADDR. Another subflow, if any,
will then be selected.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-1-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoMerge branch 'net-systemport-minor-io-macros-changes'
Jakub Kicinski [Mon, 28 Oct 2024 22:54:42 +0000 (15:54 -0700)]
Merge branch 'net-systemport-minor-io-macros-changes'

Florian Fainelli says:

====================
net: systemport: Minor IO macros changes

This patch series addresses the warning initially reported by Vladimir
here:

https://lore.kernel.org/all/20241014150139.927423-1-vladimir.oltean@nxp.com/

and follows on with proceeding with his suggestion the IO macros to the
header file.
====================

Link: https://patch.msgid.link/20241021174935.57658-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: systemport: Move IO macros to header file
Florian Fainelli [Mon, 21 Oct 2024 17:49:35 +0000 (10:49 -0700)]
net: systemport: Move IO macros to header file

Move the BCM_SYSPORT_IO_MACRO() definition and its use to bcmsysport.h
where it is more appropriate and where static inline helpers are
acceptable. While at it, make sure that the macro 'offset' argument does
not trigger a checkpatch warning due to possible argument re-use.

Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241021174935.57658-3-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: systemport: Remove unused txchk accessors
Florian Fainelli [Mon, 21 Oct 2024 17:49:34 +0000 (10:49 -0700)]
net: systemport: Remove unused txchk accessors

Vladimir reported the following warning with clang-16 and W=1:

warning: unused function 'txchk_readl' [-Wunused-function]
BCM_SYSPORT_IO_MACRO(txchk, SYS_PORT_TXCHK_OFFSET);
note: expanded from macro 'BCM_SYSPORT_IO_MACRO'

warning: unused function 'txchk_writel' [-Wunused-function]
note: expanded from macro 'BCM_SYSPORT_IO_MACRO'

warning: unused function 'tbuf_readl' [-Wunused-function]
BCM_SYSPORT_IO_MACRO(tbuf, SYS_PORT_TBUF_OFFSET);
note: expanded from macro 'BCM_SYSPORT_IO_MACRO'

warning: unused function 'tbuf_writel' [-Wunused-function]
note: expanded from macro 'BCM_SYSPORT_IO_MACRO'

The TXCHK and RBUF blocks are not being accessed, remove the IO macros
used to access those blocks. No functional impact.

Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241021174935.57658-2-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoselftest/tcp-ao: Add filter tests
Leo Stone [Mon, 21 Oct 2024 17:46:44 +0000 (10:46 -0700)]
selftest/tcp-ao: Add filter tests

Add tests that check if getsockopt(TCP_AO_GET_KEYS) returns the right
keys when using different filters.

Sample output:

> # ok 114 filter keys: by sndid, rcvid, address
> # ok 115 filter keys: by is_current
> # ok 116 filter keys: by is_rnext
> # ok 117 filter keys: by sndid, rcvid
> # ok 118 filter keys: correct nkeys when in.nkeys < matches

Acked-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Leo Stone <leocstone@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241021174652.6949-1-leocstone@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: amd8111e: Remove duplicate definition of PCI_VENDOR_ID_AMD
Yazen Ghannam [Mon, 21 Oct 2024 15:38:25 +0000 (15:38 +0000)]
net: amd8111e: Remove duplicate definition of PCI_VENDOR_ID_AMD

The AMD PCI vendor ID is already defined in <linux/pci_ids.h>.

Remove this local definition as it is not needed.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20241021153825.2536819-1-yazen.ghannam@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agodt-bindings: nfc: nxp,nci: Document PN553 compatible
Danila Tikhonov [Sun, 20 Oct 2024 20:56:09 +0000 (23:56 +0300)]
dt-bindings: nfc: nxp,nci: Document PN553 compatible

The PN553 is another NFC chip from NXP, document the compatible in the
bindings.

Signed-off-by: Danila Tikhonov <danila@jiaxyga.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20241020205615.211256-2-danila@jiaxyga.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agoconfigs/debug: make sure PROVE_RCU_LIST=y takes effect
Jakub Kicinski [Wed, 16 Oct 2024 01:11:44 +0000 (18:11 -0700)]
configs/debug: make sure PROVE_RCU_LIST=y takes effect

Commit 0aaa8977acbf ("configs: introduce debug.config for CI-like setup")
added CONFIG_PROVE_RCU_LIST=y to the common CI config,
but RCU_EXPERT is not set, and it's a dependency for
CONFIG_PROVE_RCU_LIST=y. Make sure CIs take advantage
of CONFIG_PROVE_RCU_LIST=y, recent fixes in networking
indicate that it does catch bugs.

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241016011144.3058445-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
8 months agonet: dsa: mv88e6xxx: fix unreleased fwnode_handle in setup_port()
Javier Carrasco [Sat, 19 Oct 2024 20:16:49 +0000 (22:16 +0200)]
net: dsa: mv88e6xxx: fix unreleased fwnode_handle in setup_port()

'ports_fwnode' is initialized via device_get_named_child_node(), which
requires a call to fwnode_handle_put() when the variable is no longer
required to avoid leaking memory.

Add the missing fwnode_handle_put() after 'ports_fwnode' has been used
and is no longer required.

Fixes: 94a2a84f5e9e ("net: dsa: mv88e6xxx: Support LED control")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 months agobareudp: Use pcpu stats to update rx_dropped counter.
Guillaume Nault [Fri, 18 Oct 2024 13:35:28 +0000 (15:35 +0200)]
bareudp: Use pcpu stats to update rx_dropped counter.

Use the core_stats rx_dropped counter to avoid the cost of atomic
increments.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Paolo Abeni [Fri, 25 Oct 2024 07:08:22 +0000 (09:08 +0200)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts and no adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoMerge tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 24 Oct 2024 23:43:50 +0000 (16:43 -0700)]
Merge tag 'net-6.12-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfiler, xfrm and bluetooth.

  Oddly this includes a fix for a posix clock regression; in our
  previous PR we included a change there as a pre-requisite for
  networking one. That fix proved to be buggy and requires the follow-up
  included here. Thomas suggested we should send it, given we sent the
  buggy patch.

  Current release - regressions:

   - posix-clock: Fix unbalanced locking in pc_clock_settime()

   - netfilter: fix typo causing some targets not to load on IPv6

  Current release - new code bugs:

   - xfrm: policy: remove last remnants of pernet inexact list

  Previous releases - regressions:

   - core: fix races in netdev_tx_sent_queue()/dev_watchdog()

   - bluetooth: fix UAF on sco_sock_timeout

   - eth: hv_netvsc: fix VF namespace also in synthetic NIC
     NETDEV_REGISTER event

   - eth: usbnet: fix name regression

   - eth: be2net: fix potential memory leak in be_xmit()

   - eth: plip: fix transmit path breakage

  Previous releases - always broken:

   - sched: deny mismatched skip_sw/skip_hw flags for actions created by
     classifiers

   - netfilter: bpf: must hold reference on net namespace

   - eth: virtio_net: fix integer overflow in stats

   - eth: bnxt_en: replace ptp_lock with irqsave variant

   - eth: octeon_ep: add SKB allocation failures handling in
     __octep_oq_process_rx()

  Misc:

   - MAINTAINERS: add Simon as an official reviewer"

* tag 'net-6.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
  net: dsa: mv88e6xxx: support 4000ps cycle counter period
  net: dsa: mv88e6xxx: read cycle counter period from hardware
  net: dsa: mv88e6xxx: group cycle counter coefficients
  net: usb: qmi_wwan: add Fibocom FG132 0x0112 composition
  hv_netvsc: Fix VF namespace also in synthetic NIC NETDEV_REGISTER event
  net: dsa: microchip: disable EEE for KSZ879x/KSZ877x/KSZ876x
  Bluetooth: ISO: Fix UAF on iso_sock_timeout
  Bluetooth: SCO: Fix UAF on sco_sock_timeout
  Bluetooth: hci_core: Disable works on hci_unregister_dev
  posix-clock: posix-clock: Fix unbalanced locking in pc_clock_settime()
  r8169: avoid unsolicited interrupts
  net: sched: use RCU read-side critical section in taprio_dump()
  net: sched: fix use-after-free in taprio_change()
  net/sched: act_api: deny mismatched skip_sw/skip_hw flags for actions created by classifiers
  net: usb: usbnet: fix name regression
  mlxsw: spectrum_router: fix xa_store() error checking
  virtio_net: fix integer overflow in stats
  net: fix races in netdev_tx_sent_queue()/dev_watchdog()
  net: wwan: fix global oob in wwan_rtnl_policy
  netfilter: xtables: fix typo causing some targets not to load on IPv6
  ...

8 months agoMerge tag 'hid-for-linus-20241024' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 24 Oct 2024 23:31:58 +0000 (16:31 -0700)]
Merge tag 'hid-for-linus-20241024' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:
 "Device-specific functionality quirks for Thinkpad X1 Gen3, Logitech
  Bolt and some Goodix touchpads (Bartłomiej Maryńczak, Hans de Goede
  and Kenneth Albanowski)"

* tag 'hid-for-linus-20241024' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: lenovo: Add support for Thinkpad X1 Tablet Gen 3 keyboard
  HID: multitouch: Add quirk for Logitech Bolt receiver w/ Casa touchpad
  HID: i2c-hid: Delayed i2c resume wakeup for 0x0d42 Goodix touchpad

8 months agoMerge tag 'loongarch-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 24 Oct 2024 21:17:34 +0000 (14:17 -0700)]
Merge tag 'loongarch-fixes-6.12-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Get correct cores_per_package for SMT systems, enable IRQ if do_ale()
  triggered in irq-enabled context, and fix some bugs about vDSO, memory
  managenent, hrtimer in KVM, etc"

* tag 'loongarch-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Mark hrtimer to expire in hard interrupt context
  LoongArch: Make KASAN usable for variable cpu_vabits
  LoongArch: Set initial pte entry with PAGE_GLOBAL for kernel space
  LoongArch: Don't crash in stack_top() for tasks without vDSO
  LoongArch: Set correct size for vDSO code mapping
  LoongArch: Enable IRQ if do_ale() triggered in irq-enabled context
  LoongArch: Get correct cores_per_package for SMT systems
  LoongArch: Use "Exception return address" to comment ERA

8 months agoMerge tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 24 Oct 2024 20:51:58 +0000 (13:51 -0700)]
Merge tag 'probes-fixes-v6.12-rc4.2' of git://git./linux/kernel/git/trace/linux-trace

Pull probes fixes from Masami Hiramatsu:

 - objpool: Fix choosing allocation for percpu slots

   Fixes to allocate objpool's percpu slots correctly according to the
   GFP flag. It checks whether "any bit" in GFP_ATOMIC is set to choose
   the vmalloc source, but it should check "all bits" in GFP_ATOMIC flag
   is set, because GFP_ATOMIC is a combined flag.

 - tracing/probes: Fix MAX_TRACE_ARGS limit handling

   If more than MAX_TRACE_ARGS are passed for creating a probe event,
   the entries over MAX_TRACE_ARG in trace_arg array are not
   initialized. Thus if the kernel accesses those entries, it crashes.
   This rejects creating event if the number of arguments is over
   MAX_TRACE_ARGS.

 - tracing: Consider the NUL character when validating the event length

   A strlen() is used when parsing the event name, and the original code
   does not consider the terminal null byte. Thus it can pass the name
   one byte longer than the buffer. This fixes to check it correctly.

* tag 'probes-fixes-v6.12-rc4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Consider the NULL character when validating the event length
  tracing/probes: Fix MAX_TRACE_ARGS limit handling
  objpool: fix choosing allocation for percpu slots

8 months agoMerge tag 'for-6.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 24 Oct 2024 20:04:15 +0000 (13:04 -0700)]
Merge tag 'for-6.12-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - mount option fixes:
     - fix handling of compression mount options on remount
     - reject rw remount in case there are options that don't work
       in read-write mode (like rescue options)

 - fix zone accounting of unusable space

 - fix in-memory corruption when merging extent maps

 - fix delalloc range locking for sector < page

 - use more convenient default value of drop subtree threshold, clean
   more subvolumes without the fallback to marking quotas inconsistent

 - fix smatch warning about incorrect value passed to ERR_PTR

* tag 'for-6.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix passing 0 to ERR_PTR in btrfs_search_dir_index_item()
  btrfs: reject ro->rw reconfiguration if there are hard ro requirements
  btrfs: fix read corruption due to race with extent map merging
  btrfs: fix the delalloc range locking if sector size < page size
  btrfs: qgroup: set a more sane default value for subtree drop threshold
  btrfs: clear force-compress on remount when compress mount option is given
  btrfs: zoned: fix zone unusable accounting for freed reserved extent

8 months agoMerge tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy
Linus Torvalds [Thu, 24 Oct 2024 19:47:01 +0000 (12:47 -0700)]
Merge tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy

Pull jfs fix from David Kleikamp:
 "Fix a regression introduced in 6.12-rc1"

* tag 'jfs-6.12-rc5' of github.com:kleikamp/linux-shaggy:
  jfs: Fix sanity check in dbMount

8 months agoMerge tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs
Linus Torvalds [Thu, 24 Oct 2024 19:38:59 +0000 (12:38 -0700)]
Merge tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Lots of hotfixes:

   - transaction restart injection has been shaking out a few things

   - fix a data corruption in the buffered write path on -ENOSPC, found
     by xfstests generic/299

   - Some small show_options fixes

   - Repair mismatches in inode hash type, seed: different snapshot
     versions of an inode must have the same hash/type seed, used for
     directory entries and xattrs. We were checking the hash seed, but
     not the type, and a user contributed a filesystem where the hash
     type on one inode had somehow been flipped; these fixes allow his
     filesystem to repair.

     Additionally, the hash type flip made some directory entries
     invisible, which were then recreated by userspace; so the hash
     check code now checks for duplicate non dangling dirents, and
     renames one of them if necessary.

   - Don't use wait_event_interruptible() in recovery: this fixes some
     filesystems failing to mount with -ERESTARTSYS

   - Workaround for kvmalloc not supporting > INT_MAX allocations,
     causing an -ENOMEM when allocating the sorted array of journal
     keys: this allows a 75 TB filesystem to mount

   - Make sure bch_inode_unpacked.bi_snapshot is set in the old inode
     compat path: this alllows Marcin's filesystem (in use since before
     6.7) to repair and mount"

* tag 'bcachefs-2024-10-22' of https://github.com/koverstreet/bcachefs: (26 commits)
  bcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path
  bcachefs: Mark more errors as AUTOFIX
  bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations
  bcachefs: Don't use wait_event_interruptible() in recovery
  bcachefs: Fix __bch2_fsck_err() warning
  bcachefs: fsck: Improve hash_check_key()
  bcachefs: bch2_hash_set_or_get_in_snapshot()
  bcachefs: Repair mismatches in inode hash seed, type
  bcachefs: Add hash seed, type to inode_to_text()
  bcachefs: INODE_STR_HASH() for bch_inode_unpacked
  bcachefs: Run in-kernel offline fsck without ratelimit errors
  bcachefs: skip mount option handle for empty string.
  bcachefs: fix incorrect show_options results
  bcachefs: Fix data corruption on -ENOSPC in buffered write path
  bcachefs: bch2_folio_reservation_get_partial() is now better behaved
  bcachefs: fix disk reservation accounting in bch2_folio_reservation_get()
  bcachefS: ec: fix data type on stripe deletion
  bcachefs: Don't use commit_do() unnecessarily
  bcachefs: handle restarts in bch2_bucket_io_time_reset()
  bcachefs: fix restart handling in __bch2_resume_logged_op_finsert()
  ...

8 months agoRevert "9p: Enable multipage folios"
Dominique Martinet [Wed, 23 Oct 2024 23:29:19 +0000 (08:29 +0900)]
Revert "9p: Enable multipage folios"

This reverts commit 1325e4a91a405f88f1b18626904d37860a4f9069.

using multipage folios apparently break some madvise operations like
MADV_PAGEOUT which do not reliably unload the specified page anymore,

Revert the patch until that is figured out.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Fixes: 1325e4a91a40 ("9p: Enable multipage folios")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 months agoselftests: tls: add a selftest for wrapping rec_seq
Sabrina Dubroca [Fri, 18 Oct 2024 10:55:58 +0000 (12:55 +0200)]
selftests: tls: add a selftest for wrapping rec_seq

Set the initial rec_seq to 0xffffffffffffffff so that it wraps
immediately. The send() call should fail with EBADMSG.

A bug in this code was fixed in commit cfaa80c91f6f ("net/tls: do not
free tls_rec on async operation in bpf_exec_tx_verdict()").

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20775fcfd0371422921ee60a42de170c0398ac10.1729244987.git.sd@queasysnail.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoMerge branch 'phonet-convert-all-doit-and-dumpit-to-rcu'
Paolo Abeni [Thu, 24 Oct 2024 14:04:27 +0000 (16:04 +0200)]
Merge branch 'phonet-convert-all-doit-and-dumpit-to-rcu'

Kuniyuki Iwashima says:

====================
phonet: Convert all doit() and dumpit() to RCU.

addr_doit() and route_doit() access only phonet_device_list(dev_net(dev))
and phonet_pernet(dev_net(dev))->routes, respectively.

Each per-netns struct has its dedicated mutex, and RTNL also protects
the structs.  __dev_change_net_namespace() has synchronize_net(), so
we have two options to convert addr_doit() and route_doit().

  1. Use per-netns RTNL
  2. Use RCU and convert each struct mutex to spinlock_t

As RCU is preferable, this series converts all PF_PHONET's doit()
and dumpit() to RCU.

4 doit()s and 1 dumpit() are now converted to RCU, 70 doit()s and
28 dumpit()s are still under RTNL.
====================

Link: https://patch.msgid.link/20241017183140.43028-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Don't hold RTNL for route_doit().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:40 +0000 (11:31 -0700)]
phonet: Don't hold RTNL for route_doit().

Now only __dev_get_by_index() depends on RTNL in route_doit().

Let's use dev_get_by_index_rcu() and register route_doit() with
RTNL_FLAG_DOIT_UNLOCKED.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Convert phonet_routes.lock to spinlock_t.
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:39 +0000 (11:31 -0700)]
phonet: Convert phonet_routes.lock to spinlock_t.

route_doit() calls phonet_route_add() or phonet_route_del()
for RTM_NEWROUTE or RTM_DELROUTE, respectively.

Both functions only touch phonet_pernet(dev_net(dev))->routes,
which is currently protected by RTNL and its dedicated mutex,
phonet_routes.lock.

We will convert route_doit() to RCU and cannot use mutex inside RCU.

Let's convert the mutex to spinlock_t.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Pass net and ifindex to rtm_phonet_notify().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:38 +0000 (11:31 -0700)]
phonet: Pass net and ifindex to rtm_phonet_notify().

Currently, rtm_phonet_notify() fetches netns and ifindex from dev.

Once route_doit() is converted to RCU, rtm_phonet_notify() will be
called outside of RCU due to GFP_KERNEL, and dev will be unavailable
there.

Let's pass net and ifindex to rtm_phonet_notify().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Pass ifindex to fill_route().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:37 +0000 (11:31 -0700)]
phonet: Pass ifindex to fill_route().

We will convert route_doit() to RCU.

route_doit() will call rtm_phonet_notify() outside of RCU due
to GFP_KERNEL, so dev will not be available in fill_route().

Let's pass ifindex directly to fill_route().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Don't hold RTNL for getaddr_dumpit().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:36 +0000 (11:31 -0700)]
phonet: Don't hold RTNL for getaddr_dumpit().

getaddr_dumpit() already relies on RCU and does not need RTNL.

Let's use READ_ONCE() for ifindex and register getaddr_dumpit()
with RTNL_FLAG_DUMP_UNLOCKED.

While at it, the retval of getaddr_dumpit() is changed to combine
NLMSG_DONE and save recvmsg() as done in 58a4ff5d77b1 ("phonet: no
longer hold RTNL in route_dumpit()").

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Don't hold RTNL for addr_doit().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:35 +0000 (11:31 -0700)]
phonet: Don't hold RTNL for addr_doit().

Now only __dev_get_by_index() depends on RTNL in addr_doit().

Let's use dev_get_by_index_rcu() and register addr_doit() with
RTNL_FLAG_DOIT_UNLOCKED.

While at it, I changed phonet_rtnl_msg_handlers[]'s init to C99
style like other core networking code.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Convert phonet_device_list.lock to spinlock_t.
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:34 +0000 (11:31 -0700)]
phonet: Convert phonet_device_list.lock to spinlock_t.

addr_doit() calls phonet_address_add() or phonet_address_del()
for RTM_NEWADDR or RTM_DELADDR, respectively.

Both functions only touch phonet_device_list(dev_net(dev)),
which is currently protected by RTNL and its dedicated mutex,
phonet_device_list.lock.

We will convert addr_doit() to RCU and cannot use mutex inside RCU.

Let's convert the mutex to spinlock_t.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Pass net and ifindex to phonet_address_notify().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:33 +0000 (11:31 -0700)]
phonet: Pass net and ifindex to phonet_address_notify().

Currently, phonet_address_notify() fetches netns and ifindex from dev.

Once addr_doit() is converted to RCU, phonet_address_notify() will be
called outside of RCU due to GFP_KERNEL, and dev will be unavailable
there.

Let's pass net and ifindex to phonet_address_notify().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agophonet: Pass ifindex to fill_addr().
Kuniyuki Iwashima [Thu, 17 Oct 2024 18:31:32 +0000 (11:31 -0700)]
phonet: Pass ifindex to fill_addr().

We will convert addr_doit() and getaddr_dumpit() to RCU, both
of which call fill_addr().

The former will call phonet_address_notify() outside of RCU
due to GFP_KERNEL, so dev will not be available in fill_addr().

Let's pass ifindex directly to fill_addr().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agoMerge branch 'net-dsa-mv88e6xxx-fix-mv88e6393x-phc-frequency-on-internal-clock'
Paolo Abeni [Thu, 24 Oct 2024 10:57:48 +0000 (12:57 +0200)]
Merge branch 'net-dsa-mv88e6xxx-fix-mv88e6393x-phc-frequency-on-internal-clock'

Shenghao Yang says:

====================
net: dsa: mv88e6xxx: fix MV88E6393X PHC frequency on internal clock

The MV88E6393X family of switches can additionally run their cycle
counters using a 250MHz internal clock instead of the usual 125MHz
external clock [1].

The driver currently assumes all designs utilize that external clock,
but MikroTik's RB5009 uses the internal source - causing the PHC to be
seen running at 2x real time in userspace, making synchronization
with ptp4l impossible.

This series adds support for reading off the cycle counter frequency
known to the hardware in the TAI_CLOCK_PERIOD register and picking an
appropriate set of scaling coefficients instead of using a fixed set
for each switch family.

Patch 1 groups those cycle counter coefficients into a new structure to
make it easier to pass them around.

Patch 2 modifies PTP initialization to probe TAI_CLOCK_PERIOD and
use an appropriate set of coefficients.

Patch 3 adds support for 4000ps cycle counter periods.

Changes since v2 [2]:

- Patch 1: "net: dsa: mv88e6xxx: group cycle counter coefficients"
  - Moved declaration of mv88e6xxx_cc_coeffs to avoid moving that in
    Patch 2.

- Patch 2: "net: dsa: mv88e6xxx: read cycle counter period from hardware"
  - Removed move of mv88e6xxx_cc_coeffs declaration.

- Patch 3: "net: dsa: mv88e6xxx: support 4000ps cycle counter periods"
  - No change.

[1] https://lore.kernel.org/netdev/d6622575-bf1b-445a-b08f-2739e3642aae@lunn.ch/
[2] https://lore.kernel.org/netdev/20241006145951.719162-1-me@shenghaoyang.info/
====================

Link: https://patch.msgid.link/20241020063833.5425-1-me@shenghaoyang.info
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: dsa: mv88e6xxx: support 4000ps cycle counter period
Shenghao Yang [Sun, 20 Oct 2024 06:38:30 +0000 (14:38 +0800)]
net: dsa: mv88e6xxx: support 4000ps cycle counter period

The MV88E6393X family of devices can run its cycle counter off
an internal 250MHz clock instead of an external 125MHz one.

Add support for this cycle counter period by adding another set
of coefficients and lowering the periodic cycle counter read interval
to compensate for faster overflows at the increased frequency.

Otherwise, the PHC runs at 2x real time in userspace and cannot be
synchronized.

Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Shenghao Yang <me@shenghaoyang.info>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
8 months agonet: dsa: mv88e6xxx: read cycle counter period from hardware
Shenghao Yang [Sun, 20 Oct 2024 06:38:29 +0000 (14:38 +0800)]
net: dsa: mv88e6xxx: read cycle counter period from hardware

Instead of relying on a fixed mapping of hardware family to cycle
counter frequency, pull this information from the
MV88E6XXX_TAI_CLOCK_PERIOD register.

This lets us support switches whose cycle counter frequencies depend on
board design.

Fixes: de776d0d316f ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Shenghao Yang <me@shenghaoyang.info>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>