linux-block.git
4 months agoMerge tag 'wireless-next-2024-02-20' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 21 Feb 2024 11:48:20 +0000 (11:48 +0000)]
Merge tag 'wireless-next-2024-02-20' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.9

The second "new features" pull request for v6.9.  Lots of iwlwifi and
stack changes this time. And naturally smaller changes to other drivers.

We also twice merged wireless into wireless-next to avoid conflicts
between the trees.

Major changes:

stack

* mac80211: negotiated TTLM request support

* SPP A-MSDU support

* mac80211: wider bandwidth OFDMA config support

iwlwifi

* kunit tests

* bump FW API to 89 for AX/BZ/SC devices

* enable SPP A-MSDUs

* support for new devices

ath12k

* refactoring in preparation for Multi-Link Operation (MLO) support

* 1024 Block Ack window size support

* provide firmware wmi logs via a trace event

ath11k

* 36 bit DMA mask support

* support 6 GHz station power modes: Low Power Indoor (LPI), Standard
  Power) SP and Very Low Power (VLP)

rtl8xxxu

* TP-Link TL-WN823N V2 support
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'net-kmem-cache-create'
David S. Miller [Wed, 21 Feb 2024 11:28:58 +0000 (11:28 +0000)]
Merge branch 'net-kmem-cache-create'

Kunwu Chan says:

====================
net: Use KMEM_CACHE instead of kmem_cache_create

As Jiri Pirko suggests,
I'm using a patchset to cleanup the same issues in the 'net' module.
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Some cache names are changed to be the same as struct names.
This change is recorded in the changelog for easy reference.
It's harmless cause it's used in /proc/slabinfo to identify this cache.
---
Changes in v2:
- Delete a patch as Eric said in https://lore.kernel.org/all/CANn89iLkWvum6wSqSya_K+1eqnFvp=L2WLW=kAYrZTF8Ei4b7g@mail.gmail.com/
- No code changes,only add Reviewed-by tag
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoipv6: Simplify the allocation of slab caches
Kunwu Chan [Tue, 20 Feb 2024 07:36:46 +0000 (15:36 +0800)]
ipv6: Simplify the allocation of slab caches

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoipv4: Simplify the allocation of slab caches in ip_rt_init
Kunwu Chan [Tue, 20 Feb 2024 07:36:45 +0000 (15:36 +0800)]
ipv4: Simplify the allocation of slab caches in ip_rt_init

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip_dst_cache' to 'rtable'.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoipmr: Simplify the allocation of slab caches
Kunwu Chan [Tue, 20 Feb 2024 07:36:44 +0000 (15:36 +0800)]
ipmr: Simplify the allocation of slab caches

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip_mrt_cache' to 'mfc_cache'.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoip6mr: Simplify the allocation of slab caches in ip6_mr_init
Kunwu Chan [Tue, 20 Feb 2024 07:36:43 +0000 (15:36 +0800)]
ip6mr: Simplify the allocation of slab caches in ip6_mr_init

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'ip6_mrt_cache' to 'mfc6_cache'.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: kcm: Simplify the allocation of slab caches
Kunwu Chan [Tue, 20 Feb 2024 07:36:42 +0000 (15:36 +0800)]
net: kcm: Simplify the allocation of slab caches

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
And change cache name from 'kcm_mux_cache' to 'kcm_mux',
'kcm_psock_cache' to 'kcm_psock'.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet/dummy: Move stats allocation to core
Breno Leitao [Mon, 19 Feb 2024 13:43:28 +0000 (05:43 -0800)]
net/dummy: Move stats allocation to core

With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core instead
of this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Move dummy driver to leverage the core allocation.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotg3: simplify tg3_phy_autoneg_cfg
Heiner Kallweit [Sun, 18 Feb 2024 18:04:42 +0000 (19:04 +0100)]
tg3: simplify tg3_phy_autoneg_cfg

Make use of ethtool_adv_to_mmd_eee_adv_t() to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotg3: copy only needed fields from userspace-provided EEE data
Heiner Kallweit [Sun, 18 Feb 2024 14:49:55 +0000 (15:49 +0100)]
tg3: copy only needed fields from userspace-provided EEE data

The current code overwrites fields in tp->eee with unchecked data from
edata, e.g. the bitmap with supported modes. ethtool properly returns
the received data from get_eee() call, but we have no guarantee that
other users of the ioctl set_eee() interface behave properly too.
Therefore copy only fields which are actually needed.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'net-constify-device_type'
David S. Miller [Wed, 21 Feb 2024 09:45:24 +0000 (09:45 +0000)]
Merge branch 'net-constify-device_type'

Ricardo B. Marliere says:

====================
net: constify struct device_type usage

This is a simple and straight forward cleanup series that makes all device
types in the net subsystem constants. This has been possible since 2011 [1]
but not all occurrences were cleaned. I have been sweeping the tree to fix
them all.

I was not sure if I should send these squashed, but there are quite a few
changes so I decided to send them separately. Please let me know if that is
not desirable.

[1] https://lore.kernel.org/all/1305850262-9575-5-git-send-email-gregkh@suse.de/

====================

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
4 months agonet: hso: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:34 +0000 (17:13 -0300)]
net: hso: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the hso_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: wwan: core: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:33 +0000 (17:13 -0300)]
net: wwan: core: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the wwan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: netdevsim: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:32 +0000 (17:13 -0300)]
net: netdevsim: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
nsim_bus_dev_type variable to be a constant structure as well, placing it
into read-only memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: vlan: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:31 +0000 (17:13 -0300)]
net: vlan: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the vlan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: l2tp: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:30 +0000 (17:13 -0300)]
net: l2tp: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the l2tpeth_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: hsr: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:29 +0000 (17:13 -0300)]
net: hsr: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the hsr_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: geneve: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:28 +0000 (17:13 -0300)]
net: geneve: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the geneve_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ppp: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:27 +0000 (17:13 -0300)]
net: ppp: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the ppp_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: vxlan: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:26 +0000 (17:13 -0300)]
net: vxlan: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the vxlan_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: bridge: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:25 +0000 (17:13 -0300)]
net: bridge: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the br_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dsa: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:24 +0000 (17:13 -0300)]
net: dsa: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the dsa_type
variable to be a constant structure as well, placing it into read-only
memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: usbnet: constify the struct device_type usage
Ricardo B. Marliere [Sat, 17 Feb 2024 20:13:23 +0000 (17:13 -0300)]
net: usbnet: constify the struct device_type usage

Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the wlan_type
and wwan_type variables to be constant structures as well, placing it into
read-only memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: wan: framer: constify of_phandle_args in xlate
Krzysztof Kozlowski [Sat, 17 Feb 2024 10:03:06 +0000 (11:03 +0100)]
net: wan: framer: constify of_phandle_args in xlate

The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args.  Make the argument
pointer to const for code safety and readability.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240217100306.86740-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agor8169: add MODULE_FIRMWARE entry for RTL8126A
Heiner Kallweit [Sat, 17 Feb 2024 14:48:23 +0000 (15:48 +0100)]
r8169: add MODULE_FIRMWARE entry for RTL8126A

Add the missing MODULE_FIRMWARE entry for RTL8126A.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/47ef79d2-59c4-4d44-9595-366c70c4ad87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: fix pointer check in skb_pp_cow_data routine
Lorenzo Bianconi [Sat, 17 Feb 2024 11:12:14 +0000 (12:12 +0100)]
net: fix pointer check in skb_pp_cow_data routine

Properly check page pointer returned by page_pool_dev_alloc routine in
skb_pp_cow_data() for non-linear part of the original skb.

Reported-by: Julian Wiedmann <jwiedmann.dev@gmail.com>
Closes: https://lore.kernel.org/netdev/cover.1707729884.git.lorenzo@kernel.org/T/#m7d189b0015a7281ed9221903902490c03ed19a7a
Fixes: e6d5dbdd20aa ("xdp: add multi-buff support for xdp running in generic mode")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/25512af3e09befa9dcb2cf3632bdc45b807cf330.1708167716.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge tag 'linux-can-next-for-6.9-20240220' of git://git.kernel.org/pub/scm/linux...
Paolo Abeni [Tue, 20 Feb 2024 14:32:44 +0000 (15:32 +0100)]
Merge tag 'linux-can-next-for-6.9-20240220' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2024-02-20

this is a pull request of 9 patches for net-next/master.

The first patch is by Francesco Dolcini and removes a redundant check
for pm_clock_support from the m_can driver.

Martin Hundebøll contributes 3 patches to the m_can/tcan4x5x driver to
allow resume upon RX of a CAN frame.

3 patches by Srinivas Goud add support for ECC statistics to the
xilinx_can driver.

The last 2 patches are by Oliver Hartkopp and me, target the CAN RAW
protocol and fix an error in the getsockopt() for CAN-XL introduced in
the previous pull request to net-next (linux-can-next-for-6.9-20240213).

linux-can-next-for-6.9-20240220

* tag 'linux-can-next-for-6.9-20240220' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: raw: raw_getsockopt(): reduce scope of err
  can: raw: fix getsockopt() for new CAN_RAW_XL_VCID_OPTS
  can: xilinx_can: Add ethtool stats interface for ECC errors
  can: xilinx_can: Add ECC support
  dt-bindings: can: xilinx_can: Add 'xlnx,has-ecc' optional property
  can: tcan4x5x: support resuming from rx interrupt signal
  can: m_can: allow keeping the transceiver running in suspend
  dt-bindings: can: tcan4x5x: Document the wakeup-source flag
  can: m_can: remove redundant check for pm_clock_support
====================

Link: https://lore.kernel.org/r/20240220085130.2936533-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: skbuff: add overflow debug check to pull/push helpers
Florian Westphal [Fri, 16 Feb 2024 11:36:57 +0000 (12:36 +0100)]
net: skbuff: add overflow debug check to pull/push helpers

syzbot managed to trigger following splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x4a3b/0x5e50
Read of size 1 at addr ffff888208a4000e by task a.out/2313
[..]
  __skb_flow_dissect+0x4a3b/0x5e50
  __skb_get_hash+0xb4/0x400
  ip_tunnel_xmit+0x77e/0x26f0
  ipip_tunnel_xmit+0x298/0x410
  ..

Analysis shows that the skb has a valid ->head, but bogus ->data
pointer.

skb->data gets its bogus value via the neigh layer, which does:

1556    __skb_pull(skb, skb_network_offset(skb));

... and the skb was already dodgy at this point:

skb_network_offset(skb) returns a negative value due to an
earlier overflow of skb->network_header (u16).  __skb_pull thus
"adjusts" skb->data by a huge offset, pointing outside skb->head
area.

Allow debug builds to splat when we try to pull/push more than
INT_MAX bytes.

After this, the syzkaller reproducer yields a more precise splat
before the flow dissector attempts to read off skb->data memory:

WARNING: CPU: 5 PID: 2313 at include/linux/skbuff.h:2653 neigh_connected_output+0x28e/0x400
  ip_finish_output2+0xb25/0xed0
  iptunnel_xmit+0x4ff/0x870
  ipgre_xmit+0x78e/0xbb0

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240216113700.23013-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: reorganize "struct sock" fields
Eric Dumazet [Fri, 16 Feb 2024 16:20:06 +0000 (16:20 +0000)]
net: reorganize "struct sock" fields

Last major reorg happened in commit 9115e8cd2a0c ("net: reorganize
struct sock for better data locality")

Since then, many changes have been done.

Before SO_PEEK_OFF support is added to TCP, we need
to move sk_peek_off to a better location.

It is time to make another pass, and add six groups,
without explicit alignment.

- sock_write_rx (following sk_refcnt) read-write fields in rx path.
- sock_read_rx read-mostly fields in rx path.
- sock_read_rxtx read-mostly fields in both rx and tx paths.
- sock_write_rxtx read-write fields in both rx and tx paths.
- sock_write_tx read-write fields in tx paths.
- sock_read_tx read-mostly fields in tx paths.

Results on TCP_RR benchmarks seem to show a gain (4 to 5 %).

It is possible UDP needs a change, because sk_peek_off
shares a cache line with sk_receive_queue.
If this the case, we can exchange roles of sk->sk_receive
and up->reader_queue queues.

After this change, we have the following layout:

struct sock {
struct sock_common         __sk_common;          /*     0  0x88 */
/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
__u8                       __cacheline_group_begin__sock_write_rx[0]; /*  0x88     0 */
atomic_t                   sk_drops;             /*  0x88   0x4 */
__s32                      sk_peek_off;          /*  0x8c   0x4 */
struct sk_buff_head        sk_error_queue;       /*  0x90  0x18 */
struct sk_buff_head        sk_receive_queue;     /*  0xa8  0x18 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct {
atomic_t           rmem_alloc;           /*  0xc0   0x4 */
int                len;                  /*  0xc4   0x4 */
struct sk_buff *   head;                 /*  0xc8   0x8 */
struct sk_buff *   tail;                 /*  0xd0   0x8 */
} sk_backlog;                                    /*  0xc0  0x18 */
struct {
atomic_t                   rmem_alloc;           /*     0   0x4 */
int                        len;                  /*   0x4   0x4 */
struct sk_buff *           head;                 /*   0x8   0x8 */
struct sk_buff *           tail;                 /*  0x10   0x8 */

/* size: 24, cachelines: 1, members: 4 */
/* last cacheline: 24 bytes */
};

__u8                       __cacheline_group_end__sock_write_rx[0]; /*  0xd8     0 */
__u8                       __cacheline_group_begin__sock_read_rx[0]; /*  0xd8     0 */
rcu *                      sk_rx_dst;            /*  0xd8   0x8 */
int                        sk_rx_dst_ifindex;    /*  0xe0   0x4 */
u32                        sk_rx_dst_cookie;     /*  0xe4   0x4 */
unsigned int               sk_ll_usec;           /*  0xe8   0x4 */
unsigned int               sk_napi_id;           /*  0xec   0x4 */
u16                        sk_busy_poll_budget;  /*  0xf0   0x2 */
u8                         sk_prefer_busy_poll;  /*  0xf2   0x1 */
u8                         sk_userlocks;         /*  0xf3   0x1 */
int                        sk_rcvbuf;            /*  0xf4   0x4 */
rcu *                      sk_filter;            /*  0xf8   0x8 */
/* --- cacheline 4 boundary (256 bytes) --- */
union {
rcu *              sk_wq;                /* 0x100   0x8 */
struct socket_wq * sk_wq_raw;            /* 0x100   0x8 */
};                                               /* 0x100   0x8 */
union {
rcu *                      sk_wq;                /*     0   0x8 */
struct socket_wq *         sk_wq_raw;            /*     0   0x8 */
};

void                       (*sk_data_ready)(struct sock *); /* 0x108   0x8 */
long                       sk_rcvtimeo;          /* 0x110   0x8 */
int                        sk_rcvlowat;          /* 0x118   0x4 */
__u8                       __cacheline_group_end__sock_read_rx[0]; /* 0x11c     0 */
__u8                       __cacheline_group_begin__sock_read_rxtx[0]; /* 0x11c     0 */
int                        sk_err;               /* 0x11c   0x4 */
struct socket *            sk_socket;            /* 0x120   0x8 */
struct mem_cgroup *        sk_memcg;             /* 0x128   0x8 */
rcu *                      sk_policy[2];         /* 0x130  0x10 */
/* --- cacheline 5 boundary (320 bytes) --- */
__u8                       __cacheline_group_end__sock_read_rxtx[0]; /* 0x140     0 */
__u8                       __cacheline_group_begin__sock_write_rxtx[0]; /* 0x140     0 */
socket_lock_t              sk_lock;              /* 0x140  0x20 */
u32                        sk_reserved_mem;      /* 0x160   0x4 */
int                        sk_forward_alloc;     /* 0x164   0x4 */
u32                        sk_tsflags;           /* 0x168   0x4 */
__u8                       __cacheline_group_end__sock_write_rxtx[0]; /* 0x16c     0 */
__u8                       __cacheline_group_begin__sock_write_tx[0]; /* 0x16c     0 */
int                        sk_write_pending;     /* 0x16c   0x4 */
atomic_t                   sk_omem_alloc;        /* 0x170   0x4 */
int                        sk_sndbuf;            /* 0x174   0x4 */
int                        sk_wmem_queued;       /* 0x178   0x4 */
refcount_t                 sk_wmem_alloc;        /* 0x17c   0x4 */
/* --- cacheline 6 boundary (384 bytes) --- */
unsigned long              sk_tsq_flags;         /* 0x180   0x8 */
union {
struct sk_buff *   sk_send_head;         /* 0x188   0x8 */
struct rb_root     tcp_rtx_queue;        /* 0x188   0x8 */
};                                               /* 0x188   0x8 */
union {
struct sk_buff *           sk_send_head;         /*     0   0x8 */
struct rb_root             tcp_rtx_queue;        /*     0   0x8 */
};

struct sk_buff_head        sk_write_queue;       /* 0x190  0x18 */
u32                        sk_dst_pending_confirm; /* 0x1a8   0x4 */
u32                        sk_pacing_status;     /* 0x1ac   0x4 */
struct page_frag           sk_frag;              /* 0x1b0  0x10 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct timer_list          sk_timer;             /* 0x1c0  0x28 */

/* XXX last struct has 4 bytes of padding */

unsigned long              sk_pacing_rate;       /* 0x1e8   0x8 */
atomic_t                   sk_zckey;             /* 0x1f0   0x4 */
atomic_t                   sk_tskey;             /* 0x1f4   0x4 */
__u8                       __cacheline_group_end__sock_write_tx[0]; /* 0x1f8     0 */
__u8                       __cacheline_group_begin__sock_read_tx[0]; /* 0x1f8     0 */
unsigned long              sk_max_pacing_rate;   /* 0x1f8   0x8 */
/* --- cacheline 8 boundary (512 bytes) --- */
long                       sk_sndtimeo;          /* 0x200   0x8 */
u32                        sk_priority;          /* 0x208   0x4 */
u32                        sk_mark;              /* 0x20c   0x4 */
rcu *                      sk_dst_cache;         /* 0x210   0x8 */
netdev_features_t          sk_route_caps;        /* 0x218   0x8 */
u16                        sk_gso_type;          /* 0x220   0x2 */
u16                        sk_gso_max_segs;      /* 0x222   0x2 */
unsigned int               sk_gso_max_size;      /* 0x224   0x4 */
gfp_t                      sk_allocation;        /* 0x228   0x4 */
u32                        sk_txhash;            /* 0x22c   0x4 */
u8                         sk_pacing_shift;      /* 0x230   0x1 */
bool                       sk_use_task_frag;     /* 0x231   0x1 */
__u8                       __cacheline_group_end__sock_read_tx[0]; /* 0x232     0 */
u8                         sk_gso_disabled:1;    /* 0x232: 0 0x1 */
u8                         sk_kern_sock:1;       /* 0x232:0x1 0x1 */
u8                         sk_no_check_tx:1;     /* 0x232:0x2 0x1 */
u8                         sk_no_check_rx:1;     /* 0x232:0x3 0x1 */

/* XXX 4 bits hole, try to pack */

u8                         sk_shutdown;          /* 0x233   0x1 */
u16                        sk_type;              /* 0x234   0x2 */
u16                        sk_protocol;          /* 0x236   0x2 */
unsigned long              sk_lingertime;        /* 0x238   0x8 */
/* --- cacheline 9 boundary (576 bytes) --- */
struct proto *             sk_prot_creator;      /* 0x240   0x8 */
rwlock_t                   sk_callback_lock;     /* 0x248   0x8 */
int                        sk_err_soft;          /* 0x250   0x4 */
u32                        sk_ack_backlog;       /* 0x254   0x4 */
u32                        sk_max_ack_backlog;   /* 0x258   0x4 */
kuid_t                     sk_uid;               /* 0x25c   0x4 */
spinlock_t                 sk_peer_lock;         /* 0x260   0x4 */
int                        sk_bind_phc;          /* 0x264   0x4 */
struct pid *               sk_peer_pid;          /* 0x268   0x8 */
const struct cred  *       sk_peer_cred;         /* 0x270   0x8 */
ktime_t                    sk_stamp;             /* 0x278   0x8 */
/* --- cacheline 10 boundary (640 bytes) --- */
int                        sk_disconnects;       /* 0x280   0x4 */
u8                         sk_txrehash;          /* 0x284   0x1 */
u8                         sk_clockid;           /* 0x285   0x1 */
u8                         sk_txtime_deadline_mode:1; /* 0x286: 0 0x1 */
u8                         sk_txtime_report_errors:1; /* 0x286:0x1 0x1 */
u8                         sk_txtime_unused:6;   /* 0x286:0x2 0x1 */

/* XXX 1 byte hole, try to pack */

void *                     sk_user_data;         /* 0x288   0x8 */
void *                     sk_security;          /* 0x290   0x8 */
struct sock_cgroup_data    sk_cgrp_data;         /* 0x298   0x8 */
void                       (*sk_state_change)(struct sock *); /* 0x2a0   0x8 */
void                       (*sk_write_space)(struct sock *); /* 0x2a8   0x8 */
void                       (*sk_error_report)(struct sock *); /* 0x2b0   0x8 */
int                        (*sk_backlog_rcv)(struct sock *, struct sk_buff *); /* 0x2b8   0x8 */
/* --- cacheline 11 boundary (704 bytes) --- */
void                       (*sk_destruct)(struct sock *); /* 0x2c0   0x8 */
rcu *                      sk_reuseport_cb;      /* 0x2c8   0x8 */
rcu *                      sk_bpf_storage;       /* 0x2d0   0x8 */
struct callback_head       sk_rcu __attribute__((__aligned__(8))); /* 0x2d8  0x10 */
netns_tracker              ns_tracker;           /* 0x2e8   0x8 */

/* size: 752, cachelines: 12, members: 105 */
/* sum members: 749, holes: 1, sum holes: 1 */
/* sum bitfield members: 12 bits, bit holes: 1, sum bit holes: 4 bits */
/* paddings: 1, sum paddings: 4 */
/* forced alignments: 1 */
/* last cacheline: 48 bytes */
};

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240216162006.2342759-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: tcp: Remove redundant initialization of variable len
Colin Ian King [Fri, 16 Feb 2024 12:54:43 +0000 (12:54 +0000)]
net: tcp: Remove redundant initialization of variable len

The variable len being initialized with a value that is never read, an
if statement is initializing it in both paths of the if statement.
The initialization is redundant and can be removed.

Cleans up clang scan build warning:
net/ipv4/tcp_ao.c:512:11: warning: Value stored to 'len' during its
initialization is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://lore.kernel.org/r/20240216125443.2107244-1-colin.i.king@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agocan: raw: raw_getsockopt(): reduce scope of err
Marc Kleine-Budde [Tue, 20 Feb 2024 08:16:16 +0000 (09:16 +0100)]
can: raw: raw_getsockopt(): reduce scope of err

Reduce the scope of the variable "err" to the individual cases. This
is to avoid the mistake of setting "err" in the mistaken belief that
it will be evaluated later.

Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20240220-raw-setsockopt-v1-1-7d34cb1377fc@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agoMerge branch 'abstract-page-from-net-stack'
Paolo Abeni [Tue, 20 Feb 2024 08:23:00 +0000 (09:23 +0100)]
Merge branch 'abstract-page-from-net-stack'

Mina Almasry says:

====================
Abstract page from net stack

This series is a prerequisite to the devmem TCP series. For a full
snapshot of the code which includes these changes, feel free to check:

https://github.com/mina/linux/commits/tcpdevmem-rfcv5/

Currently these components in the net stack use the struct page
directly:

1. Drivers.
2. Page pool.
3. skb_frag_t.

To add support for new (non struct page) memory types to the net stack, we
must first abstract the current memory type.

Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for safe compiler type checking we need to introduce a new type.

struct netmem is introduced to abstract the underlying memory type.
Currently it's a no-op abstraction that is always a struct page underneath.
In parallel there is an undergoing effort to add support for devmem to the
net stack:

https://lore.kernel.org/netdev/20231208005250.2910004-1-almasrymina@google.com/

Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Yunsheng Lin <linyunsheng@huawei.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
====================

Link: https://lore.kernel.org/r/20240214223405.1972973-1-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: add netmem to skb_frag_t
Mina Almasry [Wed, 14 Feb 2024 22:34:03 +0000 (14:34 -0800)]
net: add netmem to skb_frag_t

Use struct netmem* instead of page in skb_frag_t. Currently struct
netmem* is always a struct page underneath, but the abstraction
allows efforts to add support for skb frags not backed by pages.

There is unfortunately 1 instance where the skb_frag_t is assumed to be
a exactly a bio_vec in kcm. For this case, WARN_ON_ONCE and return error
before doing a cast.

Add skb[_frag]_fill_netmem_*() and skb_add_rx_frag_netmem() helpers so
that the API can be used to create netmem skbs.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agonet: introduce abstraction for network memory
Mina Almasry [Wed, 14 Feb 2024 22:34:02 +0000 (14:34 -0800)]
net: introduce abstraction for network memory

Add the netmem_ref type, an abstraction for network memory.

To add support for new memory types to the net stack, we must first
abstract the current memory type. Currently parts of the net stack
use struct page directly:

- page_pool
- drivers
- skb_frag_t

Originally the plan was to reuse struct page* for the new memory types,
and to set the LSB on the page* to indicate it's not really a page.
However, for compiler type checking we need to introduce a new type.

netmem_ref is introduced to abstract the underlying memory type.
Currently it's a no-op abstraction that is always a struct page
underneath. In parallel there is an undergoing effort to add support
for devmem to the net stack:

https://lore.kernel.org/netdev/20231208005250.2910004-1-almasrymina@google.com/

netmem_ref can be pointers to different underlying memory types, and the
low bits are set to indicate the memory type. Helpers are provided
to convert netmem pointers to the underlying memory type (currently only
struct page). In the devmem series helpers are provided so that calling
code can use netmem without worrying about the underlying memory type
unless absolutely necessary.

Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
4 months agocan: raw: fix getsockopt() for new CAN_RAW_XL_VCID_OPTS
Oliver Hartkopp [Mon, 19 Feb 2024 20:00:21 +0000 (21:00 +0100)]
can: raw: fix getsockopt() for new CAN_RAW_XL_VCID_OPTS

The code for the CAN_RAW_XL_VCID_OPTS getsockopt() was incompletely adopted
from the CAN_RAW_FILTER getsockopt().

Add the missing put_user() and return statements.

Flagged by Smatch.

Fixes: c83c22ec1493 ("can: canxl: add virtual CAN network identifier support")
Reported-by: Simon Horman <horms@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/all/20240219200021.12113-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agonet: sysfs: Do not create sysfs for non BQL device
Breno Leitao [Fri, 16 Feb 2024 09:41:52 +0000 (01:41 -0800)]
net: sysfs: Do not create sysfs for non BQL device

Creation of sysfs entries is expensive, mainly for workloads that
constantly creates netdev and netns often.

Do not create BQL sysfs entries for devices that don't need,
basically those that do not have a real queue, i.e, devices that has
NETIF_F_LLTX and IFF_NO_QUEUE, such as `lo` interface.

This will remove the /sys/class/net/eth0/queues/tx-X/byte_queue_limits/
directory for these devices.

In the example below, eth0 has the `byte_queue_limits` directory but not
`lo`.

# ls /sys/class/net/lo/queues/tx-0/
traffic_class  tx_maxrate  tx_timeout  xps_cpus  xps_rxqs

# ls /sys/class/net/eth0/queues/tx-0/byte_queue_limits/
hold_time  inflight  limit  limit_max  limit_min

This also removes the #ifdefs, since we can also use netdev_uses_bql() to
check if the config is enabled. (as suggested by Jakub).

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240216094154.3263843-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agonet: page_pool: fix recycle stats for system page_pool allocator
Lorenzo Bianconi [Fri, 16 Feb 2024 09:25:43 +0000 (10:25 +0100)]
net: page_pool: fix recycle stats for system page_pool allocator

Use global percpu page_pool_recycle_stats counter for system page_pool
allocator instead of allocating a separate percpu variable for each
(also percpu) page pool instance.

Reviewed-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/87f572425e98faea3da45f76c3c68815c01a20ee.1708075412.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agopage_pool: disable direct recycling based on pool->cpuid on destroy
Alexander Lobakin [Thu, 15 Feb 2024 11:39:05 +0000 (12:39 +0100)]
page_pool: disable direct recycling based on pool->cpuid on destroy

Now that direct recycling is performed basing on pool->cpuid when set,
memory leaks are possible:

1. A pool is destroyed.
2. Alloc cache is emptied (it's done only once).
3. pool->cpuid is still set.
4. napi_pp_put_page() does direct recycling basing on pool->cpuid.
5. Now alloc cache is not empty, but it won't ever be freed.

In order to avoid that, rewrite pool->cpuid to -1 when unlinking NAPI to
make sure no direct recycling will be possible after emptying the cache.
This involves a bit of overhead as pool->cpuid now must be accessed
via READ_ONCE() to avoid partial reads.
Rename page_pool_unlink_napi() -> page_pool_disable_direct_recycling()
to reflect what it actually does and unexport it.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20240215113905.96817-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agodt-bindings: net: fec: add iommus property
Frank Li [Thu, 1 Feb 2024 20:22:42 +0000 (15:22 -0500)]
dt-bindings: net: fec: add iommus property

iMX8QM have iommu. Add proerty 'iommus'.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240201-8qm_smmu-v2-2-3d12a80201a3@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agowifi: wilc1000: add missing read critical sections around vif list traversal
Ajay Singh [Thu, 15 Feb 2024 15:36:21 +0000 (16:36 +0100)]
wifi: wilc1000: add missing read critical sections around vif list traversal

Some code manipulating the vif list is still missing some srcu_read_lock /
srcu_read_unlock, and so can trigger RCU warnings:

=============================
WARNING: suspicious RCU usage
6.8.0-rc1+ #37 Not tainted
-----------------------------
drivers/net/wireless/microchip/wilc1000/hif.c:110 RCU-list traversed without holding the required lock!!
[...]
stack backtrace:
CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 6.8.0-rc1+ #37
Hardware name: Atmel SAMA5
Workqueue: events sdio_irq_work
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x34/0x58
 dump_stack_lvl from wilc_get_vif_from_idx+0x158/0x180
 wilc_get_vif_from_idx from wilc_network_info_received+0x80/0x48c
 wilc_network_info_received from wilc_handle_isr+0xa10/0xd30
 wilc_handle_isr from wilc_sdio_interrupt+0x44/0x58
 wilc_sdio_interrupt from process_sdio_pending_irqs+0x1c8/0x60c
 process_sdio_pending_irqs from sdio_irq_work+0x6c/0x14c
 sdio_irq_work from process_one_work+0x8d4/0x169c
 process_one_work from worker_thread+0x8cc/0x1340
 worker_thread from kthread+0x448/0x510
 kthread from ret_from_fork+0x14/0x28

Fix those warnings by adding the needed lock around the corresponding
critical sections

Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Co-developed-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215-wilc_fix_rcu_usage-v1-4-f610e46c6f82@bootlin.com
4 months agowifi: wilc1000: fix declarations ordering
Alexis Lothoré [Thu, 15 Feb 2024 15:36:20 +0000 (16:36 +0100)]
wifi: wilc1000: fix declarations ordering

Fix reverse-christmas tree order in some functions before adding more
variables

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215-wilc_fix_rcu_usage-v1-3-f610e46c6f82@bootlin.com
4 months agowifi: wilc1000: use SRCU instead of RCU for vif list traversal
Alexis Lothoré [Thu, 15 Feb 2024 15:36:19 +0000 (16:36 +0100)]
wifi: wilc1000: use SRCU instead of RCU for vif list traversal

Enabling CONFIG_PROVE_RCU_LIST raises many warnings in wilc driver, even on
some places already protected by a read critical section. An example of
such case is in wilc_get_available_idx:

=============================
WARNING: suspicious RCU usage
6.8.0-rc1+ #32 Not tainted
-----------------------------
drivers/net/wireless/microchip/wilc1000/netdev.c:944 RCU-list traversed in non-reader section!!
[...]
stack backtrace:
CPU: 0 PID: 26 Comm: kworker/0:3 Not tainted 6.8.0-rc1+ #32
Hardware name: Atmel SAMA5
Workqueue: events_freezable mmc_rescan
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x34/0x58
 dump_stack_lvl from wilc_netdev_ifc_init+0x788/0x8ec
 wilc_netdev_ifc_init from wilc_cfg80211_init+0x690/0x910
 wilc_cfg80211_init from wilc_sdio_probe+0x168/0x490
 wilc_sdio_probe from sdio_bus_probe+0x230/0x3f4
 sdio_bus_probe from really_probe+0x270/0xdf4
 really_probe from __driver_probe_device+0x1dc/0x580
 __driver_probe_device from driver_probe_device+0x60/0x140
 driver_probe_device from __device_attach_driver+0x268/0x364
 __device_attach_driver from bus_for_each_drv+0x15c/0x1cc
 bus_for_each_drv from __device_attach+0x1ec/0x3e8
 __device_attach from bus_probe_device+0x190/0x1c0
 bus_probe_device from device_add+0x10dc/0x18e4
 device_add from sdio_add_func+0x1c0/0x2c0
 sdio_add_func from mmc_attach_sdio+0xa08/0xe1c
 mmc_attach_sdio from mmc_rescan+0xa00/0xfe0
 mmc_rescan from process_one_work+0x8d4/0x169c
 process_one_work from worker_thread+0x8cc/0x1340
 worker_thread from kthread+0x448/0x510
 kthread from ret_from_fork+0x14/0x28

This warning is due to the section being protected by a srcu critical read
section, but the list traversal being done with classic RCU API. Fix the
warning by using corresponding SRCU read lock/unlock APIs. While doing so,
since we always manipulate the same list (managed through a pointer
embedded in struct_wilc), add a macro to reduce the corresponding
boilerplate in each call site.

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215-wilc_fix_rcu_usage-v1-2-f610e46c6f82@bootlin.com
4 months agowifi: wilc1000: split deeply nested RCU list traversal in dedicated helper
Alexis Lothoré [Thu, 15 Feb 2024 15:36:18 +0000 (16:36 +0100)]
wifi: wilc1000: split deeply nested RCU list traversal in dedicated helper

Move netif_wake_queue and its surrounding RCU operations in a dedicated
function to clarify wilc_txq_task and ease refactoring

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215-wilc_fix_rcu_usage-v1-1-f610e46c6f82@bootlin.com
4 months agowifi: rtw89: 8922a: add helper of set_channel
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:41 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add helper of set_channel

Reset hardware state to prevent hardware stays at abnormal state during
setting channel. Besides, add preparation for MLO/DBCC before setting
channel, and reconfigure registers after that.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215055741.14148-5-pkshih@realtek.com
4 months agowifi: rtw89: 8922a: add set_channel RF part
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:40 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add set_channel RF part

Configure RF registers according to band, channel, bandwidth. Since this
chip will support MLO, it needs check the operating mode to decide paths
we are going to configure.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215055741.14148-4-pkshih@realtek.com
4 months agowifi: rtw89: 8922a: add set_channel BB part
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:39 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add set_channel BB part

In additional to configure band, channel and bandwidth registers, it also
configure CCK support on 2GHZ band, spur elimination, and RX gain.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215055741.14148-3-pkshih@realtek.com
4 months agowifi: rtw89: 8922a: add set_channel MAC part
Ping-Ke Shih [Thu, 15 Feb 2024 05:57:38 +0000 (13:57 +0800)]
wifi: rtw89: 8922a: add set_channel MAC part

To set channel, add a function to get TXSB (TX subband) that is hardware
index to indicate primary channel. Then, configure band, channel,
bandwidth and TXSB via registers.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240215055741.14148-2-pkshih@realtek.com
4 months agonet: sched: Annotate struct tc_pedit with __counted_by
Kees Cook [Fri, 16 Feb 2024 23:27:44 +0000 (15:27 -0800)]
net: sched: Annotate struct tc_pedit with __counted_by

Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct tc_pedit.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.

Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'pds_core-AER-handling'
David S. Miller [Mon, 19 Feb 2024 10:29:08 +0000 (10:29 +0000)]
Merge branch 'pds_core-AER-handling'

Shannon Nelson says:

====================
pds_core: AER handling

Add simple handlers for the PCI AER callbacks, and improve
the reset handling.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agopds_core: use pci_reset_function for health reset
Shannon Nelson [Fri, 16 Feb 2024 22:29:52 +0000 (14:29 -0800)]
pds_core: use pci_reset_function for health reset

We get the benefit of all the PCI reset locking and recovery if
we use the existing pci_reset_function() that will call our
local reset handlers.

Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agopds_core: delete VF dev on reset
Shannon Nelson [Fri, 16 Feb 2024 22:29:51 +0000 (14:29 -0800)]
pds_core: delete VF dev on reset

When the VF is hit with a reset, remove the aux device in
the prepare for reset and try to restore it after the reset.
The userland mechanics will need to recover and rebuild whatever
uses the device afterwards.

Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agopds_core: add simple AER handler
Shannon Nelson [Fri, 16 Feb 2024 22:29:50 +0000 (14:29 -0800)]
pds_core: add simple AER handler

Set up the pci_error_handlers error_detected and resume to be
useful in handling AER events.

Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next
David S. Miller [Mon, 19 Feb 2024 10:20:39 +0000 (10:20 +0000)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next
-queue

Tony Nguyen says:

====================
i40e: Simplify VSI and VEB handling

Ivan Vecera says:

The series simplifies handling of VSIs and VEBs by introducing for-each
iterating macros, 'find' helper functions. Also removes the VEB
recursion because the VEBs cannot have sub-VEBs according datasheet and
fixes the support for floating VEBs.

The series content:
Patch 1 - Uses existing helper function for find FDIR VSI instead of loop
Patch 2 - Adds and uses macros to iterate VSI and VEB arrays
Patch 3 - Adds 2 helper functions to find VSIs and VEBs by their SEID
Patch 4 - Fixes broken support for floating VEBs
Patch 5 - Removes VEB recursion and simplifies VEB handling
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotools: ynl: don't access uninitialized attr_space variable
Jiri Pirko [Thu, 15 Feb 2024 12:27:26 +0000 (13:27 +0100)]
tools: ynl: don't access uninitialized attr_space variable

If message contains unknown attribute and user passes
"--process-unknown" command line option, _decode() gets called with space
arg set to None. In that case, attr_space variable is not initialized
used which leads to following trace:

Traceback (most recent call last):
  File "./tools/net/ynl/cli.py", line 77, in <module>
    main()
  File "./tools/net/ynl/cli.py", line 68, in main
    reply = ynl.dump(args.dump, attrs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tools/net/ynl/lib/ynl.py", line 909, in dump
    return self._op(method, vals, [], dump=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tools/net/ynl/lib/ynl.py", line 894, in _op
    rsp_msg = self._decode(decoded.raw_attrs, op.attr_set.name)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tools/net/ynl/lib/ynl.py", line 639, in _decode
    self._rsp_add(rsp, attr_name, None, self._decode_unknown(attr))
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tools/net/ynl/lib/ynl.py", line 569, in _decode_unknown
    return self._decode(NlAttrs(attr.raw), None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tools/net/ynl/lib/ynl.py", line 630, in _decode
    search_attrs = SpaceAttrs(attr_space, rsp, outer_attrs)
                              ^^^^^^^^^^
UnboundLocalError: cannot access local variable 'attr_space' where it is not associated with a value

Fix this by moving search_attrs assignment under the if statement
above it to make sure attr_space is initialized.

Fixes: bf8b832374fb ("tools/net/ynl: Support sub-messages in nested attribute spaces")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ena: Remove ena_select_queue
Kamal Heib [Thu, 15 Feb 2024 22:31:04 +0000 (17:31 -0500)]
net: ena: Remove ena_select_queue

Avoid the following warnings by removing the ena_select_queue() function
and rely on the net core to do the queue selection, The issue happen
when an skb received from an interface with more queues than ena is
forwarded to the ena interface.

[ 1176.159959] eth0 selects TX queue 11, but real number of TX queues is 8
[ 1176.863976] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1180.767877] eth0 selects TX queue 14, but real number of TX queues is 8
[ 1188.703742] eth0 selects TX queue 14, but real number of TX queues is 8

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: aquantia: add AQR813 PHY ID
Christian Marangi [Thu, 15 Feb 2024 21:43:30 +0000 (22:43 +0100)]
net: phy: aquantia: add AQR813 PHY ID

Aquantia AQR813 is the Octal Port variant of the AQR113. Add PHY ID for
it to provide support for it.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: bql: allow the config to be disabled
Breno Leitao [Thu, 15 Feb 2024 17:05:07 +0000 (09:05 -0800)]
net: bql: allow the config to be disabled

It is impossible to disable BQL individually today, since there is no
prompt for the Kconfig entry, so, the BQL is always enabled if SYSFS is
enabled.

Create a prompt entry for BQL, so, it could be enabled or disabled at
build time independently of SYSFS.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'net-phy-eee-2'
David S. Miller [Sat, 17 Feb 2024 18:45:06 +0000 (18:45 +0000)]
Merge branch 'net-phy-eee-2'

Heiner Kallweit says:

====================
net: phy: add support for the EEE 2 registers

This series adds support for the EEE 2 registers. Most relevant and
for now the only supported modes are 2500baseT and 5000baseT.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: c45: add support for MDIO_AN_EEE_ADV2
Heiner Kallweit [Wed, 14 Feb 2024 20:19:47 +0000 (21:19 +0100)]
net: phy: c45: add support for MDIO_AN_EEE_ADV2

Add support for handling the EEE advertisement 2 register.
For now only 2500baseT and 5000baseT modes are supported.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: c45: add support for EEE link partner ability 2 to genphy_c45_read_eee_lpa
Heiner Kallweit [Wed, 14 Feb 2024 20:18:50 +0000 (21:18 +0100)]
net: phy: c45: add support for EEE link partner ability 2 to genphy_c45_read_eee_lpa

Add support for reading EEE link partner ability 2 register.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: c45: add and use genphy_c45_read_eee_cap2
Heiner Kallweit [Wed, 14 Feb 2024 20:18:02 +0000 (21:18 +0100)]
net: phy: c45: add and use genphy_c45_read_eee_cap2

Add and use genphy_c45_read_eee_cap2(), complementing
genphy_c45_read_eee_cap1().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: phy: add PHY_EEE_CAP2_FEATURES
Heiner Kallweit [Wed, 14 Feb 2024 20:17:11 +0000 (21:17 +0100)]
net: phy: add PHY_EEE_CAP2_FEATURES

As a prerequisite for adding EEE CAP2 register support, complement
PHY_EEE_CAP1_FEATURES with PHY_EEE_CAP2_FEATURES.
For now only 2500baseT and 5000baseT modes are supported.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: mdio: add helpers for accessing the EEE CAP2 registers
Heiner Kallweit [Wed, 14 Feb 2024 20:16:19 +0000 (21:16 +0100)]
net: mdio: add helpers for accessing the EEE CAP2 registers

This adds helpers for accessing the EEE CAP2 registers.
For now only 2500baseT and 5000baseT modes are supported.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoi40e: Remove VEB recursion
Ivan Vecera [Fri, 24 Nov 2023 15:03:43 +0000 (16:03 +0100)]
i40e: Remove VEB recursion

The VEB (virtual embedded switch) as a switch element can be
connected according datasheet though its uplink to:
- Physical port
- Port Virtualizer (not used directly by i40e driver but can
  be present in MFP mode where the physical port is shared
  between PFs)
- No uplink (aka floating VEB)

But VEB uplink cannot be connected to another VEB and any attempt
to do so results in:

"i40e 0000:02:00.0: couldn't add VEB, err -EIO aq_err I40E_AQ_RC_ENOENT"

that indicates "the uplink SEID does not point to valid element".

Remove this logic from the driver code this way:

1) For debugfs only allow to build floating VEB (uplink_seid == 0)
   or main VEB (uplink_seid == mac_seid)
2) Do not recurse in i40e_veb_link_event() as no VEB cannot have
   sub-VEBs
3) Ditto for i40e_veb_rebuild() + simplify the function as we know
   that the VEB for rebuild can be only the main LAN VEB or some
   of the floating VEBs
4) In i40e_rebuild() there is no need to check veb->uplink_seid
   as the possible ones are 0 and MAC SEID
5) In i40e_vsi_release() do not take into account VEBs whose
   uplink is another VEB as this is not possible
6) Remove veb_idx field from i40e_veb as a VEB cannot have
   sub-VEBs

Tested using i40e debugfs interface:
1) Initial state
[root@cnb-03 net-next]# CMD="/sys/kernel/debug/i40e/0000:02:00.0/command"
[root@cnb-03 net-next]# echo dump switch > $CMD
[root@cnb-03 net-next]# dmesg -c
[   98.440641] i40e 0000:02:00.0: header: 3 reported 3 total
[   98.446053] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[   98.452593] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[   98.458856] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

2) Add floating VEB
[root@cnb-03 net-next]# echo add relay > $CMD
[root@cnb-03 net-next]# dmesg -c
[  122.745630] i40e 0000:02:00.0: added relay 162
[root@cnb-03 net-next]# echo dump switch > $CMD
[root@cnb-03 net-next]# dmesg -c
[  136.650049] i40e 0000:02:00.0: header: 4 reported 4 total
[  136.655466] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  136.661994] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  136.668264] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16
[  136.674787] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0

3) Add VMDQ2 VSI to this new VEB
[root@cnb-03 net-next]# dmesg -c
[  168.351763] i40e 0000:02:00.0: added VSI 394 to relay 162
[  168.374652] enp2s0f0np0v0: NIC Link is Up, 40 Gbps Full Duplex, Flow Control: None
[root@cnb-03 net-next]# echo dump switch > $CMD
[root@cnb-03 net-next]# dmesg -c
[  195.683204] i40e 0000:02:00.0: header: 5 reported 5 total
[  195.688611] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16
[  195.695143] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0
[  195.701410] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  195.707935] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  195.714201] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

4) Try to delete the VEB
[root@cnb-03 net-next]# echo del relay 162 > $CMD
[root@cnb-03 net-next]# dmesg -c
[  239.260901] i40e 0000:02:00.0: deleting relay 162
[  239.265621] i40e 0000:02:00.0: can't remove VEB 162 with 1 VSIs left

5) Do PF reset and check switch status after rebuild
[root@cnb-03 net-next]# echo pfr > $CMD
[root@cnb-03 net-next]# echo dump switch > $CMD
[root@cnb-03 net-next]# dmesg -c
...
[  272.333655] i40e 0000:02:00.0: header: 5 reported 5 total
[  272.339066] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16
[  272.345599] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0
[  272.351862] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  272.358387] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  272.364654] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

6) Delete VSI and delete VEB
[  297.199116] i40e 0000:02:00.0: deleting VSI 394
[  299.807580] i40e 0000:02:00.0: deleting relay 162
[  309.767905] i40e 0000:02:00.0: header: 3 reported 3 total
[  309.773318] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  309.779845] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  309.786111] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 months agoi40e: Fix broken support for floating VEBs
Ivan Vecera [Fri, 24 Nov 2023 15:03:42 +0000 (16:03 +0100)]
i40e: Fix broken support for floating VEBs

Although the i40e supports so-called floating VEB (VEB without
an uplink connection to external network), this support is
broken. This functionality is currently unused (except debugfs)
but it will be used by subsequent series for switchdev mode
slow-path. Fix this by following:

1) Handle correctly floating VEB (VEB with uplink_seid == 0)
   in i40e_reconstitute_veb() and look for owner VSI and
   create it only for non-floating VEBs and also set bridge
   mode only for such VEBs as the floating ones are using
   always VEB mode.
2) Handle correctly floating VEB in i40e_veb_release() and
   disallow its release when there are some VSIs. This is
   different from regular VEB that have owner VSI that is
   connected to VEB's uplink after VEB deletion by FW.
3) Fix i40e_add_veb() to handle 'vsi' that is NULL for floating
   VEBs. For floating VEB use 0 for downlink SEID and 'true'
   for 'default_port' parameters as per datasheet.
4) Fix 'add relay' command in i40e_dbg_command_write() to allow
   to create floating VEB by 'add relay 0 0' or 'add relay'

Tested using debugfs:
1) Initial state
[root@host net-next]# echo dump switch > $CMD
[root@host net-next]# dmesg -c
[  173.701286] i40e 0000:02:00.0: header: 3 reported 3 total
[  173.706701] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  173.713241] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  173.719507] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

2) Add floating VEB
[root@host net-next]# CMD="/sys/kernel/debug/i40e/0000:02:00.0/command"
[root@host net-next]# echo add relay > $CMD
[root@host net-next]# dmesg -c
[  245.551720] i40e 0000:02:00.0: added relay 162
[root@host net-next]# echo dump switch > $CMD
[root@host net-next]# dmesg -c
[  276.984371] i40e 0000:02:00.0: header: 4 reported 4 total
[  276.989779] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  276.996302] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  277.002569] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16
[  277.009091] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0

3) Add VMDQ2 VSI to this new VEB
[root@host net-next]# echo add vsi 162 > $CMD
[root@host net-next]# dmesg -c
[  332.314030] i40e 0000:02:00.0: added VSI 394 to relay 162
[  332.337486] enp2s0f0np0v0: NIC Link is Up, 40 Gbps Full Duplex, Flow Control: None
[root@host net-next]# echo dump switch > $CMD
[root@host net-next]# dmesg -c
[  387.284490] i40e 0000:02:00.0: header: 5 reported 5 total
[  387.289904] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16
[  387.296446] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0
[  387.302708] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  387.309234] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  387.315500] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

4) Try to delete the VEB
[root@host net-next]# echo del relay 162 > $CMD
[root@host net-next]# dmesg -c
[  428.749297] i40e 0000:02:00.0: deleting relay 162
[  428.754011] i40e 0000:02:00.0: can't remove VEB 162 with 1 VSIs left

5) Do PF reset and check switch status after rebuild
[root@host net-next]# echo pfr > $CMD
[root@host net-next]# echo dump switch > $CMD
[root@host net-next]# dmesg -c
[  738.056172] i40e 0000:02:00.0: header: 5 reported 5 total
[  738.061577] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16
[  738.068104] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0
[  738.074367] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[  738.080892] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[  738.087160] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

6) Delete VSI and delete VEB
[root@host net-next]# echo del vsi 394 > $CMD
[root@host net-next]# echo del relay 162 > $CMD
[root@host net-next]# echo dump switch > $CMD
[root@host net-next]# dmesg -c
[ 1233.081126] i40e 0000:02:00.0: deleting VSI 394
[ 1239.345139] i40e 0000:02:00.0: deleting relay 162
[ 1244.886920] i40e 0000:02:00.0: header: 3 reported 3 total
[ 1244.892328] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16
[ 1244.898853] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0
[ 1244.905119] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 months agoi40e: Add helpers to find VSI and VEB by SEID and use them
Ivan Vecera [Fri, 24 Nov 2023 15:03:41 +0000 (16:03 +0100)]
i40e: Add helpers to find VSI and VEB by SEID and use them

Add two helpers i40e_(veb|vsi)_get_by_seid() to find corresponding
VEB or VSI by their SEID value and use these helpers to replace
existing open-coded loops.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 months agoi40e: Introduce and use macros for iterating VSIs and VEBs
Ivan Vecera [Fri, 24 Nov 2023 15:03:40 +0000 (16:03 +0100)]
i40e: Introduce and use macros for iterating VSIs and VEBs

Introduce i40e_for_each_vsi() and i40e_for_each_veb() helper
macros and use them to iterate relevant arrays.

Replace pattern:
for (i = 0; i < pf->num_alloc_vsi; i++)
by:
i40e_for_each_vsi(pf, i, vsi)

and pattern:
for (i = 0; i < I40E_MAX_VEB; i++)
by
i40e_for_each_veb(pf, i, veb)

These macros also check if array item pf->vsi[i] or pf->veb[i]
are not NULL and skip such items so we can remove redundant
checks from loop bodies.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 months agoi40e: Use existing helper to find flow director VSI
Ivan Vecera [Fri, 24 Nov 2023 15:03:39 +0000 (16:03 +0100)]
i40e: Use existing helper to find flow director VSI

Use existing i40e_find_vsi_by_type() to find a VSI
associated with flow director.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
4 months agowifi: brcmsmac: avoid function pointer casts
Arnd Bergmann [Tue, 13 Feb 2024 10:05:37 +0000 (11:05 +0100)]
wifi: brcmsmac: avoid function pointer casts

An old cleanup went a little too far and causes a warning with clang-16
and higher as it breaks control flow integrity (KCFI) rules:

drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy_shim.c:64:34: error: cast from 'void (*)(struct brcms_phy *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
   64 |                         brcms_init_timer(physhim->wl, (void (*)(void *))fn,
      |                                                       ^~~~~~~~~~~~~~~~~~~~

Change this one instance back to passing a void pointer so it can be
used with the timer callback interface.

Fixes: d89a4c80601d ("staging: brcm80211: removed void * from softmac phy")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://msgid.link/20240213100548.457854-1-arnd@kernel.org
4 months agoMerge patch series "Add ECC feature support to Tx and Rx FIFOs for Xilinx CAN Control...
Marc Kleine-Budde [Fri, 16 Feb 2024 13:18:35 +0000 (14:18 +0100)]
Merge patch series "Add ECC feature support to Tx and Rx FIFOs for Xilinx CAN Controller."

Marc Kleine-Budde <mkl@pengutronix.de> says:

ECC is an IP configuration option where counter registers are added in
IP for 1bit/2bit ECC errors count and reset.

Also driver reports 1bit/2bit ECC errors for FIFOs based on ECC error
interrupts.

Add xlnx,has-ecc optional property for Xilinx AXI CAN controller
to support ECC if the ECC block is enabled in the HW.

Add ethtool stats interface for getting all the ECC errors information.

There is no public documentation for it available.

Changes in v8:
- Use u64_stats_sync instead of spinlock
- Renamed stats strings: use "_" instead of "-"
- Renamed stats strings: add "_errors" trailer
- Renamed stats variables similar to stats strings

Changes in v7:
- Update with spinlock only for stats counters

Changes in v6:
- Update commit description

Changes in v5:
- Fix review comments
- Change the sequence of updates the stats
- Add get_strings and get_sset_count stats interface
- Use u64 stats helper function

Changes in v4:
- Fix DT binding check warning
- Update xlnx,has-ecc property description

Changes in v3:
- Update mailing list
- Update commit description

Changes in v2:
- Address review comments
- Add ethtool stats interface
- Update commit description

Link: https://lore.kernel.org/all/20240213-xilinx_ecc-v8-0-8d75f8b80771@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agocan: xilinx_can: Add ethtool stats interface for ECC errors
Srinivas Goud [Tue, 13 Feb 2024 10:36:45 +0000 (11:36 +0100)]
can: xilinx_can: Add ethtool stats interface for ECC errors

Add ethtool stats interface for reading FIFO 1bit/2bit ECC errors
information.

Signed-off-by: Srinivas Goud <srinivas.goud@amd.com>
Link: https://lore.kernel.org/all/20240213-xilinx_ecc-v8-3-8d75f8b80771@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agocan: xilinx_can: Add ECC support
Srinivas Goud [Tue, 13 Feb 2024 10:36:44 +0000 (11:36 +0100)]
can: xilinx_can: Add ECC support

Add ECC support for Xilinx CAN Controller, so this driver reports
1bit/2bit ECC errors for FIFO's based on ECC error interrupt. ECC
feature for Xilinx CAN Controller selected through 'xlnx,has-ecc' DT
property

Signed-off-by: Srinivas Goud <srinivas.goud@amd.com>
Link: https://lore.kernel.org/all/20240213-xilinx_ecc-v8-2-8d75f8b80771@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agodt-bindings: can: xilinx_can: Add 'xlnx,has-ecc' optional property
Srinivas Goud [Tue, 13 Feb 2024 10:36:43 +0000 (11:36 +0100)]
dt-bindings: can: xilinx_can: Add 'xlnx,has-ecc' optional property

ECC feature added to CAN TX_OL, TX_TL and RX FIFOs of Xilinx AXI CAN
Controller.

ECC is an IP configuration option where counter registers are added in
IP for 1bit/2bit ECC errors.

'xlnx,has-ecc' is an optional property and added to Xilinx AXI CAN
Controller node if ECC block enabled in the HW

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Srinivas Goud <srinivas.goud@amd.com>
Link: https://lore.kernel.org/all/20240213-xilinx_ecc-v8-1-8d75f8b80771@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 months agonet: phy: aquantia: add AQR113 PHY ID
Christian Marangi [Thu, 15 Feb 2024 15:30:05 +0000 (16:30 +0100)]
net: phy: aquantia: add AQR113 PHY ID

Add Aquantia AQR113 PHY ID. Aquantia AQR113 is just a chip size variant of
the already supported AQR133C where the only difference is the PHY ID
and the hw chip size.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ti: icssg-prueth: Remove duplicate cleanup calls in emac_ndo_stop()
Diogo Ivo [Thu, 15 Feb 2024 15:22:01 +0000 (15:22 +0000)]
net: ti: icssg-prueth: Remove duplicate cleanup calls in emac_ndo_stop()

Remove the duplicate calls to prueth_emac_stop() and
prueth_cleanup_tx_chns() in emac_ndo_stop().

Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agotcp: Spelling s/curcuit/circuit/
Geert Uytterhoeven [Thu, 15 Feb 2024 13:12:21 +0000 (14:12 +0100)]
tcp: Spelling s/curcuit/circuit/

Fix a misspelling of "circuit".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'ionic-xdp-support'
David S. Miller [Fri, 16 Feb 2024 08:48:09 +0000 (08:48 +0000)]
Merge branch 'ionic-xdp-support'

Shannon Nelson says:

====================
ionic: add XDP support

This patchset is new support in ionic for XDP processing,
including basic XDP on Rx packets, TX and REDIRECT, and frags
for jumbo frames.

Since ionic has not yet been converted to use the page_pool APIs,
this uses the simple MEM_TYPE_PAGE_ORDER0 buffering.  There are plans
to convert the driver in the near future.

v4:
 - removed "inline" from short utility functions
 - changed to use "goto err_out" in ionic_xdp_register_rxq_info()
 - added "continue" to reduce nesting in ionic_xdp_queues_config()
 - used xdp_prog in ionic_rx_clean() to flag whether or not to sync
   the rx buffer after calling ionix_xdp_run()
 - swapped order of XDP_TX and XDP_REDIRECT cases in ionic_xdp_run()
   to make patch 6 a little cleaner

v3:
https://lore.kernel.org/netdev/20240210004827.53814-1-shannon.nelson@amd.com/
 - removed budget==0 patch, sent it separately to net

v2:
https://lore.kernel.org/netdev/20240208005725.65134-1-shannon.nelson@amd.com/
 - added calls to txq_trans_cond_update()
 - added a new patch to catch NAPI budget==0

v1:
https://lore.kernel.org/netdev/20240130013042.11586-1-shannon.nelson@amd.com/

RFC:
https://lore.kernel.org/netdev/20240118192500.58665-1-shannon.nelson@amd.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: implement xdp frags support
Shannon Nelson [Wed, 14 Feb 2024 17:59:09 +0000 (09:59 -0800)]
ionic: implement xdp frags support

Add support for using scatter-gather / frags in XDP in both
Rx and Tx paths.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: add ndo_xdp_xmit
Shannon Nelson [Wed, 14 Feb 2024 17:59:08 +0000 (09:59 -0800)]
ionic: add ndo_xdp_xmit

When our ndo_xdp_xmit is called we mark the buffer with
XDP_REDIRECT so we know to return it to the XDP stack for
cleaning.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: Add XDP_REDIRECT support
Shannon Nelson [Wed, 14 Feb 2024 17:59:07 +0000 (09:59 -0800)]
ionic: Add XDP_REDIRECT support

The XDP_REDIRECT packets are given to the XDP stack and
we drop the use of the related page: it will get freed
by the driver that ends up doing the Tx.  Because we have
some hardware configurations with limited queue resources,
we use the existing datapath Tx queues rather than creating
and managing a separate set of xdp_tx queues.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: Add XDP_TX support
Shannon Nelson [Wed, 14 Feb 2024 17:59:06 +0000 (09:59 -0800)]
ionic: Add XDP_TX support

The XDP_TX packets get fed back into the Rx queue's partnered
Tx queue as an xdp_frame.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: Add XDP packet headroom
Shannon Nelson [Wed, 14 Feb 2024 17:59:05 +0000 (09:59 -0800)]
ionic: Add XDP packet headroom

If an xdp program is loaded, add headroom at the beginning
of the frame to allow for editing and insertions that an XDP
program might need room for, and tailroom used later for XDP
frame tracking.  These are only needed in the first Rx buffer
in a packet, not for any trailing frags.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: add initial framework for XDP support
Shannon Nelson [Wed, 14 Feb 2024 17:59:04 +0000 (09:59 -0800)]
ionic: add initial framework for XDP support

Set up the basics for running Rx packets through XDP programs.
Add new queue setup and teardown steps for adding/removing an
XDP program, and add the call to run the XDP on a packet.

The XDP frame size needs to be the MTU plus standard ethernet
header, plus head room for XDP scribblings and tail room for a
struct skb_shared_info.  Also, at this point, we don't support
XDP frags, only a single contiguous Rx buffer.  This means
that our page splitting is not very useful, so when XDP is in
use we need to use the full Rx buffer size and not do sharing.

Co-developed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: use dma range APIs
Shannon Nelson [Wed, 14 Feb 2024 17:59:03 +0000 (09:59 -0800)]
ionic: use dma range APIs

Convert Rx datapath handling to use the DMA range APIs
in preparation for adding XDP handling.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: add helpers for accessing buffer info
Shannon Nelson [Wed, 14 Feb 2024 17:59:02 +0000 (09:59 -0800)]
ionic: add helpers for accessing buffer info

These helpers clean up some of the code around DMA mapping
and other buffer references, and will be used in the next
few patches for the XDP support.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoionic: set adminq irq affinity
Shannon Nelson [Wed, 14 Feb 2024 17:59:01 +0000 (09:59 -0800)]
ionic: set adminq irq affinity

We claim to have the AdminQ on our irq0 and thus cpu id 0,
but we need to be sure we set the affinity hint to try to
keep it there.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet/iucv: fix virtual vs physical address confusion
Alexander Gordeev [Wed, 14 Feb 2024 08:47:07 +0000 (09:47 +0100)]
net/iucv: fix virtual vs physical address confusion

Fix virtual vs physical address confusion. This does not fix a bug
since virtual and physical address spaces are currently the same.

Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoMerge branch 'ravb-rutime-PM-support'
David S. Miller [Fri, 16 Feb 2024 08:32:05 +0000 (08:32 +0000)]
Merge branch 'ravb-rutime-PM-support'

Claudiu Beznea says:

====================
net: ravb: Add runtime PM support (part 2)

Series adds runtime PM support for the ravb driver. This is a continuation
of [1].

There are 5 more preparation patches (patches 1-5) and patch 6
adds runtime PM support.

Patches in this series were part of [2].

Changes in v4:
- remove unnecessary code from patch 4/6
- improve the code in patch 5/6

Changes in v3:
- fixed typos
- added patch "net: ravb: Move the update of ndev->features to
  ravb_set_features()"
- changes title of patch "net: ravb: Do not apply RX checksum
  settings to hardware if the interface is down" from v2 into
  "net: ravb: Do not apply features to hardware if the interface
  is down", changed patch description and updated the patch
- collected tags

Changes in v2:
- address review comments
- in patch 4/5 take into account the latest changes introduced
  in ravb_set_features_gbeth()

Changes since [2]:
- patch 1/5 is new
- use pm_runtime_get_noresume() and pm_runtime_active() in patches
  3/5, 4/5
- fixed higlighted typos in patch 4/5

[1] https://lore.kernel.org/all/20240202084136.3426492-1-claudiu.beznea.uj@bp.renesas.com/
[2] https://lore.kernel.org/all/20240105082339.1468817-1-claudiu.beznea.uj@bp.renesas.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Add runtime PM support
Claudiu Beznea [Wed, 14 Feb 2024 13:58:00 +0000 (15:58 +0200)]
net: ravb: Add runtime PM support

Add runtime PM support for the ravb driver. As the driver is used by
different IP variants, with different behaviors, to be able to have the
runtime PM support available for all devices, the preparatory commits
moved all the resources parsing and allocations in the driver's probe
function and kept the settings for ravb_open(). This is due to the fact
that on some IP variants-platforms tuples disabling/enabling the clocks
will switch the IP to the reset operation mode where register contents is
lost and reconfiguration needs to be done. For this the rabv_open()
function enables the clocks, switches the IP to configuration mode, applies
all the register settings and switches the IP to the operational mode. At
the end of ravb_open() IP is ready to send/receive data.

In ravb_close() necessary reverts are done (compared with ravb_open()), the
IP is switched to reset mode and clocks are disabled.

The ethtool APIs or IOCTLs that might execute while the interface is down
are either cached (and applied in ravb_open()) or rejected (as at that time
the IP is in reset mode). Keeping the IP in the reset mode also increases
the power saved (according to the hardware manual).

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Do not apply features to hardware if the interface is down
Claudiu Beznea [Wed, 14 Feb 2024 13:57:59 +0000 (15:57 +0200)]
net: ravb: Do not apply features to hardware if the interface is down

Do not apply features to hardware if the interface is down. In case runtime
PM is enabled, and while the interface is down, the IP will be in reset
mode (as for some platforms disabling the clocks will switch the IP to
reset mode, which will lead to losing register contents) and applying
settings in reset mode is not an option. Instead, cache the features and
apply them in ravb_open() through ravb_emac_init().

To avoid accessing the hardware while the interface is down
pm_runtime_active() check was introduced. Along with it the device runtime
PM usage counter has been incremented to avoid disabling the device clocks
while the check is in progress (if any).

Commit prepares for the addition of runtime PM.

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Move the update of ndev->features to ravb_set_features()
Claudiu Beznea [Wed, 14 Feb 2024 13:57:58 +0000 (15:57 +0200)]
net: ravb: Move the update of ndev->features to ravb_set_features()

Commit c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")
introduced support for setting GbEth features. With this the IP-specific
features update functions update the ndev->features individually.

Next commits add runtime PM support for the ravb driver. The runtime PM
implementation will enable/disable the IP clocks on
the ravb_open()/ravb_close() functions. Accessing the IP registers with
clocks disabled blocks the system.

The ravb_set_features() function could be executed when the Ethernet
interface is closed so we need to ensure we don't access IP registers while
the interface is down when runtime PM support will be in place.

For these, move the update of ndev->features to ravb_set_features(). In
this way we update the ndev->features only when the IP-specific features
set function returns success and we can avoid code duplication when
introducing runtime PM registers protection.

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Return cached statistics if the interface is down
Claudiu Beznea [Wed, 14 Feb 2024 13:57:57 +0000 (15:57 +0200)]
net: ravb: Return cached statistics if the interface is down

Return the cached statistics in case the interface is down. There should be
no drawback to this, as cached statistics are updated in ravb_close().

In order to avoid accessing the IP registers while the IP is runtime
suspended pm_runtime_active() check was introduced. The device runtime
PM usage counter has been incremented to avoid disabling the device clocks
while the check is in progress (if any).

The commit prepares the code for the addition of runtime PM support.

Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Keep the reverse order of operations in ravb_close()
Claudiu Beznea [Wed, 14 Feb 2024 13:57:56 +0000 (15:57 +0200)]
net: ravb: Keep the reverse order of operations in ravb_close()

Keep the reverse order of operations in ravb_close() when compared with
ravb_open(). This is the recommended configuration sequence.

Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: ravb: Get rid of the temporary variable irq
Claudiu Beznea [Wed, 14 Feb 2024 13:57:55 +0000 (15:57 +0200)]
net: ravb: Get rid of the temporary variable irq

The 4th argument of ravb_setup_irq() is used to save the IRQ number that
will be further used by the driver code. Not all ravb_setup_irqs() calls
need to save the IRQ number. The previous code used to pass a dummy
variable as the 4th argument in case the IRQ is not needed for further
usage. That is not necessary as the code from ravb_setup_irq() can detect
by itself if the IRQ needs to be saved. Thus, get rid of the code that is
not needed.

Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agoselftests: bonding: make sure new active is not null
Hangbin Liu [Wed, 14 Feb 2024 09:21:28 +0000 (17:21 +0800)]
selftests: bonding: make sure new active is not null

One of Jakub's tests[1] shows that there may be period all ports
are down and no active slave. This makes the new_active_slave null
and the test fails. Add a check to make sure the new active is not null.

 [  189.051966] br0: port 2(s1) entered disabled state
 [  189.317881] bond0: (slave eth1): link status definitely down, disabling slave
 [  189.318487] bond0: (slave eth2): making interface the new active one
 [  190.435430] br0: port 4(s2) entered disabled state
 [  190.773786] bond0: (slave eth0): link status definitely down, disabling slave
 [  190.774204] bond0: (slave eth2): link status definitely down, disabling slave
 [  190.774715] bond0: now running without any active interface!
 [  190.877760] bond0: (slave eth0): link status definitely up
 [  190.878098] bond0: (slave eth0): making interface the new active one
 [  190.878495] bond0: active interface up!
 [  191.802872] br0: port 4(s2) entered blocking state
 [  191.803157] br0: port 4(s2) entered forwarding state
 [  191.813756] bond0: (slave eth2): link status definitely up
 [  192.847095] br0: port 2(s1) entered blocking state
 [  192.847396] br0: port 2(s1) entered forwarding state
 [  192.853740] bond0: (slave eth1): link status definitely up
 # TEST: prio (active-backup ns_ip6_target primary_reselect 1)         [FAIL]
 # Current active slave is null but not eth0

[1] https://netdev-3.bots.linux.dev/vmksft-bonding/results/464481/1-bond-options-sh/stdout

Fixes: 45bf79bc56c4 ("selftests: bonding: reduce garp_test/arp_validate test time")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 months agonet: dsa: remove OF-based MDIO bus registration from DSA core
Arınç ÜNAL [Tue, 13 Feb 2024 07:29:05 +0000 (10:29 +0300)]
net: dsa: remove OF-based MDIO bus registration from DSA core

The code block under the "!ds->user_mii_bus && ds->ops->phy_read" check
under dsa_switch_setup() populates ds->user_mii_bus. The use of
ds->user_mii_bus is inappropriate when the MDIO bus of the switch is
described on the device tree [1].

For this reason, use this code block only for switches [with MDIO bus]
probed on platform_data, and OF which the switch MDIO bus isn't described
on the device tree. Therefore, remove OF-based MDIO bus registration as
it's useless for these cases.

These subdrivers which control switches [with MDIO bus] probed on OF, will
lose the ability to register the MDIO bus OF-based:

drivers/net/dsa/b53/b53_common.c
drivers/net/dsa/lan9303-core.c
drivers/net/dsa/vitesse-vsc73xx-core.c

These subdrivers let the DSA core driver register the bus:
- ds->ops->phy_read() and ds->ops->phy_write() are present.
- ds->user_mii_bus is not populated.

The commit fe7324b93222 ("net: dsa: OF-ware slave_mii_bus") which brought
OF-based MDIO bus registration on the DSA core driver is reasonably recent
and, in this time frame, there have been no device trees in the Linux
repository that started describing the MDIO bus, or dt-bindings defining
the MDIO bus for the switches these subdrivers control. So I don't expect
any devices to be affected.

The logic we encourage is that all subdrivers should register the switch
MDIO bus on their own [2]. And, for subdrivers which control switches [with
MDIO bus] probed on OF, this logic must be followed to support all cases
properly:

No switch MDIO bus defined: Populate ds->user_mii_bus, register the MDIO
bus, set the interrupts for PHYs if "interrupt-controller" is defined at
the switch node. This case should only be covered for the switches which
their dt-bindings documentation didn't document the MDIO bus from the
start. This is to keep supporting the device trees that do not describe the
MDIO bus on the device tree but the MDIO bus is being used nonetheless.

Switch MDIO bus defined: Don't populate ds->user_mii_bus, register the MDIO
bus, set the interrupts for PHYs if ["interrupt-controller" is defined at
the switch node and "interrupts" is defined at the PHY nodes under the
switch MDIO bus node].

Switch MDIO bus defined but explicitly disabled: If the device tree says
status = "disabled" for the MDIO bus, we shouldn't need an MDIO bus at all.
Instead, just exit as early as possible and do not call any MDIO API.

After all subdrivers that control switches with MDIO buses are made to
register the MDIO buses on their own, we will be able to get rid of
dsa_switch_ops :: phy_read() and :: phy_write(), and the code block for
registering the MDIO bus on the DSA core driver.

Link: https://lore.kernel.org/netdev/20231213120656.x46fyad6ls7sqyzv@skbuf/
Link: https://lore.kernel.org/netdev/20240103184459.dcbh57wdnlox6w7d@skbuf/
Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Acked-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240213-for-netnext-dsa-mdio-bus-v2-1-0ff6f4823a9e@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge branch 'for-thermal-genetlink-family-bind-unbind-callbacks'
Jakub Kicinski [Fri, 16 Feb 2024 01:49:24 +0000 (17:49 -0800)]
Merge branch 'for-thermal-genetlink-family-bind-unbind-callbacks'

Stanislaw Gruszka says:

====================
thermal/netlink/intel_hfi: Enable HFI feature only when required

The patchset introduces a new genetlink family bind/unbind callbacks
and thermal/netlink notifications, which allow drivers to send netlink
multicast events based on the presence of actual user-space consumers.
This functionality optimizes resource usage by allowing disabling
of features when not needed.

v1: https://lore.kernel.org/linux-pm/20240131120535.933424-1-stanislaw.gruszka@linux.intel.com//
v2: https://lore.kernel.org/linux-pm/20240206133605.1518373-1-stanislaw.gruszka@linux.intel.com/
v3: https://lore.kernel.org/linux-pm/20240209120625.1775017-1-stanislaw.gruszka@linux.intel.com/
====================

Link: https://lore.kernel.org/r/20240212161615.161935-1-stanislaw.gruszka@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agogenetlink: Add per family bind/unbind callbacks
Stanislaw Gruszka [Mon, 12 Feb 2024 16:16:13 +0000 (17:16 +0100)]
genetlink: Add per family bind/unbind callbacks

Add genetlink family bind()/unbind() callbacks when adding/removing
multicast group to/from netlink client socket via setsockopt() or
bind() syscall.

They can be used to track if consumers of netlink multicast messages
emerge or disappear. Thus, a client implementing callbacks, can now
send events only when there are active consumers, preventing unnecessary
work when none exist.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240212161615.161935-2-stanislaw.gruszka@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoconfigs/debug: add NET debug config
Matthieu Baerts (NGI0) [Mon, 12 Feb 2024 10:47:14 +0000 (11:47 +0100)]
configs/debug: add NET debug config

The debug.config file is really great to easily enable a bunch of
general debugging features on a CI-like setup. But it would be great to
also include core networking debugging config.

A few CI's validating features from the Net tree also enable a few other
debugging options on top of debug.config. A small selection is quite
generic for the whole net tree. They validate some assumptions in
different parts of the core net tree. As suggested by Jakub Kicinski in
[1], having them added to this debug.config file would help other CIs
using network features to find bugs in this area.

Note that the two REFCNT configs also select REF_TRACKER, which doesn't
seem to be an issue.

Link: https://lore.kernel.org/netdev/20240202093148.33bd2b14@kernel.org/T/
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240212-kconfig-debug-enable-net-v1-1-fb026de8174c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 15 Feb 2024 22:01:43 +0000 (14:01 -0800)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

net/core/dev.c
  9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()")
  723de3ebef03 ("net: free altname using an RCU callback")

net/unix/garbage.c
  11498715f266 ("af_unix: Remove io_uring code for GC.")
  25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.")

drivers/net/ethernet/renesas/ravb_main.c
  ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path"
)
  c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")

net/mptcp/protocol.c
  bdd70eb68913 ("mptcp: drop the push_pending field")
  28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>