linux-block.git
2 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Fri, 19 Apr 2024 00:10:20 +0000 (17:10 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-04-17 (ice)

This series contains updates to ice driver only.

Marcin adds Tx malicious driver detection (MDD) events to be included as
part of mdd-auto-reset-vf.

Dariusz removes unnecessary implementation of ndo_get_phys_port_name.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Remove ndo_get_phys_port_name
  ice: Add automatic VF reset on Tx MDD events
====================

Link: https://lore.kernel.org/r/20240417165634.2081793-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: sja1105: flower: validate control flags
Asbjørn Sloth Tønnesen [Wed, 17 Apr 2024 14:44:12 +0000 (14:44 +0000)]
net: dsa: sja1105: flower: validate control flags

This driver currently doesn't support any control flags.

Use flow_rule_match_has_control_flags() to check for control flags,
such as can be set through `tc flower ... ip_flags frag`.

In case any control flags are masked, flow_rule_match_has_control_flags()
sets a NL extended error message, and we return -EOPNOTSUPP.

Only compile-tested.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Link: https://lore.kernel.org/r/20240417144413.104257-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: felix: flower: validate control flags
Asbjørn Sloth Tønnesen [Wed, 17 Apr 2024 14:44:06 +0000 (14:44 +0000)]
net: dsa: felix: flower: validate control flags

This driver currently doesn't support any control flags.

Use flow_rule_match_has_control_flags() to check for control flags,
such as can be set through `tc flower ... ip_flags frag`.

In case any control flags are masked, flow_rule_match_has_control_flags()
sets a NL extended error message, and we return -EOPNOTSUPP.

Only compile-tested.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Link: https://lore.kernel.org/r/20240417144407.104241-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: mscc: ocelot: flower: validate control flags
Asbjørn Sloth Tønnesen [Wed, 17 Apr 2024 14:43:58 +0000 (14:43 +0000)]
net: mscc: ocelot: flower: validate control flags

This driver currently doesn't support any control flags.

Use flow_rule_match_has_control_flags() to check for control flags,
such as can be set through `tc flower ... ip_flags frag`.

In case any control flags are masked, flow_rule_match_has_control_flags()
sets a NL extended error message, and we return -EOPNOTSUPP.

Only compile-tested.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Link: https://lore.kernel.org/r/20240417144359.104225-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agosfc: use flow_rule_is_supp_control_flags()
Asbjørn Sloth Tønnesen [Wed, 17 Apr 2024 14:07:10 +0000 (14:07 +0000)]
sfc: use flow_rule_is_supp_control_flags()

Change the check for unsupported control flags, to use the new helper
flow_rule_is_supp_control_flags().

Since the helper was based on sfc, then nothing really changes.

Compile-tested, and compiled objects are identical.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20240417140712.100905-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agomlxsw: spectrum_flower: validate control flags
Asbjørn Sloth Tønnesen [Wed, 17 Apr 2024 13:51:20 +0000 (13:51 +0000)]
mlxsw: spectrum_flower: validate control flags

This driver currently doesn't support any control flags.

Use flow_rule_has_control_flags() to check for control flags,
such as can be set through `tc flower ... ip_flags frag`.

In case any control flags are masked, flow_rule_has_control_flags()
sets a NL extended error message, and we return -EOPNOTSUPP.

Only compile-tested.

Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20240417135131.99921-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net: set the exit code correctly in Python tests
Jakub Kicinski [Wed, 17 Apr 2024 23:11:40 +0000 (16:11 -0700)]
selftests: net: set the exit code correctly in Python tests

Test cases need to exit with non-zero status if they failed,
we currently don't do that:

  # KTAP version 1
  # 1..3
  # # At /root/ksft-net-drv/drivers/net/./ping.py line 18:
  # # Check failed 1 != 2
  # not ok 1 ping.test_v4
  # ok 2 ping.test_v6
  # ok 3 ping.test_tcp
  # # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0
  ok 1 selftests: drivers/net: ping.py
  ^^^^

It's a bit tempting to make the exit part of ksft_run(),
but that only works well for very trivial setups. We can
revisit this later, if people forget to call ksft_exit().

Link: https://lore.kernel.org/r/20240417231146.2435572-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: net: fix counting totals when some checks fail
Jakub Kicinski [Wed, 17 Apr 2024 23:11:39 +0000 (16:11 -0700)]
selftests: net: fix counting totals when some checks fail

Totals currently only pay attention to exceptions, if check fails
(say ksft_eq()) the test case will be counted as pass:

  # At /ksft/drivers/net/./ping.py line 18:
  # Check failed 1 != 2
  not ok 1 ping.test_v4
  ok 2 ping.test_v6
  ok 3 ping.test_tcp
  # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
            ^^^^^^^^^^^^^

Pay attention to the result.

Fixes: b86761ff6374 ("selftests: net: add scaffolding for Netlink tests in Python")
Link: https://lore.kernel.org/r/20240417231146.2435572-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 18 Apr 2024 20:10:20 +0000 (13:10 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

Conflicts:

include/trace/events/rpcgss.h
  386f4a737964 ("trace: events: cleanup deprecated strncpy uses")
  a4833e3abae1 ("SUNRPC: Fix rpcgss_context trace event acceptor field")

Adjacent changes:

drivers/net/ethernet/intel/ice/ice_tc_lib.c
  2cca35f5dd78 ("ice: Fix checking for unsupported keys on non-tunnel device")
  784feaa65dfd ("ice: Add support for PFCP hardware offload in switchdev")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'net-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 18 Apr 2024 18:40:54 +0000 (11:40 -0700)]
Merge tag 'net-6.9-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "A little calmer than usual, probably just the timing of sub-tree PRs.

  Including fixes from netfilter.

  Current release - regressions:

   - inet: bring NLM_DONE out to a separate recv() again, fix user space
     which assumes multiple recv()s will happen and gets blocked forever

   - drv: mlx5:
       - restore mistakenly dropped parts in register devlink flow
       - use channel mdev reference instead of global mdev instance for
         coalescing
       - acquire RTNL lock before RQs/SQs activation/deactivation

  Previous releases - regressions:

   - net: change maximum number of UDP segments to 128, fix virtio
     compatibility with Windows peers

   - usb: ax88179_178a: avoid writing the mac address before first
     reading

  Previous releases - always broken:

   - sched: fix mirred deadlock on device recursion

   - netfilter:
       - br_netfilter: skip conntrack input hook for promisc packets
       - fixes removal of duplicate elements in the pipapo set backend
       - various fixes for abort paths and error handling

   - af_unix: don't peek OOB data without MSG_OOB

   - drv: flower: fix fragment flags handling in multiple drivers

   - drv: ravb: fix jumbo frames and packet stats accounting

  Misc:

   - kselftest_harness: fix Clang warning about zero-length format

   - tun: limit printing rate when illegal packet received by tun dev"

* tag 'net-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits)
  net: ethernet: ti: am65-cpsw-nuss: cleanup DMA Channels before using them
  net: usb: ax88179_178a: avoid writing the mac address before first reading
  net: ravb: Fix RX byte accounting for jumbo packets
  net: ravb: Fix GbEth jumbo packet RX checksum handling
  net: ravb: Allow RX loop to move past DMA mapping errors
  net: ravb: Count packets instead of descriptors in R-Car RX path
  net: ethernet: mtk_eth_soc: fix WED + wifi reset
  net:usb:qmi_wwan: support Rolling modules
  selftests: kselftest_harness: fix Clang warning about zero-length format
  net/sched: Fix mirred deadlock on device recursion
  netfilter: nf_tables: fix memleak in map from abort path
  netfilter: nf_tables: restore set elements when delete set fails
  netfilter: nf_tables: missing iterator type in lookup walk
  s390/ism: Properly fix receive message buffer allocation
  net: dsa: mt7530: fix port mirroring for MT7988 SoC switch
  net: dsa: mt7530: fix mirroring frames received on local port
  tun: limit printing rate when illegal packet received by tun dev
  ice: Fix checking for unsupported keys on non-tunnel device
  ice: tc: allow zero flags in parsing tc flower
  ice: tc: check src_vsi in case of traffic from VF
  ...

2 months agoMerge tag 'gpio-fixes-for-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 18 Apr 2024 17:18:03 +0000 (10:18 -0700)]
Merge tag 'gpio-fixes-for-v6.9-rc5' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - use -ENOTSUPP consistently in Intel GPIO drivers

 - don't include dt-bindings headers in gpio-swnode code

 - add missing of device table to gpio-lpc32xx and fix autoloading

* tag 'gpio-fixes-for-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: swnode: Remove wrong header inclusion
  gpio: lpc32xx: fix module autoloading
  gpio: crystalcove: Use -ENOTSUPP consistently
  gpio: wcove: Use -ENOTSUPP consistently

2 months agonet: ethernet: ti: am65-cpsw-nuss: cleanup DMA Channels before using them
Siddharth Vadapalli [Wed, 17 Apr 2024 09:54:25 +0000 (15:24 +0530)]
net: ethernet: ti: am65-cpsw-nuss: cleanup DMA Channels before using them

The TX and RX DMA Channels used by the driver to exchange data with CPSW
are not guaranteed to be in a clean state during driver initialization.
The Bootloader could have used the same DMA Channels without cleaning them
up in the event of failure. Thus, reset and disable the DMA Channels to
ensure that they are in a clean state before using them.

Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Reported-by: Schuyler Patton <spatton@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://lore.kernel.org/r/20240417095425.2253876-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: usb: ax88179_178a: avoid writing the mac address before first reading
Jose Ignacio Tornos Martinez [Wed, 17 Apr 2024 08:55:13 +0000 (10:55 +0200)]
net: usb: ax88179_178a: avoid writing the mac address before first reading

After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset operation, in which the default mac
address from the device is read, is not executed from bind operation and
the random address, that is pregenerated just in case, is direclty written
the first time in the device, so the default one from the device is not
even read. This writing is not dangerous because is volatile and the
default mac address is not missed.

In order to avoid this and keep the simplification to have only one
reset and reduce the delays, restore the reset from bind operation and
remove the reset that is commanded from open operation. The behavior is
the same but everything is ready for usbnet_probe.

Tested with ASIX AX88179 USB Gigabit Ethernet devices.
Restore the old behavior for the rest of possible devices because I don't
have the hardware to test.

cc: stable@vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Jarkko Palviainen <jarkko.palviainen@gmail.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Link: https://lore.kernel.org/r/20240417085524.219532-1-jtornosm@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'random-6.9-rc5-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 18 Apr 2024 16:49:08 +0000 (09:49 -0700)]
Merge tag 'random-6.9-rc5-for-linus' of git://git./linux/kernel/git/crng/random

Pull random number generator fixes from Jason Donenfeld:

 - The input subsystem contributes entropy in some places where a
   spinlock is held, but the entropy accounting code only handled
   callers being in an interrupt or non-atomic process context, but not
   atomic process context. We fix this by removing an optimization and
   just calling queue_work() unconditionally.

 - Greg accidently sent up a patch not intended for his tree and that
   had been nack'd, so that's now reverted.

* tag 'random-6.9-rc5-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  Revert "vmgenid: emit uevent when VMGENID updates"
  random: handle creditable entropy from atomic process context

2 months agoMerge tag 'platform-drivers-x86-v6.9-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 18 Apr 2024 14:15:33 +0000 (07:15 -0700)]
Merge tag 'platform-drivers-x86-v6.9-3' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:

 - amd/pmf: Add SPS notifications quirk (+ quirk support)

 - amd/pmf: Lower Smart PC check message severity

 - x86/ISST: New HW support

 - x86/intel-uncore-freq: Bump minor version to avoid "unsupported" message

 - amd/pmc: New BIOS version still needs Spurious IRQ1 quirk

* tag 'platform-drivers-x86-v6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/amd/pmc: Extend Framework 13 quirk to more BIOSes
  platform/x86/intel-uncore-freq: Increase minor number support
  platform/x86: ISST: Add Granite Rapids-D to HPM CPU list
  platform/x86/amd: pmf: Add quirk for ROG Zephyrus G14
  platform/x86/amd: pmf: Add infrastructure for quirking supported funcs
  platform/x86/amd: pmf: Decrease error message to debug

2 months agoRevert "vmgenid: emit uevent when VMGENID updates"
Jason A. Donenfeld [Thu, 18 Apr 2024 11:45:17 +0000 (13:45 +0200)]
Revert "vmgenid: emit uevent when VMGENID updates"

This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
nak'd it, and Greg said on the thread that it links that he wasn't going
to take it either, especially since it's not his code or his tree, but
then, seemingly accidentally, it got pushed up some months later, in
what looks like a mistake, with no further discussion in the linked
thread. So revert it, since it's clearly not intended.

Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531095119.11202-2-bchalios@amazon.es
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agovirtio_net: Support RX hash XDP hint
Liang Chen [Wed, 17 Apr 2024 07:18:22 +0000 (15:18 +0800)]
virtio_net: Support RX hash XDP hint

The RSS hash report is a feature that's part of the virtio specification.
Currently, virtio backends like qemu, vdpa (mlx5), and potentially vhost
(still a work in progress as per [1]) support this feature. While the
capability to obtain the RSS hash has been enabled in the normal path,
it's currently missing in the XDP path. Therefore, we are introducing
XDP hints through kfuncs to allow XDP programs to access the RSS hash.

1.
https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.com/#r

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20240417071822.27831-1-liangchen.linux@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge tag 'nf-24-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Paolo Abeni [Thu, 18 Apr 2024 11:12:36 +0000 (13:12 +0200)]
Merge tag 'nf-24-04-18' of git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

Patch #1 amends a missing spot where the set iterator type is unset.
 This is fixing a issue in the previous pull request.

Patch #2 fixes the delete set command abort path by restoring state
         of the elements. Reverse logic for the activate (abort) case
 otherwise element state is not restored, this requires to move
 the check for active/inactive elements to the set iterator
 callback. From the deactivate path, toggle the next generation
 bit and from the activate (abort) path, clear the next generation
 bitmask.

Patch #3 skips elements already restored by delete set command from the
 abort path in case there is a previous delete element command in
 the batch. Check for the next generation bit just like it is done
 via set iteration to restore maps.

netfilter pull request 24-04-18

* tag 'nf-24-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: fix memleak in map from abort path
  netfilter: nf_tables: restore set elements when delete set fails
  netfilter: nf_tables: missing iterator type in lookup walk
====================

Link: https://lore.kernel.org/r/20240418010948.3332346-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch 'net-ipa-header-hygiene'
Paolo Abeni [Thu, 18 Apr 2024 11:01:07 +0000 (13:01 +0200)]
Merge branch 'net-ipa-header-hygiene'

Alex Elder says:

====================
net: ipa: header hygiene

The end result of this series is that the list of files included in
every IPA source file will be maintained in sorted order.  This
imposes some consistency that was previously not possible.

If an IPA header file requires a symbol or type declared in another
header, that other header must be included.  E.g., if bool or u32
type is used in a function declaration in an IPA header file, the
IPA header must include <linux/types.h>.

If a type used is just a struct or union *pointer* or enum type (and
no members within these types are needed), then these types only need
to be *declared* within the header that uses it.

This is sufficient, but in addition, this series removes includes of
files that aren't necessary, as well as unneeded type declarations.
====================

Link: https://lore.kernel.org/r/20240416231018.389520-1-elder@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: sort all includes
Alex Elder [Tue, 16 Apr 2024 23:10:18 +0000 (18:10 -0500)]
net: ipa: sort all includes

Establish the rule that header files are always included in sorted
(POSIX local) order.  Standard and private headers are separated by
a blank line.

Similarly, sort all forward-declarations for structures.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: more include file cleanup
Alex Elder [Tue, 16 Apr 2024 23:10:17 +0000 (18:10 -0500)]
net: ipa: more include file cleanup

All of the config data files and all of the register definition
files (plus a few others) use GSI_EE_AP, which is defined in
"ipa_version.h".  Include that header where it's needed.

All of the IPA register definition files include "../ipa.h", though
none of them need anything defined there.  Similarly, all of the GSI
register definition files include "../gsi.h", but don't need anything
defined there.  Remove these unnneded includes.

All of the configuration data files include "../gsi.h", though none
of them need anything defined there, so remove these includes.

Remove other includes of local header files that are not required.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: eliminate unneeded struct declarations
Alex Elder [Tue, 16 Apr 2024 23:10:16 +0000 (18:10 -0500)]
net: ipa: eliminate unneeded struct declarations

As definitions in headers have been moved around, some of the
struct and enum declarations found in header files have become
no longer necessary and can be removed.  Remove these unneeded
declarations.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: add some needed struct declarations
Alex Elder [Tue, 16 Apr 2024 23:10:15 +0000 (18:10 -0500)]
net: ipa: add some needed struct declarations

Declare some structure types in a few header files where functions
declared therein use them:
  - Functions are declared in "gsi_private.h" that use gsi, gsi_ring, and
    gsi_trans structure pointers.
  - A gsi_trans struct pointer is passed to two functions
    declared in "ipa_endpoint.h"
  - In "ipa_interrupt.h", a platform_device pointer is passed in the
    declaration for ipa_interrupt_init().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: include "ipa_interrupt.h" where needed
Alex Elder [Tue, 16 Apr 2024 23:10:14 +0000 (18:10 -0500)]
net: ipa: include "ipa_interrupt.h" where needed

The IPA structure contains an ipa_interrupt structure pointer, and
that structure is declared in "ipa.h".  There is no need to include
"ipa_interrupt.h" in that header file.

Instead, include "ipa_interrupt.h" in the three source files (in
addition to "ipa_main.c") that actually use the functions that are
declared there.

Similarly, three files use symbols defined in "ipa_reg.h" but do not
include that file; include it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: remove unneeded standard includes
Alex Elder [Tue, 16 Apr 2024 23:10:13 +0000 (18:10 -0500)]
net: ipa: remove unneeded standard includes

Some IPA header files include one or more other standard header
files despite not directly needing anything defined in the included
files.  Remove these unnecessary includes.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ipa: include some standard header files
Alex Elder [Tue, 16 Apr 2024 23:10:12 +0000 (18:10 -0500)]
net: ipa: include some standard header files

Some IPA header files use types defined in <linux/types.h>, but do
not include that file:
  - In "ipa_mem.h", the ipa_mem structure has u16 and u32 fields
  - In "ipa_power.h", ipa_power_retention() takes a bool argument,
    and ipa_core_clock_rate() returns u32
  - In "ipa_version.h", ipa_version_supported() returns bool
Include it in these files to satisfy their dependencies.

The ipa_qmi structure (defined in "ipa_qmi.h") contains a work
structure, so include <linux/workqueue.h> in there.

All of the data and register definition files, as well as "reg.h",
use the ARRAY_SIZE() macro.  Include <linux/array_size.h> everywhere
it's used.

Similarly, all register definition files (and a few others) use the
GENMASK() macro, so include <linux/bits.h> to ensure it's defined
where used.  BIT() becomes available by including this file also.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoibmvnic: Return error code on TX scrq flush fail
Nick Child [Tue, 16 Apr 2024 16:41:28 +0000 (11:41 -0500)]
ibmvnic: Return error code on TX scrq flush fail

In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then
it will inform upper level networking functions to disable tx
queues. H_CLOSED signals that the connection with the vnic server is
down and a transport event is expected to recover the device.

Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success.
Therefore, the queues would remain active until ibmvnic_cleanup() is
called within do_reset().

The problem is that do_reset() depends on the RTNL lock. If several
ibmvnic devices are resetting then there can be a long wait time until
the last device can grab the lock. During this time the tx/rx queues
still appear active to upper level functions.

FYI, we do make a call to netif_carrier_off() outside the RTNL lock but
its calls to dev_deactivate() are also dependent on the RTNL lock.

As a result, large amounts of retransmissions were observed in a short
period of time, eventually leading to ETIMEOUT. This was specifically
seen with HNV devices, likely because of even more RTNL dependencies.

Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is
propagated to the xmit function to allow for an earlier (and lock-less)
response to a transport event.

Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Link: https://lore.kernel.org/r/20240416164128.387920-1-nnac123@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoneighbour: guarantee the localhost connections be established successfully even the...
Zheng Li [Tue, 16 Apr 2024 09:53:43 +0000 (17:53 +0800)]
neighbour: guarantee the localhost connections be established successfully even the ARP table is full

Inter-process communication on localhost should be established successfully
even the ARP table is full, many processes on server machine use the
localhost to communicate such as command-line interface (CLI),
servers hope all CLI commands can be executed successfully even the arp
table is full. Right now CLI commands got timeout when the arp table is
full. Set the parameter of exempt_from_gc to be true for LOOPBACK net
device to keep localhost neigh in arp table, not removed by gc.

the steps of reproduced:
server with "gc_thresh3 = 1024" setting, ping server from more than 1024
same netmask Lan IPv4 addresses, run "ssh localhost" on console interface,
then the command will get timeout.

Signed-off-by: Zheng Li <James.Z.Li@Dell.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240416095343.540-1-lizheng043@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch 'ravb-ethernet-driver-bugfixes'
Paolo Abeni [Thu, 18 Apr 2024 09:34:31 +0000 (11:34 +0200)]
Merge branch 'ravb-ethernet-driver-bugfixes'

Paul Barker says:

====================
ravb Ethernet driver bugfixes

These patches fix bugs found during recent work on the ravb driver.

Patches 1 & 2 affect the R-Car code paths so have been tested on an
R-Car M3N Salvator-XS board - this is the only R-Car board I currently
have access to.

Patches 2, 3 & 4 affect the GbEth code paths so have been tested on
RZ/G2L and RZ/G2UL SMARC EVK boards.

Changes v2->v3:
  * Incorporate feedback from Niklas and add Reviewed-by tag to patch
    "net: ravb: Count packets instead of descriptors in R-Car RX path".
Changes v1->v2:
  * Fixed typos in commit message of patch
    "net: ravb: Allow RX loop to move past DMA mapping errors".
  * Added Sergey's Reviewed-by tags.
  * Expanded Cc list as Patchwork complained that I had missed people.
  * Trimmed the call trace in accordance with the docs [1] in patch
    "net: ravb: Fix GbEth jumbo packet RX checksum handling".

[1]: https://docs.kernel.org/process/submitting-patches.html#backtraces-in-commit-messages
====================

Link: https://lore.kernel.org/r/20240416120254.2620-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ravb: Fix RX byte accounting for jumbo packets
Paul Barker [Tue, 16 Apr 2024 12:02:54 +0000 (13:02 +0100)]
net: ravb: Fix RX byte accounting for jumbo packets

The RX byte accounting for jumbo packets was changed to fix a potential
use-after-free bug. However, that fix used the wrong variable and so
only accounted for the number of bytes in the final descriptor, not the
number of bytes in the whole packet.

To fix this, we can simply update our stats with the correct number of
bytes before calling napi_gro_receive().

Also rename pkt_len to desc_len in ravb_rx_gbeth() to avoid any future
confusion. The variable name pkt_len is correct in ravb_rx_rcar() as
that function does not handle packets spanning multiple descriptors.

Fixes: 5a5a3e564de6 ("ravb: Fix potential use-after-free in ravb_rx_gbeth()")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ravb: Fix GbEth jumbo packet RX checksum handling
Paul Barker [Tue, 16 Apr 2024 12:02:53 +0000 (13:02 +0100)]
net: ravb: Fix GbEth jumbo packet RX checksum handling

Sending a 7kB ping packet to the RZ/G2L in v6.9-rc2 causes the following
backtrace:

WARNING: CPU: 0 PID: 0 at include/linux/skbuff.h:3127 skb_trim+0x30/0x38
Hardware name: Renesas SMARC EVK based on r9a07g044l2 (DT)
pc : skb_trim+0x30/0x38
lr : ravb_rx_csum_gbeth+0x40/0x90
Call trace:
 skb_trim+0x30/0x38
 ravb_rx_gbeth+0x56c/0x5cc
 ravb_poll+0xa0/0x204
 __napi_poll+0x38/0x17c

This is caused by ravb_rx_gbeth() calling ravb_rx_csum_gbeth() with the
wrong skb for a packet which spans multiple descriptors. To fix this,
use the correct skb.

Fixes: c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ravb: Allow RX loop to move past DMA mapping errors
Paul Barker [Tue, 16 Apr 2024 12:02:52 +0000 (13:02 +0100)]
net: ravb: Allow RX loop to move past DMA mapping errors

The RX loops in ravb_rx_gbeth() and ravb_rx_rcar() skip to the next loop
iteration if a zero-length descriptor is seen (indicating a DMA mapping
error). However, the current RX descriptor index `priv->cur_rx[q]` was
incremented at the end of the loop and so would not be incremented when
we skip to the next loop iteration. This would cause the loop to keep
seeing the same zero-length descriptor instead of moving on to the next
descriptor.

As the loop counter `i` still increments, the loop would eventually
terminate so there is no risk of being stuck here forever - but we
should still fix this to avoid wasting cycles.

To fix this, the RX descriptor index is incremented at the top of the
loop, in the for statement itself. The assignments of `entry` and `desc`
are brought into the loop to avoid the need for duplication.

Fixes: d8b48911fd24 ("ravb: fix ring memory allocation")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ravb: Count packets instead of descriptors in R-Car RX path
Paul Barker [Tue, 16 Apr 2024 12:02:51 +0000 (13:02 +0100)]
net: ravb: Count packets instead of descriptors in R-Car RX path

The units of "work done" in the RX path should be packets instead of
descriptors.

Descriptors which are used by the hardware to record error conditions or
are empty in the case of a DMA mapping error should not count towards
our RX work budget.

Also make the limit variable unsigned as it can never be negative.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: ethernet: mtk_eth_soc: fix WED + wifi reset
Felix Fietkau [Tue, 16 Apr 2024 08:23:29 +0000 (10:23 +0200)]
net: ethernet: mtk_eth_soc: fix WED + wifi reset

The WLAN + WED reset sequence relies on being able to receive interrupts from
the card, in order to synchronize individual steps with the firmware.
When WED is stopped, leave interrupts running and rely on the driver turning
off unwanted ones.
WED DMA also needs to be disabled before resetting.

Fixes: f78cd9c783e0 ("net: ethernet: mtk_wed: update mtk_wed_stop")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20240416082330.82564-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet:usb:qmi_wwan: support Rolling modules
Vanillan Wang [Tue, 16 Apr 2024 12:07:13 +0000 (20:07 +0800)]
net:usb:qmi_wwan: support Rolling modules

Update the qmi_wwan driver support for the Rolling
LTE modules.

- VID:PID 33f8:0104, RW101-GL for laptop debug M.2 cards(with RMNET
interface for /Linux/Chrome OS)
0x0104: RMNET, diag, at, pipe

Here are the outputs of usb-devices:
T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
P:  Vendor=33f8 ProdID=0104 Rev=05.04
S:  Manufacturer=Rolling Wireless S.a.r.l.
S:  Product=Rolling Module
S:  SerialNumber=ba2eb033
C:  #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option
E:  Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
E:  Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=88(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
E:  Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=usbfs
E:  Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=89(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms

Signed-off-by: Vanillan Wang <vanillanwang@163.com>
Link: https://lore.kernel.org/r/20240416120713.24777-1-vanillanwang@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 18 Apr 2024 01:38:34 +0000 (18:38 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-04-16 (ice)

This series contains updates to ice driver only.

Michal fixes a couple of issues with TC filter parsing; always add match
for src_vsi and remove flag check that could prevent addition of valid
filters.

Marcin adds additional checks for unsupported flower filters.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: Fix checking for unsupported keys on non-tunnel device
  ice: tc: allow zero flags in parsing tc flower
  ice: tc: check src_vsi in case of traffic from VF
====================

Link: https://lore.kernel.org/r/20240416202409.2008383-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: kselftest_harness: fix Clang warning about zero-length format
Jakub Kicinski [Tue, 16 Apr 2024 15:10:48 +0000 (08:10 -0700)]
selftests: kselftest_harness: fix Clang warning about zero-length format

Apparently it's more legal to pass the format as NULL, than
it is to use an empty string. Clang complains about empty
formats:

./../kselftest_harness.h:1207:30: warning: format string is empty
[-Wformat-zero-length]
 1207 |            diagnostic ? "%s" : "", diagnostic);
      |                                 ^~
1 warning generated.

Reported-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240409224256.1581292-1-seanjc@google.com
Fixes: 378193eff339 ("selftests: kselftest_harness: let PASS / FAIL provide diagnostic")
Tested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240416151048.1682352-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotcp_metrics: use parallel_ops for tcp_metrics_nl_family
Eric Dumazet [Tue, 16 Apr 2024 16:20:25 +0000 (16:20 +0000)]
tcp_metrics: use parallel_ops for tcp_metrics_nl_family

TCP_METRICS_CMD_GET and TCP_METRICS_CMD_DEL use their
own locking (tcp_metrics_lock and RCU),
they do not need genl_mutex protection.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240416162025.1251547-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotcp_metrics: fix tcp_metrics_nl_dump() return value
Eric Dumazet [Tue, 16 Apr 2024 16:11:12 +0000 (16:11 +0000)]
tcp_metrics: fix tcp_metrics_nl_dump() return value

Change tcp_metrics_nl_dump() to return 0 at the end
of a dump so that NLMSG_DONE can be appended
to the current skb, saving one recvmsg() system call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240416161112.1199265-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetns: no longer hold RTNL in rtnl_net_dumpid()
Eric Dumazet [Tue, 16 Apr 2024 14:07:39 +0000 (14:07 +0000)]
netns: no longer hold RTNL in rtnl_net_dumpid()

- rtnl_net_dumpid() is already fully RCU protected,
  RTNL is not needed there.

- Fix return value at the end of a dump,
  so that NLMSG_DONE can be appended to current skb,
  saving one recvmsg() system call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/20240416140739.967941-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: xrs700x: provide own phylink MAC operations
Russell King (Oracle) [Tue, 16 Apr 2024 10:19:08 +0000 (11:19 +0100)]
net: dsa: xrs700x: provide own phylink MAC operations

Convert xrs700x to provide its own phylink MAC operations, thus
avoiding the shim layer in DSA's port.c. We need to provide stubs for
the mac_link_down() and mac_config() methods which are mandatory.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rwfu8-007531-TG@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: rzn1_a5psw: provide own phylink MAC operations
Russell King (Oracle) [Tue, 16 Apr 2024 10:19:19 +0000 (11:19 +0100)]
net: dsa: rzn1_a5psw: provide own phylink MAC operations

Convert rzn1_a5psw to provide its own phylink MAC operations, thus
avoiding the shim layer in DSA's port.c. We need to provide a stub for
the mac_config() method which is mandatory.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rwfuJ-00753D-6d@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: lan9303: provide own phylink MAC operations
Russell King (Oracle) [Tue, 16 Apr 2024 10:19:14 +0000 (11:19 +0100)]
net: dsa: lan9303: provide own phylink MAC operations

Convert lan9303 to provide its own phylink MAC operations, thus
avoiding the shim layer in DSA's port.c. We need to provide stubs for
the mac_link_down() and mac_config() methods which are mandatory.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1rwfuE-007537-1u@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: bcm_sf2: provide own phylink MAC operations
Russell King (Oracle) [Tue, 16 Apr 2024 10:19:03 +0000 (11:19 +0100)]
net: dsa: bcm_sf2: provide own phylink MAC operations

Convert bcm_sf2 to provide its own phylink MAC operations, thus
avoiding the shim layer in DSA's port.c

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1rwfu3-00752s-On@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: openvswitch: Fix escape chars in regexp.
Adrian Moreno [Tue, 16 Apr 2024 09:09:13 +0000 (11:09 +0200)]
selftests: openvswitch: Fix escape chars in regexp.

Character sequences starting with `\` are interpreted by python as
escaped Unicode characters. However, they have other meaning in
regular expressions (e.g: "\d").

It seems Python >= 3.12 starts emitting a SyntaxWarning when these
escaped sequences are not recognized as valid Unicode characters.

An example of these warnings:

tools/testing/selftests/net/openvswitch/ovs-dpctl.py:505:
SyntaxWarning: invalid escape sequence '\d'

Fix all the warnings by flagging literals as raw strings.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/20240416090913.2028475-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMerge tag 'for-6.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 18 Apr 2024 01:25:40 +0000 (18:25 -0700)]
Merge tag 'for-6.9-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fixup in zoned mode for out-of-order writes of metadata that are no
   longer necessary, this used to be tracked in a separate list but now
   the old locaion needs to be zeroed out, also add assertions

 - fix bulk page allocation retry, this may stall after first failure
   for compression read/write

* tag 'for-6.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: do not wait for short bulk allocation
  btrfs: zoned: add ASSERT and WARN for EXTENT_BUFFER_ZONED_ZEROOUT handling
  btrfs: zoned: do not flag ZEROOUT on non-dirty extent buffer

2 months agonet: netdevsim: select PAGE_POOL in Kconfig
Jakub Kicinski [Tue, 16 Apr 2024 23:21:37 +0000 (16:21 -0700)]
net: netdevsim: select PAGE_POOL in Kconfig

build bot points out that I forgot to add the PAGE_POOL
config dependency when adding the support in netdevsim.

Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404170348.thxrboF1-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202404170527.LIAPSyMB-lkp@intel.com/
Link: https://lore.kernel.org/r/20240416232137.2022058-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/sched: Fix mirred deadlock on device recursion
Eric Dumazet [Mon, 15 Apr 2024 21:07:28 +0000 (18:07 -0300)]
net/sched: Fix mirred deadlock on device recursion

When the mirred action is used on a classful egress qdisc and a packet is
mirrored or redirected to self we hit a qdisc lock deadlock.
See trace below.

[..... other info removed for brevity....]
[   82.890906]
[   82.890906] ============================================
[   82.890906] WARNING: possible recursive locking detected
[   82.890906] 6.8.0-05205-g77fadd89fe2d-dirty #213 Tainted: G        W
[   82.890906] --------------------------------------------
[   82.890906] ping/418 is trying to acquire lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] but task is already holding lock:
[   82.890906] ffff888006994110 (&sch->q.lock){+.-.}-{3:3}, at:
__dev_queue_xmit+0x1778/0x3550
[   82.890906]
[   82.890906] other info that might help us debug this:
[   82.890906]  Possible unsafe locking scenario:
[   82.890906]
[   82.890906]        CPU0
[   82.890906]        ----
[   82.890906]   lock(&sch->q.lock);
[   82.890906]   lock(&sch->q.lock);
[   82.890906]
[   82.890906]  *** DEADLOCK ***
[   82.890906]
[..... other info removed for brevity....]

Example setup (eth0->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

Another example(eth0->eth1->eth0) to recreate
tc qdisc add dev eth0 root handle 1: htb default 30
tc filter add dev eth0 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth1

tc qdisc add dev eth1 root handle 1: htb default 30
tc filter add dev eth1 handle 1: protocol ip prio 2 matchall \
     action mirred egress redirect dev eth0

We fix this by adding an owner field (CPU id) to struct Qdisc set after
root qdisc is entered. When the softirq enters it a second time, if the
qdisc owner is the same CPU, the packet is dropped to break the loop.

Reported-by: Mingshuai Ren <renmingshuai@huawei.com>
Closes: https://lore.kernel.org/netdev/20240314111713.5979-1-renmingshuai@huawei.com/
Fixes: 3bcb846ca4cf ("net: get rid of spin_trylock() in net_tx_action()")
Fixes: e578d9c02587 ("net: sched: use counter to break reclassify loops")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240415210728.36949-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: usb: qmi_wwan: add Lonsung U8300/U9300 product
Coia Prant [Mon, 15 Apr 2024 14:26:38 +0000 (07:26 -0700)]
net: usb: qmi_wwan: add Lonsung U8300/U9300 product

Update the net usb qmi_wwan driver to support Longsung U8300/U9300.
Enabling DTR on this modem was necessary to ensure stable operation.

For U8300

Interface 4 is used by for QMI interface in stock firmware of U8300, the
router which uses U8300 modem. Free the interface up, to rebind it to
qmi_wwan driver.
Interface 5 is used by for ADB interface in stock firmware of U8300, the
router which uses U8300 modem. Free the interface up.
The proper configuration is:

Interface mapping is:
0: unknown (Debug), 1: AT (Modem), 2: AT, 3: PPP (NDIS / Pipe), 4: QMI, 5: ADB

T:  Bus=05 Lev=01 Prnt=03 Port=02 Cnt=01 Dev#=  4 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1c9e ProdID=9b05 Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms

For U9300

Interface 1 is used by for ADB interface in stock firmware of U9300, the
router which uses U9300 modem. Free the interface up.
Interface 4 is used by for QMI interface in stock firmware of U9300, the
router which uses U9300 modem. Free the interface up, to rebind it to
qmi_wwan driver.
The proper configuration is:

Interface mapping is:
0: ADB, 1: AT (Modem), 2: AT, 3: PPP (NDIS / Pipe), 4: QMI

Note: Interface 3 of some models of the U9300 series can send AT commands.

T:  Bus=05 Lev=01 Prnt=05 Port=04 Cnt=01 Dev#=  6 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1c9e ProdID=9b3c Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
I:  If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms

Tested successfully using Modem Manager on U9300.
Tested successfully using qmicli on U9300.

Signed-off-by: Coia Prant <coiaprant@gmail.com>
Link: https://lore.kernel.org/r/20240415142638.1756966-1-coiaprant@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: adopt BPF's approach to quieter builds
Jakub Kicinski [Thu, 11 Apr 2024 19:05:34 +0000 (12:05 -0700)]
selftests: adopt BPF's approach to quieter builds

selftest build is fairly noisy, it's easy to miss warnings.
It's standard practice to add alternative messages in
the Makefile. I was grepping for existing solutions,
and found that bpf already has the right knobs.

Move them to lib.mk and adopt in net.
Convert the basic rules in lib.mk.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240411190534.444918-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonetfilter: nf_tables: fix memleak in map from abort path
Pablo Neira Ayuso [Wed, 17 Apr 2024 15:43:21 +0000 (17:43 +0200)]
netfilter: nf_tables: fix memleak in map from abort path

The delete set command does not rely on the transaction object for
element removal, therefore, a combination of delete element + delete set
from the abort path could result in restoring twice the refcount of the
mapping.

Check for inactive element in the next generation for the delete element
command in the abort path, skip restoring state if next generation bit
has been already cleared. This is similar to the activate logic using
the set walk iterator.

[ 6170.286929] ------------[ cut here ]------------
[ 6170.286939] WARNING: CPU: 6 PID: 790302 at net/netfilter/nf_tables_api.c:2086 nf_tables_chain_destroy+0x1f7/0x220 [nf_tables]
[ 6170.287071] Modules linked in: [...]
[ 6170.287633] CPU: 6 PID: 790302 Comm: kworker/6:2 Not tainted 6.9.0-rc3+ #365
[ 6170.287768] RIP: 0010:nf_tables_chain_destroy+0x1f7/0x220 [nf_tables]
[ 6170.287886] Code: df 48 8d 7d 58 e8 69 2e 3b df 48 8b 7d 58 e8 80 1b 37 df 48 8d 7d 68 e8 57 2e 3b df 48 8b 7d 68 e8 6e 1b 37 df 48 89 ef eb c4 <0f> 0b 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 0f
[ 6170.287895] RSP: 0018:ffff888134b8fd08 EFLAGS: 00010202
[ 6170.287904] RAX: 0000000000000001 RBX: ffff888125bffb28 RCX: dffffc0000000000
[ 6170.287912] RDX: 0000000000000003 RSI: ffffffffa20298ab RDI: ffff88811ebe4750
[ 6170.287919] RBP: ffff88811ebe4700 R08: ffff88838e812650 R09: fffffbfff0623a55
[ 6170.287926] R10: ffffffff8311d2af R11: 0000000000000001 R12: ffff888125bffb10
[ 6170.287933] R13: ffff888125bffb10 R14: dead000000000122 R15: dead000000000100
[ 6170.287940] FS:  0000000000000000(0000) GS:ffff888390b00000(0000) knlGS:0000000000000000
[ 6170.287948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6170.287955] CR2: 00007fd31fc00710 CR3: 0000000133f60004 CR4: 00000000001706f0
[ 6170.287962] Call Trace:
[ 6170.287967]  <TASK>
[ 6170.287973]  ? __warn+0x9f/0x1a0
[ 6170.287986]  ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables]
[ 6170.288092]  ? report_bug+0x1b1/0x1e0
[ 6170.287986]  ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables]
[ 6170.288092]  ? report_bug+0x1b1/0x1e0
[ 6170.288104]  ? handle_bug+0x3c/0x70
[ 6170.288112]  ? exc_invalid_op+0x17/0x40
[ 6170.288120]  ? asm_exc_invalid_op+0x1a/0x20
[ 6170.288132]  ? nf_tables_chain_destroy+0x2b/0x220 [nf_tables]
[ 6170.288243]  ? nf_tables_chain_destroy+0x1f7/0x220 [nf_tables]
[ 6170.288366]  ? nf_tables_chain_destroy+0x2b/0x220 [nf_tables]
[ 6170.288483]  nf_tables_trans_destroy_work+0x588/0x590 [nf_tables]

Fixes: 591054469b3e ("netfilter: nf_tables: revisit chain/object refcounting from elements")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 months agogpiolib: swnode: Remove wrong header inclusion
Andy Shevchenko [Wed, 17 Apr 2024 14:19:13 +0000 (17:19 +0300)]
gpiolib: swnode: Remove wrong header inclusion

The flags in the software node properties are supposed to be
the GPIO lookup flags, which are provided by gpio/machine.h,
as the software nodes are the kernel internal thing and doesn't
need to rely to any of ABIs.

Fixes: e7f9ff5dc90c ("gpiolib: add support for software nodes")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2 months agoMerge tag 'pwm/for-6.9-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 17 Apr 2024 17:04:40 +0000 (10:04 -0700)]
Merge tag 'pwm/for-6.9-rc5-fixes' of git://git./linux/kernel/git/ukleinek/linux

Pull pwm fixes from Uwe Kleine-König:
 "The first patch fixes a regression in the suspend/resume path for the
  dwc pwm driver that was introduced in v6.9-rc1 when support for 16
  channel devices was added.

  The second patch fixes a bunch of device tree binding check warnings"

* tag 'pwm/for-6.9-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
  dt-bindings: pwm: mediatek,pwm-disp: Document power-domains property
  pwm: dwc: allow suspend/resume for 16 channels

2 months agoice: Remove ndo_get_phys_port_name
Dariusz Aftanski [Tue, 12 Mar 2024 20:26:35 +0000 (13:26 -0700)]
ice: Remove ndo_get_phys_port_name

ndo_get_phys_port_name is never actually used, as in switchdev
devlink is always being created.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Dariusz Aftanski <dariusz.aftanski@linux.intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 months agoice: Add automatic VF reset on Tx MDD events
Marcin Szycik [Thu, 4 Apr 2024 14:04:51 +0000 (16:04 +0200)]
ice: Add automatic VF reset on Tx MDD events

In cases when VF sends malformed packets that are classified as malicious,
it can cause Tx queue to freeze as a result of Malicious Driver Detection
event. Such malformed packets can appear as a result of a faulty userspace
app running on VF. This frozen queue can be stuck for several minutes being
unusable.

User might prefer to immediately bring the VF back to operational state
after such event, which can be done by automatically resetting the VF which
caused MDD. This is already implemented for Rx events (mdd-auto-reset-vf
flag private flag needs to be set).

Extend the VF auto reset to also cover Tx MDD events. When any MDD event
occurs on VF (Tx or Rx) and the mdd-auto-reset-vf private flag is set,
perform a graceful VF reset to quickly bring it back to operational state.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Co-developed-by: Liang-Min Wang <liang-min.wang@intel.com>
Signed-off-by: Liang-Min Wang <liang-min.wang@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 months agonetfilter: nf_tables: restore set elements when delete set fails
Pablo Neira Ayuso [Wed, 17 Apr 2024 15:43:11 +0000 (17:43 +0200)]
netfilter: nf_tables: restore set elements when delete set fails

From abort path, nft_mapelem_activate() needs to restore refcounters to
the original state. Currently, it uses the set->ops->walk() to iterate
over these set elements. The existing set iterator skips inactive
elements in the next generation, this does not work from the abort path
to restore the original state since it has to skip active elements
instead (not inactive ones).

This patch moves the check for inactive elements to the set iterator
callback, then it reverses the logic for the .activate case which
needs to skip active elements.

Toggle next generation bit for elements when delete set command is
invoked and call nft_clear() from .activate (abort) path to restore the
next generation bit.

The splat below shows an object in mappings memleak:

[43929.457523] ------------[ cut here ]------------
[43929.457532] WARNING: CPU: 0 PID: 1139 at include/net/netfilter/nf_tables.h:1237 nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables]
[...]
[43929.458014] RIP: 0010:nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables]
[43929.458076] Code: 83 f8 01 77 ab 49 8d 7c 24 08 e8 37 5e d0 de 49 8b 6c 24 08 48 8d 7d 50 e8 e9 5c d0 de 8b 45 50 8d 50 ff 89 55 50 85 c0 75 86 <0f> 0b eb 82 0f 0b eb b3 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90
[43929.458081] RSP: 0018:ffff888140f9f4b0 EFLAGS: 00010246
[43929.458086] RAX: 0000000000000000 RBX: ffff8881434f5288 RCX: dffffc0000000000
[43929.458090] RDX: 00000000ffffffff RSI: ffffffffa26d28a7 RDI: ffff88810ecc9550
[43929.458093] RBP: ffff88810ecc9500 R08: 0000000000000001 R09: ffffed10281f3e8f
[43929.458096] R10: 0000000000000003 R11: ffff0000ffff0000 R12: ffff8881434f52a0
[43929.458100] R13: ffff888140f9f5f4 R14: ffff888151c7a800 R15: 0000000000000002
[43929.458103] FS:  00007f0c687c4740(0000) GS:ffff888390800000(0000) knlGS:0000000000000000
[43929.458107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[43929.458111] CR2: 00007f58dbe5b008 CR3: 0000000123602005 CR4: 00000000001706f0
[43929.458114] Call Trace:
[43929.458118]  <TASK>
[43929.458121]  ? __warn+0x9f/0x1a0
[43929.458127]  ? nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables]
[43929.458188]  ? report_bug+0x1b1/0x1e0
[43929.458196]  ? handle_bug+0x3c/0x70
[43929.458200]  ? exc_invalid_op+0x17/0x40
[43929.458211]  ? nft_setelem_data_deactivate+0xd7/0xf0 [nf_tables]
[43929.458271]  ? nft_setelem_data_deactivate+0xe4/0xf0 [nf_tables]
[43929.458332]  nft_mapelem_deactivate+0x24/0x30 [nf_tables]
[43929.458392]  nft_rhash_walk+0xdd/0x180 [nf_tables]
[43929.458453]  ? __pfx_nft_rhash_walk+0x10/0x10 [nf_tables]
[43929.458512]  ? rb_insert_color+0x2e/0x280
[43929.458520]  nft_map_deactivate+0xdc/0x1e0 [nf_tables]
[43929.458582]  ? __pfx_nft_map_deactivate+0x10/0x10 [nf_tables]
[43929.458642]  ? __pfx_nft_mapelem_deactivate+0x10/0x10 [nf_tables]
[43929.458701]  ? __rcu_read_unlock+0x46/0x70
[43929.458709]  nft_delset+0xff/0x110 [nf_tables]
[43929.458769]  nft_flush_table+0x16f/0x460 [nf_tables]
[43929.458830]  nf_tables_deltable+0x501/0x580 [nf_tables]

Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 months agonetfilter: nf_tables: missing iterator type in lookup walk
Pablo Neira Ayuso [Wed, 17 Apr 2024 15:43:01 +0000 (17:43 +0200)]
netfilter: nf_tables: missing iterator type in lookup walk

Add missing decorator type to lookup expression and tighten WARN_ON_ONCE
check in pipapo to spot earlier that this is unset.

Fixes: 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2 months agoplatform/x86/amd/pmc: Extend Framework 13 quirk to more BIOSes
Mario Limonciello [Wed, 10 Apr 2024 14:10:46 +0000 (09:10 -0500)]
platform/x86/amd/pmc: Extend Framework 13 quirk to more BIOSes

BIOS 03.05 still hasn't fixed the spurious IRQ1 issue.  As it's still
being worked on there is still a possibility that it won't need to
apply to future BIOS releases.

Add a quirk for BIOS 03.05 as well.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240410141046.433-1-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agotcp: accept bare FIN packets under memory pressure
Eric Dumazet [Tue, 16 Apr 2024 09:50:54 +0000 (09:50 +0000)]
tcp: accept bare FIN packets under memory pressure

Andrew Oates reported that some macOS hosts could repeatedly
send FIN packets even if the remote peer drops them and
send back DUP ACK RWIN 0 packets.

<quoting Andrew>

 20:27:16.968254 gif0  In  IP macos > victim: Flags [SEW], seq 1950399762, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 501897188 ecr 0,sackOK,eol], length 0
 20:27:16.968339 gif0  Out IP victim > macos: Flags [S.E], seq 2995489058, ack 1950399763, win 1448, options [mss 1460,sackOK,TS val 3829877593 ecr 501897188,nop,wscale 0], length 0
 20:27:16.968833 gif0  In  IP macos > victim: Flags [.], ack 1, win 2058, options [nop,nop,TS val 501897188 ecr 3829877593], length 0
 20:27:16.968885 gif0  In  IP macos > victim: Flags [P.], seq 1:1449, ack 1, win 2058, options [nop,nop,TS val 501897188 ecr 3829877593], length 1448
 20:27:16.968896 gif0  Out IP victim > macos: Flags [.], ack 1449, win 0, options [nop,nop,TS val 3829877593 ecr 501897188], length 0
 20:27:19.454593 gif0  In  IP macos > victim: Flags [F.], seq 1449, ack 1, win 2058, options [nop,nop,TS val 501899674 ecr 3829877593], length 0
 20:27:19.454675 gif0  Out IP victim > macos: Flags [.], ack 1449, win 0, options [nop,nop,TS val 3829880079 ecr 501899674], length 0
 20:27:19.455116 gif0  In  IP macos > victim: Flags [F.], seq 1449, ack 1, win 2058, options [nop,nop,TS val 501899674 ecr 3829880079], length 0

 The retransmits/dup-ACKs then repeat in a tight loop.

</quoting Andrew>

RFC 9293 3.4. Sequence Numbers states :

  Note that when the receive window is zero no segments should be
  acceptable except ACK segments.  Thus, it is be possible for a TCP to
  maintain a zero receive window while transmitting data and receiving
  ACKs.  However, even when the receive window is zero, a TCP must
  process the RST and URG fields of all incoming segments.

Even if we could consider a bare FIN.ACK packet to be an ACK in RFC terms,
the retransmits should use exponential backoff.

Accepting the FIN in linux does not add extra memory costs,
because the FIN flag will simply be merged to the tail skb in
the receive queue, and incoming packet is freed.

Reported-by: Andrew Oates <aoates@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Vidhi Goel <vidhi_goel@apple.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agos390/ism: Properly fix receive message buffer allocation
Gerd Bayer [Mon, 15 Apr 2024 13:15:07 +0000 (15:15 +0200)]
s390/ism: Properly fix receive message buffer allocation

Since [1], dma_alloc_coherent() does not accept requests for GFP_COMP
anymore, even on archs that may be able to fulfill this. Functionality that
relied on the receive buffer being a compound page broke at that point:
The SMC-D protocol, that utilizes the ism device driver, passes receive
buffers to the splice processor in a struct splice_pipe_desc with a
single entry list of struct pages. As the buffer is no longer a compound
page, the splice processor now rejects requests to handle more than a
page worth of data.

Replace dma_alloc_coherent() and allocate a buffer with folio_alloc and
create a DMA map for it with dma_map_page(). Since only receive buffers
on ISM devices use DMA, qualify the mapping as FROM_DEVICE.
Since ISM devices are available on arch s390, only, and on that arch all
DMA is coherent, there is no need to introduce and export some kind of
dma_sync_to_cpu() method to be called by the SMC-D protocol layer.

Analogously, replace dma_free_coherent by a two step dma_unmap_page,
then folio_put to free the receive buffer.

[1] https://lore.kernel.org/all/20221113163535.884299-1-hch@lst.de/

Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agorandom: handle creditable entropy from atomic process context
Jason A. Donenfeld [Wed, 17 Apr 2024 11:38:29 +0000 (13:38 +0200)]
random: handle creditable entropy from atomic process context

The entropy accounting changes a static key when the RNG has
initialized, since it only ever initializes once. Static key changes,
however, cannot be made from atomic context, so depending on where the
last creditable entropy comes from, the static key change might need to
be deferred to a worker.

Previously the code used the execute_in_process_context() helper
function, which accounts for whether or not the caller is
in_interrupt(). However, that doesn't account for the case where the
caller is actually in process context but is holding a spinlock.

This turned out to be the case with input_handle_event() in
drivers/input/input.c contributing entropy:

  [<ffffffd613025ba0>] die+0xa8/0x2fc
  [<ffffffd613027428>] bug_handler+0x44/0xec
  [<ffffffd613016964>] brk_handler+0x90/0x144
  [<ffffffd613041e58>] do_debug_exception+0xa0/0x148
  [<ffffffd61400c208>] el1_dbg+0x60/0x7c
  [<ffffffd61400c000>] el1h_64_sync_handler+0x38/0x90
  [<ffffffd613011294>] el1h_64_sync+0x64/0x6c
  [<ffffffd613102d88>] __might_resched+0x1fc/0x2e8
  [<ffffffd613102b54>] __might_sleep+0x44/0x7c
  [<ffffffd6130b6eac>] cpus_read_lock+0x1c/0xec
  [<ffffffd6132c2820>] static_key_enable+0x14/0x38
  [<ffffffd61400ac08>] crng_set_ready+0x14/0x28
  [<ffffffd6130df4dc>] execute_in_process_context+0xb8/0xf8
  [<ffffffd61400ab30>] _credit_init_bits+0x118/0x1dc
  [<ffffffd6138580c8>] add_timer_randomness+0x264/0x270
  [<ffffffd613857e54>] add_input_randomness+0x38/0x48
  [<ffffffd613a80f94>] input_handle_event+0x2b8/0x490
  [<ffffffd613a81310>] input_event+0x6c/0x98

According to Guoyong, it's not really possible to refactor the various
drivers to never hold a spinlock there. And in_atomic() isn't reliable.

So, rather than trying to be too fancy, just punt the change in the
static key to a workqueue always. There's basically no drawback of doing
this, as the code already needed to account for the static key not
changing immediately, and given that it's just an optimization, there's
not exactly a hurry to change the static key right away, so deferal is
fine.

Reported-by: Guoyong Wang <guoyong.wang@mediatek.com>
Cc: stable@vger.kernel.org
Fixes: f5bda35fba61 ("random: use static branch for crng_ready()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2 months agoselftests: fix netfilter path in Makefile
Yujie Liu [Tue, 16 Apr 2024 05:32:10 +0000 (13:32 +0800)]
selftests: fix netfilter path in Makefile

Netfilter tests have been moved to a subdir under selftests/net by
patch series [1]. Fix the path in selftests/Makefile accordingly.

This helps fix the following error:

    tools/testing/selftests$ make
    ...
    make[1]: Entering directory 'tools/testing/selftests'
    make[1]: *** netfilter: No such file or directory.  Stop.
    make[1]: Leaving directory 'tools/testing/selftests'

[1] https://lore.kernel.org/all/20240411233624.8129-1-fw@strlen.de/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: phy: mediatek-ge: do not disable EEE advertisement
Arınç ÜNAL [Sat, 13 Apr 2024 21:08:13 +0000 (00:08 +0300)]
net: phy: mediatek-ge: do not disable EEE advertisement

The mediatek-ge PHY driver already disables EEE advertisement on the switch
PHYs but my testing [1] shows that it is somehow enabled afterwards.
Disabling EEE advertisement before the PHY driver initialises keeps it off.
Therefore, remove disabling EEE advertisement here as it's useless.

Link: https://lore.kernel.org/netdev/d286ea27-e911-4dcb-9037-b75f22b437b8@arinc9.com/
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agoMerge branch 'mt7530-fixes'
David S. Miller [Wed, 17 Apr 2024 07:56:51 +0000 (08:56 +0100)]
Merge branch 'mt7530-fixes'

Merge branch 'mr7530-fixes'

Arınç ÜNAL says:

====================
Fix port mirroring on MT7530 DSA subdriver

This patch series fixes the frames received on the local port (monitor
port) not being mirrored, and port mirroring for the MT7988 SoC switch.
====================

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
2 months agonet: dsa: mt7530: fix port mirroring for MT7988 SoC switch
Arınç ÜNAL [Sat, 13 Apr 2024 13:01:40 +0000 (16:01 +0300)]
net: dsa: mt7530: fix port mirroring for MT7988 SoC switch

The "MT7988A Wi-Fi 7 Generation Router Platform: Datasheet (Open Version)
v0.1" document shows bits 16 to 18 as the MIRROR_PORT field of the CPU
forward control register. Currently, the MT7530 DSA subdriver configures
bits 0 to 2 of the CPU forward control register which breaks the port
mirroring feature for the MT7988 SoC switch.

Fix this by using the MT7531_MIRROR_PORT_GET() and MT7531_MIRROR_PORT_SET()
macros which utilise the correct bits.

Fixes: 110c18bfed41 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch")
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: dsa: mt7530: fix mirroring frames received on local port
Arınç ÜNAL [Sat, 13 Apr 2024 13:01:39 +0000 (16:01 +0300)]
net: dsa: mt7530: fix mirroring frames received on local port

This switch intellectual property provides a bit on the ARL global control
register which controls allowing mirroring frames which are received on the
local port (monitor port). This bit is unset after reset.

This ability must be enabled to fully support the port mirroring feature on
this switch intellectual property.

Therefore, this patch fixes the traffic not being reflected on a port,
which would be configured like below:

  tc qdisc add dev swp0 clsact

  tc filter add dev swp0 ingress matchall skip_sw \
  action mirred egress mirror dev swp0

As a side note, this configuration provides the hairpinning feature for a
single port.

Fixes: 37feab6076aa ("net: dsa: mt7530: add support for port mirroring")
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 months agonet: pse-pd: Rectify and adapt the naming of admin_cotrol member of struct pse_contro...
Kory Maincent (Dent Project) [Sun, 14 Apr 2024 14:21:52 +0000 (16:21 +0200)]
net: pse-pd: Rectify and adapt the naming of admin_cotrol member of struct pse_control_config

In commit 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet
Power Equipment"), the 'pse_control_config' structure was introduced,
housing a single member labeled 'admin_cotrol' responsible for maintaining
the operational state of the PoDL PSE functions.

A noticeable typographical error exists in the naming of this field
('cotrol' should be corrected to 'control'), which this commit aims to
rectify.

Furthermore, with upcoming extensions of this structure to encompass PoE
functionalities, the field is being renamed to 'podl_admin_state' to
distinctly indicate that this state is tailored specifically for PoDL."

Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://lore.kernel.org/r/20240414-feature_poe-v8-3-e4bf1e860da5@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoof: property: Add fw_devlink support for pse parent
Kory Maincent (Dent Project) [Sun, 14 Apr 2024 14:21:51 +0000 (16:21 +0200)]
of: property: Add fw_devlink support for pse parent

This allows fw_devlink to create device links between consumers of
a PSE and the supplier of the PSE.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://lore.kernel.org/r/20240414-feature_poe-v8-2-e4bf1e860da5@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoMAINTAINERS: net: Add Oleksij to pse-pd maintainers
Kory Maincent (Dent Project) [Sun, 14 Apr 2024 14:21:50 +0000 (16:21 +0200)]
MAINTAINERS: net: Add Oleksij to pse-pd maintainers

Oleksij was the first to add support for pse-pd net subsystem.
Add himself to the maintainers seems logical.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240414-feature_poe-v8-1-e4bf1e860da5@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: add config for netdevsim
Jakub Kicinski [Tue, 16 Apr 2024 00:45:52 +0000 (17:45 -0700)]
selftests: drv-net: add config for netdevsim

Real driver testing will obviously require enabling more
options, but will require more manual setup in the first
place. For CIs running purely software tests we need
to enable netdevsim.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240416004556.1618804-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoselftests: drv-net: add stdout to the command failed exception
Jakub Kicinski [Tue, 16 Apr 2024 00:45:51 +0000 (17:45 -0700)]
selftests: drv-net: add stdout to the command failed exception

ping prints all the info to stdout. To make debug easier capture
stdout in the Exception raised when command unexpectedly fails.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240416004556.1618804-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoip6_vti: fix memleak on netns dismantle
Florian Westphal [Mon, 15 Apr 2024 12:23:44 +0000 (14:23 +0200)]
ip6_vti: fix memleak on netns dismantle

kmemleak reports net_device resources are no longer released, restore
needs_free_netdev toggle.  Sample backtrace:

unreferenced object 0xffff88810874f000 (size 4096): [..]
    [<00000000a2b8af8b>] __kmalloc_node+0x209/0x290
    [<0000000040b0a1a9>] alloc_netdev_mqs+0x58/0x470
    [<00000000b4be1e78>] vti6_init_net+0x94/0x230
    [<000000008830c1ea>] ops_init+0x32/0xc0
    [<000000006a26fa8f>] setup_net+0x134/0x2e0
[..]

Cc: Breno Leitao <leitao@debian.org>
Fixes: a9b2d55a8f1e ("ip6_vti: Do not use custom stat allocator")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240415122346.26503-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agodt-bindings: net: nxp,dwmac-imx: allow nvmem cells property
Peng Fan [Mon, 15 Apr 2024 10:36:21 +0000 (18:36 +0800)]
dt-bindings: net: nxp,dwmac-imx: allow nvmem cells property

Allow nvmem-cells and nvmem-cell-names to get mac_address from onchip
fuse.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240415103621.1644735-1-peng.fan@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: ipa: Remove unnecessary print function dev_err()
Jiapeng Chong [Mon, 15 Apr 2024 03:14:56 +0000 (11:14 +0800)]
net: ipa: Remove unnecessary print function dev_err()

The print function dev_err() is redundant because
platform_get_irq_byname() already prints an error.

./drivers/net/ipa/ipa_interrupt.c:300:2-9: line 300 is redundant because platform_get_irq() already prints an error.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8756
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240415031456.10805-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet/handshake: remove redundant assignment to variable ret
Colin Ian King [Mon, 15 Apr 2024 10:07:13 +0000 (11:07 +0100)]
net/handshake: remove redundant assignment to variable ret

The variable is being assigned an value and then is being re-assigned
a new value in the next statement. The assignment is redundant and can
be removed.

Cleans up clang scan build warning:
net/handshake/tlshd.c:216:2: warning: Value stored to 'ret' is never
read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20240415100713.483399-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agotun: limit printing rate when illegal packet received by tun dev
Lei Chen [Mon, 15 Apr 2024 02:02:46 +0000 (22:02 -0400)]
tun: limit printing rate when illegal packet received by tun dev

vhost_worker will call tun call backs to receive packets. If too many
illegal packets arrives, tun_do_read will keep dumping packet contents.
When console is enabled, it will costs much more cpu time to dump
packet and soft lockup will be detected.

net_ratelimit mechanism can be used to limit the dumping rate.

PID: 33036    TASK: ffff949da6f20000  CPU: 23   COMMAND: "vhost-32980"
 #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253
 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3
 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e
 #3 [fffffe00003fced0] do_nmi at ffffffff8922660d
 #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663
    [exception RIP: io_serial_in+20]
    RIP: ffffffff89792594  RSP: ffffa655314979e8  RFLAGS: 00000002
    RAX: ffffffff89792500  RBX: ffffffff8af428a0  RCX: 0000000000000000
    RDX: 00000000000003fd  RSI: 0000000000000005  RDI: ffffffff8af428a0
    RBP: 0000000000002710   R8: 0000000000000004   R9: 000000000000000f
    R10: 0000000000000000  R11: ffffffff8acbf64f  R12: 0000000000000020
    R13: ffffffff8acbf698  R14: 0000000000000058  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594
 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470
 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6
 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605
 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558
 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124
 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07
 #12 [ffffa65531497b68] printk at ffffffff89318306
 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765
 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun]
 #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun]
 #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net]
 #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost]
 #18 [ffffa65531497f10] kthread at ffffffff892d2e72
 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f

Fixes: ef3db4a59542 ("tun: avoid BUG, dump packet on GSO errors")
Signed-off-by: Lei Chen <lei.chen@smartx.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agonet: dsa: microchip: drop unneeded MODULE_ALIAS
Krzysztof Kozlowski [Sun, 14 Apr 2024 15:49:29 +0000 (17:49 +0200)]
net: dsa: microchip: drop unneeded MODULE_ALIAS

The ID table already has respective entry and MODULE_DEVICE_TABLE and
creates proper alias for SPI driver.  Having another MODULE_ALIAS causes
the alias to be duplicated.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20240414154929.127045-1-krzk@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 months agoice: Fix checking for unsupported keys on non-tunnel device
Marcin Szycik [Tue, 9 Apr 2024 15:45:44 +0000 (17:45 +0200)]
ice: Fix checking for unsupported keys on non-tunnel device

Add missing FLOW_DISSECTOR_KEY_ENC_* checks to TC flower filter parsing.
Without these checks, it would be possible to add filters with tunnel
options on non-tunnel devices. enc_* options are only valid for tunnel
devices.

Example:
  devlink dev eswitch set $PF1_PCI mode switchdev
  echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
  tc qdisc add dev $VF1_PR ingress
  ethtool -K $PF1 hw-tc-offload on
  tc filter add dev $VF1_PR ingress flower enc_ttl 12 skip_sw action drop

Fixes: 9e300987d4a8 ("ice: VXLAN and Geneve TC support")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 months agoice: tc: allow zero flags in parsing tc flower
Michal Swiatkowski [Fri, 15 Mar 2024 11:08:21 +0000 (12:08 +0100)]
ice: tc: allow zero flags in parsing tc flower

The check for flags is done to not pass empty lookups to adding switch
rule functions. Since metadata is always added to lookups there is no
need to check against the flag.

It is also fixing the problem with such rule:
$ tc filter add dev gtp_dev ingress protocol ip prio 0 flower \
enc_dst_port 2123 action drop
Switch block in case of GTP can't parse the destination port, because it
should always be set to GTP specific value. The same with ethertype. The
result is that there is no other matching criteria than GTP tunnel. In
this case flags is 0, rule can't be added only because of defensive
check against flags.

Fixes: 9a225f81f540 ("ice: Support GTP-U and GTP-C offload in switchdev")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 months agoice: tc: check src_vsi in case of traffic from VF
Michal Swiatkowski [Fri, 15 Mar 2024 11:08:20 +0000 (12:08 +0100)]
ice: tc: check src_vsi in case of traffic from VF

In case of traffic going from the VF (so ingress for port representor)
source VSI should be consider during packet classification. It is
needed for hardware to not match packets from different ports with
filters added on other port.

It is only for "from VF" traffic, because other traffic direction
doesn't have source VSI.

Set correct ::src_vsi in rule_info to pass it to the hardware filter.

For example this rule should drop only ipv4 packets from eth10, not from
the others VF PRs. It is needed to check source VSI in this case.
$tc filter add dev eth10 ingress protocol ip flower skip_sw action drop

Fixes: 0d08a441fb1a ("ice: ndo_setup_tc implementation for PF")
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 months agoMerge branch 'net-stmmac-fix-mac-capabilities-procedure'
Paolo Abeni [Tue, 16 Apr 2024 13:24:57 +0000 (15:24 +0200)]
Merge branch 'net-stmmac-fix-mac-capabilities-procedure'

Serge Semin says:

====================
net: stmmac: Fix MAC-capabilities procedure

The series got born as a result of the discussions around the recent
Yanteng' series adding the Loongson LS7A1000, LS2K1000, LS7A2000, LS2K2000
MACs support:
Link: https://lore.kernel.org/netdev/fu3f6uoakylnb6eijllakeu5i4okcyqq7sfafhp5efaocbsrwe@w74xe7gb6x7p
In particular the Yanteng' patchset needed to implement the Loongson
MAC-specific constraints applied to the link speed and link duplex mode.
As a result of the discussion with Russel the next preliminary patch was
born:
Link: https://lore.kernel.org/netdev/df31e8bcf74b3b4ddb7ddf5a1c371390f16a2ad5.1712917541.git.siyanteng@loongson.cn
The patch above was a temporal solution utilized by Yanteng for further
developments and to move on with the on-going review. This patchset is a
refactored version of that single patch with formatting required for the
fixes patches.

In particular the series starts with fixing the half-duplex-less
constraint currently applied for all IP-cores. In fact it's specific for
the DW QoS Eth only (DW GMAC v4.x/v5.x).

The next patch fixes the MAC-capabilities setting up during the active
Tx/Rx queues re-initialization procedure. Particularly the procedure
missed the max-speed limit thus possibly activating speeds prohibited on
the respective platforms.

Third patch fixes the incorrect MAC-capabilities initialization for DW
MAC100, DW XGMAC and DW XLGMAC devices by moving the correct
initialization to the IP-core specific setup() methods.

That's it for now. Thanks for review and testing in advance.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Samuel Holland <samuel@sholland.org>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-sunxi@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
====================

Link: https://lore.kernel.org/r/20240412180340.7965-1-fancer.lancer@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: stmmac: Fix IP-cores specific MAC capabilities
Serge Semin [Fri, 12 Apr 2024 18:03:16 +0000 (21:03 +0300)]
net: stmmac: Fix IP-cores specific MAC capabilities

Here is the list of the MAC capabilities specific to the particular DW MAC
IP-cores currently supported by the driver:

DW MAC100: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
   MAC_10 | MAC_100

DW GMAC:  MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
          MAC_10 | MAC_100 | MAC_1000

Allwinner sun8i MAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
                     MAC_10 | MAC_100 | MAC_1000

DW QoS Eth: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
            MAC_10 | MAC_100 | MAC_1000 | MAC_2500FD
if there is more than 1 active Tx/Rx queues:
   MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
           MAC_10FD | MAC_100FD | MAC_1000FD | MAC_2500FD

DW XGMAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
          MAC_1000FD | MAC_2500FD | MAC_5000FD | MAC_10000FD

DW XLGMAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
          MAC_1000FD | MAC_2500FD | MAC_5000FD | MAC_10000FD |
          MAC_25000FD | MAC_40000FD | MAC_50000FD | MAC_100000FD

As you can see there are only two common capabilities:
MAC_ASYM_PAUSE | MAC_SYM_PAUSE.
Meanwhile what is currently implemented defines 10/100/1000 link speeds
for all IP-cores, which is definitely incorrect for DW MAC100, DW XGMAC
and DW XLGMAC devices.

Seeing the flow-control is implemented as a callback for each MAC IP-core
(see dwmac100_flow_ctrl(), dwmac1000_flow_ctrl(), sun8i_dwmac_flow_ctrl(),
etc) and since the MAC-specific setup() method is supposed to be called
for each available DW MAC-based device, the capabilities initialization
can be freely moved to these setup() functions, thus correctly setting up
the MAC-capabilities for each IP-core (including the Allwinner Sun8i). A
new stmmac_link::caps field was specifically introduced for that so to
have all link-specific info preserved in a single structure.

Note the suggested change fixes three earlier commits at a time. The
commit 5b0d7d7da64b ("net: stmmac: Add the missing speeds that XGMAC
supports") permitted the 10-100 link speeds and 1G half-duplex mode for DW
XGMAC IP-core even though it doesn't support them. The commit df7699c70c1b
("net: stmmac: Do not cut down 1G modes") incorrectly added the MAC1000
capability to the DW MAC100 IP-core. Similarly to the DW XGMAC the commit
8a880936e902 ("net: stmmac: Add XLGMII support") incorrectly permitted the
10-100 link speeds and 1G half-duplex mode for DW XLGMAC IP-core.

Fixes: 5b0d7d7da64b ("net: stmmac: Add the missing speeds that XGMAC supports")
Fixes: df7699c70c1b ("net: stmmac: Do not cut down 1G modes")
Fixes: 8a880936e902 ("net: stmmac: Add XLGMII support")
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: stmmac: Fix max-speed being ignored on queue re-init
Serge Semin [Fri, 12 Apr 2024 18:03:15 +0000 (21:03 +0300)]
net: stmmac: Fix max-speed being ignored on queue re-init

It's possible to have the maximum link speed being artificially limited on
the platform-specific basis. It's done either by setting up the
plat_stmmacenet_data::max_speed field or by specifying the "max-speed"
DT-property. In such cases it's required that any specific
MAC-capabilities re-initializations would take the limit into account. In
particular the link speed capabilities may change during the number of
active Tx/Rx queues re-initialization. But the currently implemented
procedure doesn't take the speed limit into account.

Fix that by calling phylink_limit_mac_speed() in the
stmmac_reinit_queues() method if the speed limitation was required in the
same way as it's done in the stmmac_phy_setup() function.

Fixes: 95201f36f395 ("net: stmmac: update MAC capabilities when tx queues are updated")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agonet: stmmac: Apply half-duplex-less constraint for DW QoS Eth only
Serge Semin [Fri, 12 Apr 2024 18:03:14 +0000 (21:03 +0300)]
net: stmmac: Apply half-duplex-less constraint for DW QoS Eth only

There are three DW MAC IP-cores which can have the multiple Tx/Rx queues
enabled:
DW GMAC v3.7+ with AV feature,
DW QoS Eth v4.x/v5.x,
DW XGMAC/XLGMAC
Based on the respective HW databooks, only the DW QoS Eth IP-core doesn't
support the half-duplex link mode in case if more than one queues enabled:

"In multiple queue/channel configurations, for half-duplex operation,
enable only the Q0/CH0 on Tx and Rx. For single queue/channel in
full-duplex operation, any queue/channel can be enabled."

The rest of the IP-cores don't have such constraint. Thus in order to have
the constraint applied for the DW QoS Eth MACs only, let's move the it'
implementation to the respective MAC-capabilities getter and make sure the
getter is called in the queues re-init procedure.

Fixes: b6cfffa7ad92 ("stmmac: fix DMA channel hang in half-duplex mode")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoplatform/x86/intel-uncore-freq: Increase minor number support
Srinivas Pandruvada [Mon, 15 Apr 2024 22:06:25 +0000 (15:06 -0700)]
platform/x86/intel-uncore-freq: Increase minor number support

No new changes will be added for minor version 2. Change the minor
version number to 2 and stop displaying log message for unsupported
minor version 2.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20240415220625.2828339-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agoplatform/x86: ISST: Add Granite Rapids-D to HPM CPU list
Srinivas Pandruvada [Mon, 15 Apr 2024 21:28:53 +0000 (14:28 -0700)]
platform/x86: ISST: Add Granite Rapids-D to HPM CPU list

Add Granite Rapids-D to hpm_cpu_ids, so that MSR 0x54 can be used.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20240415212853.2820470-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agoplatform/x86/amd: pmf: Add quirk for ROG Zephyrus G14
Mario Limonciello [Wed, 10 Apr 2024 14:09:56 +0000 (09:09 -0500)]
platform/x86/amd: pmf: Add quirk for ROG Zephyrus G14

ROG Zephyrus G14 advertises support for SPS notifications to the
BIOS but doesn't actually use them. Instead the asus-nb-wmi driver
utilizes such events.

Add a quirk to prevent the system from registering for ACPI platform
profile when this system is found to avoid conflicts.

Reported-by: al0uette@outlook.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218685
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240410140956.385-3-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agoplatform/x86/amd: pmf: Add infrastructure for quirking supported funcs
Mario Limonciello [Wed, 10 Apr 2024 14:09:55 +0000 (09:09 -0500)]
platform/x86/amd: pmf: Add infrastructure for quirking supported funcs

In the event of a BIOS bug add infrastructure that will be utilized
to override the return value for supported_funcs to avoid enabling
features.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240410140956.385-2-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agoplatform/x86/amd: pmf: Decrease error message to debug
Mario Limonciello [Wed, 10 Apr 2024 14:09:54 +0000 (09:09 -0500)]
platform/x86/amd: pmf: Decrease error message to debug

ASUS ROG Zephyrus G14 doesn't have _CRS in AMDI0102 device and so
there are no resources to walk.  This is expected behavior because
it doesn't support Smart PC.  Decrease error message to debug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218685
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240410140956.385-1-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2 months agoaf_unix: Try not to hold unix_gc_lock during accept().
Kuniyuki Iwashima [Sat, 13 Apr 2024 02:19:28 +0000 (19:19 -0700)]
af_unix: Try not to hold unix_gc_lock during accept().

Commit dcf70df2048d ("af_unix: Fix up unix_edge.successor for embryo
socket.") added spin_lock(&unix_gc_lock) in accept() path, and it
caused regression in a stress test as reported by kernel test robot.

If the embryo socket is not part of the inflight graph, we need not
hold the lock.

To decide that in O(1) time and avoid the regression in the normal
use case,

  1. add a new stat unix_sk(sk)->scm_stat.nr_unix_fds

  2. count the number of inflight AF_UNIX sockets in the receive
     queue under unix_state_lock()

  3. move unix_update_edges() call under unix_state_lock()

  4. avoid locking if nr_unix_fds is 0 in unix_update_edges()

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202404101427.92a08551-oliver.sang@intel.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240413021928.20946-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch 'selftests-net-tcp_ao-a-bunch-of-fixes-for-tcp-ao-selftests'
Paolo Abeni [Tue, 16 Apr 2024 11:35:09 +0000 (13:35 +0200)]
Merge branch 'selftests-net-tcp_ao-a-bunch-of-fixes-for-tcp-ao-selftests'

Dmitry Safonov via says:

====================
selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

Started as addressing the flakiness issues in rst_ipv*, that affect
netdev dashboard.

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
====================

Link: https://lore.kernel.org/r/20240413-tcp-ao-selftests-fixes-v1-0-f9c41c96949d@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests/tcp_ao: Printing fixes to confirm with format-security
Dmitry Safonov [Sat, 13 Apr 2024 01:42:55 +0000 (02:42 +0100)]
selftests/tcp_ao: Printing fixes to confirm with format-security

On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces
> lib/setup.c: In function ‘__test_msg’:
> lib/setup.c:20:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    20 |         ksft_print_msg(buf);
>       |         ^~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_ok’:
> lib/setup.c:26:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    26 |         ksft_test_result_pass(buf);
>       |         ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_fail’:
> lib/setup.c:32:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    32 |         ksft_test_result_fail(buf);
>       |         ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_xfail’:
> lib/setup.c:38:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    38 |         ksft_test_result_xfail(buf);
>       |         ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_error’:
> lib/setup.c:44:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    44 |         ksft_test_result_error(buf);
>       |         ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_skip’:
> lib/setup.c:50:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    50 |         ksft_test_result_skip(buf);
>       |         ^~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors

As the buffer was already pre-printed into, print it as a string
rather than a format-string.

Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests/tcp_ao: Fix fscanf() call for format-security
Dmitry Safonov [Sat, 13 Apr 2024 01:42:54 +0000 (02:42 +0100)]
selftests/tcp_ao: Fix fscanf() call for format-security

On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces:
> lib/proc.c: In function ‘netstat_read_type’:
> lib/proc.c:89:9: error: format not a string literal and no format arguments [-Werror=format-security]
>    89 |         if (fscanf(fnetstat, type->header_name) == EOF)
>       |         ^~
> cc1: some warnings being treated as errors

Here the selftests lib parses header name, while expectes non-space word
ending with a column.

Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests/tcp_ao: Zero-init tcp_ao_info_opt
Dmitry Safonov [Sat, 13 Apr 2024 01:42:53 +0000 (02:42 +0100)]
selftests/tcp_ao: Zero-init tcp_ao_info_opt

The structure is on the stack and has to be zero-initialized as
the kernel checks for:
> if (in.reserved != 0 || in.reserved2 != 0)
> return -EINVAL;

Fixes: b26660531cf6 ("selftests/net: Add test for TCP-AO add setsockopt() command")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests/tcp_ao: Make RST tests less flaky
Dmitry Safonov [Sat, 13 Apr 2024 01:42:52 +0000 (02:42 +0100)]
selftests/tcp_ao: Make RST tests less flaky

Currently, "active reset" cases are flaky, because select() is called
for 3 sockets, while only 2 are expected to receive RST.
The idea of the third socket was to get into request_sock_queue,
but the test mistakenly attempted to connect() after the listener
socket was shut down.

Repair this test, it's important to check the different kernel
code-paths for signing RST TCP-AO segments.

Fixes: c6df7b2361d7 ("selftests/net: Add TCP-AO RST test")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Paolo Abeni [Tue, 16 Apr 2024 11:11:29 +0000 (13:11 +0200)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================

This series contains updates to ice driver only.

Lukasz removes unnecessary argument from ice_fdir_comp_rules().

Jakub adds support for ethtool 'ether' flow-type rules.

Jake moves setting of VF MSI-X value to initialization function and adds
tracking of VF relative MSI-X index.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: store VF relative MSI-X index in q_vector->vf_reg_idx
  ice: set vf->num_msix in ice_initialize_vf_entry()
  ice: Implement 'flow-type ether' rules
  ice: Remove unnecessary argument from ice_fdir_comp_rules()
====================

Link: https://lore.kernel.org/r/20240412210534.916756-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoMerge branch 'selftests-assortment-of-fixes'
Paolo Abeni [Tue, 16 Apr 2024 10:14:44 +0000 (12:14 +0200)]
Merge branch 'selftests-assortment-of-fixes'

Petr Machata says:

====================
selftests: Assortment of fixes

This is a loose follow-up to the Kernel CI patchset posted recently. It
contains various fixes that were supposed to be part of said patchset, but
didn't fit due to its size. The latter 4 patches were written independently
of the CI effort, but again didn't fit in their intended patchsets.

- Patch #1 unifies code of two very similar looking functions, busywait()
  and slowwait().

- Patch #2 adds sanity checks around the setting of NETIFS, which carries
  list of interfaces to run on.

- Patch #3 changes bail_on_lldpad() to SKIP instead of FAILing.

- Patches #4 to #7 fix issues in selftests.

- Patches #8 to #10 add topology diagrams to several selftests.
  This should have been part of the mlxsw leg of NH group stats patches,
  but again, it did not fit in due to size.
====================

Link: https://lore.kernel.org/r/cover.1712940759.git.petrm@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests: forwarding: router_nh: Add a diagram
Petr Machata [Fri, 12 Apr 2024 17:03:13 +0000 (19:03 +0200)]
selftests: forwarding: router_nh: Add a diagram

This test lacks a topology diagram, making the setup not obvious.
Add one.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests: forwarding: router_mpath_nh_res: Add a diagram
Petr Machata [Fri, 12 Apr 2024 17:03:12 +0000 (19:03 +0200)]
selftests: forwarding: router_mpath_nh_res: Add a diagram

This test lacks a topology diagram, making the setup not obvious.
Add one.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 months agoselftests: forwarding: router_mpath_nh: Add a diagram
Petr Machata [Fri, 12 Apr 2024 17:03:11 +0000 (19:03 +0200)]
selftests: forwarding: router_mpath_nh: Add a diagram

This test lacks a topology diagram, making the setup not obvious.
Add one.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>