git.kernel.dk Git - linux-block.git/log

wifi: ath12k: correct the capital word typo

Rename the "ATH12k" word to "ATH12K" for consistent capitalization in the
word.

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240405144524.1157122-1-quic_periyasa@quicinc.com

wifi: ath11k: support hibernation

Now that all infrastructure is in place and ath11k is fixed to handle all the
corner cases, power down the ath11k firmware during suspend and power it back
up during resume. This fixes the problem when using hibernation with ath11k PCI
devices.

For suspend, two conditions needs to be satisfied:
        1. since MHI channel unprepare would be done in late suspend stage,
           ath11k needs to get all QMI-dependent things done before that stage.
        2. and because unprepare MHI channels requires a working MHI stack,
           ath11k is not allowed to call mhi_power_down() until that finishes.
So the original suspend callback is separated into two parts: the first part
handles all QMI-dependent things in suspend callback; while the second part
powers down MHI in suspend_late callback. This is valid because kernel calls
ath11k's suspend callback before all suspend_late callbacks, making the first
condition happy. And because MHI devices are children of ath11k device
(ab->dev), kernel guarantees that ath11k's suspend_late callback is called
after QRTR's suspend_late callback, this satisfies the second condition.

Above analysis also applies to resume process. so the original resume
callback is separated into two parts: the first part powers up MHI stack
in resume_early callback, this guarantees MHI stack is working when
QRTR tries to prepare MHI channels (kernel calls QRTR's resume_early callback
after ath11k's resume_early callback, due to the child-father relationship);
the second part waits for the completion of restart, which won't fail now
since MHI channels are ready for use by QMI.

Another notable change is in power down path, we tell mhi_power_down() to not
to destroy MHI devices, making it possible for QRTR to help unprepare/prepare
MHI channels, and finally get us rid of the probe-defer issue when resume.

Also change related code due to interface changes.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240305021320.3367-4-quic_bqiang@quicinc.com

net: qrtr: support suspend/hibernation

MHI devices may not be destroyed during suspend/hibernation, so need
to unprepare/prepare MHI channels throughout the transition, this is
done by adding suspend/resume callbacks.

The suspend callback is called in the late suspend stage, this means
MHI channels are still alive at suspend stage, and that makes it
possible for an MHI controller driver to communicate with others over
those channels at suspend stage. While the resume callback is called
in the early resume stage, for a similar reason.

Also note that we won't do unprepare/prepare when MHI device is in
suspend state because it's pointless if MHI is only meant to go through
a suspend/resume transition, instead of a complete power cycle.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240305021320.3367-3-quic_bqiang@quicinc.com

wifi: ath12k: fix link capable flags

By default driver supports intra-device SLO/MLO, the link capability flags
reflect this. When the QMI PHY capability learning fails driver not enable
the MLO parameter in the host capability QMI request message. In this case,
reset the device link capability flags to zero (SLO/MLO not support) to
accurately represent the capabilities.

Found in code review.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240403042056.1504209-3-quic_periyasa@quicinc.com

wifi: ath12k: extend the link capable flag

Link capability categorized as Single Link Operation (SLO) and Multi Link
Operation (MLO).

- Intra-device SLO/MLO refers to links present within a device
- Inter-device SLO/MLO refers to links present across multiple devices

Currently, driver uses a boolean variable to represent intra-device SLO/MLO
capability. To accommodate inter-device SLO/MLO capabilities within the
same variable, modify the existing variable name and type. Define a new
enumeration for the link capabilities to accommodate both intra-device
and inter-device scenarios. Populate the enum based on the supported
capabilities.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00188-QCAHKSWPL_SILICONZ-1

Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240403042056.1504209-2-quic_periyasa@quicinc.com

Merge branch 'mhi-immutable' of git://git./linux/kernel/git/mani/mhi into ath-next

This is needed for patch 'wifi: ath11k: support hibernation'.

wifi: ath10k: support board-specific firmware overrides

Different Qualcomm platforms using WCN3990 WiFI chip use SoC-specific
firmware versions with different features. For example firmware for
SDM845 doesn't use single-chan-info-per-channel feature, while firmware
for QRB2210 / QRB4210 requires that feature. Allow board DT files to
override the subdir of the fw dir used to lookup the firmware-N.bin file
decribing corresponding WiFi firmware.

For example:

- ath10k/WCN3990/hw1.0/wlanmdsp.mbn,
  ath10k/WCN3990/hw1.0/firmware-5.bin: main firmware files, used by default

- ath10k/WCN3990/hw1.0/qcm2290/wlanmdsp.mbn,
  ath10k/WCN3990/hw1.0/qcm2290/firmware-5.bin: SoC specific firmware
    with different signature and feature bits

Note, while board files lookup uses the same function and thus it is
possible to provide board-specific board-2.bin files, this is not
required in 99% of cases as board-2.bin already contains a way to
provide board-specific data with finer granularity than DT overrides.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240306-wcn3990-firmware-path-v2-2-f89e98e71a57@linaro.org

dt-bindings: net: wireless: ath10k: describe firmware-name property

For WCN3990 platforms we need to look for the platform / board specific
firmware-N.mbn file which corresponds to the wlanmdsp.mbn loaded to the
modem DSP via the TQFTPserv. Add firmware-name property describing this
classifier.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240306-wcn3990-firmware-path-v2-1-f89e98e71a57@linaro.org

wifi: ath6kl: sdio: simplify module initialization

This driver's initialization functions do not perform any custom code,
except printing messages. Printing messages on modules
loading/unloading is discouraged because it pollutes the dmesg
regardless whether user actually has this device. Core kernel code
already gives tools to investigate whether module was loaded or not.

Drop the printing messages which allows to replace open-coded
module_sdio_driver().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240329171019.63836-2-krzysztof.kozlowski@linaro.org

wifi: wil6210: wmi: Use __counted_by() in struct wmi_set_link_monitor_cmd and avoid -Wfamnae warning

Prepare for the coming implementation by GCC and Clang of the
__counted_by attribute. Flexible array members annotated with
__counted_by can have their accesses bounds-checked at run-time
via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE
(for strcpy/memcpy-family functions).

Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are
getting ready to enable it globally.

So, use the `DEFINE_FLEX()` helper for an on-stack definition of
a flexible structure where the size of the flexible-array member
is known at compile-time, and refactor the rest of the code,
accordingly.

So, with these changes, fix the following warning:
drivers/net/wireless/ath/wil6210/wmi.c:4018:49: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://github.com/KSPP/linux/issues/202
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/ZgSTCmdP+omePvWg@neat

wifi: wil6210: cfg80211: Use __counted_by() in struct wmi_start_scan_cmd and avoid some -Wfamnae warnings

Prepare for the coming implementation by GCC and Clang of the
__counted_by attribute. Flexible array members annotated with
__counted_by can have their accesses bounds-checked at run-time
via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE
(for strcpy/memcpy-family functions).

Also, -Wflex-array-member-not-at-end is coming in GCC-14, and we are
getting ready to enable it globally.

So, use the `DEFINE_FLEX()` helper for an on-stack definition of
a flexible structure where the size of the flexible-array member
is known at compile-time, and refactor the rest of the code,
accordingly.

So, with these changes, fix the following warning:
drivers/net/wireless/ath/wil6210/cfg80211.c:896:43: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://github.com/KSPP/linux/issues/202
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/ZgSP/CMSVfr68R2u@neat

wifi: ath12k: fix hal_rx_buf_return_buf_manager documentation

kernel-doc flagged the following issue, so fix it:

drivers/net/wireless/ath/ath12k/hal.h:771: warning: missing initial short description on line:
* enum hal_rx_buf_return_buf_manager

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240329-hal_rx_buf_return_buf_manager-v1-2-e62ec9dc2af9@quicinc.com

wifi: ath11k: fix hal_rx_buf_return_buf_manager documentation

kernel-doc flagged the following issue, so fix it:

drivers/net/wireless/ath/ath11k/hal.h:668: warning: missing initial short description on line:
* enum hal_rx_buf_return_buf_manager

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240329-hal_rx_buf_return_buf_manager-v1-1-e62ec9dc2af9@quicinc.com

wifi: ath9k: work around memset overflow warning

gcc-9 and some other older versions produce a false-positive warning
for zeroing two fields

In file included from include/linux/string.h:369,
                 from drivers/net/wireless/ath/ath9k/main.c:18:
In function 'fortify_memset_chk',
    inlined from 'ath9k_ps_wakeup' at drivers/net/wireless/ath/ath9k/main.c:140:3:
include/linux/fortify-string.h:462:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  462 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using a struct_group seems to reliably avoid the warning and
not make the code much uglier. The combined memset() should even
save a couple of cpu cycles.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240328135509.3755090-3-arnd@kernel.org

wifi: carl9170: re-fix fortified-memset warning

The carl9170_tx_release() function sometimes triggers a fortified-memset
warning in my randconfig builds:

In file included from include/linux/string.h:254,
                 from drivers/net/wireless/ath/carl9170/tx.c:40:
In function 'fortify_memset_chk',
    inlined from 'carl9170_tx_release' at drivers/net/wireless/ath/carl9170/tx.c:283:2,
    inlined from 'kref_put' at include/linux/kref.h:65:3,
    inlined from 'carl9170_tx_put_skb' at drivers/net/wireless/ath/carl9170/tx.c:342:9:
include/linux/fortify-string.h:493:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  493 |                         __write_overflow_field(p_size_field, size);

Kees previously tried to avoid this by using memset_after(), but it seems
this does not fully address the problem. I noticed that the memset_after()
here is done on a different part of the union (status) than the original
cast was from (rate_driver_data), which may confuse the compiler.

Unfortunately, the memset_after() trick does not work on driver_rates[]
because that is part of an anonymous struct, and I could not get
struct_group() to do this either. Using two separate memset() calls
on the two members does address the warning though.

Fixes: fb5f6a0e8063b ("mac80211: Use memset_after() to clear tx status")
Link: https://lore.kernel.org/lkml/20230623152443.2296825-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240328135509.3755090-2-arnd@kernel.org

wifi: ath12k: fix missing endianness conversion in wmi_vdev_create_cmd()

The WMI commands are in little endian byte order, fix missing endianness
conversion in wmi_vdev_create_cmd.

Tested-on: WCN7850 hw2.0 WLAN.IOE_HMT.1.0.2-00240-QCAHMTSWPL_V1.0_V2.0_SILICON-1

Signed-off-by: Miaoqing Pan <quic_miaoqing@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240322003726.2016208-1-quic_miaoqing@quicinc.com

wifi: ath12k: debugfs: radar simulation support

Create dfs_simulate_radar debugfs in ath12k debugfs directory.

Usage:

echo 1 > /sys/kernel/debug/ath12k/pci-0000:06:00.0/mac0/dfs_simulate_radar

This debugfs helps user to simulate RADAR interference in run time.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240320171305.655288-3-quic_rgnanase@quicinc.com

wifi: ath12k: initial debugfs support

The initial debugfs infra bringup in ath12k driver and create the ath12k debugfs
and soc-specific directories in /sys/kernel/debug/

For each ath12k device, directory will be created in <bus>-<devname>
schema under ath12k root directory.

Example with one ath12k device:
/sys/kernel/debug/ath12k/pci-0000:06:00.0

ath12k
`-- pci-0000:06:00.0
|-- mac0

To enable ath12k debugfs support (CONFIG_ATH12K_DEBUGFS=y)

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1

Signed-off-by: Ramasamy Kaliappan <quic_rkaliapp@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240320171305.655288-2-quic_rgnanase@quicinc.com

Merge branch 'mlxsw-preparations-for-improving-performance'

Petr Machata says:

====================
mlxsw: Preparations for improving performance

Amit Cohen writes:

mlxsw driver will use NAPI for event processing in a next patch set.
Some additional improvements will be added later. This patch set
prepares the code for NAPI usage and refactor some relevant areas. See
more details in commit messages.

Patch Set overview:
Patches #1-#2 are preparations for patch #3
Patch #3 setups tasklets as part of queue initializtion
Patch #4 removes handling of unlikely scenario
Patch #5 removes unused counters
Patch #6 makes style change in mlxsw_pci_eq_tasklet()
Patch #7-#10 poll command interface instead of EQ0 usage
Patches #11-#12 make style change and break the function
mlxsw_pci_cq_tasklet()
Patches #13-#14 remove functions which can be replaced by a stored value
Patch #15 improves accessing to descriptor queue instance
====================

Link: https://lore.kernel.org/r/cover.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Store DQ pointer as part of CQ structure

Currently, for each completion, we check the number of descriptor queue
and take it via mlxsw_pci_{sdq,rdq}_get(). This is inefficient, the
DQ should be the same for all the completions in CQ, as each CQ handles
only one DQ - SDQ or RDQ. This mapping is handled as part of DQ
initialization via mlxsw_cmd_mbox_sw2hw_dq_cq_set().

Instead, as part of DQ initialization, set DQ pointer in the appropriate
CQ structure. When we handle completions, warn in case that the DQ number
that we expect is different from the number we get in the CQE. Call
WARN_ON_ONCE() only after checking the value, to avoid calling this method
for each completion.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a5b2559cd6d532c120f3194f89a1e257110318f1.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Remove mlxsw_pci_cq_count()

Currently, for each interrupt we call mlxsw_pci_cq_count() to determine the
number of CQs. This call makes additional two function's calls. This can
be removed by storing this value as part of structure 'mlxsw_pci', as we
already do for number of SDQs. Remove the function and
__mlxsw_pci_queue_count() which is now not used and store the value
instead.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f08ad113e8160678f3c8d401382a696c6c7f44c7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Remove mlxsw_pci_sdq_count()

The number of SDQs is stored as part of 'mlxsw_pci' structure. In some
cases, the driver uses this value and in some cases it calls
mlxsw_pci_sdq_count() to get the value. Align the code to use the
stored value. This simplifies the code and makes it clearer that the
value is always the same. Rename 'mlxsw_pci->num_sdq_cqs' to
'mlxsw_pci->num_sdqs' as now it is used not only in CQ context.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/0c8788506d9af35d589dbf64be35a508fd63d681.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Break mlxsw_pci_cq_tasklet() into tasklets per queue type

Completion queues are used for completions of RDQ or SDQ. Each
completion queue is used for one DQ. The first CQs are used for SDQs and
the rest are used for RDQs.

Currently, for each CQE (completion queue element), we check 'sr' value
(send/receive) to know if it is completion of RDQ or SDQ. Actually, we
do not really have to check it, as according to the queue number we know
if it handles completions of Rx or Tx.

Break the tasklet into two - one for Rx (RDQ) and one for Tx (SDQ). Then,
setup the appropriate tasklet for each queue as part of queue
initialization. Use 'sr' value for unlikely case that we get completion
with type that we do not expect. Call WARN_ON_ONCE() only after checking
the value, to avoid calling this method for each completion.

A next patch set will use NAPI to handle events, then we will have a
separate poll method for Rx and Tx. This change is a preparation for
NAPI usage.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/50fbc366f8de54cb5dc72a7c4f394333ef71f1d0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Make style change in mlxsw_pci_cq_tasklet()

This function will be broken into several functions later. As preparation,
reorder variables to reverse xmas tree.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7170a8f4429ecb5a539b0374c621697778ff8363.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Remove unused wait queue

The previous patch changed the code to do not handle command interface
from event queue. With this change the wait queue is not used anymore.
Remove it and 'wait_done' variable.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f3af6a5a9dabd97d2920cefe475c6aa57767f504.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Use only one event queue

The device supports two event queues. EQ0 is used for command interface
completion events. EQ1 is used for completion events of RDQ or SDQ.

Currently, for each EQE (event queue element), we check the queue number
and handle accordingly. More than that, for each interrupt we schedule
tasklets for both EQs. This is really ineffective, especially because of
the fact that EQ0 is used only as part of driver init/fini, when EMADs are
not available. There is no point to schedule the tasklet for it and check
each EQE.

A previous patch changed the code to poll command interface for each use of
it. It means that now there is no real reason to use EQ0, as we poll the
command interface.

Initialize only one event queue and use it as EQ1 (this is determined by
queue number). Then, for each interrupt we can schedule the tasklet only
for one queue and we do not have to check the queue number. This
simplifies the code and should improve performance. Note that polling
command interface is ok as we use it only as part of driver init/fini.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/23d764f5c032e4c363b98590b746a4b32d2bf900.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Rename MLXSW_PCI_EQS_COUNT

Currently we use MLXSW_PCI_EQS_COUNT event queues. A next patch will
change the driver to initialize only EQ1, as EQ0 is not required anymore
when we poll command interface.

Rename the macro to MLXSW_PCI_EQS_MAX as later we will not initialize
the maximum supported EQs, this value represents the maximum and a new
macro will be added to represent the actual used queues.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/b08df430b62f23ca1aa3aaa257896d2d95aa7691.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Poll command interface for each cmd_exec()

Command interface is used for configuring and querying FW when EMADs are
not available. During the time that the driver sets up the asynchronous
queues, it polls the command interface for getting completions. Then,
there is a short period when asynchronous queues work, but EMADs are not
available (marked in the code as nopoll = true). During this time, we
send commands via command interface, but we do not poll it, as we can get
an interrupt for the completion. Completions of command interface are
received from HW in EQ0 (event queue 0).

The usage of EQ0 instead of polling is done only 4 times during
initialization and one time during tear down, but it makes an overhead
during lifetime of the driver. For each interrupt, we have to check if
we get events in EQ0 or EQ1 and handle them. This is really ineffective,
especially because of the fact that EQ0 is used only as part of driver
init/fini.

Instead, we can poll command interface for each call of cmd_exec(). It
means that when we send a command via command interface (as EMADs are
not available), we will poll it, regardless of availability of the
asynchronous queues. This will allow us to configure later only EQ1 and
simplify the flow.

Remove 'nopoll' indication and change mlxsw_pci_cmd_exec() to poll till
answer/timeout regardless of queues' state. For now, completions are
handled also by EQ0, but it will be removed in next patch. Additional
cleanups will be added in next patches.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/e674c70380ceda953e0e45a77334c5d22e69938f.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Make style changes in mlxsw_pci_eq_tasklet()

This function will be used later only for EQ1. As preparation, reorder
variables to reverse xmas tree and return earlier when it is possible, to
simplify the code.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/2412d6c135b2a6aedb4484f5d8baab3aecd7b9ae.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Remove unused counters

The structure 'mlxsw_pci_queue' stores several counters which were consumed
via debugfs. Since commit 9a32562becd9 ("mlxsw: Remove debugfs interface"),
these counters are not used. Remove them. This makes the 'union u' and
'struct eq' redundant. Maintain 'struct cq' as it will be extended later.

Replace increasing 'q->u.eq.ev_other_count' with WARN_ON_ONCE(), as it is
used in an unreasonable case of receiving event in EQ which is not EQ0 or
EQ1. When the queues are initialized, we check number of event queues and
fail with the print "Unsupported number of queues" in case that the driver
tries to initialize more than two queues.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ee9e658800aa0390e08342100bc27daff4c176c0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Arm CQ doorbell regardless of number of completions

Currently, as part of mlxsw_pci_cq_tasklet(), we check if any item
was handled, and only in such case we arm doorbell. This is unlikely case,
as we schedule tasklet only for CQs that we get an event for them, which
means that they contain completions to handle. Remove this check, which
is supposed to be true always, and even if it is false, it is not a mistake
to ring the doorbell. We can warn on such case, but it is not really worth
to add a check which will be run for each CQ handling when we do not expect
to reach it and it does not point to logic error that should be handled.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f8efa481bfe7bebb9f93bb803f44ab7da77f53e6.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Do not setup tasklet from operation

Currently, the structure 'mlxsw_pci_queue_ops' holds a pointer to the
callback function of tasklet. This is used only for EQ and CQ. mlxsw
driver will use NAPI in a following patch set, so CQ will not use tasklet
anymore. As preparation, remove this pointer from the shared operation
structure and setup the tasklet as part of queue initialization.
For now, setup tasklet for EQ and CQ. Later, CQ code will be changed.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a326cae5fc1ad085a1a063c004983de6fe389414.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Move mlxsw_pci_cq_{init, fini}()

Move mlxsw_pci_cq_{init, fini}() after mlxsw_pci_cq_tasklet() as a next
patch will setup the tasklet as part of initialization.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/25196cb5baf5acf6ec1e956203790e018ba8e306.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

mlxsw: pci: Move mlxsw_pci_eq_{init, fini}()

Move mlxsw_pci_eq_{init, fini}() after mlxsw_pci_eq_tasklet() as a next
patch will setup the tasklet as part of initialization.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7ae120a02e1c490084daae7e684a0d40b7cce4e7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mlx5-misc-patches'

Tariq Toukan says:

====================
mlx5 misc patches

This patchset includes small features and misc code enhancements for the
mlx5 core and EN drivers.

Patches 1-4 by Gal improves the mlx5e ethtool stats implementation, for
example by using standard helpers ethtool_sprintf/puts.

Patch 5 by me adds a reset option for the FW command interface debugfs
stats entries. This allows explicit FW command interface stats reset
between different runs of a test case.

Patches 6 and 7 are simple cleanups.

Patch 8 by Gal adds driver support for 800Gbps link modes.

Patch 9 by Jianbo enhances the L4 steering abilities.

Patches 10-11 by Jianbo save redundant operations.
====================

Link: https://lore.kernel.org/r/20240402133043.56322-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Don't call give_pages() if request 0 page

Firmware will return 0 on query BOOT/INIT PAGES for non-page supplier
functions (external host PF/VF/SF), so no page is needed to be
allocated for them.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Skip pages EQ creation for non-page supplier function

Page events are not issued by device on the function if
page_request_disable is set, so no need to create pages EQ.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Support matching on l4_type for ttc_table

Replace matching on TCP and UDP protocols with new l4_type field which
is parsed by steering for ttc_table. It is enabled by the
outer_l4_type or inner_l4_type bits in nic_rx or port_sel flow table
capabilities and used only if pcc_ifa2 bit in HCA capabilities is set.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Add support for 800Gbps link modes

Add support for 800Gbps speed, link modes of 100Gbps per lane.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Convert uintX_t to uX

In the kernel, the preferred types are uX.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: XDP, Fix an inconsistent comment

Starting from commit
eb9b9fdcafe2 ("net/mlx5e: Introduce extended version for mlx5e_xmit_data")
sinfo is no longer passed as an argument to
mlx5e_xmit_xdp_frame(), the comment is inconsistent.

check_result must be zero when the packet is fragmented.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: debugfs, Add reset option for command interface stats

Resetting stats just before some test/debug case allows us to eliminate
out the impact of previous commands. Useful in particular for the
average latency calculation.

The average_write() callback was unreachable, as "average" is a
read-only file. Extend, rename, and use it for a newly exposed
write-only "reset" file.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Make stats group fill_stats callbacks consistent with the API

The fill_strings() callbacks were changed to accept a **data pointer,
and not rely on propagating the index value.
Make a similar change to fill_stats() callbacks to keep the API
consistent.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Use ethtool_sprintf/puts() to fill stats strings

Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

Change the fill_strings callback to accept a **data pointer, and remove
the index and return value.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Use ethtool_sprintf/puts() to fill selftests strings

Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

The int return value in mlx5e_self_test_fill_strings() is not removed as
it is still used to return the number of selftests.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5e: Use ethtool_sprintf/puts() to fill priv flags strings

Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'wireless-next-2024-04-03' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.10

The first "new features" pull request for v6.10 with changes both in
stack and in drivers. The big thing in this pull request is that
wireless subsystem is now almost free of sparse warnings. There's only
one warning left in ath11k which was introduced in v6.9-rc1 and will
be fixed via the wireless tree.

Realtek drivers continue to improve, now we have support for RTL8922AE
and RTL8723CS devices. ath11k also has long waited support for P2P.

This time we have a small conflict in iwlwifi, Stephen has an example
merge resolution which should help with fixing the conflict:

https://lore.kernel.org/all/20240326100945.765b8caf@canb.auug.org.au/

Major changes:

rtw89
* RTL8922AE Wi-Fi 7 PCI device support

rtw88
* RTL8723CS SDIO device support

iwlwifi
* don't support puncturing in 5 GHz
* support monitor mode on passive channels
* BZ-W device support
* P2P with HE/EHT support

ath11k
* P2P support for QCA6390, WCN6855 and QCA2066

* tag 'wireless-next-2024-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (122 commits)
  wifi: mt76: mt7915: workaround dubious x | !y warning
  wifi: mwl8k: Avoid -Wflex-array-member-not-at-end warnings
  wifi: ti: Avoid a hundred -Wflex-array-member-not-at-end warnings
  wifi: iwlwifi: mvm: fix check in iwl_mvm_sta_fw_id_mask
  net: rfkill: gpio: Convert to platform remove callback returning void
  wifi: mac80211: use kvcalloc() for codel vars
  wifi: iwlwifi: reconfigure TLC during HW restart
  wifi: iwlwifi: mvm: don't change BA sessions during restart
  wifi: iwlwifi: mvm: select STA mask only for active links
  wifi: iwlwifi: mvm: set wider BW OFDMA ignore correctly
  wifi: iwlwifi: Add support for LARI_CONFIG_CHANGE_CMD cmd v9
  wifi: iwlwifi: mvm: Declare HE/EHT capabilities support for P2P interfaces
  wifi: iwlwifi: mvm: Remove outdated comment
  wifi: iwlwifi: add support for BZ_W
  wifi: iwlwifi: Print a specific device name.
  wifi: iwlwifi: remove wrong CRF_IDs
  wifi: iwlwifi: remove devices that never came out
  wifi: iwlwifi: mvm: mark EMLSR disabled in cleanup iterator
  wifi: iwlwifi: mvm: fix active link counting during recovery
  wifi: iwlwifi: mvm: assign link STA ID lookups during restart
  ...
====================

Link: https://lore.kernel.org/r/20240403093625.CF515C433C7@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: ethtool.py: Make tool invokable from any CWD

ethtool.py depends on yml files in a specific location of the linux kernel
tree. Using relative lookup for those files means that ethtool.py would
need to be run under tools/net/ynl/. Lookup needed yml files without
depending on the current working directory that ethtool.py is invoked from.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240402204000.115081-1-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: marvell: implement cable-test for 88E308X/88E609X family

This commit implements VCT in 88E308X/88E609X Family.

It require two workarounds with some magic configuration.
Regular use require only one register configuration. But Open Circuit
require second workaround.
It cause implementation two phases for fault length measuring.

Fast Ethernet PHY have implemented very simple version of VCT. It's
complitley different than vct5 or vct7.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-3-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethtool: Add impedance mismatch result code to cable test

Some PHYs can recognize during a cable test if the impedance in the cable
is okay. They can detect reflections caused by impedance discontinuity
between a regular 100 Ohm cable and an abnormal part with a higher or
lower impedance.

This commit introduces a new result code:
ETHTOOL_A_CABLE_RESULT_CODE_IMPEDANCE_MISMATCH,
which represents the results of a cable test indicating issues with
impedance integrity.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-2-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: marvell: add basic support of 88E308X/88E609X family

This patch implements only basic support.

It covers PHY used in multiple IC:
PHY: 88E3082, 88E3083
Switch: 88E6096, 88E6097

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20240402201123.2961909-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: fman: Remove some unused fields in some structure

In "struct muram_info", the 'size' field is unused.
In "struct memac_cfg", the 'fixed_link' field is unused.

Remove them.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/425222d4f6c584e8316ccb7b2ef415a85c96e455.1712084103.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'af_unix-remove-old-gc-leftovers'

Kuniyuki Iwashima says:

====================
af_unix: Remove old GC leftovers.

This is a follow-up series for commit 4090fa373f0e ("af_unix: Replace
garbage collection algorithm.") which introduced the new GC for AF_UNIX.

Now we no longer need two ugly tricks for the old GC, let's remove them.
====================

Link: https://lore.kernel.org/r/20240401173125.92184-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

af_unix: Remove lock dance in unix_peek_fds().

In the previous GC implementation, the shape of the inflight socket
graph was not expected to change while GC was in progress.

MSG_PEEK was tricky because it could install inflight fd silently
and transform the graph.

Let's say we peeked a fd, which was a listening socket, and accept()ed
some embryo sockets from it. The garbage collection algorithm would
have been confused because the set of sockets visited in scan_inflight()
would change within the same GC invocation.

That's why we placed spin_lock(&unix_gc_lock) and spin_unlock() in
unix_peek_fds() with a fat comment.

In the new GC implementation, we no longer garbage-collect the socket
if it exists in another queue, that is, if it has a bridge to another
SCC. Also, accept() will require the lock if it has edges.

Thus, we need not do the complicated lock dance.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240401173125.92184-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

af_unix: Remove scm_fp_dup() in unix_attach_fds().

When we passed fds, we used to bump each file's refcount twice
in scm_fp_copy() and scm_fp_dup() before linking the socket to
gc_inflight_list.

This is because we incremented the inflight count of the socket
and linked it to the list in advance before passing skb to the
destination socket.

Otherwise, the inflight socket could have been garbage-collected
in a small race window between linking the socket to the list and
queuing skb:

  CPU 1 : sendmsg(X) w/ A's fd     CPU 2 : close(A)
  -----                            -----
  /* Here A's refcount is 1, and inflight count is 0 */

  bump A's refcount to 2 in scm_fp_copy()
  bump A's inflight count to 1
  link A to gc_inflight_list
                                   decrement A's refcount to 1

  /* A's refcount == inflight count, thus A could be GC candidate */

                                   start GC
                                   mark A as candidate
                                   purge A's receive queue

  queue skb w/ A's fd to X

  /* A is queued, but all data has been lost */

After commit 4090fa373f0e ("af_unix: Replace garbage collection
algorithm."), we increment the inflight count and link the socket
to the global list only when queuing the skb.

The race no longer exists, so let's not clone the fd nor bump
the count in unix_attach_fds().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240401173125.92184-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'tcp-make-trace-of-reset-logic-complete'

Jason Xing says:

====================
tcp: make trace of reset logic complete

Before this, we miss some cases where the TCP layer could send RST but
we cannot trace it. So I decided to complete it :)

Link: https://lore.kernel.org/all/20240329034243.7929-1-kerneljasonxing@gmail.com/
====================

Link: https://lore.kernel.org/r/20240401073605.37335-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

trace: tcp: fully support trace_tcp_send_reset

Prior to this patch, what we can see by enabling trace_tcp_send is
only happening under two circumstances:
1) active rst mode
2) non-active rst mode and based on the full socket

That means the inconsistency occurs if we use tcpdump and trace
simultaneously to see how rst happens.

It's necessary that we should take into other cases into considerations,
say:
1) time-wait socket
2) no socket
...

By parsing the incoming skb and reversing its 4-tuple can
we know the exact 'flow' which might not exist.

Samples after applied this patch:
1. tcp_send_reset: skbaddr=XXX skaddr=XXX src=ip:port dest=ip:port
state=TCP_ESTABLISHED
2. tcp_send_reset: skbaddr=000...000 skaddr=XXX src=ip:port dest=ip:port
state=UNKNOWN
Note:
1) UNKNOWN means we cannot extract the right information from skb.
2) skbaddr/skaddr could be 0

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://lore.kernel.org/r/20240401073605.37335-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters

Introducing entry_saddr and entry_daddr parameters in this macro
for later use can help us record the reverse 4-tuple by analyzing
the 4-tuple of the incoming skb when receiving.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240401073605.37335-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: enable timestamp static key if CPU

For systems that use CPU isolation (via nohz_full), creating or destroying
a socket with SO_TIMESTAMP, SO_TIMESTAMPNS or SO_TIMESTAMPING with flag
SOF_TIMESTAMPING_RX_SOFTWARE will cause a static key to be enabled/disabled.
This in turn causes undesired IPIs to isolated CPUs.

So enable the static key unconditionally, if CPU isolation is enabled,
thus avoiding the IPIs.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/ZgrUiLLtbEUf9SFn@tpad
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'gve-ring-size-changes'

Harshitha Ramamurthy says:

====================
gve: enable ring size changes

This series enables support to change ring size via ethtool
in gve.

The first three patches deal with some clean up, setting
default values for the ring sizes and related fields. The
last two patches enable ring size changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

gve: add support to change ring size via ethtool

Allow the user to change ring size via ethtool if
supported by the device. The driver relies on the
ring size ranges queried from device to validate
ring sizes requested by the user.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gve: add support to read ring size ranges from the device

Add support to read ring size change capability and the
min and max descriptor counts from the device and store it
in the driver. Also accommodate a special case where the
device does not provide minimum ring size depending on the
version of the device. In that case, rely on default values
for the minimums.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gve: set page count for RX QPL for GQI and DQO queue formats

Fulfill the requirement that for GQI, the number of pages per
RX QPL is equal to the ring size. Set this value to be equal to
ring size. Because of this change, the rx_data_slot_cnt and
rx_pages_per_qpl fields stored in the priv structure are not
needed, so remove their usage. And for DQO, the number of pages
per RX QPL is more than ring size to account for out-of-order
completions. So set it to two times of rx ring size.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gve: make the completion and buffer ring size equal for DQO

For the DQO queue format, the gve driver stores two ring sizes
for both TX and RX - one for completion queue ring and one for
data buffer ring. This is supposed to enable asymmetric sizes
for these two rings but that is not supported. Make both fields
reference the same single variable.

This change renders reading supported TX completion ring size
and RX buffer ring size for DQO from the device useless, so change
those fields to reserved and remove related code.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gve: simplify setting decriptor count defaults

Combine the gve_set_desc_cnt and gve_set_desc_cnt_dqo into
one function which sets the counts after checking the queue
format. Both the functions in the previous code and the new
combined function never return an error so make the new
function void and remove the goto on error.

Also rename the new function to gve_set_default_desc_cnt to
be clearer about its intention.

Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-pf: Reset MAC stats during probe

Reset CGX/RPM MAC HW statistics at the time of driver probe()

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

There are currently a couple of objects in `struct smc_clc_msg_proposal_area`
that contain a couple of flexible structures:

struct smc_clc_msg_proposal_area {
...
struct smc_clc_v2_extension             pclc_v2_ext;
...
struct smc_clc_smcd_v2_extension        pclc_smcd_v2_ext;
...
};

So, in order to avoid ending up with a couple of flexible-array members
in the middle of a struct, we use the `struct_group_tagged()` helper to
separate the flexible array from the rest of the members in the flexible
structure:

struct smc_clc_smcd_v2_extension {
        struct_group_tagged(smc_clc_smcd_v2_extension_fixed, fixed,
                            u8 system_eid[SMC_MAX_EID_LEN];
                            u8 reserved[16];
        );
        struct smc_clc_smcd_gid_chid gidchid[];
};

With the change described above, we now declare objects of the type of
the tagged struct without embedding flexible arrays in the middle of
another struct:

struct smc_clc_msg_proposal_area {
        ...
        struct smc_clc_v2_extension_fixed pclc_v2_ext;
        ...
        struct smc_clc_smcd_v2_extension_fixed pclc_smcd_v2_ext;
        ...
};

We also use `container_of()` when we need to retrieve a pointer to the
flexible structures.

So, with these changes, fix the following warnings:

In file included from net/smc/af_smc.c:42:
net/smc/smc_clc.h:186:49: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  186 |         struct smc_clc_v2_extension             pclc_v2_ext;
      |                                                 ^~~~~~~~~~~
net/smc/smc_clc.h:188:49: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
  188 |         struct smc_clc_smcd_v2_extension        pclc_smcd_v2_ext;
      |                                                 ^~~~~~~~~~~~~~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevice: add DEFINE_FREE() for dev_put

For short netdev holds within a function there are still a lot of
users of dev_put() rather than netdev_put(). Add DEFINE_FREE() to
allow making those safer.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: add guard for RTNL

The new guard/scoped_gard can be useful for the RTNL as well,
so add a guard definition for it. It gets used like

{
   guard(rtnl)();
   // RTNL held until end of block
}

or

  scoped_guard(rtnl) {
    // RTNL held in this block
  }

as with any other guard/scoped_guard.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-04-01 (ice)

This series contains updates to ice driver only.

Michal Schmidt changes flow for gettimex64 to use host-side spinlock
rather than hardware semaphore for lighter-weight locking.

Steven adds ability for switch recipes to be re-used when firmware
supports it.

Thorsten Blum removes unwanted newlines in netlink messaging.

Michal Swiatkowski and Piotr re-organize devlink related code; renaming,
moving, and consolidating it to a single location. Michal also
simplifies the devlink init and cleanup path to occur under a single
lock call.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: hold devlink lock for whole init/cleanup
  ice: move devlink port code to a separate file
  ice: move ice_devlink.[ch] to devlink folder
  ice: Remove newlines in NL_SET_ERR_MSG_MOD
  ice: Add switch recipe reusing feature
  ice: fold ice_ptp_read_time into ice_ptp_gettimex64
  ice: avoid the PTP hardware semaphore in gettimex64 path
  ice: add ice_adapter for shared data across PFs on the same NIC
====================

Link: https://lore.kernel.org/r/20240401172421.1401696-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: snps,dwmac: Align 'snps,priority' type definition

'snps,priority' is also defined in dma/snps,dw-axi-dmac.yaml as a
uint32-array. It's preferred to have a single type for a given property
name, so update the type in snps,dwmac schema to match.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240401204422.1692359-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'doc-netlink-add-a-yaml-spec-for-team'

Hangbin Liu says:

====================
doc/netlink: add a YAML spec for team

Add a YAML spec for team. As we need to link two objects together to form
the team module, rename team to team_core for linking.
====================

Link: https://lore.kernel.org/r/20240401031004.1159713-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

uapi: team: use header file generated from YAML spec

generated with:

$ ./tools/net/ynl/ynl-gen-c.py --mode uapi \
> --spec Documentation/netlink/specs/team.yaml \
> --header -o include/uapi/linux/if_team.h

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-5-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: team: use policy generated by YAML spec

generated with:

$ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
> --spec Documentation/netlink/specs/team.yaml --source \
> -o drivers/net/team/team_nl.c
$ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
> --spec Documentation/netlink/specs/team.yaml --header \
> -o drivers/net/team/team_nl.h

The TEAM_ATTR_LIST_PORT in team_nl_policy is removed as it is only in the
port list reply attributes.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-4-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: team: rename team to team_core for linking

Similar with commit 08d323234d10 ("net: fou: rename the source for linking"),
We'll need to link two objects together to form the team module.
This means the source can't be called team, the build system expects
team.o to be the combined object.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Documentation: netlink: add a YAML spec for team

Add a YAML specification for team.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240401031004.1159713-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog

Since commit 099ecf59f05b ("net: annotate lockless accesses to
sk->sk_max_ack_backlog") decided to handle the sk_max_ack_backlog
locklessly, there is one more function mostly called in TCP/DCCP
cases. So this patch completes it:)

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240331090521.71965-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

caif: Use UTILITY_NAME_LENGTH instead of hard-coding 16

UTILITY_NAME_LENGTH is 16. So better use the former when defining the
'utility_name' array. This makes the intent clearer when it is used around
line 260.

While at it, declare variable in reverse xmas tree style.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/8c1160501f69b64bb2d45ce9f26f746eec80ac77.1711787352.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'avoid-explicit-cpumask-var-allocation-on-stack'

Dawei Li says:

====================
Avoid explicit cpumask var allocation on stack

v1: https://lore.kernel.org/lkml/20240329105610.922675-1-dawei.li@shingroup.cn/
====================

Link: https://lore.kernel.org/r/20240331053441.1276826-1-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/dpaa2: Avoid explicit cpumask var allocation on stack

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

Use *cpumask_var API(s) to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Link: https://lore.kernel.org/r/20240331053441.1276826-3-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/iucv: Avoid explicit cpumask var allocation on stack

For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.

Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.

Use *cpumask_var API(s) to address it.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20240331053441.1276826-2-dawei.li@shingroup.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: sja1105: drop driver owner assignment

Core in spi_register_driver() already sets the .owner, so driver
does not need to.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240330211023.100924-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: microchip: drop driver owner assignment

Core in spi_register_driver() already sets the .owner, so driver
does not need to.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20240330211023.100924-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

dt-bindings: net: renesas,ethertsn: Create child-node for MDIO bus

The bindings for Renesas Ethernet TSN was just merged in v6.9 and the
design for the bindings followed that of other Renesas Ethernet drivers
and thus did not force a child-node for the MDIO bus. As there
are no upstream drivers or users of this binding yet take the
opportunity to correct this and force the usage of a child-node for the
MDIO bus.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20240330131228.1541227-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'page_pool-allow-direct-bulk-recycling'

Alexander Lobakin says:

====================
page_pool: allow direct bulk recycling

Previously, there was no reliable way to check whether it's safe to use
direct PP cache. The drivers were passing @allow_direct to the PP
recycling functions and that was it. Bulk recycling is used by
xdp_return_frame_bulk() on .ndo_xdp_xmit() frames completion where
the page origin is unknown, thus the direct recycling has never been
tried.
Now that we have at least 2 ways of checking if we're allowed to perform
direct recycling -- pool->p.napi (Jakub) and pool->cpuid (Lorenzo), we
can use them when doing bulk recycling as well. Just move that logic
from the skb core to the PP core and call it before
__page_pool_put_page() every time @allow_direct is false.
Under high .ndo_xdp_xmit() traffic load, the win is 2-3% Pps assuming
the sending driver uses xdp_return_frame_bulk() on Tx completion.
====================

Link: https://lore.kernel.org/r/20240329165507.3240110-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

page_pool: try direct bulk recycling

Now that the checks for direct recycling possibility live inside the
Page Pool core, reuse them when performing bulk recycling.
page_pool_put_page_bulk() can be called from process context as well,
page_pool_napi_local() takes care of this at the very beginning.
Under high .ndo_xdp_xmit() traffic load, the win is 2-3% Pps assuming
the sending driver uses xdp_return_frame_bulk() on Tx completion.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240329165507.3240110-3-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

page_pool: check for PP direct cache locality later

Since we have pool->p.napi (Jakub) and pool->cpuid (Lorenzo) to check
whether it's safe to use direct recycling, we can use both globally for
each page instead of relying solely on @allow_direct argument.
Let's assume that @allow_direct means "I'm sure it's local, don't waste
time rechecking this" and when it's false, try the mentioned params to
still recycle the page directly. If neither is true, we'll lose some
CPU cycles, but then it surely won't be hotpath. On the other hand,
paths where it's possible to use direct cache, but not possible to
safely set @allow_direct, will benefit from this move.
The whole propagation of @napi_safe through a dozen of skb freeing
functions can now go away, which saves us some stack space.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20240329165507.3240110-2-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rhashtable: Improve grammar

Change "a" to "an" according to the usual rules, fix an "if" that
was mistyped as "in", improve grammar in "considerable slow" ->
"considerably slower".

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240329-misc-rhashtable-v1-1-5862383ff798@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: add ynl_dump_empty() helper

Checking if dump is empty requires a couple of casts.
Add a convenient wrapper.

Add an example use in the netdev sample, loopback is always
present so an empty dump is an error.

Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20240329181651.319326-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

nfp: Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

There is currently an object (`tl`), at the beginning of multiple
structures, that contains a flexible structure (`struct nfp_dump_tl`),
for example:

struct nfp_dumpspec_csr {
        struct nfp_dump_tl tl;

        ...

        __be32 register_width;  /* in bits */
};

So, in order to avoid ending up with flexible-array members in the
middle of multiple other structs, we use the `struct_group_tagged()`
helper to separate the flexible array from the rest of the members
in the flexible structure:

struct nfp_dump_tl {
struct_group_tagged(nfp_dump_tl_hdr, hdr,

... the rest of members

);
        char data[];
};

With the change described above, we now declare objects of the type of
the tagged struct, in this case `struct nfp_dump_tl_hdr`, without
embedding flexible arrays in the middle of another struct:

struct nfp_dumpspec_csr {
        struct nfp_dump_tl_hdr tl;

...

        __be32 register_width;  /* in bits */
};

Also, use `container_of()` whenever we need to retrieve a pointer to
the flexible structure, through which we can access the flexible
array if needed.

So, with these changes, fix 33 of the following warnings:
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:58:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:64:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:70:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:78:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:87:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c:92:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://github.com/KSPP/linux/issues/202
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/ZgYWlkxdrrieDYIu@neat
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: aquantia: add support for AQR114C PHY ID

Add support for AQR114C PHY ID. This PHY advertise 10G speed:
SPEED(0x04): 0x6031
  capabilities: -400g +5g +2.5g -200g -25g -10g-xr -100g -40g -10g/1g -10
                +100 +1000 -10-ts -2-tl +10g
EXTABLE(0x0B): 0x40fc
  capabilities: -10g-cx4 -10g-lrm +10g-t +10g-kx4 +10g-kr +1000-t +1000-kx
                +100-tx -10-t -p2mp -40g/100g -1000/100-t1 -25g -200g/400g
                +2.5g/5g -1000-h

but supports only up to 5G speed (as with AQR111/111B0).
AQR111 init config is used to set max speed 5G.

Signed-off-by: Paweł Owoc <frut3k7@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240401145114.1699451-1-frut3k7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'ath-next-20240402' of git://git./linux/kernel/git/kvalo/ath

ath.git patches for v6.10

ath drivers now have no remaining sparse warnings, otherwise smaller
fixes and some refactoring.

ath11k

* P2P support for QCA6390, WCN6855 and QCA2066

ipv6: remove RTNL protection from inet6_dump_fib()

No longer hold RTNL while calling inet6_dump_fib().

Also change return value for a completed dump,
so that NLMSG_DONE can be appended to current skb,
saving one recvmsg() system call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240329183053.644630-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'genetlink-remove-linux-genetlink-h'

Jakub Kicinski says:

====================
genetlink: remove linux/genetlink.h

There are two genetlink headers net/genetlink.h and linux/genetlink.h
This is similar to netlink.h, but for netlink.h both contain good
amount of code. For genetlink.h the linux/ version is leftover
from before uAPI headers were split out, it has 10 lines of code.
Move those 10 lines into other appropriate headers and delete
linux/genetlink.h.

I occasionally open the wrong header in the editor when coding,
I guess I'm not the only one.

v2: https://lore.kernel.org/all/20240325173716.2390605-1-kuba@kernel.org/
v1: https://lore.kernel.org/all/20240309183458.3014713-1-kuba@kernel.org
====================

Link: https://lore.kernel.org/r/20240329175710.291749-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

genetlink: remove linux/genetlink.h

genetlink.h is a shell of what used to be a combined uAPI
and kernel header over a decade ago. It has fewer than
10 lines of code. Merge it into net/genetlink.h.
In some ways it'd be better to keep the combined header
under linux/ but it would make looking through git history
harder.

Acked-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20240329175710.291749-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: openvswitch: remove unnecessary linux/genetlink.h include

The only legit reason I could think of for net/genetlink.h
and linux/genetlink.h to be separate would be if one was
included by other headers and we wanted to keep it lightweight.
That is not the case, net/openvswitch/meter.h includes
linux/genetlink.h but for no apparent reason (for struct genl_family
perhaps? it's not necessary, types of externs do not need
to be known).

Link: https://lore.kernel.org/r/20240329175710.291749-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

netlink: create a new header for internal genetlink symbols

There are things in linux/genetlink.h which are only used
under net/netlink/. Move them to a new local header.
A new header with just 2 externs isn't great, but alternative
would be to include af_netlink.h in genetlink.c which feels
even worse.

Link: https://lore.kernel.org/r/20240329175710.291749-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-03-29 (net: intel)

This series contains updates to most Intel drivers.

Jesse moves declaration of pci_driver struct to remove need for forward
declarations in igb and converts Intel drivers to user newer power
management ops.

Sasha reworks power management flow on igc to avoid using rtnl_lock()
during those flows.

Maciej reorganizes i40e_nvm file to avoid forward declarations.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  i40e: avoid forward declarations in i40e_nvm.c
  igc: Refactor runtime power management flow
  net: intel: implement modern PM ops declarations
  igb: simplify pci ops declaration
====================

Link: https://lore.kernel.org/r/20240329175632.211340-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp/dccp: do not care about families in inet_twsk_purge()

We lost ability to unload ipv6 module a long time ago.

Instead of calling expensive inet_twsk_purge() twice,
we can handle all families in one round.

Also remove an extra line added in my prior patch,
per Kuniyuki Iwashima feedback.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/netdev/20240327192934.6843-1-kuniyu@amazon.com/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240329153203.345203-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

inet: preserve const qualifier in inet_csk()

We can change inet_csk() to propagate its argument const qualifier,
thanks to container_of_const().

We have to fix few places that had mistakes, like tcp_bound_rto().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240329144931.295800-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>