linux-2.6-block.git
10 months agodt-bindings: net: dsa: microchip: add wakeup-source property
Oleksij Rempel [Mon, 23 Oct 2023 09:33:36 +0000 (11:33 +0200)]
dt-bindings: net: dsa: microchip: add wakeup-source property

Add wakeup-source property to enable Wake on Lan functionality in the
switch.

Since PME wake pin is not always attached to the SoC, use wakeup-source
instead of wakeup-gpios

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: dsa: microchip: Add missing MAC address register offset for ksz8863
Oleksij Rempel [Mon, 23 Oct 2023 09:33:35 +0000 (11:33 +0200)]
net: dsa: microchip: Add missing MAC address register offset for ksz8863

Add the missing offset for the global MAC address register
(REG_SW_MAC_ADDR) for the ksz8863 family of switches.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agos390/qeth: replace deprecated strncpy with strscpy
Justin Stitt [Mon, 23 Oct 2023 19:39:39 +0000 (19:39 +0000)]
s390/qeth: replace deprecated strncpy with strscpy

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect new_entry->dbf_name to be NUL-terminated based on its use with
strcmp():
|       if (strcmp(entry->dbf_name, name) == 0) {

Moreover, NUL-padding is not required as new_entry is kzalloc'd just
before this assignment:
|       new_entry = kzalloc(sizeof(struct qeth_dbf_entry), GFP_KERNEL);

... rendering any future NUL-byte assignments (like the ones strncpy()
does) redundant.

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Tested-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-s390-net-qeth_core_main-c-v1-1-e7ce65454446@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agos390/ctcm: replace deprecated strncpy with strscpy
Justin Stitt [Mon, 23 Oct 2023 19:35:07 +0000 (19:35 +0000)]
s390/ctcm: replace deprecated strncpy with strscpy

strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect chid to be NUL-terminated based on its use with format
strings:

CTCM_DBF_TEXT_(SETUP, CTC_DBF_INFO, "%s(%s) %s", CTCM_FUNTAIL,
chid, ok ? "OK" : "failed");

Moreover, NUL-padding is not required as it is _only_ used in this one
instance with a format string.

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

We can also drop the +1 from chid's declaration as we no longer need to
be cautious about leaving a spot for a NUL-byte. Let's use the more
idiomatic strscpy usage of (dest, src, sizeof(dest)) as this more
closely ties the destination buffer to the length.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Tested-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-s390-net-ctcm_main-c-v1-1-265db6e78165@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: ethernet: mtk_wed: remove wo pointer in wo_r32/wo_w32 signature
Lorenzo Bianconi [Mon, 23 Oct 2023 22:01:30 +0000 (00:01 +0200)]
net: ethernet: mtk_wed: remove wo pointer in wo_r32/wo_w32 signature

wo pointer is no longer used in wo_r32 and wo_w32 routines so get rid of
it.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/530537db0872f7523deff21f0a5dfdd9b75fdc9d.1698098459.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: ethernet: mtk_wed: fix firmware loading for MT7986 SoC
Lorenzo Bianconi [Mon, 23 Oct 2023 22:00:19 +0000 (00:00 +0200)]
net: ethernet: mtk_wed: fix firmware loading for MT7986 SoC

The WED mcu firmware does not contain all the memory regions defined in
the dts reserved_memory node (e.g. MT7986 WED firmware does not contain
cpu-boot region).
Reverse the mtk_wed_mcu_run_firmware() logic to check all the fw
sections are defined in the dts reserved_memory node.

Fixes: c6d961aeaa77 ("net: ethernet: mtk_wed: move mem_region array out of mtk_wed_mcu_load_firmware")
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/d983cbfe8ea562fef9264de8f0c501f7d5705bd5.1698098381.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'net-ethernet-renesas-infrastructure-preparations-for-upcoming-driver'
Jakub Kicinski [Tue, 24 Oct 2023 23:16:20 +0000 (16:16 -0700)]
Merge branch 'net-ethernet-renesas-infrastructure-preparations-for-upcoming-driver'

Wolfram Sang says:

====================
net: ethernet: renesas: infrastructure preparations for upcoming driver

Before we upstream a new driver, Niklas and I thought that a few
cleanups for Kconfig/Makefile will help readability and maintainability.
Here they are, looking forward to comments.
====================

Link: https://lore.kernel.org/r/20231022205316.3209-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: ethernet: renesas: drop SoC names in Kconfig
Wolfram Sang [Sun, 22 Oct 2023 20:53:16 +0000 (22:53 +0200)]
net: ethernet: renesas: drop SoC names in Kconfig

Mentioning SoCs in Kconfig descriptions tends to get stale (e.g. RAVB is
missing RZV2M) or imprecise (e.g. SH_ETH is not available on all
R8A779x). Drop them instead of providing vague information. Improve the
file description a tad while here.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20231022205316.3209-3-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: ethernet: renesas: group entries in Makefile
Wolfram Sang [Sun, 22 Oct 2023 20:53:15 +0000 (22:53 +0200)]
net: ethernet: renesas: group entries in Makefile

A new Renesas driver shall be added soon. Prepare the Makefile by
grouping the specific objects to the Kconfig symbol for better
readability. Improve the file description a tad while here.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20231022205316.3209-2-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoselftests: net: change ifconfig with ip command
Swarup Laxman Kotiaklapudi [Mon, 23 Oct 2023 12:34:22 +0000 (18:04 +0530)]
selftests: net: change ifconfig with ip command

Change ifconfig with ip command, on a system where ifconfig is
not used this script will not work correcly.

Test result with this patchset:

sudo make TARGETS="net" kselftest
....
TAP version 13
1..1
 timeout set to 1500
 selftests: net: route_localnet.sh
 run arp_announce test
 net.ipv4.conf.veth0.route_localnet = 1
 net.ipv4.conf.veth1.route_localnet = 1
 net.ipv4.conf.veth0.arp_announce = 2
 net.ipv4.conf.veth1.arp_announce = 2
 PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
  bytes of data.
 64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.038 ms
 64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.068 ms
 64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.068 ms

 --- 127.25.3.14 ping statistics ---
 5 packets transmitted, 5 received, 0% packet loss, time 4073ms
 rtt min/avg/max/mdev = 0.038/0.062/0.068/0.012 ms
 ok
 run arp_ignore test
 net.ipv4.conf.veth0.route_localnet = 1
 net.ipv4.conf.veth1.route_localnet = 1
 net.ipv4.conf.veth0.arp_ignore = 3
 net.ipv4.conf.veth1.arp_ignore = 3
 PING 127.25.3.14 (127.25.3.14) from 127.25.3.4 veth0: 56(84)
  bytes of data.
 64 bytes from 127.25.3.14: icmp_seq=1 ttl=64 time=0.032 ms
 64 bytes from 127.25.3.14: icmp_seq=2 ttl=64 time=0.065 ms
 64 bytes from 127.25.3.14: icmp_seq=3 ttl=64 time=0.066 ms
 64 bytes from 127.25.3.14: icmp_seq=4 ttl=64 time=0.065 ms
 64 bytes from 127.25.3.14: icmp_seq=5 ttl=64 time=0.065 ms

 --- 127.25.3.14 ping statistics ---
 5 packets transmitted, 5 received, 0% packet loss, time 4092ms
 rtt min/avg/max/mdev = 0.032/0.058/0.066/0.013 ms
 ok
ok 1 selftests: net: route_localnet.sh
...

Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20231023123422.2895-1-swarupkotikalapudi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'switch-dsa-to-inclusive-terminology'
Jakub Kicinski [Tue, 24 Oct 2023 20:08:15 +0000 (13:08 -0700)]
Merge branch 'switch-dsa-to-inclusive-terminology'

Florian Fainelli says:

====================
Switch DSA to inclusive terminology

One of the action items following Netconf'23 is to switch subsystems to
use inclusive terminology. DSA has been making extensive use of the
"master" and "slave" words which are now replaced by "conduit" and
"user" respectively.
====================

Link: https://lore.kernel.org/r/20231023181729.1191071-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUIT
Florian Fainelli [Mon, 23 Oct 2023 18:17:29 +0000 (11:17 -0700)]
net: dsa: Rename IFLA_DSA_MASTER to IFLA_DSA_CONDUIT

This preserves the existing IFLA_DSA_MASTER which is part of the uAPI
and creates an alias named IFLA_DSA_CONDUIT.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231023181729.1191071-3-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: dsa: Use conduit and user terms
Florian Fainelli [Mon, 23 Oct 2023 18:17:28 +0000 (11:17 -0700)]
net: dsa: Use conduit and user terms

Use more inclusive terms throughout the DSA subsystem by moving away
from "master" which is replaced by "conduit" and "slave" which is
replaced by "user". No functional changes.

Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231023181729.1191071-2-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotsnep: Fix tsnep_request_irq() format-overflow warning
Gerhard Engleder [Mon, 23 Oct 2023 18:38:56 +0000 (20:38 +0200)]
tsnep: Fix tsnep_request_irq() format-overflow warning

Compiler warns about a possible format-overflow in tsnep_request_irq():
drivers/net/ethernet/engleder/tsnep_main.c:884:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
                         sprintf(queue->name, "%s-rx-%d", name,
                                                       ^
drivers/net/ethernet/engleder/tsnep_main.c:881:55: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
                         sprintf(queue->name, "%s-tx-%d", name,
                                                       ^
drivers/net/ethernet/engleder/tsnep_main.c:878:49: warning: '-txrx-' directive writing 6 bytes into a region of size between 5 and 25 [-Wformat-overflow=]
                         sprintf(queue->name, "%s-txrx-%d", name,
                                                 ^~~~~~

Actually overflow cannot happen. Name is limited to IFNAMSIZ, because
netdev_name() is called during ndo_open(). queue_index is single char,
because less than 10 queues are supported.

Fix warning with snprintf(). Additionally increase buffer to 32 bytes,
because those 7 additional bytes were unused anyway.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310182028.vmDthIUa-lkp@intel.com/
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023183856.58373-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'net-deduplicate-netdev-name-allocation'
Jakub Kicinski [Tue, 24 Oct 2023 20:03:00 +0000 (13:03 -0700)]
Merge branch 'net-deduplicate-netdev-name-allocation'

Jakub Kicinski says:

====================
net: deduplicate netdev name allocation

After recent fixes we have even more duplicated code in netdev name
allocation helpers. There are two complications in this code.
First, __dev_alloc_name() clobbers its output arg even if allocation
fails, forcing callers to do extra copies. Second as our experience in
commit 55a5ec9b7710 ("Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"") and
commit 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"")
taught us, user space is very sensitive to the exact error codes.

Align the callers of __dev_alloc_name(), and remove some of its
complexity.

v1: https://lore.kernel.org/all/20231020011856.3244410-1-kuba@kernel.org/
====================

Link: https://lore.kernel.org/r/20231023152346.3639749-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: remove else after return in dev_prep_valid_name()
Jakub Kicinski [Mon, 23 Oct 2023 15:23:46 +0000 (08:23 -0700)]
net: remove else after return in dev_prep_valid_name()

Remove unnecessary else clauses after return.
I copied this if / else construct from somewhere,
it makes the code harder to read.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: remove dev_valid_name() check from __dev_alloc_name()
Jakub Kicinski [Mon, 23 Oct 2023 15:23:45 +0000 (08:23 -0700)]
net: remove dev_valid_name() check from __dev_alloc_name()

__dev_alloc_name() is only called by dev_prep_valid_name(),
which already checks that name is valid.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: trust the bitmap in __dev_alloc_name()
Jakub Kicinski [Mon, 23 Oct 2023 15:23:44 +0000 (08:23 -0700)]
net: trust the bitmap in __dev_alloc_name()

Prior to restructuring __dev_alloc_name() handled both printf
and non-printf names. In a clever attempt at code reuse it
always prints the name into a buffer and checks if it's
a duplicate.

Trust the bitmap, and return an error if its full.

This shrinks the possible ID space by one from 32K to 32K - 1,
as previously the max value would have been tried as a valid ID.
It seems very unlikely that anyone would care as we heard
no requests to increase the max beyond 32k.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: reduce indentation of __dev_alloc_name()
Jakub Kicinski [Mon, 23 Oct 2023 15:23:43 +0000 (08:23 -0700)]
net: reduce indentation of __dev_alloc_name()

All callers of __dev_valid_name() go thru dev_prep_valid_name()
which handles the non-printf case. Focus __dev_alloc_name() on
the sprintf case, remove the indentation level.

Minor functional change of returning -EINVAL if % is not found,
which should now never happen.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: make dev_alloc_name() call dev_prep_valid_name()
Jakub Kicinski [Mon, 23 Oct 2023 15:23:42 +0000 (08:23 -0700)]
net: make dev_alloc_name() call dev_prep_valid_name()

__dev_alloc_name() handles both the sprintf and non-sprintf
target names. This complicates the code.

dev_prep_valid_name() already handles the non-sprintf case,
before calling __dev_alloc_name(), make the only other caller
also go thru dev_prep_valid_name(). This way we can drop
the non-sprintf handling in __dev_alloc_name() in one of
the next changes.

commit 55a5ec9b7710 ("Revert "net: core: dev_get_valid_name is now the same as dev_alloc_name_ns"") and
commit 029b6d140550 ("Revert "net: core: maybe return -EEXIST in __dev_alloc_name"")
tell us that we can't start returning -EEXIST from dev_alloc_name()
on name duplicates. Bite the bullet and pass the expected errno to
dev_prep_valid_name().

dev_prep_valid_name() must now propagate out the allocated id
for printf names.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: don't use input buffer of __dev_alloc_name() as a scratch space
Jakub Kicinski [Mon, 23 Oct 2023 15:23:41 +0000 (08:23 -0700)]
net: don't use input buffer of __dev_alloc_name() as a scratch space

Callers of __dev_alloc_name() want to pass dev->name as
the output buffer. Make __dev_alloc_name() not clobber
that buffer on failure, and remove the workarounds
in callers.

dev_alloc_name_ns() is now completely unnecessary.

The extra strscpy() added here will be gone by the end
of the patch series.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231023152346.3639749-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'mptcp-convert-netlink-code-to-use-yaml-spec'
Jakub Kicinski [Tue, 24 Oct 2023 20:00:33 +0000 (13:00 -0700)]
Merge branch 'mptcp-convert-netlink-code-to-use-yaml-spec'

Mat Martineau says:

====================
mptcp: convert Netlink code to use YAML spec

This series from Davide converts most of the MPTCP Netlink interface
(plus uAPI bits) to use sources generated by YNL using a YAML spec file.

This new YAML file is useful to validate the API and to generate a good
documentation page.

Patch 1 modifies YNL spec to support "uns-admin-perm" for genetlink
legacy.

Patch 2 adds support for validating exact length of netlink attrs.

Patch 3 converts Netlink structures from small_ops to ops to prepare the
switch to YAML.

Patch 4 adds the Netlink YAML spec for MPTCP.

Patch 5 adds and uses a new header file generated from the new YAML
spec.

Patch 6 renames some handlers to match the ones generated from the YAML
spec.

Patch 7 adds and uses Netlink policies automatically generated from the
YAML spec.
====================

Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-0-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: mptcp: use policy generated by YAML spec
Davide Caratti [Mon, 23 Oct 2023 18:17:11 +0000 (11:17 -0700)]
net: mptcp: use policy generated by YAML spec

generated with:

 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/mptcp.yaml --source \
 > -o net/mptcp/mptcp_pm_gen.c
 $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
 > --spec Documentation/netlink/specs/mptcp.yaml --header \
 > -o net/mptcp/mptcp_pm_gen.h

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-7-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: mptcp: rename netlink handlers to mptcp_pm_nl_<blah>_{doit,dumpit}
Davide Caratti [Mon, 23 Oct 2023 18:17:10 +0000 (11:17 -0700)]
net: mptcp: rename netlink handlers to mptcp_pm_nl_<blah>_{doit,dumpit}

so that they will match names generated from YAML spec.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-6-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agouapi: mptcp: use header file generated from YAML spec
Davide Caratti [Mon, 23 Oct 2023 18:17:09 +0000 (11:17 -0700)]
uapi: mptcp: use header file generated from YAML spec

generated with:

 $ ./tools/net/ynl/ynl-gen-c.py --mode uapi \
 > --spec Documentation/netlink/specs/mptcp.yaml \
 > --header -o include/uapi/linux/mptcp_pm.h

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-5-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoDocumentation: netlink: add a YAML spec for mptcp
Davide Caratti [Mon, 23 Oct 2023 18:17:08 +0000 (11:17 -0700)]
Documentation: netlink: add a YAML spec for mptcp

it describes most of the current netlink interface (uAPI definitions,
doit/dumpit operations and attributes)

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-4-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: mptcp: convert netlink from small_ops to ops
Davide Caratti [Mon, 23 Oct 2023 18:17:07 +0000 (11:17 -0700)]
net: mptcp: convert netlink from small_ops to ops

in the current MPTCP control plane, all operations use a netlink
attribute of the same type "MPTCP_PM_ATTR". However, add/del/get/flush
operations only parse the first element in the message _ the one that
describes MPTCP endpoints (that was named MPTCP_PM_ATTR_ADDR and
mostly used in ADD_ADDR operations _ probably the similarity of "attr",
"addr" and "add" might cause some confusion to human readers).
Convert MPTCP from 'small_ops' to 'ops', thus allowing different attributes
for each single operation, hopefully makes all this clearer to human
readers.

- use a separate attribute set for add/del/get/flush address operation,
  binary compatible with the existing one, to store the endpoint address.
  MPTCP_PM_ENDPOINT_ADDR is added to the uAPI (with the same value as
  MPTCP_PM_ATTR_ADDR) for these operations.
- convert mptcp_pm_ops[] and add policy files accordingly.

this prepares MPTCP control plane to be described as YAML spec.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-3-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotools: ynl-gen: add support for exact-len validation
Davide Caratti [Mon, 23 Oct 2023 18:17:06 +0000 (11:17 -0700)]
tools: ynl-gen: add support for exact-len validation

add support for 'exact-len' validation on netlink attributes.

Link: https://github.com/multipath-tcp/mptcp_net-next/issues/340
Acked-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-2-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotools: ynl: add uns-admin-perm to genetlink legacy
Davide Caratti [Mon, 23 Oct 2023 18:17:05 +0000 (11:17 -0700)]
tools: ynl: add uns-admin-perm to genetlink legacy

this flag maps to GENL_UNS_ADMIN_PERM and will be used by future specs.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-1-v2-1-16b1f701f900@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: sched: sch_qfq: Use non-work-conserving warning handler
Liu Jian [Mon, 23 Oct 2023 06:47:29 +0000 (14:47 +0800)]
net: sched: sch_qfq: Use non-work-conserving warning handler

A helper function for printing non-work-conserving alarms is added in
commit b00355db3f88 ("pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving
 warning handler."). In this commit, use qdisc_warn_nonwc() instead of
WARN_ONCE() to handle the non-work-conserving warning in qfq Qdisc.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Link: https://lore.kernel.org/r/20231023064729.370649-1-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet: ethernet: davinci_emac: Use MAC Address from Device Tree
Adam Ford [Sun, 22 Oct 2023 15:19:11 +0000 (10:19 -0500)]
net: ethernet: davinci_emac: Use MAC Address from Device Tree

Currently there is a device tree entry called "local-mac-address"
which can be filled by the bootloader or manually set.This is
useful when the user does not want to use the MAC address
programmed into the SoC.

Currently, the davinci_emac reads the MAC from the DT, copies
it from pdata->mac_addr to priv->mac_addr, then blindly overwrites
it by reading from registers in the SoC, and falls back to a
random MAC if it's still not valid.  This completely ignores any
MAC address in the device tree.

In order to use the local-mac-address, check to see if the contents
of priv->mac_addr are valid before falling back to reading from the
SoC when the MAC address is not valid.

Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231022151911.4279-1-aford173@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agosock: Ignore memcg pressure heuristics when raising allocated
Abel Wu [Thu, 19 Oct 2023 12:00:26 +0000 (20:00 +0800)]
sock: Ignore memcg pressure heuristics when raising allocated

Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.

That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
So revert that behavior.

After reverting, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.

Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231019120026.42215-3-wuyun.abel@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agosock: Doc behaviors for pressure heurisitics
Abel Wu [Thu, 19 Oct 2023 12:00:25 +0000 (20:00 +0800)]
sock: Doc behaviors for pressure heurisitics

There are now two accounting infrastructures for skmem, while the
heuristics in __sk_mem_raise_allocated() were actually introduced
before memcg was born.

Add some comments to clarify whether they can be applied to both
infrastructures or not.

Suggested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231019120026.42215-2-wuyun.abel@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agosock: Code cleanup on __sk_mem_raise_allocated()
Abel Wu [Thu, 19 Oct 2023 12:00:24 +0000 (20:00 +0800)]
sock: Code cleanup on __sk_mem_raise_allocated()

Code cleanup for both better simplicity and readability.
No functional change intended.

Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231019120026.42215-1-wuyun.abel@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet: ti: icssg-prueth: Add phys_port_name support
Jan Kiszka [Sun, 22 Oct 2023 08:56:22 +0000 (10:56 +0200)]
net: ti: icssg-prueth: Add phys_port_name support

Helps identifying the ports in udev rules e.g.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/895ae9c1-b6dd-4a97-be14-6f2b73c7b2b5@siemens.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet: microchip: lan743x: improve throughput with rx timestamp config
Vishvambar Panth S [Fri, 20 Oct 2023 18:58:01 +0000 (00:28 +0530)]
net: microchip: lan743x: improve throughput with rx timestamp config

Currently all RX frames are timestamped which results in a performance
penalty when timestamping is not needed.  The default is now being
changed to not timestamp any Rx frames (HWTSTAMP_FILTER_NONE), but
support has been added to allow changing the desired RX timestamping
mode (HWTSTAMP_FILTER_ALL -  which was the previous setting and
HWTSTAMP_FILTER_PTP_V2_EVENT are now supported) using
SIOCSHWTSTAMP. All settings were tested using the hwstamp_ctl application.
It is also noted that ptp4l, when started, preconfigures the device to
timestamp using HWTSTAMP_FILTER_PTP_V2_EVENT, so this driver continues
to work properly "out of the box".

Test setup:  x64 PC with LAN7430 ---> x64 PC as partner

iperf3 with - Timestamp all incoming packets:
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.05   sec   517 MBytes   859 Mbits/sec    0    sender
[  5]   0.00-5.00   sec   515 MBytes   864 Mbits/sec         receiver

iperf Done.

iperf3 with - Timestamp only PTP packets:
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.04   sec   563 MBytes   937 Mbits/sec    0    sender
[  5]   0.00-5.00   sec   561 MBytes   941 Mbits/sec         receiver

Signed-off-by: Vishvambar Panth S <vishvambarpanth.s@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231020185801.25649-1-vishvambarpanth.s@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge branch 'introduce-page_pool_alloc-related-api'
Jakub Kicinski [Tue, 24 Oct 2023 02:14:51 +0000 (19:14 -0700)]
Merge branch 'introduce-page_pool_alloc-related-api'

Yunsheng Lin says:

====================
introduce page_pool_alloc() related API

In [1] & [2] & [3], there are usecases for veth and virtio_net
to use frag support in page pool to reduce memory usage, and it
may request different frag size depending on the head/tail
room space for xdp_frame/shinfo and mtu/packet size. When the
requested frag size is large enough that a single page can not
be split into more than one frag, using frag support only have
performance penalty because of the extra frag count handling
for frag support.

So this patchset provides a page pool API for the driver to
allocate memory with least memory utilization and performance
penalty when it doesn't know the size of memory it need
beforehand.

1. https://patchwork.kernel.org/project/netdevbpf/patch/d3ae6bd3537fbce379382ac6a42f67e22f27ece2.1683896626.git.lorenzo@kernel.org/
2. https://patchwork.kernel.org/project/netdevbpf/patch/20230526054621.18371-3-liangchen.linux@gmail.com/
3. https://github.com/alobakin/linux/tree/iavf-pp-frag
====================

Link: https://lore.kernel.org/r/20231020095952.11055-1-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet: veth: use newly added page pool API for veth with xdp
Yunsheng Lin [Fri, 20 Oct 2023 09:59:52 +0000 (17:59 +0800)]
net: veth: use newly added page pool API for veth with xdp

Use page_pool_alloc() API to allocate memory with least
memory utilization and performance penalty.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231020095952.11055-6-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agopage_pool: update document about fragment API
Yunsheng Lin [Fri, 20 Oct 2023 09:59:51 +0000 (17:59 +0800)]
page_pool: update document about fragment API

As more drivers begin to use the fragment API, update the
document about how to decide which API to use for the
driver author.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
CC: Dima Tisnek <dimaqq@gmail.com>
Link: https://lore.kernel.org/r/20231020095952.11055-5-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agopage_pool: introduce page_pool_alloc() API
Yunsheng Lin [Fri, 20 Oct 2023 09:59:50 +0000 (17:59 +0800)]
page_pool: introduce page_pool_alloc() API

Currently page pool supports the below use cases:
use case 1: allocate page without page splitting using
            page_pool_alloc_pages() API if the driver knows
            that the memory it need is always bigger than
            half of the page allocated from page pool.
use case 2: allocate page frag with page splitting using
            page_pool_alloc_frag() API if the driver knows
            that the memory it need is always smaller than
            or equal to the half of the page allocated from
            page pool.

There is emerging use case [1] & [2] that is a mix of the
above two case: the driver doesn't know the size of memory it
need beforehand, so the driver may use something like below to
allocate memory with least memory utilization and performance
penalty:

if (size << 1 > max_size)
page = page_pool_alloc_pages();
else
page = page_pool_alloc_frag();

To avoid the driver doing something like above, add the
page_pool_alloc() API to support the above use case, and update
the true size of memory that is acctually allocated by updating
'*size' back to the driver in order to avoid exacerbating
truesize underestimate problem.

Rename page_pool_free() which is used in the destroy process to
__page_pool_destroy() to avoid confusion with the newly added
API.

1. https://lore.kernel.org/all/d3ae6bd3537fbce379382ac6a42f67e22f27ece2.1683896626.git.lorenzo@kernel.org/
2. https://lore.kernel.org/all/20230526054621.18371-3-liangchen.linux@gmail.com/

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231020095952.11055-4-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agopage_pool: remove PP_FLAG_PAGE_FRAG
Yunsheng Lin [Fri, 20 Oct 2023 09:59:49 +0000 (17:59 +0800)]
page_pool: remove PP_FLAG_PAGE_FRAG

PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count
handling is unified and page_pool_alloc_frag() is supported
in 32-bit arch with 64-bit DMA, so remove it.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agopage_pool: unify frag_count handling in page_pool_is_last_frag()
Yunsheng Lin [Fri, 20 Oct 2023 09:59:48 +0000 (17:59 +0800)]
page_pool: unify frag_count handling in page_pool_is_last_frag()

Currently when page_pool_create() is called with
PP_FLAG_PAGE_FRAG flag, page_pool_alloc_pages() is only
allowed to be called under the below constraints:
1. page_pool_fragment_page() need to be called to setup
   page->pp_frag_count immediately.
2. page_pool_defrag_page() often need to be called to drain
   the page->pp_frag_count when there is no more user will
   be holding on to that page.

Those constraints exist in order to support a page to be
split into multi fragments.

And those constraints have some overhead because of the
cache line dirtying/bouncing and atomic update.

Those constraints are unavoidable for case when we need a
page to be split into more than one fragment, but there is
also case that we want to avoid the above constraints and
their overhead when a page can't be split as it can only
hold a fragment as requested by user, depending on different
use cases:
use case 1: allocate page without page splitting.
use case 2: allocate page with page splitting.
use case 3: allocate page with or without page splitting
            depending on the fragment size.

Currently page pool only provide page_pool_alloc_pages() and
page_pool_alloc_frag() API to enable the 1 & 2 separately,
so we can not use a combination of 1 & 2 to enable 3, it is
not possible yet because of the per page_pool flag
PP_FLAG_PAGE_FRAG.

So in order to allow allocating unsplit page without the
overhead of split page while still allow allocating split
page we need to remove the per page_pool flag in
page_pool_is_last_frag(), as best as I can think of, it seems
there are two methods as below:
1. Add per page flag/bit to indicate a page is split or
   not, which means we might need to update that flag/bit
   everytime the page is recycled, dirtying the cache line
   of 'struct page' for use case 1.
2. Unify the page->pp_frag_count handling for both split and
   unsplit page by assuming all pages in the page pool is split
   into a big fragment initially.

As page pool already supports use case 1 without dirtying the
cache line of 'struct page' whenever a page is recyclable, we
need to support the above use case 3 with minimal overhead,
especially not adding any noticeable overhead for use case 1,
and we are already doing an optimization by not updating
pp_frag_count in page_pool_defrag_page() for the last fragment
user, this patch chooses to unify the pp_frag_count handling
to support the above use case 3.

There is no noticeable performance degradation and some
justification for unifying the frag_count handling with this
patch applied using a micro-benchmark testing in [1].

1. https://lore.kernel.org/all/bf2591f8-7b3c-4480-bb2c-31dc9da1d6ac@huawei.com/

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231020095952.11055-2-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge tag 'for-net-next-2023-10-23' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Mon, 23 Oct 2023 23:44:18 +0000 (16:44 -0700)]
Merge tag 'for-net-next-2023-10-23' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - Add 0bda:b85b for Fn-Link RTL8852BE
 - ISO: Many fixes for broadcast support
 - Mark bcm4378/bcm4387 as BROKEN_LE_CODED
 - Add support ITTIM PE50-M75C
 - Add RTW8852BE device 13d3:3570
 - Add support for QCA2066
 - Add support for Intel Misty Peak - 8087:0038

* tag 'for-net-next-2023-10-23' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
  Bluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err
  Bluetooth: Fix double free in hci_conn_cleanup
  Bluetooth: btmtksdio: enable bluetooth wakeup in system suspend
  Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE
  Bluetooth: hci_bcm4377: Mark bcm4378/bcm4387 as BROKEN_LE_CODED
  Bluetooth: ISO: Copy BASE if service data matches EIR_BAA_SERVICE_UUID
  Bluetooth: Make handle of hci_conn be unique
  Bluetooth: btusb: Add date->evt_skb is NULL check
  Bluetooth: ISO: Fix bcast listener cleanup
  Bluetooth: msft: __hci_cmd_sync() doesn't return NULL
  Bluetooth: ISO: Match QoS adv handle with BIG handle
  Bluetooth: ISO: Allow binding a bcast listener to 0 bises
  Bluetooth: btusb: Add RTW8852BE device 13d3:3570 to device tables
  Bluetooth: qca: add support for QCA2066
  Bluetooth: ISO: Set CIS bit only for devices with CIS support
  Bluetooth: Add support for Intel Misty Peak - 8087:0038
  Bluetooth: Add support ITTIM PE50-M75C
  Bluetooth: ISO: Pass BIG encryption info through QoS
  Bluetooth: ISO: Fix BIS cleanup
====================

Link: https://lore.kernel.org/r/20231023182119.3629194-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'devlink-finish-conversion-to-generated-split_ops'
Jakub Kicinski [Mon, 23 Oct 2023 23:16:52 +0000 (16:16 -0700)]
Merge branch 'devlink-finish-conversion-to-generated-split_ops'

Jiri Pirko says:

====================
devlink: finish conversion to generated split_ops

This patchset converts the remaining genetlink commands to generated
split_ops and removes the existing small_ops arrays entirely
alongside with shared netlink attribute policy.

Patches #1-#6 are just small preparations and small fixes on multiple
              places. Note that couple of patches contain the "Fixes"
              tag but no need to put them into -net tree.
Patch #7 is a simple rename preparation
Patch #8 is the main one in this set and adds actual definitions of cmds
         in to yaml file.
Patches #9-#10 finalize the change removing bits that are no longer in
               use.
====================

Link: https://lore.kernel.org/r/20231021112711.660606-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodevlink: remove netlink small_ops
Jiri Pirko [Sat, 21 Oct 2023 11:27:11 +0000 (13:27 +0200)]
devlink: remove netlink small_ops

All commands are now covered by generated split_ops. Remove the
small_ops entirely alongside with unified devlink netlink policy array.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-11-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodevlink: remove duplicated netlink callback prototypes
Jiri Pirko [Sat, 21 Oct 2023 11:27:10 +0000 (13:27 +0200)]
devlink: remove duplicated netlink callback prototypes

The prototypes are now generated, remove the old ones.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetlink: specs: devlink: add the remaining command to generate complete split_ops
Jiri Pirko [Sat, 21 Oct 2023 11:27:09 +0000 (13:27 +0200)]
netlink: specs: devlink: add the remaining command to generate complete split_ops

Currently, some of the commands are not described in devlink yaml file
and are manually filled in net/devlink/netlink.c in small_ops. To make
all part of split_ops, add definitions of the rest of the commands
alongside with needed attributes and enums.

Note that this focuses on the kernel side. The requests are fully
described in order to generate split_op alongside with policies.
Follow-up will describe the replies in order to make the userspace
helpers complete.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-9-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodevlink: rename netlink callback to be aligned with the generated ones
Jiri Pirko [Sat, 21 Oct 2023 11:27:08 +0000 (13:27 +0200)]
devlink: rename netlink callback to be aligned with the generated ones

All remaining doit and dumpit netlink callback functions are going to be
used by generated split ops. They expect certain name format. Rename the
callback to be aligned with generated names.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agodevlink: make devlink_flash_overwrite enum named one
Jiri Pirko [Sat, 21 Oct 2023 11:27:07 +0000 (13:27 +0200)]
devlink: make devlink_flash_overwrite enum named one

Since this enum is going to be used in generated userspace file, name it
properly.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-7-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetlink: specs: devlink: make dont-validate single line
Jiri Pirko [Sat, 21 Oct 2023 11:27:06 +0000 (13:27 +0200)]
netlink: specs: devlink: make dont-validate single line

Make dont-validate field more compact and push it into a single line.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonetlink: specs: devlink: remove reload-action from devlink-get cmd reply
Jiri Pirko [Sat, 21 Oct 2023 11:27:05 +0000 (13:27 +0200)]
netlink: specs: devlink: remove reload-action from devlink-get cmd reply

devlink-get command does not contain reload-action attr in reply.
Remove it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotools: ynl-gen: render rsp_parse() helpers if cmd has only dump op
Jiri Pirko [Sat, 21 Oct 2023 11:27:04 +0000 (13:27 +0200)]
tools: ynl-gen: render rsp_parse() helpers if cmd has only dump op

Due to the check in RenderInfo class constructor, type_consistent
flag is set to False to avoid rendering the same response parsing
helper for do and dump ops. However, in case there is no do, the helper
needs to be rendered for dump op. So split check to achieve that.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agotools: ynl-gen: introduce support for bitfield32 attribute type
Jiri Pirko [Sat, 21 Oct 2023 11:27:03 +0000 (13:27 +0200)]
tools: ynl-gen: introduce support for bitfield32 attribute type

Introduce support for attribute type bitfield32.
Note that since the generated code works with struct nla_bitfield32,
the generator adds netlink.h to the list of includes for userspace
headers in case any bitfield32 is present.

Note that this is added only to genetlink-legacy scheme as requested
by Jakub Kicinski.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agogenetlink: don't merge dumpit split op for different cmds into single iter
Jiri Pirko [Sat, 21 Oct 2023 11:27:02 +0000 (13:27 +0200)]
genetlink: don't merge dumpit split op for different cmds into single iter

Currently, split ops of doit and dumpit are merged into a single iter
item when they are subsequent. However, there is no guarantee that the
dumpit op is for the same cmd as doit op.

Fix this by checking if cmd is the same for both.
This problem does not occur in existing families.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231021112711.660606-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoMerge branch 'intel-wired-lan-driver-updates-2023-10-19-idpf'
Jakub Kicinski [Mon, 23 Oct 2023 22:55:33 +0000 (15:55 -0700)]
Merge branch 'intel-wired-lan-driver-updates-2023-10-19-idpf'

Jacob Keller says:

====================
Intel Wired LAN Driver Updates 2023-10-19 (idpf)

This series contains two fixes for the recently merged idpf driver.

Michal adds missing logic for programming the scheduling mode of completion
queues.

Pavan fixes a call trace caused by the mailbox work item not being canceled
properly if an error occurred during initialization.
====================

Link: https://lore.kernel.org/r/20231023202655.173369-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoidpf: cancel mailbox work in error path
Pavan Kumar Linga [Mon, 23 Oct 2023 20:26:55 +0000 (13:26 -0700)]
idpf: cancel mailbox work in error path

In idpf_vc_core_init, the mailbox work is queued
on a mailbox workqueue but it is not cancelled on error.
This results in a call trace when idpf_mbx_task tries
to access the freed mailbox queue pointer. Fix it by
cancelling the mailbox work in the error path.

Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023202655.173369-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agoidpf: set scheduling mode for completion queue
Michal Kubiak [Mon, 23 Oct 2023 20:26:54 +0000 (13:26 -0700)]
idpf: set scheduling mode for completion queue

The HW must be programmed differently for queue-based scheduling mode.
To program the completion queue context correctly, the control plane
must know the scheduling mode not only for the Tx queue, but also for
the completion queue.
Unfortunately, currently the driver sets the scheduling mode only for
the Tx queues.

Propagate the scheduling mode data for the completion queue as
well when sending the queue configuration messages.

Fixes: 1c325aac10a8 ("idpf: configure resources for TX queues")
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Alan Brady <alan.brady@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023202655.173369-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet_sched: sch_fq: fastpath needs to take care of sk->sk_pacing_status
Eric Dumazet [Fri, 20 Oct 2023 20:12:54 +0000 (20:12 +0000)]
net_sched: sch_fq: fastpath needs to take care of sk->sk_pacing_status

If packets of a TCP flows take the fast path, we need to make sure
sk->sk_pacing_status is set to SK_PACING_FQ otherwise TCP might
fallback to internal pacing, which is not optimal.

Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231020201254.732527-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 months agonet_sched: sch_fq: fix off-by-one error in fq_dequeue()
Eric Dumazet [Fri, 20 Oct 2023 20:00:53 +0000 (20:00 +0000)]
net_sched: sch_fq: fix off-by-one error in fq_dequeue()

A last minute change went wrong.

We need to look for a packet in all 3 bands, not only two.

Fixes: 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202310201422.a22b0999-oliver.sang@intel.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Dave Taht <dave.taht@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231020200053.675951-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoBluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err
Marcel Ziswiler [Wed, 18 Oct 2023 14:47:35 +0000 (16:47 +0200)]
Bluetooth: hci_sync: Fix Opcode prints in bt_dev_dbg/err

Printed Opcodes may be missing leading zeros:

Bluetooth: hci0: Opcode 0x c03 failed: -110

Fix this by always printing leading zeros:

Bluetooth: hci0: Opcode 0x0c03 failed: -110

Fixes: d0b137062b2d ("Bluetooth: hci_sync: Rework init stages")
Fixes: 6a98e3836fa2 ("Bluetooth: Add helper for serialized HCI command execution")
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: Fix double free in hci_conn_cleanup
ZhengHan Wang [Wed, 18 Oct 2023 10:30:55 +0000 (12:30 +0200)]
Bluetooth: Fix double free in hci_conn_cleanup

syzbot reports a slab use-after-free in hci_conn_hash_flush [1].
After releasing an object using hci_conn_del_sysfs in the
hci_conn_cleanup function, releasing the same object again
using the hci_dev_put and hci_conn_put functions causes a double free.
Here's a simplified flow:

hci_conn_del_sysfs:
  hci_dev_put
    put_device
      kobject_put
        kref_put
          kobject_release
            kobject_cleanup
              kfree_const
                kfree(name)

hci_dev_put:
  ...
    kfree(name)

hci_conn_put:
  put_device
    ...
      kfree(name)

This patch drop the hci_dev_put and hci_conn_put function
call in hci_conn_cleanup function, because the object is
freed in hci_conn_del_sysfs function.

This patch also fixes the refcounting in hci_conn_add_sysfs() and
hci_conn_del_sysfs() to take into account device_add() failures.

This fixes CVE-2023-28464.

Link: https://syzkaller.appspot.com/bug?id=1bb51491ca5df96a5f724899d1dbb87afda61419
Signed-off-by: ZhengHan Wang <wzhmmmmm@gmail.com>
Co-developed-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: btmtksdio: enable bluetooth wakeup in system suspend
Zhengping Jiang [Fri, 13 Oct 2023 19:47:07 +0000 (12:47 -0700)]
Bluetooth: btmtksdio: enable bluetooth wakeup in system suspend

The BTMTKSDIO_BT_WAKE_ENABLED flag is set for bluetooth interrupt
during system suspend and increases wakeup count for bluetooth event.

Signed-off-by: Zhengping Jiang <jiangzp@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE
Guan Wentao [Thu, 12 Oct 2023 11:21:17 +0000 (19:21 +0800)]
Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE

Add PID/VID 0bda:b85b for Realtek RTL8852BE USB bluetooth part.
The PID/VID was reported by the patch last year. [1]
Some SBCs like rockpi 5B A8 module contains the device.
And it`s founded in website. [2] [3]

Here is the device tables in /sys/kernel/debug/usb/devices .

T:  Bus=07 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0bda ProdID=b85b Rev= 0.00
S:  Manufacturer=Realtek
S:  Product=Bluetooth Radio
S:  SerialNumber=00e04c000001
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Link: https://lore.kernel.org/all/20220420052402.19049-1-tangmeng@uniontech.com/
Link: https://forum.radxa.com/t/bluetooth-on-ubuntu/13051/4
Link: https://ubuntuforums.org/showthread.php?t=2489527
Cc: stable@vger.kernel.org
Signed-off-by: Meng Tang <tangmeng@uniontech.com>
Signed-off-by: Guan Wentao <guanwentao@uniontech.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: hci_bcm4377: Mark bcm4378/bcm4387 as BROKEN_LE_CODED
Janne Grunau [Mon, 16 Oct 2023 07:13:08 +0000 (09:13 +0200)]
Bluetooth: hci_bcm4377: Mark bcm4378/bcm4387 as BROKEN_LE_CODED

bcm4378 and bcm4387 claim to support LE Coded PHY but fail to pair
(reliably) with BLE devices if it is enabled.
On bcm4378 pairing usually succeeds after 2-3 tries. On bcm4387
pairing appears to be completely broken.

Cc: stable@vger.kernel.org # 6.4.y+
Link: https://discussion.fedoraproject.org/t/mx-master-3-bluetooth-mouse-doesnt-connect/87072/33
Link: https://github.com/AsahiLinux/linux/issues/177
Fixes: 288c90224eec ("Bluetooth: Enable all supported LE PHY by default")
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Eric Curtin <ecurtin@redhat.com>
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Copy BASE if service data matches EIR_BAA_SERVICE_UUID
Claudia Draghicescu [Thu, 28 Sep 2023 08:02:08 +0000 (11:02 +0300)]
Bluetooth: ISO: Copy BASE if service data matches EIR_BAA_SERVICE_UUID

Copy the content of a Periodic Advertisement Report to BASE only if
the service UUID is Basic Audio Announcement Service UUID.

Signed-off-by: Claudia Draghicescu <claudia.rosu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: Make handle of hci_conn be unique
Ziyang Xuan [Wed, 11 Oct 2023 09:57:31 +0000 (17:57 +0800)]
Bluetooth: Make handle of hci_conn be unique

The handle of new hci_conn is always HCI_CONN_HANDLE_MAX + 1 if
the handle of the first hci_conn entry in hci_dev->conn_hash->list
is not HCI_CONN_HANDLE_MAX + 1. Use ida to manage the allocation of
hci_conn->handle to make it be unique.

Fixes: 9f78191cc9f1 ("Bluetooth: hci_conn: Always allocate unique handles")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: btusb: Add date->evt_skb is NULL check
youwan Wang [Wed, 11 Oct 2023 05:14:47 +0000 (13:14 +0800)]
Bluetooth: btusb: Add date->evt_skb is NULL check

fix crash because of null pointers

[ 6104.969662] BUG: kernel NULL pointer dereference, address: 00000000000000c8
[ 6104.969667] #PF: supervisor read access in kernel mode
[ 6104.969668] #PF: error_code(0x0000) - not-present page
[ 6104.969670] PGD 0 P4D 0
[ 6104.969673] Oops: 0000 [#1] SMP NOPTI
[ 6104.969684] RIP: 0010:btusb_mtk_hci_wmt_sync+0x144/0x220 [btusb]
[ 6104.969688] RSP: 0018:ffffb8d681533d48 EFLAGS: 00010246
[ 6104.969689] RAX: 0000000000000000 RBX: ffff8ad560bb2000 RCX: 0000000000000006
[ 6104.969691] RDX: 0000000000000000 RSI: ffffb8d681533d08 RDI: 0000000000000000
[ 6104.969692] RBP: ffffb8d681533d70 R08: 0000000000000001 R09: 0000000000000001
[ 6104.969694] R10: 0000000000000001 R11: 00000000fa83b2da R12: ffff8ad461d1d7c0
[ 6104.969695] R13: 0000000000000000 R14: ffff8ad459618c18 R15: ffffb8d681533d90
[ 6104.969697] FS:  00007f5a1cab9d40(0000) GS:ffff8ad578200000(0000) knlGS:00000
[ 6104.969699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6104.969700] CR2: 00000000000000c8 CR3: 000000018620c001 CR4: 0000000000760ef0
[ 6104.969701] PKRU: 55555554
[ 6104.969702] Call Trace:
[ 6104.969708]  btusb_mtk_shutdown+0x44/0x80 [btusb]
[ 6104.969732]  hci_dev_do_close+0x470/0x5c0 [bluetooth]
[ 6104.969748]  hci_rfkill_set_block+0x56/0xa0 [bluetooth]
[ 6104.969753]  rfkill_set_block+0x92/0x160
[ 6104.969755]  rfkill_fop_write+0x136/0x1e0
[ 6104.969759]  __vfs_write+0x18/0x40
[ 6104.969761]  vfs_write+0xdf/0x1c0
[ 6104.969763]  ksys_write+0xb1/0xe0
[ 6104.969765]  __x64_sys_write+0x1a/0x20
[ 6104.969769]  do_syscall_64+0x51/0x180
[ 6104.969771]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 6104.969773] RIP: 0033:0x7f5a21f18fef
[ 6104.9] RSP: 002b:00007ffeefe39010 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 6104.969780] RAX: ffffffffffffffda RBX: 000055c10a7560a0 RCX: 00007f5a21f18fef
[ 6104.969781] RDX: 0000000000000008 RSI: 00007ffeefe39060 RDI: 0000000000000012
[ 6104.969782] RBP: 00007ffeefe39060 R08: 0000000000000000 R09: 0000000000000017
[ 6104.969784] R10: 00007ffeefe38d97 R11: 0000000000000293 R12: 0000000000000002
[ 6104.969785] R13: 00007ffeefe39220 R14: 00007ffeefe391a0 R15: 000055c10a72acf0

Signed-off-by: youwan Wang <wangyouwan@126.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Fix bcast listener cleanup
Iulia Tanasescu [Wed, 11 Oct 2023 14:24:07 +0000 (17:24 +0300)]
Bluetooth: ISO: Fix bcast listener cleanup

This fixes the cleanup callback for slave bis and pa sync hcons.

Closing all bis hcons will trigger BIG Terminate Sync, while closing
all bises and the pa sync hcon will also trigger PA Terminate Sync.

Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: msft: __hci_cmd_sync() doesn't return NULL
Dan Carpenter [Thu, 5 Oct 2023 11:19:23 +0000 (14:19 +0300)]
Bluetooth: msft: __hci_cmd_sync() doesn't return NULL

The __hci_cmd_sync() function doesn't return NULL.  Checking for NULL
doesn't make the code safer, it just confuses people.

When a function returns both error pointers and NULL then generally the
NULL is a kind of success case.  For example, maybe we look up an item
then errors mean we ran out of memory but NULL means the item is not
found.  Or if we request a feature, then error pointers mean that there
was an error but NULL means that the feature has been deliberately
turned off.

In this code it's different.  The NULL is handled as if there is a bug
in __hci_cmd_sync() where it accidentally returns NULL instead of a
proper error code.  This was done consistently until commit 9e14606d8f38
("Bluetooth: msft: Extended monitor tracking by address filter") which
deleted the work around for the potential future bug and treated NULL as
success.

Predicting potential future bugs is complicated, but we should just fix
them instead of working around them.  Instead of debating whether NULL
is failure or success, let's just say it's currently impossible and
delete the dead code.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Match QoS adv handle with BIG handle
Iulia Tanasescu [Tue, 3 Oct 2023 14:37:39 +0000 (17:37 +0300)]
Bluetooth: ISO: Match QoS adv handle with BIG handle

In case the user binds multiple sockets for the same BIG, the BIG
handle should be matched with the associated adv handle, if it has
already been allocated previously.

Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Allow binding a bcast listener to 0 bises
Iulia Tanasescu [Tue, 3 Oct 2023 14:49:33 +0000 (17:49 +0300)]
Bluetooth: ISO: Allow binding a bcast listener to 0 bises

This makes it possible to bind a broadcast listener to a broadcaster
address without asking for any BIS indexes to sync with.

Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: btusb: Add RTW8852BE device 13d3:3570 to device tables
Masum Reza [Sun, 24 Sep 2023 11:16:55 +0000 (16:46 +0530)]
Bluetooth: btusb: Add RTW8852BE device 13d3:3570 to device tables

This device is used in TP-Link TX20E WiFi+Bluetooth adapter.

Relevant information in /sys/kernel/debug/usb/devices
about the Bluetooth device is listed as the below.

T:  Bus=01 Lev=01 Prnt=01 Port=08 Cnt=01 Dev#=  2 Spd=12   MxCh= 0
D:  Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=13d3 ProdID=3570 Rev= 0.00
S:  Manufacturer=Realtek
S:  Product=Bluetooth Radio
S:  SerialNumber=00e04c000001
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms

Signed-off-by: Masum Reza <masumrezarock100@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: qca: add support for QCA2066
Tim Jiang [Tue, 12 Sep 2023 09:39:57 +0000 (17:39 +0800)]
Bluetooth: qca: add support for QCA2066

This patch adds support for QCA2066 firmware patch and NVM downloading.
as the RF performance of QCA2066 SOC chip from different foundries may
vary. Therefore we use different NVM to configure them based on board ID.

Changes in v2
 - optimize the function qca_generate_hsp_nvm_name
 - remove redundant debug code for function qca_read_fw_board_id

Signed-off-by: Tim Jiang <quic_tjiang@quicinc.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Set CIS bit only for devices with CIS support
Vlad Pruteanu [Tue, 12 Sep 2023 06:33:29 +0000 (09:33 +0300)]
Bluetooth: ISO: Set CIS bit only for devices with CIS support

Currently the CIS bit that can be set by the host is set for any device
that has CIS or BIS support. In reality, devices that support BIS may not
allow that bit to be set and so, the HCI bring up fails for them.

This commit fixes this by only setting the bit for CIS capable devices.

Signed-off-by: Vlad Pruteanu <vlad.pruteanu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: Add support for Intel Misty Peak - 8087:0038
Vijay Satija [Thu, 7 Sep 2023 09:15:50 +0000 (14:45 +0530)]
Bluetooth: Add support for Intel Misty Peak - 8087:0038

Devices from /sys/kernel/debug/usb/devices:

T:  Bus=01 Lev=01 Prnt=01 Port=13 Cnt=02 Dev#=  3 Spd=12   MxCh= 0
D:  Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=0038 Rev= 0.00
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
I:  If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  63 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  63 Ivl=1ms

Signed-off-by: Vijay Satija <vijay.satija@intel.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: Add support ITTIM PE50-M75C
Jingyang Wang [Wed, 6 Sep 2023 08:31:47 +0000 (16:31 +0800)]
Bluetooth: Add support ITTIM PE50-M75C

-Device(35f5:7922) from /sys/kernel/debug/usb/devices
P:  Vendor=35f5 ProdID=7922 Rev= 1.00
S:  Manufacturer=MediaTek Inc.
S:  Product=Wireless_Device
S:  SerialNumber=000000000
C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA
A:  FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=125us
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
I:  If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=83(I) Atr=01(Isoc) MxPS=  63 Ivl=1ms
E:  Ad=03(O) Atr=01(Isoc) MxPS=  63 Ivl=1ms
I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none)
E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=125us
E:  Ad=0a(O) Atr=03(Int.) MxPS=  64 Ivl=125us
I:  If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none)
E:  Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us
E:  Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us

Signed-off-by: Jingyang Wang <wjy7717@126.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Pass BIG encryption info through QoS
Iulia Tanasescu [Wed, 6 Sep 2023 14:01:03 +0000 (17:01 +0300)]
Bluetooth: ISO: Pass BIG encryption info through QoS

This enables a broadcast sink to be informed if the PA
it has synced with is associated with an encrypted BIG,
by retrieving the socket QoS and checking the encryption
field.

After PA sync has been successfully established and the
first BIGInfo advertising report is received, a new hcon
is added and notified to the ISO layer. The ISO layer
sets the encryption field of the socket and hcon QoS
according to the encryption parameter of the BIGInfo
advertising report event.

After that, the userspace is woken up, and the QoS of the
new PA sync socket can be read, to inspect the encryption
field and follow up accordingly.

Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agoBluetooth: ISO: Fix BIS cleanup
Iulia Tanasescu [Wed, 6 Sep 2023 13:59:54 +0000 (16:59 +0300)]
Bluetooth: ISO: Fix BIS cleanup

This fixes the master BIS cleanup procedure - as opposed to CIS cleanup,
no HCI disconnect command should be issued. A master BIS should only be
terminated by disabling periodic and extended advertising, and terminating
the BIG.

In case of a Broadcast Receiver, all BIS and PA connections can be
cleaned up by calling hci_conn_failed, since it contains all function
calls that are necessary for successful cleanup.

Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
11 months agonet: mdio: xgene: Fix unused xgene_mdio_of_match warning for !CONFIG_OF
Rob Herring [Thu, 19 Oct 2023 18:23:45 +0000 (13:23 -0500)]
net: mdio: xgene: Fix unused xgene_mdio_of_match warning for !CONFIG_OF

Commit a243ecc323b9 ("net: mdio: xgene: Use device_get_match_data()")
dropped the unconditional use of xgene_mdio_of_match resulting in this
warning:

drivers/net/mdio/mdio-xgene.c:303:34: warning: unused variable 'xgene_mdio_of_match' [-Wunused-const-variable]

The fix is to drop of_match_ptr() which is not necessary because DT is
always used for this driver (well, it could in theory support ACPI only,
but CONFIG_OF is always enabled for arm64).

Fixes: a243ecc323b9 ("net: mdio: xgene: Use device_get_match_data()")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310170832.xnVXw1bb-lkp@intel.com/
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20231019182345.833136-1-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agotools: ynl-gen: change spacing around __attribute__
Jakub Kicinski [Fri, 20 Oct 2023 22:18:27 +0000 (15:18 -0700)]
tools: ynl-gen: change spacing around __attribute__

checkpatch gets confused and treats __attribute__ as a function call.
It complains about white space before "(":

WARNING:SPACING: space prohibited between function name and open parenthesis '('
+ struct netdev_queue_get_rsp obj __attribute__ ((aligned (8)));

No spaces wins in the kernel:

  $ git grep 'attribute__((.*aligned(' | wc -l
  480
  $ git grep 'attribute__ ((.*aligned (' | wc -l
  110
  $ git grep 'attribute__ ((.*aligned(' | wc -l
  94
  $ git grep 'attribute__((.*aligned (' | wc -l
  63

So, whatever, change the codegen.

Note that checkpatch also thinks we should use __aligned(),
but this is user space code.

Link: https://lore.kernel.org/all/202310190900.9Dzgkbev-lkp@intel.com/
Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231020221827.3436697-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agotls: don't reset prot->aad_size and prot->tail_size for TLS_HW
Sabrina Dubroca [Fri, 20 Oct 2023 14:00:55 +0000 (16:00 +0200)]
tls: don't reset prot->aad_size and prot->tail_size for TLS_HW

Prior to commit 1a074f7618e8 ("tls: also use init_prot_info in
tls_set_device_offload"), setting TLS_HW on TX didn't touch
prot->aad_size and prot->tail_size. They are set to 0 during context
allocation (tls_prot_info is embedded in tls_context, kzalloc'd by
tls_ctx_create).

When the RX key is configured, tls_set_sw_offload is called (for both
TLS_SW and TLS_HW). If the TX key is configured in TLS_HW mode after
the RX key has been installed, init_prot_info will now overwrite the
correct values of aad_size and tail_size, breaking SW decryption and
causing -EBADMSG errors to be returned to userspace.

Since TLS_HW doesn't use aad_size and tail_size at all (for TLS1.2,
tail_size is always 0, and aad_size is equal to TLS_HEADER_SIZE +
rec_seq_size), we can simply drop this hunk.

Fixes: 1a074f7618e8 ("tls: also use init_prot_info in tls_set_device_offload")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Ran Rozenstein <ranro@nvidia.com>
Link: https://lore.kernel.org/r/979d2f89a6a994d5bb49cae49a80be54150d094d.1697653889.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: lan966x: remove useless code in lan966x_xtr_irq_handler
Su Hui [Fri, 20 Oct 2023 09:16:26 +0000 (17:16 +0800)]
net: lan966x: remove useless code in lan966x_xtr_irq_handler

'err' is useless after break, remove this to save space and
be more clear.

Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'tcp-ts-usec-resolution'
David S. Miller [Mon, 23 Oct 2023 08:35:02 +0000 (09:35 +0100)]
Merge branch 'tcp-ts-usec-resolution'

Eric Dumazet says:

====================
tcp: add optional usec resolution to TCP TS

As discussed in various public places in 2016, Google adopted
usec resolution in RFC 7323 TS values, at Van Jacobson suggestion.

Goals were :

1) better observability of delays in networking stacks/fabrics.

2) better disambiguation of events based on TSval/ecr values.

3) building block for congestion control modules needing usec resolution.

Back then we implemented a schem based on private SYN options
to safely negotiate the feature.

For upstream submission, we chose to use a much simpler route
attribute because this feature is probably going to be used
in private networks.

ip route add 10/8 ... features tcp_usec_ts

References:

https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/

First two patches are fixing old minor bugs and might be taken
by stable teams (thanks to appropriate Fixes: tags)
====================

Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: add TCPI_OPT_USEC_TS
Eric Dumazet [Fri, 20 Oct 2023 12:57:48 +0000 (12:57 +0000)]
tcp: add TCPI_OPT_USEC_TS

Add the ability to report in tcp_info.tcpi_options if
a flow is using usec resolution in TCP TS val.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: add support for usec resolution in TCP TS values
Eric Dumazet [Fri, 20 Oct 2023 12:57:47 +0000 (12:57 +0000)]
tcp: add support for usec resolution in TCP TS values

Back in 2015, Van Jacobson suggested to use usec resolution in TCP TS values.
This has been implemented in our private kernels.

Goals were :

1) better observability of delays in networking stacks.
2) better disambiguation of events based on TSval/ecr values.
3) building block for congestion control modules needing usec resolution.

Back then we implemented a schem based on private SYN options
to negotiate the feature.

For upstream submission, we chose to use a route attribute,
because this feature is probably going to be used in private
networks [1] [2].

ip route add 10/8 ... features tcp_usec_ts

Note that RFC 7323 recommends a
  "timestamp clock frequency in the range 1 ms to 1 sec per tick.",
but also mentions
  "the maximum acceptable clock frequency is one tick every 59 ns."

[1] Unfortunately RFC 7323 5.5 (Outdated Timestamps) suggests
to invalidate TS.Recent values after a flow was idle for more
than 24 days. This is the part making usec_ts a problem
for peers following this recommendation for long living
idle flows.

[2] Attempts to standardize usec ts went nowhere:

https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: introduce TCP_PAWS_WRAP
Eric Dumazet [Fri, 20 Oct 2023 12:57:46 +0000 (12:57 +0000)]
tcp: introduce TCP_PAWS_WRAP

tcp_paws_check() uses TCP_PAWS_24DAYS constant to detect if TCP TS
values might have wrapped after a long idle period.

This mechanism is described in RFC 7323 5.5 (Outdated Timestamps)

TCP_PAWS_24DAYS value was based on the assumption of a clock
of 1 Khz.

As we want to adopt a 1 Mhz clock in the future, we reduce
this constant.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: add RTAX_FEATURE_TCP_USEC_TS
Eric Dumazet [Fri, 20 Oct 2023 12:57:45 +0000 (12:57 +0000)]
tcp: add RTAX_FEATURE_TCP_USEC_TS

This new dst feature flag will be used to allow TCP to use usec
based timestamps instead of msec ones.

ip route .... feature tcp_usec_ts

Also document that RTAX_FEATURE_SACK and RTAX_FEATURE_TIMESTAMP
are unused.

RTAX_FEATURE_ALLFRAG is also going away soon.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: add tcp_rtt_tsopt_us()
Eric Dumazet [Fri, 20 Oct 2023 12:57:44 +0000 (12:57 +0000)]
tcp: add tcp_rtt_tsopt_us()

Before adding usec TS support, add tcp_rtt_tsopt_us() helper
to factorize code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: rename tcp_time_stamp() to tcp_time_stamp_ts()
Eric Dumazet [Fri, 20 Oct 2023 12:57:43 +0000 (12:57 +0000)]
tcp: rename tcp_time_stamp() to tcp_time_stamp_ts()

This helper returns a TSval from a TCP socket.

It currently calls tcp_time_stamp_ms() but will soon
be able to return a usec based TSval, depending
on an upcoming tp->tcp_usec_ts field.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: move tcp_ns_to_ts() to net/ipv4/syncookies.c
Eric Dumazet [Fri, 20 Oct 2023 12:57:42 +0000 (12:57 +0000)]
tcp: move tcp_ns_to_ts() to net/ipv4/syncookies.c

tcp_ns_to_ts() is only used once from cookie_init_timestamp().

Also add the 'bool usec_ts' parameter to enable usec TS later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: rename tcp_skb_timestamp()
Eric Dumazet [Fri, 20 Oct 2023 12:57:41 +0000 (12:57 +0000)]
tcp: rename tcp_skb_timestamp()

This helper returns a 32bit TCP TSval from skb->tstamp.

As we are going to support usec or ms units soon, rename it
to tcp_skb_timestamp_ts() and add a boolean to select the unit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: replace tcp_time_stamp_raw()
Eric Dumazet [Fri, 20 Oct 2023 12:57:40 +0000 (12:57 +0000)]
tcp: replace tcp_time_stamp_raw()

In preparation of usec TCP TS support, remove tcp_time_stamp_raw()
in favor of tcp_clock_ts() helper. This helper will return a suitable
32bit result to feed TS values, depending on a socket field.

Also add tcp_tw_tsval() and tcp_rsk_tsval() helpers to factorize
the details.

We do not yet support usec timestamps.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: introduce tcp_clock_ms()
Eric Dumazet [Fri, 20 Oct 2023 12:57:39 +0000 (12:57 +0000)]
tcp: introduce tcp_clock_ms()

It delivers current TCP time stamp in ms unit, and is used
in place of confusing tcp_time_stamp_raw()

It is the same family than tcp_clock_ns() and tcp_clock_ms().

tcp_time_stamp_raw() will be replaced later for TSval
contexts with a more descriptive name.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: add tcp_time_stamp_ms() helper
Eric Dumazet [Fri, 20 Oct 2023 12:57:38 +0000 (12:57 +0000)]
tcp: add tcp_time_stamp_ms() helper

In preparation of adding usec TCP TS values, add tcp_time_stamp_ms()
for contexts needing ms based values.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agotcp: fix cookie_init_timestamp() overflows
Eric Dumazet [Fri, 20 Oct 2023 12:57:37 +0000 (12:57 +0000)]
tcp: fix cookie_init_timestamp() overflows

cookie_init_timestamp() is supposed to return a 64bit timestamp
suitable for both TSval determination and setting of skb->tstamp.

Unfortunately it uses 32bit fields and overflows after
2^32 * 10^6 nsec (~49 days) of uptime.

Generated TSval are still correct, but skb->tstamp might be set
far away in the past, potentially confusing other layers.

tcp_ns_to_ts() is changed to return a full 64bit value,
ts and ts_now variables are changed to u64 type,
and TSMASK is removed in favor of shifts operations.

While we are at it, change this sequence:
ts >>= TSBITS;
ts--;
ts <<= TSBITS;
ts |= options;
to:
ts -= (1UL << TSBITS);

Fixes: 9a568de4818d ("tcp: switch TCP TS option (RFC 7323) to 1ms clock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agochtls: fix tp->rcv_tstamp initialization
Eric Dumazet [Fri, 20 Oct 2023 12:57:36 +0000 (12:57 +0000)]
chtls: fix tp->rcv_tstamp initialization

tp->rcv_tstamp should be set to tcp_jiffies, not tcp_time_stamp().

Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'consolidate-udp-ipv6-route-lookups'
David S. Miller [Mon, 23 Oct 2023 07:48:57 +0000 (08:48 +0100)]
Merge branch 'consolidate-udp-ipv6-route-lookups'

Beniamino Galvani says:

====================
net: consolidate IPv6 route lookup for UDP tunnels

At the moment different UDP tunnels rely on different functions for
IPv6 route lookup, and those functions all implement the same
logic.

Extend the generic lookup function so that it is suitable for all UDP
tunnel implementations, and then adapt bareudp, geneve and vxlan to
use it.

This is similar to what already done for IPv4.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agovxlan: use generic function for tunnel IPv6 route lookup
Beniamino Galvani [Fri, 20 Oct 2023 11:55:29 +0000 (13:55 +0200)]
vxlan: use generic function for tunnel IPv6 route lookup

The route lookup can be done now via generic function
udp_tunnel6_dst_lookup() to replace the custom implementation in
vxlan6_get_route().

This is similar to what already done for IPv4 in commit 6f19b2c136d9
("vxlan: use generic function for tunnel IPv4 route lookup").

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agogeneve: use generic function for tunnel IPv6 route lookup
Beniamino Galvani [Fri, 20 Oct 2023 11:55:28 +0000 (13:55 +0200)]
geneve: use generic function for tunnel IPv6 route lookup

The route lookup can be done now via generic function
udp_tunnel6_dst_lookup() to replace the custom implementation in
geneve_get_v6_dst().

This is similar to what already done for IPv4 in commit daa2ba7ed1d1
("geneve: use generic function for tunnel IPv4 route lookup").

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoipv6: add new arguments to udp_tunnel6_dst_lookup()
Beniamino Galvani [Fri, 20 Oct 2023 11:55:27 +0000 (13:55 +0200)]
ipv6: add new arguments to udp_tunnel6_dst_lookup()

We want to make the function more generic so that it can be used by
other UDP tunnel implementations such as geneve and vxlan. To do that,
add the following arguments:

 - source and destination UDP port;
 - ifindex of the output interface, needed by vxlan;
 - the tos, because in some cases it is not taken from struct
   ip_tunnel_info (for example, when it's inherited from the inner
   packet);
 - the dst cache, because not all tunnel types (e.g. vxlan) want to
   use the one from struct ip_tunnel_info.

With these parameters, the function no longer needs the full struct
ip_tunnel_info as argument and we can pass only the relevant part of
it (struct ip_tunnel_key).

This is similar to what already done for IPv4 in commit 72fc68c6356b
("ipv4: add new arguments to udp_tunnel_dst_lookup()").

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>