linux-block.git
5 years agobpf: Sync bpf.h to tools
Willem de Bruijn [Fri, 22 Mar 2019 18:32:57 +0000 (14:32 -0400)]
bpf: Sync bpf.h to tools

Sync include/uapi/linux/bpf.h with tools/

Changes
  v1->v2:
  - BPF_F_ADJ_ROOM_MASK moved, no longer in this commit
  v2->v3:
  - BPF_F_ADJ_ROOM_ENCAP_L3_MASK moved, no longer in this commit

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add bpf_skb_adjust_room encap flags
Willem de Bruijn [Fri, 22 Mar 2019 18:32:56 +0000 (14:32 -0400)]
bpf: add bpf_skb_adjust_room encap flags

When pushing tunnel headers, annotate skbs in the same way as tunnel
devices.

For GSO packets, the network stack requires certain fields set to
segment packets with tunnel headers. gro_gse_segment depends on
transport and inner mac header, for instance.

Add an option to pass this information.

Remove the restriction on len_diff to network header length, which
is too short, e.g., for GRE protocols.

Changes
  v1->v2:
  - document new flags
  - BPF_F_ADJ_ROOM_MASK moved
  v2->v3:
  - BPF_F_ADJ_ROOM_ENCAP_L3_MASK moved

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO
Willem de Bruijn [Fri, 22 Mar 2019 18:32:55 +0000 (14:32 -0400)]
bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO

bpf_skb_adjust_room adjusts gso_size of gso packets to account for the
pushed or popped header room.

This is not allowed with UDP, where gso_size delineates datagrams. Add
an option to avoid these updates and allow this call for datagrams.

It can also be used with TCP, when MSS is known to allow headroom,
e.g., through MSS clamping or route MTU.

Changes v1->v2:
  - document flag BPF_F_ADJ_ROOM_FIXED_GSO
  - do not expose BPF_F_ADJ_ROOM_MASK through uapi, as it may change.

Link: https://patchwork.ozlabs.org/patch/1052497/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC
Willem de Bruijn [Fri, 22 Mar 2019 18:32:54 +0000 (14:32 -0400)]
bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC

bpf_skb_adjust_room net allows inserting room in an skb.

Existing mode BPF_ADJ_ROOM_NET inserts room after the network header
by pulling the skb, moving the network header forward and zeroing the
new space.

Add new mode BPF_ADJUST_ROOM_MAC that inserts room after the mac
header. This allows inserting tunnel headers in front of the network
header without having to recreate the network header in the original
space, avoiding two copies.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: extend bpf tunnel test with tso
Willem de Bruijn [Fri, 22 Mar 2019 18:32:53 +0000 (14:32 -0400)]
selftests/bpf: extend bpf tunnel test with tso

Segmentation offload takes a longer path. Verify that the feature
works with large packets.

The test succeeds if not setting dodgy in bpf_skb_adjust_room, as veth
TSO is permissive.

If not setting SKB_GSO_DODGY, this enables tunneled TSO offload on
supporting NICs.

The feature sets SKB_GSO_DODGY because the caller is untrusted. As a
result the packets traverse through the gso stack at least up to TCP.
And fail the gso_type validation, such as the skb->encapsulation check
in gre_gso_segment and the gso_type checks introduced in commit
418e897e0716 ("gso: validate gso_type on ipip style tunnel").

This will be addressed in a follow-on feature patch. In the meantime,
disable the new gso tests.

Changes v1->v2:
  - not all netcat versions support flag '-q', use timeout instead

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: extend bpf tunnel test with gre
Willem de Bruijn [Fri, 22 Mar 2019 18:32:52 +0000 (14:32 -0400)]
selftests/bpf: extend bpf tunnel test with gre

GRE is a commonly used protocol. Add GRE cases for both IPv4 and IPv6.

It also inserts different sized headers, which can expose some
unexpected edge cases.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: expand bpf tunnel test to ipv6
Willem de Bruijn [Fri, 22 Mar 2019 18:32:51 +0000 (14:32 -0400)]
selftests/bpf: expand bpf tunnel test to ipv6

The test only uses ipv4 so far, expand to ipv6.
This is mostly a boilerplate near copy of the ipv4 path.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: expand bpf tunnel test with decap
Willem de Bruijn [Fri, 22 Mar 2019 18:32:50 +0000 (14:32 -0400)]
selftests/bpf: expand bpf tunnel test with decap

The bpf tunnel test encapsulates using bpf, then decapsulates using
a standard tunnel device to verify correctness.

Once encap is verified, also test decap, by replacing the tunnel
device on decap with another bpf program.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: bpf tunnel encap test
Willem de Bruijn [Fri, 22 Mar 2019 18:32:49 +0000 (14:32 -0400)]
selftests/bpf: bpf tunnel encap test

Validate basic tunnel encapsulation using ipip.

Set up two namespaces connected by veth. Connect a client and server.
Do this with and without bpf encap.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: in bpf_skb_adjust_room avoid copy in tx fast path
Willem de Bruijn [Fri, 22 Mar 2019 18:32:48 +0000 (14:32 -0400)]
bpf: in bpf_skb_adjust_room avoid copy in tx fast path

bpf_skb_adjust_room calls skb_cow on grow.

This expensive operation can be avoided in the fast path when the only
other clone has released the header. This is the common case for TCP,
where one headerless clone is kept on the retransmit queue.

It is safe to do so even when touching the gso fields in skb_shinfo.
Regular tunnel encap with iptunnel_handle_offloads takes the same
optimization.

The tcp stack unclones in the unlikely case that it accesses these
fields through headerless clones packets on the retransmit queue (see
__tcp_retransmit_skb).

If any other clones are present, e.g., from packet sockets,
skb_cow_head returns the same value as skb_cow().

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests: bpf: modify urandom_read and link it non-statically
Ivan Vecera [Fri, 15 Mar 2019 20:04:14 +0000 (21:04 +0100)]
selftests: bpf: modify urandom_read and link it non-statically

After some experiences I found that urandom_read does not need to be
linked statically. When the 'read' syscall call is moved to separate
non-inlined function then bpf_get_stackid() is able to find
the executable in stack trace and extract its build_id from it.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agosamples: bpf: add xdp_sample_pkts to .gitignore
Daniel T. Lee [Wed, 20 Mar 2019 04:17:47 +0000 (13:17 +0900)]
samples: bpf: add xdp_sample_pkts to .gitignore

This commit adds xdp_sample_pkts to .gitignore which is
currently ommited from the ignore file.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'bpf_tcp_check_syncookie'
Alexei Starovoitov [Fri, 22 Mar 2019 01:59:11 +0000 (18:59 -0700)]
Merge branch 'bpf_tcp_check_syncookie'

Lorenz Bauer says:

====================
This series adds the necessary helpers to determine wheter a given
(encapsulated) TCP packet belongs to a connection known to the network stack.

* bpf_skc_lookup_tcp gives access to request and timewait sockets
* bpf_tcp_check_syncookie identifies the final 3WHS ACK when syncookies
  are enabled

The goal is to be able to implement load-balancing approaches like
glb-director [1] or Beamer [2] in pure eBPF. Specifically, we'd like to replace
the functionality of the glb-redirect kernel module [3] by an XDP program or
tc classifier.

Changes in v3:
* Fix missing check for ip4->ihl
* Only cast to unsigned long in BPF_CALLs

Changes in v2:
* Rename bpf_sk_check_syncookie to bpf_tcp_check_syncookie.
* Add bpf_skc_lookup_tcp. Without it bpf_tcp_check_syncookie doesn't make sense.
* Check tcp_synq_no_recent_overflow() in bpf_tcp_check_syncookie.
* Check th->syn in bpf_tcp_check_syncookie.
* Require CONFIG_IPV6 to be a built in.

1: https://github.com/github/glb-director
2: https://www.usenix.org/conference/nsdi18/presentation/olteanu
3: https://github.com/github/glb-director/tree/master/src/glb-redirect
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: add tests for bpf_tcp_check_syncookie and bpf_skc_lookup_tcp
Lorenz Bauer [Fri, 22 Mar 2019 01:54:06 +0000 (09:54 +0800)]
selftests/bpf: add tests for bpf_tcp_check_syncookie and bpf_skc_lookup_tcp

Add tests which verify that the new helpers work for both IPv4 and
IPv6, by forcing SYN cookies to always on. Use a new network namespace
to avoid clobbering the global SYN cookie settings.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: test references to sock_common
Lorenz Bauer [Fri, 22 Mar 2019 01:54:05 +0000 (09:54 +0800)]
selftests/bpf: test references to sock_common

Make sure that returning a struct sock_common * reference invokes
the reference tracking machinery in the verifier.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: allow specifying helper for BPF_SK_LOOKUP
Lorenz Bauer [Fri, 22 Mar 2019 01:54:04 +0000 (09:54 +0800)]
selftests/bpf: allow specifying helper for BPF_SK_LOOKUP

Make the BPF_SK_LOOKUP macro take a helper function, to ease
writing tests for new helpers.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agotools: update include/uapi/linux/bpf.h
Lorenz Bauer [Fri, 22 Mar 2019 01:54:03 +0000 (09:54 +0800)]
tools: update include/uapi/linux/bpf.h

Pull definitions for bpf_skc_lookup_tcp and bpf_sk_check_syncookie.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add helper to check for a valid SYN cookie
Lorenz Bauer [Fri, 22 Mar 2019 01:54:02 +0000 (09:54 +0800)]
bpf: add helper to check for a valid SYN cookie

Using bpf_skc_lookup_tcp it's possible to ascertain whether a packet
belongs to a known connection. However, there is one corner case: no
sockets are created if SYN cookies are active. This means that the final
ACK in the 3WHS is misclassified.

Using the helper, we can look up the listening socket via
bpf_skc_lookup_tcp and then check whether a packet is a valid SYN
cookie ACK.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add skc_lookup_tcp helper
Lorenz Bauer [Fri, 22 Mar 2019 01:54:01 +0000 (09:54 +0800)]
bpf: add skc_lookup_tcp helper

Allow looking up a sock_common. This gives eBPF programs
access to timewait and request sockets.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: allow helpers to return PTR_TO_SOCK_COMMON
Lorenz Bauer [Fri, 22 Mar 2019 01:54:00 +0000 (09:54 +0800)]
bpf: allow helpers to return PTR_TO_SOCK_COMMON

It's currently not possible to access timewait or request sockets
from eBPF, since there is no way to return a PTR_TO_SOCK_COMMON
from a helper. Introduce RET_PTR_TO_SOCK_COMMON to enable this
behaviour.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: track references based on is_acquire_func
Lorenz Bauer [Fri, 22 Mar 2019 01:53:59 +0000 (09:53 +0800)]
bpf: track references based on is_acquire_func

So far, the verifier only acquires reference tracking state for
RET_PTR_TO_SOCKET_OR_NULL. Instead of extending this for every
new return type which desires these semantics, acquire reference
tracking state iff the called helper is an acquire function.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: Add arm target register definitions
Adrian Ratiu [Wed, 20 Mar 2019 10:10:54 +0000 (12:10 +0200)]
selftests/bpf: Add arm target register definitions

eBPF "restricted C" code can be compiled with LLVM/clang using target
triplets like armv7l-unknown-linux-gnueabihf and loaded/run with small
cross-compiled gobpf/elf [1] programs without requiring a full BCC
port which is also undesirable on small embedded systems due to its
size footprint. The only missing pieces are these helper macros which
otherwise have to be redefined by each eBPF arm program.

[1] https://github.com/iovisor/gobpf/tree/master/elf

Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agonet: isdn: Make isdn_ppp_mp_discard and isdn_ppp_mp_reassembly static
YueHaibing [Wed, 20 Mar 2019 13:48:06 +0000 (21:48 +0800)]
net: isdn: Make isdn_ppp_mp_discard and isdn_ppp_mp_reassembly static

Fix sparse warnings:

drivers/isdn/i4l/isdn_ppp.c:1891:16: warning:
 symbol 'isdn_ppp_mp_discard' was not declared. Should it be static?
drivers/isdn/i4l/isdn_ppp.c:1903:6: warning:
 symbol 'isdn_ppp_mp_reassembly' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Make hclge_destroy_cmd_queue static
YueHaibing [Wed, 20 Mar 2019 13:37:13 +0000 (21:37 +0800)]
net: hns3: Make hclge_destroy_cmd_queue static

Fix sparse warning:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c:414:6:
 warning: symbol 'hclge_destroy_cmd_queue' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-refactor-ndo_select_queue'
David S. Miller [Wed, 20 Mar 2019 18:18:55 +0000 (11:18 -0700)]
Merge branch 'net-refactor-ndo_select_queue'

Paolo Abeni says:

====================
net: refactor ndo_select_queue()

Currently, on most devices implementing ndo_select_queue(), we get 2
indirect calls per xmit packet, at least in some scenarios.

We can avoid one of such indirect calls refactoring the ndo_select_queue()
usage so that we don't need anymore the 'fallback' argument.

The first patch renames a helper used later as a public API, the second one
changes the af packet implementation so that it uses the common infrastructure
to select the xmit queue, and the second patch drops the now unneeded argument
from ndo_select_queue().

Alternatively we could use the INDIRECT_CALL_WRAPPER infrastructure to avoid
the fallback indirect call in the common case, but this solution allows also
for some code cleanup.

 v1 -> v2:
  - renamed select queue helpers, as per Eric's and David's suggestions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: remove 'fallback' argument from dev->ndo_select_queue()
Paolo Abeni [Wed, 20 Mar 2019 10:02:06 +0000 (11:02 +0100)]
net: remove 'fallback' argument from dev->ndo_select_queue()

After the previous patch, all the callers of ndo_select_queue()
provide as a 'fallback' argument netdev_pick_tx.
The only exceptions are nested calls to ndo_select_queue(),
which pass down the 'fallback' available in the current scope
- still netdev_pick_tx.

We can drop such argument and replace fallback() invocation with
netdev_pick_tx(). This avoids an indirect call per xmit packet
in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen)
with device drivers implementing such ndo. It also clean the code
a bit.

Tested with ixgbe and CONFIG_FCOE=m

With pktgen using queue xmit:
threads vanilla  patched
(kpps) (kpps)
1 2334 2428
2 4166 4278
4 7895 8100

 v1 -> v2:
 - rebased after helper's name change

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopacket: rework packet_pick_tx_queue() to use common code selection
Paolo Abeni [Wed, 20 Mar 2019 10:02:05 +0000 (11:02 +0100)]
packet: rework packet_pick_tx_queue() to use common code selection

Currently packet_pick_tx_queue() is the only caller of
ndo_select_queue() using a fallback argument other than
netdev_pick_tx.

Leveraging rx queue, we can obtain a similar queue selection
behavior using core helpers. After this change, ndo_select_queue()
is always invoked with netdev_pick_tx() as fallback.
We can change ndo_select_queue() signature in a followup patch,
dropping an indirect call per transmitted packet in some scenarios
(e.g. TCP syn and XDP generic xmit)

This changes slightly how af packet queue selection happens when
PACKET_QDISC_BYPASS is set. It's now more similar to plan dev_queue_xmit()
tacking in account both XPS and TC mapping.

 v1  -> v2:
  - rebased after helper name change
 RFC -> v1:
  - initialize sender_cpu to the expected value

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dev: rename queue selection helpers.
Paolo Abeni [Wed, 20 Mar 2019 10:02:04 +0000 (11:02 +0100)]
net: dev: rename queue selection helpers.

With the following patches, we are going to use __netdev_pick_tx() in
many modules. Rename it to netdev_pick_tx(), to make it clear is
a public API.

Also rename the existing netdev_pick_tx() to netdev_core_pick_tx(),
to avoid name clashes.

Suggested-by: Eric Dumazet <edumazet@google.com>
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'qed-next'
David S. Miller [Wed, 20 Mar 2019 18:12:50 +0000 (11:12 -0700)]
Merge branch 'qed-next'

Sudarsana Reddy Kalluru says:

====================
qed* enhancements.

The patch series adds couple of enhancements for qed/qede drivers.
Please consider applying it to 'net-next' tree.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Define new MF bit for no_vlan config
Sudarsana Reddy Kalluru [Wed, 20 Mar 2019 07:26:26 +0000 (00:26 -0700)]
qed: Define new MF bit for no_vlan config

The patch introduces a new Multi-Function bit for cases where firmware
shouldn't perform the insertion of vlan-0 tag. The new bit is defined to
abstract the implementation from the actual MF mode.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqede: Populate mbi version in ethtool driver query data.
Sudarsana Reddy Kalluru [Wed, 20 Mar 2019 07:26:25 +0000 (00:26 -0700)]
qede: Populate mbi version in ethtool driver query data.

The patch adds support to display MBI image version in 'ethtool -i' output.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomacvlan: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to real device
Hangbin Liu [Wed, 20 Mar 2019 02:23:33 +0000 (10:23 +0800)]
macvlan: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to real device

Similiar to commit a6111d3c93d0 ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to
real device") and commit 37dd9255b2f6 ("vlan: Pass ethtool get_ts_info
queries to real device."), add MACVlan HW ptp support.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: bridge: use eth_broadcast_addr() to assign broadcast address
Mao Wenan [Wed, 20 Mar 2019 02:06:57 +0000 (10:06 +0800)]
net: bridge: use eth_broadcast_addr() to assign broadcast address

This patch is to use eth_broadcast_addr() to assign broadcast address
insetad of memset().

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: Add support of AES128-CCM based ciphers
Vakul Garg [Wed, 20 Mar 2019 02:03:36 +0000 (02:03 +0000)]
net/tls: Add support of AES128-CCM based ciphers

Added support for AES128-CCM based record encryption. AES128-CCM is
similar to AES128-GCM. Both of them have same salt/iv/mac size. The
notable difference between the two is that while invoking AES128-CCM
operation, the salt||nonce (which is passed as IV) has to be prefixed
with a hardcoded value '2'. Further, CCM implementation in kernel
requires IV passed in crypto_aead_request() to be full '16' bytes.
Therefore, the record structure 'struct tls_rec' has been modified to
reserve '16' bytes for IV. This works for both GCM and CCM based cipher.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-phy-aquantia-add-interface-mode-handling'
David S. Miller [Wed, 20 Mar 2019 17:58:17 +0000 (10:58 -0700)]
Merge branch 'net-phy-aquantia-add-interface-mode-handling'

Heiner Kallweit says:

====================
net: phy: aquantia: add interface mode handling

These two patches add interface mode handling for the AQR107/AQCS109.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: check for changed interface mode in read_status
Nikita Yushchenko [Tue, 19 Mar 2019 22:05:50 +0000 (23:05 +0100)]
net: phy: aquantia: check for changed interface mode in read_status

Depending on the auto-negotiated speed the PHY may change the interface
mode. Check for new mode and set phydev->interface accordingly.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[hkallweit1@gmail.com: picked from bigger patch and reworked]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: check for supported interface modes in config_init
Andrew Lunn [Tue, 19 Mar 2019 22:04:38 +0000 (23:04 +0100)]
net: phy: aquantia: check for supported interface modes in config_init

Let config_init check for unsupported interface modes on AQR107/AQCS109.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[hkallweit1@gmail.com: adjusted for AQR107/AQCS109 specifics]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: improve handling link_change_notify callback
Heiner Kallweit [Tue, 19 Mar 2019 18:56:51 +0000 (19:56 +0100)]
net: phy: improve handling link_change_notify callback

Currently the Phy driver's link_change_notify callback is called
whenever the state machine is run (every second if polling), no matter
whether the state changed or not. This isn't needed and may confuse
users considering the name of the callback. Actually it contradicts
its kernel-doc description. Therefore let's change the behavior and
call this callback only in case of an actual state change.

This requires changes to the at803x and rockchip drivers.
at803x can be simplified so that it reacts on a state change to
PHY_NOLINK only.
The rockchip driver can also be much simplified. We simply re-init
the AFE/DSP registers whenever we change to PHY_RUNNING and speed
is 100Mbps. This causes very small overhead because we do this even
if the speed was 100Mbps already. But this is negligible and
I think justified by the much simpler code.

Changes are compile-tested only.

A little bit problematic seems to be to find somebody with the
hardware to test the changes to the two PHY drivers. See also [0].
David may be able to test the Rockchip driver.

[0] https://marc.info/?t=153782508800006&r=1&w=2

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 20 Mar 2019 17:14:10 +0000 (10:14 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-03-19

This series contains updates to ice driver only.

Michal adds support for the pruning enable flag to avoid seeing
broadcast packets on different VLANs.

Akeem fixes an issue with VF queues being disabled and the VF netdev
network carrier being lost after reset. Fixed an issue issue when doing
PFR and CORER resets, where all VF VSIs need to be reset and rebuilt
with the main VSIs before replaying all VSIs.  Resolved an issue to
properly initialize VFs in the guest OS via PCI passthrough.

Bruce adds a local variable to avoid unnecessary de-references
throughout ice_probe().

Brett cleans up the code a bit by removing the need for a local variable
and re-designs the loop to simply return when get a successful result.
Cleans up the code to replace loop calls with a predefined macro to make
the code more consistent.  Updated the driver to ensure ITR granularity
is always 2 usecs. Refactors the calculation of VSIs per PF into a
general function that can calculate per PF allocations for not just VSIs
but across multiple resource types.  Improve the driver performance of
the driver when using the default settings by determining the ring size
and the number of descriptors for transmit and receive based on a
calculation with the PAGE_SIZE, ICE_MAX_NUM_DESC, and
ICE_REQ_DESC_MULTIPLE.

Chinh fixes an issue, where a reserved bit was possibly being set when
it should never be set.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 20 Mar 2019 17:12:03 +0000 (10:12 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2019-03-19

This series contains updates to e100, e1000, e1000e, igb, igc and
ixgbe.

Serhey Popovych fixes the return value for several of our older
drivers for netdev_update_features() to notify of changes applied.

Kai-Heng Feng fixes the WoL setting for system suspend, which should
not set to runtime suspend settings for igb.  Then fixes a power
management issue with e1000e for CNP+ devices.

Colin Ian King fixes whitespace issue (indentation), which helps with
readability.

Sasha provides the remaining changes for igc, including the enabling of
multi-queues to receive.  Added support for displaying and configuring
network flow classification (NFC) via ethtool.  Added additional
statistics and basic counters for igc.  Fixed a typo, so it aligns with
our other drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoice: Determine descriptor count and ring size based on PAGE_SIZE
Brett Creeley [Fri, 8 Feb 2019 20:50:59 +0000 (12:50 -0800)]
ice: Determine descriptor count and ring size based on PAGE_SIZE

Currently we set the default number of Tx and Rx descriptors to 128 by
default. For Rx this amounts to a full page (assuming 4K pages) because
each Rx descriptor is 32 Bytes, but for Tx it only amounts to a half
page because each Tx descriptor is 16 Bytes (assuming 4K pages).
Instead of assuming 4K pages, determine the ring size and the number of
descriptors for Tx and Rx based on a calculation using the PAGE_SIZE,
ICE_MAX_NUM_DESC, and ICE_REQ_DESC_MULTIPLE. This change is being made
to improve the performance of the driver when using the default
settings.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Reset all VFs with VFLR during SR-IOV init flow
Akeem G Abodunrin [Fri, 8 Feb 2019 20:50:58 +0000 (12:50 -0800)]
ice: Reset all VFs with VFLR during SR-IOV init flow

During SR-IOV initialization, we allocate and setup VFs with reset, and
since we were going to inform Firmware about our intention to do VFLR by
disabling LAN TX Queue, then we really have to complete VF reset flow with
VFLR using appropriate registers - Otherwise, reset status bit for VF in
the Guest OS might returns DEADBEEF.
This resolves issue to properly initialize VFs in the Guest OS via PCI
passthrough.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Get resources per function
Brett Creeley [Fri, 8 Feb 2019 20:50:57 +0000 (12:50 -0800)]
ice: Get resources per function

ice_get_guar_num_vsi currently calculates the number of VSIs per PF.
Rework this into a general function ice_get_num_per_func, that can
calculate per PF allocations for not just VSIs but across multiple
resource types.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Implement flow to reset VFs with PFR and other resets
Akeem G Abodunrin [Fri, 8 Feb 2019 20:50:56 +0000 (12:50 -0800)]
ice: Implement flow to reset VFs with PFR and other resets

All VF VSIs need to be reset and rebuild with the main VSIs before
replaying all VSIs, so that all existing switch filters, scheduler tree
and other configuration could be replayed at once. This fixes issues when
doing PFR and CORER reset.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: configure GLINT_ITR to always have an ITR gran of 2
Brett Creeley [Fri, 8 Feb 2019 20:50:55 +0000 (12:50 -0800)]
ice: configure GLINT_ITR to always have an ITR gran of 2

Instead of hoping that our ITR granularity will be 2 usec program the
GLINT_CTL register to make sure the ITR granularity is always 2 usecs.

Now that we know what the ITR granularity will be get rid of the check
in ice_probe() to verify our previous assumption.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: use ice_for_each_vsi macro when possible
Brett Creeley [Fri, 8 Feb 2019 20:50:54 +0000 (12:50 -0800)]
ice: use ice_for_each_vsi macro when possible

Replace all instances of:
for (i = 0; i < pf->num_alloc_vsi; i++)

with the following macro:
ice_for_each_vsi(pf, i)

This will allow the code to be consistent since there are currently
cases of using both.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice : Ensure only valid bits are set in ice_aq_set_phy_cfg
Chinh T Cao [Fri, 8 Feb 2019 20:50:52 +0000 (12:50 -0800)]
ice : Ensure only valid bits are set in ice_aq_set_phy_cfg

In the ice_aq_set_phy_cfg AQ command, the 16.4 bit is reserved. This
patch will make sure that this bit will never be set to 1.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: remove redundant variable and if condition
Brett Creeley [Fri, 8 Feb 2019 20:50:51 +0000 (12:50 -0800)]
ice: remove redundant variable and if condition

In ice_pf_rxq_wait we are using an unnecessary local variable and also
we are checking if the timeout time was reached after the loop. Get rid
of the local variable and return 0 right when we get a successful
result. This makes it so we can return -ETIMEDOUT if we ever exit the
loop because we know the timeout time has been hit.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: avoid multiple unnecessary de-references in probe
Bruce Allan [Fri, 8 Feb 2019 20:50:50 +0000 (12:50 -0800)]
ice: avoid multiple unnecessary de-references in probe

Add a local variable struct device *dev to avoid unnecessary de-references
throughout ice_probe().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix issue with VF reset and multiple VFs support on PFs
Akeem G Abodunrin [Fri, 8 Feb 2019 20:50:49 +0000 (12:50 -0800)]
ice: Fix issue with VF reset and multiple VFs support on PFs

This patch fixes issues with VF queues being disabled, and VF netdev
network carrier being lost after reset. Basically, we need to check if VF
is enabled, and queue configured in reset_all_vfs flow, and disable/enable
those queues appropriately whenever the function is called after
Global/CORER/PFR reset/rebuild/replay.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix broadcast traffic in port VLAN mode
Michal Swiatkowski [Fri, 8 Feb 2019 20:50:48 +0000 (12:50 -0800)]
ice: Fix broadcast traffic in port VLAN mode

Set egress (Rx) pruning enable flag for VF VSI in VSI ctxt to
enable prune action.

To avoid seeing broadcast packet in different VLAN, pruning enable
flag in VSI ctxt should be set.

Write new functions (fill VSI ctx) to not repeat send ctxt code.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigc: Remove unneeded hw_dbg prints
Sasha Neftin [Fri, 15 Mar 2019 15:12:07 +0000 (17:12 +0200)]
igc: Remove unneeded hw_dbg prints

Remove unneeded hw_dbg prints from igc_ethtool.c file.
Clean up code.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigc: Fix the typo in igc_base.h header definition
Sasha Neftin [Mon, 11 Mar 2019 22:34:35 +0000 (00:34 +0200)]
igc: Fix the typo in igc_base.h header definition

Add the underline for the _IGC_BASE_H_.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigc: Add support for the ntuple feature
Sasha Neftin [Wed, 20 Feb 2019 12:39:31 +0000 (14:39 +0200)]
igc: Add support for the ntuple feature

Copy the ntuple feature into list of user selectable features.
Enable the ntuple feature.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigc: Add support for statistics
Sasha Neftin [Mon, 18 Feb 2019 08:37:31 +0000 (10:37 +0200)]
igc: Add support for statistics

Add support for statistics and show basic counters.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoMerge branch 'enc28j60-messaging-clean-up-and-ACPI-improvements'
David S. Miller [Tue, 19 Mar 2019 21:59:32 +0000 (14:59 -0700)]
Merge branch 'enc28j60-messaging-clean-up-and-ACPI-improvements'

Andy Shevchenko says:

====================
enc28j60: messaging clean up and ACPI improvements

Most of the patches in the series dedicated to update messaging to use modern
APIs, such as netdev, with a benefit to distinguish devices, if more than one
installed on the system.

Besides that, patch 1 targeting ACPI enabled systems when MAC address provided
there via properties.

And few clean ups are included, such as:
- switching to module_spi_driver()
- converting to use ether_addr_copy() API
- converting to use SPDX

Since v2:
- cover letter is added
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Convert to use SPDX identifier
Andy Shevchenko [Tue, 19 Mar 2019 18:49:30 +0000 (20:49 +0200)]
enc28j60: Convert to use SPDX identifier

Reduce size of duplicated comments by switching to use SPDX identifier.

No functional change.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Fix indentation splats
Andy Shevchenko [Tue, 19 Mar 2019 18:49:29 +0000 (20:49 +0200)]
enc28j60: Fix indentation splats

Fix few indentation splats. No functional change intended.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Amend comments by fixing typos, adding periods, etc
Andy Shevchenko [Tue, 19 Mar 2019 18:49:28 +0000 (20:49 +0200)]
enc28j60: Amend comments by fixing typos, adding periods, etc

Amend comments in the code:
 - adding periods to the multi-line comments
 - fixing typos
 - capitalize first word in the sentences
 - etc

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Remove linux/init.h
Andy Shevchenko [Tue, 19 Mar 2019 18:49:27 +0000 (20:49 +0200)]
enc28j60: Remove linux/init.h

There is no need to include linux/init.h when at the same time
we include linux/module.h.

Remove redundant inclusion.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Convert printk() to netdev_printk()
Andy Shevchenko [Tue, 19 Mar 2019 18:49:26 +0000 (20:49 +0200)]
enc28j60: Convert printk() to netdev_printk()

The debug prints of network operations will look better if network
device name is printed. The benefit of that is a possibility to distinguish
the actual hardware when more than one is installed on the system.

Convert appropriate printk(KERN_DEBUG) to netdev_print(KERN_DEBUG, ndev).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Convert HW related printk() to dev_printk()
Andy Shevchenko [Tue, 19 Mar 2019 18:49:25 +0000 (20:49 +0200)]
enc28j60: Convert HW related printk() to dev_printk()

The debug prints of hardware status and operations will look better
if SPI device name is printed. The benefit of that is a possibility
to distinguish the actual hardware when more than one is installed
on the system.

Convert appropriate printk(KERN_DEBUG) to dev_print(KERN_DEBUG, &spi->dev).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Switch to dev_<level> from pr_<level>
Andy Shevchenko [Tue, 19 Mar 2019 18:49:24 +0000 (20:49 +0200)]
enc28j60: Switch to dev_<level> from pr_<level>

Instead of using open coded printk(KERN_<LEVEL>) switch the driver to use
dev_<level> macros.

Note, the device name will be printed in full, which is beneficial when
more than one card installed on the system.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Use ether_addr_copy() in enc28j60_set_mac_address()
Andy Shevchenko [Tue, 19 Mar 2019 18:49:23 +0000 (20:49 +0200)]
enc28j60: Use ether_addr_copy() in enc28j60_set_mac_address()

Use ether_addr_copy() instead of memcpy() to copy the mac address.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Switch to use module_spi_driver() macro
Andy Shevchenko [Tue, 19 Mar 2019 18:49:22 +0000 (20:49 +0200)]
enc28j60: Switch to use module_spi_driver() macro

Eliminate some boilerplate code by using module_spi_driver() instead of
->init() / ->exit(), moving the salient bits from ->init() into ->probe().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Drop driver name duplication from messages
Andy Shevchenko [Tue, 19 Mar 2019 18:49:21 +0000 (20:49 +0200)]
enc28j60: Drop driver name duplication from messages

When dev_<level>() macros are used against SPI device, the driver's name
is printed as well. No need to duplicate this explicitly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Replace dev_*(&netdev->dev, ...) with netdev_*()
Andy Shevchenko [Tue, 19 Mar 2019 18:49:20 +0000 (20:49 +0200)]
enc28j60: Replace dev_*(&netdev->dev, ...) with netdev_*()

Replace open coded netdev_<level>() macros.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Remove duplicate messaging
Andy Shevchenko [Tue, 19 Mar 2019 18:49:19 +0000 (20:49 +0200)]
enc28j60: Remove duplicate messaging

The ->probe() and ->remove() stages can be easily debugged with
initcall_debug or function tracer. There is no need to repeat the same
explicitly in the driver.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenc28j60: Use device_get_mac_address()
Andy Shevchenko [Tue, 19 Mar 2019 18:49:18 +0000 (20:49 +0200)]
enc28j60: Use device_get_mac_address()

Replace the DT-specific of_get_mac_address() function with
device_get_mac_address, which works on both DT and ACPI platforms.
This change makes it easier to add ACPI support.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoigc: Extend the ethtool supporting
Sasha Neftin [Thu, 14 Feb 2019 11:31:37 +0000 (13:31 +0200)]
igc: Extend the ethtool supporting

Add show and configure network flow classification (NFC) methods
to the ethtool. Show the specifies Rx ntuple filters.
Configures receive network flow classification option or rules.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigc: Add multiple receive queues control supporting
Sasha Neftin [Wed, 6 Feb 2019 07:48:37 +0000 (09:48 +0200)]
igc: Add multiple receive queues control supporting

Enable the multi queues to receive.
Program the direction of packets to specified queues according
to the mode selected in the MRQC register.
Multiple receive queues defined by filters and RSS for 4 queues.
Enable/disable RSS hashing and also to enable multiple receive queues.
This patch will allow further ethtool support development.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoe1000e: Disable runtime PM on CNP+
Kai-Heng Feng [Sat, 2 Feb 2019 17:40:16 +0000 (01:40 +0800)]
e1000e: Disable runtime PM on CNP+

There are some new e1000e devices can only be woken up from D3 one time,
by plugging Ethernet cable. Subsequent cable plugging does set PME bit
correctly, but it still doesn't get woken up.

Since e1000e connects to the root complex directly, we rely on ACPI to
wake it up. In this case, the GPE from _PRW only works once and stops
working after that. Though it appears to be a platform bug, e1000e
maintainers confirmed that I219 does not support D3.

So disable runtime PM on CNP+ chips. We may need to disable earlier
generations if this bug also hit older platforms.

Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=280819
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoigb: fix various indentation issues
Colin Ian King [Wed, 16 Jan 2019 18:53:16 +0000 (18:53 +0000)]
igb: fix various indentation issues

There are some lines that have indentation issues, fix these

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoipv6: Add icmp_echo_ignore_multicast support for ICMPv6
Stephen Suryaputra [Tue, 19 Mar 2019 16:37:12 +0000 (12:37 -0400)]
ipv6: Add icmp_echo_ignore_multicast support for ICMPv6

IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings.
IPv6 needs a similar mechanism.

v1->v2:
- Remove NET_IPV6_ICMP_ECHO_IGNORE_MULTICAST.

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: macb: simplify getting .driver_data
Wolfram Sang [Tue, 19 Mar 2019 16:36:34 +0000 (17:36 +0100)]
net: macb: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoigb: Exclude device from suspend direct complete optimization
Kai-Heng Feng [Tue, 11 Dec 2018 07:59:38 +0000 (15:59 +0800)]
igb: Exclude device from suspend direct complete optimization

igb sets different WoL settings in system suspend callback and runtime
suspend callback.

The suspend direct complete optimization leaves igb in runtime suspended
state with wrong WoL setting during system suspend.

To fix this, we need to disable suspend direct complete optimization to
let igb always use suspend callback to set correct WoL during system
suspend.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agointel: correct return from set features callback
Serhey Popovych [Thu, 29 Mar 2018 14:51:36 +0000 (17:51 +0300)]
intel: correct return from set features callback

According to comments in <linux/netdevice.h> we should return either >0
or -errno from ->ndo_set_features() if changing dev->features by itself.

Return 1 in such places to notify netdev_update_features() about applied
changes in dev->features.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agonet: pasemi: Make pasemi_mac_init_module static
YueHaibing [Tue, 19 Mar 2019 15:39:06 +0000 (23:39 +0800)]
net: pasemi: Make pasemi_mac_init_module static

Fix sparse warning:

drivers/net/ethernet/pasemi/pasemi_mac.c:1842:5: warning:
 symbol 'pasemi_mac_init_module' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: free request sock directly upon TFO or syncookies error
Guillaume Nault [Tue, 19 Mar 2019 15:05:44 +0000 (16:05 +0100)]
tcp: free request sock directly upon TFO or syncookies error

Since the request socket is created locally, it'd make more sense to
use reqsk_free() instead of reqsk_put() in TFO and syncookies' error
path.

However, tcp_get_cookie_sock() may set ->rsk_refcnt before freeing the
socket; tcp_conn_request() may also have non-null ->rsk_refcnt because
of tcp_try_fastopen(). In both cases 'req' hasn't been exposed
to the outside world and is safe to free immediately, but that'd
trigger the WARN_ON_ONCE in reqsk_free().

Define __reqsk_free() for these situations where we know nobody's
referencing the socket, even though ->rsk_refcnt might be non-null.
Now we can consolidate the error path of tcp_get_cookie_sock() and
tcp_conn_request().

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodatagram: Make __skb_datagram_iter static
YueHaibing [Tue, 19 Mar 2019 14:59:46 +0000 (22:59 +0800)]
datagram: Make __skb_datagram_iter static

Fix sparse warning:

net/core/datagram.c:411:5: warning:
 symbol '__skb_datagram_iter' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Make hclgevf_update_link_mode static
YueHaibing [Tue, 19 Mar 2019 14:46:41 +0000 (22:46 +0800)]
net: hns3: Make hclgevf_update_link_mode static

Fix sparse warning:

drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c:407:6:
 warning: symbol 'hclgevf_update_link_mode' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmveth: Make array ibmveth_stats static
YueHaibing [Tue, 19 Mar 2019 14:42:37 +0000 (22:42 +0800)]
ibmveth: Make array ibmveth_stats static

Fix sparse warning:
drivers/net/ethernet/ibm/ibmveth.c:96:21:
 warning: symbol 'ibmveth_stats' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: add tcp_inet6_sk() helper
Eric Dumazet [Tue, 19 Mar 2019 14:01:08 +0000 (07:01 -0700)]
tcp: add tcp_inet6_sk() helper

TCP ipv6 fast path dereferences a pointer to get to the inet6
part of a tcp socket, but given the fixed memory placement,
we can do better and avoid a possible cache line miss.

This also reduces register pressure, since we let the compiler
know about this memory placement.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoibmvnic: Report actual backing device speed and duplex values
Murilo Fossa Vicentini [Tue, 19 Mar 2019 13:28:51 +0000 (10:28 -0300)]
ibmvnic: Report actual backing device speed and duplex values

The ibmvnic driver currently reports a fixed value for both speed and
duplex settings regardless of the actual backing device that is being
used. By adding support to the QUERY_PHYS_PARMS command defined by the
PAPR+ we can query the current physical port state and report the proper
values for these feilds.

Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Murilo Fossa Vicentini <muvic@linux.ibm.com>
Reviewed-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: smooth change between replicast and broadcast
Hoang Le [Tue, 19 Mar 2019 11:49:50 +0000 (18:49 +0700)]
tipc: smooth change between replicast and broadcast

Currently, a multicast stream may start out using replicast, because
there are few destinations, and then it should ideally switch to
L2/broadcast IGMP/multicast when the number of destinations grows beyond
a certain limit. The opposite should happen when the number decreases
below the limit.

To eliminate the risk of message reordering caused by method change,
a sending socket must stick to a previously selected method until it
enters an idle period of 5 seconds. Means there is a 5 seconds pause
in the traffic from the sender socket.

If the sender never makes such a pause, the method will never change,
and transmission may become very inefficient as the cluster grows.

With this commit, we allow such a switch between replicast and
broadcast without any need for a traffic pause.

Solution is to send a dummy message with only the header, also with
the SYN bit set, via broadcast or replicast. For the data message,
the SYN bit is set and sending via replicast or broadcast (inverse
method with dummy).

Then, at receiving side any messages follow first SYN bit message
(data or dummy message), they will be held in deferred queue until
another pair (dummy or data message) arrived in other link.

v2: reverse christmas tree declaration

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: introduce new capability flag for cluster
Hoang Le [Tue, 19 Mar 2019 11:49:49 +0000 (18:49 +0700)]
tipc: introduce new capability flag for cluster

As a preparation for introducing a smooth switching between replicast
and broadcast method for multicast message, We have to introduce a new
capability flag TIPC_MCAST_RBCTL to handle this new feature.

During a cluster upgrade a node can come back with this new capabilities
which also must be reflected in the cluster capabilities field.
The new feature is only applicable if all node in the cluster supports
this new capability.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: support broadcast/replicast configurable for bc-link
Hoang Le [Tue, 19 Mar 2019 11:49:48 +0000 (18:49 +0700)]
tipc: support broadcast/replicast configurable for bc-link

Currently, a multicast stream uses either broadcast or replicast as
transmission method, based on the ratio between number of actual
destinations nodes and cluster size.

However, when an L2 interface (e.g., VXLAN) provides pseudo
broadcast support, this becomes very inefficient, as it blindly
replicates multicast packets to all cluster/subnet nodes,
irrespective of whether they host actual target sockets or not.

The TIPC multicast algorithm is able to distinguish real destination
nodes from other nodes, and hence provides a smarter and more
efficient method for transferring multicast messages than
pseudo broadcast can do.

Because of this, we now make it possible for users to force
the broadcast link to permanently switch to using replicast,
irrespective of which capabilities the bearer provides,
or pretend to provide.
Conversely, we also make it possible to force the broadcast link
to always use true broadcast. While maybe less useful in
deployed systems, this may at least be useful for testing the
broadcast algorithm in small clusters.

We retain the current AUTOSELECT ability, i.e., to let the broadcast link
automatically select which algorithm to use, and to switch back and forth
between broadcast and replicast as the ratio between destination
node number and cluster size changes. This remains the default method.

Furthermore, we make it possible to configure the threshold ratio for
such switches. The default ratio is now set to 10%, down from 25% in the
earlier implementation.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovirtio_net: remove hcpu from virtnet_clean_affinity
Peter Xu [Mon, 18 Mar 2019 06:56:06 +0000 (14:56 +0800)]
virtio_net: remove hcpu from virtnet_clean_affinity

The variable is never used.

CC: Michael S. Tsirkin <mst@redhat.com>
CC: Jason Wang <jasowang@redhat.com>
CC: virtualization@lists.linux-foundation.org
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoDocumentation: networking: Update netdev-FAQ regarding patches
Florian Fainelli [Mon, 18 Mar 2019 18:07:33 +0000 (11:07 -0700)]
Documentation: networking: Update netdev-FAQ regarding patches

Provide an explanation of what is expected with respect to sending new
versions of specific patches within a patch series, as well as what
happens if an earlier patch series accidentally gets merged).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 's390-qeth-fixes'
David S. Miller [Tue, 19 Mar 2019 01:34:45 +0000 (18:34 -0700)]
Merge branch 's390-qeth-fixes'

Julian Wiedmann says:

====================
s390/qeth: fixes 2019-03-18

please apply the following three patches to -net. The first two are fixes
for minor race conditions in the probe code, while the third one gets
dropwatch working (again).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: be drop monitor friendly
Julian Wiedmann [Mon, 18 Mar 2019 15:40:56 +0000 (16:40 +0100)]
s390/qeth: be drop monitor friendly

As part of the TX completion path, qeth_release_skbs() frees the completed
skbs with __skb_queue_purge(). This ends in kfree_skb(), reporting every
completed skb as dropped.
On the other hand when dropping an skb in .ndo_start_xmit, we end up
calling consume_skb()... where we should be using kfree_skb() so that
drop monitors get notified.

Switch the drop/consume logic around, and also don't accumulate dropped
packets in the tx_errors statistics.

Fixes: dc149e3764d8 ("s390/qeth: replace open-coded skb_queue_walk()")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: fix race when initializing the IP address table
Julian Wiedmann [Mon, 18 Mar 2019 15:40:55 +0000 (16:40 +0100)]
s390/qeth: fix race when initializing the IP address table

The ucast IP table is utilized by some of the L3-specific sysfs attributes
that qeth_l3_create_device_attributes() provides. So initialize the table
_before_ registering the attributes.

Fixes: ebccc7397e4a ("s390/qeth: add missing hash table initializations")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agos390/qeth: don't erase configuration while probing
Julian Wiedmann [Mon, 18 Mar 2019 15:40:54 +0000 (16:40 +0100)]
s390/qeth: don't erase configuration while probing

The HW trap and VNICC configuration is exposed via sysfs, and may have
already been modified when qeth_l?_probe_device() attempts to initialize
them. So (1) initialize the VNICC values a little earlier, and (2) don't
bother about the HW trap mode, it was already initialized before.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomISDN: hfcpci: Test both vendor & device ID for Digium HFC4S
Bjorn Helgaas [Mon, 18 Mar 2019 13:51:06 +0000 (08:51 -0500)]
mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S

The device ID alone does not uniquely identify a device.  Test both the
vendor and device ID to make sure we don't mistakenly think some other
vendor's 0xB410 device is a Digium HFC4S.  Also, instead of the bare hex
ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
ID table.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'sctp-fix-ignoring-asoc_id-for-tcp-style-sockets-on-some-setsockopts'
David S. Miller [Tue, 19 Mar 2019 01:31:09 +0000 (18:31 -0700)]
Merge branch 'sctp-fix-ignoring-asoc_id-for-tcp-style-sockets-on-some-setsockopts'

Xin Long says:

====================
sctp: fix ignoring asoc_id for tcp-style sockets on some setsockopts

This is a patchset to fix ignoring asoc_id for tcp-style sockets on
some setsockopts, introduced by SCTP_CURRENT_ASSOC of the patchset:

  [net-next,00/24] sctp: support SCTP_FUTURE/CURRENT/ALL_ASSOC
  (https://patchwork.ozlabs.org/cover/1031706/)

As Marcelo suggested, we fix it on each setsockopt that is using
SCTP_CURRENT_ASSOC one by one by adding the check:

    if (sctp_style(sk, TCP))
         xxx.xxx_assoc_id = SCTP_FUTURE_ASSOC;

so that assoc_id will be completely ingored for tcp-style socket on
setsockopts, and works as SCTP_FUTURE_ASSOC.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: fix ignoring asoc_id for tcp-style sockets on SCTP_STREAM_SCHEDULER sockopt
Xin Long [Mon, 18 Mar 2019 12:06:11 +0000 (20:06 +0800)]
sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_STREAM_SCHEDULER sockopt

A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on
SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_STREAM_SCHEDULER sockopt.

Fixes: 7efba10d6bd2 ("sctp: add SCTP_FUTURE_ASOC and SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: fix ignoring asoc_id for tcp-style sockets on SCTP_EVENT sockopt
Xin Long [Mon, 18 Mar 2019 12:06:10 +0000 (20:06 +0800)]
sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_EVENT sockopt

A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on
SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_EVENT sockopt.

Fixes: d251f05e3ba2 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_EVENT sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: fix ignoring asoc_id for tcp-style sockets on SCTP_ENABLE_STREAM_RESET sockopt
Xin Long [Mon, 18 Mar 2019 12:06:09 +0000 (20:06 +0800)]
sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_ENABLE_STREAM_RESET sockopt

A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on
SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_ENABLE_STREAM_RESET sockopt.

Fixes: 99a62135e127 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_ENABLE_STREAM_RESET sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_PRINFO sockopt
Xin Long [Mon, 18 Mar 2019 12:06:08 +0000 (20:06 +0800)]
sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_DEFAULT_PRINFO sockopt

A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on
SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_DEFAULT_PRINFO sockopt.

Fixes: 3a583059d187 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_PRINFO sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_DEACTIVATE_KEY sockopt
Xin Long [Mon, 18 Mar 2019 12:06:07 +0000 (20:06 +0800)]
sctp: fix ignoring asoc_id for tcp-style sockets on SCTP_AUTH_DEACTIVATE_KEY sockopt

A similar fix as Patch "sctp: fix ignoring asoc_id for tcp-style sockets on
SCTP_DEFAULT_SEND_PARAM sockopt" on SCTP_AUTH_DEACTIVATE_KEY sockopt.

Fixes: 2af66ff3edc7 ("sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_DEACTIVATE_KEY sockopt")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>