linux-block.git
3 years agovhost: introduce vhost_vring_call
Zhu Lingshan [Fri, 31 Jul 2020 06:55:28 +0000 (14:55 +0800)]
vhost: introduce vhost_vring_call

This commit introduces struct vhost_vring_call which replaced
raw struct eventfd_ctx *call_ctx in struct vhost_virtqueue.
Besides eventfd_ctx, it contains a spin lock and an
irq_bypass_producer in its structure.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200731065533.4144-2-lingshan.zhu@intel.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovhost: Use flex_array_size() helper in copy_from_user()
Gustavo A. R. Silva [Fri, 31 Jul 2020 13:09:56 +0000 (08:09 -0500)]
vhost: Use flex_array_size() helper in copy_from_user()

Make use of the flex_array_size() helper to calculate the size of a
flexible array member within an enclosing structure.

This helper offers defense-in-depth against potential integer
overflows, while at the same time makes it explicitly clear that
we are dealing with a flexible array member.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200731130956.GA30525@embeddedor
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovdpasim: protect concurrent access to iommu iotlb
Max Gurtovoy [Fri, 31 Jul 2020 07:38:22 +0000 (15:38 +0800)]
vdpasim: protect concurrent access to iommu iotlb

Iommu iotlb can be accessed by different cores for performing IO using
multiple virt queues. Add a spinlock to synchronize iotlb accesses.

This could be easily reproduced when using more than 1 pktgen threads
to inject traffic to vdpa simulator.

Fixes: 2c53d0f64c06f("vdpasim: vDPA device simulator")
Cc: stable@vger.kernel.org
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200731073822.13326-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovhost: vdpa: remove per device feature whitelist
Jason Wang [Mon, 20 Jul 2020 08:50:43 +0000 (16:50 +0800)]
vhost: vdpa: remove per device feature whitelist

We used to have a per device feature whitelist to filter out the
unsupported virtio features. But this seems unnecessary since:

- the main idea behind feature whitelist is to block control vq
  feature until we finalize the control virtqueue API. But the current
  vhost-vDPA uAPI is sufficient to support control virtqueue. For
  device that has hardware control virtqueue, the vDPA device driver
  can just setup the hardware virtqueue and let userspace to use
  hardware virtqueue directly. For device that doesn't have a control
  virtqueue, the vDPA device driver need to use e.g vringh to emulate
  a software control virtqueue.
- we don't do it in virtio-vDPA driver

So remove this limitation.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200720085043.16485-1-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_ring: Avoid loop when vq is broken in virtqueue_poll
Mao Wenan [Sun, 2 Aug 2020 07:44:09 +0000 (15:44 +0800)]
virtio_ring: Avoid loop when vq is broken in virtqueue_poll

The loop may exist if vq->broken is true,
virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split
will return NULL, so virtnet_poll will reschedule napi to
receive packet, it will lead cpu usage(si) to 100%.

call trace as below:
virtnet_poll
virtnet_receive
virtqueue_get_buf_ctx
virtqueue_get_buf_ctx_packed
virtqueue_get_buf_ctx_split
virtqueue_napi_complete
virtqueue_poll           //return true
virtqueue_napi_schedule //it will reschedule napi

to fix this, return false if vq is broken in virtqueue_poll.

Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1596354249-96204-1-git-send-email-wenan.mao@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
3 years agovirtio_net: use LE accessors for speed/duplex
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_net: use LE accessors for speed/duplex

Speed and duplex config fields depend on VIRTIO_NET_F_SPEED_DUPLEX
which being 63>31 depends on VIRTIO_F_VERSION_1.

Accordingly, use LE accessors for these fields.

Reported-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: drop LE option from config space
Michael S. Tsirkin [Wed, 5 Aug 2020 11:29:12 +0000 (07:29 -0400)]
virtio_config: drop LE option from config space

All drivers now use virtio_cread/write_le for LE config
space fields. Drop LE option from virtio_cread/write, only leaving
the option to access transitional fields.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio-iommu: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio-iommu: convert to LE accessors

Virtio iommu is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_mem: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_mem: convert to LE accessors

Virtio mem is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agodrm/virtio: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
drm/virtio: convert to LE accessors

Virtgpu is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_pmem: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_pmem: convert to LE accessors

Virtio pmem is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_crypto: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_crypto: convert to LE accessors

Virtio crypto is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_fs: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_fs: convert to LE accessors

Virtio fs is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_input: convert to LE accessors
Michael S. Tsirkin [Wed, 5 Aug 2020 09:39:36 +0000 (05:39 -0400)]
virtio_input: convert to LE accessors

Virtio input is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_balloon: use LE config space accesses
Michael S. Tsirkin [Tue, 4 Aug 2020 21:51:35 +0000 (17:51 -0400)]
virtio_balloon: use LE config space accesses

Balloon is LE, it's cleaner to access it as such directly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: add virtio_cread_le_feature
Michael S. Tsirkin [Wed, 5 Aug 2020 13:17:38 +0000 (09:17 -0400)]
virtio_config: add virtio_cread_le_feature

Mirrors virtio_cread_feature but for LE fields.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_caif: correct tags for config space fields
Michael S. Tsirkin [Tue, 4 Aug 2020 22:02:39 +0000 (18:02 -0400)]
virtio_caif: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: LE config space accessors
Michael S. Tsirkin [Tue, 4 Aug 2020 21:33:08 +0000 (17:33 -0400)]
virtio_config: LE config space accessors

To be used by modern code, as well as to handle LE only fields such as
balloon.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: disallow native type fields (again)
Michael S. Tsirkin [Fri, 10 Jul 2020 11:55:52 +0000 (07:55 -0400)]
virtio_config: disallow native type fields (again)

_Generic version allowed __uXX types but that is no longer necessary:

Transitional devices should all use __virtioXX types (and __leXX for
fields not present in the legacy devices).
Modern ones should use __leXX.
_uXX type would be a bug.
Let's prevent that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: rewrite using _Generic
Michael S. Tsirkin [Mon, 3 Aug 2020 20:08:11 +0000 (16:08 -0400)]
virtio_config: rewrite using _Generic

Min compiler version has been raised, so that's ok now.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_config: cread/write cleanup
Michael S. Tsirkin [Thu, 30 Jul 2020 20:12:40 +0000 (16:12 -0400)]
virtio_config: cread/write cleanup

Use vars of the correct type instead of casting.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovdpa_sim: fix endian-ness of config space
Michael S. Tsirkin [Sun, 12 Jul 2020 14:57:02 +0000 (10:57 -0400)]
vdpa_sim: fix endian-ness of config space

VDPA sim accesses config space as native endian - this is
wrong since it's a modern device and actually uses LE.

It only supports modern guests so we could punt and
just force LE, but let's use the full virtio APIs since people
tend to copy/paste code, and this is not data path anyway.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_vdpa: legacy features handling
Michael S. Tsirkin [Mon, 27 Jul 2020 14:59:02 +0000 (10:59 -0400)]
virtio_vdpa: legacy features handling

We normally expect vdpa to use the modern interface.
However for consistency, let's use same APIs as vhost
for legacy guests.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovhost/vdpa: switch to new helpers
Michael S. Tsirkin [Mon, 27 Jul 2020 14:58:18 +0000 (10:58 -0400)]
vhost/vdpa: switch to new helpers

For new helpers handling legacy features to be effective,
vhost needs to invoke them. Tie them in.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovdpa: make sure set_features is invoked for legacy
Michael S. Tsirkin [Mon, 27 Jul 2020 14:51:55 +0000 (10:51 -0400)]
vdpa: make sure set_features is invoked for legacy

Some legacy guests just assume features are 0 after reset.
We detect that config space is accessed before features are
set and set features to 0 automatically.
Note: some legacy guests might not even access config space, if this is
reported in the field we might need to catch a kick to handle these.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agomlxbf-tmfifo: sparse tags for config access
Michael S. Tsirkin [Sun, 12 Jul 2020 14:56:34 +0000 (10:56 -0400)]
mlxbf-tmfifo: sparse tags for config access

mlxbf-tmfifo accesses config space using native types -
which works for it since the legacy virtio native types.

This will break if it ever needs to support modern virtio,
so with new tags previously introduced for virtio net config,
sparse now warns for this in drivers.

Since this is a legacy only device, fix it up using
virtio_legacy_is_little_endian for now.

No functional changes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
3 years agovirtio_config: disallow native type fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:55:52 +0000 (07:55 -0400)]
virtio_config: disallow native type fields

Transitional devices should all use __virtioXX types (and __leXX for
fields not present in legacy devices).
Modern ones should use __leXX.
_uXX type would be a bug.
Let's prevent that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_scsi: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_scsi: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_pmem: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_pmem: correct tags for config space fields

Since this is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_net: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_net: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio_mem: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_mem: correct tags for config space fields

Since this is a modern-only device,
tag config space fields as having little endian-ness.

TODO: check other uses of __virtioXX types in this header,
should probably be __leXX.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_iommu: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_iommu: correct tags for config space fields

Since this is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_input: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_input: correct tags for config space fields

Since this is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_gpu: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_gpu: correct tags for config space fields

Since gpu is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_fs: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_fs: correct tags for config space fields

Since fs is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_crypto: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_crypto: correct tags for config space fields

Since crypto is a modern-only device,
tag config space fields as having little endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_console: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_console: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_blk: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_blk: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
3 years agovirtio_balloon: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_balloon: correct tags for config space fields

Tag config space fields as having little endian-ness.
Note that balloon is special: LE even when using
the legacy interface.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_9p: correct tags for config space fields
Michael S. Tsirkin [Fri, 10 Jul 2020 11:17:13 +0000 (07:17 -0400)]
virtio_9p: correct tags for config space fields

Tag config space fields as having virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio: allow __virtioXX, __leXX in config space
Michael S. Tsirkin [Fri, 10 Jul 2020 07:20:21 +0000 (03:20 -0400)]
virtio: allow __virtioXX, __leXX in config space

Currently all config space fields are of the type __uXX.
This confuses people and some drivers (notably vdpa)
access them using CPU endian-ness - which only
works well for legacy or LE platforms.

Update virtio_cread/virtio_cwrite macros to allow __virtioXX
and __leXX field types. Follow-up patches will convert
config space to use these types.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_ring: sparse warning fixup
Michael S. Tsirkin [Fri, 10 Jul 2020 10:46:04 +0000 (06:46 -0400)]
virtio_ring: sparse warning fixup

virtio_store_mb was built with split ring in mind so it accepts
__virtio16 arguments. Packed ring uses __le16 values, so sparse
complains.  It's just a store with some barriers so let's convert it to
a macro, we don't loose too much type safety by doing that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio_balloon: fix sparse warning
Michael S. Tsirkin [Fri, 10 Jul 2020 10:34:49 +0000 (06:34 -0400)]
virtio_balloon: fix sparse warning

balloon uses virtio32_to_cpu instead of cpu_to_virtio32
to convert a native endian number to virtio.
No practical difference but makes sparse warn.
Fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
3 years agovirtio: virtio_has_iommu_quirk -> virtio_has_dma_quirk
Michael S. Tsirkin [Wed, 24 Jun 2020 23:17:04 +0000 (19:17 -0400)]
virtio: virtio_has_iommu_quirk -> virtio_has_dma_quirk

Now that the corresponding feature bit has been renamed,
rename the quirk too - it's about special ways to
do DMA, not necessarily about the IOMMU.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agovirtio: VIRTIO_F_IOMMU_PLATFORM -> VIRTIO_F_ACCESS_PLATFORM
Michael S. Tsirkin [Wed, 24 Jun 2020 22:24:33 +0000 (18:24 -0400)]
virtio: VIRTIO_F_IOMMU_PLATFORM -> VIRTIO_F_ACCESS_PLATFORM

Rename the bit to match latest virtio spec.
Add a compat macro to avoid breaking existing userspace.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
3 years agoLinux 5.8 v5.8
Linus Torvalds [Sun, 2 Aug 2020 21:21:45 +0000 (14:21 -0700)]
Linux 5.8

3 years agoMerge tag 'x86-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 2 Aug 2020 20:10:42 +0000 (13:10 -0700)]
Merge tag 'x86-urgent-2020-08-02' of git://git./linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
 "A single fix for a potential deadlock when printing a message about
  spurious interrupts"

* tag 'x86-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/i8259: Use printk_deferred() to prevent deadlock

3 years agoMerge tag 'kbuild-fixes-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 2 Aug 2020 18:23:51 +0000 (11:23 -0700)]
Merge tag 'kbuild-fixes-v5.8-4' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - clean the generated moc file for xconfig

 - fix xconfig bugs, and revert some bad commits

* tag 'kbuild-fixes-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: remove redundant FORCE definition in scripts/Makefile.modpost
  kconfig: qconf: remove wrong ConfigList::firstChild()
  Revert "kconfig: qconf: don't show goback button on splitMode"
  Revert "kconfig: qconf: Change title for the item window"
  kconfig: qconf: remove "goBack" debug message
  kconfig: qconf: use delete[] instead of delete to free array
  kconfig: qconf: compile moc object separately
  kconfig: qconf: use if_changed for qconf.moc rule
  modpost: explain why we can't use strsep

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 2 Aug 2020 17:41:00 +0000 (10:41 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Bugfixes and strengthening the validity checks on inputs from new
  userspace APIs.

  Now I know why I shouldn't prepare pull requests on the weekend, it's
  hard to concentrate if your son is shouting about his latest Minecraft
  builds in your ear. Fortunately all the patches were ready and I just
  had to check the test results..."

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM
  KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled
  KVM: arm64: Don't inherit exec permission across page-table levels
  KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions
  KVM: nVMX: check for invalid hdr.vmx.flags
  KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE
  selftests: kvm: do not set guest mode flag

3 years agokbuild: remove redundant FORCE definition in scripts/Makefile.modpost
Masahiro Yamada [Sun, 2 Aug 2020 13:54:40 +0000 (22:54 +0900)]
kbuild: remove redundant FORCE definition in scripts/Makefile.modpost

The same code exists a few lines above.

Fixes: 436b2ac603d5 ("modpost: invoke modpost only when input files are updated")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokconfig: qconf: remove wrong ConfigList::firstChild()
Masahiro Yamada [Sat, 1 Aug 2020 07:08:50 +0000 (16:08 +0900)]
kconfig: qconf: remove wrong ConfigList::firstChild()

This function returns the first child object, but the returned pointer
is not compatible with (ConfigItem *).

Commit cc1c08edccaf ("kconfig: qconf: don't show goback button on
splitMode") uncovered this issue because using the pointer from this
function would make qconf crash. (https://lkml.org/lkml/2020/7/18/411)

This function does not work. Remove.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Sat, 1 Aug 2020 23:47:24 +0000 (16:47 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca.

 2) Better parameter validation in pfkey_dump(), from Mark Salyzyn.

 3) Fix several clang issues on powerpc in selftests, from Tanner Love.

 4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al
    Viro.

 5) Out of bounds access in mlx5e driver, from Raed Salem.

 6) Fix transfer buffer memleak in lan78xx, from Johan Havold.

 7) RCU fixups in rhashtable, from Herbert Xu.

 8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang.

 9) vxlan FDB dump must be done under RCU, from Ido Schimmel.

10) Fix use after free in mlxsw, from Ido Schimmel.

11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko.

12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar
    Thiagarajan.

13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang.

14) Fix bpf program reference count leaks in mlx5 during
    mlx5e_alloc_rq(), from Xin Xiong.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
  vxlan: fix memleak of fdb
  rds: Prevent kernel-infoleak in rds_notify_queue_get()
  net/sched: The error lable position is corrected in ct_init_module
  net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
  net/mlx5e: E-Switch, Specify flow_source for rule with no in_port
  net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
  net/mlx5e: CT: Support restore ipv6 tunnel
  net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()
  ionic: unlock queue mutex in error path
  atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
  net: ethernet: mtk_eth_soc: fix MTU warnings
  net: nixge: fix potential memory leak in nixge_probe()
  devlink: ignore -EOPNOTSUPP errors on dumpit
  rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
  MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
  selftests/bpf: fix netdevsim trap_flow_action_cookie read
  ipv6: fix memory leaks on IPV6_ADDRFORM path
  net/bpfilter: Initialize pos in __bpfilter_process_sockopt
  igb: reinit_locked() should be called with rtnl_lock
  e1000e: continue to init PHY even when failed to disable ULP
  ...

3 years agoMerge tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 1 Aug 2020 23:40:59 +0000 (16:40 -0700)]
Merge tag 'for-linus-2020-08-01' of git://git./linux/kernel/git/brauner/linux

Pull thread fix from Christian Brauner:
 "A simple spelling fix for dequeue_synchronous_signal()"

* tag 'for-linus-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  signal: fix typo in dequeue_synchronous_signal()

3 years agoMerge tag 'perf-tools-fixes-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 1 Aug 2020 20:08:50 +0000 (13:08 -0700)]
Merge tag 'perf-tools-fixes-2020-08-01' of git://git./linux/kernel/git/acme/linux

Pull perf tooling fixes from Arnaldo Carvalho de Melo:

 - Fix libtraceevent build with binutils 2.35

 - Fix memory leak in process_dynamic_array_len in libtraceevent

 - Fix 'perf test 68' zstd compression for s390

 - Fix record failure when mixed with ARM SPE event

* tag 'perf-tools-fixes-2020-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  libtraceevent: Fix build with binutils 2.35
  perf tools: Fix record failure when mixed with ARM SPE event
  perf tests: Fix test 68 zstd compression for s390
  tools lib traceevent: Fix memory leak in process_dynamic_array_len

3 years agovxlan: fix memleak of fdb
Taehee Yoo [Sat, 1 Aug 2020 07:07:50 +0000 (07:07 +0000)]
vxlan: fix memleak of fdb

When vxlan interface is deleted, all fdbs are deleted by vxlan_flush().
vxlan_flush() flushes fdbs but it doesn't delete fdb, which contains
all-zeros-mac because it is deleted by vxlan_uninit().
But vxlan_uninit() deletes only the fdb, which contains both all-zeros-mac
and default vni.
So, the fdb, which contains both all-zeros-mac and non-default vni
will not be deleted.

Test commands:
    ip link add vxlan0 type vxlan dstport 4789 external
    ip link set vxlan0 up
    bridge fdb add to 00:00:00:00:00:00 dst 172.0.0.1 dev vxlan0 via lo \
    src_vni 10000 self permanent
    ip link del vxlan0

kmemleak reports as follows:
unreferenced object 0xffff9486b25ced88 (size 96):
  comm "bridge", pid 2151, jiffies 4294701712 (age 35506.901s)
  hex dump (first 32 bytes):
    02 00 00 00 ac 00 00 01 40 00 09 b1 86 94 ff ff  ........@.......
    46 02 00 00 00 00 00 00 a7 03 00 00 12 b5 6a 6b  F.............jk
  backtrace:
    [<00000000c10cf651>] vxlan_fdb_append.part.51+0x3c/0xf0 [vxlan]
    [<000000006b31a8d9>] vxlan_fdb_create+0x184/0x1a0 [vxlan]
    [<0000000049399045>] vxlan_fdb_update+0x12f/0x220 [vxlan]
    [<0000000090b1ef00>] vxlan_fdb_add+0x12a/0x1b0 [vxlan]
    [<0000000056633c2c>] rtnl_fdb_add+0x187/0x270
    [<00000000dd5dfb6b>] rtnetlink_rcv_msg+0x264/0x490
    [<00000000fc44dd54>] netlink_rcv_skb+0x4a/0x110
    [<00000000dff433e7>] netlink_unicast+0x18e/0x250
    [<00000000b87fb421>] netlink_sendmsg+0x2e9/0x400
    [<000000002ed55153>] ____sys_sendmsg+0x237/0x260
    [<00000000faa51c66>] ___sys_sendmsg+0x88/0xd0
    [<000000006c3982f1>] __sys_sendmsg+0x4e/0x80
    [<00000000a8f875d2>] do_syscall_64+0x56/0xe0
    [<000000003610eefa>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff9486b1c40080 (size 128):
  comm "bridge", pid 2157, jiffies 4294701754 (age 35506.866s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 f8 dc 42 b2 86 94 ff ff  ..........B.....
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:
    [<00000000a2981b60>] vxlan_fdb_create+0x67/0x1a0 [vxlan]
    [<0000000049399045>] vxlan_fdb_update+0x12f/0x220 [vxlan]
    [<0000000090b1ef00>] vxlan_fdb_add+0x12a/0x1b0 [vxlan]
    [<0000000056633c2c>] rtnl_fdb_add+0x187/0x270
    [<00000000dd5dfb6b>] rtnetlink_rcv_msg+0x264/0x490
    [<00000000fc44dd54>] netlink_rcv_skb+0x4a/0x110
    [<00000000dff433e7>] netlink_unicast+0x18e/0x250
    [<00000000b87fb421>] netlink_sendmsg+0x2e9/0x400
    [<000000002ed55153>] ____sys_sendmsg+0x237/0x260
    [<00000000faa51c66>] ___sys_sendmsg+0x88/0xd0
    [<000000006c3982f1>] __sys_sendmsg+0x4e/0x80
    [<00000000a8f875d2>] do_syscall_64+0x56/0xe0
    [<000000003610eefa>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 3ad7a4b141eb ("vxlan: support fdb and learning in COLLECT_METADATA mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'pinctrl-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Sat, 1 Aug 2020 17:11:42 +0000 (10:11 -0700)]
Merge tag 'pinctrl-v5.8-4' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fix from Linus Walleij:
 "A single last minute pin control fix to the Qualcomm driver fixing
  missing dual edge PCH interrupts"

* tag 'pinctrl-v5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: qcom: Handle broken/missing PDC dual edge IRQs on sc7180

3 years agoRevert "kconfig: qconf: don't show goback button on splitMode"
Masahiro Yamada [Sat, 1 Aug 2020 07:08:49 +0000 (16:08 +0900)]
Revert "kconfig: qconf: don't show goback button on splitMode"

This reverts commit cc1c08edccaf5317d99a17a3231fe06381044e83.

Maxim Levitsky reports 'make xconfig' crashes since that commit
(https://lkml.org/lkml/2020/7/18/411)

Or, the following is simple test code that makes it crash:

    menu "Menu"

    config FOO
            bool "foo"
            default y

    menuconfig BAR
            bool "bar"
            depends on FOO

    endmenu

Select the Split View mode, and double-click "bar" in the right
window, then you will see Segmentation fault.

When 'last' is not set for symbolMode, the following code in
ConfigList::updateList() calls firstChild().

  item = last ? last->nextSibling() : firstChild();

However, the pointer returned by ConfigList::firstChild() does not
seem to be compatible with (ConfigItem *), which seems another bug.

I'd rather want to reconsider whether hiding the goback icon is the
right thing to do.

In the following test code, the Split View shows "Menu2" and "Menu3"
in the right window. You can descend into "Menu3", but there is no way
to ascend back to "Menu2" from "Menu3".

    menu "Menu1"

    config FOO
            bool "foo"
            default y

    menu "Menu2"
            depends on FOO

    menu "Menu3"

    config BAZ
            bool "baz"

    endmenu

    endmenu

    endmenu

It is true that the goback button is currently not functional due to
yet another bug, but hiding the problem is not the right way to go.

Anyway, Segmentation fault is fatal. Revert the offending commit for
now, and we should find the right solution.

Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agoRevert "kconfig: qconf: Change title for the item window"
Masahiro Yamada [Wed, 29 Jul 2020 17:46:17 +0000 (02:46 +0900)]
Revert "kconfig: qconf: Change title for the item window"

This reverts commit 5752ff07fd90d764d96e3c586cc95c09598abfdd.

It added dead code to ConfigList:ConfigList().

The constructor of ConfigList has the initializer, mode(singleMode).

    if (mode == symbolMode)
           setHeaderLabels(QStringList() << "Item" << "Name" << "N" << "M" << "Y" << "Value");
    else
           setHeaderLabels(QStringList() << "Option" << "Name" << "N" << "M" << "Y" << "Value");

... always takes the else part.

The change to ConfigList::updateSelection() is strange too.
When you click the split view icon for the first time, the titles in
both windows show "Option". After you click something in the right
window, the title suddenly changes to "Item".

ConfigList::updateSelection() is not the right place to do this,
at least. It was not a good idea, I think.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokconfig: qconf: remove "goBack" debug message
Masahiro Yamada [Wed, 29 Jul 2020 17:12:40 +0000 (02:12 +0900)]
kconfig: qconf: remove "goBack" debug message

Every time the goback icon is clicked, the annoying message "goBack"
is displayed on the console.

I guess this line is the left-over debug code of commit af737b4defe1
("kconfig: qconf: simplify the goBack() logic").

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokconfig: qconf: use delete[] instead of delete to free array
Masahiro Yamada [Wed, 29 Jul 2020 17:02:39 +0000 (02:02 +0900)]
kconfig: qconf: use delete[] instead of delete to free array

cppcheck reports "Mismatching allocation and deallocation".

$ cppcheck scripts/kconfig/qconf.cc
Checking scripts/kconfig/qconf.cc ...
scripts/kconfig/qconf.cc:1242:10: error: Mismatching allocation and deallocation: data [mismatchAllocDealloc]
  delete data;
         ^
scripts/kconfig/qconf.cc:1236:15: note: Mismatching allocation and deallocation: data
 char *data = new char[count + 1];
              ^
scripts/kconfig/qconf.cc:1242:10: note: Mismatching allocation and deallocation: data
  delete data;
         ^
scripts/kconfig/qconf.cc:1255:10: error: Mismatching allocation and deallocation: data [mismatchAllocDealloc]
  delete data;
         ^
scripts/kconfig/qconf.cc:1236:15: note: Mismatching allocation and deallocation: data
 char *data = new char[count + 1];
              ^
scripts/kconfig/qconf.cc:1255:10: note: Mismatching allocation and deallocation: data
  delete data;
         ^

Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again")
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokconfig: qconf: compile moc object separately
Masahiro Yamada [Wed, 29 Jul 2020 17:02:38 +0000 (02:02 +0900)]
kconfig: qconf: compile moc object separately

Currently, qconf.moc is included from qconf.cc but they can be compiled
independently.

When you modify qconf.cc, qconf.moc does not need recompiling.

Rename qconf.moc to qconf-moc.cc, and split it out as an independent
compilation unit.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokconfig: qconf: use if_changed for qconf.moc rule
Masahiro Yamada [Wed, 29 Jul 2020 17:02:37 +0000 (02:02 +0900)]
kconfig: qconf: use if_changed for qconf.moc rule

Regenerate qconf.moc when the moc command is changed.

This also allows 'make mrproper' to clean it up. Previously, it was
not cleaned up because 'clean-files += qconf.moc' was missing.
Now 'make mrproper' correctly cleans it up because files listed in
'targets' are cleaned.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Sat, 1 Aug 2020 00:19:47 +0000 (17:19 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2020-07-31

The following pull-request contains BPF updates for your *net* tree.

We've added 5 non-merge commits during the last 21 day(s) which contain
a total of 5 files changed, 126 insertions(+), 18 deletions(-).

The main changes are:

1) Fix a map element leak in HASH_OF_MAPS map type, from Andrii Nakryiko.

2) Fix a NULL pointer dereference in __btf_resolve_helper_id() when no
   btf_vmlinux is available, from Peilin Ye.

3) Init pos variable in __bpfilter_process_sockopt(), from Christoph Hellwig.

4) Fix a cgroup sockopt verifier test by specifying expected attach type,
   from Jean-Philippe Brucker.

Note that when net gets merged into net-next later on, there is a small
merge conflict in kernel/bpf/btf.c between commit 5b801dfb7feb ("bpf: Fix
NULL pointer dereference in __btf_resolve_helper_id()") from the bpf tree
and commit 138b9a0511c7 ("bpf: Remove btf_id helpers resolving") from the
net-next tree.

Resolve as follows: remove the old hunk with the __btf_resolve_helper_id()
function. Change the btf_resolve_helper_id() so it actually tests for a
NULL btf_vmlinux and bails out:

int btf_resolve_helper_id(struct bpf_verifier_log *log,
                          const struct bpf_func_proto *fn, int arg)
{
        int id;

        if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID || !btf_vmlinux)
                return -EINVAL;
        id = fn->btf_id[arg];
        if (!id || id > btf_vmlinux->nr_types)
                return -EINVAL;
        return id;
}

Let me know if you run into any others issues (CC'ing Jiri Olsa so he's in
the loop with regards to merge conflict resolution).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
David S. Miller [Sat, 1 Aug 2020 00:10:53 +0000 (17:10 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2020-07-31

1) Fix policy matching with mark and mask on userspace interfaces.
   From Xin Long.

2) Several fixes for the new ESP in TCP encapsulation.
   From Sabrina Dubroca.

3) Fix crash when the hold queue is used. The assumption that
   xdst->path and dst->child are not a NULL pointer only if dst->xfrm
   is not a NULL pointer is true with the exception of using the
   hold queue. Fix this by checking for hold queue usage before
   dereferencing xdst->path or dst->child.

4) Validate pfkey_dump parameter before sending them.
   From Mark Salyzyn.

5) Fix the location of the transport header with ESP in UDPv6
   encapsulation. From Sabrina Dubroca.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'mlx5-fixes-2020-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Sat, 1 Aug 2020 00:05:54 +0000 (17:05 -0700)]
Merge tag 'mlx5-fixes-2020-07-30' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2020-07-30

This small patchset introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.18:
 ('net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq')

For -stable v5.7:
 ('net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agords: Prevent kernel-infoleak in rds_notify_queue_get()
Peilin Ye [Thu, 30 Jul 2020 19:20:26 +0000 (15:20 -0400)]
rds: Prevent kernel-infoleak in rds_notify_queue_get()

rds_notify_queue_get() is potentially copying uninitialized kernel stack
memory to userspace since the compiler may leave a 4-byte hole at the end
of `cmsg`.

In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which
unfortunately does not always initialize that 4-byte hole. Fix it by using
memset() instead.

Cc: stable@vger.kernel.org
Fixes: f037590fff30 ("rds: fix a leak of kernel memory")
Fixes: bdbe6fbc6a2f ("RDS: recv.c")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Fri, 31 Jul 2020 23:51:58 +0000 (16:51 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-07-30

This series contains updates to the e1000e and igb drivers.

Aaron Ma allows PHY initialization to continue if ULP disable failed for
e1000e.

Francesco Ruggeri fixes race conditions in igb reset that could cause panics.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: The error lable position is corrected in ct_init_module
liujian [Thu, 30 Jul 2020 08:14:28 +0000 (16:14 +0800)]
net/sched: The error lable position is corrected in ct_init_module

Exchange the positions of the err_tbl_init and err_register labels in
ct_init_module function.

Fixes: c34b961a2492 ("net/sched: act_ct: Create nf flow table per zone")
Signed-off-by: liujian <liujian56@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 31 Jul 2020 19:50:54 +0000 (12:50 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Some I2C core improvements to prevent NULL pointer usage and a
  MAINTAINERS update"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: slave: add sanity check when unregistering
  i2c: slave: improve sanity check when registering
  MAINTAINERS: Update GENI I2C maintainers list
  i2c: also convert placeholder function to return errno

3 years agoMerge tag 'powerpc-5.8-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Fri, 31 Jul 2020 16:38:39 +0000 (09:38 -0700)]
Merge tag 'powerpc-5.8-8' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "Fix a bug introduced by the changes we made to lockless page table
  walking this cycle.

  When using the hash MMU, and perf with callchain recording, we can
  deadlock if the PMI interrupts a hash fault, and the callchain
  recording then takes a hash fault on the same page.

  Thanks to Nicholas Piggin, Aneesh Kumar K.V, Anton Blanchard, and
  Athira Rajeev"

* tag 'powerpc-5.8-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/hash: Fix hash_preload running with interrupts enabled

3 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 31 Jul 2020 16:36:03 +0000 (09:36 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The main one is to fix the build after Willy's per-cpu entropy changes
  this week. Although that was already resolved elsewhere, the arm64 fix
  here is useful cleanup anyway.

  Other than that, we've got a fix for building with Clang's integrated
  assembler and a fix to make our IPv4 checksumming robust against
  invalid header lengths (this only seems to be triggerable by injected
  errors).

   - Fix build breakage due to circular headers

   - Fix build regression when using Clang's integrated assembler

   - Fix IPv4 header checksum code to deal with invalid length field

   - Fix broken path for Arm PMU entry in MAINTAINERS"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  MAINTAINERS: Include drivers subdirs for ARM PMU PROFILING AND DEBUGGING entry
  arm64: csum: Fix handling of bad packets
  arm64: Drop unnecessary include from asm/smp.h
  arm64/alternatives: move length validation inside the subsection

3 years agoMerge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Fri, 31 Jul 2020 16:33:45 +0000 (09:33 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - avoid invoking overflow handler for uaccess watchpoints

 - fix incorrect clock_gettime64 availability

 - fix EFI crash in create_mapping_late()

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8988/1: mmu: fix crash in EFI calls due to p4d typo in create_mapping_late()
  ARM: 8987/1: VDSO: Fix incorrect clock_gettime64
  ARM: 8986/1: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 31 Jul 2020 16:22:10 +0000 (09:22 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Two more merge window regressions, a corruption bug in hfi1 and a few
  other small fixes.

   - Missing user input validation regression in ucma

   - Disallowing a previously allowed user combination regression in
     mlx5

   - ODP prefetch memory leaking triggerable by userspace

   - Memory corruption in hf1 due to faulty ring buffer logic

   - Missed mutex initialization crash in mlx5

   - Two small defects with RDMA DIM"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Free DIM memory in error unwind
  RDMA/core: Stop DIM before destroying CQ
  RDMA/mlx5: Initialize QP mutex for the debug kernels
  IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE
  RDMA/mlx5: Allow providing extra scatter CQE QP flag
  RDMA/mlx5: Fix prefetch memory leak if get_prefetchable_mr fails
  RDMA/cm: Add min length checks to user structure copies

3 years agoMerge tag 'sound-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Fri, 31 Jul 2020 16:17:24 +0000 (09:17 -0700)]
Merge tag 'sound-5.8' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A few wrap-up small fixes for the usual HD-audio and USB-audio stuff:

   - A regression fix for S3 suspend on old Intel platforms

   - A fix for possible Oops in ASoC HD-audio binding

   - Trivial quirks for various devices"

* tag 'sound-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Fixed HP right speaker no sound
  ALSA: hda: fix NULL pointer dereference during suspend
  ALSA: hda/hdmi: Fix keep_power assignment for non-component devices
  ALSA: hda: Workaround for spurious wakeups on some Intel platforms
  ALSA: hda/realtek: Fix add a "ultra_low_power" function for intel reference board (alc256)
  ALSA: hda/realtek: typo_fix: enable headset mic of ASUS ROG Zephyrus G14(GA401) series with ALC289
  ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G15(GA502) series with ALC289
  ALSA: usb-audio: Add implicit feedback quirk for SSL2

3 years agolibtraceevent: Fix build with binutils 2.35
Ben Hutchings [Sat, 25 Jul 2020 01:06:23 +0000 (02:06 +0100)]
libtraceevent: Fix build with binutils 2.35

In binutils 2.35, 'nm -D' changed to show symbol versions along with
symbol names, with the usual @@ separator.  When generating
libtraceevent-dynamic-list we need just the names, so strip off the
version suffix if present.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Fix record failure when mixed with ARM SPE event
Wei Li [Fri, 24 Jul 2020 07:11:10 +0000 (15:11 +0800)]
perf tools: Fix record failure when mixed with ARM SPE event

When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.

  [root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.067 MB perf.data ]
  [root@localhost 0620]#
  [root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
     -e  cache-misses sleep 1
  [root@localhost 0620]#

The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().

We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.

Fixes: ffd3d18c20b8 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Fix test 68 zstd compression for s390
Thomas Richter [Wed, 29 Jul 2020 13:53:14 +0000 (15:53 +0200)]
perf tests: Fix test 68 zstd compression for s390

Commit 5aa98879efe7 ("s390/cpum_sf: prohibit callchain data collection")
prohibits call graph sampling for hardware events on s390. The
information recorded is out of context and does not match.

On s390 this commit now breaks test case 68 Zstd perf.data
compression/decompression.

Therefore omit call graph sampling on s390 in this test.

Output before:
  [root@t35lp46 perf]# ./perf test -Fv 68
  68: Zstd perf.data compression/decompression              :
  --- start ---
  Collecting compressed record file:
  Error:
  cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
                                Try 'perf stat'
  ---- end ----
  Zstd perf.data compression/decompression: FAILED!
  [root@t35lp46 perf]#

Output after:
[root@t35lp46 perf]# ./perf test -Fv 68
  68: Zstd perf.data compression/decompression              :
  --- start ---
  Collecting compressed record file:
  500+0 records in
  500+0 records out
  256000 bytes (256 kB, 250 KiB) copied, 0.00615638 s, 41.6 MB/s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.004 MB /tmp/perf.data.X3M,
                        compressed (original 0.002 MB, ratio is 3.609) ]
  Checking compressed events stats:
  # compressed : Zstd, level = 1, ratio = 4
        COMPRESSED events:          1
  2ELIFREPh---- end ----
  Zstd perf.data compression/decompression: Ok
  [root@t35lp46 perf]#

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200729135314.91281-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools lib traceevent: Fix memory leak in process_dynamic_array_len
Philippe Duplessis-Guindon [Thu, 30 Jul 2020 15:02:36 +0000 (11:02 -0400)]
tools lib traceevent: Fix memory leak in process_dynamic_array_len

I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoKVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM
Wanpeng Li [Fri, 31 Jul 2020 03:12:21 +0000 (11:12 +0800)]
KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM

'Commit 8566ac8b8e7c ("KVM: SVM: Implement pause loop exit logic in SVM")'
drops disable pause loop exit/pause filtering capability completely, I
guess it is a merge fault by Radim since disable vmexits capabilities and
pause loop exit for SVM patchsets are merged at the same time. This patch
reintroduces the disable pause loop exit/pause filtering capability support.

Reported-by: Haiwei Li <lihaiwei@tencent.com>
Tested-by: Haiwei Li <lihaiwei@tencent.com>
Fixes: 8566ac8b ("KVM: SVM: Implement pause loop exit logic in SVM")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-3-git-send-email-wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled
Wanpeng Li [Fri, 31 Jul 2020 03:12:19 +0000 (11:12 +0800)]
KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled

Prevent setting the tscdeadline timer if the lapic is hw disabled.

Fixes: bce87cce88 (KVM: x86: consolidate different ways to test for in-kernel LAPIC)
Cc: <stable@vger.kernel.org>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1596165141-28874-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoMerge tag 'drm-fixes-2020-07-31' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 31 Jul 2020 04:26:42 +0000 (21:26 -0700)]
Merge tag 'drm-fixes-2020-07-31' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
 "As mentioned previously this contains the nouveau regression fix.

  amdgpu had three fixes outstanding as well, one revert, an info leak
  and use after free. The use after free is a bit trickier than I'd
  like, and I've personally gone over it to confirm I'm happy that it is
  doing what it says.

  nouveau:
   - final modifiers regression fix

  amdgpu:
   - Revert a fix which caused other regressions
   - Fix potential kernel info leak
   - Fix a use-after-free bug that was uncovered by another change in 5.7"

* tag 'drm-fixes-2020-07-31' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau: Accept 'legacy' format modifiers
  Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers"
  drm/amd/display: Clear dm_state for fast updates
  drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl()

3 years agoMerge tag 'amd-drm-fixes-5.8-2020-07-30' of git://people.freedesktop.org/~agd5f/linux...
Dave Airlie [Fri, 31 Jul 2020 03:04:00 +0000 (13:04 +1000)]
Merge tag 'amd-drm-fixes-5.8-2020-07-30' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.8-2020-07-30:

amdgpu:
- Revert a fix which caused other regressions
- Fix potential kernel info leak
- Fix a use-after-free bug that was uncovered by another change in 5.7

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730154338.244104-1-alexander.deucher@amd.com
3 years agodrm/nouveau: Accept 'legacy' format modifiers
James Jones [Thu, 30 Jul 2020 23:58:23 +0000 (16:58 -0700)]
drm/nouveau: Accept 'legacy' format modifiers

Accept the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()
family of modifiers to handle broken userspace
Xorg modesetting and Mesa drivers. Existing Mesa
drivers are still aware of only these older
format modifiers which do not differentiate
between different variations of the block linear
layout. When the format modifier support flag was
flipped in the nouveau kernel driver, the X.org
modesetting driver began attempting to use its
format modifier-enabled framebuffer path. Because
the set of format modifiers advertised by the
kernel prior to this change do not intersect with
the set of format modifiers advertised by Mesa,
allocating GBM buffers using format modifiers
fails and the modesetting driver falls back to
non-modifier allocation. However, it still later
queries the modifier of the GBM buffer when
creating its DRM-KMS framebuffer object, receives
the old-format modifier from Mesa, and attempts
to create a framebuffer with it. Since the kernel
is still not aware of these formats, this fails.

Userspace should not be attempting to query format
modifiers of GBM buffers allocated with a non-
format-modifier-aware allocation path, but to
avoid breaking existing userspace behavior, this
change accepts the old-style format modifiers when
creating framebuffers and applying them to planes
by translating them to the equivalent new-style
modifier. To accomplish this, some layout
parameters must be assumed to match properties of
the device targeted by the relevant ioctls. To
avoid perpetuating misuse of the old-style
modifiers, this change does not advertise support
for them. Doing so would imply compatibility
between devices with incompatible memory layouts.

Tested with Xorg 1.20 modesetting driver,
weston@c46c70dac84a4b3030cd05b380f9f410536690fc,
gnome & KDE wayland desktops from Ubuntu 18.04,
and sway 1.5

Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Fixes: fa4f4c213f5f ("drm/nouveau/kms: Support NVIDIA format modifiers")
Link: https://lkml.org/lkml/2020/6/30/1251
Signed-off-by: James Jones <jajones@nvidia.com>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
3 years agonet/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq
Xin Xiong [Thu, 30 Jul 2020 10:29:41 +0000 (18:29 +0800)]
net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq

The function invokes bpf_prog_inc(), which increases the reference
count of a bpf_prog object "rq->xdp_prog" if the object isn't NULL.

The refcount leak issues take place in two error handling paths. When
either mlx5_wq_ll_create() or mlx5_wq_cyc_create() fails, the function
simply returns the error code and forgets to drop the reference count
increased earlier, causing a reference count leak of "rq->xdp_prog".

Fix this issue by jumping to the error handling path err_rq_wq_destroy
while either function fails.

Fixes: 422d4c401edd ("net/mlx5e: RX, Split WQ objects for different RQ types")
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
3 years agonet/mlx5e: E-Switch, Specify flow_source for rule with no in_port
Jianbo Liu [Fri, 3 Jul 2020 02:34:23 +0000 (02:34 +0000)]
net/mlx5e: E-Switch, Specify flow_source for rule with no in_port

The flow_source must be specified, even for rule without matching
source vport, because some actions are only allowed in uplink.
Otherwise, rule can't be offloaded and firmware syndrome happens.

Fixes: 6fb0701a9cfa ("net/mlx5: E-Switch, Add support for offloading rules with no in_port")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
3 years agonet/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring
Jianbo Liu [Thu, 2 Jul 2020 01:06:37 +0000 (01:06 +0000)]
net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring

The modified flow_context fields in FTE must be indicated in
modify_enable bitmask. Previously, the misc bit in modify_enable is
always set as source vport must be set for each rule. So, when parsing
vxlan/gre/geneve/qinq rules, this bit is not set because those are all
from the same misc fileds that source vport fields are located at, and
we don't need to set the indicator twice.

After adding per vport tables for mirroring, misc bit is not set, then
firmware syndrome happens. To fix it, set the bit wherever misc fileds
are changed. This also makes it unnecessary to check misc fields and set
the misc bit accordingly in metadata matching, so here remove it.

Besides, flow_source must be specified for uplink because firmware
will check it and some actions are only allowed for packets received
from uplink.

Fixes: 96e326878fa5 ("net/mlx5e: Eswitch, Use per vport tables for mirroring")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
3 years agonet/mlx5e: CT: Support restore ipv6 tunnel
Jianbo Liu [Mon, 20 Jul 2020 01:36:45 +0000 (01:36 +0000)]
net/mlx5e: CT: Support restore ipv6 tunnel

Currently the driver restores only IPv4 tunnel headers.
Add support for restoring IPv6 tunnel header.

Fixes: b8ce90370977 ("net/mlx5e: Restore tunnel metadata on miss")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
3 years agoMerge tag 'mac80211-for-davem-2020-07-30' of git://git.kernel.org/pub/scm/linux/kerne...
David S. Miller [Fri, 31 Jul 2020 00:47:34 +0000 (17:47 -0700)]
Merge tag 'mac80211-for-davem-2020-07-30' of git://git./linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A couple of more changes:
 * remove a warning that can trigger in certain races
 * check a function pointer before using it
 * check before adding 6 GHz to avoid a warning in mesh
 * fix two memory leaks in mesh
 * fix a TX status bug leading to a memory leak
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_por...
Wang Hai [Thu, 30 Jul 2020 07:30:00 +0000 (15:30 +0800)]
net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe()

Fix the missing clk_disable_unprepare() before return
from gemini_ethernet_port_probe() in the error handling case.

Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: unlock queue mutex in error path
Shannon Nelson [Wed, 29 Jul 2020 17:52:17 +0000 (10:52 -0700)]
ionic: unlock queue mutex in error path

On an error return, jump to the unlock at the end to be sure
to unlock the queue_lock mutex.

Fixes: 0925e9db4dc8 ("ionic: use mutex to protect queue operations")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoatm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
Xin Xiong [Wed, 29 Jul 2020 13:06:59 +0000 (21:06 +0800)]
atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent

atmtcp_remove_persistent() invokes atm_dev_lookup(), which returns a
reference of atm_dev with increased refcount or NULL if fails.

The refcount leaks issues occur in two error handling paths. If
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL, the function
returns 0 without decreasing the refcount kept by a local variable,
resulting in refcount leaks.

Fix the issue by adding atm_dev_put() before returning 0 both when
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL.

Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ethernet: mtk_eth_soc: fix MTU warnings
Landen Chao [Wed, 29 Jul 2020 08:15:17 +0000 (10:15 +0200)]
net: ethernet: mtk_eth_soc: fix MTU warnings

in recent kernel versions there are warnings about incorrect MTU size
like these:

eth0: mtu greater than device maximum
mtk_soc_eth 1b100000.ethernet eth0: error -22 setting MTU to include DSA overhead

Fixes: bfcb813203e6 ("net: dsa: configure the MTU for switch ports")
Fixes: 72579e14a1d3 ("net: dsa: don't fail to probe if we couldn't set the MTU")
Fixes: 7a4c53bee332 ("net: report invalid mtu value via netlink extack")
Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: nixge: fix potential memory leak in nixge_probe()
Lu Wei [Wed, 29 Jul 2020 03:50:05 +0000 (11:50 +0800)]
net: nixge: fix potential memory leak in nixge_probe()

If some processes in nixge_probe() fail, free_netdev(dev)
needs to be called to aviod a memory leak.

Fixes: 87ab207981ec ("net: nixge: Separate ctrl and dma resources")
Fixes: abcd3d6fc640 ("net: nixge: Fix error path for obtaining mac address")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: ignore -EOPNOTSUPP errors on dumpit
Jakub Kicinski [Tue, 28 Jul 2020 23:15:07 +0000 (16:15 -0700)]
devlink: ignore -EOPNOTSUPP errors on dumpit

Number of .dumpit functions try to ignore -EOPNOTSUPP errors.
Recent change missed that, and started reporting all errors
but -EMSGSIZE back from dumps. This leads to situation like
this:

$ devlink dev info
devlink answers: Operation not supported

Dump should not report an error just because the last device
to be queried could not provide an answer.

To fix this and avoid similar confusion make sure we clear
err properly, and not leave it set to an error if we don't
terminate the iteration.

Fixes: c62c2cfb801b ("net: devlink: don't ignore errors during dumpit")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agorxrpc: Fix race between recvmsg and sendmsg on immediate call failure
David Howells [Tue, 28 Jul 2020 23:03:56 +0000 (00:03 +0100)]
rxrpc: Fix race between recvmsg and sendmsg on immediate call failure

There's a race between rxrpc_sendmsg setting up a call, but then failing to
send anything on it due to an error, and recvmsg() seeing the call
completion occur and trying to return the state to the user.

An assertion fails in rxrpc_recvmsg() because the call has already been
released from the socket and is about to be released again as recvmsg deals
with it.  (The recvmsg_q queue on the socket holds a ref, so there's no
problem with use-after-free.)

We also have to be careful not to end up reporting an error twice, in such
a way that both returns indicate to userspace that the user ID supplied
with the call is no longer in use - which could cause the client to
malfunction if it recycles the user ID fast enough.

Fix this by the following means:

 (1) When sendmsg() creates a call after the point that the call has been
     successfully added to the socket, don't return any errors through
     sendmsg(), but rather complete the call and let recvmsg() retrieve
     them.  Make sendmsg() return 0 at this point.  Further calls to
     sendmsg() for that call will fail with ESHUTDOWN.

     Note that at this point, we haven't send any packets yet, so the
     server doesn't yet know about the call.

 (2) If sendmsg() returns an error when it was expected to create a new
     call, it means that the user ID wasn't used.

 (3) Mark the call disconnected before marking it completed to prevent an
     oops in rxrpc_release_call().

 (4) recvmsg() will then retrieve the error and set MSG_EOR to indicate
     that the user ID is no longer known by the kernel.

An oops like the following is produced:

kernel BUG at net/rxrpc/recvmsg.c:605!
...
RIP: 0010:rxrpc_recvmsg+0x256/0x5ae
...
Call Trace:
 ? __init_waitqueue_head+0x2f/0x2f
 ____sys_recvmsg+0x8a/0x148
 ? import_iovec+0x69/0x9c
 ? copy_msghdr_from_user+0x5c/0x86
 ___sys_recvmsg+0x72/0xaa
 ? __fget_files+0x22/0x57
 ? __fget_light+0x46/0x51
 ? fdget+0x9/0x1b
 do_recvmmsg+0x15e/0x232
 ? _raw_spin_unlock+0xa/0xb
 ? vtime_delta+0xf/0x25
 __x64_sys_recvmmsg+0x2c/0x2f
 do_syscall_64+0x4c/0x78
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 357f5ef64628 ("rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()")
Reported-by: syzbot+b54969381df354936d96@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer
Joyce Ooi [Mon, 27 Jul 2020 09:46:41 +0000 (17:46 +0800)]
MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer

This patch is to replace Thor Thayer as Altera Triple Speed Ethernet
maintainer as he is moving to a different role.

Signed-off-by: Joyce Ooi <joyce.ooi@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests/bpf: fix netdevsim trap_flow_action_cookie read
Hangbin Liu [Mon, 27 Jul 2020 11:04:55 +0000 (19:04 +0800)]
selftests/bpf: fix netdevsim trap_flow_action_cookie read

When read netdevsim trap_flow_action_cookie, we need to init it first,
or we will get "Invalid argument" error.

Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoipv6: fix memory leaks on IPV6_ADDRFORM path
Cong Wang [Sat, 25 Jul 2020 22:40:53 +0000 (15:40 -0700)]
ipv6: fix memory leaks on IPV6_ADDRFORM path

IPV6_ADDRFORM causes resource leaks when converting an IPv6 socket
to IPv4, particularly struct ipv6_ac_socklist. Similar to
struct ipv6_mc_socklist, we should just close it on this path.

This bug can be easily reproduced with the following C program:

  #include <stdio.h>
  #include <string.h>
  #include <sys/types.h>
  #include <sys/socket.h>
  #include <arpa/inet.h>

  int main()
  {
    int s, value;
    struct sockaddr_in6 addr;
    struct ipv6_mreq m6;

    s = socket(AF_INET6, SOCK_DGRAM, 0);
    addr.sin6_family = AF_INET6;
    addr.sin6_port = htons(5000);
    inet_pton(AF_INET6, "::ffff:192.168.122.194", &addr.sin6_addr);
    connect(s, (struct sockaddr *)&addr, sizeof(addr));

    inet_pton(AF_INET6, "fe80::AAAA", &m6.ipv6mr_multiaddr);
    m6.ipv6mr_interface = 5;
    setsockopt(s, SOL_IPV6, IPV6_JOIN_ANYCAST, &m6, sizeof(m6));

    value = AF_INET;
    setsockopt(s, SOL_IPV6, IPV6_ADDRFORM, &value, sizeof(value));

    close(s);
    return 0;
  }

Reported-by: ch3332xr@gmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/bpfilter: Initialize pos in __bpfilter_process_sockopt
Christoph Hellwig [Thu, 30 Jul 2020 16:09:00 +0000 (18:09 +0200)]
net/bpfilter: Initialize pos in __bpfilter_process_sockopt

__bpfilter_process_sockopt never initialized the pos variable passed
to the pipe write. This has been mostly harmless in the past as pipes
ignore the offset, but the switch to kernel_write now verified the
position, which can lead to a failure depending on the exact stack
initialization pattern. Initialize the variable to zero to make
rw_verify_area happy.

Fixes: 6955a76fbcd5 ("bpfilter: switch to kernel_write")
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Reported-by: Rodrigo Madera <rodrigo.madera@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Rodrigo Madera <rodrigo.madera@gmail.com>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/bpf/20200730160900.187157-1-hch@lst.de
3 years agoMerge tag 'kvmarm-fixes-5.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmar...
Paolo Bonzini [Thu, 30 Jul 2020 22:10:26 +0000 (18:10 -0400)]
Merge tag 'kvmarm-fixes-5.8-4' of git://git./linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm64 fixes for Linux 5.8, take #3

- Fix a corner case of a new mapping inheriting exec permission without
  and yet bypassing invalidation of the I-cache
- Make sure PtrAuth predicates oinly generate inline code for the
  non-VHE hypervisor code