linux-2.6-block.git
3 years agopowerpc/rtas: dispatch partition migration requests to pseries
Nathan Lynch [Mon, 7 Dec 2020 21:51:47 +0000 (15:51 -0600)]
powerpc/rtas: dispatch partition migration requests to pseries

sys_rtas() cannot call ibm,suspend-me directly in the same way it
handles other inputs. Instead it must dispatch the request to code
that can first perform the H_JOIN sequence before any call to
ibm,suspend-me can succeed. Over time kernel/rtas.c has accreted a fair
amount of platform-specific code to implement this.

Since a different, more robust implementation of the suspend sequence
is now in the pseries platform code, we want to dispatch the request
there.

Note that invoking ibm,suspend-me via the RTAS syscall is all but
deprecated; this change preserves ABI compatibility for old programs
while providing to them the benefit of the new partition suspend
implementation. This is a behavior change in that the kernel performs
the device tree update and firmware activation before returning, but
experimentation indicates this is tolerated fine by legacy user space.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-16-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: retry partition suspend after error
Nathan Lynch [Mon, 7 Dec 2020 21:51:46 +0000 (15:51 -0600)]
powerpc/pseries/mobility: retry partition suspend after error

This is a mitigation for the relatively rare occurrence where a
virtual IOA can be in a transient state that prevents the
suspend/migration from succeeding, resulting in an error from
ibm,suspend-me.

If the join/suspend sequence returns an error, it is acceptable to
retry as long as the VASI suspend session state is still
"Suspending" (i.e. the platform is still waiting for the OS to
suspend).

Retry a few times on suspend failure while this condition holds,
progressively increasing the delay between attempts. We don't want to
retry indefinitey because firmware emits an error log event on each
unsuccessful attempt.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-15-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: signal suspend cancellation to platform
Nathan Lynch [Mon, 7 Dec 2020 21:51:45 +0000 (15:51 -0600)]
powerpc/pseries/mobility: signal suspend cancellation to platform

If we're returning an error to user space, use H_VASI_SIGNAL to send a
cancellation request to the platform. This isn't strictly required but
it communicates that Linux will not attempt to complete the suspend,
which allows the various entities involved to promptly end the
operation in progress.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-14-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: use stop_machine for join/suspend
Nathan Lynch [Mon, 7 Dec 2020 21:51:44 +0000 (15:51 -0600)]
powerpc/pseries/mobility: use stop_machine for join/suspend

The partition suspend sequence as specified in the platform
architecture requires that all active processor threads call
H_JOIN, which:

- suspends the calling thread until it is the target of
  an H_PROD; or
- immediately returns H_CONTINUE, if the calling thread is the last to
  call H_JOIN. This thread is expected to call ibm,suspend-me to
  completely suspend the partition.

Upon returning from ibm,suspend-me the calling thread must wake all
others using H_PROD.

rtas_ibm_suspend_me_unsafe() uses on_each_cpu() to implement this
protocol, but because of its synchronizing nature this is susceptible
to deadlock versus users of stop_machine() or other callers of
on_each_cpu().

Not only is stop_machine() intended for use cases like this, it
handles error propagation and allows us to keep the data shared
between CPUs minimal: a single atomic counter which ensures exactly
one CPU will wake the others from their joined states.

Switch the migration code to use stop_machine() and a less complex
local implementation of the H_JOIN/ibm,suspend-me logic, which
carries additional benefits:

- more informative error reporting, appropriately ratelimited
- resets the lockup detector / watchdog on resume to prevent lockup
  warnings when the OS has been suspended for a time exceeding the
  threshold.

Fixes: 91dc182ca6e2 ("[PATCH] powerpc: special-case ibm,suspend-me RTAS call")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-13-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: extract VASI session polling logic
Nathan Lynch [Mon, 7 Dec 2020 21:51:43 +0000 (15:51 -0600)]
powerpc/pseries/mobility: extract VASI session polling logic

The behavior of rtas_ibm_suspend_me_unsafe() is to return -EAGAIN to
the caller until the specified VASI suspend session state makes the
transition from H_VASI_ENABLED to H_VASI_SUSPENDING. In the interest
of separating concerns to prepare for a new implementation of the
join/suspend sequence, extract VASI session polling logic into a
couple of local functions. Waiting for the session state to reach
H_VASI_SUSPENDING before calling rtas_ibm_suspend_me_unsafe() ensures
that we will never get an EAGAIN result necessitating a retry. No
user-visible change in behavior is intended.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-12-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: use rtas_activate_firmware() on resume
Nathan Lynch [Mon, 7 Dec 2020 21:51:42 +0000 (15:51 -0600)]
powerpc/pseries/mobility: use rtas_activate_firmware() on resume

It's incorrect to abort post-suspend processing if
ibm,activate-firmware isn't available. Use rtas_activate_firmware(),
which logs this condition appropriately and allows us to proceed.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-11-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: error message improvements
Nathan Lynch [Mon, 7 Dec 2020 21:51:41 +0000 (15:51 -0600)]
powerpc/pseries/mobility: error message improvements

- Convert printk(KERN_ERR) to pr_err().
- Include errno in property update failure message.
- Remove reference to "Post-mobility" from device tree update message:
  with pr_err() it will have a "mobility:" prefix.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-10-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: add missing break to default case
Nathan Lynch [Mon, 7 Dec 2020 21:51:40 +0000 (15:51 -0600)]
powerpc/pseries/mobility: add missing break to default case

update_dt_node() has a switch statement where the default case lacks a
break statement.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-9-nathanl@linux.ibm.com
3 years agopowerpc/pseries/mobility: don't error on absence of ibm, update-nodes
Nathan Lynch [Mon, 7 Dec 2020 21:51:39 +0000 (15:51 -0600)]
powerpc/pseries/mobility: don't error on absence of ibm, update-nodes

Treat the absence of the ibm,update-nodes function as benign instead
of reporting an error. If the platform does not provide that facility,
it's not a problem for Linux.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-8-nathanl@linux.ibm.com
3 years agopowerpc/hvcall: add token and codes for H_VASI_SIGNAL
Nathan Lynch [Mon, 7 Dec 2020 21:51:38 +0000 (15:51 -0600)]
powerpc/hvcall: add token and codes for H_VASI_SIGNAL

H_VASI_SIGNAL can be used by a partition to request cancellation of
its migration. To be used in future changes.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-7-nathanl@linux.ibm.com
3 years agopowerpc/rtas: add rtas_activate_firmware()
Nathan Lynch [Mon, 7 Dec 2020 21:51:37 +0000 (15:51 -0600)]
powerpc/rtas: add rtas_activate_firmware()

Provide a documented wrapper function for the ibm,activate-firmware
service, which must be called after a partition migration or
hibernation.

If the function is absent or the call fails, the OS will continue to
run normally with the current firmware, so there is no need to perform
any recovery. Just log it and continue.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-6-nathanl@linux.ibm.com
3 years agopowerpc/rtas: add rtas_ibm_suspend_me()
Nathan Lynch [Mon, 7 Dec 2020 21:51:36 +0000 (15:51 -0600)]
powerpc/rtas: add rtas_ibm_suspend_me()

Now that the name is available, provide a simple wrapper for
ibm,suspend-me which returns both a Linux errno and optionally the
actual RTAS status to the caller.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-5-nathanl@linux.ibm.com
3 years agopowerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe
Nathan Lynch [Mon, 7 Dec 2020 21:51:35 +0000 (15:51 -0600)]
powerpc/rtas: rtas_ibm_suspend_me -> rtas_ibm_suspend_me_unsafe

The pseries partition suspend sequence requires that all active CPUs
call H_JOIN, which suspends all but one of them with interrupts
disabled. The "chosen" CPU is then to call ibm,suspend-me to complete
the suspend. Upon returning from ibm,suspend-me, the chosen CPU is to
use H_PROD to wake the joined CPUs.

Using on_each_cpu() for this, as rtas_ibm_suspend_me() does to
implement partition migration, is susceptible to deadlock with other
users of on_each_cpu() and with users of stop_machine APIs. The
callback passed to on_each_cpu() is not allowed to synchronize with
other CPUs in the way it is used here.

Complicating the fix is the fact that rtas_ibm_suspend_me() also
occupies the function name that should be used to provide a more
conventional wrapper for ibm,suspend-me. Rename rtas_ibm_suspend_me()
to rtas_ibm_suspend_me_unsafe() to free up the name and indicate that
it should not gain users.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-4-nathanl@linux.ibm.com
3 years agopowerpc/rtas: complete ibm,suspend-me status codes
Nathan Lynch [Mon, 7 Dec 2020 21:51:34 +0000 (15:51 -0600)]
powerpc/rtas: complete ibm,suspend-me status codes

We don't completely account for the possible return codes for
ibm,suspend-me. Add definitions for these.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-3-nathanl@linux.ibm.com
3 years agopowerpc/rtas: prevent suspend-related sys_rtas use on LE
Nathan Lynch [Mon, 7 Dec 2020 21:51:33 +0000 (15:51 -0600)]
powerpc/rtas: prevent suspend-related sys_rtas use on LE

While drmgr has had work in some areas to make its RTAS syscall
interactions endian-neutral, its code for performing partition
migration via the syscall has never worked on LE. While it is able to
complete ibm,suspend-me successfully, it crashes when attempting the
subsequent ibm,update-nodes call.

drmgr is the only known (or plausible) user of ibm,suspend-me,
ibm,update-nodes, and ibm,update-properties, so allow them only in
big-endian configurations.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207215200.1785968-2-nathanl@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Improve error reporting with KUAP
Aneesh Kumar K.V [Tue, 8 Dec 2020 03:15:39 +0000 (08:45 +0530)]
powerpc/book3s64/kuap: Improve error reporting with KUAP

This partially reverts commit eb232b162446 ("powerpc/book3s64/kuap: Improve
error reporting with KUAP") and update the fault handler to print

[   55.022514] Kernel attempted to access user page (7e6725b70000) - exploit attempt? (uid: 0)
[   55.022528] BUG: Unable to handle kernel data access on read at 0x7e6725b70000
[   55.022533] Faulting instruction address: 0xc000000000e8b9bc
[   55.022540] Oops: Kernel access of bad area, sig: 11 [#1]
....

when the kernel access userspace address without unlocking AMR.

bad_kuap_fault() is added as part of commit 5e5be3aed230 ("powerpc/mm: Detect
bad KUAP faults") to catch userspace access incorrectly blocked by AMR. Hence
retain the full stack dump there even with hash translation. Also, add a comment
explaining the difference between hash and radix.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201208031539.84878-1-aneesh.kumar@linux.ibm.com
3 years agopowerpc/powernv/idle: Restore CIABR after idle for Power9
Jordan Niethe [Mon, 7 Dec 2020 01:05:19 +0000 (12:05 +1100)]
powerpc/powernv/idle: Restore CIABR after idle for Power9

On Power9, CIABR is lost after idle. This means that instruction
breakpoints set by xmon which use CIABR do not work. Fix this by
restoring CIABR after idle.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207010519.15597-2-jniethe5@gmail.com
3 years agopowerpc/book3s64/kexec: Clear CIABR on kexec
Jordan Niethe [Mon, 7 Dec 2020 01:05:18 +0000 (12:05 +1100)]
powerpc/book3s64/kexec: Clear CIABR on kexec

The value in CIABR persists across kexec which can lead to unintended
results when the new kernel hits the old kernel's breakpoint. For
example:

0:mon> bi $loadavg_proc_show
0:mon> b
   type            address
1 inst   c000000000519060  loadavg_proc_show+0x0/0x130
0:mon> x

$ kexec -l /mnt/vmlinux --initrd=/mnt/rootfs.cpio.gz --append='xmon=off'
$ kexec -e

$ cat /proc/loadavg
Trace/breakpoint trap

Make sure CIABR is cleared so this does not happen.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207010519.15597-1-jniethe5@gmail.com
3 years agopowerpc: Remove ucache_bsize
Christophe Leroy [Tue, 17 Nov 2020 05:07:59 +0000 (05:07 +0000)]
powerpc: Remove ucache_bsize

ppc601 and e200 were the users of ucache_bsize.
ppc601 and e200 are now gone.

Remove ucache_bsize.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/288b6048597c0fdc495b203fda57a223d89499d2.1605589460.git.christophe.leroy@csgroup.eu
3 years agopowerpc: Retire e200 core (mpc555x processor)
Christophe Leroy [Tue, 17 Nov 2020 05:07:58 +0000 (05:07 +0000)]
powerpc: Retire e200 core (mpc555x processor)

There is no defconfig selecting CONFIG_E200, and no platform.

e200 is an earlier version of booke, a predecessor of e500,
with some particularities like an unified cache instead of both an
instruction cache and a data cache.

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/34ebc3ba2c768d97f363bd5f2deea2356e9ae127.1605589460.git.christophe.leroy@csgroup.eu
3 years agopowerpc: Fix update form addressing in inline assembly
Christophe Leroy [Thu, 22 Oct 2020 09:29:21 +0000 (09:29 +0000)]
powerpc: Fix update form addressing in inline assembly

In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with update form addressing,
but the associated "<>" constraint is missing.

As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used directly.

Use UPD_CONSTR macro everywhere %Un modifier is used.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/62eab5ca595485c192de1765bdac099f633a21d0.1603358942.git.christophe.leroy@csgroup.eu
3 years agopowerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at
Mathieu Desnoyers [Thu, 22 Oct 2020 09:29:20 +0000 (09:29 +0000)]
powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at

The placeholder for instruction selection should use the second
argument's operand, which is %1, not %0. This could generate incorrect
assembly code if the memory addressing of operand %0 is a different
form from that of operand %1.

Also remove the %Un placeholder because having %Un placeholders
for two operands which are based on the same local var (ptep) doesn't
make much sense. By the way, it doesn't change the current behaviour
because "<>" constraint is missing for the associated "=m".

[chleroy: revised commit log iaw segher's comments and removed %U0]

Fixes: 9bf2b5cdc5fe ("powerpc: Fixes for CONFIG_PTE_64BIT for SMP support")
Cc: <stable@vger.kernel.org> # v2.6.28+
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/96354bd77977a6a933fe9020da57629007fdb920.1603358942.git.christophe.leroy@csgroup.eu
3 years agopowerpc/xmon: Change printk() to pr_cont()
Christophe Leroy [Fri, 4 Dec 2020 10:35:38 +0000 (10:35 +0000)]
powerpc/xmon: Change printk() to pr_cont()

Since some time now, printk() adds carriage return, leading to
unusable xmon output if there is no udbg backend available:

  [   54.288722] sysrq: Entering xmon
  [   54.292209] Vector: 0  at [cace3d2c]
  [   54.292274]     pc:
  [   54.292331] c0023650
  [   54.292468] : xmon+0x28/0x58
  [   54.292519]
  [   54.292574]     lr:
  [   54.292630] c0023724
  [   54.292749] : sysrq_handle_xmon+0xa4/0xfc
  [   54.292801]
  [   54.292867]     sp: cace3de8
  [   54.292931]    msr: 9032
  [   54.292999]   current = 0xc28d0000
  [   54.293072]     pid   = 377, comm = sh
  [   54.293157] Linux version 5.10.0-rc6-s3k-dev-01364-gedf13f0ccd76-dirty (root@po17688vm.idsi0.si.c-s.fr) (powerpc64-linux-gcc (GCC) 10.1.0, GNU ld (GNU Binutils) 2.34) #4211 PREEMPT Fri Dec 4 09:32:11 UTC 2020
  [   54.293287] enter ? for help
  [   54.293470] [cace3de8]
  [   54.293532] c0023724
  [   54.293654]  sysrq_handle_xmon+0xa4/0xfc
  [   54.293711]  (unreliable)
  ...
  [   54.296002]
  [   54.296159] --- Exception: c01 (System Call) at
  [   54.296217] 0fd4e784
  [   54.296303]
  [   54.296375] SP (7fca6ff0) is in userspace
  [   54.296431] mon>
  [   54.296484]  <no input ...>

Use pr_cont() instead.

Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Mention that it only happens when udbg is not available]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c8a6ec704416ecd5ff2bd26213c9bc026bdd19de.1607077340.git.christophe.leroy@csgroup.eu
3 years agopowerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU
Alexey Kardashevskiy [Sun, 22 Nov 2020 07:38:28 +0000 (18:38 +1100)]
powerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU

We execute certain NPU2 setup code (such as mapping an LPID to a device
in NPU2) unconditionally if an Nvlink bridge is detected. However this
cannot succeed on POWER8NVL machines and errors appear in dmesg. This is
harmless as skiboot returns an error and the only place we check it is
vfio-pci but that code does not get called on P8+ either.

This adds a check if pnv_npu2_xxx helpers are called on a machine with
NPU2 which initializes pnv_phb::npu in pnv_npu2_init();
pnv_phb::npu==NULL on POWER8/NVL (Naples).

While at this, fix NULL derefencing in pnv_npu_peers_take_ownership/
pnv_npu_peers_release_ownership which occurs when GPUs on mentioned P8s
cause EEH which happens if "vfio-pci" disables devices using
the D3 power state; the vfio-pci's disable_idle_d3 module parameter
controls this and must be set on Naples. The EEH handling clears
the entire pnv_ioda_pe struct in pnv_ioda_free_pe() hence
the NULL derefencing. We cannot recover from that but at least we stop
crashing.

Tested on
- POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm,
  NVIDIA GV100 10de:1db1 driver 418.39
- POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm,
  NVIDIA P100 10de:15f9 driver 396.47

Fixes: 1b785611e119 ("powerpc/powernv/npu: Add release_ownership hook")
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201122073828.15446-1-aik@ozlabs.ru
3 years agolkdtm/powerpc: Add SLB multihit test
Ganesh Goudar [Mon, 30 Nov 2020 08:30:57 +0000 (14:00 +0530)]
lkdtm/powerpc: Add SLB multihit test

To check machine check handling, add support to inject slb
multihit errors.

Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
[mpe: Use CONFIG_PPC_BOOK3S_64 to fix compile errors reported by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201130083057.135610-1-ganeshgr@linux.ibm.com
3 years agopowernv/pci: Print an error when device enable is blocked
Oliver O'Halloran [Thu, 9 Apr 2020 06:13:37 +0000 (16:13 +1000)]
powernv/pci: Print an error when device enable is blocked

If the platform decides to block enabling the device nothing is printed
currently. This can lead to some confusion since the dmesg output will
usually print an error with no context e.g.

e1000e: probe of 0022:01:00.0 failed with error -22

This shouldn't be spammy since pci_enable_device() already prints a
messages when it succeeds.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200409061337.9187-1-oohall@gmail.com
3 years agopowerpc/pci: Remove LSI mappings on device teardown
Oliver O'Halloran [Wed, 2 Dec 2020 00:52:22 +0000 (11:52 +1100)]
powerpc/pci: Remove LSI mappings on device teardown

When a passthrough IO adapter is removed from a pseries machine using hash
MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS
to clear all page table entries related to the adapter. If some are still
present, the RTAS call which isolates the PCI slot returns error 9001
"valid outstanding translations" and the removal of the IO adapter fails.
This is because when the PHBs are scanned, Linux maps automatically the
INTx interrupts in the Linux interrupt number space but these are never
removed.

This problem can be fixed by adding the corresponding unmap operation when
the device is removed. There's no pcibios_* hook for the remove case, but
the same effect can be achieved using a bus notifier.

Because INTx are shared among PHBs (and potentially across the system),
this adds tracking of virq to unmap them only when the last user is gone.

[aik: added refcounter]

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202005222.5477-1-aik@ozlabs.ru
3 years agopowerpc: add security.config, enforcing lockdown=integrity
Daniel Axtens [Thu, 3 Dec 2020 04:28:07 +0000 (15:28 +1100)]
powerpc: add security.config, enforcing lockdown=integrity

It's sometimes handy to have a config that boots a bit like a system
under secure boot (forcing lockdown=integrity, without needing any
extra stuff like a command line option).

This config file allows that, and also turns on a few assorted security
and hardening options for good measure.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201203042807.1293655-1-dja@axtens.net
3 years agopowerpc/44x: Don't support 47x code and non 47x code at the same time
Christophe Leroy [Sun, 18 Oct 2020 17:25:18 +0000 (17:25 +0000)]
powerpc/44x: Don't support 47x code and non 47x code at the same time

440/460 variants and 470 variants are not compatible, no
need to make code supporting both and using MMU features.

Just use CONFIG_PPC_47x to decide what to build.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c3e64da3d5d068c69a201e03bbae7da055761e5b.1603041883.git.christophe.leroy@csgroup.eu
3 years agopowerpc/44x: Don't support 440 when CONFIG_PPC_47x is set
Christophe Leroy [Sun, 18 Oct 2020 17:25:17 +0000 (17:25 +0000)]
powerpc/44x: Don't support 440 when CONFIG_PPC_47x is set

As stated in platform/44x/Kconfig, CONFIG_PPC_47x is not
compatible with 440 and 460 variants.

This is confirmed in asm/cache.h as L1_CACHE_SHIFT is different
for 47x, meaning a kernel built for 47x will not run correctly
on a 440.

In cputable, opt out all 440 and 460 variants when CONFIG_PPC_47x
is set. Also add a default match dedicated to 470.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/822833ce3dc10634339818f7d1ab616edf63b0c6.1603041883.git.christophe.leroy@csgroup.eu
3 years agopowerpc/feature: Remove CPU_FTR_NODSISRALIGN
Christophe Leroy [Tue, 13 Oct 2020 11:11:21 +0000 (11:11 +0000)]
powerpc/feature: Remove CPU_FTR_NODSISRALIGN

CPU_FTR_NODSISRALIGN has not been used since
commit 31bfdb036f12 ("powerpc: Use instruction emulation
infrastructure to handle alignment faults")

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/05d98136b24bbf11525445414bb18cffe2724f48.1602587470.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm: Desintegrate MMU_FTR_PPCAS_ARCH_V2
Christophe Leroy [Mon, 12 Oct 2020 08:04:24 +0000 (08:04 +0000)]
powerpc/mm: Desintegrate MMU_FTR_PPCAS_ARCH_V2

MMU_FTR_PPCAS_ARCH_V2 is defined in cpu_table.h
as MMU_FTR_TLBIEL | MMU_FTR_16M_PAGE.

MMU_FTR_TLBIEL and MMU_FTR_16M_PAGE are defined in mmu.h

MMU_FTR_PPCAS_ARCH_V2 is used only in mmu.h and it is used only once.

Remove MMU_FTR_PPCAS_ARCH_V2 and use
directly MMU_FTR_TLBIEL | MMU_FTR_16M_PAGE

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/829ae1aed1d2fc6b5fc5818362e573dee5d6ecde.1602489852.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm: MMU_FTR_NEED_DTLB_SW_LRU is only possible with CONFIG_PPC_83xx
Christophe Leroy [Mon, 12 Oct 2020 08:05:49 +0000 (08:05 +0000)]
powerpc/mm: MMU_FTR_NEED_DTLB_SW_LRU is only possible with CONFIG_PPC_83xx

Only mpc83xx will set MMU_FTR_NEED_DTLB_SW_LRU and its
definition is enclosed in #ifdef CONFIG_PPC_83xx.

Make MMU_FTR_NEED_DTLB_SW_LRU possible only when
CONFIG_PPC_83xx is set.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d01d7613664fafa43de1f1ae89924075bc24241c.1602489931.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm: Remove useless #ifndef CPU_FTR_COHERENT_ICACHE in mem.c
Christophe Leroy [Mon, 12 Oct 2020 08:02:30 +0000 (08:02 +0000)]
powerpc/mm: Remove useless #ifndef CPU_FTR_COHERENT_ICACHE in mem.c

Since commit 10b35d9978ac ("[PATCH] powerpc: merged asm/cputable.h"),
CPU_FTR_COHERENT_ICACHE has always been defined.

Remove the #ifndef CPU_FTR_COHERENT_ICACHE block.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e26ddc1d6f6aca739dd8d2b7c67351ead559b084.1602489664.git.christophe.leroy@csgroup.eu
3 years agopowerpc/feature: Add CPU_FTR_NOEXECUTE to G2_LE
Christophe Leroy [Mon, 12 Oct 2020 08:02:13 +0000 (08:02 +0000)]
powerpc/feature: Add CPU_FTR_NOEXECUTE to G2_LE

G2_LE has a 603 core, add CPU_FTR_NOEXECUTE.

Fixes: 385e89d5b20f ("powerpc/mm: add exec protection on powerpc 603")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/39a530ee41d83f49747ab3af8e39c056450b9b4d.1602489653.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm: Fix verification of MMU_FTR_TYPE_44x
Christophe Leroy [Sat, 10 Oct 2020 17:30:59 +0000 (17:30 +0000)]
powerpc/mm: Fix verification of MMU_FTR_TYPE_44x

MMU_FTR_TYPE_44x cannot be checked by cpu_has_feature()

Use mmu_has_feature() instead

Fixes: 23eb7f560a2a ("powerpc: Convert flush_icache_range & friends to C")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ceede82fadf37f3b8275e61fcf8cf29a3e2ec7fe.1602351011.git.christophe.leroy@csgroup.eu
3 years agopowerpc/time: Remove ifdef in get_vtb()
Christophe Leroy [Thu, 1 Oct 2020 10:59:20 +0000 (10:59 +0000)]
powerpc/time: Remove ifdef in get_vtb()

SPRN_VTB and CPU_FTR_ARCH_207S are always defined,
no need of an ifdef.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a0fc81cd85121407726bcf480fc9a0d8e7617fce.1601549933.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs
Christophe Leroy [Wed, 25 Nov 2020 07:10:53 +0000 (07:10 +0000)]
powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs

Use SPRN_SPRG_SCRATCH2 as a third scratch register in
exception prologs in order to simplify them and avoid
data going back and forth from/to CR.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6f5c8a7faa8cc54acb89c55c20aa579a2f30a4e9.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog
Christophe Leroy [Wed, 25 Nov 2020 07:10:52 +0000 (07:10 +0000)]
powerpc/32s: Use SPRN_SPRG_SCRATCH2 in DSI prolog

Use SPRN_SPRG_SCRATCH2 as an alternative scratch register in
the early part of DSI prolog in order to avoid clobbering
SPRN_SPRG_SCRATCH0/1 used by other prologs.

The 603 doesn't like a jump from DataLoadTLBMiss to the 10 nops
that are now in the beginning of DSI exception as a result of
the feature section. To workaround this, add a jump as alternative.
It also avoids fetching 10 nops for nothing.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f9f8df2a2be93568768ef1ac793639f7914cf103.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32: Simplify EXCEPTION_PROLOG_1 macro
Christophe Leroy [Wed, 25 Nov 2020 07:10:51 +0000 (07:10 +0000)]
powerpc/32: Simplify EXCEPTION_PROLOG_1 macro

Make code more readable with a clear CONFIG_VMAP_STACK
section and a clear non CONFIG_VMAP_STACK section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c0f16cf432d22fc80097264d94649460d3dd761d.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/603: Use SPRN_SDR1 to store the pgdir phys address
Christophe Leroy [Wed, 25 Nov 2020 07:10:50 +0000 (07:10 +0000)]
powerpc/603: Use SPRN_SDR1 to store the pgdir phys address

On the 603, SDR1 is not used.

In order to free SPRN_SPRG2, use SPRN_SDR1 to store the pgdir
phys addr.

But only some bits of SDR1 can be used (0xffff01ff).
As the pgdir is 4k aligned, rotate it by 4 bits to the left.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7370574b49d8476878ce5480726197993cb76108.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32s: Don't use SPRN_SPRG_PGDIR in hash_page
Christophe Leroy [Wed, 25 Nov 2020 07:10:49 +0000 (07:10 +0000)]
powerpc/32s: Don't use SPRN_SPRG_PGDIR in hash_page

SPRN_SPRG_PGDIR is there mainly to speedup SW TLB miss handlers
for powerpc 603.

We need to free SPRN_SPRG2 to reduce the mess with CONFIG_VMAP_STACK.

In hash_page(), reading PGDIR from thread_struct will be in the noise
performance wise.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4adca19b7120cdf619956768ed09e74fc6a558f3.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32s: Fix an FTR_SECTION_ELSE
Christophe Leroy [Wed, 25 Nov 2020 07:10:48 +0000 (07:10 +0000)]
powerpc/32s: Fix an FTR_SECTION_ELSE

An FTR_SECTION_ELSE is in the middle of
BEGIN_MMU_FTR_SECTION/ALT_MMU_FTR_SECTION_END_IFSET

Change it to MMU_FTR_SECTION_ELSE

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/61790f1a91692950a6bb5bb53d6d514d9bcdad74.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32s: Don't hash_preload() kernel text
Christophe Leroy [Wed, 25 Nov 2020 07:10:47 +0000 (07:10 +0000)]
powerpc/32s: Don't hash_preload() kernel text

We now always map kernel text with BATs. Neither need to preload
hash with kernel text addresses nor ensure they are never evicted.

This is more or less a revert of commit ee4f2ea48674 ("[POWERPC] Fix
32-bit mm operations when not using BATs")

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0a0bab7fadd89aa829e33420fbc10d60c59040a7.1606285014.git.christophe.leroy@csgroup.eu
3 years agopowerpc/32s: Always map kernel text and rodata with BATs
Christophe Leroy [Wed, 25 Nov 2020 07:10:46 +0000 (07:10 +0000)]
powerpc/32s: Always map kernel text and rodata with BATs

Since commit 2b279c0348af ("powerpc/32s: Allow mapping with BATs with
DEBUG_PAGEALLOC"), there is no real situation where mapping without
BATs is required.

In order to simplify memory handling, always map kernel text
and rodata with BATs even when "nobats" kernel parameter is set.

Also fix the 603 TLB miss exceptions that don't require anymore
kernel page table if DEBUG_PAGEALLOC.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/da51f7ec632825a4ce43290a904aad61648408c0.1606285013.git.christophe.leroy@csgroup.eu
3 years agoocxl: Add new kernel traces
Christophe Lombard [Wed, 25 Nov 2020 15:50:13 +0000 (16:50 +0100)]
ocxl: Add new kernel traces

Add specific kernel traces which provide information on mmu notifier and on
pages range.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-6-clombard@linux.vnet.ibm.com
3 years agoocxl: Add mmu notifier
Christophe Lombard [Wed, 25 Nov 2020 15:50:12 +0000 (16:50 +0100)]
ocxl: Add mmu notifier

Add invalidate_range mmu notifier, when required (ATSD access of MMIO
registers is available), to initiate TLB invalidation commands.
For the time being, the ATSD0 set of registers is used by default.

The pasid and bdf values have to be configured in the Process Element
Entry.
The PEE must be set up to match the BDF/PASID of the AFU.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-5-clombard@linux.vnet.ibm.com
3 years agoocxl: Update the Process Element Entry
Christophe Lombard [Wed, 25 Nov 2020 15:50:11 +0000 (16:50 +0100)]
ocxl: Update the Process Element Entry

To complete the MMIO based mechanism, the fields: PASID, bus, device and
function of the Process Element Entry have to be filled. (See
OpenCAPI Power Platform Architecture document)

                   Hypervisor Process Element Entry
Word
    0 1 .... 7  8  ...... 12  13 ..15  16.... 19  20 ........... 31
0                  OSL Configuration State (0:31)
1                  OSL Configuration State (32:63)
2               PASID                      |    Reserved
3       Bus   |   Device    |Function |        Reserved
4                             Reserved
5                             Reserved
6                               ....

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-4-clombard@linux.vnet.ibm.com
3 years agoocxl: Initiate a TLB invalidate command
Christophe Lombard [Wed, 25 Nov 2020 15:50:10 +0000 (16:50 +0100)]
ocxl: Initiate a TLB invalidate command

When a TLB Invalidate is required for the Logical Partition, the following
sequence has to be performed:

1. Load MMIO ATSD AVA register with the necessary value, if required.
2. Write the MMIO ATSD launch register to initiate the TLB Invalidate
command.
3. Poll the MMIO ATSD status register to determine when the TLB Invalidate
   has been completed.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-3-clombard@linux.vnet.ibm.com
3 years agoocxl: Assign a register set to a Logical Partition
Christophe Lombard [Wed, 25 Nov 2020 15:50:09 +0000 (16:50 +0100)]
ocxl: Assign a register set to a Logical Partition

Platform specific function to assign a register set to a Logical Partition.
The "ibm,mmio-atsd" property, provided by the firmware, contains the 16
base ATSD physical addresses (ATSD0 through ATSD15) of the set of MMIO
registers (XTS MMIO ATSDx LPARID/AVA/launch/status register).

For the time being, the ATSD0 set of registers is used by default.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-2-clombard@linux.vnet.ibm.com
3 years agopowerpc/perf: MMCR0 control for PMU registers under PMCC=00
Athira Rajeev [Thu, 26 Nov 2020 16:54:44 +0000 (11:54 -0500)]
powerpc/perf: MMCR0 control for PMU registers under PMCC=00

PowerISA v3.1 introduces new control bit (PMCCEXT) for restricting
access to group B PMU registers in problem state when
MMCR0 PMCC=0b00. In problem state and when MMCR0 PMCC=0b00,
setting the Monitor Mode Control Register bit 54 (MMCR0 PMCCEXT),
will restrict read permission on Group B Performance Monitor
Registers (SIER, SIAR, SDAR and MMCR1). When this bit is set to zero,
group B registers will be readable. In other platforms (like power9),
the older behaviour is retained where group B PMU SPRs are readable.

Patch adds support for MMCR0 PMCCEXT bit in power10 by enabling
this bit during boot and during the PMU event enable/disable callback
functions.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-8-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Fix to update cache events with l2l3 events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:43 +0000 (11:54 -0500)]
powerpc/perf: Fix to update cache events with l2l3 events in power10

Export l2l3 events (PM_L2_ST_MISS and PM_L2_ST) and LLC-prefetches
(PM_L3_PF_MISS_L3) via sysfs, and also add these to list of
cache_events.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-7-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Fix to update generic event codes for power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:42 +0000 (11:54 -0500)]
powerpc/perf: Fix to update generic event codes for power10

Fix the event code for events: branch-instructions (to PM_BR_FIN),
branch-misses (to PM_MPRED_BR_FIN) and cache-misses (to
PM_LD_DEMAND_MISS_L1_FIN) for power10 PMU. Update the
list of generic events with this modified event code.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-6-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Add generic and cache event list for power10 DD1
Athira Rajeev [Thu, 26 Nov 2020 16:54:41 +0000 (11:54 -0500)]
powerpc/perf: Add generic and cache event list for power10 DD1

There are event code updates for some of the generic events
and cache events for power10. Inorder to maintain the current
event codes work with DD1 also, create a new array of generic_events,
cache_events and pmu_attr_groups with suffix _dd1, example,
power10_events_attr_dd1. So that further updates to event codes
can be made in the original list, ie, power10_events_attr. Update the
power10 pmu init code to pick the dd1 list while registering
the power PMU, based on the pvr (Processor Version Register) value.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-5-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Fix the PMU group constraints for threshold events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:40 +0000 (11:54 -0500)]
powerpc/perf: Fix the PMU group constraints for threshold events in power10

The PMU group constraints mask for threshold events covers
all thresholding bits which includes threshold control value
(start/stop), select value as well as thresh_cmp value (MMCRA[9:18].
In power9, thresh_cmp bits were part of the event code. But in case
of power10, thresh_cmp bits are not part of event code due to
inclusion of MMCR3 bits. Hence thresh_cmp is not valid for
group constraints for power10.

Fix the PMU group constraints checking for threshold events in
power10 by using constraint mask and value for only threshold control
and select bits.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-4-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Update the PMU group constraints for l2l3 events in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:39 +0000 (11:54 -0500)]
powerpc/perf: Update the PMU group constraints for l2l3 events in power10

In Power9, L2/L3 bus events are always available as a
"bank" of 4 events. To obtain the counts for any of the
l2/l3 bus events in a given bank, the user will have to
program PMC4 with corresponding l2/l3 bus event for that
bank.

Commit 59029136d750 ("powerpc/perf: Add constraints for power9 l2/l3 bus events")
enforced this rule in Power9. But this is not valid for
Power10, since in Power10 Monitor Mode Control Register2
(MMCR2) has bits to configure l2/l3 event bits. Hence remove
this PMC4 constraint check from power10.

Since the l2/l3 bits in MMCR2 are not per-pmc, patch handles
group constrints checks for l2/l3 bits in MMCR2.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-3-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/perf: Fix to update radix_scope_qual in power10
Athira Rajeev [Thu, 26 Nov 2020 16:54:38 +0000 (11:54 -0500)]
powerpc/perf: Fix to update radix_scope_qual in power10

power10 uses bit 9 of the raw event code as RADIX_SCOPE_QUAL.
This bit is used for enabling the radix process events.
Patch fixes the PMU counter support functions to program bit
18 of MMCR1 ( Monitor Mode Control Register1 ) with the
RADIX_SCOPE_QUAL bit value. Since this field is not per-pmc,
add this to PMU group constraints to make sure events in a
group will have same bit value for this field. Use bit 21 as
constraint bit field for radix_scope_qual. Patch also updates
the power10 raw event encoding layout information, format field
and constraints bit layout to include the radix_scope_qual bit.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-2-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agopowerpc/book3s64/pkeys: Optimize KUAP and KUEP feature disabled case
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:24 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Optimize KUAP and KUEP feature disabled case

If FTR_BOOK3S_KUAP is disabled, kernel will continue to run with the same AMR
value with which it was entered. Hence there is a high chance that
we can return without restoring the AMR value. This also helps the case
when applications are not using the pkey feature. In this case, different
applications will have the same AMR values and hence we can avoid restoring
AMR in this case too.

Also avoid isync() if not really needed.

Do the same for IAMR.

null-syscall benchmark results:

With smap/smep disabled:
Without patch:
957.95 ns    2778.17 cycles
With patch:
858.38 ns    2489.30 cycles

With smap/smep enabled:
Without patch:
1017.26 ns    2950.36 cycles
With patch:
1021.51 ns    2962.44 cycles

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-23-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kup: Check max key supported before enabling kup
Aneesh Kumar K.V [Wed, 2 Dec 2020 04:38:54 +0000 (10:08 +0530)]
powerpc/book3s64/kup: Check max key supported before enabling kup

Don't enable KUEP/KUAP if we support less than or equal to 3 keys.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202043854.76406-1-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/hash/kuep: Enable KUEP on hash
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:22 +0000 (10:14 +0530)]
powerpc/book3s64/hash/kuep: Enable KUEP on hash

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-21-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/hash/kuap: Enable kuap on hash
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:21 +0000 (10:14 +0530)]
powerpc/book3s64/hash/kuap: Enable kuap on hash

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-20-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:20 +0000 (10:14 +0530)]
powerpc/book3s64/kuep: Use Key 3 to implement KUEP with hash translation.

Radix use IAMR Key 0 and hash translation use IAMR key 3.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-19-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:19 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Use Key 3 to implement KUAP with hash translation.

Radix use AMR Key 0 and hash translation use AMR key 3.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-18-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Improve error reporting with KUAP
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:18 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Improve error reporting with KUAP

With hash translation use DSISR_KEYFAULT to identify a wrong access.
With Radix we look at the AMR value and type of fault.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-17-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:17 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Restrict access to userspace based on userspace AMR

If an application has configured address protection such that read/write is
denied using pkey even the kernel should receive a FAULT on accessing the same.

This patch use user AMR value stored in pt_regs.amr to achieve the same.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-16-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:16 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.

Now that kernel correctly store/restore userspace AMR/IAMR values, avoid
manipulating AMR and IAMR from the kernel on behalf of userspace.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-15-aneesh.kumar@linux.ibm.com
3 years agopowerpc/ptrace-view: Use pt_regs values instead of thread_struct based one.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:15 +0000 (10:14 +0530)]
powerpc/ptrace-view: Use pt_regs values instead of thread_struct based one.

We will remove thread.amr/iamr/uamor in a later patch

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-14-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/pkeys: Reset userspace AMR correctly on exec
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:14 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Reset userspace AMR correctly on exec

On fork, we inherit from the parent and on exec, we should switch to default_amr values.

Also, avoid changing the AMR register value within the kernel. The kernel now runs with
different AMR values.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-13-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/pkeys: Inherit correctly on fork.
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:13 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Inherit correctly on fork.

Child thread.kuap value is inherited from the parent in copy_thread_tls. We still
need to make sure when the child returns from a fork in the kernel we start with the kernel
default AMR value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-12-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/pkeys: Store/restore userspace AMR/IAMR correctly on entry and exit...
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:12 +0000 (10:14 +0530)]
powerpc/book3s64/pkeys: Store/restore userspace AMR/IAMR correctly on entry and exit from kernel

This prepare kernel to operate with a different value than userspace AMR/IAMR.
For this, AMR/IAMR need to be saved and restored on entry and return from the
kernel.

With KUAP we modify kernel AMR when accessing user address from the kernel
via copy_to/from_user interfaces. We don't need to modify IAMR value in
similar fashion.

If MMU_FTR_PKEY is enabled we need to save AMR/IAMR in pt_regs on entering
kernel from userspace. If not we can assume that AMR/IAMR is not modified
from userspace.

We need to save AMR if we have MMU_FTR_BOOK3S_KUAP feature enabled and we are
interrupted within kernel. This is required so that if we get interrupted
within copy_to/from_user we continue with the right AMR value.

If we hae MMU_FTR_BOOK3S_KUEP enabled we need to restore IAMR on
return to userspace beause kernel will be running with a different
IAMR value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-11-aneesh.kumar@linux.ibm.com
3 years agopowerpc/exec: Set thread.regs early during exec
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:11 +0000 (10:14 +0530)]
powerpc/exec: Set thread.regs early during exec

In later patches during exec, we would like to access default regs.amr to
control access to the user mapping. Having thread.regs set early makes the
code changes simpler.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-10-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:10 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation

This patch updates kernel hash page table entries to use storage key 3
for its mapping. This implies all kernel access will now use key 3 to
control READ/WRITE. The patch also prevents the allocation of key 3 from
userspace and UAMOR value is updated such that userspace cannot modify key 3.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-9-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Rename MMU_FTR_RADIX_KUAP and MMU_FTR_KUEP
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:09 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Rename MMU_FTR_RADIX_KUAP and MMU_FTR_KUEP

This is in preparation to adding support for kuap with hash translation.
In preparation for that rename/move kuap related functions to
non radix names. Also move the feature bit closer to MMU_FTR_KUEP.

MMU_FTR_KUEP is renamed to MMU_FTR_BOOK3S_KUEP to indicate the feature
is only relevant to BOOK3S_64

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-8-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuep: Move KUEP related function outside radix
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:08 +0000 (10:14 +0530)]
powerpc/book3s64/kuep: Move KUEP related function outside radix

The next set of patches adds support for kuep with hash translation.
In preparation for that rename/move kuap related functions to
non radix names.

Also set MMU_FTR_KUEP and add the missing isync().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-7-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap: Move KUAP related function outside radix
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:07 +0000 (10:14 +0530)]
powerpc/book3s64/kuap: Move KUAP related function outside radix

The next set of patches adds support for kuap with hash translation.
In preparation for that rename/move kuap related functions to
non radix names.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-6-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap/kuep: Move uamor setup to pkey init
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:06 +0000 (10:14 +0530)]
powerpc/book3s64/kuap/kuep: Move uamor setup to pkey init

This patch consolidates UAMOR update across pkey, kuap and kuep features.
The boot cpu initialize UAMOR via pkey init and both radix/hash do the
secondary cpu UAMOR init in early_init_mmu_secondary.

We don't check for mmu_feature in radix secondary init because UAMOR
is a supported SPRN with all CPUs supporting radix translation.
The old code was not updating UAMOR if we had smap disabled and smep enabled.
This change handles that case.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-5-aneesh.kumar@linux.ibm.com
3 years agopowerpc/book3s64/kuap/kuep: Add PPC_PKEY config on book3s64
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:05 +0000 (10:14 +0530)]
powerpc/book3s64/kuap/kuep: Add PPC_PKEY config on book3s64

The config CONFIG_PPC_PKEY is used to select the base support that is
required for PPC_MEM_KEYS, KUAP, and KUEP. Adding this dependency
reduces the code complexity(in terms of #ifdefs) and enables us to
move some of the initialization code to pkeys.c

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-4-aneesh.kumar@linux.ibm.com
3 years agoKVM: PPC: BOOK3S: PR: Ignore UAMOR SPR
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:04 +0000 (10:14 +0530)]
KVM: PPC: BOOK3S: PR: Ignore UAMOR SPR

With power7 and above we expect the cpu to support keys. The
number of keys are firmware controlled based on device tree.
PR KVM do not expose key details via device tree. Hence when running with PR KVM
we do run with MMU_FTR_KEY support disabled. But we can still
get updates on UAMOR. Hence ignore access to them and for mfstpr return
0 indicating no AMR/IAMR update is no allowed.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-3-aneesh.kumar@linux.ibm.com
3 years agopowerpc: Add new macro to handle NESTED_IFCLR
Aneesh Kumar K.V [Fri, 27 Nov 2020 04:44:03 +0000 (10:14 +0530)]
powerpc: Add new macro to handle NESTED_IFCLR

This will be used by the following patches

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201127044424.40686-2-aneesh.kumar@linux.ibm.com
3 years agopowerpc/64s: Tidy machine check SLB logging
Nicholas Piggin [Sat, 28 Nov 2020 07:07:28 +0000 (17:07 +1000)]
powerpc/64s: Tidy machine check SLB logging

Since ISA v3.0, SLB no longer uses the slb_cache, and stab_rr is no
longer correlated with SLB allocation. Move those to pre-3.0.

While here, improve some alignments and reduce whitespace.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-9-npiggin@gmail.com
3 years agopowerpc/64s: Remove "Host" from MCE logging
Nicholas Piggin [Sat, 28 Nov 2020 07:07:27 +0000 (17:07 +1000)]
powerpc/64s: Remove "Host" from MCE logging

"Host" caused machine check is printed when the kernel sees a MCE
hit in this kernel or userspace, and "Guest" if it hit one of its
guests. This is confusing when a guest kernel handles a hypervisor-
delivered MCE, it also prints "Host".

Just remove "Host". "Guest" is adequate to make the distinction.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-8-npiggin@gmail.com
3 years agopowerpc/64s/pseries: Add ERAT specific machine check handler
Nicholas Piggin [Sat, 28 Nov 2020 07:07:26 +0000 (17:07 +1000)]
powerpc/64s/pseries: Add ERAT specific machine check handler

Don't treat ERAT MCEs as SLB, don't save the SLB and use a specific
ERAT flush to recover it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-7-npiggin@gmail.com
3 years agopowerpc/64s/powernv: Ratelimit harmless HMI error printing
Nicholas Piggin [Sat, 28 Nov 2020 07:07:25 +0000 (17:07 +1000)]
powerpc/64s/powernv: Ratelimit harmless HMI error printing

Harmless HMI errors can be triggered by guests in some cases, and don't
contain much useful information anyway. Ratelimit these to avoid
flooding the console/logs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use dedicated ratelimit state, not printk_ratelimit()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-6-npiggin@gmail.com
3 years agoKVM: PPC: Book3S HV: Ratelimit machine check messages coming from guests
Nicholas Piggin [Sat, 28 Nov 2020 07:07:24 +0000 (17:07 +1000)]
KVM: PPC: Book3S HV: Ratelimit machine check messages coming from guests

A number of machine check exceptions are triggerable by the guest.
Ratelimit these to avoid a guest flooding the host console and logs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use dedicated ratelimit state, not printk_ratelimit()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-5-npiggin@gmail.com
3 years agoKVM: PPC: Book3S HV: Don't attempt to recover machine checks for FWNMI enabled guests
Nicholas Piggin [Sat, 28 Nov 2020 07:07:23 +0000 (17:07 +1000)]
KVM: PPC: Book3S HV: Don't attempt to recover machine checks for FWNMI enabled guests

Guests that can deal with machine checks would actually prefer the
hypervisor not to try recover for them. For example if SLB multi-hits
are recovered by the hypervisor by clearing the SLB then the guest
will not be able to log the contents and debug its programming error.

If guests don't register for FWNMI, they may not be so capable and so
the hypervisor will continue to recover for those.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-4-npiggin@gmail.com
3 years agopowerpc/64s/powernv: Allow KVM to handle guest machine check details
Nicholas Piggin [Sat, 28 Nov 2020 07:07:22 +0000 (17:07 +1000)]
powerpc/64s/powernv: Allow KVM to handle guest machine check details

KVM has strategies to perform machine check recovery. If a MCE hits
in a guest, have the low level handler just decode and save the MCE
but not try to recover anything, so KVM can deal with it.

The host does not own SLBs and does not need to report the SLB state
in case of a multi-hit for example, or know about the virtual memory
map of the guest.

UE and memory poisoning of guest pages in the host is one thing that
is possibly not completely robust at the moment, but this too needs
to go via KVM (possibly via the guest and back out to host via hcall)
rather than being handled at a low level in the host handler.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-3-npiggin@gmail.com
3 years agopowerpc/ps3: make system bus's remove and shutdown callbacks return void
Uwe Kleine-König [Thu, 26 Nov 2020 16:59:50 +0000 (17:59 +0100)]
powerpc/ps3: make system bus's remove and shutdown callbacks return void

The driver core ignores the return value of struct device_driver::remove
because there is only little that can be done. For the shutdown callback
it's ps3_system_bus_shutdown() which ignores the return value.

To simplify the quest to make struct device_driver::remove return void,
let struct ps3_system_bus_driver::remove return void, too. All users
already unconditionally return 0, this commit makes it obvious that
returning an error code is a bad idea and ensures future users behave
accordingly.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126165950.2554997-2-u.kleine-koenig@pengutronix.de
3 years agoALSA: ppc: drop if block with always false condition
Uwe Kleine-König [Thu, 26 Nov 2020 16:59:49 +0000 (17:59 +0100)]
ALSA: ppc: drop if block with always false condition

The remove callback is only called for devices that were probed
successfully before. As the matching probe function cannot complete
without error if dev->match_id != PS3_MATCH_ID_SOUND, we don't have to
check this here.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126165950.2554997-1-u.kleine-koenig@pengutronix.de
3 years agopowerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()
Srikar Dronamraju [Wed, 2 Dec 2020 05:04:56 +0000 (10:34 +0530)]
powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()

If its a shared LPAR but not a KVM guest, then see if the vCPU is
related to the calling vCPU. On PowerVM, only cores can be preempted.
So if one vCPU is a non-preempted state, we can decipher that all
other vCPUs sharing the same core are in non-preempted state.

Performance results:

  $ perf stat -r 5 -a perf bench sched pipe -l 10000000 (lesser time is better)

  powerpc/next
       35,107,951.20 msec cpu-clock                 #  255.898 CPUs utilized            ( +-  0.31% )
          23,655,348      context-switches          #    0.674 K/sec                    ( +-  3.72% )
              14,465      cpu-migrations            #    0.000 K/sec                    ( +-  5.37% )
              82,463      page-faults               #    0.002 K/sec                    ( +-  8.40% )
   1,127,182,328,206      cycles                    #    0.032 GHz                      ( +-  1.60% )  (66.67%)
      78,587,300,622      stalled-cycles-frontend   #    6.97% frontend cycles idle     ( +-  0.08% )  (50.01%)
     654,124,218,432      stalled-cycles-backend    #   58.03% backend cycles idle      ( +-  1.74% )  (50.01%)
     834,013,059,242      instructions              #    0.74  insn per cycle
                                                    #    0.78  stalled cycles per insn  ( +-  0.73% )  (66.67%)
     132,911,454,387      branches                  #    3.786 M/sec                    ( +-  0.59% )  (50.00%)
       2,890,882,143      branch-misses             #    2.18% of all branches          ( +-  0.46% )  (50.00%)

             137.195 +- 0.419 seconds time elapsed  ( +-  0.31% )

  powerpc/next + patchset
       29,981,702.64 msec cpu-clock                 #  255.881 CPUs utilized            ( +-  1.30% )
          40,162,456      context-switches          #    0.001 M/sec                    ( +-  0.01% )
               1,110      cpu-migrations            #    0.000 K/sec                    ( +-  5.20% )
              62,616      page-faults               #    0.002 K/sec                    ( +-  3.93% )
   1,430,030,626,037      cycles                    #    0.048 GHz                      ( +-  1.41% )  (66.67%)
      83,202,707,288      stalled-cycles-frontend   #    5.82% frontend cycles idle     ( +-  0.75% )  (50.01%)
     744,556,088,520      stalled-cycles-backend    #   52.07% backend cycles idle      ( +-  1.39% )  (50.01%)
     940,138,418,674      instructions              #    0.66  insn per cycle
                                                    #    0.79  stalled cycles per insn  ( +-  0.51% )  (66.67%)
     146,452,852,283      branches                  #    4.885 M/sec                    ( +-  0.80% )  (50.00%)
       3,237,743,996      branch-misses             #    2.21% of all branches          ( +-  1.18% )  (50.01%)

              117.17 +- 1.52 seconds time elapsed  ( +-  1.30% )

This is around 14.6% improvement in performance.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Waiman Long <longman@redhat.com>
[mpe: Fold in performance results from cover letter]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202050456.164005-5-srikar@linux.vnet.ibm.com
3 years agopowerpc: Reintroduce is_kvm_guest() as a fast-path check
Srikar Dronamraju [Wed, 2 Dec 2020 05:04:55 +0000 (10:34 +0530)]
powerpc: Reintroduce is_kvm_guest() as a fast-path check

Introduce a static branch that would be set during boot if the OS
happens to be a KVM guest. Subsequent checks to see if we are on KVM
will rely on this static branch. This static branch would be used in
vcpu_is_preempted() in a subsequent patch.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202050456.164005-4-srikar@linux.vnet.ibm.com
3 years agopowerpc: Rename is_kvm_guest() to check_kvm_guest()
Srikar Dronamraju [Wed, 2 Dec 2020 05:04:54 +0000 (10:34 +0530)]
powerpc: Rename is_kvm_guest() to check_kvm_guest()

We want to reuse the is_kvm_guest() name in a subsequent patch but
with a new body. Hence rename is_kvm_guest() to check_kvm_guest(). No
additional changes.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: kernel test robot <lkp@intel.com> # int -> bool fix
[mpe: Fold in fix from lkp to use true/false not 0/1]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202050456.164005-3-srikar@linux.vnet.ibm.com
3 years agopowerpc: Refactor is_kvm_guest() declaration to new header
Srikar Dronamraju [Wed, 2 Dec 2020 05:04:53 +0000 (10:34 +0530)]
powerpc: Refactor is_kvm_guest() declaration to new header

Only code/declaration movement, in anticipation of doing a KVM-aware
vcpu_is_preempted(). No additional changes.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201202050456.164005-2-srikar@linux.vnet.ibm.com
3 years agopowerpc: show registers when unwinding interrupt frames
Nicholas Piggin [Sat, 7 Nov 2020 02:33:05 +0000 (12:33 +1000)]
powerpc: show registers when unwinding interrupt frames

It's often useful to know the register state for interrupts in
the stack frame. In the below example (with this patch applied),
the important information is the state of the page fault.

A blatant case like this probably rather should have the page
fault regs passed down to the warning, but quite often there are
less obvious cases where an interrupt shows up that might give
some more clues.

The downside is longer and more complex bug output.

  Bug: Write fault blocked by AMR!
  WARNING: CPU: 0 PID: 72 at arch/powerpc/include/asm/book3s/64/kup-radix.h:164 __do_page_fault+0x880/0xa90
  Modules linked in:
  CPU: 0 PID: 72 Comm: systemd-gpt-aut Not tainted
  NIP:  c00000000006e2f0 LR: c00000000006e2ec CTR: 0000000000000000
  REGS: c00000000a4f3420 TRAP: 0700
  MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 28002840  XER: 20040000
  CFAR: c000000000128be0 IRQMASK: 3
  GPR00: c00000000006e2ec c00000000a4f36c0 c0000000014f0700 0000000000000020
  GPR04: 0000000000000001 c000000001290f50 0000000000000001 c000000001290f80
  GPR08: c000000001612b08 0000000000000000 0000000000000000 00000000ffffe0f7
  GPR12: 0000000048002840 c0000000016e0000 c00c000000021c80 c000000000fd6f60
  GPR16: 0000000000000000 c00000000a104698 0000000000000003 c0000000087f0000
  GPR20: 0000000000000100 c0000000070330b8 0000000000000000 0000000000000004
  GPR24: 0000000002000000 0000000000000300 0000000002000000 c00000000a5b0c00
  GPR28: 0000000000000000 000000000a000000 00007fffb2a90038 c00000000a4f3820
  NIP [c00000000006e2f0] __do_page_fault+0x880/0xa90
  LR [c00000000006e2ec] __do_page_fault+0x87c/0xa90
  Call Trace:
  [c00000000a4f36c0] [c00000000006e2ec] __do_page_fault+0x87c/0xa90 (unreliable)
  [c00000000a4f3780] [c000000000e1c034] do_page_fault+0x34/0x90
  [c00000000a4f37b0] [c000000000008908] data_access_common_virt+0x158/0x1b0
  --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
  NIP:  c00000000009b028 LR: c000000000802978 CTR: 0000000000000800
  REGS: c00000000a4f3820 TRAP: 0300
  MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004840  XER: 00000000
  CFAR: c00000000009aff4 DAR: 00007fffb2a90038 DSISR: 0a000000 IRQMASK: 0
  GPR00: 0000000000000000 c00000000a4f3ac0 c0000000014f0700 00007fffb2a90028
  GPR04: c000000008720010 0000000000010000 0000000000000000 0000000000000000
  GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
  GPR12: 0000000000004000 c0000000016e0000 c00c000000021c80 c000000000fd6f60
  GPR16: 0000000000000000 c00000000a104698 0000000000000003 c0000000087f0000
  GPR20: 0000000000000100 c0000000070330b8 0000000000000000 0000000000000004
  GPR24: c00000000a4f3c80 c000000008720000 0000000000010000 0000000000000000
  GPR28: 0000000000010000 0000000008720000 0000000000010000 c000000001515b98
  NIP [c00000000009b028] __copy_tofrom_user_base+0x9c/0x5a4
  LR [c000000000802978] copyout+0x68/0xc0
  --- interrupt: 300
  [c00000000a4f3af0] [c0000000008074b8] copy_page_to_iter+0x188/0x540
  [c00000000a4f3b50] [c00000000035c678] generic_file_buffered_read+0x358/0xd80
  [c00000000a4f3c40] [c0000000004c1e90] blkdev_read_iter+0x50/0x80
  [c00000000a4f3c60] [c00000000045733c] new_sync_read+0x12c/0x1c0
  [c00000000a4f3d00] [c00000000045a1f0] vfs_read+0x1d0/0x240
  [c00000000a4f3d50] [c00000000045a7f4] ksys_read+0x84/0x140
  [c00000000a4f3da0] [c000000000033a60] system_call_exception+0x100/0x280
  [c00000000a4f3e10] [c00000000000c508] system_call_common+0xf8/0x2f8
  Instruction dump:
  eae10078 3be0000b 4bfff890 60420000 792917e1 4182ff18 3c82ffab 3884a5e0
  3c62ffab 3863a6e8 480ba891 60000000 <0fe000003be0000b 4bfff860 e93c0938

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201107023305.2384874-1-npiggin@gmail.com
3 years agopowerpc/perf: Invoke per-CPU variable access with disabled interrupts
Athira Rajeev [Tue, 1 Dec 2020 09:28:00 +0000 (04:28 -0500)]
powerpc/perf: Invoke per-CPU variable access with disabled interrupts

The power_pmu_event_init() callback access per-cpu variable
(cpu_hw_events) to check for event constraints and Branch Stack
(BHRB). Current usage is to disable preemption when accessing the
per-cpu variable, but this does not prevent timer callback from
interrupting event_init. Fix this by using 'local_irq_save/restore'
to make sure the code path is invoked with disabled interrupts.

This change is tested in mambo simulator to ensure that, if a timer
interrupt comes in during the per-cpu access in event_init, it will be
soft masked and replayed later. For testing purpose, introduced a
udelay() in power_pmu_event_init() to make sure a timer interrupt arrives
while in per-cpu variable access code between local_irq_save/resore.
As expected the timer interrupt was replayed later during local_irq_restore
called from power_pmu_event_init. This was confirmed by adding
breakpoint in mambo and checking the backtrace when timer_interrupt
was hit.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606814880-1720-1-git-send-email-atrajeev@linux.vnet.ibm.com
3 years agoselftests/powerpc: Fix uninitialized variable warning
Harish [Tue, 1 Dec 2020 09:24:03 +0000 (14:54 +0530)]
selftests/powerpc: Fix uninitialized variable warning

Patch fixes uninitialized variable warning in bad_accesses test
which causes the selftests build to fail in older distibutions

bad_accesses.c: In function ‘bad_access’:
bad_accesses.c:52:9: error: ‘x’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
   printf("Bad - no SEGV! (%c)\n", x);
         ^
cc1: all warnings being treated as errors

Signed-off-by: Harish <harish@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201201092403.238182-1-harish@linux.ibm.com
3 years agoselftests/powerpc: update .gitignore
Daniel Axtens [Tue, 1 Dec 2020 14:44:27 +0000 (01:44 +1100)]
selftests/powerpc: update .gitignore

I did an in-place build of the self-tests and found that it left
the tree dirty.

Add missed test binaries to .gitignore

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201201144427.1228745-1-dja@axtens.net
3 years agopowerpc/feature-fixups: use a semicolon rather than a comma
Daniel Axtens [Tue, 1 Dec 2020 14:43:44 +0000 (01:43 +1100)]
powerpc/feature-fixups: use a semicolon rather than a comma

In a bunch of our security flushes, we use a comma rather than
a semicolon to 'terminate' an assignment. Nothing breaks, but
checkpatch picks it up if you copy it into another flush.

Switch to semicolons for ending statements.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201201144344.1228421-1-dja@axtens.net
3 years agopowerpc/pseries: Define PCI bus speed for Gen4 and Gen5
Frederic Barrat [Mon, 30 Nov 2020 15:29:49 +0000 (16:29 +0100)]
powerpc/pseries: Define PCI bus speed for Gen4 and Gen5

Update bus speed definition for PCI Gen4 and 5.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201130152949.26467-1-fbarrat@linux.ibm.com
3 years agopowerpc: Allow relative pointers in bug table entries
Jordan Niethe [Tue, 1 Dec 2020 00:52:03 +0000 (11:52 +1100)]
powerpc: Allow relative pointers in bug table entries

This enables GENERIC_BUG_RELATIVE_POINTERS on Power so that 32-bit
offsets are stored in the bug entries rather than 64-bit pointers.
While this doesn't save space for 32-bit machines, use it anyway so
there is only one code path.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201201005203.15210-1-jniethe5@gmail.com
3 years agopowerpc/xmon: Fix build failure for 8xx
Ravi Bangoria [Mon, 30 Nov 2020 03:44:06 +0000 (09:14 +0530)]
powerpc/xmon: Fix build failure for 8xx

With CONFIG_PPC_8xx and CONFIG_XMON set, kernel build fails with

  arch/powerpc/xmon/xmon.c:1379:12: error: 'find_free_data_bpt' defined
  but not used [-Werror=unused-function]

Fix it by enclosing find_free_data_bpt() inside #ifndef CONFIG_PPC_8xx.

Fixes: 30df74d67d48 ("powerpc/watchpoint/xmon: Support 2nd DAWR")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201130034406.288047-1-ravi.bangoria@linux.ibm.com