linux-block.git
6 months agodrm/xe/guc: Fix missing topology init
Zhanjun Dong [Tue, 27 Feb 2024 16:49:22 +0000 (08:49 -0800)]
drm/xe/guc: Fix missing topology init

init_steering_dss need topology dss mask to be init ahead.
Fixed by moving xe_gt_topology_init ahead of xe_gt_mcr_init

Fixes: bf8ec3c3e82c ("drm/xe: Initialize GuC earlier during probe")
Cc: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240227164922.281346-2-zhanjun.dong@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
6 months agodrm/xe/xe2: fix 64-bit division in pte_update_size
Arnd Bergmann [Mon, 26 Feb 2024 12:46:38 +0000 (13:46 +0100)]
drm/xe/xe2: fix 64-bit division in pte_update_size

This function does not build on 32-bit targets when the compiler
fails to reduce DIV_ROUND_UP() into a shift:

ld.lld: error: undefined symbol: __aeabi_uldivmod
>>> referenced by xe_migrate.c
>>>               drivers/gpu/drm/xe/xe_migrate.o:(pte_update_size) in archive vmlinux.a

There are two instances in this function. Change the first to
use an open-coded shift with the same behavior, and the second
one to a 32-bit calculation, which is sufficient here as the size
is never more than 2^32 pages (16TB).

Fixes: 237412e45390 ("drm/xe: Enable 32bits build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-3-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
6 months agodrm/xe/mmio: fix build warning for BAR resize on 32-bit
Arnd Bergmann [Mon, 26 Feb 2024 12:46:37 +0000 (13:46 +0100)]
drm/xe/mmio: fix build warning for BAR resize on 32-bit

clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:

drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
  109 |                     root_res->start > 0x100000000ull)
      |                     ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~

Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.

Fixes: 9a6e6c14bfde ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b")
Fixes: 237412e45390 ("drm/xe: Enable 32bits build")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
6 months agodrm/xe/kunit: fix link failure with built-in xe
Arnd Bergmann [Mon, 26 Feb 2024 12:46:36 +0000 (13:46 +0100)]
drm/xe/kunit: fix link failure with built-in xe

When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:

ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!

Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.

Fixes: 5095d13d758b ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
6 months agodrm/xe: Deny unbinds if uapi ufence pending
Mika Kuoppala [Thu, 15 Feb 2024 18:11:52 +0000 (20:11 +0200)]
drm/xe: Deny unbinds if uapi ufence pending

If user fence was provided for MAP in vm_bind_ioctl
and it has still not been signalled, deny UNMAP of said
vma with EBUSY as long as unsignalled fence exists.

This guarantees that MAP vs UNMAP sequences won't
escape under the radar if we ever want to track the
client's state wrt to completed and accessible MAPs.
By means of intercepting the ufence release signalling.

v2: find ufence with num_fences > 1 (Matt)
v3: careful on clearing vma ufence (Matt)

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1159
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-3-mika.kuoppala@linux.intel.com
6 months agodrm/xe: Expose user fence from xe_sync_entry
Mika Kuoppala [Thu, 15 Feb 2024 18:11:51 +0000 (20:11 +0200)]
drm/xe: Expose user fence from xe_sync_entry

By allowing getting reference to user fence, we can
control the lifetime outside of sync entries.

This is needed to allow vma to track the associated
user fence that was provided with bind ioctl.

v2: xe_user_fence can be kept opaque (Jani, Matt)
v3: indent fix (Matt)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-2-mika.kuoppala@linux.intel.com
6 months agodrm/xe/guc: Handle timing out of signaled jobs gracefully
Matthew Brost [Fri, 23 Feb 2024 20:46:59 +0000 (12:46 -0800)]
drm/xe/guc: Handle timing out of signaled jobs gracefully

Timing out of signaled jobs can happen during regular operations (e.g.
an exec queue closed immediately after last fence signaled). The TDR can
pass the worker which free jobs. Rather than running through the TDR if
signaled job is found, simply free it without any debug messages.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1271
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240223204659.40750-1-matthew.brost@intel.com
6 months agodrm/xe: get rid of MAX_BINDS
Paulo Zanoni [Thu, 15 Feb 2024 00:53:53 +0000 (16:53 -0800)]
drm/xe: get rid of MAX_BINDS

Mesa has been issuing a single bind operation per ioctl since xe.ko
changed to GPUVA due xe.ko bug #746. If I change Mesa to try again to
issue every single bind operation it can in the same ioctl, it hits
the MAX_BINDS assertion when running Vulkan conformance tests.

Test dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
issues 960 bind operations in a single ioctl, it's the most I could
find in the conformance suite.

I don't see a reason to keep the MAX_BINDS restriction: it doesn't
seem to be preventing any specific issue. If the number is too big for
the memory allocations, then those will fail. Nothing related to
num_binds seems to be using the stack. Let's just get rid of it.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Testcase: dEQP-VK.sparse_resources.transfer_queue.3d.rgba32i.1024_128_8
References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/746
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215005353.1295420-1-paulo.r.zanoni@intel.com
6 months agodrm/xe: Use vmalloc for array of bind allocation in bind IOCTL
Matthew Brost [Mon, 26 Feb 2024 15:55:54 +0000 (07:55 -0800)]
drm/xe: Use vmalloc for array of bind allocation in bind IOCTL

Use vmalloc in effort to allow a user pass in a large number of binds in
an IOCTL (mesa use case). Also use array allocations rather open coding
the size calculation.

v2: Use __GFP_ACCOUNT for allocations (Thomas)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226155554.103384-1-matthew.brost@intel.com
6 months agodrm/xe: Extend uAPI to query HuC micro-controler firmware version
Francois Dugast [Thu, 8 Feb 2024 18:35:39 +0000 (10:35 -0800)]
drm/xe: Extend uAPI to query HuC micro-controler firmware version

The infrastructure to query GuC firmware version is already in place. It
is extended with a new micro-controller type to query the HuC firmware
version. It can be used from user space to know if HuC is running.

Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-2-jose.souza@intel.com
6 months agodrm/xe: Remove useless mem_access on PAT dumps
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:37 +0000 (11:39 -0500)]
drm/xe: Remove useless mem_access on PAT dumps

PAT dumps are already protected by the xe_pm_runtime_{get,put}
around the debugfs call. So, these can be removed.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-14-rodrigo.vivi@intel.com
6 months agodrm/xe: Convert gt_reset from mem_access to xe_pm_runtime
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:36 +0000 (11:39 -0500)]
drm/xe: Convert gt_reset from mem_access to xe_pm_runtime

We need to ensure that device is in D0 on any kind of GT reset.
We are likely already protected by outer bounds like exec,
but if exec/sched ref gets dropped on a hang, we might transition
to D3 before we are able to perform the gt_reset and recover.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-13-rodrigo.vivi@intel.com
6 months agodrm/xe: Remove mem_access from suspend and resume functions
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:35 +0000 (11:39 -0500)]
drm/xe: Remove mem_access from suspend and resume functions

At these points, we are sure that device is awake in D0.
Likely in the middle of the transition, but awake. So,
these extra protections are useless. Let's remove it and
continue with the killing of xe_device_mem_access.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-12-rodrigo.vivi@intel.com
6 months agodrm/xe: Convert gsc_work from mem_access to xe_pm_runtime
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:34 +0000 (11:39 -0500)]
drm/xe: Convert gsc_work from mem_access to xe_pm_runtime

Let's directly use xe_pm_runtime_{get,put} instead of the
mem_access helpers that are going away soon.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-11-rodrigo.vivi@intel.com
6 months agodrm/xe: Remove useless mem_access protection for query ioctls
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:33 +0000 (11:39 -0500)]
drm/xe: Remove useless mem_access protection for query ioctls

Every IOCTL is already protected on its outer bounds by
xe_pm_runtime_{get,put} calls, so we can now remove
these.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-10-rodrigo.vivi@intel.com
6 months agodrm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:32 +0000 (11:39 -0500)]
drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls

Continue the work to kill the mem_access in favor of a pure runtime pm.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-9-rodrigo.vivi@intel.com
6 months agodrm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:31 +0000 (11:39 -0500)]
drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls

Continue on the path to entirely remove mem_access helpers in
favour of the direct xe_pm_runtime calls. This item is one of
the direct outer bounds of the protection.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-8-rodrigo.vivi@intel.com
6 months agodrm/xe: Runtime PM wake on every debugfs call
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:30 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every debugfs call

Let's ensure our PCI device is awaken on every debugfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

Also let's remove the mem_access_{get,put} from where they are
not needed anymore.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-7-rodrigo.vivi@intel.com
6 months agodrm/xe: Remove mem_access from guc_pc calls
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:29 +0000 (11:39 -0500)]
drm/xe: Remove mem_access from guc_pc calls

We are now protected by init, sysfs, or removal and don't
need these mem_access protections around GuC_PC anymore.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-6-rodrigo.vivi@intel.com
6 months agodrm/xe: Runtime PM wake on every sysfs call
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:28 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every sysfs call

Let's ensure our PCI device is awaken on every sysfs call.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

For now, for the files with small number of attr functions,
let's only call the runtime pm functions directly.
For the hw_engines entries with many files, let's add
the sysfs_ops wrapper.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-5-rodrigo.vivi@intel.com
6 months agodrm/xe: Convert kunit tests from mem_access to xe_pm_runtime
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:27 +0000 (11:39 -0500)]
drm/xe: Convert kunit tests from mem_access to xe_pm_runtime

Let's convert the kunit tests that are currently relying on
xe_device_mem_access_{get,put} towards the direct xe_pm_runtime_{get,put}.
While doing this we need to move the get/put calls towards the outer
bounds of the tests to ensure consistency with the other usages of
pm_runtime on the regular paths.

v2: include xe_pm.h in tests/xe_mocs.c and sort the include block
    while at it.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-4-rodrigo.vivi@intel.com
6 months agodrm/xe: Runtime PM wake on every IOCTL
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:26 +0000 (11:39 -0500)]
drm/xe: Runtime PM wake on every IOCTL

Let's ensure our PCI device is awaken on every IOCTL entry.
Let's increase the runtime_pm protection and start moving
that to the outer bounds.

v2: minor typo fix and renaming function to make it clear
    that is intended to be used by ioctl only. (Matt)
v3: Make it NULL if CONFIG_COMPAT is not selected.

Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-3-rodrigo.vivi@intel.com
6 months agodrm/xe: Convert mem_access assertion towards the runtime_pm state
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:25 +0000 (11:39 -0500)]
drm/xe: Convert mem_access assertion towards the runtime_pm state

The mem_access helpers are going away and getting replaced by
direct calls of the xe_pm_runtime_{get,put} functions. However, an
assertion with a warning splat is desired when we hit the worst
case of a memory access with the device really in the 'suspended'
state.

Also, this needs to be the first step. Otherwise, the upcoming
conversion would be really noise with warn splats of missing mem_access
gets.

v2: Minor doc changes as suggested by Matt

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-2-rodrigo.vivi@intel.com
6 months agodrm/xe: Document Xe PM component
Rodrigo Vivi [Thu, 22 Feb 2024 16:39:24 +0000 (11:39 -0500)]
drm/xe: Document Xe PM component

Replace outdated information with a proper PM documentation.
Already establish the rules for the runtime PM get and put that
Xe needs to follow.

Also add missing function documentation to all the "exported" functions.

v2: updated after Francois' feedback.
    s/grater/greater (Matt)
v3: detach D3 from runtime_pm
    remove opportunistic S0iX (Anshuman)

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Acked-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com> #v2
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-1-rodrigo.vivi@intel.com
6 months agodrm/xe: Don't support execlists in xe_gt_tlb_invalidation layer
Matthew Brost [Thu, 22 Feb 2024 23:20:21 +0000 (15:20 -0800)]
drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer

The xe_gt_tlb_invalidation layer implements TLB invalidations for a GuC
backend. Simply return if in execlists mode. A follow up may properly
implement the xe_gt_tlb_invalidation layer for both GuC and execlists.

Fixes: a9351846d945 ("drm/xe: Break of TLB invalidation into its own file")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222232021.3911545-4-matthew.brost@intel.com
6 months agodrm/xe: Cleanup some layering in GGTT
Matthew Brost [Thu, 22 Feb 2024 23:20:20 +0000 (15:20 -0800)]
drm/xe: Cleanup some layering in GGTT

xe_ggtt.c touched GuC layers which is incorrect. Call into
xe_gt_tlb_invalidation layer instead.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222232021.3911545-3-matthew.brost@intel.com
6 months agodrm/xe: Fix execlist splat
Matthew Brost [Thu, 22 Feb 2024 23:20:19 +0000 (15:20 -0800)]
drm/xe: Fix execlist splat

Although execlist submission is not supported it should be kept in a
basic working state as it can be used for very early hardware bring up.
Fix the below splat.

WARNING: CPU: 3 PID: 11 at drivers/gpu/drm/xe/xe_execlist.c:217 execlist_run_job+0x1c2/0x220 [xe]
Modules linked in: xe drm_kunit_helpers drm_gpuvm drm_ttm_helper ttm drm_exec drm_suballoc_helper drm_buddy gpu_sched mei_pxp mei_hdcp wmi_bmof x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core video snd_pcm mei_me mei wmi fuse e1000e i2c_i801 ptp i2c_smbus pps_core intel_lpss_pci
CPU: 3 PID: 11 Comm: kworker/u16:0 Tainted: G     U             6.8.0-rc3-guc+ #1046
Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.3243.A01.2006102133 06/10/2020
Workqueue: rcs0 drm_sched_run_job_work [gpu_sched]
RIP: 0010:execlist_run_job+0x1c2/0x220 [xe]
Code: 8b f8 03 00 00 4c 89 39 e9 e2 fe ff ff 49 8d 7d 20 be ff ff ff ff e8 ed fd a6 e1 85 c0 0f 85 e1 fe ff ff 0f 0b e9 da fe ff ff <0f> 0b 0f 0b 41 83 fc 03 0f 86 8a fe ff ff 0f 0b e9 83 fe ff ff be
RSP: 0018:ffffc9000013bdb8 EFLAGS: 00010246
RAX: ffff888105021a00 RBX: ffff888105078400 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffc9000013bd14 RDI: ffffc90001609090
RBP: ffff88811e3f0040 R08: 0000000000000088 R09: 00000000ffffff81
R10: 0000000000000001 R11: ffff88810c10c000 R12: 00000000fffffffe
R13: ffff888109b72c28 R14: ffff8881050784a0 R15: ffff888105078408
FS:  0000000000000000(0000) GS:ffff88849f980000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563459d130f8 CR3: 000000000563a001 CR4: 0000000000f70ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 ? __warn+0x7f/0x170
 ? execlist_run_job+0x1c2/0x220 [xe]
 ? report_bug+0x1c7/0x1d0
 ? handle_bug+0x3c/0x70
 ? exc_invalid_op+0x18/0x70
 ? asm_exc_invalid_op+0x1a/0x20
 ? execlist_run_job+0x1c2/0x220 [xe]
 ? execlist_run_job+0x2c/0x220 [xe]
 drm_sched_run_job_work+0x246/0x3f0 [gpu_sched]
 ? process_one_work+0x18d/0x4e0
 process_one_work+0x1f7/0x4e0
 worker_thread+0x1da/0x3e0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xfe/0x130
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2c/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1b/0x30
 </TASK>

Fixes: 9b9529ce379a ("drm/xe: Rename engine to exec_queue")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222232021.3911545-2-matthew.brost@intel.com
6 months agodrm/xe/uapi: Remove unused flags
Francois Dugast [Thu, 22 Feb 2024 23:23:56 +0000 (18:23 -0500)]
drm/xe/uapi: Remove unused flags

Those cases missed in previous uAPI cleanups were mostly accidentally
brought in from i915 or created to exercise the possibilities of gpuvm
but they are not used by userspace yet, so let's remove them. They can
still be brought back later if needed.

v2:
- Fix XE_VM_FLAG_FAULT_MODE support in xe_lrc.c (Brian Welty)
- Leave DRM_XE_VM_BIND_OP_UNMAP_ALL (José Roberto de Souza)
- Ensure invalid flag values are rejected (Rodrigo Vivi)

v3: Rebase after removal of persistent exec_queues (Francois Dugast)

v4: Rodrigo: Rebase after the new dumpable flag.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222232356.175431-1-rodrigo.vivi@intel.com
6 months agodrm/xe: Prefer struct_size over open coded arithmetic
Erick Archer [Sat, 10 Feb 2024 14:19:12 +0000 (15:19 +0100)]
drm/xe: Prefer struct_size over open coded arithmetic

This is an effort to get rid of all multiplications from allocation
functions in order to prevent integer overflows [1].

As the "q" variable is a pointer to "struct xe_exec_queue" and this
structure ends in a flexible array:

struct xe_exec_queue {
[...]
struct xe_lrc lrc[];
};

the preferred way in the kernel is to use the struct_size() helper to
do the arithmetic instead of the argument "size + size * count" in the
kzalloc() function.

This way, the code is more readable and more safer.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Link: https://github.com/KSPP/linux/issues/160
Signed-off-by: Erick Archer <erick.archer@gmx.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240210141913.6611-1-erick.archer@gmx.com
6 months agodrm/xe: Use pointers in trace events
Lucas De Marchi [Thu, 22 Feb 2024 14:41:24 +0000 (06:41 -0800)]
drm/xe: Use pointers in trace events

Commit a0df2cc858c3 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
inadvertently reverted commit 8d038f49c1f3 ("drm/xe: Fix cast on trace
variable"), breaking the build on 32bits.

As noted by Ville, there's no point in converting the pointers to u64
and add casts everywhere. In fact, it's better to just use %p and let
the address be hashed. Convert all the cases in xe_trace.h to use
pointers.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222144125.2862546-1-lucas.demarchi@intel.com
6 months agodrm/xe: Do not include current dir for generated/xe_wa_oob.h
Dafna Hirschfeld [Wed, 21 Feb 2024 08:36:22 +0000 (10:36 +0200)]
drm/xe: Do not include current dir for generated/xe_wa_oob.h

The generated file 'generated/xe_wa_oob.h' is included using:
"generated/xe_wa_oob.h"
which first look inside the source code. But the file resides
in the build directory and should therefore be included using:
<generated/xe_wa_oob.h>

Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221083622.1584492-1-dhirschfeld@habana.ai
6 months agodrm/xe: Add debug prints for skipping rebinds
Matthew Brost [Wed, 21 Feb 2024 03:27:43 +0000 (19:27 -0800)]
drm/xe: Add debug prints for skipping rebinds

Will help debug issues with VM binds.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221032743.3698849-1-matthew.brost@intel.com
6 months agodrm/xe: Implement VM snapshot support for BO's and userptr
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:21 +0000 (14:30 +0100)]
drm/xe: Implement VM snapshot support for BO's and userptr

Since we cannot immediately capture the BO's and userptr, perform it in
2 stages. The immediate stage takes a reference to each BO and userptr,
while a delayed worker captures the contents and then frees the
reference.

This is required because in signaling context, no locks can be taken, no
memory can be allocated, and no waits on userspace can be performed.

With the delayed worker, all of this can be performed very easily,
without having to resort to hacks.

Changes since v1:
- Fix crash on NULL captured vm.
- Use ascii85_encode to capture BO contents and save some space.
- Add length to coredump output for each captured area.
Changes since v2:
- Dump each mapping on their own line, to simplify tooling.
- Fix null pointer deref in xe_vm_snapshot_free.
Changes since v3:
- Don't add uninitialized value to snap->ofs. (Souza)
- Use kernel types for u32 and u64.
- Move snap_mutex destruction to final vm destruction. (Souza)
Changes since v4:
- Remove extra memset. (Souza)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-6-maarten.lankhorst@linux.intel.com
6 months agodrm/xe: Add vm snapshot mutex for easily taking a vm snapshot during devcoredump
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:20 +0000 (14:30 +0100)]
drm/xe: Add vm snapshot mutex for easily taking a vm snapshot during devcoredump

The devcoredump is done in fence signaling context. Because of this, we
cannot take any of the normal mutexes or we would invert.

Normal: Take vm->lock, dma_fence_wait()
Devcoredump: from dma_fence_wait() context, take vm->lock.

This doesn't work, and we only care about integrity, so take the locks
around additions and removals of vma's.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-5-maarten.lankhorst@linux.intel.com
6 months agodrm/xe: Annotate each dumpable vma as such
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:19 +0000 (14:30 +0100)]
drm/xe: Annotate each dumpable vma as such

In preparation for snapshot dumping, mark each dumpable VMA as such, so
we can walk over the VM later and dump it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-4-maarten.lankhorst@linux.intel.com
6 months agodrm/xe: Add uapi for dumpable bos
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:18 +0000 (14:30 +0100)]
drm/xe: Add uapi for dumpable bos

Add the flag XE_VM_BIND_FLAG_DUMPABLE to notify devcoredump that this
mapping should be dumped.

This is not hooked up, but the uapi should be ready before merging.

It's likely easier to dump the contents of the bo's at devcoredump
readout time, so it's better if the bos will stay unmodified after
a hang. The NEEDS_CPU_MAPPING flag is removed as requirement.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-3-maarten.lankhorst@linux.intel.com
6 months agodrm/xe: Clear all snapshot members after deleting coredump
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:17 +0000 (14:30 +0100)]
drm/xe: Clear all snapshot members after deleting coredump

It's not strictly needed to clear right now, but this prevents bugs
from dangling pointers.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-2-maarten.lankhorst@linux.intel.com
6 months agodrm/xe/snapshot: Remove drm_err on guc alloc failures
Maarten Lankhorst [Wed, 21 Feb 2024 13:30:16 +0000 (14:30 +0100)]
drm/xe/snapshot: Remove drm_err on guc alloc failures

The kernel will complain loudly if allocation fails, no need to do it
ourselves.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-1-maarten.lankhorst@linux.intel.com
6 months agodrm/xe: Fix modpost warning on xe_mocs kunit module
Ashutosh Dixit [Tue, 13 Feb 2024 03:35:48 +0000 (19:35 -0800)]
drm/xe: Fix modpost warning on xe_mocs kunit module

$ make W=1 -j100 M=drivers/gpu/drm/xe
MODPOST drivers/gpu/drm/xe/Module.symvers
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_mocs_test.o

Fix is identical to '1d425066f15f ("drm/xe: Fix modpost warning on kunit
modules")'.

Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
6 months agodrm/xe/xe_gt_idle: Drop redundant newline in name
Ashutosh Dixit [Tue, 6 Feb 2024 19:27:31 +0000 (11:27 -0800)]
drm/xe/xe_gt_idle: Drop redundant newline in name

Newline in name is redunant and produces an unnecessary empty line during
'cat name'. Newline is added during sysfs_emit. See '27a1a1e2e47d ("drm/xe:
stringify the argument to avoid potential vulnerability")'.

v2: Add Fixes tag (Riana)

Fixes: 7b076d14f21a ("drm/xe/mtl: Add support to get C6 residency/status of MTL")
Reviewed-by: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
6 months agodrm/xe/guc: Remove usage of the deprecated ida_simple_xx() API
Christophe JAILLET [Sun, 14 Jan 2024 15:09:16 +0000 (16:09 +0100)]
drm/xe/guc: Remove usage of the deprecated ida_simple_xx() API

ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().

Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_max() is inclusive. So a -1 has been added when needed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d6a9ec9dc426fca372eaa1423a83632bd743c5d9.1705244938.git.christophe.jaillet@wanadoo.fr
6 months agodrm/xe: Initialize GuC earlier during probe
Michał Winiarski [Mon, 19 Feb 2024 13:05:30 +0000 (14:05 +0100)]
drm/xe: Initialize GuC earlier during probe

SR-IOV VF has limited access to MMIO registers. Fortunately, it is able
to access a curated subset that is needed to initialize the driver by
communicating with SR-IOV PF using GuC CT.
Initialize GuC earlier in order to keep the unified probe ordering
between VF and PF modes.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-4-michal.winiarski@intel.com
6 months agodrm/xe/guc: Move GuC power control init to "post-hwconfig"
Michał Winiarski [Mon, 19 Feb 2024 13:05:29 +0000 (14:05 +0100)]
drm/xe/guc: Move GuC power control init to "post-hwconfig"

SLPC is not used at "hwconfig" stage. Move the initialization of data
structures used for SLPC to a later point in probe.
Also - move the xe_guc_pc_init_early to happen just prior to initial
"hwconfig" load.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-3-michal.winiarski@intel.com
6 months agodrm/xe/huc: Realloc HuC FW in vram for post-hwconfig
Michał Winiarski [Mon, 19 Feb 2024 13:05:28 +0000 (14:05 +0100)]
drm/xe/huc: Realloc HuC FW in vram for post-hwconfig

Similar to GuC, we're using system memory for the initial stage, and
move the image to vram when it's available for subsequent loads (e.g.
after reset).

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-2-michal.winiarski@intel.com
6 months agodrm/xe/guc: Allocate GuC data structures in system memory for initial load
Michał Winiarski [Mon, 19 Feb 2024 13:05:27 +0000 (14:05 +0100)]
drm/xe/guc: Allocate GuC data structures in system memory for initial load

GuC load will need to happen at an earlier point in probe, where local
memory is not yet available. Use system memory for GuC data structures
used for initial "hwconfig" load, and realloc at a later,
"post-hwconfig" load if needed, when local memory is available.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-1-michal.winiarski@intel.com
6 months agoMerge drm/drm-next into drm-xe-next
Lucas De Marchi [Tue, 20 Feb 2024 17:26:12 +0000 (09:26 -0800)]
Merge drm/drm-next into drm-xe-next

Bring changes from drm-misc-next that got merged in drm-next back to
drm-xe so they can be used for additional features.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
6 months agodrm/xe: Return 2MB page size for compact 64k PTEs
Matthew Brost [Mon, 19 Feb 2024 21:19:42 +0000 (13:19 -0800)]
drm/xe: Return 2MB page size for compact 64k PTEs

Compact 64k PTEs are only intended to be used within a single VMA which
covers the entire 2MB range of the compact 64k PTEs. Add
XE_VMA_PTE_COMPACT VMA flag to indicate compact 64k PTEs are used and
update xe_vma_max_pte_size to return at least 2MB if set.

v2: Include missing changes

Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds")
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/758
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-4-matthew.brost@intel.com
6 months agodrm/xe: Add XE_VMA_PTE_64K VMA flag
Matthew Brost [Mon, 19 Feb 2024 21:19:41 +0000 (13:19 -0800)]
drm/xe: Add XE_VMA_PTE_64K VMA flag

Add XE_VMA_PTE_64K VMA flag to ensure skipping rebinds does not cross
64k page boundaries.

Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-3-matthew.brost@intel.com
6 months agodrm/xe: Fix xe_vma_set_pte_size
Matthew Brost [Mon, 19 Feb 2024 21:19:40 +0000 (13:19 -0800)]
drm/xe: Fix xe_vma_set_pte_size

xe_vma_set_pte_size had a return value and did not set the 4k VMA flag.
Both of these were incorrect. Fix these.

Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-2-matthew.brost@intel.com
6 months agodrm/xe/xe_bo_move: Enhance xe_bo_move trace
Priyanka Dandamudi [Tue, 20 Feb 2024 04:47:48 +0000 (10:17 +0530)]
drm/xe/xe_bo_move: Enhance xe_bo_move trace

Enhanced xe_bo_move trace to be more readable.
It will help to show the migration details.
Src and dst details.

v2: Modify trace_xe_bo_move(), it takes the integer mem_type
rather than a string.
Make mem_type_to_name() extern, it will be used by trace.(Thomas)

v3: Move mem_type_to_name() to xe_bo.[ch] (Thomas, Matt)

v4: Add device details to reduce ambiquity related to vram0/vram1. (Oak)

v5: Rename mem_type_to_name to xe_mem_type_to_name. (Thomas)

v6: Optimised code to use xe_bo_device(__entry->bo). (Thomas)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240220044748.948496-1-priyanka.dandamudi@intel.com
6 months agodrm/xe: Enable 32bits build
Lucas De Marchi [Fri, 19 Jan 2024 00:16:12 +0000 (16:16 -0800)]
drm/xe: Enable 32bits build

Now that all the issues with 32bits are fixed, enable it again.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-6-lucas.demarchi@intel.com
7 months agodrm/xe/uapi: Remove support for persistent exec_queues
Thomas Hellström [Fri, 9 Feb 2024 11:34:44 +0000 (12:34 +0100)]
drm/xe/uapi: Remove support for persistent exec_queues

Persistent exec_queues delays explicit destruction of exec_queues
until they are done executing, but destruction on process exit
is still immediate. It turns out no UMD is relying on this
functionality, so remove it. If there turns out to be a use-case
in the future, let's re-add.

Persistent exec_queues were never used for LR VMs

v2:
- Don't add an "UNUSED" define for the missing property
  (Lucas, Rodrigo)
v3:
- Remove the remaining struct xe_exec_queue::persistent state
  (Niranjana, Lucas)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240209113444.8396-1-thomas.hellstrom@linux.intel.com
7 months agoMerge tag 'drm-misc-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 16 Feb 2024 03:16:39 +0000 (13:16 +1000)]
Merge tag 'drm-misc-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.9:

UAPI Changes:

Cross-subsystem Changes:

arch:
- powerpc/ps3: select CONFIG_VIDEO

Core Changes:

ci:
- msm: fix apq8016 runner

display:
- use newer DRM print helpers

documentation:
- fix typos

print:
- add device-specific error and debug printers

sysfb:
- set Linux parent device for firmware framebuffer

tests:
- mm: use newer DRM print helpers

Driver Changes:

bridge:
- switch to ->read_edid callback throughout the bridge
drivers
- remove old ->get_edid callback

i915:
- use newer DRM print helpers

lima:
- improve stability by fixes to error handling and recovery

mediathek:
- switch to ->read_edid callback

msm:
- switch to ->read_edid callback

omap:
- switch to ->read_edid callback

panel:
- add Powkiddy RGB10MAX3 plus DT bindings
- st7703: support panel rotation plus DT bindings

rockchip:
- DT bindings: remove port, add power-domains

xe:
- use newer DRM print helpers

xlnx:
- switch to ->read_edid callback

Signed-off-by: Dave Airlie <airlied@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmXOD/oACgkQaA3BHVML
# eiMWMAgArTVXF4UQ+FUxYZB5QTm2veYIpilvwmzaQLNxsM9SsWpzwMIVAi+xf93g
# uqUqkl6QvZ9pJg6bxuXRNcJw/GObIO4x6tn+LkbccczgHiHwvn6ydNdUoMx8ulne
# EsGC0z8bb5Gpwh9b/pnBul2AoIE7PHAJltgH271/O2xnhFMUbchQ0ckHvWnn8/GA
# Nef145ySX4gkYtY8u2TRr4r6Bkp7Tpiyv6ipU7Cpu7KqyveTDMx3c9r5FaiHnJT/
# Hx/5s87q0Bx2m+iNjlBLJzYjF2UWth+pbfiu3xwyWOE7hdkPLwCQ5mqHWcFFqxfb
# Vuj9jP+Vb68L7EvGpq2LArLdhZjHIQ==
# =SsjX
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 15 Feb 2024 23:22:02 AEST
# gpg:                using RSA key 7217FBAC8CE9CF6344A168E5680DC11D530B7A23
# gpg: Can't check signature: No public key
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215132610.GA1464@localhost.localdomain
7 months agoMerge tag 'drm-intel-gt-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Fri, 16 Feb 2024 01:19:14 +0000 (11:19 +1000)]
Merge tag 'drm-intel-gt-next-2024-02-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Add GuC submission interface version query (Tvrtko Ursulin)

Driver Changes:

Fixes/improvements/new stuff:

- Atomically invalidate userptr on mmu-notifier (Jonathan Cavitt)
- Update handling of MMIO triggered reports (Umesh Nerlige Ramappa)
- Don't make assumptions about intel_wakeref_t type (Jani Nikula)
- Add workaround 14019877138 [xelpg] (Tejas Upadhyay)
- Allow for very slow HuC loading [huc] (John Harrison)
- Flush context destruction worker at suspend [guc] (Alan Previn)
- Close deregister-context race against CT-loss [guc] (Alan Previn)
- Avoid circular locking issue on busyness flush [guc] (John Harrison)
- Use rc6.supported flag from intel_gt for rc6_enable sysfs (Juan Escamilla)
- Reflect the true and current status of rc6_enable (Juan Escamilla)
- Wake GT before sending H2G message [mtl] (Vinay Belgaumkar)
- Restart the heartbeat timer when forcing a pulse (John Harrison)

Future platform enablement:

- Extend driver code of Xe_LPG to Xe_LPG+ [xelpg] (Harish Chegondi)
- Extend some workarounds/tuning to gfx version 12.74 [xelpg] (Matt Roper)

Miscellaneous:

- Reconcile Excess struct member kernel-doc warnings (Randy Dunlap)
- Change wa and EU_PERF_CNTL registers to MCR type [guc] (Shuicheng Lin)
- Add flex arrays to struct i915_syncmap (Erick Archer)
- Increasing the sleep time for live_rc6_manual [selftests] (Anirban Sk)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zc3iIVsiAwo+bu10@tursulin-desk
7 months agoMerge tag 'drm-intel-next-2024-02-07' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 15 Feb 2024 20:52:03 +0000 (06:52 +1000)]
Merge tag 'drm-intel-next-2024-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

drm/i915 feature pull for v6.9:

Features and functionality:
- Early transport for panel replay and PSR (Jouni)
- New ARL PCI IDs (Matt)
- DP TPS4 PHY test pattern support (Khaled)

Refactoring and cleanups:
- Unify and improve VSC SDP for PSR and non-PSR cases (Jouni)
- Refactor memory regions and improve debug logging (Ville)
- Rework global state serialization (Ville)
- Remove unused CDCLK divider fields (Gustavo)
- Unify HDCP connector logging format (Jani)
- Use display instead of graphics version in display code (Jani)
- Move VBT and opregion debugfs next to the implementation (Jani)
- Abstract opregion interface, use opaque type (Jani)

Fixes:
- Fix MTL stolen memory access (Ville)
- Fix initial display plane readout for MTL (Ville)
- Fix HPD handling during driver init/shutdown (Imre)
- Cursor vblank evasion fixes (Ville)
- Various VSC SDP fixes (Jouni)
- Allow PSR mode changes without full modeset (Jouni)
- Fix CDCLK sanitization on module load for Xe2_LPD (Gustavo)
- Fix the max DSC bpc supported by the source (Ankit)
- Add missing LNL ALPM AUX wake configuration (Jouni)
- Cx0 PHY state readout and verify fixes (Mika)
- Fix PSR (panel replay) debugfs for MST connectors (Imre)
- Fail HDCP repeater authentication if Type1 device not present (Suraj)
- Ratelimit debug logging in vm_fault_ttm (Nirmoy)
- Use a fake PCH for MTL because south display is not on the PCH (Haridhar)
- Disable DSB for Xe driver for now (José)
- Fix some LNL display register changes (Lucas)
- Fix build on ChromeOS (Paz Zcharya)
- Preserve current shared DPLL for fastsets on Type-C ports (Ville)
- Fix state checker warnings for MG/TC/TBT PLLs (Ville)
- Fix HDCP repeater ctl register value on errors (Jani)
- Allow FBC with CCS modifiers on SKL+ (Ville)
- Fix HDCP GGTT pinning (Ville)

DRM core changes:
- Add ratelimited drm dbg print (Nirmoy)
- DPCD PSR early transport macro (Jouni)

Merges:
- Backmerge drm-next to bring Xe driver to drm-intel-next (Jani)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87cyt8cxsh.fsf@intel.com
7 months agodrm/xe: avoid function cast warnings
Arnd Bergmann [Tue, 13 Feb 2024 09:56:48 +0000 (10:56 +0100)]
drm/xe: avoid function cast warnings

clang-16 warns about a cast between incompatible function types:

drivers/gpu/drm/xe/xe_range_fence.c:155:10: error: cast from 'void (*)(const void *)' to 'void (*)(struct xe_range_fence *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
  155 |         .free = (void (*)(struct xe_range_fence *rfence)) kfree,
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Avoid this with a trivial helper function that calls kfree() here.

v2:
- s/* rfence/*rfence/ (Thomas)

Fixes: 845f64bdbfc9 ("drm/xe: Introduce a range-fence utility")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213095719.454865-1-arnd@kernel.org
7 months agodrm/i915/gt: Restart the heartbeat timer when forcing a pulse
John Harrison [Wed, 10 Jan 2024 21:02:16 +0000 (13:02 -0800)]
drm/i915/gt: Restart the heartbeat timer when forcing a pulse

The context persistence code does things like send super high priority
heartbeat pulses to ensure any leaked context can still be pre-empted
and thus isn't a total denial of service but only a minor denial of
service. Unfortunately, it wasn't bothering to restart the heartbeat
worker with a fresh timeout. Thus, if a persistent context happened to
be closed just before the heartbeat was going to go ping anyway then
the forced pulse would get a negligble execution time. And as the
forced pulse is super high priority, the worker thread's next step is
a reset. Which means a potentially innocent system randomly goes boom
when attempting to close a context. So, force a re-schedule of the
worker thread with the appropriate timeout.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240110210216.4125092-1-John.C.Harrison@Intel.com
7 months agodrm/xe: Remove exec queue bind.fence_*
Matthew Brost [Tue, 13 Feb 2024 04:32:51 +0000 (20:32 -0800)]
drm/xe: Remove exec queue bind.fence_*

struct xe_exec_queue bind.fence_* members are unused. Remove these.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213043251.3482928-1-matthew.brost@intel.com
7 months agodrm/i915: Add GuC submission interface version query
Tvrtko Ursulin [Thu, 8 Feb 2024 08:25:10 +0000 (08:25 +0000)]
drm/i915: Add GuC submission interface version query

Add a new query to the GuC submission interface version.

Mesa intends to use this information to check for old firmware versions
with a known bug where using the render and compute command streamers
simultaneously can cause GPU hangs due issues in firmware scheduling.

Based on patches from Vivaik and Joonas.

Compile tested only.

v2:
 * Added branch version.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Sagar Ghuge <sagar.ghuge@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Vivaik Balasubrawmanian <vivaik.balasubrawmanian@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208082510.1363268-1-tvrtko.ursulin@linux.intel.com
7 months agodrm: ci: use clk_ignore_unused for apq8016
Dmitry Baryshkov [Wed, 14 Feb 2024 08:37:08 +0000 (10:37 +0200)]
drm: ci: use clk_ignore_unused for apq8016

If the ADV7511 bridge driver is compiled as a module, while DRM_MSM is
built-in, the clk_disable_unused congests with the runtime PM handling
of the DSI PHY for the clk_prepare_lock(). This causes apq8016 runner to
fail without completing any jobs ([1]). Drop the BM_CMDLINE which
duplicate the command line from the .baremetal-igt-arm64 clause and
enforce the clk_ignore_unused kernelarg instead to make apq8016 runner
work.

[1] https://gitlab.freedesktop.org/drm/msm/-/jobs/54990475

Fixes: 0119c894ab0d ("drm: Add initial ci/ subdirectory")
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Helen Koike <helen.koike@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214083708.2323967-1-dmitry.baryshkov@linaro.org
7 months agoiosys-map: fix typo
Randy Dunlap [Tue, 13 Feb 2024 22:42:19 +0000 (14:42 -0800)]
iosys-map: fix typo

Correct a spello/typo in comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213224219.10644-1-rdunlap@infradead.org
7 months agodrm: drm_crtc: correct some comments
Randy Dunlap [Tue, 13 Feb 2024 06:17:33 +0000 (22:17 -0800)]
drm: drm_crtc: correct some comments

Fix some typos and punctuation.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexdeucher@gmail.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213061733.8068-1-rdunlap@infradead.org
7 months agofbdev/efifb: Remove framebuffer relocation tracking
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:16 +0000 (10:06 +0100)]
fbdev/efifb: Remove framebuffer relocation tracking

If the firmware framebuffer has been reloacted, the sysfb code
fixes the screen_info state before it creates the framebuffer's
platform device. Efifb will automatically receive a screen_info
with updated values. Hence remove the tracking from efifb.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-9-tzimmermann@suse.de
7 months agofirmware/sysfb: Update screen_info for relocated EFI framebuffers
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:15 +0000 (10:06 +0100)]
firmware/sysfb: Update screen_info for relocated EFI framebuffers

On ARM PCI systems, the PCI hierarchy might be reconfigured during
boot and the firmware framebuffer might move as a result of that.
The values in screen_info will then be invalid.

Work around this problem by tracking the framebuffer's initial
location before it get relocated; then fix the screen_info state
between reloaction and creating the firmware framebuffer's device.

This functionality has been lifted from efifb. See the commit message
of commit 55d728a40d36 ("efi/fb: Avoid reconfiguration of BAR that
covers the framebuffer") for more information.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-8-tzimmermann@suse.de
7 months agofbdev/efifb: Do not track parent device status
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:14 +0000 (10:06 +0100)]
fbdev/efifb: Do not track parent device status

There will be no EFI framebuffer device for disabled parent devices
and thus we never probe efifb in that case. Hence remove the tracking
code from efifb.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-7-tzimmermann@suse.de
7 months agofirmware/sysfb: Create firmware device only for enabled PCI devices
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:13 +0000 (10:06 +0100)]
firmware/sysfb: Create firmware device only for enabled PCI devices

Test if the firmware framebuffer's parent PCI device, if any, has
been enabled. If not, the firmware framebuffer is most likely not
working. Hence, do not create a device for the firmware framebuffer
on disabled PCI devices.

So far, efifb tracked the status of the PCI parent device internally
and did not bind if it was disabled. This patch implements the
functionality for all PCI-based firmware framebuffers.

v3:
* make commit message more precise (Sui)
v2:
* rework sysfb_pci_dev_is_enabled() (Javier)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-6-tzimmermann@suse.de
7 months agofbdev/efifb: Remove PM for parent device
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:12 +0000 (10:06 +0100)]
fbdev/efifb: Remove PM for parent device

The EFI device has the correct parent device set. This allows Linux
to handle the power management internally. Hence, remove the manual
PM management for the parent device from efifb.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-5-tzimmermann@suse.de
7 months agofirmware/sysfb: Set firmware-framebuffer parent device
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:11 +0000 (10:06 +0100)]
firmware/sysfb: Set firmware-framebuffer parent device

Set the firmware framebuffer's parent device, which usually is the
graphics hardware's physical device. Integrates the framebuffer in
the Linux device hierarchy and lets Linux handle dependencies among
devices. For example, the graphics hardware won't be suspended while
the firmware device is still active.

v4:
* fix build for CONFIG_SYSFB_SIMPLEFB=n, again
v3:
* fix build for CONFIG_SYSFB_SIMPLEFB=n (Sui)
* test result of screen_info_pci_dev() for errors (Sui)
v2:
* detect parent device in sysfb_parent_dev()

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-4-tzimmermann@suse.de
7 months agovideo: Provide screen_info_get_pci_dev() to find screen_info's PCI device
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:10 +0000 (10:06 +0100)]
video: Provide screen_info_get_pci_dev() to find screen_info's PCI device

Add screen_info_get_pci_dev() to find the PCI device of an instance
of screen_info. Does nothing on systems without PCI bus.

v3:
* search PCI device with pci_get_base_class() (Sui)
v2:
* remove ret from screen_info_pci_dev() (Javier)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-3-tzimmermann@suse.de
7 months agovideo: Add helpers for decoding screen_info
Thomas Zimmermann [Mon, 12 Feb 2024 09:06:09 +0000 (10:06 +0100)]
video: Add helpers for decoding screen_info

The plain values as stored in struct screen_info need to be decoded
before being used. Add helpers that decode the type of video output
and the framebuffer I/O aperture.

Old or non-x86 systems may not set the type of video directly, but
only indicate the presence by storing 0x01 in orig_video_isVGA. The
decoding logic in screen_info_video_type() takes this into account.
It then follows similar code in vgacon's vgacon_startup() to detect
the video type from the given values.

A call to screen_info_resources() returns all known resources of the
given screen_info. The resources' values have been taken from existing
code in vgacon and vga16fb. These drivers can later be converted to
use the new interfaces.

v2:
* return ssize_t from screen_info_resources()
* don't call __screen_info_has_lfb() unnecessarily

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212090736.11464-2-tzimmermann@suse.de
7 months agodrm/xe: Add uAPI to query GuC firmware submission version
José Roberto de Souza [Thu, 8 Feb 2024 18:35:38 +0000 (10:35 -0800)]
drm/xe: Add uAPI to query GuC firmware submission version

Due to a bug in GuC firmware, Mesa can't enable by default the usage of
compute engines in DG2 and newer.

A new GuC firmware fixed the issue but until now there was no way
for Mesa to know if KMD was running with the fixed GuC version or not,
so this uAPI is required.

It may be expanded in future to query other firmware versions too.

This is querying XE_UC_FW_VER_COMPATIBILITY/submission version because
that is also supported by VFs, while XE_UC_FW_VER_RELEASE don't.

i915 uAPI: https://patchwork.freedesktop.org/series/129627/
Mesa usage: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25233

v2:
- fixed drm_xe_query_uc_fw_version documentation
- moved branch_ver as the first version number

Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-1-jose.souza@intel.com
7 months agodt-bindings: display: rockchip,dw-hdmi: add power-domains property
Johan Jonker [Wed, 31 Jan 2024 21:16:24 +0000 (22:16 +0100)]
dt-bindings: display: rockchip,dw-hdmi: add power-domains property

Most Rockchip hdmi nodes are part of a power domain.
Add a power-domains property and include it to the example
with some reordering to align with the (new) documentation
about property ordering.

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/5c45527a-e218-40a3-8e71-a5815417e5f7@gmail.com
7 months agodt-bindings: display: rockchip: rockchip,dw-hdmi: remove port property
Johan Jonker [Wed, 31 Jan 2024 21:14:29 +0000 (22:14 +0100)]
dt-bindings: display: rockchip: rockchip,dw-hdmi: remove port property

The hdmi-connector nodes are now functional and the new way to model
hdmi ports nodes with both in and output port subnodes. Unfortunately
with the conversion to YAML the old method with only an input port node
was used. Later the new method was also added to the binding.
A binding must be unambiguously, so remove the old port property
entirely and make port@0 and port@1 a requirement as all
upstream dts files are updated as well and because checking
deprecated stuff is a bit pointless.
Update the example to avoid use of the removed property.

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/a493c65e-7cf9-455f-95d5-8c98cad35710@gmail.com
7 months agodrm/panel: st7703: Add Panel Rotation Support
Chris Morgan [Mon, 12 Feb 2024 18:49:47 +0000 (12:49 -0600)]
drm/panel: st7703: Add Panel Rotation Support

Add support for panel rotation to ST7703 based devices.

Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212184950.52210-5-macroalpha82@gmail.com
7 months agodt-bindings: display: rocktech,jh057n00900: Document panel rotation
Chris Morgan [Mon, 12 Feb 2024 18:49:46 +0000 (12:49 -0600)]
dt-bindings: display: rocktech,jh057n00900: Document panel rotation

Document the rotation property for rocktech,jh057n00900 panels.

Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212184950.52210-4-macroalpha82@gmail.com
7 months agodrm/panel: st7703: Add Powkiddy RGB10MAX3 Panel Support
Chris Morgan [Mon, 12 Feb 2024 18:49:45 +0000 (12:49 -0600)]
drm/panel: st7703: Add Powkiddy RGB10MAX3 Panel Support

The Powkiddy RGB10MAX3 is a handheld device with a 5 inch 720x1280
display panel with a Sitronix ST7703 display controller. The panel
is installed rotated 270 degrees.

Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212184950.52210-3-macroalpha82@gmail.com
7 months agodt-bindings: display: Add Powkiddy RGB10MAX3 panel
Chris Morgan [Mon, 12 Feb 2024 18:49:44 +0000 (12:49 -0600)]
dt-bindings: display: Add Powkiddy RGB10MAX3 panel

The RGB10MAX3 panel is a panel specific to the Powkiddy RGB10MAX3
handheld device that measures 5 inches diagonally with a resolution
of 720x1280.

Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212184950.52210-2-macroalpha82@gmail.com
7 months agogpu: host1x: bus: make host1x_bus_type const
Ricardo B. Marliere [Tue, 13 Feb 2024 14:44:40 +0000 (11:44 -0300)]
gpu: host1x: bus: make host1x_bus_type const

Since commit d492cc2573a0 ("driver core: device.h: make struct
bus_type a const *"), the driver core can properly handle constant
struct bus_type, move the host1x_bus_type variable to be a constant
structure as well, placing it into read-only memory which can not be
modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213-bus_cleanup-host1x-v1-1-54ec51b5d14f@marliere.net
7 months agodrm/xe/vf: Don't support MCR registers if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:55 +0000 (16:43 +0100)]
drm/xe/vf: Don't support MCR registers if VF

VF drivers can't operate on MCR registers.  Make sure that driver
is not trying to read nor write using any of MCR register.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-9-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't program PAT if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:54 +0000 (16:43 +0100)]
drm/xe/vf: Don't program PAT if VF

PAT programming can only be done by the PF driver.
Besides VF drivers don't have access to control registers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-8-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't enable hwmon if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:53 +0000 (16:43 +0100)]
drm/xe/vf: Don't enable hwmon if VF

Registers used by hwmon are not available for VF drivers.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-7-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't check if LMEM is initialized if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:52 +0000 (16:43 +0100)]
drm/xe/vf: Don't check if LMEM is initialized if VF

It is PF driver responsibility to verify that LMEM was correctly
initialized, also VF drivers don't have access to GU_CNTL register.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-6-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't initialize stolen memory manager if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:51 +0000 (16:43 +0100)]
drm/xe/vf: Don't initialize stolen memory manager if VF

VF drivers don't have access to the stolen memory.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-5-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't program MOCS if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:50 +0000 (16:43 +0100)]
drm/xe/vf: Don't program MOCS if VF

MOCS programming may only be done by the PF driver.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-4-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Don't try to capture engine data unavailable to VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:49 +0000 (16:43 +0100)]
drm/xe/vf: Don't try to capture engine data unavailable to VF

Don't capture engine ring registers as thoe are not available for
the VF driver.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-3-michal.wajdeczko@intel.com
7 months agodrm/xe/vf: Assume fixed GSM size if VF
Michal Wajdeczko [Tue, 13 Feb 2024 15:43:48 +0000 (16:43 +0100)]
drm/xe/vf: Assume fixed GSM size if VF

VFs can't use size mirrored from PCI config, but it should be
safe to assume it covers full 4GiB GGTT.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-2-michal.wajdeczko@intel.com
7 months agodrm/i915/selftests: Increasing the sleep time for live_rc6_manual
Anirban Sk [Mon, 12 Feb 2024 05:07:38 +0000 (10:37 +0530)]
drm/i915/selftests: Increasing the sleep time for live_rc6_manual

Sometimes gt_pm live_rc6_manual selftest fails due to no power being
measured for the rc6 disabled period. Therefore increasing the rc6 disable
period from 250ms to 1000ms to rule out such sporadic failure.

v3:
- More descriptive and improved commit message (Anshuman)

Signed-off-by: Anirban Sk <sk.anirban@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212050738.1162198-1-sk.anirban@intel.com
7 months agodrm/xe: fix arguments to drm_err_printer()
Jani Nikula [Tue, 13 Feb 2024 08:49:54 +0000 (10:49 +0200)]
drm/xe: fix arguments to drm_err_printer()

The commit below changed drm_err_printer() arguments, but failed to
update all places.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/r/20240213120410.75c45763@canb.auug.org.au
Fixes: 5e0c04c8c40b ("drm/print: make drm_err_printer() device specific by using drm_err()")
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213084954.878643-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
7 months agoMerge tag 'amd-drm-next-6.9-2024-02-09' of https://gitlab.freedesktop.org/agd5f/linux...
Dave Airlie [Tue, 13 Feb 2024 01:32:23 +0000 (11:32 +1000)]
Merge tag 'amd-drm-next-6.9-2024-02-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.9-2024-02-09:

amdgpu:
- Validate DMABuf imports in compute VMs
- Add RAS ACA framework
- PSP 13 fixes
- Misc code cleanups
- Replay fixes
- Atom interpretor PS, WS bounds checking
- DML2 fixes
- Audio fixes
- DCN 3.5 Z state fixes
- Remove deprecated ida_simple usage
- UBSAN fixes
- RAS fixes
- Enable seq64 infrastructure
- DC color block enablement
- Documentation updates
- DC documentation updates
- DMCUB updates
- S3 fixes
- VCN 4.0.5 fixes
- DP MST fixes
- SR-IOV fixes

amdkfd:
- Validate DMABuf imports in compute VMs
- SVM fixes
- Trap handler updates

radeon:
- Atom interpretor PS, WS bounds checking
- Misc code cleanups

UAPI:
- Bump KFD version so UMDs know that the fixes that enable the management of
  VA mappings in compute VMs using the GEM_VA ioctl for DMABufs exported from KFD are present
- Add INFO query for input power.  This matches the existing INFO query for average
  power.  Used in gaming HUDs, etc.
  Example userspace: https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_power

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240209221459.5453-1-alexander.deucher@amd.com
7 months agodrm/xe/pt: Allow for stricter type- and range checking
Thomas Hellström [Fri, 9 Feb 2024 11:26:55 +0000 (12:26 +0100)]
drm/xe/pt: Allow for stricter type- and range checking

Distinguish between xe_pt and the xe_pt_dir subclass when
allocating and freeing. Also use a fixed-size array for the
xe_pt_dir page entries to make life easier for dynamic range-
checkers. Finally rename the page-directory child pointer array
to "children".

While no functional change, this fixes ubsan splats similar to:

[   51.463021] ------------[ cut here ]------------
[   51.463022] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/xe/xe_pt.c:47:9
[   51.463023] index 0 is out of range for type 'xe_ptw *[*]'
[   51.463024] CPU: 5 PID: 2778 Comm: xe_vm Tainted: G     U             6.8.0-rc1+ #218
[   51.463026] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023
[   51.463027] Call Trace:
[   51.463028]  <TASK>
[   51.463029]  dump_stack_lvl+0x47/0x60
[   51.463030]  __ubsan_handle_out_of_bounds+0x95/0xd0
[   51.463032]  xe_pt_destroy+0xa5/0x150 [xe]
[   51.463088]  __xe_pt_unbind_vma+0x36c/0x9b0 [xe]
[   51.463144]  xe_vm_unbind+0xd8/0x580 [xe]
[   51.463204]  ? drm_exec_prepare_obj+0x3f/0x60 [drm_exec]
[   51.463208]  __xe_vma_op_execute+0x5da/0x910 [xe]
[   51.463268]  ? __drm_gpuvm_sm_unmap+0x1cb/0x220 [drm_gpuvm]
[   51.463272]  ? radix_tree_node_alloc.constprop.0+0x89/0xc0
[   51.463275]  ? drm_gpuva_it_remove+0x1f3/0x2a0 [drm_gpuvm]
[   51.463279]  ? drm_gpuva_remove+0x2f/0xc0 [drm_gpuvm]
[   51.463283]  xe_vm_bind_ioctl+0x1a55/0x20b0 [xe]
[   51.463344]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   51.463414]  drm_ioctl_kernel+0xb6/0x120
[   51.463416]  drm_ioctl+0x287/0x4e0
[   51.463418]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
[   51.463481]  __x64_sys_ioctl+0x94/0xd0
[   51.463484]  do_syscall_64+0x86/0x170
[   51.463486]  ? syscall_exit_to_user_mode+0x7d/0x200
[   51.463488]  ? do_syscall_64+0x96/0x170
[   51.463490]  ? do_syscall_64+0x96/0x170
[   51.463492]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[   51.463494] RIP: 0033:0x7f246bfe817d
[   51.463498] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[   51.463501] RSP: 002b:00007ffc1bd19ad0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   51.463502] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f246bfe817d
[   51.463504] RDX: 00007ffc1bd19b60 RSI: 0000000040886445 RDI: 0000000000000003
[   51.463505] RBP: 00007ffc1bd19b20 R08: 0000000000000000 R09: 0000000000000000
[   51.463506] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc1bd19b60
[   51.463508] R13: 0000000040886445 R14: 0000000000000003 R15: 0000000000010000
[   51.463510]  </TASK>
[   51.463517] ---[ end trace ]---

v2
- Fix kerneldoc warning (Matthew Brost)

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240209112655.4872-1-thomas.hellstrom@linux.intel.com
7 months agodrm/xe: use drm based debugging instead of dev
Jani Nikula [Mon, 12 Feb 2024 14:57:57 +0000 (16:57 +0200)]
drm/xe: use drm based debugging instead of dev

Prefer drm_dbg() over dev_dbg().

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212145757.645094-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
7 months agodrm/i915: Add flex arrays to struct i915_syncmap
Erick Archer [Thu, 8 Feb 2024 18:13:18 +0000 (19:13 +0100)]
drm/i915: Add flex arrays to struct i915_syncmap

The "struct i915_syncmap" uses a dynamically sized set of trailing
elements. It can use an "u32" array or a "struct i915_syncmap *"
array.

So, use the preferred way in the kernel declaring flexible arrays [1].
Because there are two possibilities for the trailing arrays, it is
necessary to declare a union and use the DECLARE_FLEX_ARRAY macro.

The comment can be removed as the union is now clear enough.

Also, avoid the open-coded arithmetic in the memory allocator functions
[2] using the "struct_size" macro.

Moreover, refactor the "__sync_seqno" and "__sync_child" functions due
to now it is possible to use the union members added to the structure.
This way, it is also possible to avoid the open-coded arithmetic in
pointers.

Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Signed-off-by: Erick Archer <erick.archer@gmx.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208181318.4259-1-erick.archer@gmx.com
7 months agodrm/xe: Fix a missing argument to drm_err_printer
Thomas Hellström [Mon, 12 Feb 2024 10:38:33 +0000 (11:38 +0100)]
drm/xe: Fix a missing argument to drm_err_printer

The indicated commit below added a device argument to the
function, but there was a call in the xe driver that was
not properly changed.

Fixes: 5e0c04c8c40b ("drm/print: make drm_err_printer() device specific by using drm_err()")
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212103833.138263-1-thomas.hellstrom@linux.intel.com
7 months agodrm/tests: mm: Convert to drm_dbg_printer
Michał Winiarski [Fri, 9 Feb 2024 14:08:18 +0000 (15:08 +0100)]
drm/tests: mm: Convert to drm_dbg_printer

Fix one of the tests in drm_mm that was not converted prior to
drm_debug_printer removal, causing tests build failure.

Fixes: e154c4fc7bf2 ("drm: remove drm_debug_printer in favor of drm_dbg_printer")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240209140818.106685-1-michal.winiarski@intel.com
7 months agodrm/lima: standardize debug messages by ip name
Erico Nunes [Wed, 24 Jan 2024 02:59:47 +0000 (03:59 +0100)]
drm/lima: standardize debug messages by ip name

Some debug messages carried the ip name, or included "lima", or
included both the ip name and then the numbered ip name again.
Make the messages more consistent by always looking up and showing
the ip name first.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-9-nunes.erico@gmail.com
7 months agodrm/lima: increase default job timeout to 10s
Erico Nunes [Wed, 24 Jan 2024 02:59:46 +0000 (03:59 +0100)]
drm/lima: increase default job timeout to 10s

The previous 500ms default timeout was fairly optimistic and could be
hit by real world applications. Many distributions targeting devices
with a Mali-4xx already bumped this timeout to a higher limit.
We can be generous here with a high value as 10s since this should
mostly catch buggy jobs like infinite loop shaders, and these don't
seem to happen very often in real applications.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-8-nunes.erico@gmail.com
7 months agodrm/lima: remove guilty drm_sched context handling
Erico Nunes [Wed, 24 Jan 2024 02:59:45 +0000 (03:59 +0100)]
drm/lima: remove guilty drm_sched context handling

Marking the context as guilty currently only makes the application which
hits a single timeout problem to stop its rendering context entirely.
All jobs submitted later are dropped from the guilty context.

Lima runs on fairly underpowered hardware for modern standards and it is
not entirely unreasonable that a rendering job may time out occasionally
due to high system load or too demanding application stack. In this case
it would be generally preferred to report the error but try to keep the
application going.

Other similar embedded GPU drivers don't make use of the guilty context
flag. Now that there are reliability improvements to the lima timeout
recovery handling, drop the guilty contexts to let the application keep
running in this case.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-7-nunes.erico@gmail.com
7 months agodrm/lima: handle spurious timeouts due to high irq latency
Erico Nunes [Wed, 24 Jan 2024 02:59:44 +0000 (03:59 +0100)]
drm/lima: handle spurious timeouts due to high irq latency

There are several unexplained and unreproduced cases of rendering
timeouts with lima, for which one theory is high IRQ latency coming from
somewhere else in the system.
This kind of occurrence may cause applications to trigger unnecessary
resets of the GPU or even applications to hang if it hits an issue in
the recovery path.
Panfrost already does some special handling to account for such
"spurious timeouts", it makes sense to have this in lima too to reduce
the chance that it hit users.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-6-nunes.erico@gmail.com
7 months agodrm/lima: set gp bus_stop bit before hard reset
Erico Nunes [Wed, 24 Jan 2024 02:59:43 +0000 (03:59 +0100)]
drm/lima: set gp bus_stop bit before hard reset

This is required for reliable hard resets. Otherwise, doing a hard reset
while a task is still running (such as a task which is being stopped by
the drm_sched timeout handler) may result in random mmu write timeouts
or lockups which cause the entire gpu to hang.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-5-nunes.erico@gmail.com
7 months agodrm/lima: set pp bus_stop bit before hard reset
Erico Nunes [Wed, 24 Jan 2024 02:59:42 +0000 (03:59 +0100)]
drm/lima: set pp bus_stop bit before hard reset

This is required for reliable hard resets. Otherwise, doing a hard reset
while a task is still running (such as a task which is being stopped by
the drm_sched timeout handler) may result in random mmu write timeouts
or lockups which cause the entire gpu to hang.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124025947.2110659-4-nunes.erico@gmail.com