ChunTao Tso [Tue, 22 Oct 2024 06:54:50 +0000 (14:54 +0800)]
drm/amd/display: Add a Panel Replay config option
[Why]
Replay need special policy for the scenario Teams,
add a flag to imply apply special policy or not.
[How]
Add a config option intended for future use for video conferencing applications.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: ChunTao Tso <ChunTao.Tso@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 18 Mar 2025 21:25:16 +0000 (17:25 -0400)]
drm/amd/display: use drm_warn instead of DRM_WARN
drm_warn prints the drm device instance which is helpful when
debugging multi gpu issues
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 18 Mar 2025 21:05:50 +0000 (17:05 -0400)]
drm/amd/display: use drm_info instead of DRM_INFO
drm_info prints the drm device instance which is helpful when
debugging multi gpu issues
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dillon Varone [Thu, 13 Mar 2025 19:24:59 +0000 (15:24 -0400)]
drm/amd/display: Consider downspread against max clocks in DML2.1
[WHY&HOW]
Core should evaluate support based on the max clocks after considering
downspread.
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Robin Chen [Tue, 18 Mar 2025 01:14:47 +0000 (09:14 +0800)]
drm/amd/display: Enable Replay Low Hz feature flag
Enable replay low refresh rate support.
Reviewed-by: ChunTao Tso <chuntao.tso@amd.com>
Signed-off-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Joshua Aberback [Fri, 14 Mar 2025 22:33:43 +0000 (18:33 -0400)]
drm/amd/display: Use meaningful size for block_sequence array
[Why]
This array was initially defined as size 50. There were array overflow
issues so the size was increased to 100. To ensure such issues are
avoided in the future, the size should be set based on the possible
contents instead of an arbitrary value.
[How]
- upper bound, assume every update occurs on max number of pipes
- define array sizes for function parameters, for static analysis
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Austin Zheng [Mon, 17 Mar 2025 17:29:47 +0000 (13:29 -0400)]
drm/amd/display: Set ODM Factor Based On DML Architecture
[Why]
Mapping of ODM enum is different for DML2.0 vs DML2.1.
Configs using DML2.1 will incorrectly trigger an assert meant for DML2.0.
[How]
Use if/else to seperate logic between DML2.0 and DML2.1.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 11 Mar 2025 19:34:53 +0000 (15:34 -0400)]
drm/amd/display: convert more DRM_ERROR to drm_err
prefer drm_err instead of DRM_ERROR since the former prints the
associated DRM device, which is helpful when debugging multi-gpu
use cases.
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 11 Mar 2025 19:55:55 +0000 (15:55 -0400)]
drm/amd/display: use drm_err in create_validate_stream_for_sink()
make the drm device available in create_validate_stream_for_sink()
so that drm_err() can be used
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 11 Mar 2025 19:51:03 +0000 (15:51 -0400)]
drm/amd/display: use drm_err in hpd rx offload
add amdgpu_device pointer to data associated with the work struct
such that hpd handlers has access to the drm device for use with
drm_err()
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aurabindo Pillai [Tue, 11 Mar 2025 19:43:07 +0000 (15:43 -0400)]
drm/amd/display: convert DRM_ERROR to drm_err in hpd_rx_irq_create_workqueue()
pass in a pointer to amdgpu_device directly to the function.
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 17 Mar 2025 07:03:46 +0000 (15:03 +0800)]
drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_12
Use gpu_metrics_v1_8 for smu_v13_0_12 to fill metrics data
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 17 Mar 2025 06:55:38 +0000 (14:55 +0800)]
drm/amd/pm: Use gpu_metrics_v1_8 for smu_v13_0_6
Use gpu_metrics_v1_8 for smu_v13_0_6 to fill metrics data
v2: Move exposing caps to separate patch, move smu_v13.0.12 gpu metrics
1.8 usage to separate patch (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 17 Mar 2025 06:37:37 +0000 (14:37 +0800)]
drm/amd/pm: Expose smu_v13_0_6 caps
Expose smu_v13_0_6 caps by moving it to common header
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charles Han [Thu, 13 Feb 2025 07:08:37 +0000 (15:08 +0800)]
Documentation: Remove repeated word in docs
Remove the repeated word "the" in docs.
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:47 +0000 (11:18 -0600)]
Documentation/gpu: Add an intro about MES
MES is an important firmware that lacks some essential documentation.
This commit introduces an overview of it and how it works.
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:46 +0000 (11:18 -0600)]
Documentation/gpu: Create a GC entry in the amdgpu documentation
GC is a large block that plays a vital role for amdgpu; for this reason,
this commit creates one specific page for GC and adds extra information
about the CP component.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:45 +0000 (11:18 -0600)]
Documentation/gpu: Add explanation about AMD Pipes and Queues
Pipes and Queues are two common vocabulary that pervades discussions
around amdgpu core features. The definition and explanation of those
components are spread around multiple places in the code, mailing list,
and Gitlab, which sometimes leads to the wrong interpretation of these
concepts. This commit attempts to centralize the definition and
explanation of Pipe and Queue from amdgpu perspective in a kernel doc.
Most of the information in this doc was derived from:
- https://lore.kernel.org/amd-gfx/CADnq5_Pcz2x4aJzKbVrN3jsZhD6sTydtDw=6PaN4O3m4t+Grtg@mail.gmail.com/T/#m9a670b55ab20e0f7c46c80f802a0a4be255a719d
- https://gitlab.freedesktop.org/mesa/mesa/-/issues/11759
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:44 +0000 (11:18 -0600)]
Documentation/gpu: Create a documentation entry just for hardware info
The APU and dGPU tables are hidden in the driver misc info, which makes
it hard to find specific hardware info when users need it. This commit
creates a single page for this information and adds it to the top of the
amdgpu list to improve searchability.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:43 +0000 (11:18 -0600)]
Documentation/gpu: Change index order to show driver core first
Since driver-core has an overview of the AMD GPU hardware structure, it
makes more sense to keep it first. This commit move driver-core up in
the index list.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rodrigo Siqueira [Tue, 25 Mar 2025 17:18:42 +0000 (11:18 -0600)]
Documentation/gpu: Add new acronyms
This commit introduces some new acronyms extracted from the source code
and found on some web pages around the internet (most of them came from
ArchLinux, Gentoo, and Wikipedia links).
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:58:19 +0000 (11:58 -0400)]
drm/amdgpu/gfx11: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:58:03 +0000 (11:58 -0400)]
drm/amdgpu/gfx10: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:57:49 +0000 (11:57 -0400)]
drm/amdgpu/gfx9: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:57:34 +0000 (11:57 -0400)]
drm/amdgpu/gfx8: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:57:19 +0000 (11:57 -0400)]
drm/amdgpu/gfx7: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Mar 2025 15:56:02 +0000 (11:56 -0400)]
drm/amdgpu/gfx6: fix CSIB handling
We shouldn't return after the last section.
We need to update the rest of the CSIB.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 18:24:47 +0000 (14:24 -0400)]
drm/amdgpu/gfx: assign the actual me0 queues per pipe
Set the actual number of queues per pipe for ME0 (gfx).
This way we will dump all of the queues properly in
dev core dumps.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 18:10:17 +0000 (14:10 -0400)]
drm/amdgpu/gfx: decouple the number of kgqs from the hw
The driver currently sets up one kgq per pipe. As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe. Decouple
the kgq setup from the actual hardware count. For dev core
dumps and user queues, we want to know the actual number
of queues per pipe.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 17:34:33 +0000 (13:34 -0400)]
drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() static
It's not used outside of amdgpu_gfx.c.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Srinivasan Shanmugam [Wed, 26 Mar 2025 07:23:01 +0000 (12:53 +0530)]
drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUs
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide
data isolation between GPU workloads. The cleaner shader is responsible
for clearing the Local Data Store (LDS), Vector General Purpose
Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which
helps prevent data leakage and ensures accurate computation results.
This update extends cleaner shader support to GFX10.3.x GPUs, previously
available for GFX10.3.0. It enhances security by clearing GPU memory
between processes and maintains a consistent GPU state across KGD and
KFD workloads.
Cc: Mario Sopena-Novales <mario.novales@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 25 Mar 2025 14:07:50 +0000 (10:07 -0400)]
drm/amdgpu: drop some dead code
Drop the cgs smu firmware code for SI, it's not used.
The smu firmware fetching for SI is done in si_dpm.c.
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 3 Mar 2025 22:35:32 +0000 (17:35 -0500)]
drm/amdgpu: add initial documentation for debugfs files
Describes what debugfs files are available and what
they are used for.
v2: fix some typos (Mark Glines)
v3: Address comments from Siqueira and Kent
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:47:00 +0000 (21:47 -0400)]
drm/amdgpu: continue cleaning up sid.h and si_enums.h
Remove more duplicated defines and move some in sid.h for coherence with
CIK.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ananta Srikar [Tue, 25 Mar 2025 01:49:12 +0000 (21:49 -0400)]
drm/amd/amdgpu: Fix typo
Fixes a typo in the word "version" in an error message.
Signed-off-by: Ananta Srikar <srikarananta01@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andres Urian Florez [Tue, 25 Mar 2025 00:07:21 +0000 (19:07 -0500)]
drm/amdgpu: Replace deprecated function strcpy() with strscpy()
Instead of using the strcpy() deprecated function to populate the
fw_name, use the strscpy() function
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 27 Feb 2025 17:31:28 +0000 (12:31 -0500)]
drm/amdgpu: add rebar parameter
Add a new parameter to disable BAR resizing. Note that this
only disables the driver from attempting to resize the BAR,
The BIOS may have resized the BAR at boot.
Some teams have found this useful in debugging P2P DMA
issues on systems where the available MMIO space did not allow
for all of the GPUs present to resize their BARs.
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:59 +0000 (21:46 -0400)]
drm/amdgpu: cleanup DCE6 a bit more
Use shifts already available in DCE6's defines, masks and shifts.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:58 +0000 (21:46 -0400)]
drm/amdgpu: keep removing sid.h dependency from si_dma.c
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL
in oss_1_0_d.h
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:56 +0000 (21:46 -0400)]
drm/amdgpu: move si_dma.c away from sid.h and si_enums.h
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h
Taking the opportunity to add some comments taken from cik_sdma.c
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:54 +0000 (21:46 -0400)]
drm/amdgpu: make GFX6 easier to read
Just fix the style and add a comment for reading easiness
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:51 +0000 (21:46 -0400)]
drm/amdgpu: add missing GFX6 defines
They will be used later when switching away from sid.h/si_enums.h.
v2: fix whitespace (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:50 +0000 (21:46 -0400)]
drm/amdgpu: add missing DMA defines, shifts and masks
They will be used later when switching away from sid.h/si_enums.h.
v2: fix up whitespace (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:49 +0000 (21:46 -0400)]
drm/amdgpu: move DCE6 away from sid.h and si_enums.h defines
This cleans up DCE6.
I added some minor tweaks taken from CIK to exit early
v2: minor fixes (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:48 +0000 (21:46 -0400)]
drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with GRPH_SECONDARY_SURFACE_ADDRESS in DCE6
It seems a copy-paste error: since we are working with
mmGRPH_SECONDARY_SURFACE_ADDRESS,
GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK
should be used.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:47 +0000 (21:46 -0400)]
drm/amdgpu: move si_ih.c away from sid.h defines
They are properly defined under oss_1_0_d.h
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:46 +0000 (21:46 -0400)]
drm/amdgpu: remove PACKET3 duplicated defines from si_enums.h
PACKET3 is already in sid.h, as it is done under cikd.h for CIK
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:45 +0000 (21:46 -0400)]
drm/amdgpu: use proper defines, shifts and masks in DCE6 code
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK,
we also need to fix its usage in GMC6.
Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h,
so we need to invert its value where it was used.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:44 +0000 (21:46 -0400)]
drm/amdgpu: wire up defines, shifts and masks through SI code
To be able to remove as much duplicated defines, the different files
containing definitions, shifts and masks must be properly included.
Once done, the code will be migrated where needed to shifts and masks and
proper defines, before removing useless defines in the end.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 01:46:43 +0000 (21:46 -0400)]
drm/amdgpu: move GFX6 defines into gfx_v6_0.c
Send a few GFX6 defines where it's used in GFX6.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:39:00 +0000 (14:39 -0400)]
drm/radeon: fix MAX_POWER_SHIFT value
While I don't think it is being used anywhere, if it were used, it would
be wrong. We can base this assumption on MAX_POWER_MASK, where the shift is
by 16 bits.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:37:45 +0000 (14:37 -0400)]
drm/amdgpu: move X_GB_ADDR_CONFIG_GOLDEN in GFX7
[BONAIRE|HAWAII]_GB_ADDR_CONFIG_GOLDEN are only used by GFX7. So keep them
where they are needed.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:37:44 +0000 (14:37 -0400)]
drm/amdgpu: small cleanup to CIK SDMA
Tidy cik_sdma_hw_init() by returning directly cik_sdma_start()'s result.
Keep amdgpu_cik_gpu_check_soft_reset() early declaration with others.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:37:43 +0000 (14:37 -0400)]
drm/amdgpu: use cik_sdma_is_idle() in CIK SDMA
cik_sdma_is_idle() does exactly what we need, so use it.
V2: fix parameter (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Sat, 22 Mar 2025 18:37:42 +0000 (14:37 -0400)]
drm/amdgpu: use gmc_v7_0_is_idle() since it is available under GMC7
gmc_v7_0_is_idle() does exactly what we need, so use it.
v2: fix parameter (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saleemkhan Jamadar [Fri, 21 Mar 2025 11:30:09 +0000 (17:00 +0530)]
drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to drm_err (Mario)
Update message to identifiy the vblank initialization fail case
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saleemkhan Jamadar [Fri, 21 Mar 2025 11:30:09 +0000 (17:00 +0530)]
drm/amd/display: add proper error message for vblank init
v1 - DRM_ERROR to dev_err (Mario)
Update message to identifiy the vblank initialization fail case
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Tue, 31 Dec 2024 04:53:13 +0000 (12:53 +0800)]
drm/amdgpu/vcn: during dpc recovery will corrupt VCPU buffer
err_event_athub and dpc recovery will corrupt VCPU buffer,
so we need to restore fw data and clear buffer in amdgpu_vcn_resume()
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Fri, 21 Mar 2025 02:11:18 +0000 (10:11 +0800)]
drm/amdgpu: Multi-GPU DPC recovery support
Add support for DPC recover based on refactored code
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Thu, 20 Mar 2025 10:12:40 +0000 (18:12 +0800)]
drm/amdgpu: refactor amdgpu_device_gpu_recover
Split amdgpu_device_gpu_recover into the following stages:
halt activities,asic reset,schedule resume and amdgpu resume.
The reason is that the subsequent addition of dpc recover
code will have a high similarity with gpu reset
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ce Sun [Fri, 22 Nov 2024 07:19:16 +0000 (15:19 +0800)]
drm/amd/pm: Add link reset for SMU 13.0.6
Add link reset implementation
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Thu, 20 Mar 2025 05:01:32 +0000 (10:31 +0530)]
drm/amdkfd: Use dev_* instead of pr_* for messages
To get the device context, replace pr_ with dev_ functions.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Sun, 16 Mar 2025 15:46:52 +0000 (10:46 -0500)]
drm/amd/display: DC v3.2.326
Summary:
* DML 2.1 resync
* Vblank disable fixes
* Visual confirm debug improvements
* Add command for reading ABM histogram
* Bug fixes & improvements
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
JinZe.Xu [Wed, 12 Mar 2025 10:02:16 +0000 (18:02 +0800)]
drm/amd/display: Use sync version of indirect register access.
[Why]
Access to indirect registers by DC and other components are not synchronized.
[How]
Use sync version of indirect register access.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aric Cyr [Thu, 13 Mar 2025 17:43:41 +0000 (13:43 -0400)]
drm/amd/display: Create a temporary scratch dc_link
Create a temporary scratch dc_link for programming purposes
and fix a copy of pipe_ctx on the stack to a pointer reference.
Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Charlene Liu [Wed, 12 Mar 2025 21:30:25 +0000 (17:30 -0400)]
drm/amd/display: fix zero value for APU watermark_c
[why]
the guard of is_apu not in sync, caused no watermark_c output.
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chun-Liang Chang [Fri, 21 Feb 2025 16:36:22 +0000 (10:36 -0600)]
drm/amd/display: Add Read Histogram command header
[Why]
Read the histogram for VariBright validation
[How]
Add dc/dmub functions to read histogram and ACE
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Chun-Liang Chang <Chun-Liang.Chang@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Hsieh [Tue, 11 Mar 2025 09:16:57 +0000 (17:16 +0800)]
drm/amd/display: Skip to enable dsc if it has been off
[Why]
It makes DSC enable when we commit the stream which need
keep power off.And then it will skip to disable DSC if
pipe reset at this situation as power has been off. It may
cause the DSC unexpected enable on the pipe with the
next new stream which doesn't support DSC.
[HOW]
Check the DSC used on current pipe status when update stream.
Skip to enable if it has been off. The operation enable
DSC should happen when set power on.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Austin Zheng [Fri, 7 Mar 2025 18:36:53 +0000 (13:36 -0500)]
drm/amd/display: DML21 Reintegration
[Why]
To bring in latest changes in DML21
[List of Changes]
- Unification of DML logging to use DML_LOG_* macro
- Clean up variables that are exclusively used for logging
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cruise [Thu, 6 Mar 2025 02:17:48 +0000 (10:17 +0800)]
drm/amd/display: Remove BW Allocation from DPIA notification
[Why]
USB4 BW Allocation response will be handled in HPD IRQ.
No need to handle it in DPIA notification callback.
[How]
Remove DP BW allocation response code in DPIA notification.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Cruise <Cruise.Hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Zeng [Wed, 26 Feb 2025 19:35:05 +0000 (14:35 -0500)]
drm/amd/display: Get visual confirm color for stream
[WHY]
We want to output visual confirm color based on stream.
[HOW]
If visual confirm is for DMUB, use DMUB to get color.
Otherwise, find plane with highest layer index, output visual confirm color
of pipe that contains plane with highest index.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Leo Zeng [Tue, 25 Feb 2025 20:59:59 +0000 (15:59 -0500)]
drm/amd/display: Add override for visual confirm
[WHY]
We want to allow the display manager to override the visual
confirm color in DC when required.
[HOW]
Add new visual confirm mode VISUAL_CONFIRM_EXPLICIT, check mode before
setting visual confirm color.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 9 Jan 2025 16:57:56 +0000 (11:57 -0500)]
drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Pak Nin Lui <pak.lui@amd.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Maarten Lankhorst [Thu, 27 Mar 2025 19:51:28 +0000 (20:51 +0100)]
drm/amdgpu: Add cgroups implementation
Similar to xe, enable some simple management of VRAM only.
Reviewed-by: Christian König <christian.koenig@amd.com>
Co-developed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jay Cornwall [Fri, 21 Mar 2025 18:19:05 +0000 (13:19 -0500)]
drm/amdgpu: Increase KIQ invalidate_tlbs timeout
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation on the invalidate_tlbs path.
v2: Poll once before msleep
v3: Fix return value
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jonathan Kim [Thu, 27 Mar 2025 15:50:42 +0000 (11:50 -0400)]
drm/amdkfd: limit sdma queue reset caps flagging for gfx9
ASICs post GFX 9 are being flagged as SDMA per queue reset supported
in the KGD but KFD and scheduler FW currently have no support.
Limit SDMA queue reset capabilities to GFX 9.
Fixes:
ceb7114c961b ("drm/amdkfd: flag per-sdma queue reset supported to user space")
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Mario Limonciello [Thu, 27 Mar 2025 19:07:55 +0000 (14:07 -0500)]
drm/amd/display: Add HP Elitebook 645 to the quirk list for eDP on DP1
[Why]
HP Elitebook 645 has DP0 and DP1 swapped.
[How]
Add HP Elitebook 645 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3701
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Thu, 6 Mar 2025 17:29:20 +0000 (11:29 -0600)]
drm/amd/display: Add HP Probook 445 and 465 to the quirk list for eDP on DP1
[Why]
HP Probook 445 and 465 has DP0 and DP1 swapped.
[How]
Add HP Probook 445 and 465 to DP0/DP1 swap quirk list.
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3995
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Anson Tsao <anson.tsao@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huacai Chen [Thu, 27 Mar 2025 09:53:34 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_validate()/dml21_validate()
Commit
7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_validate()/dml21_validate() are not protected from their
callers, causing such errors:
do_fpu invoked from kernel context![#1]:
CPU: 10 UID: 0 PID: 331 Comm: kworker/10:1H Not tainted 6.14.0-rc6+ #4
Workqueue: events_highpri dm_irq_work_func [amdgpu]
pc
ffff800003191eb0 ra
ffff800003191e60 tp
9000000107a94000 sp
9000000107a975b0
a0
9000000140ce4910 a1
0000000000000000 a2
9000000140ce49b0 a3
9000000140ce49a8
a4
9000000140ce49a8 a5
0000000100000000 a6
0000000000000001 a7
9000000107a97660
t0
ffff800003790000 t1
9000000140ce5000 t2
0000000000000001 t3
0000000000000000
t4
0000000000000004 t5
0000000000000000 t6
0000000000000000 t7
0000000000000000
t8
0000000100000000 u0
ffff8000031a3b9c s9
9000000130bc0000 s0
9000000132400000
s1
9000000140ec0000 s2
9000000132400000 s3
9000000140ce0000 s4
90000000057f8b88
s5
9000000140ec0000 s6
9000000140ce4910 s7
0000000000000001 s8
9000000130d45010
ra:
ffff800003191e60 dml21_map_dc_state_into_dml_display_cfg+0x40/0x1140 [amdgpu]
ERA:
ffff800003191eb0 dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
CRMD:
000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD:
00000004 (PPLV0 +PIE -PWE)
EUEN:
00000000 (-FPE -SXE -ASXE -BTE)
ECFG:
00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT:
000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID:
0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/10:1H (pid: 331, threadinfo=
000000007bf9ddb0, task=
00000000cc4ab9f3)
Stack :
0000000100000000 0000043800000780 0000000100000001 0000000100000001
0000000000000000 0000078000000000 0000000000000438 0000078000000000
0000000000000438 0000078000000000 0000000000000438 0000000100000000
0000000100000000 0000000100000000 0000000100000000 0000000100000000
0000000000000001 9000000140ec0000 9000000132400000 9000000132400000
ffff800003408000 ffff800003408000 9000000132400000 9000000140ce0000
9000000140ce0000 ffff800003193850 0000000000000001 9000000140ec0000
9000000132400000 9000000140ec0860 9000000140ec0738 0000000000000001
90000001405e8000 9000000130bc0000 9000000140ec02a8 ffff8000031b5db8
0000000000000000 0000043800000780 0000000000000003 ffff8000031b79cc
...
Call Trace:
[<
ffff800003191eb0>] dml21_map_dc_state_into_dml_display_cfg+0x90/0x1140 [amdgpu]
[<
ffff80000319384c>] dml21_validate+0xcc/0x520 [amdgpu]
[<
ffff8000031b8948>] dc_validate_global_state+0x2e8/0x460 [amdgpu]
[<
ffff800002e94034>] create_validate_stream_for_sink+0x3d4/0x420 [amdgpu]
[<
ffff800002e940e4>] amdgpu_dm_connector_mode_valid+0x64/0x240 [amdgpu]
[<
900000000441d6b8>] drm_connector_mode_valid+0x38/0x80
[<
900000000441d824>] __drm_helper_update_and_validate+0x124/0x3e0
[<
900000000441ddc0>] drm_helper_probe_single_connector_modes+0x2e0/0x620
[<
90000000044050dc>] drm_client_modeset_probe+0x23c/0x1780
[<
9000000004420384>] __drm_fb_helper_initial_config_and_unlock+0x44/0x5a0
[<
9000000004403acc>] drm_client_dev_hotplug+0xcc/0x140
[<
ffff800002e9ab50>] handle_hpd_irq_helper+0x1b0/0x1e0 [amdgpu]
[<
90000000038f5da0>] process_one_work+0x160/0x300
[<
90000000038f6718>] worker_thread+0x318/0x440
[<
9000000003901b8c>] kthread+0x12c/0x220
[<
90000000038b1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml2_validate()/dml21_validate() out of DML2
causes "sleeping function called from invalid context", so protect them
with DC_FP_START() and DC_FP_END() inside.
Fixes:
7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Tested-by: Dongyan Qian <qiandongyan@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huacai Chen [Thu, 27 Mar 2025 09:53:33 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml2_init()/dml21_init()
Commit
7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml2_init()/dml21_init() are not protected from their callers,
causing such errors:
do_fpu invoked from kernel context![#1]:
CPU: 0 UID: 0 PID: 239 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #2
Workqueue: events work_for_cpu_fn
pc
ffff80000319de80 ra
ffff80000319de5c tp
900000010575c000 sp
900000010575f840
a0
0000000000000000 a1
900000012f210130 a2
900000012f000000 a3
ffff80000357e268
a4
ffff80000357e260 a5
900000012ea52cf0 a6
0000000400000004 a7
0000012c00001388
t0
00001900000015e0 t1
ffff80000379d000 t2
0000000010624dd3 t3
0000006400000014
t4
00000000000003e8 t5
0000005000000018 t6
0000000000000020 t7
0000000f00000064
t8
000000000000002f u0
5f5e9200f8901912 s9
900000012d380010 s0
900000012ea51fd8
s1
900000012f000000 s2
9000000109296000 s3
0000000000000001 s4
0000000000001fd8
s5
0000000000000001 s6
ffff800003415000 s7
900000012d390000 s8
ffff800003211f80
ra:
ffff80000319de5c dml21_apply_soc_bb_overrides+0x3c/0x960 [amdgpu]
ERA:
ffff80000319de80 dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
CRMD:
000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD:
00000004 (PPLV0 +PIE -PWE)
EUEN:
00000000 (-FPE -SXE -ASXE -BTE)
ECFG:
00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT:
000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID:
0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/0:5 (pid: 239, threadinfo=
00000000927eadc6, task=
000000008fd31682)
Stack :
00040dc000003164 0000000000000001 900000012f210130 900000012eabeeb8
900000012f000000 ffff80000319fe48 900000012f210000 900000012f210130
900000012f000000 900000012eabeeb8 0000000000000001 ffff8000031a0064
900000010575f9f0 900000012f210130 900000012eac0000 900000012ea80000
900000012f000000 ffff8000031cefc4 900000010575f9f0 ffff8000035859c0
ffff800003414000 900000010575fa78 900000012f000000 ffff8000031b4c50
0000000000000000 9000000101c9d700 9000000109c40000 5f5e9200f8901912
900000012d3c4bd0 900000012d3c5000 ffff8000034aed18 900000012d380010
900000012d3c4bd0 ffff800003414000 900000012d380000 ffff800002ea49dc
0000000000000001 900000012d3c6000 00000000ffffe423 0000000000010000
...
Call Trace:
[<
ffff80000319de80>] dml21_apply_soc_bb_overrides+0x60/0x960 [amdgpu]
[<
ffff80000319fe44>] dml21_init+0xa4/0x280 [amdgpu]
[<
ffff8000031a0060>] dml21_create+0x40/0x80 [amdgpu]
[<
ffff8000031cefc0>] dc_state_create+0x100/0x160 [amdgpu]
[<
ffff8000031b4c4c>] dc_create+0x44c/0x640 [amdgpu]
[<
ffff800002ea49d8>] amdgpu_dm_init+0x3f8/0x2060 [amdgpu]
[<
ffff800002ea6658>] dm_hw_init+0x18/0x60 [amdgpu]
[<
ffff800002b16738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
[<
ffff800002b18e80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
[<
ffff800002b0c8f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
[<
900000000448eae4>] local_pci_probe+0x44/0xc0
[<
9000000003b02b18>] work_for_cpu_fn+0x18/0x40
[<
9000000003b05da0>] process_one_work+0x160/0x300
[<
9000000003b06718>] worker_thread+0x318/0x440
[<
9000000003b11b8c>] kthread+0x12c/0x220
[<
9000000003ac1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml2_init()/dml21_init() out of DML2 causes
"sleeping function called from invalid context", so protect them with
DC_FP_START() and DC_FP_END() inside.
Fixes:
7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huacai Chen [Thu, 27 Mar 2025 09:53:32 +0000 (17:53 +0800)]
drm/amd/display: Protect FPU in dml21_copy()
Commit
7da55c27e76749b9 ("drm/amd/display: Remove incorrect FP context
start") removes the FP context protection of dml2_create(), and it said
"All the DC_FP_START/END should be used before call anything from DML2".
However, dml21_copy() are not protected from their callers, causing such
errors:
do_fpu invoked from kernel context![#1]:
CPU: 0 UID: 0 PID: 240 Comm: kworker/0:5 Not tainted 6.14.0-rc6+ #1
Workqueue: events work_for_cpu_fn
pc
ffff80000318bd2c ra
ffff80000315750c tp
9000000105910000 sp
9000000105913810
a0
0000000000000000 a1
0000000000000002 a2
900000013140d728 a3
900000013140d720
a4
0000000000000000 a5
9000000131592d98 a6
0000000000017ae8 a7
00000000001312d0
t0
9000000130751ff0 t1
ffff800003790000 t2
ffff800003790000 t3
9000000131592e28
t4
000000000004c6a8 t5
00000000001b7740 t6
0000000000023e38 t7
0000000000249f00
t8
0000000000000002 u0
0000000000000000 s9
900000012b010000 s0
9000000131400000
s1
9000000130751fd8 s2
ffff800003408000 s3
9000000130752c78 s4
9000000131592da8
s5
9000000131592120 s6
9000000130751ff0 s7
9000000131592e28 s8
9000000131400008
ra:
ffff80000315750c dml2_top_soc15_initialize_instance+0x20c/0x300 [amdgpu]
ERA:
ffff80000318bd2c mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
CRMD:
000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
PRMD:
00000004 (PPLV0 +PIE -PWE)
EUEN:
00000000 (-FPE -SXE -ASXE -BTE)
ECFG:
00071c1d (LIE=0,2-4,10-12 VS=7)
ESTAT:
000f0000 [FPD] (IS= ECode=15 EsubCode=0)
PRID:
0014d010 (Loongson-64bit, Loongson-3C6000/S)
Process kworker/0:5 (pid: 240, threadinfo=
00000000f1700428, task=
0000000020d2e962)
Stack :
0000000000000000 0000000000000000 0000000000000000 9000000130751fd8
9000000131400000 ffff8000031574e0 9000000130751ff0 0000000000000000
9000000131592e28 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 f9175936df5d7fd2
900000012b00ff08 900000012b000000 ffff800003409000 ffff8000034a1780
90000001019634c0 900000012b000010 90000001307beeb8 90000001306b0000
0000000000000001 ffff8000031942b4 9000000130780000 90000001306c0000
9000000130780000 ffff8000031c276c 900000012b044bd0 ffff800003408000
...
Call Trace:
[<
ffff80000318bd2c>] mcg_dcn4_build_min_clock_table+0x14c/0x600 [amdgpu]
[<
ffff800003157508>] dml2_top_soc15_initialize_instance+0x208/0x300 [amdgpu]
[<
ffff8000031942b0>] dml21_create_copy+0x30/0x60 [amdgpu]
[<
ffff8000031c2768>] dc_state_create_copy+0x68/0xe0 [amdgpu]
[<
ffff800002e98ea0>] amdgpu_dm_init+0x8c0/0x2060 [amdgpu]
[<
ffff800002e9a658>] dm_hw_init+0x18/0x60 [amdgpu]
[<
ffff800002b0a738>] amdgpu_device_init+0x1938/0x27e0 [amdgpu]
[<
ffff800002b0ce80>] amdgpu_driver_load_kms+0x20/0xa0 [amdgpu]
[<
ffff800002b008f0>] amdgpu_pci_probe+0x1b0/0x580 [amdgpu]
[<
9000000003c7eae4>] local_pci_probe+0x44/0xc0
[<
90000000032f2b18>] work_for_cpu_fn+0x18/0x40
[<
90000000032f5da0>] process_one_work+0x160/0x300
[<
90000000032f6718>] worker_thread+0x318/0x440
[<
9000000003301b8c>] kthread+0x12c/0x220
[<
90000000032b1484>] ret_from_kernel_thread+0x8/0xa4
Unfortunately, protecting dml21_copy() out of DML2 causes "sleeping
function called from invalid context", so protect them with DC_FP_START()
and DC_FP_END() inside.
Fixes:
7da55c27e767 ("drm/amd/display: Remove incorrect FP context start")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tom Chung [Wed, 19 Mar 2025 08:31:31 +0000 (16:31 +0800)]
drm/amd/display: Do not enable Replay and PSR while VRR is on in amdgpu_dm_commit_planes()
[Why]
Replay and PSR will cause some video corruption while VRR is enabled.
[How]
Do not enable the Replay and PSR while VRR is active in
amdgpu_dm_enable_self_refresh().
Fixes:
67edb81d6e9a ("drm/amd/display: Disable replay and psr while VRR is enabled")
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Emily Deng [Fri, 28 Mar 2025 10:14:17 +0000 (18:14 +0800)]
drm/amdkfd: sriov doesn't support per queue reset
Disable per queue reset for sriov.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Flora Cui [Wed, 26 Mar 2025 12:06:13 +0000 (20:06 +0800)]
drm/amdgpu/ip_discovery: add missing ip_discovery fw
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Matthew Auld [Mon, 7 Apr 2025 14:18:25 +0000 (15:18 +0100)]
drm/amdgpu/dma_buf: fix page_link check
The page_link lower bits of the first sg could contain something like
SG_END, if we are mapping a single VRAM page or contiguous blob which
fits into one sg entry. Rather pull out the struct page, and use that in
our check to know if we mapped struct pages vs VRAM.
Fixes:
f44ffd677fb3 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Thu, 20 Mar 2025 13:46:18 +0000 (14:46 +0100)]
drm/amdgpu: immediately use GTT for new allocations
Only use GTT as a fallback if we already have a backing store. This
prevents evictions when an application constantly allocates and frees new
memory.
Partially fixes
https://gitlab.freedesktop.org/drm/amd/-/issues/3844#note_2833985.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes:
216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Thu, 27 Mar 2025 21:33:49 +0000 (17:33 -0400)]
drm/amdgpu/mes11: optimize MES pipe FW version fetching
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes:
028c3fb37e70 ("drm/amdgpu/mes11: initiate mes v11 support")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Thu, 20 Mar 2025 16:09:11 +0000 (12:09 -0400)]
drm/amdgpu/gfx12: fix num_mec
GC12 only has 1 mec.
Fixes:
52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Mar 2025 13:35:02 +0000 (09:35 -0400)]
drm/amdgpu/gfx11: fix num_mec
GC11 only has 1 mec.
Fixes:
3d879e81f0f9 ("drm/amdgpu: add init support for GFX11 (v2)")
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Wed, 12 Mar 2025 06:07:49 +0000 (14:07 +0800)]
drm/amd/pm: Add gpu_metrics_v1_8
Add new gpu_metrics_v1_8 to acquire below host limit counters
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Tue, 25 Mar 2025 06:12:08 +0000 (11:42 +0530)]
drm/amdgpu: Prefer shadow rom when available
Fetch VBIOS from shadow ROM when available before trying other methods
like EFI method.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fixes:
9c081c11c621 ("drm/amdgpu: Reorder to read EFI exported ROM first")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4066
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Asad Kamal [Mon, 17 Mar 2025 06:17:51 +0000 (14:17 +0800)]
drm/amd/pm: Update smu metrics table for smu_v13_0_6
Update smu metrics table to vesrion 0x10 for smu_v13_0_6
v2: Host metrics support removal moved to separate patch (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asad Kamal [Mon, 17 Mar 2025 06:16:04 +0000 (14:16 +0800)]
drm/amd/pm: Remove host limit metrics support
Firmware algorithm changed and the values in this version
are not accurate thereby remove host limit metric support
for smu_v13_0_6, smu_v13_0_12 & smu_v13_0_14
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Candice Li [Wed, 26 Mar 2025 05:41:01 +0000 (13:41 +0800)]
Remove unnecessary firmware version check for gc v9_4_2
GC v9_4_2 uses a new versioning scheme for CP firmware, making
the warning ("CP firmware version too old, please update!") irrelevant.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Christian König [Tue, 18 Mar 2025 15:15:12 +0000 (16:15 +0100)]
drm/amdgpu: stop unmapping MQD for kernel queues v3
This looks unnecessary and actually extremely harmful since using kmap()
is not possible while inside the ring reset.
Remove all the extra mapping and unmapping of the MQDs.
v2: also fix debugfs
v3: fix coding style typo
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jesse.zhang@amd.com [Tue, 25 Mar 2025 07:01:19 +0000 (15:01 +0800)]
Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"
this temporarily reverts
commit
6ec04e38b2f6 ("drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA")
it cause a regression.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Mon, 24 Mar 2025 09:19:54 +0000 (17:19 +0800)]
drm/amdgpu: Parse all deferred errors with UMC aca handle
We should only increase the deferred errors in UMC block.
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stanley.Yang [Tue, 25 Mar 2025 03:10:43 +0000 (11:10 +0800)]
drm/amdgpu: Update ta ras block
Update ta ra block to keep sync with RAS TA.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Mon, 24 Mar 2025 07:56:26 +0000 (13:26 +0530)]
drm/amdgpu: Add NPS2 to DPX compatible mode
Compute partition DPX is possible in NPS2 mode. Update the compatible
modes for DPX.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Xiang Liu [Fri, 21 Mar 2025 12:47:23 +0000 (20:47 +0800)]
drm/amdgpu: Use correct gfx deferred error count
In the case of parsing GFX deferred error from SMU corrected error
channel, the error count should be set to 1 instead of parsing from
MISC0 register, which is 0.
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>