linux-block.git
2 weeks agodrm/amd/amdgpu: apply command submission parser for JPEG v2+
David (Ming Qiang) Wu [Fri, 16 Aug 2024 15:43:05 +0000 (11:43 -0400)]
drm/amd/amdgpu: apply command submission parser for JPEG v2+

This patch extends the same cs parser from JPEG v4.0.3 to
other JPEG versions (v2 and above).

Rename to more common name as jpeg_v2_dec_ring_parse_cs()
from jpeg_v4_0_3_dec_ring_parse_cs().

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agodrm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
Kenneth Feng [Fri, 6 Sep 2024 12:46:54 +0000 (20:46 +0800)]
drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3

fix the pp_dpm_pcie issue on smu v14.0.2/3 as below:
0: 2.5GT/s, x4 250Mhz
1: 8.0GT/s, x4 616Mhz *
2: 8.0GT/s, x4 1143Mhz *
the middle level can be removed since it is always skipped on
smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agodrm/amd/pm: update the features set on smu v14.0.2/3
Kenneth Feng [Thu, 5 Sep 2024 07:38:18 +0000 (15:38 +0800)]
drm/amd/pm: update the features set on smu v14.0.2/3

update the features set on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agodrm/amdkfd: Fix resource leak in criu restore queue
Jesse Zhang [Fri, 6 Sep 2024 03:29:55 +0000 (11:29 +0800)]
drm/amdkfd: Fix resource leak in criu restore queue

To avoid memory leaks, release q_extra_data when exiting the restore queue.
v2: Correct the proto (Alex)

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 weeks agodrm/amd/display: Do not reset planes based on crtc zpos_changed
Leo Li [Thu, 5 Sep 2024 22:45:04 +0000 (18:45 -0400)]
drm/amd/display: Do not reset planes based on crtc zpos_changed

[Why]

drm_normalize_zpos will set the crtc_state->zpos_changed to 1 if any of
it's assigned planes changes zpos, or is removed/added from it.

To have amdgpu_dm request a plane reset on this is too broad. For
example, if only the cursor plane was moved from one crtc to another,
the crtc's zpos_changed will be set to true. But that does not mean that
the underlying primary plane requires a reset.

[How]

Narrow it down so that only the plane that has a change in zpos will
require a reset.

As a future TODO, we can further optimize this by only requiring a reset
on z-order change. Z-order is different from z-pos, since a zpos change
doesn't necessarily mean the z-ordering changed, and DC should only
require a reset if the z-ordering changed.

For example, the following zpos update does not change z-ordering:

    Plane A: zpos 2 -> 3
    Plane B: zpos 1 -> 2
    => Plane A is still on top of plane B: no reset needed

Whereas this one does change z-ordering:

    Plane A: zpos 2 -> 1
    Plane B: zpos 1 -> 2
    => Plane A changed from on top, to below plane B: reset needed

Fixes: 38e0c3df6dbd ("drm/amd/display: Move PRIMARY plane zpos higher")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3569
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: drop redundant W=1 warnings from Makefile
Jani Nikula [Thu, 23 May 2024 13:37:07 +0000 (16:37 +0300)]
drm/amdgpu: drop redundant W=1 warnings from Makefile

Since commit a61ddb4393ad ("drm: enable (most) W=1 warnings by default
across the subsystem"), most of the extra warnings in the driver
Makefile are redundant. Remove them.

Note that -Wmissing-declarations and -Wmissing-prototypes are always
enabled by default in scripts/Makefile.extrawarn.

Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"
Christian König [Tue, 27 Aug 2024 14:15:06 +0000 (16:15 +0200)]
drm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"

That is clearly not something we should do upstream. The SDMA is
mandatory for the driver to work correctly.

We could do this for emulation and bringup, but in those cases the
engineer should probably enabled CPU based updates manually.

This reverts commit 62eefd10ac1c7e976bda47ff311bd87cee40ab8d.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/mes11: Indent an if statment
Dan Carpenter [Wed, 4 Sep 2024 08:01:43 +0000 (11:01 +0300)]
drm/amdgpu/mes11: Indent an if statment

Indent the "break" statement one more tab.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: Document and define SVM events message macro
Philip Yang [Fri, 16 Feb 2024 16:00:10 +0000 (11:00 -0500)]
drm/amdkfd: Document and define SVM events message macro

Document how to use SMI system management interface to enable and
receive SVM events. Document SVM event triggers.

Define SVM events message string format macro that could be used by user
mode for sscanf to parse the event. Add it to uAPI header file to make
it obvious that is changing uAPI in future.

No functional changes.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: Select reset method for poison handling
Hawking Zhang [Fri, 6 Sep 2024 08:06:13 +0000 (16:06 +0800)]
drm/amdkfd: Select reset method for poison handling

Driver mode-2 is only supported by relative new
smc firmware.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: fix missed queue reset on queue destroy
Jonathan Kim [Thu, 22 Aug 2024 14:44:39 +0000 (10:44 -0400)]
drm/amdkfd: fix missed queue reset on queue destroy

If a queue is being destroyed but causes a HWS hang on removal, the KFD
may issue an unnecessary gpu reset if the destroyed queue can be fixed
by a queue reset.

This is because the queue has been removed from the KFD's queue list
prior to the preemption action on destroy so the reset call will fail to
match the HQD PQ reset information against the KFD's queue record to do
the actual reset.

To fix this, deactivate the queue prior to preemption since it's being
destroyed anyways and remove the queue from the KFD's queue list after
preemption.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: Surface svm_default_granularity, a RW module parameter
Ramesh Errabolu [Tue, 20 Aug 2024 21:05:30 +0000 (16:05 -0500)]
drm/amdgpu: Surface svm_default_granularity, a RW module parameter

Enables users to update SVM's default granularity, used in
buffer migration and handling of recoverable page faults.
Param value is set in terms of log(numPages(buffer)),
e.g. 9 for a 2 MIB buffer

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: fix queue reset issue by mmio
Jesse Zhang [Wed, 4 Sep 2024 09:47:06 +0000 (17:47 +0800)]
drm/amdgpu: fix queue reset issue by mmio

Initialize the queue type before resetting the queue using mmio.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Add kdoc entry for 'program_isharp_1dlut' in 'dpp401_dscl_program_is...
Srinivasan Shanmugam [Wed, 4 Sep 2024 07:40:59 +0000 (13:10 +0530)]
drm/amd/display: Add kdoc entry for 'program_isharp_1dlut' in 'dpp401_dscl_program_isharp'

Added a descriptor for the 'program_isharp_1dlut' parameter, which is a
flag used to determine whether to program the isharp 1D LUT.

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:963: warning: Function parameter or struct member 'program_isharp_1dlut' not described in 'dpp401_dscl_program_isharp'

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner...
Srinivasan Shanmugam [Wed, 4 Sep 2024 07:00:16 +0000 (12:30 +0530)]
drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader

This commit replaces the use of amdgpu_job_submit_direct which submits
the job to the ring directly, with drm_sched_entity in the cleaner
shader job submission process. The change allows the GPU scheduler to
manage the cleaner shader job.

- The job is then submitted to the GPU using the
  drm_sched_entity_push_job function, which allows the GPU scheduler to
  manage the job.

This change improves the reliability of the cleaner shader job
submission process by leveraging the capabilities of the GPU scheduler.

Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/: Add missing kdoc entry in amdgpu_vm_handle_fault function
Srinivasan Shanmugam [Mon, 26 Aug 2024 13:23:50 +0000 (18:53 +0530)]
drm/amdgpu/: Add missing kdoc entry in amdgpu_vm_handle_fault function

This commit adds a description for the 'ts' parameter in the
amdgpu_vm_handle_fault function's comment block.

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2781: warning: Function parameter or struct member 'ts' not described in 'amdgpu_vm_handle_fault'

Cc: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408251419.vgZHg3GV-lkp@intel.com/
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Xiaogang Chen <Xiaogang.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: fix dccg root clock optimization related hang
Qili Lu [Wed, 21 Aug 2024 20:26:13 +0000 (16:26 -0400)]
drm/amd/display: fix dccg root clock optimization related hang

[Why]
enable dpp rcg before we disable dppclk in hw_init cause system
hang/reboot

[How]
we remove dccg rcg related code from init into a separate function and
call it after we init pipe

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Qili Lu <qili.lu@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Refactor dccg35_get_other_enabled_symclk_fe
Nicholas Susanto [Tue, 20 Aug 2024 19:10:45 +0000 (15:10 -0400)]
drm/amd/display: Refactor dccg35_get_other_enabled_symclk_fe

[Why]

Function used to check the number of FEs connected to the current BE.
This was then used to determine if the symclk could be disabled, if
all FEs were disconnected. However, the function would skip over the
primary FE and return 0 when the primary FE was still connected. This
caused black screens on driver disable with an MST daisy chain hooked
up.

[How]

Refactor the function to correctly return the number of FEs connected
to the input BE. Also, rename it for clarity.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: Normalize reg offsets on JPEG v4.0.3
Lijo Lazar [Fri, 16 Aug 2024 07:10:43 +0000 (12:40 +0530)]
drm/amdgpu: Normalize reg offsets on JPEG v4.0.3

On VFs and SOCs with GC 9.4.4, VCN RRMT is disabled.
Only local register offsets should be used on JPEG v4.0.3 as they cannot
handle remote access to other AIDs. Since only local offsets are used,
the special write to MCM_ADDR register is no longer needed.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()
Tobias Jakobi [Mon, 2 Sep 2024 09:40:27 +0000 (11:40 +0200)]
drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()

dc_state_destruct() nulls the resource context of the DC state. The pipe
context passed to dcn35_set_drr() is a member of this resource context.

If dc_state_destruct() is called parallel to the IRQ processing (which
calls dcn35_set_drr() at some point), we can end up using already nulled
function callback fields of struct stream_resource.

The logic in dcn35_set_drr() already tries to avoid this, by checking tg
against NULL. But if the nulling happens exactly after the NULL check and
before the next access, then we get a race.

Avoid this by copying tg first to a local variable, and then use this
variable for all the operations. This should work, as long as nobody
frees the resource pool where the timing generators live.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()
Tobias Jakobi [Mon, 2 Sep 2024 09:40:26 +0000 (11:40 +0200)]
drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()

dc_state_destruct() nulls the resource context of the DC state. The pipe
context passed to dcn10_set_drr() is a member of this resource context.

If dc_state_destruct() is called parallel to the IRQ processing (which
calls dcn10_set_drr() at some point), we can end up using already nulled
function callback fields of struct stream_resource.

The logic in dcn10_set_drr() already tries to avoid this, by checking tg
against NULL. But if the nulling happens exactly after the NULL check and
before the next access, then we get a race.

Avoid this by copying tg first to a local variable, and then use this
variable for all the operations. This should work, as long as nobody
frees the resource pool where the timing generators live.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142
Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2")
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Tested-by: Raoul van Rüschen <raoul.van.rueschen@gmail.com>
Tested-by: Christopher Snowhill <chris@kode54.net>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Sefa Eyeoglu <contact@scrumplex.net>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: use clamp() in amdgpu_vm_adjust_size()
Li Zetao [Fri, 30 Aug 2024 01:22:15 +0000 (09:22 +0800)]
drm/amdgpu: use clamp() in amdgpu_vm_adjust_size()

When it needs to get a value within a certain interval, using clamp()
makes the code easier to understand than min(max()).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd: use clamp() in amdgpu_pll_get_fb_ref_div()
Li Zetao [Fri, 30 Aug 2024 01:22:14 +0000 (09:22 +0800)]
drm/amd: use clamp() in amdgpu_pll_get_fb_ref_div()

When it needs to get a value within a certain interval, using clamp()
makes the code easier to understand than min(max()).

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: enable gfxoff quirk on HP 705G4
Peng Liu [Fri, 30 Aug 2024 07:27:08 +0000 (15:27 +0800)]
drm/amdgpu: enable gfxoff quirk on HP 705G4

Enabling gfxoff quirk results in perfectly usable
graphical user interface on HP 705G4 DM with R5 2400G.

Without the quirk, X server is completely unusable as
every few seconds there is gpu reset due to ring gfx timeout.

Signed-off-by: Peng Liu <liupeng01@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: add raven1 gfxoff quirk
Peng Liu [Fri, 30 Aug 2024 07:25:54 +0000 (15:25 +0800)]
drm/amdgpu: add raven1 gfxoff quirk

Fix screen corruption with openkylin.

Link: https://bbs.openkylin.top/t/topic/171497
Signed-off-by: Peng Liu <liupeng01@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Fix spelling mistake "recompte" -> "recompute"
Colin Ian King [Wed, 28 Aug 2024 09:32:50 +0000 (10:32 +0100)]
drm/amd/display: Fix spelling mistake "recompte" -> "recompute"

There is a spelling mistake in a DRM_DEBUG_DRIVER message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdkfd: Add cache line size info
David Belanger [Fri, 23 Aug 2024 17:50:03 +0000 (13:50 -0400)]
drm/amdkfd: Add cache line size info

Populate cache line size info in topology based on information from IP
discovery table.

Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Sreekant Somasekharan <Sreekant.Somasekharan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_progra...
Srinivasan Shanmugam [Wed, 28 Aug 2024 11:25:23 +0000 (16:55 +0530)]
drm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp

This commit addresses a missing kdoc for the 'bs_coeffs_updated'
parameter in the 'dpp401_dscl_program_isharp' function. The
'bs_coeffs_updated' is a flag indicating whether the Blur and Scale
Coefficients have been updated.

The 'dpp401_dscl_program_isharp' function is responsible for programming
the isharp, which includes setting the isharp filter, noise gain, and
blur and scale coefficients. If the 'bs_coeffs_updated' flag is set to
true, the function updates the blur and scale coefficients.

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp'

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Suggested-by: Tom Chung <chiahsuan.chung@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: fix invalid fence handling in amdgpu_vm_tlb_flush
Lang Yu [Sun, 1 Sep 2024 12:56:07 +0000 (08:56 -0400)]
drm/amdgpu: fix invalid fence handling in amdgpu_vm_tlb_flush

CPU based update doesn't produce a fence, handle such cases properly.

Fixes: d8a3f0a0348d ("drm/amdgpu: implement TLB flush fence")
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu: re-work VM syncing
Christian König [Tue, 20 Aug 2024 10:01:22 +0000 (12:01 +0200)]
drm/amdgpu: re-work VM syncing

Rework how VM operations synchronize to submissions. Provide an
amdgpu_sync container to the backends instead of an reservation
object and fill in the amdgpu_sync object in the higher layers
of the code.

No intended functional change, just prepares for upcomming changes.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agoRevert "drm/amdgpu: align pp_power_profile_mode with kernel docs"
Alex Deucher [Thu, 5 Sep 2024 18:24:38 +0000 (14:24 -0400)]
Revert "drm/amdgpu: align pp_power_profile_mode with kernel docs"

This reverts commit bbb05f8a9cd87f5046d05a0c596fddfb714ee457.

This breaks some manual setting of the profile mode in
certain cases.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: use rlc safe mode for soft recovery
Alex Deucher [Wed, 24 Jul 2024 22:20:34 +0000 (18:20 -0400)]
drm/amdgpu/gfx10: use rlc safe mode for soft recovery

Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: use rlc safe mode for soft recovery
Alex Deucher [Wed, 24 Jul 2024 22:20:23 +0000 (18:20 -0400)]
drm/amdgpu/gfx11: use rlc safe mode for soft recovery

Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx12: use rlc safe mode for soft recovery
Alex Deucher [Wed, 24 Jul 2024 22:20:13 +0000 (18:20 -0400)]
drm/amdgpu/gfx12: use rlc safe mode for soft recovery

Protect the MMIO access with safe mode.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx12: use proper rlc safe mode helpers
Alex Deucher [Wed, 24 Jul 2024 22:11:52 +0000 (18:11 -0400)]
drm/amdgpu/gfx12: use proper rlc safe mode helpers

Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: use proper rlc safe mode helpers
Alex Deucher [Wed, 24 Jul 2024 22:10:04 +0000 (18:10 -0400)]
drm/amdgpu/gfx11: use proper rlc safe mode helpers

Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: use proper rlc safe mode helpers
Alex Deucher [Wed, 24 Jul 2024 22:07:28 +0000 (18:07 -0400)]
drm/amdgpu/gfx10: use proper rlc safe mode helpers

Rather than open coding it for the queue reset.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx12: per queue reset only on bare metal
Alex Deucher [Thu, 18 Jul 2024 14:22:00 +0000 (10:22 -0400)]
drm/amdgpu/gfx12: per queue reset only on bare metal

It's not supported under SR-IOV at the moment.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: per queue reset only on bare metal
Alex Deucher [Thu, 18 Jul 2024 14:21:45 +0000 (10:21 -0400)]
drm/amdgpu/gfx11: per queue reset only on bare metal

It's not supported under SR-IOV at the moment.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: per queue reset only on bare metal
Alex Deucher [Thu, 18 Jul 2024 14:21:21 +0000 (10:21 -0400)]
drm/amdgpu/gfx10: per queue reset only on bare metal

It's not supported under SR-IOV at the moment.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/mes11: implement mmio queue reset for gfx11
Jiadong Zhu [Thu, 4 Jul 2024 04:32:01 +0000 (12:32 +0800)]
drm/amdgpu/mes11: implement mmio queue reset for gfx11

Implement queue reset for graphic and compute queue.

v2: use amdgpu_gfx_rlc funcs to enter/exit safe mode.
v3: use gfx_v11_0_request_gfx_index_mutex()
v4: fix mutex handling

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/mes: implement amdgpu_mes_reset_hw_queue_mmio
Jiadong Zhu [Thu, 4 Jul 2024 04:26:16 +0000 (12:26 +0800)]
drm/amdgpu/mes: implement amdgpu_mes_reset_hw_queue_mmio

The reset_queue api could be used from kfd or kgd.

v2: add use_mmio parameter for mes_reset_legacy_queue.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/mes: modify mes api for mmio queue reset
Jiadong Zhu [Thu, 4 Jul 2024 04:10:59 +0000 (12:10 +0800)]
drm/amdgpu/mes: modify mes api for mmio queue reset

Add me/pipe/queue parameters for queue reset input.

v2: fix build (Alex)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx12: fallback to driver reset compute queue directly
Alex Deucher [Mon, 1 Jul 2024 22:22:24 +0000 (18:22 -0400)]
drm/amdgpu/gfx12: fallback to driver reset compute queue directly

Since the MES FW resets kernel compute queue always failed, this
may caused by the KIQ failed to process unmap KCQ. So, before MES
FW work properly that will fallback to driver executes dequeue and
resets SPI directly. Besides, rework the ring reset function and make
the busy ring type reset in each function respectively.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx12: add ring reset callbacks
Alex Deucher [Mon, 3 Jun 2024 21:07:56 +0000 (17:07 -0400)]
drm/amdgpu/gfx12: add ring reset callbacks

Add ring reset callbacks for gfx and compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: rework reset sequence
Alex Deucher [Mon, 1 Jul 2024 22:14:14 +0000 (18:14 -0400)]
drm/amdgpu/gfx10: rework reset sequence

To match other GFX IPs.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: wait for reset done before remap
Jiadong Zhu [Tue, 2 Jul 2024 01:17:14 +0000 (09:17 +0800)]
drm/amdgpu/gfx10: wait for reset done before remap

There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

v2: fix KIQ locking (Alex)
v3: fix KIQ locking harder (Jessie)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: remap queue after reset successfully
Jiadong Zhu [Fri, 14 Jun 2024 05:46:36 +0000 (13:46 +0800)]
drm/amdgpu/gfx10: remap queue after reset successfully

Kiq command unmap_queues only does the dequeueing action.
We have to map the queue back with clean mqd.

v2: fix up error handling (Alex)

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx10: add ring reset callbacks
Alex Deucher [Fri, 24 May 2024 16:37:50 +0000 (12:37 -0400)]
drm/amdgpu/gfx10: add ring reset callbacks

Add ring reset callbacks for gfx and compute.

v2: fix gfx handling
v3: wait for KIQ to complete

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: wait for reset done before remap
Jiadong Zhu [Tue, 2 Jul 2024 02:01:21 +0000 (10:01 +0800)]
drm/amdgpu/gfx11: wait for reset done before remap

There is a racing condition that cp firmware modifies
MQD in reset sequence after driver updates it for
remapping. We have to wait till CP_HQD_ACTIVE becoming
false then remap the queue.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: rename gfx_v11_0_gfx_init_queue()
Alex Deucher [Mon, 1 Jul 2024 22:04:40 +0000 (18:04 -0400)]
drm/amdgpu/gfx11: rename gfx_v11_0_gfx_init_queue()

Rename to gfx_v11_0_kgq_init_queue() to better align with
the other naming in the file.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: fallback to driver reset compute queue directly (v2)
Prike Liang [Fri, 14 Jun 2024 13:25:44 +0000 (21:25 +0800)]
drm/amdgpu/gfx11: fallback to driver reset compute queue directly (v2)

Since the MES FW resets kernel compute queue always failed, this
may caused by the KIQ failed to process unmap KCQ. So, before MES
FW work properly that will fallback to driver executes dequeue and
resets SPI directly. Besides, rework the ring reset function and make
the busy ring type reset in each function respectively.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: 3.2.299
Aric Cyr [Sun, 25 Aug 2024 23:40:51 +0000 (19:40 -0400)]
drm/amd/display: 3.2.299

This version brings along the following:

- DCN35 fixes
- DML2 fixes
- IPS fixes
- ODM fixes
- Miscellaneous cleanups
- MST fixes
- SPL fixes

Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Fix flickering caused by dccg
Hansen Dsouza [Wed, 14 Aug 2024 15:20:08 +0000 (11:20 -0400)]
drm/amd/display: Fix flickering caused by dccg

Always allow un-gating. Follow legacy workaround for repeated
dppclk dto updates

Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Block timing sync for different signals in PMO
Dillon Varone [Thu, 22 Aug 2024 21:52:57 +0000 (17:52 -0400)]
drm/amd/display: Block timing sync for different signals in PMO

PMO assumes that like timings can be synchronized, but DC only allows
this if the signal types match.

Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: fix graphics hang in multi-display mst case
Gabe Teeger [Fri, 23 Aug 2024 13:50:22 +0000 (09:50 -0400)]
drm/amd/display: fix graphics hang in multi-display mst case

[what]
Graphics hang observed with 3 displays connected to DP2.0 mst dock.

[why]
There's a mismatch in dml and dc between the assignments of hpo link
encoders.

[how]
Add a new array in dml that tracks the current mapping of HPO stream
encoders to HPO link encoders in dc.

Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Add sharpness control interface
Relja Vojvodic [Wed, 21 Aug 2024 13:34:21 +0000 (09:34 -0400)]
drm/amd/display: Add sharpness control interface

- Add interface for controlling shapness level input into DCN.
- Update SPL to support custom sharpness values.
- Add support for different sharpness values depending on YUV/RGB
  content.

Reviewed-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Relja Vojvodic <Relja.Vojvodic@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agoRevert "drm/amd/display: Wait for all pending cleared before full update"
Dillon Varone [Tue, 20 Aug 2024 19:13:14 +0000 (15:13 -0400)]
Revert "drm/amd/display: Wait for all pending cleared before full update"

This reverts commit f0b7dcf25834afd17df316367dfe5d4c890c713c.

It is causing graphics hangs.

Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: disable sharpness if HDR Multiplier is too large
Samson Tam [Thu, 22 Aug 2024 00:17:23 +0000 (20:17 -0400)]
drm/amd/display: disable sharpness if HDR Multiplier is too large

[Why]
Certain profiles have higher HDR multiplier than SDR boost max which
is not currently supported

[How]
Disable sharpness for these profiles

Fixes: 1b0ce903fe74 ("drm/amd/display: add improvements for text display and HDR DWM and MPO")
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Add dpia debug option to control power management
Meenakshikumar Somasundaram [Tue, 20 Aug 2024 17:15:38 +0000 (13:15 -0400)]
drm/amd/display: Add dpia debug option to control power management

[Why]
To provide option to dpia control power management

[How]
By adding disable_usb4_pm_support bit field in dpia_debug option to
control dpia power management

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx11: add ring reset callbacks
Alex Deucher [Fri, 24 May 2024 16:20:10 +0000 (12:20 -0400)]
drm/amdgpu/gfx11: add ring reset callbacks

Add ring reset callbacks for gfx and compute.

Acked-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: re-enable Dynamic ODM policy
Samson Tam [Wed, 21 Aug 2024 15:03:11 +0000 (11:03 -0400)]
drm/amd/display: re-enable Dynamic ODM policy

[Why]
Previous disable ODM policy due to underflow issue with sharpener.
Issue is resolved after updating sharpening policy to apply to
both windowed and fullscreen video

[How]
Remove sharpness check disabling Dynamic ODM policy

Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Lock DC and exit IPS when changing backlight
Leo Li [Tue, 20 Aug 2024 18:34:15 +0000 (14:34 -0400)]
drm/amd/display: Lock DC and exit IPS when changing backlight

Backlight updates require aux and/or register access. Therefore, driver
needs to disallow IPS beforehand.

So, acquire the dc lock before calling into dc to update backlight - we
should be doing this regardless of IPS. Then, while the lock is held,
disallow IPS before calling into dc, then allow IPS afterwards (if it
was previously allowed).

Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: only trigger BIOS related assert for older ASICs
Daniel Sa [Tue, 20 Aug 2024 18:19:26 +0000 (14:19 -0400)]
drm/amd/display: only trigger BIOS related assert for older ASICs

[Why]
Some asserts are always hit on startup/Pnp when they should only be used
to indicate when something has gone wrong.

[How]
Ignore result of getting function from bios cmd table for newer asics.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Daniel Sa <Daniel.Sa@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amd/display: Fix DCN35 set min dispclk logic
Nicholas Susanto [Tue, 20 Aug 2024 15:05:54 +0000 (11:05 -0400)]
drm/amd/display: Fix DCN35 set min dispclk logic

[Why]

Setting min dispclk to 50Mhz outside clock lowering function causes
unnecessary calls to SMU to lower dispclk and causes dentist hangs when
there is no stream on the pipes.

[How]

Move the set minimum dispclk logic inside the lowering dispclk if
statement.

Fixes: 234441320552 ("DCN35 set min dispclk to 50Mhz")
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 weeks agodrm/amdgpu/gfx9.4.3: Implement compute pipe reset
Prike Liang [Thu, 29 Aug 2024 03:47:12 +0000 (11:47 +0800)]
drm/amdgpu/gfx9.4.3: Implement compute pipe reset

Implement the compute pipe reset, and the driver will
fallback to pipe reset when queue reset fails.
The pipe reset only deactivates the queue which is
scheduled in the pipe, and meanwhile the MEC pipe
will be reset to the firmware _start pointer. So,
it seems pipe reset will cost more cycles than the
queue reset; therefore, the driver tries to recover
by doing queue reset first.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu: always allocate cleared VRAM for GEM allocations
Alex Deucher [Tue, 26 Mar 2024 15:28:29 +0000 (11:28 -0400)]
drm/amdgpu: always allocate cleared VRAM for GEM allocations

This adds allocation latency, but aligns better with user
expectations.  The latency should improve with the drm buddy
clearing patches that Arun has been working on.

In addition this fixes the high CPU spikes seen when doing
wipe on release.

v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christian)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3528
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Acked-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Cc: Christian König <christian.koenig@amd.com>
4 weeks agodrm/amdgpu/mes: add mes mapping legacy queue switch
Jack Xiao [Thu, 22 Aug 2024 10:18:51 +0000 (18:18 +0800)]
drm/amdgpu/mes: add mes mapping legacy queue switch

For mes11 old firmware has issue to map legacy queue,
add a flag to switch mes to map legacy queue.

Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support")
Reported-by: Andrew Worsley <amworsley@gmail.com>
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdkfd: Don't drain ih1 for APU
Yifan Zhang [Tue, 27 Aug 2024 07:14:31 +0000 (15:14 +0800)]
drm/amdkfd: Don't drain ih1 for APU

ih1 is not initialized for APUs. Don't drain it or NULL pointer
error will be triggered.

Fixes: 6ef29715ac06 ("drm/amdkfd: Change kfd/svm page fault drain handling")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/gfx12: return early in preempt_ib()
Alex Deucher [Thu, 15 Aug 2024 16:58:14 +0000 (12:58 -0400)]
drm/amdgpu/gfx12: return early in preempt_ib()

When MES is enabled KIQ is not available.  Return an error
when someone uses the debugfs preempt test interface in
that case.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/gfx11: return early in preempt_ib()
Alex Deucher [Wed, 14 Aug 2024 13:15:24 +0000 (09:15 -0400)]
drm/amdgpu/gfx11: return early in preempt_ib()

When MES is enabled KIQ is not available.  Return an error
when someone uses the debugfs preempt test interface in
that case.

Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Determine IPS mode by ASIC and PMFW versions
Leo Li [Tue, 27 Aug 2024 15:29:53 +0000 (11:29 -0400)]
drm/amd/display: Determine IPS mode by ASIC and PMFW versions

[Why]

DCN IPS interoperates with other system idle power features, such as
Zstates.

On DCN35, there is a known issue where system Z8 + DCN IPS2 causes a
hard hang. We observe this on systems where the SBIOS allows Z8.

Though there is a SBIOS fix, there's no guarantee that users will get it
any time soon, or even install it. A workaround is needed to prevent
this from rearing its head in the wild.

[How]

For DCN35, check the pmfw version to determine whether the SBIOS has the
fix. If not, set IPS1+RCG as the deepest possible state in all cases
except for s0ix and display off (DPMS). Otherwise, enable all IPS

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu: Move the dumping log out of for loop
Sunil Khatri [Wed, 28 Aug 2024 08:06:23 +0000 (13:36 +0530)]
drm/amdgpu: Move the dumping log out of for loop

log message "Dumping IP State Completed" needs to
be logged only once when state dumping is complete.

Hence moving it out of the for loop.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/amdgpu: move drain_workqueue before shutdown is set
Victor Zhao [Sun, 25 Aug 2024 16:14:26 +0000 (00:14 +0800)]
drm/amd/amdgpu: move drain_workqueue before shutdown is set

[background] when unloading amdgpu driver right after running a
workload, drain_workqueue is causing "Fence fallback timer
expired on ring sdma0.0". Under sriov, this issue will cause sriov
full access timeout and a reset happening.

move drain_workqueue before shutdown is set to allow ih process and
before enter full access under sriov to avoid full access time cost.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu: Do core dump immediately when job tmo
Trigger Huang [Mon, 19 Aug 2024 08:04:52 +0000 (16:04 +0800)]
drm/amdgpu: Do core dump immediately when job tmo

Do the coredump immediately after a job timeout to get a closer
representation of GPU's error status.

V2: This will skip printing vram_lost as the GPU reset is not
happened yet (Alex)

V3: Unconditionally call the core dump as we care about all the reset
functions(soft-recovery and queue reset and full adapter reset, Alex)

V4: Do the dump after adev->job_hang = true (Sunil)

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu: skip printing vram_lost if needed
Trigger Huang [Mon, 19 Aug 2024 07:53:22 +0000 (15:53 +0800)]
drm/amdgpu: skip printing vram_lost if needed

The vm lost status can only be obtained after a GPU reset occurs, but
sometimes a dev core dump can be happened before GPU reset. So a new
argument is added to tell the dev core dump implementation whether to
skip printing the vram_lost status in the dump.
And this patch is also trying to decouple the core dump function from
the GPU reset function, by replacing the argument amdgpu_reset_context
with amdgpu_job to specify the context for core dump.

V2: Inform user if VRAM lost check is skipped so users don't assume
VRAM wasn't lost (Alex)

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/gfx9: put queue resets behind a debug option
Alex Deucher [Tue, 20 Aug 2024 20:21:15 +0000 (16:21 -0400)]
drm/amdgpu/gfx9: put queue resets behind a debug option

Pending extended validation.

Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu: add experimental resets debug flag
Alex Deucher [Tue, 20 Aug 2024 19:19:04 +0000 (15:19 -0400)]
drm/amdgpu: add experimental resets debug flag

Add this flag to enable experimental resets for testing before they
are fully validated.

Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/display: Fix a mistake in revert commit
Fangzhi Zuo [Tue, 27 Aug 2024 20:08:13 +0000 (16:08 -0400)]
drm/amdgpu/display: Fix a mistake in revert commit

[why]
It is to fix in try_disable_dsc() due to misrevert of
commit 338567d17627 ("drm/amd/display: Fix MST BW calculation Regression")

[How]
Fix restoring minimum compression bw by 'max_kbps', instead of native bw 'stream_kbps'

Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/swsmu: always force a state reprogram on init
Alex Deucher [Fri, 23 Aug 2024 01:54:24 +0000 (21:54 -0400)]
drm/amdgpu/swsmu: always force a state reprogram on init

Always reprogram the hardware state on init.  This ensures
the PMFW state is explicitly programmed and we are not relying
on the default PMFW state.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3131
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/display: remove unnecessary TODO spl_os_types.h
Zaeem Mohamed [Fri, 23 Aug 2024 04:30:15 +0000 (00:30 -0400)]
drm/amdgpu/display: remove unnecessary TODO spl_os_types.h

Remove unnecessary TODO from spl_os_types.h

Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/display: SPDX copyright for spl_os_types.h
Zaeem Mohamed [Thu, 22 Aug 2024 21:36:10 +0000 (17:36 -0400)]
drm/amdgpu/display: SPDX copyright for spl_os_types.h

Use appropriate SPDX copyright for spl_os_types.h

Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Add DSC Debug Log
Fangzhi Zuo [Fri, 2 Aug 2024 19:03:39 +0000 (15:03 -0400)]
drm/amd/display: Add DSC Debug Log

Add DSC log in each critical routines to facilitate debugging.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: 3.2.298
Aric Cyr [Mon, 19 Aug 2024 01:39:06 +0000 (21:39 -0400)]
drm/amd/display: 3.2.298

This version brings along the following fixes:
- Fix MS/MP mismatches in dml21 for dcn401
- Resolved Coverity issues
- Add back quality EASF and ISHARP and dc dependency changes
- Add sharpness support for windowed YUV420 video
- Add improvements for text display and HDR DWM and MPO
- Fix Synaptics Cascaded Panamera DSC Determination
- Allocate DCN35 clock table transfer buffers in GART
- Add Replay Low Refresh Rate parameters in dc type

Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: add sharpness support for windowed YUV420 video
Samson Tam [Sat, 17 Aug 2024 23:24:27 +0000 (19:24 -0400)]
drm/amd/display: add sharpness support for windowed YUV420 video

[Why]
Previous only applied sharpness for fullscreen YUV420 video.

[How]
Remove fullscrene restriction and apply sharpness for windowed
 YUV420 video as well.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: add improvements for text display and HDR DWM and MPO
Samson Tam [Sat, 17 Aug 2024 23:16:53 +0000 (19:16 -0400)]
drm/amd/display: add improvements for text display and HDR DWM and MPO

[Why]
Tune settings for improved text display.
Handle differences between DWM and MPO in HDR path.

[How]
Update sharpener LBA table.
Use HDR multiplier to calculate scalar matrix coefficients
 for HDR RGB MPO path.
Update unit tests.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Add Replay Low Refresh Rate parameters in dc type.
Dennis Chan [Fri, 19 Jul 2024 07:08:35 +0000 (15:08 +0800)]
drm/amd/display: Add Replay Low Refresh Rate parameters in dc type.

Why:
To supported Low Refresh Rate panel for Replay Feature,
Adding some parameters to record Low Refresh Rate information.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Dennis Chan <dennis.chan@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: add back quality EASF and ISHARP and dc dependency changes
Samson Tam [Fri, 16 Aug 2024 15:42:35 +0000 (11:42 -0400)]
drm/amd/display: add back quality EASF and ISHARP and dc dependency changes

[Why]
Addressed previous issues with quality changes and new issues due to
 rolling back quality changes.

[How]
This reverts commit f9e6759888866748f31b6b6c2142a481d587f51f, fixes merge conflicts, and fixed some
 formatting errors.
Store current sharpness level for each pregen table to minimize
 calculating sharpness table every time.
Disable dynamic ODM when sharpness is enabled.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Notify DMCUB of D0/D3 state
Nicholas Kazlauskas [Fri, 21 Jun 2024 20:11:28 +0000 (16:11 -0400)]
drm/amd/display: Notify DMCUB of D0/D3 state

[Why]
We want to avoid arming the HPD timer in firmware when preparing for
S0i3 entry when DC is considered in D3.

[How]
Notify DMCUB of the power state transitions so it can decide to arm
the HPD timer for idle in DCN35 only in D0.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Fix Synaptics Cascaded Panamera DSC Determination
Fangzhi Zuo [Mon, 12 Aug 2024 16:13:44 +0000 (12:13 -0400)]
drm/amd/display: Fix Synaptics Cascaded Panamera DSC Determination

Synaptics Cascaded Panamera topology needs to unconditionally
acquire root aux for dsc decoding.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Retry Replay residency
ChunTao Tso [Thu, 8 Aug 2024 09:25:55 +0000 (17:25 +0800)]
drm/amd/display: Retry Replay residency

[Why]
Because sometime DMUB GPINT will time out,
 it will cause we return 0 as residency number.

[How]
Retry to avoid this happened.

Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: ChunTao Tso <ChunTao.Tso@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Allocate DCN35 clock table transfer buffers in GART
Nicholas Kazlauskas [Thu, 15 Aug 2024 20:31:44 +0000 (16:31 -0400)]
drm/amd/display: Allocate DCN35 clock table transfer buffers in GART

[Why]
Request from PMFW to use GART for clock table transfer tables as
framebuffer is being deprecated on APU.

[How]
Switch over to GART via the allocation flag.

Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: do not set traslate_by_source for DCN401 cursor
Aurabindo Pillai [Wed, 14 Aug 2024 21:56:17 +0000 (17:56 -0400)]
drm/amd/display: do not set traslate_by_source for DCN401 cursor

translate_by_source need not be set for DCN401 onwards since cursor
cursor composition comes after scaler in the hardware pipeline.
Hence offset calculation has been reworked, and this setting is not
necessary to be enabled anymore.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Resolve Coverity Issues
Daniel Sa [Mon, 12 Aug 2024 19:24:27 +0000 (15:24 -0400)]
drm/amd/display: Resolve Coverity Issues

[WHY]
Remove coverity issues that were originally ignored.

[HOW]
Ran coverity locally on driver, used output report to find existing
coverity issues, resolved them

Reviewed-by: Nicholas Choi <nicholas.choi@amd.com>
Signed-off-by: Daniel Sa <Daniel.Sa@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Fix MS/MP mismatches in dml21 for dcn401
Dillon Varone [Wed, 14 Aug 2024 21:32:16 +0000 (17:32 -0400)]
drm/amd/display: Fix MS/MP mismatches in dml21 for dcn401

[WHY]
Prefetch calculations did not guarantee that bandwidth required in
mode support was less than mode programming which can cause failures.

[HOW]
Fix bandwidth calculations to assume fixed times for OTO schedule,
and choose which schedule to use based on time to fetch pixel data.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Wait for all pending cleared before full update
Alvin Lee [Thu, 8 Aug 2024 14:19:54 +0000 (10:19 -0400)]
drm/amd/display: Wait for all pending cleared before full update

[Description]
Before every full update we must wait for all pending updates to be
cleared - this is particularly important for minimal transitions
because if we don't wait for pending cleared, it will be as if
there was no minimal transition at all. In OTG we must read 3 different
status registers for pending cleared, one specifically for OTG updates,
one specifically for OPTC updates, and the last for surface related
updates

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: guard write a 0 post_divider value to HW
Ahmed, Muhammad [Tue, 13 Aug 2024 21:11:55 +0000 (17:11 -0400)]
drm/amd/display: guard write a 0 post_divider value to HW

[why]
post_divider_value should not be 0.

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Ahmed, Muhammad <Ahmed.Ahmed@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd/display: Don't skip clock updates in overclocking
Alvin Lee [Thu, 20 Jun 2024 18:32:21 +0000 (14:32 -0400)]
drm/amd/display: Don't skip clock updates in overclocking

[Description]
Skipping clock updates is not a hard requirement for overclocking
and only an optimization. Remove the skip as this can cause issues
for FAMS transitions during the overclock sequence. If FAMS
is enabled we must disable UCLK switch on any full update (which
requires update clocks to be called).

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amd: Introduce additional IPS debug flags
Leo Li [Tue, 6 Aug 2024 17:29:13 +0000 (13:29 -0400)]
drm/amd: Introduce additional IPS debug flags

[Why]

Idle power states (IPS) describe levels of power-gating within DCN. DM
and DC is responsible for ensuring that we are out of IPS before any DCN
programming happens. Any DCN programming while we're in IPS leads to
undefined behavior (mostly hangs).

Because IPS intersects with all display features, the ability to disable
IPS by default while ironing out the known issues is desired. However,
disabing it completely will cause important features such as s0ix entry
to fail.

Therefore, more granular IPS debug flags are desired.

[How]

Extend the dc debug mask bits to include the available list of IPS
debug flags.

All the flags should work as documented, with the exception of
IPS_DISABLE_DYNAMIC. It requires dm changes which will be done in
later changes.

v2: enable docs and fix docstring format

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
4 weeks agodrm/amdgpu/smu13.0.7: print index for profiles
Alex Deucher [Thu, 22 Aug 2024 20:20:10 +0000 (16:20 -0400)]
drm/amdgpu/smu13.0.7: print index for profiles

Print the index for the profiles.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3543
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>