amdgpu/soc15: enable asic reset for dGPU in case of suspend abort
authorJiang Liu <gerry@linux.alibaba.com>
Mon, 13 Jan 2025 03:40:12 +0000 (11:40 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Thu, 13 Feb 2025 02:02:58 +0000 (21:02 -0500)
When GPU suspend is aborted, do the same for dGPU as APU to reset
soc15 asic. Otherwise it may cause following errors:
[  547.229463] amdgpu 0001:81:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)

[  555.126827] amdgpu 0000:0a:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_0.2.1.0 test failed (-110)
[  555.126901] [drm:amdgpu_gfx_enable_kcq [amdgpu]] *ERROR* KCQ enable failed
[  555.126957] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_4_3> failed -110
[  555.126959] amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_resume failed (-110).
[  555.126965] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -110
[  555.126966] PM: Device 0000:0a:00.0 failed to resume async: error -110

This fix has been tested on Mi308X.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Tested-by: Shuo Liu <shuox.liu@linux.alibaba.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/2462b4b12eb9d025e82525178d568cbaa4c223ff.1736739303.git.gerry@linux.alibaba.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/soc15.c

index a59b4c36cad73378abfe9273dd7857273afcab31..0e1daefd1a8ea3e870470b9cac2b5db7c387256a 100644 (file)
@@ -605,12 +605,10 @@ soc15_asic_reset_method(struct amdgpu_device *adev)
 static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)
 {
        /* Will reset for the following suspend abort cases.
-        * 1) Only reset on APU side, dGPU hasn't checked yet.
-        * 2) S3 suspend aborted in the normal S3 suspend or
-        *    performing pm core test.
+        * 1) S3 suspend aborted in the normal S3 suspend
+        * 2) S3 suspend aborted in performing pm core test.
         */
-       if (adev->flags & AMD_IS_APU && adev->in_s3 &&
-                       !pm_resume_via_firmware())
+       if (adev->in_s3 && !pm_resume_via_firmware())
                return true;
        else
                return false;