drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11
authorTim Huang <Tim.Huang@amd.com>
Wed, 27 Mar 2024 05:10:37 +0000 (13:10 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 10 Apr 2024 01:50:16 +0000 (21:50 -0400)
While doing multiple S4 stress tests, GC/RLC/PMFW get into
an invalid state resulting into hard hangs.

Adding a GFX reset as workaround just before sending the
MP1_UNLOAD message avoids this failure.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c

index e8119918ef6b129e7a26ab997c652f4dbb431a9a..88f1a0d878f339890e630e11f1128622dd49fc4e 100644 (file)
@@ -226,8 +226,18 @@ static int smu_v13_0_4_system_features_control(struct smu_context *smu, bool en)
        struct amdgpu_device *adev = smu->adev;
        int ret = 0;
 
-       if (!en && !adev->in_s0ix)
+       if (!en && !adev->in_s0ix) {
+               /* Adds a GFX reset as workaround just before sending the
+                * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering
+                * an invalid state.
+                */
+               ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset,
+                                                     SMU_RESET_MODE_2, NULL);
+               if (ret)
+                       return ret;
+
                ret = smu_cmn_send_smc_msg(smu, SMU_MSG_PrepareMp1ForUnload, NULL);
+       }
 
        return ret;
 }