drm/amdgpu: Log RAS errors during load
authorLijo Lazar <lijo.lazar@amd.com>
Tue, 6 May 2025 11:08:27 +0000 (16:38 +0530)
committerAlex Deucher <alexander.deucher@amd.com>
Tue, 13 May 2025 13:34:02 +0000 (09:34 -0400)
During driver load, RAS event manager may not be initialized. This will
cause any ATHUB event during driver load to be skipped in dmesg log. Log
the error in dmesg log for easier diagnosis.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c

index cf794cf7e262986f372ec4b9ce79f3e541a6940a..dc07936d2fcb2c443f86be0baab30fd7b536d0d9 100644 (file)
@@ -4498,8 +4498,11 @@ void amdgpu_ras_global_ras_isr(struct amdgpu_device *adev)
                enum ras_event_type type = RAS_EVENT_TYPE_FATAL;
                u64 event_id;
 
-               if (amdgpu_ras_mark_ras_event(adev, type))
+               if (amdgpu_ras_mark_ras_event(adev, type)) {
+                       dev_err(adev->dev,
+                               "uncorrectable hardware error (ERREVENT_ATHUB_INTERRUPT) detected!\n");
                        return;
+               }
 
                event_id = amdgpu_ras_acquire_event_id(adev, type);