drm/amdgpu: Add Runtime Bad Page message definitions for VFs
authorEllen Pan <yunru.pan@amd.com>
Tue, 29 Apr 2025 20:22:53 +0000 (16:22 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 7 May 2025 21:41:43 +0000 (17:41 -0400)
Currently VFs rely on poison consumption interrupt from HW
to kick off the bad page retirement process. Part of this process
includes a VF reset.

This patch adds the following:

1) Host Bad Pages notification message.
2) Guest request bad pages message.

When combined, VFs are able to reserve the pages early, and potentially
avoid future poison consumption that will disrupt user services
from consequent FLR.

Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Signed-off-by: Ellen Pan <yunru.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h

index bea724981309cdacc5dc6ee4baf6649872248f37..3b0c55f67fe46c2012a7c6bf7b9f41b2e5323699 100644 (file)
@@ -331,6 +331,7 @@ enum amd_sriov_mailbox_request_message {
        MB_REQ_MSG_RAS_POISON = 202,
        MB_REQ_RAS_ERROR_COUNT = 203,
        MB_REQ_RAS_CPER_DUMP = 204,
+       MB_REQ_RAS_BAD_PAGES = 205,
 };
 
 /* mailbox message send from host to guest  */
@@ -348,6 +349,8 @@ enum amd_sriov_mailbox_response_message {
        MB_RES_MSG_GPU_RMA                      = 10,
        MB_RES_MSG_RAS_ERROR_COUNT_READY        = 11,
        MB_REQ_RAS_CPER_DUMP_READY              = 14,
+       MB_RES_MSG_RAS_BAD_PAGES_READY          = 15,
+       MB_RES_MSG_RAS_BAD_PAGES_NOTIFICATION   = 16,
        MB_RES_MSG_TEXT_MESSAGE                 = 255
 };