drm/amdgpu: extend xnack limit page fault timeout
authorAlex Sierra <alex.sierra@amd.com>
Mon, 12 Apr 2021 18:34:57 +0000 (13:34 -0500)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 23 Apr 2021 21:16:20 +0000 (17:16 -0400)
Extending this timeout will prevent IH from storm interrupts coming
from SDMA while a page fault is active. Currently, on Aldebaran,
handling that many interrupts can take a lot of CPU time
(up to 4 seconds).
This eventually causes timeouts in other tasks.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c

index 5715be6770ecc0314e0c40d71b9412a7771d474a..823a367990bf6d17f51006a6b596af6af539512b 100644 (file)
@@ -1109,6 +1109,8 @@ static void sdma_v4_0_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
                if (adev->asic_type == CHIP_ARCTURUS &&
                    adev->sdma.instance[i].fw_version >= 14)
                        WREG32_SDMA(i, mmSDMA0_PUB_DUMMY_REG2, enable);
+               /* Extend page fault timeout to avoid interrupt storm */
+               WREG32_SDMA(i, mmSDMA0_UTCL1_TIMEOUT, 0x00800080);
        }
 
 }