drm/amdkfd: Handle deallocated VPGRs in gfx11+ trap handler
authorJay Cornwall <jay.cornwall@amd.com>
Tue, 28 May 2024 20:55:56 +0000 (15:55 -0500)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 5 Jun 2024 15:05:57 +0000 (11:05 -0400)
commitc5afb313e7e623a06cd3428f0a651b2235211430
tree64533707377886b1391ce8cecdf09e0cd4d17ffe
parent4002a6c55e99046b4a09ae255d38d3620b31fb1d
drm/amdkfd: Handle deallocated VPGRs in gfx11+ trap handler

A wavefront may deallocate its VGPRs at the end of a program while
waiting for memory transactions to complete. If it subsequently
receives a context save exception it will be unable to save,
since this requires VGPRs. In this case the trap handler should
terminate the wavefront.

Fixes intermittent VM faults under context switching load.

V2: Use S_ENDPGM instead of S_ENDPGM_SAVED for performance counters

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Lancelot Six <lancelot.six@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm