From: shaoyunl Date: Sun, 14 Nov 2021 17:38:18 +0000 (-0500) Subject: drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered again X-Git-Url: https://git.kernel.dk/?a=commitdiff_plain;h=74aafe99efb68f15e50be9f7032c2168512f98a8;p=linux-block.git drm/amd/amdkfd: Fix kernel panic when reset failed and been triggered again [ Upstream commit 2cf49e00d40d5132e3d067b5aa6d84791929ab15 ] In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch already been called, the start_cpsch will not be called since there is no resume in this case. When reset been triggered again, driver should avoid to do uninitialization again. Signed-off-by: shaoyunl Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 352a32dc609b..2645ebc63a14 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1207,6 +1207,11 @@ static int stop_cpsch(struct device_queue_manager *dqm) bool hanging; dqm_lock(dqm); + if (!dqm->sched_running) { + dqm_unlock(dqm); + return 0; + } + if (!dqm->is_hws_hang) unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0); hanging = dqm->is_hws_hang || dqm->is_resetting;