drm/xe: Fix early wedge on GuC load failure
authorDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Wed, 11 Jun 2025 21:44:54 +0000 (14:44 -0700)
committerLucas De Marchi <lucas.demarchi@intel.com>
Thu, 12 Jun 2025 22:17:12 +0000 (15:17 -0700)
commit0b93b7dcd9eb888a6ac7546560877705d4ad61bf
treeb5069827175d594f3d0c8825b4a68a5559aac9aa
parent3a1edef8f4b58b0ba826bc68bf4bce4bdf59ecf3
drm/xe: Fix early wedge on GuC load failure

When the GuC fails to load we declare the device wedged. However, the
very first GuC load attempt on GT0 (from xe_gt_init_hwconfig) is done
before the GT1 GuC objects are initialized, so things go bad when the
wedge code attempts to cleanup GT1. To fix this, check the initialization
status in the functions called during wedge.

Fixes: 7dbe8af13c18 ("drm/xe: Wedge the entire device")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Zhanjun Dong <zhanjun.dong@intel.com>
Cc: stable@vger.kernel.org # v6.12+: 1e1981b16bb1: drm/xe: Fix taking invalid lock on wedge
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250611214453.1159846-2-daniele.ceraolospurio@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
drivers/gpu/drm/xe/xe_guc_ct.c
drivers/gpu/drm/xe/xe_guc_ct.h
drivers/gpu/drm/xe/xe_guc_submit.c