drm/nouveau: sched: avoid job races between entities
authorDanilo Krummrich <dakr@redhat.com>
Fri, 11 Aug 2023 01:06:25 +0000 (03:06 +0200)
committerDanilo Krummrich <dakr@redhat.com>
Tue, 22 Aug 2023 17:32:13 +0000 (19:32 +0200)
commit7baf605564133405443556b415692d3c7aa54351
treec32b181bda09cfba5ae6887f169229323541fce4
parent504245a5ab6b6e1bfe0280baa4885c551e082099
drm/nouveau: sched: avoid job races between entities

If a sched job depends on a dma-fence from a job from the same GPU
scheduler instance, but a different scheduler entity, the GPU scheduler
does only wait for the particular job to be scheduled, rather than for
the job to fully complete. This is due to the GPU scheduler assuming
that there is a scheduler instance per ring. However, the current
implementation, in order to avoid arbitrary amounts of kthreads, has a
single scheduler instance while scheduler entities represent rings.

As a workaround, set the DRM_SCHED_FENCE_DONT_PIPELINE for all
out-fences in order to force the scheduler to wait for full job
completion for dependent jobs from different entities and same scheduler
instance.

There is some work in progress [1] to address the issues of firmware
schedulers; once it is in-tree the scheduler topology in Nouveau should
be re-worked accordingly.

[1] https://lore.kernel.org/dri-devel/20230801205103.627779-1-matthew.brost@intel.com/

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collaboralcom>
Link: https://patchwork.freedesktop.org/patch/msgid/20230811010632.2473-1-dakr@redhat.com
drivers/gpu/drm/nouveau/nouveau_sched.c