Alex Deucher [Tue, 18 Feb 2025 18:06:19 +0000 (13:06 -0500)]
drm/amdgpu: add ring flag for no user submissions
This would be set by IPs which only accept submissions
from the kernel, not userspace, such as when kernel
queues are disabled. Don't expose the rings to userspace
and reject any submissions in the CS IOCTL.
v2: fix error code (Alex)
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 18 Feb 2025 15:33:53 +0000 (10:33 -0500)]
drm/amdgpu: add parameter to disable kernel queues
On chips that support user queues, setting this option
will disable kernel queues to be used to validate
user queues without kernel queues.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 16:49:30 +0000 (12:49 -0400)]
drm/amdgpu/userq: prevent runtime pm when userqs are active
Similar to KFD, prevent runtime pm while user queues are active.
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Feb 2025 20:56:24 +0000 (15:56 -0500)]
drm/amdgpu: store userq_managers in a list in adev
So we can iterate across them when we need to manage
all user queues.
v2: add uq_mgr to adev list in amdgpu_userq_mgr_init
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 24 Mar 2025 20:29:03 +0000 (16:29 -0400)]
drm/amdgpu: bump version for user queue IP support query
Add the user queue IP support query to the drm_amdgpu_info_device
query.
Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 24 Mar 2025 20:26:00 +0000 (16:26 -0400)]
drm/amdgpu: add UAPI to query if user queues are supported
Add an INFO query to check if user queues are supported.
v2: switch to a mask of IPs (Marek)
v3: move to drm_amdgpu_info_device (Marek)
Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Mar 2025 16:21:22 +0000 (12:21 -0400)]
drm/amdgpu/gfx12: split userq setup to a separate switch
Add a separate switch statement for the userq callback
assignment so that we can assign the callbacks for each
asic as the firmware becomes available.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Mar 2025 16:09:12 +0000 (12:09 -0400)]
drm/amdgpu/gfx11: clean up and consolidate sw_init
With the ME details fixed, we can now consolidate
this state. Also split out the userq setup into a separate
switch statement so that we can set them per IP version
when the firmwares are ready.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Tue, 18 Mar 2025 13:15:40 +0000 (18:45 +0530)]
drm/amdgpu: Fix display freezing issue when resizing apps
The display is freezing because the amdgpu_userq_wait_ioctl()
is waiting for a non-user queue fence(specifically, the PT update fence).
RootCause:
The resume_work is initiated by both amdgpu_userq_suspend and
amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend
signals a dma-fence and subsequently triggers the resume_work, which is
intended to replace the existing fence by creating new dma-fence. However,
following this, the amdgpu_userqueue_ensure_ev_fence schedules another
resume_work that generates a new dma-fence, thereby replacing the one
created by amdgpu_userq_suspend. Consequently, the original fence will
never be signaled.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Mar 2025 14:18:58 +0000 (10:18 -0400)]
drm/amdgpu/mes: warn on unexpected pipe numbers
Warn if the number of pipes exceeds what the MES supports.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Feb 2025 17:31:46 +0000 (12:31 -0500)]
drm/amdgpu/mes: centralize gfx_hqd mask management
Move it to amdgpu_mes to align with the compute and
sdma hqd masks. No functional change.
v2: rebase on new changes
v3: misc optimizations
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 12 Mar 2025 17:47:33 +0000 (13:47 -0400)]
drm/amdgpu: remove is_mes_queue flag
This was leftover from MES bring up when we had MES
user queues in the kernel. It's no longer used so
remove it.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Feb 2025 20:39:02 +0000 (15:39 -0500)]
drm/amdgpu/mes: remove unused functions
Leftover from the MES self tests that were removed previously.
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Feb 2025 21:31:57 +0000 (16:31 -0500)]
drm/amdgpu: validate user queue parameters
Make sure these are set properly to ensure compatibility if
we ever update the IOCTL interface.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Tue, 18 Feb 2025 13:26:25 +0000 (18:56 +0530)]
drm/amdgpu: fix the memleak caused by fence not released
Encountering a taint issue during the unloading of gpu_sched
due to the fence not being released/put. In this context,
amdgpu_vm_clear_freed is responsible for creating a job to
update the page table (PT). It allocates kmem_cache for
drm_sched_fence and returns the finished fence associated
with job->base.s_fence. In case of Usermode queue this finished
fence is added to the timeline sync object through
amdgpu_gem_update_bo_mapping, which is utilized by user
space to ensure the completion of the PT update.
[ 508.900587] =============================================================================
[ 508.900605] BUG drm_sched_fence (Tainted: G N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown()
[ 508.900617] -----------------------------------------------------------------------------
[ 508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
[ 508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G N 6.12.0+ #1
[ 508.900651] Tainted: [N]=TEST
[ 508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
[ 508.900656] Call Trace:
[ 508.900659] <TASK>
[ 508.900665] dump_stack_lvl+0x70/0x90
[ 508.900674] dump_stack+0x14/0x20
[ 508.900678] slab_err+0xcb/0x110
[ 508.900687] ? srso_return_thunk+0x5/0x5f
[ 508.900692] ? try_to_grab_pending+0xd3/0x1d0
[ 508.900697] ? srso_return_thunk+0x5/0x5f
[ 508.900701] ? mutex_lock+0x17/0x50
[ 508.900708] __kmem_cache_shutdown+0x144/0x2d0
[ 508.900713] ? flush_rcu_work+0x50/0x60
[ 508.900719] kmem_cache_destroy+0x46/0x1f0
[ 508.900728] drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched]
[ 508.900736] __do_sys_delete_module.constprop.0+0x184/0x320
[ 508.900744] ? srso_return_thunk+0x5/0x5f
[ 508.900747] ? debug_smp_processor_id+0x1b/0x30
[ 508.900754] __x64_sys_delete_module+0x16/0x20
[ 508.900758] x64_sys_call+0xdf/0x20d0
[ 508.900763] do_syscall_64+0x51/0x120
[ 508.900769] entry_SYSCALL_64_after_hwframe+0x76/0x7e
v2: call dma_fence_put in amdgpu_gem_va_update_vm
v3: Addressed review comments from Christian.
- calling amdgpu_gem_update_timeline_node before switch.
- puting a dma_fence in case of error or !timeline_syncobj.
v4: Addressed review comments from Christian.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 28 Feb 2025 19:55:57 +0000 (14:55 -0500)]
drm/amdgpu/userq: move the header to amdgpu directory
To align with other headers.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 19 Feb 2025 21:46:52 +0000 (16:46 -0500)]
drm/amdgpu/userq: remove BROKEN from config
This can be enabled now. We have the firmware checks
in place.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 28 Feb 2025 19:50:11 +0000 (14:50 -0500)]
drm/amdgpu: add userq firmware version checks
Currently disabled until the firmwares are officially
released.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 28 Feb 2025 19:45:37 +0000 (14:45 -0500)]
drm/amdgpu/gfx11: fix config guard
s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 28 Feb 2025 19:37:31 +0000 (14:37 -0500)]
drm/amdgpu/Kconfig: fix wording of DRM_AMDGPU_NAVI3X_USERQ
The feature is not navi3x specific at this point.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 28 Feb 2025 19:14:35 +0000 (14:14 -0500)]
drm/amdgpu: return an error in the userq IOCTL when DRM_AMDGPU_NAVI3X_USERQ=n
I'd swear this was already fixed, but I guess the patch never
landed. Add it now.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Feb 2025 14:44:39 +0000 (09:44 -0500)]
drm/amdgpu/userq: handle runtime pm
Take a reference when we create a queue and drop it
when we destroy the queue. We need to keep the device
active while user queues are active.
v2: squash in fix from Sunil
v3: squash in fix from Prike
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 20 Feb 2025 21:08:02 +0000 (16:08 -0500)]
drm/amdgpu/userq: fix hardcoded uq functions
Use the IP type to look up the userq functions rather
than hardcoding it.
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Mon, 27 Jan 2025 12:52:01 +0000 (18:22 +0530)]
drm/amdgpu: Fix display freeze lockup error
A deadlock situation has arised between the userq
signal ioctl and the eviction fence. In this scenario,
the function amdgpu_userq_signal_ioctl() has acquired a reservation
lock on the read/write buffer object (BO) through drm_exec.
Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(),
which is in a waiting for the userq resume work.
Meanwhile, the userq suspend worker has initiated the userq resume
work(amdgpu_userqueue_resume_worker). This userq resume work attempts
to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos
also attempting to reservation lock the same write BO that is already
locked by amdgpu_userq_signal_ioctl.
As a result, the resume work becomes stalled, causing
amdgpu_userqueue_ensure_ev_fence to remain in a waiting state.
Call Trace:
[ 242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds.
[ 242.836486] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4
[ 242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.836494] task:gnome-shel:cs0 state:D stack:0 pid:1288 tgid:1282 ppid:1180 flags:0x00000002
[ 242.836503] Call Trace:
[ 242.836508] <TASK>
[ 242.836517] __schedule+0x3e0/0xb10
[ 242.836530] ? srso_return_thunk+0x5/0x5f
[ 242.836541] schedule+0x31/0x120
[ 242.836546] schedule_timeout+0x150/0x160
[ 242.836551] ? srso_return_thunk+0x5/0x5f
[ 242.836555] ? sysvec_call_function+0x69/0xd0
[ 242.836562] ? srso_return_thunk+0x5/0x5f
[ 242.836567] ? preempt_count_add+0x7f/0xd0
[ 242.836577] __wait_for_common+0x91/0x180
[ 242.836582] ? __pfx_schedule_timeout+0x10/0x10
[ 242.836590] wait_for_completion+0x28/0x30
[ 242.836595] __flush_work+0x16c/0x290
[ 242.836602] ? __pfx_wq_barrier_func+0x10/0x10
[ 242.836611] flush_delayed_work+0x3a/0x60
[ 242.836621] amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu]
[ 242.836966] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[ 242.837171] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.837365] drm_ioctl_kernel+0xae/0x100 [drm]
[ 242.837398] drm_ioctl+0x2a1/0x500 [drm]
[ 242.837420] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.837622] ? srso_return_thunk+0x5/0x5f
[ 242.837627] ? srso_return_thunk+0x5/0x5f
[ 242.837630] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 242.837635] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 242.837811] __x64_sys_ioctl+0x99/0xd0
[ 242.837820] x64_sys_call+0x1209/0x20d0
[ 242.837825] do_syscall_64+0x51/0x120
[ 242.837830] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 242.837835] RIP: 0033:0x7f2f33f1a94f
[ 242.837838] RSP: 002b:
00007f2f24ffea30 EFLAGS:
00000246 ORIG_RAX:
0000000000000010
[ 242.837842] RAX:
ffffffffffffffda RBX:
00007f2f24ffebd0 RCX:
00007f2f33f1a94f
[ 242.837845] RDX:
00007f2f24ffebd0 RSI:
00000000c0306457 RDI:
000000000000000d
[ 242.837847] RBP:
00007f2f24ffeab0 R08:
0000000000000000 R09:
0000000000000000
[ 242.837849] R10:
00007f2f24ffecd0 R11:
0000000000000246 R12:
00007f2f25000640
[ 242.837851] R13:
00000000c0306457 R14:
000000000000000d R15:
00007fff3b39c1e0
[ 242.837858] </TASK>
[ 242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds.
[ 242.837869] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4
[ 242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.837874] task:Xwayland:cs0 state:D stack:0 pid:1517 tgid:1338 ppid:1282 flags:0x00004002
[ 242.837878] Call Trace:
[ 242.837880] <TASK>
[ 242.837883] __schedule+0x3e0/0xb10
[ 242.837890] schedule+0x31/0x120
[ 242.837894] schedule_preempt_disabled+0x1c/0x30
[ 242.837897] __mutex_lock.constprop.0+0x386/0x6e0
[ 242.837902] ? srso_return_thunk+0x5/0x5f
[ 242.837905] ? __timer_delete_sync+0x81/0xe0
[ 242.837911] __mutex_lock_slowpath+0x13/0x20
[ 242.837915] mutex_lock+0x3b/0x50
[ 242.837919] amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu]
[ 242.838138] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[ 242.838340] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.838531] drm_ioctl_kernel+0xae/0x100 [drm]
[ 242.838559] drm_ioctl+0x2a1/0x500 [drm]
[ 242.838580] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.838778] ? srso_return_thunk+0x5/0x5f
[ 242.838783] ? srso_return_thunk+0x5/0x5f
[ 242.838786] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 242.838791] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 242.838967] __x64_sys_ioctl+0x99/0xd0
[ 242.838972] x64_sys_call+0x1209/0x20d0
[ 242.838975] do_syscall_64+0x51/0x120
[ 242.838979] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 242.838982] RIP: 0033:0x7f9118b1a94f
[ 242.838985] RSP: 002b:
00007f910cdff760 EFLAGS:
00000246 ORIG_RAX:
0000000000000010
[ 242.838989] RAX:
ffffffffffffffda RBX:
00007f910cdff910 RCX:
00007f9118b1a94f
[ 242.838991] RDX:
00007f910cdff910 RSI:
00000000c0306457 RDI:
000000000000000c
[ 242.838993] RBP:
00007f910cdff7e0 R08:
0000000000000000 R09:
0000000000000001
[ 242.838995] R10:
00007f910cdff9d4 R11:
0000000000000246 R12:
00007f910ce00640
[ 242.838997] R13:
00000000c0306457 R14:
000000000000000c R15:
00007fff9dd11d10
[ 242.839004] </TASK>
v2: Addressed review comemnts from Christian.
v3/v4: Addressed review comemnts from Christian.
- Move drm_exec drm_exec loop after userq fence create.
- cleanup the newly created userq fence in case of error.
v5 - Addressed review comemnts from Christian.
- Create a new amdgpu_userq_fence_alloc() function for allocation.
- Calling dma_fence_put for cleanup procedure.
- make amdgpu_userq_fence_create() function static.
- drm_exec_init is called after mutex_unlock.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Mon, 10 Feb 2025 16:47:28 +0000 (22:17 +0530)]
drm/amdgpu: Modify the seq64 VM cache policy
The seq64 VM cache policy should be set to UC (Uncached) to
match with userqueue fence address kernel mapped memory's
cache settings.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 1 Jan 2025 08:52:29 +0000 (14:22 +0530)]
drm/amdgpu: Fix out-of-bounds issue in user fence
Fix out-of-bounds issue in userq fence create when
accessing the userq xa structure. Added a lock to
protect the race condition.
v2:(Christian)
- Allocate memory with GFP_ATOMIC.
v3:
- Moved to 2 xa approach.
v4:(Christian)
- Lock the xa_for_each blocks and memory allocation part
as well to make sure that xa is not modified in between
the 2 xa_for_each blocks.
BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000006] Call Trace:
[ +0.000005] <TASK>
[ +0.000005] dump_stack_lvl+0x6c/0x90
[ +0.000011] print_report+0xc4/0x5e0
[ +0.000009] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? kasan_complete_mode_report_info+0x26/0x1d0
[ +0.000007] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000405] kasan_report+0xdf/0x120
[ +0.000009] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000405] __asan_report_store8_noabort+0x17/0x20
[ +0.000007] amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000406] ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu]
[ +0.000408] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm]
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000008] amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu]
[ +0.000412] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000404] ? try_to_wake_up+0x165/0x1840
[ +0.000010] ? __pfx_futex_wake_mark+0x10/0x10
[ +0.000011] drm_ioctl_kernel+0x178/0x2f0 [drm]
[ +0.000050] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000404] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.000043] ? __kasan_check_read+0x11/0x20
[ +0.000007] ? srso_return_thunk+0x5/0x5f
[ +0.000007] ? __kasan_check_write+0x14/0x20
[ +0.000008] drm_ioctl+0x513/0xd20 [drm]
[ +0.000040] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000407] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.000044] ? srso_return_thunk+0x5/0x5f
[ +0.000007] ? _raw_spin_lock_irqsave+0x99/0x100
[ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000006] ? __rseq_handle_notify_resume+0x188/0xc30
[ +0.000008] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? srso_return_thunk+0x5/0x5f
[ +0.000006] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.000010] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[ +0.000388] __x64_sys_ioctl+0x135/0x1b0
[ +0.000009] x64_sys_call+0x1205/0x20d0
[ +0.000007] do_syscall_64+0x4d/0x120
[ +0.000008] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000007] RIP: 0033:0x7f7c3d31a94f
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saleemkhan Jamadar [Mon, 6 Jan 2025 07:20:50 +0000 (12:50 +0530)]
drm/amdgpu: add db size and offset range for VCN and VPE
VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit.
v1 : add gfx switch case and fix checkpatch warnings (Shashank)
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saleemkhan Jamadar [Fri, 3 Jan 2025 13:32:59 +0000 (19:02 +0530)]
drm/amdgpu: map doorbell for the requested userq
Introduce db_info structure to the populate the doorbell
information that is required to be mapped.
Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.
v2 - Fix space alignment and checkpatch warnings(Shashank)
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Fri, 20 Dec 2024 12:44:23 +0000 (13:44 +0100)]
drm/amdgpu: fix call to amdgpu_eviction_fence_detach
That needs to be done after grabbing the lock, not before.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Thu, 19 Dec 2024 14:13:54 +0000 (19:43 +0530)]
drm/amdgpu: Fix Illegal opcode in command stream Error
When applications closes, it triggers the drm_file_free
function which subsequently releases all allocated buffer
objects. Concurrently, the resume_worker thread will attempt
to map the usermode queue. However, since the wptr buffer
object has already been deallocated, this will result in
an Illegal opcode error being raised in the command stream.
Now replacing drm_release() with a new function
amdgpu_drm_release(). This function will set the flag to
prevent the scheduling of any new queue resume/map, stop
all queues and then call drm_release().
V2:
- Replace drm_release with amdgpu_drm_release(Christian).
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Thu, 12 Dec 2024 14:06:16 +0000 (19:36 +0530)]
drm/amdgpu: Apply sign extension to seq64
Apply sign extension to seq64 va address.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Christian König [Mon, 9 Dec 2024 17:40:48 +0000 (23:10 +0530)]
drm/amdgpu: Modify the MES process va end limit
Modify the MES process va end limit to max pfn.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 17:34:34 +0000 (23:04 +0530)]
drm/amdgpu: Fix the use-after-free issue in wait IOCTL
The xarray pointer which has the userqueue xarray structure
reference should be cleared when the userqueue gets
destroyed. Otherwise, we may access the freed xa memory and
see the below warnings.
warning 1:
BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x7a/0xe0
[ +0.000044] Call Trace:
[ +0.000017] <TASK>
[ +0.000016] dump_stack_lvl+0x6c/0x90
[ +0.000025] print_report+0xc4/0x5e0
[ +0.000025] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? kasan_complete_mode_report_info+0x60/0x1d0
[ +0.000030] ? _raw_spin_lock+0x7a/0xe0
[ +0.000023] kasan_report+0xdf/0x120
[ +0.000023] ? _raw_spin_lock+0x7a/0xe0
[ +0.000025] kasan_check_range+0xf7/0x1b0
[ +0.000025] __kasan_check_write+0x14/0x20
[ +0.000024] _raw_spin_lock+0x7a/0xe0
[ +0.000023] ? __pfx__raw_spin_lock+0x10/0x10
[ +0.000024] ? amdgpu_userq_wait_ioctl+0xac0/0x1f30 [amdgpu]
[ +0.000442] amdgpu_userq_wait_ioctl+0x18fc/0x1f30 [amdgpu]
[ +0.000428] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000424] ? __pfx_idr_alloc_u32+0x10/0x10
[ +0.000027] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? __kasan_check_write+0x14/0x20
[ +0.000025] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? idr_alloc+0x72/0xc0
[ +0.000023] ? srso_return_thunk+0x5/0x5f
[ +0.000023] ? fput+0x1c/0x2f0
[ +0.000025] drm_ioctl_kernel+0x178/0x2f0 [drm]
[ +0.000065] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000425] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.000064] ? srso_return_thunk+0x5/0x5f
[ +0.000023] ? __kasan_check_write+0x14/0x20
[ +0.000025] drm_ioctl+0x513/0xd20 [drm]
[ +0.000058] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000428] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.000061] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000027] ? __count_memcg_events+0x11f/0x3a0
[ +0.000027] ? srso_return_thunk+0x5/0x5f
[ +0.001040] ? srso_return_thunk+0x5/0x5f
[ +0.000969] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.000966] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[ +0.001352] __x64_sys_ioctl+0x135/0x1b0
[ +0.000966] x64_sys_call+0x1205/0x20d0
[ +0.000968] do_syscall_64+0x4d/0x120
[ +0.000960] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000962] RIP: 0033:0x7f42af11a94f
warning 2:
WARNING: at lib/xarray.c:1849 __xa_alloc+0x13a/0x150
[ 366.491409] RIP: 0010:__xa_alloc+0x13a/0x150
[ 366.491434] Call Trace:
[ 366.491437] <TASK>
[ 366.491440] ? show_regs+0x6d/0x80
[ 366.491445] ? __warn+0x91/0x140
[ 366.491450] ? __xa_alloc+0x13a/0x150
[ 366.491453] ? report_bug+0x1c9/0x1e0
[ 366.491459] ? handle_bug+0x63/0xa0
[ 366.491463] ? exc_invalid_op+0x1d/0x80
[ 366.491467] ? asm_exc_invalid_op+0x1f/0x30
[ 366.491476] ? __xa_alloc+0x13a/0x150
[ 366.491484] amdgpu_userq_wait_ioctl+0xe0e/0xfe0 [amdgpu]
[ 366.491743] ? idr_alloc_u32+0x97/0xd0
[ 366.491749] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ 366.491912] drm_ioctl_kernel+0xae/0x100 [drm]
[ 366.491942] drm_ioctl+0x2a1/0x500 [drm]
[ 366.491961] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ 366.492127] ? srso_return_thunk+0x5/0x5f
[ 366.492132] ? srso_return_thunk+0x5/0x5f
[ 366.492135] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 366.492139] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 366.492288] __x64_sys_ioctl+0x99/0xd0
[ 366.492295] x64_sys_call+0x1209/0x20d0
[ 366.492299] do_syscall_64+0x51/0x120
[ 366.492303] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 366.492418] RIP: 0033:0x7f86f3b1a94f
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 17:32:28 +0000 (23:02 +0530)]
drm/amdgpu: Fix NULL ptr dereference issue for non userq fences
Add the correct fences count variable [num_fences] in the fences
array iteration to handle the userq / non-userq fences.
v2:(Christian)
- All fences in the array either come from some reservation object
or drm_syncobj. If any of those are NULL then there is a bug
somewhere else.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Mon, 9 Dec 2024 13:50:56 +0000 (19:20 +0530)]
drm/amdgpu: Add mqd for userq compute queue
Add mqd for userq compute queue for gfx11/gfx12
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 20 Nov 2024 17:45:33 +0000 (18:45 +0100)]
drm/amdgpu: enable eviction fence
This patch enables attachment and detachment of eviction fences.
This is just a fork of eviction fence enabling code from the first
patch of the series so that the CI testing can happen on fully
fledged code.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 11 Dec 2024 11:09:00 +0000 (12:09 +0100)]
drm/amdgpu: simplify eviction fence suspend/resume
The basic idea in this redesign is to add an eviction fence only in UQ
resume path. When userqueue is not present, keep ev_fence as NULL
Main changes are:
- do not create the eviction fence during evf_mgr_init, keeping
evf_mgr->ev_fence=NULL until UQ get active.
- do not replace the ev_fence in evf_resume path, but replace it only in
uq_resume path, so remove all the unnecessary code from ev_fence_resume.
- add a new helper function (amdgpu_userqueue_ensure_ev_fence) which
will do the following:
- flush any pending uq_resume work, so that it could create an
eviction_fence
- if there is no pending uq_resume_work, add a uq_resume work and
wait for it to execute so that we always have a valid ev_fence
- call this helper function from two places, to ensure we have a valid
ev_fence:
- when a new uq is created
- when a new uq completion fence is created
v2: Worked on review comments by Christian.
v3: Addressed few more review comments by Christian.
v4: Move mutex lock outside of the amdgpu_userqueue_suspend()
function (Christian).
v5: squash in build fix (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Tue, 26 Nov 2024 14:51:08 +0000 (15:51 +0100)]
drm/amdgpu: enable userqueue secure sem for GFX 12
- Add a field in struct amdgpu_mqd_prop for userqueue
secure sem fence address since now we have a generic
file for mes_userqueue.c
- Add secure sem fence address mqd support to gfx12 into
their corresponding init functions.
- Enable secure semaphore IRQ handling
V2: Address review comment from Alex:
Use fence_address instead of fenceaddress (Shashank)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Somalapuram Amaranath [Thu, 10 Oct 2024 18:08:06 +0000 (20:08 +0200)]
drm/amdgpu: enable userqueue support for GFX12
This patch enables Usermode queue support across GFX, Compute
and SDMA IPs on GFX12/SDMA7. It typically reuses Navi3X userqueue
IP functions to create and destroy MQDs.
v2: rebase on proposed changes (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 26 Nov 2024 14:45:19 +0000 (15:45 +0100)]
drm/amdgpu/uq: make MES UQ setup generic
Now that all of the IP specific code has been moved into
the IP specific functions, we can make this code generic.
V2: Fixed build errors and porting logics (Shashank)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 21 Oct 2024 13:16:12 +0000 (15:16 +0200)]
drm/amdgpu/uq: remove gfx11 specifics from UQ setup
This can all be handled by in the IP specific mpd init
code.
V2: Removed setting of gds_va, which was removed during UAPI
review (Shashank)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 18 Oct 2024 18:15:51 +0000 (14:15 -0400)]
drm/amdgpu/sdma7: update mqd init for UQ
Set the addresses for the UQ metadata.
V2: Fix lower offset mask (Shashank)
V2: Use lower_32_bits for mqd objects(Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 18 Oct 2024 18:15:31 +0000 (14:15 -0400)]
drm/amdgpu/sdma6: update mqd init for UQ
Set the addresses for the UQ metadata.
V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits for MQD objects (Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 18 Oct 2024 18:15:12 +0000 (14:15 -0400)]
drm/amdgpu/gfx12: update mqd init for UQ
Set the addresses for the UQ metadata.
V2: Fix lower address mask (Shashank)
V3: Use lower_32_bits() for MQD objects (Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Amaranath Somalapuram [Wed, 27 Nov 2024 16:06:45 +0000 (17:06 +0100)]
drm/amdgpu: fix IGT CI regression with eviction fence
This patch fixes one of the regressions in eviction fence code with
IGT tests.
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Amaranath Somalapuram <amaranath.somalapuram@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 18 Oct 2024 18:14:34 +0000 (14:14 -0400)]
drm/amdgpu/gfx11: update mqd init for UQ
Set the addresses for the UQ metadata.
V2: Fix lower address (Shashank)
V3: Restore lower_32_bits() for MQD addresses (Alex)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 18 Oct 2024 17:58:23 +0000 (13:58 -0400)]
drm/amdgpu: add some additional members to amdgpu_mqd_prop
These are needed to make userqueue infrastructure generic.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 20 Nov 2024 17:04:33 +0000 (18:04 +0100)]
drm/amdgpu: handle eviction fence race
The eviction process can get into a race condition between the eviction
fence suspend work (which replaces the old fence with new) and kms_close
(which destroys the fence and doesn't expect a new one).
This patch:
- adds a flag to indicate that fd is closing, so fence replacement is
not required (evf_mgr->fd_closing)
- adds a flush_work() during the ev_fence_destroy routine
V2: Addressed review comments from Christian:
- Do not use mutex to sync
- Use flush_work and wait for suspend_work to be done
V3: Fixed state machine for queue->active, which adds into race between
suspend/resume and queue ops
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 20 Nov 2024 18:02:26 +0000 (19:02 +0100)]
drm/amdgpu: resume gfx userqueues
This patch adds support for userqueue resume. What it typically does is
this:
- adds a new delayed work for resuming all the queues.
- schedules this delayed work from the suspend work.
- validates the BOs and replaces the eviction fence before resuming all
the queues running under this instance of userq manager.
V2: Addressed Christian's review comments:
- declare local variables like ret at the bottom.
- lock all the object first, then start attaching the new fence.
- dont replace old eviction fence, just attach new eviction fence.
- no error logs for drm_exec_lock failures
- no need to reserve bos after drm_exec_locked
- schedule the resume worker immediately (not after 100 ms)
- check for NULL BO (Arvind)
V5: Rebased wrt changes in suspend patch
- moved amdgpu_userqueue_validate_vm_bo in this patch
- initialized ret in resume_all
V6: Rebase
V7: Addressed review comments from Christian
- Do not use list_for_each_safe() with vm->invalidated, its not
correct way
V8: Fixed the race condition between suspend/close/fence
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 20 Nov 2024 17:59:49 +0000 (18:59 +0100)]
drm/amdgpu: suspend gfx userqueues
This patch adds suspend support for gfx userqueues. It typically does
the following:
- adds an enable_signaling function for the eviction fence, so that it
can trigger the userqueue suspend,
- adds a delayed work to handle suspending of the eviction_fence
- adds a suspend function to handle suspending of userqueues which
suspends all the queues under this userq manager and signals the
eviction fence,
- adds a function to replace the old eviction fence with a new one and
attach it to each of the objects,
- adds reference of userq manager in the eviction fence container so
that it can be used in the suspend function.
V2: Addressed Christian's review comments:
- schedule suspend work immediately
V4: Addressed Christian's review comments:
- wait for pending uq fences before starting suspend, added
queue->last_fence for the same
- accommodate ev_fence_mgr into existing code
- some bug fixes and NULL checks
V5: Addressed Christian's review comments (gitlab)
- Wait for eviction fence to get signaled in destroy,
don't signal it
- Wait for eviction fence to get signaled in replace fence,
don't signal it
V6: Addressed Christian's review comments
- Do not destroy the old eviction fence until we have it replaced
- Change the sequence of fence replacement sub-tasks
- reusing the ev_fence delayed work for userqueue suspend as well
(Shashank).
V7: Addressed Christian's review comments
- give evf_mgr as argument (instead of fpriv) to replace_fence()
- save ptr to evf_mgr in ev_fence (instead of uq_mgr)
- modify suspend_all_queues logic to reflect error properly
- remove the garbage drm_exec_lock section in wait_for_signal
- grab the userqueue mutex before starting the wait for fence
- remove the unrelated gobj check from signal_ioctl
V8: Added race condition fixes
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 3 Jun 2024 09:13:03 +0000 (11:13 +0200)]
drm/amdgpu: add userqueue suspend/resume functions
This patch adds userqueue suspend/resume functions at
core MES V11 IP level.
V2: use true/false for queue_active status (Christian)
added Christian's R-B
V3: reset/set queue status in mqd.create and mqd.destroy
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 27 Aug 2024 10:14:43 +0000 (15:44 +0530)]
drm/amdgpu: add gfx eviction fence helpers
This patch adds basic eviction fence framework for the gfx buffers.
The idea is to:
- One eviction fence is created per gfx process, at kms_open.
- This fence is attached to all the gem buffers created
by this process.
- This fence is detached to all the gem buffers at postclose_kms.
This framework will be further used for usermode queues.
V2: Addressed review comments from Christian
- keep fence_ctx and fence_seq directly in fpriv
- evcition_fence should be dynamically allocated
- do not save eviction fence instance in BO, there could be many
such fences attached to one BO
- use dma_resv_replace_fence() in detach
V3: Addressed review comments from Christian
- eviction fence create and destroy functions should be called
only once from fpriv create/destroy
- use dma_fence_put() in eviction_fence_destroy
V4: Addressed review comments from Christian:
- create a separate ev_fence_mgr structure
- cleanup fence init part
- do not add a domain for fence owner KGD
V5: Addressed review comments from Christian:
- drop the dma_fence_is_signaled check
- use a local variable to access evf_mgr->ev_fence under the
spin_lock() multiple places
- remove the vm->is_compute_ctx check to attach gfx eviction fence,
in gem_object_open
V6: Addressed review comments from Christian:
- drop the return value from eviction_fence_signal
- reserve_fence should be the first thing inside the
attach_eviction_fence function, also keep the resv_add_fence inside
the lock
- remove the unwanted ev_fence check inside detach function
- fix wrong variable check in eviction_fence_init function
- return the error value of eviction_fence_init to the caller, dont
keep it void.
- fail gem_object_open if attaching of eviction_fence fails
- detach the eviction fence only when amdgpu_vm_is_bo_always_valid
is not true.
V7: Addressed review comments from Christian:
- Do not add a uq_mgr ptr in ev_fence, rather add evf_mgr
V8: Move eviction fence enabling into separate patch for CI
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Sunil Khatri [Wed, 13 Nov 2024 08:10:33 +0000 (13:40 +0530)]
drm/amdgpu: add the argument description for gpu_addr
Add argument description for the input argument
gpu_addr for amdgpu_seq64_alloc.
Fixes the warning raised by the compiler:
drivers/gpu/drm/amd/amdgpu/amdgpu_seq64.c:168:
warning: Function parameter or struct member 'gpu_addr' not described in 'amdgpu_seq64_alloc
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 30 Oct 2024 14:39:42 +0000 (15:39 +0100)]
drm/amdgpu: add new AMDGPU_INFO subquery for userq objects
This patch adds a new subquery (AMDGPU_INFO_UQ_FW_AREAS) in
AMDGPU_INFO_IOCTL to get the size and alignment of shadow
and csa objects from the FW setup. This information is
required for the userqueue consumers.
V2: Added Alex's suggestions and addressed review comments:
- make this query IP specific (GFX/SDMA etc)
- give a better title (AMDGPU_INFO_UQ_METADATA)
- restructured the code as per sample code shared by Alex
V3: Split the UAPI patch from shadow_size_fn modifications
V4: Addressed review comments from UAPI review (Marek/Pierre-Eric)
- Change the query name to AMDGPU_INFO_UQ_FW_AREAS
- remove unused inpur parameter for AMDGPU_HW_IP*
link: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/400/
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 24 Oct 2024 15:07:42 +0000 (17:07 +0200)]
drm/amdgpu: add get_gfx_shadow_info callback for gfx12
This callback gets the size and alignment requirements
for the gfx shadow buffer for preemption.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Mon, 11 Nov 2024 07:13:07 +0000 (12:43 +0530)]
drm/amdgpu: Modify userq signal/wait struct field names
Modify kernel UAPI userq signal/wait struct field names and
description corresponding to the libdrm UAPI review comments.
libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Wed, 30 Oct 2024 14:32:27 +0000 (15:32 +0100)]
drm/amdgpu: bypass SRIOV check for shadow size info
Currently, the shadow FW space size and alignment information is
protected under a flag (adev->gfx.cp_gfx_shadow) which gets set
only in case of SRIOV setups.
if (amdgpu_sriov_vf(adev))
adev->gfx.cp_gfx_shadow = true;
But we need this information for GFX Userqueues, so that user can
create these objects while creating userqueue. This patch series
creates a method to get this information bypassing the dependency
on this check.
This patch:
- adds a new input parameter flag to the gfx.funcs->get_gfx_shadow_info
fptr definition, so that it can accommodate the information without the
check (adev->gfx.cp_gfx_shadow) on request.
- updates the existing definition of amdgpu_gfx_get_gfx_shadow_info to
adjust with this new flag.
Next patch in the series is adding a UAPI which will consume this info.
V2: split this patch from the new UAPI patch
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 11 Nov 2024 11:34:30 +0000 (12:34 +0100)]
drm/amdgpu: fix userqueue UAPI comments
This patch fixes some of the pending UAPI review comments
from the libDRM/UAPI review process.
- It updates some outdated comments in the userqueue UAPI header
highlighted during the libdrm UAPI review.
- It removes the GDS BO support which was found unused.
- It also removes the unused flags parameter from the UAPI.
- It also adds a padding variables in userqueue in/out structures.
(Pierre-Eric and Marek)
- clarify comments on top of drm_amdgpu_userq_in
- clarify comment for queue_id (in)
- clarify comment for mqd
- clarify comment for compute MQD size
- clarify comment for queue_id (out)
- remove GDB object from BO object list
- remove the unused flags parameter
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 28 Nov 2023 14:22:30 +0000 (19:52 +0530)]
Revert "drm/amdgpu: don't allow userspace to create a doorbell BO"
This reverts commit
6be2ad4f0073c541146caa66c5ae936c955a8224.
This patch was to block userspace to use doorbell manager UAPI
until usermode queue UAPI gets approved. UQ UAPI got approved in the
following MR:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Wed, 25 Sep 2024 16:10:41 +0000 (18:10 +0200)]
drm/amdgpu: Add input fence to sync bo map/unmap
This patch adds input fences to VM_IOCTL for buffer object.
The kernel will map/unmap the BO only when the fence is signaled.
The UAPI for the same has been approved here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392
V2: Bug fix (Arvind)
V3: Bug fix (Arvind)
V4: Rename UAPI objects as per UAPI review (Marek)
V5: Addressed review comemnts from Christian
- function should return error.
- Add 'TODO' comment
- The input fence should be independent of the operation.
V6: Addressed review comemnts from Christian
- Release the memory allocated by memdup_user().
V7: Addressed review comemnts from Christian
- Drop the debug print and add "return r;" for the error handling.
V11: Rebase
v12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va.
v13: Fix deadlock issue.
v14: Fix merge conflict.
v15: Fix review comment by renaming syncobj handles.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:56:37 +0000 (11:26 +0530)]
drm/amdgpu: add userq specific kernel config for fence ioctls
Keep the user queue fence signal and wait IOCTLs in the
kernel config CONFIG_DRM_AMDGPU_NAVI3X_USERQ.
v2(Christian):
- Remove the userq specific config added for kernel queues fence init
function.
v3(Alex):
- It will be better to return an error(-ENOTSUPP) in these cases.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:49:26 +0000 (11:19 +0530)]
drm/amdgpu: Add gpu_addr support to seq64 allocation
Add gpu address support to seq64 alloc function.
v1:(Christian)
- Add the user of this new interface change to the same
patch.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:29:04 +0000 (10:59 +0530)]
drm/amdgpu: Add separate array of read and write for BO handles
Drop AMDGPU_USERQ_BO_WRITE as this should not be a global option
of the IOCTL, It should be option per buffer. Hence adding separate
array for read and write BO handles.
v2(Marek):
- Internal kernel details shouldn't be here. This file should only
document the observed behavior, not the implementation .
v3:
- Fix DAL CI clang issue.
v4:
- Added Alex RB to merge the kernel UAPI changes since he has
already approved the amdgpu_drm.h changes.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Suggested-by: Marek Olšák <marek.olsak@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:26:57 +0000 (10:56 +0530)]
drm/amdgpu: add vm root BO lock before accessing the vm
Add a vm root BO lock before accessing the userqueue VM.
v1:(Christian)
- Keep the VM locked until you are done with the mapping.
- Grab a temporary BO reference, drop the VM lock and acquire the BO.
When you are done with everything just drop the BO lock and
then the temporary BO reference.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:25:22 +0000 (10:55 +0530)]
drm/amdgpu: Add the missing error handling for xa_store() call
Add the missing error handling for xa_store() call in the function
amdgpu_userq_fence_driver_alloc().
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:21:58 +0000 (10:51 +0530)]
drm/amdgpu: Few optimization and fixes for userq fence driver
Few optimization and fixes for userq fence driver.
v1:(Christian):
- Remove unnecessary comments.
- In drm_exec_init call give num_bo_handles as last parameter it would
making allocation of the array more efficient
- Handle return value of __xa_store() and improve the error handling of
amdgpu_userq_fence_driver_alloc().
v2:(Christian):
- Revert userq_xa xarray init to XA_FLAGS_LOCK_IRQ.
- move the xa_unlock before the error check of the call xa_err(__xa_store())
and moved this change to a separate patch as this is adding a missing error
handling.
- Removed the unnecessary comments.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:16:49 +0000 (10:46 +0530)]
drm/amdgpu: Remove the MES self test
Remove MES self test as this conflicts the userqueue fence
interrupts.
v2:(Christian)
- remove the amdgpu_mes_self_test() function and any now unused code.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Wed, 25 Sep 2024 16:09:49 +0000 (18:09 +0200)]
drm/amdgpu: update userqueue BOs and PDs
This patch updates the VM_IOCTL to allow userspace to synchronize
the mapping/unmapping of a BO in the page table.
The major changes are:
- it adds a drm_timeline object as an input parameter to the VM IOCTL.
- this object is used by the kernel to sync the update of the BO in
the page table during the mapping of the object.
- the kernel also synchronizes the tlb flush of the page table entry of
this object during the unmapping (Added in this series:
https://patchwork.freedesktop.org/series/131276/ and
https://patchwork.freedesktop.org/patch/584182/)
- the userspace can wait on this timeline, and then the BO is ready to
be consumed by the GPU.
The UAPI for the same has been approved here:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/392
V2:
- remove the eviction fence coupling
V3:
- added the drm timeline support instead of input/output fence
(Christian)
V4:
- made timeline 64-bit (Christian)
- bug fix (Arvind)
V5: GLCTS bug fix (Arvind)
V6: Rename syncobj_handle -> timeline_syncobj_out
Rename point -> timeline_point_in (Marek)
V7: Addressed review comments from Christian:
- do not send last_update fence in case of vm_clear_freed, instead
return the fence from gen_va_update_vm
- move the functions to update bo_mapping to amdgpu_gem.c
- do not use amdgpu_userq_update_vm anymore in userq_create()
V8: Addressed review comments from Christian:
- Split amdgpu_gem_update_bo_mapping function.
- amdgpu_gem_va_update_vm should return stub for error.
V9: Addressed review comments from Christian:
- Rename the function amdgpu_gem_update_timeline_node.
- amdgpu_gem_update_timeline_node should be void function.
- when timeline_point is zero don't allocate a chain and
call drm_syncobj_replace_fence() instead of
drm_syncobj_add_point().
V11: rebase
V12: Fix 32-bit holes issue in sturct drm_amdgpu_gem_va.
V13: Fix the review comment by renaming timeline syncobj (Marek)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:08:46 +0000 (10:38 +0530)]
drm/amdgpu: Enable userq fence interrupt support
Add support to handle the userqueue protected fence signal hardware
interrupt.
Create a xarray which maps the doorbell index to the fence driver address.
This would help to retrieve the fence driver information when an userq fence
interrupt is triggered. Firmware sends the doorbell offset value and
this info is compared with the queue's mqd doorbell offset value.
If they are same, we process the userq fence interrupt.
v1:(Christian):
- use xa_load to extract the fence driver.
- move the amdgpu_userq_fence_driver_process call within the xa_lock
as there is a chance that fence_drv might be freed.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 05:03:28 +0000 (10:33 +0530)]
drm/amdgpu: Add wait IOCTL timeline syncobj support
Add user fence wait IOCTL timeline syncobj support.
v2:(Christian)
- handle dma_fence_wait() return value.
- shorten the variable name syncobj_timeline_points a bit.
- move num_points up to avoid padding issues.
v3:(Christian)
- Handle timeline drm_syncobj_find_fence() call error
handling
- Use dma_fence_unwrap_for_each() in timeline fence as
there could be more than one fence.
v4:(Christian)
- Drop the first num_fences since fence is always included in
the dma_fence_unwrap_for_each() iteration, when fence != f
then fence is most likely just a container.
v5: Added Alex RB to merge the kernel UAPI changes since he has
already approved the amdgpu_drm.h changes.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 04:56:07 +0000 (10:26 +0530)]
drm/amdgpu: Implement userqueue signal/wait IOCTL
This patch introduces new IOCTL for userqueue secure semaphore.
The signal IOCTL called from userspace application creates a drm
syncobj and array of bo GEM handles and passed in as parameter to
the driver to install the fence into it.
The wait IOCTL gets an array of drm syncobjs, finds the fences
attached to the drm syncobjs and obtain the array of
memory_address/fence_value combintion which are returned to
userspace.
v2: (Christian)
- Install fence into GEM BO object.
- Lock all BO's using the dma resv subsystem
- Reorder the sequence in signal IOCTL function.
- Get write pointer from the shadow wptr
- use userq_fence to fetch the va/value in wait IOCTL.
v3: (Christian)
- Use drm_exec helper for the proper BO drm reserve and avoid BO
lock/unlock issues.
- fence/fence driver reference count logic for signal/wait IOCTLs.
v4: (Christian)
- Fixed the drm_exec calling sequence
- use dma_resv_for_each_fence_unlock if BO's are not locked
- Modified the fence_info array storing logic.
v5: (Christian)
- Keep fence_drv until wait queue execution.
- Add dma_fence_wait for other fences.
- Lock BO's using drm_exec as the number of fences in them could
change.
- Install signaled fences as well into BO/Syncobj.
- Move Syncobj fence installation code after the drm_exec_prepare_array.
- Directly add dma_resv_usage_rw(args->bo_flags....
- remove unnecessary dma_fence_put.
v6: (Christian)
- Add xarray stuff to store the fence_drv
- Implement a function to iterate over the xarray and drop
the fence_drv references.
- Add drm_exec_until_all_locked() wrapper
- Add a check that if we haven't exceeded the user allocated num_fences
before adding dma_fence to the fences array.
v7: (Christian)
- Use memdup_user() for kmalloc_array + copy_from_user
- Move the fence_drv references from the xarray into the newly created fence
and drop the fence_drv references when we signal this fence.
- Move this locking of BOs before the "if (!wait_info->num_fences)",
this way you need this code block only once.
- Merge the error handling code and the cleanup + return 0 code.
- Initializing the xa should probably be done in the userq code.
- Remove the userq back pointer stored in fence_drv.
- Pass xarray as parameter in amdgpu_userq_walk_and_drop_fence_drv()
v8: (Christian)
- Move fence_drv references must come before adding the fence to the list.
- Use xa_lock_irqsave_nested for nested spinlock operations.
- userq_mgr should be per fpriv and not one per device.
- Restructure the interrupt process code for the early exit of the loop.
- The reference acquired in the syncobj fence replace code needs to be
kept around.
- Modify the dma_fence acquire placement in wait IOCTL.
- Move USERQ_BO_WRITE flag to UAPI header file.
- drop the fence drv reference after telling the hw to stop accessing it.
- Add multi sync object support to userq signal IOCTL.
V9: (Christian)
- Store all the fence_drv ref to other drivers and not ourself.
- Remove the userq fence xa implementation and replace with
kvmalloc_array.
v10: (Christian)
- Add a comment for the userq_xa xarray
- drop the if check of userq_fence->fence_drv_array
- use the i variable to initialize userq_fence->fence_drv_array_count
- drop the fence reference before you free the array in the error handling,
otherwise it could be that some references leaked.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Wed, 30 Oct 2024 04:39:50 +0000 (10:09 +0530)]
drm/amdgpu: UAPI headers for userqueue Secure semaphore
Add UAPI header support for userqueue Secure semaphore
v2: Worked on review comments from Christian for the following
modifications
- Add bo handles, bo flags and padding fields.
- Include value/va in a combined array.
v3: Worked on review comments from Christian
- Add num_fences field to obtain the number of objects required
to allocate memory for userq_fence_info.
- Replace obj_handle name with syncobj_handle.
- Replace point name with syncobj_point.
- Replace count_handles name with num_syncobj_handles.
- Fix structure padding related issues.
v4: Worked on review comments from Christian
- Modify the bo flags description.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Fri, 25 Oct 2024 10:45:02 +0000 (16:15 +0530)]
drm/amdgpu: screen freeze and userq driver crash
Screen freeze and userq fence driver crash while playing Xonotic
v2: (Christian)
- There is change that fence might signal in between testing
and grabbing the lock. Hence we can move the lock above the
if..else check and use the dma_fence_is_signaled_locked().
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Fri, 25 Oct 2024 10:14:02 +0000 (15:44 +0530)]
drm/amdgpu: Add mqd support for the fence address
- Add a field in struct v11_gfx_mqd for userqueue
fence address.
- Assign fence gpu VA address to the userqueue mqd
fence address fields.
v2: Remove the mask and replace with lower_32_bits (Christian)
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arunpravin Paneer Selvam [Fri, 25 Oct 2024 10:11:53 +0000 (15:41 +0530)]
drm/amdgpu: Implement a new userqueue fence driver
Developed a userqueue fence driver for the userqueue process shared
BO synchronization.
Create a dma fence having write pointer as the seqno and allocate a
seq64 memory for each user queue process and feed this memory address
into the firmware/hardware, thus the firmware writes the read pointer
into the given address when the process completes it execution.
Compare wptr and rptr, if rptr >= wptr, signal the fences for the waiting
process to consume the buffers.
v2: Worked on review comments from Christian for the following
modifications
- Add wptr as sequence number into the fence
- Add a reference count for the fence driver
- Add dma_fence_put below the list_del as it might
frees the userq fence.
- Trim unnecessary code in interrupt handler.
- Check dma fence signaled state in dma fence creation
function for a potential problem of hardware completing
the job processing beforehand.
- Add necessary locks.
- Create a list and process all the unsignaled fences.
- clean up fences in destroy function.
- implement .signaled callback function
v3: Worked on review comments from Christian
- Modify naming convention for reference counted objects
- Fix fence driver reference drop issue
- Drop amdgpu_userq_fence_driver_process() function return value
v4: Worked on review comments from Christian
- Moved fence driver allocation into amdgpu_userq_fence_driver_alloc()
- Added detail doc mentioning the differences b/w
two spinlocks declared.
v5: Worked on review comments from Christian
- Check before upcast and remove local variable
- Add error handling in fence_drv alloc function.
- Move rptr read fn outside of the loop and remove WARN_ON in
destroy function.
v6:
- clear the seq64 memory in user fence driver(Christian)
- fix for the wptr va bo mapping(Christian)
- move the fence_drv xa entry erase code from the interrupt handler
into user fence destroy function
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 27 Aug 2024 09:25:35 +0000 (14:55 +0530)]
drm/amdgpu: add kernel config for gfx-userqueue
This patch:
- adds a kernel config option "CONFIG_DRM_AMDGPU_NAVI3X_USERQ"
- moves the usequeue initialization code for all IPs under
this flag
- cover the core userqueue functions under this config
- adds stub function for userqueue ioctl.
so that the userqueue works only when the config is enabled.
V9: Introduce this patch
V10: Call it CONFIG_DRM_AMDGPU_NAVI3X_USERQ instead of
CONFIG_DRM_AMDGPU_USERQ_GFX (Christian)
V11: Add GFX in the config help description message.
V12: Add depends on BROKEN for this config, remove this when the rest of
the code is available.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Tue, 27 Aug 2024 09:59:49 +0000 (15:29 +0530)]
drm/amdgpu: fix MES GFX mask
Current MES GFX mask prevents FW to enable oversubscription. This patch
does the following:
- Fixes the mask values and adds a description for the same
- Removes the central mask setup and makes it IP specific, as it would
be different when the number of pipes and queues are different.
v2: squash in fix from Shashank
Cc: Christian König <Christian.Koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 9 May 2024 12:31:15 +0000 (14:31 +0200)]
drm/amdgpu: enable compute/gfx usermode queue
This patch does the necessary changes required to
enable compute workload support using the existing
usermode queues infrastructure.
V9: Patch introduced
V10: Add custom IP specific mqd strcuture for compute (Alex)
V11: Rename drm_amdgpu_userq_mqd_compute_gfx_v11 to
drm_amdgpu_userq_mqd_compute_gfx11 (Marek)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arvind Yadav [Tue, 27 Aug 2024 09:22:07 +0000 (14:52 +0530)]
drm/amdgpu: enable SDMA usermode queues
This patch does necessary modifications to enable the SDMA
usermode queues using the existing userqueue infrastructure.
V9: introduced this patch in the series
V10: use header file instead of extern (Alex)
V11: rename drm_amdgpu_userq_mqd_sdma_gfx_v11 to
drm_amdgpu_userq_mqd_sdma_gfx11 (Marek)
Cc: Christian König <Christian.Koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 2 May 2024 10:37:30 +0000 (12:37 +0200)]
drm/amdgpu: enable GFX-V11 userqueue support
This patch enables GFX-v11 IP support in the usermode queue base
code. It typically:
- adds a GFX_v11 specific MQD structure
- sets IP functions to create and destroy MQDs
- sets MQD objects coming from userspace
V10: introduced this spearate patch for GFX V11 enabling (Alex).
V11: Addressed review comments:
- update the comments in GFX mqd structure informing user about using
the INFO IOCTL for object sizes (Alex)
- rename struct drm_amdgpu_userq_mqd_gfx_v11 to
drm_amdgpu_userq_mqd_gfx11 (Marek)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 10 Oct 2023 10:17:50 +0000 (12:17 +0200)]
drm/amdgpu: cleanup leftover queues
This patch adds code to cleanup any leftover userqueues which
a user might have missed to destroy due to a crash or any other
programming error.
V7: Added Alex's R-B
V8: Rebase
V9: Rebase
V10: Rebase
V11: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 9 May 2024 12:17:13 +0000 (14:17 +0200)]
drm/amdgpu: generate doorbell index for userqueue
The userspace sends us the doorbell object and the relative doobell
index in the object to be used for the usermode queue, but the FW
expects the absolute doorbell index on the PCI BAR in the MQD. This
patch adds a function to convert this relative doorbell index to
absolute doorbell index.
V5: Fix the db object reference leak (Christian)
V6: Pin the doorbell bo in userqueue_create() function, and unpin it
in userqueue destoy (Christian)
V7: Added missing kfree for queue in error cases
Added Alex's R-B
V8: Rebase
V9: Changed the function names from gfx_v11* to mes_v11*
V10: Rebase
V11: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 22 Apr 2024 17:21:20 +0000 (19:21 +0200)]
drm/amdgpu: map wptr BO into GART
To support oversubscription, MES FW expects WPTR BOs to
be mapped into GART, before they are submitted to usermode
queues. This patch adds a function for the same.
V4: fix the wptr value before mapping lookup (Bas, Christian).
V5: Addressed review comments from Christian:
- Either pin object or allocate from GART, but not both.
- All the handling must be done with the VM locks held.
V7: Addressed review comments from Christian:
- Do not take vm->eviction_lock
- Use amdgpu_bo_gpu_offset to get the wptr_bo GPU offset
V8: Rebase
V9: Changed the function names from gfx_v11* to mes_v11*
V10: Remove unused adev (Harish)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 2 May 2024 10:13:37 +0000 (12:13 +0200)]
drm/amdgpu: map usermode queue into MES
This patch adds new functions to map/unmap a usermode queue into
the FW, using the MES ring. As soon as this mapping is done, the
queue would be considered ready to accept the workload.
V1: Addressed review comments from Alex on the RFC patch series
- Map/Unmap should be IP specific.
V2:
Addressed review comments from Christian:
- Fix the wptr_mc_addr calculation (moved into another patch)
Addressed review comments from Alex:
- Do not add fptrs for map/unmap
V3: Integration with doorbell manager
V4: Rebase
V5: Use gfx_v11_0 for function names (Alex)
V6: Removed queue->proc/gang/fw_ctx_address variables and doing the
address calculations locally to keep the queue structure GEN
independent (Alex)
V7: Added R-B from Alex
V8: Rebase
V9: Rebase
V10: Rebase
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Thu, 2 May 2024 11:46:06 +0000 (13:46 +0200)]
drm/amdgpu: create context space for usermode queue
The MES FW expects us to allocate at least one page as context
space to process gang and process related context data. This
patch creates a joint object for the same, and calculates GPU
space offsets of these spaces.
V1: Addressed review comments on RFC patch:
Alex: Make this function IP specific
V2: Addressed review comments from Christian
- Allocate only one object for total FW space, and calculate
offsets for each of these objects.
V3: Integration with doorbell manager
V4: Review comments:
- Remove shadow from FW space list from cover letter (Alex)
- Alignment of macro (Luben)
V5: Merged patches 5 and 6 into this single patch
Addressed review comments:
- Use lower_32_bits instead of mask (Christian)
- gfx_v11_0 instead of gfx_v11 in function names (Alex)
- Shadow and GDS objects are now coming from userspace (Christian,
Alex)
V6:
- Add a comment to replace amdgpu_bo_create_kernel() with
amdgpu_bo_create() during fw_ctx object creation (Christian).
- Move proc_ctx_gpu_addr, gang_ctx_gpu_addr and fw_ctx_gpu_addr out
of generic queue structure and make it gen11 specific (Alex).
V7:
- Using helper function to create/destroy userqueue objects.
- Removed FW object space allocation.
V8:
- Updating FW object address from user values.
V9:
- uppdated function name from gfx_v11_* to mes_v11_*
V10:
- making this patch independent of IP based changes, moving any
GFX object related changes in GFX specific patch (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Acked-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 26 Aug 2024 17:42:21 +0000 (23:12 +0530)]
drm/amdgpu: create MES-V11 usermode queue for GFX
A Memory queue descriptor (MQD) of a userqueue defines it in
the hw's context. As MQD format can vary between different
graphics IPs, we need gfx GEN specific handlers to create MQDs.
This patch:
- Adds a new file which will be used for MES based userqueue
functions targeting GFX and SDMA IP.
- Introduces MQD handler functions for the usermode queues.
V1: Worked on review comments from Alex:
- Make MQD functions GEN and IP specific
V2: Worked on review comments from Alex:
- Reuse the existing adev->mqd[ip] for MQD creation
- Formatting and arrangement of code
V3:
- Integration with doorbell manager
V4: Review comments addressed:
- Do not create a new file for userq, reuse gfx_v11_0.c (Alex)
- Align name of structure members (Luben)
- Don't break up the Cc tag list and the Sob tag list in commit
message (Luben)
V5:
- No need to reserve the bo for MQD (Christian).
- Some more changes to support IP specific MQD creation.
V6:
- Add a comment reminding us to replace the amdgpu_bo_create_kernel()
calls while creating MQD object to amdgpu_bo_create() once eviction
fences are ready (Christian).
V7:
- Re-arrange userqueue functions in adev instead of uq_mgr (Alex)
- Use memdup_user instead of copy_from_user (Christian)
V9:
- Moved userqueue code from gfx_v11_0.c to new file mes_v11_0.c so
that it can be reused for SDMA userqueues as well (Shashank, Alex)
V10: Addressed review comments from Alex
- Making this patch independent of IP engine(GFX/SDMA/Compute) and
specific to MES V11 only, using the generic MQD structure.
- Splitting a spearate patch to enabling GFX support from here.
- Verify mqd va address to be non-NULL.
- Add a separate header file.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 10 Oct 2023 10:17:44 +0000 (12:17 +0200)]
drm/amdgpu: add helpers to create userqueue object
This patch introduces amdgpu_userqueue_object and its helper
functions to creates and destroy this object. The helper
functions creates/destroys a base amdgpu_bo, kmap/unmap it and
save the respective GPU and CPU addresses in the encapsulating
userqueue object.
These helpers will be used to create/destroy userqueue MQD, WPTR
and FW areas.
V7:
- Forked out this new patch from V11-gfx-userqueue patch to prevent
that patch from growing very big.
- Using amdgpu_bo_create instead of amdgpu_bo_create_kernel in prep
for eviction fences (Christian)
V9:
- Rebase
V10:
- Added Alex's R-B
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Tue, 10 Oct 2023 10:17:43 +0000 (12:17 +0200)]
drm/amdgpu: add new IOCTL for usermode queue
This patch adds:
- A new IOCTL function to create and destroy
- A new structure to keep all the user queue data in one place.
- A function to generate unique index for the queue.
V1: Worked on review comments from RFC patch series:
- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
Don't keep the queue structure in amdgpu_ctx
V2: Worked on review comments:
- Christian:
- Formatting of text
- There is no need for queuing of userqueues, with idr in place
- Alex:
- Remove use_doorbell, its unnecessary
- Reuse amdgpu_mqd_props for saving mqd fields
- Code formatting and re-arrangement
V3:
- Integration with doorbell manager
V4:
- Accommodate MQD union related changes in UAPI (Alex)
- Do not set the queue size twice (Bas)
V5:
- Remove wrapper functions for queue indexing (Christian)
- Do not save the queue id/idr in queue itself (Christian)
- Move the idr allocation in the IP independent generic space
(Christian)
V6:
- Check the validity of input IP type (Christian)
V7:
- Move uq_func from uq_mgr to adev (Alex)
- Add missing free(queue) for error cases (Yifan)
V9:
- Rebase
V10: Addressed review comments from Christian, and added R-B:
- Do not initialize the local variable
- Convert DRM_ERROR to DEBUG.
V11:
- check the input flags to be zero (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 26 Aug 2024 17:34:13 +0000 (23:04 +0530)]
drm/amdgpu: add usermode queue base code
This patch adds IP independent skeleton code for amdgpu
usermode queue. It contains:
- A new files with init functions of usermode queues.
- A queue context manager in driver private data.
V1: Worked on design review comments from RFC patch series:
(https://patchwork.freedesktop.org/series/112214/)
- Alex: Keep a list of queues, instead of single queue per process.
- Christian: Use the queue manager instead of global ptrs,
Don't keep the queue structure in amdgpu_ctx
V2:
- Reformatted code, split the big patch into two
V3:
- Integration with doorbell manager
V4:
- Align the structure member names to the largest member's column
(Luben)
- Added SPDX license (Luben)
V5:
- Do not add amdgpu.h in amdgpu_userqueue.h (Christian).
- Move struct amdgpu_userq_mgr into amdgpu_userqueue.h (Christian).
V6: Rebase
V9: Rebase
V10: Rebase + Alex's R-B
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 10 Oct 2023 10:17:41 +0000 (12:17 +0200)]
drm/amdgpu: UAPI for user queue management
This patch intorduces new UAPI/IOCTL for usermode graphics
queue. The userspace app will fill this structure and request
the graphics driver to add a graphics work queue for it. The
output of this UAPI is a queue id.
This UAPI maps the queue into GPU, so the graphics app can start
submitting work to the queue as soon as the call returns.
V2: Addressed review comments from Alex and Christian
- Make the doorbell offset's comment clearer
- Change the output parameter name to queue_id
V3: Integration with doorbell manager
V4:
- Updated the UAPI doc (Pierre-Eric)
- Created a Union for engine specific MQDs (Alex)
- Added Christian's R-B
V5:
- Add variables for GDS and CSA in MQD structure (Alex)
- Make MQD data a ptr-size pair instead of union (Alex)
V9:
- renamed struct drm_amdgpu_userq_mqd_gfx_v11 to struct
drm_amdgpu_userq_mqd as its being used for SDMA and
compute queues as well
V10:
- keeping the drm_amdgpu_userq_mqd IP independent, moving the
_gfx_v11 objects in a separate structure in other patch.
(Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Mon, 7 Apr 2025 05:52:30 +0000 (01:52 -0400)]
drm/amdgpu: still cleanup sid.h
The defines, shifts and masks are already available in dce_6_0_d.h,
dce_6_0_sh_mask.h.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Fri, 4 Apr 2025 05:42:25 +0000 (01:42 -0400)]
drm/amdgpu: fill in gmc_v6_0_set_clockgating_state()
Pretty much was already there, just not ported to amdgpu.
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alexandre Demers [Fri, 4 Apr 2025 05:42:24 +0000 (01:42 -0400)]
drm/amd/display/dc: reclassify DCE6 resources and hw sequencer
Classify DCE6 resource and sequencer as they are for other DCE versions
Put dce60_resource.c and .h under amd/display/dc/resource/dce60
Put and rename dce60_hw_sequencer.c and .h under amd/display/dc/hwss/dce60
v2: fix build when CONFIG_DRM_AMD_DC_SI=n (Alex)
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lijo Lazar [Tue, 8 Apr 2025 03:25:33 +0000 (08:55 +0530)]
drm/amdgpu: Reset RAS table if header is invalid
If a valid header is not found during RAS eeprom init, consider it as
new and reset RAS table info.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tao Zhou [Thu, 3 Apr 2025 03:39:49 +0000 (11:39 +0800)]
drm/amdgpu: add loop bits for NPS2 page retirement
Support NPS2 RAS.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kenneth Feng [Tue, 1 Apr 2025 07:56:41 +0000 (15:56 +0800)]
drm/amd/amdgpu: decouple ASPM with pcie dpm
ASPM doesn't need to be disabled if pcie dpm is disabled.
So ASPM can be independantly enabled.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ruili Ji [Mon, 24 Mar 2025 05:15:25 +0000 (01:15 -0400)]
amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3
Add interface for hardware init by vcn instance.
v2: fix code format
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Victor Skvortsov [Wed, 2 Apr 2025 21:35:56 +0000 (17:35 -0400)]
drm/amdgpu: Disable ACA on VFs
VFs query RAS error counts directly from host with
AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled,
an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create()
Likewise, VFs depend on host support to query CPERs, rather than ACA component.
Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Mar 2025 17:50:10 +0000 (13:50 -0400)]
Documenation: fix typo in debugfs.rst
In reference to memory carved out for APUs,
s/cave out/carve out/
Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 26 Mar 2025 17:49:00 +0000 (13:49 -0400)]
Documentation: update KIQ documentation
KIQ is replaced with MES on GFX 11 and newer.
Reviewed-by: shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>