linux-block.git
7 weeks agodrm/amdgpu: convert bios_hardcoded_edid to drm_edid
Thomas Weißschuh [Fri, 26 Jul 2024 13:40:15 +0000 (15:40 +0200)]
drm/amdgpu: convert bios_hardcoded_edid to drm_edid

Instead of manually passing around 'struct edid *' and its size,
use 'struct drm_edid', which encapsulates a validated combination of
both.

As the drm_edid_ can handle NULL gracefully, the explicit checks can be
dropped.

Also save a few characters by transforming '&array[0]' to the equivalent
'array' and using 'max_t(int, ...)' instead of manual casts.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()
Alex Deucher [Thu, 25 Jul 2024 21:30:37 +0000 (17:30 -0400)]
drm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()

We only need to skip this on modern APUs.  It's required
on older APUs as it's where start_smu gets called from.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3502
Fixes: 064d92436b69 ("drm/amd/pm: avoid to load smu firmware for APUs")
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Tim Huang <Tim.Huang@amd.com>
7 weeks agodrm/amdgpu: trigger ip dump before suspend of IP's
Sunil Khatri [Fri, 26 Jul 2024 12:37:41 +0000 (18:07 +0530)]
drm/amdgpu: trigger ip dump before suspend of IP's

Problem:
IP dump right now is done post suspend of all
IP's which for some IP's could change power
state and software state too which we do not want
to reflect in the dump as it might not be same at
the time of hang.

Solution:
IP should be dumped as close to the HW state when
the GPU was in hung state without trying to reinitialize
any resource.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Avoid overflow assignment in link_dp_cts
Alex Hung [Wed, 17 Jul 2024 15:17:56 +0000 (09:17 -0600)]
drm/amd/display: Avoid overflow assignment in link_dp_cts

sampling_rate is an uint8_t but is assigned an unsigned int, and thus it
can overflow. As a result, sampling_rate is changed to uint32_t.

Similarly, LINK_QUAL_PATTERN_SET has a size of 2 bits, and it should
only be assigned to a value less or equal than 4.

This fixes 2 INTEGER_OVERFLOW issues reported by Coverity.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Use correct cm_helper function
Ilya Bakoulin [Tue, 16 Jul 2024 17:39:10 +0000 (13:39 -0400)]
drm/amd/display: Use correct cm_helper function

Need to use cm3_helper function with DCN401 to avoid cases where high
RGB component values can get set to zero if using the TF curve generated
by cm_helper.

Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add seamless boot support for more DIG operation modes
Nicholas Kazlauskas [Tue, 16 Jul 2024 21:41:54 +0000 (17:41 -0400)]
drm/amd/display: Add seamless boot support for more DIG operation modes

[Why]
When pre-OS firmware enables display support for displays that operate
the DIG in 2 pixels per cycle processing modes the inferred pixel rate
from get_pixel_clk_frequency_100hz does not account for the true pixel
rate since we're outputting 2 per cycle past the stream encoder.

This causes seamless boot validation to abort early.

[How]
Add a new stream encoder function for getting pixels per cycle from the
stream encoder. If the pixels per cycle is greater than 1 and the driver
policy is to enable 2 pixels per cycle for post-OS then allow seamless
boot to continue.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Reset VRR config during resume
Tom Chung [Fri, 12 Jul 2024 09:29:07 +0000 (17:29 +0800)]
drm/amd/display: Reset VRR config during resume

[Why]
After resume the system, the new_crtc_state->vrr_infopacket does not
synchronize with the current state.  It will affect the
update_freesync_state_on_stream() does not update the state correctly.

The previous patch causes a PSR SU regression that cannot let panel go
into self-refresh mode.

[How]
Reset the VRR config during resume to force update the VRR config later.

Fixes: eb6dfbb7a9c6 ("drm/amd/display: Reset freesync config before update new state")
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add a missing PSR state
Tom Chung [Fri, 12 Jul 2024 10:02:30 +0000 (18:02 +0800)]
drm/amd/display: Add a missing PSR state

[Why & How]
Add a missing PSR state to make the dmub_psr_get_state() return a
correct PSR state.

Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: sync dmub output event type.
Charlene Liu [Tue, 16 Jul 2024 19:58:35 +0000 (15:58 -0400)]
drm/amd/display: sync dmub output event type.

[why]
dmubfw added a new event type, update amdgpu to avoid "notify type 6
invalid"

Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Reviewed-by: Chris Park <chris.park@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: restore immediate_disable_crtc for w/a
Charlene Liu [Tue, 16 Jul 2024 17:47:43 +0000 (13:47 -0400)]
drm/amd/display: restore immediate_disable_crtc for w/a

[why]
immediate_disable_crtc does not reset ODM.  if switching to disable_crtc
which will disable ODM as well.  i.e. need to restore ODM mem cfg at
reenable it at end of w/a.

Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: increase mes log buffer size for gfx12
Michael Chen [Tue, 23 Jul 2024 21:45:23 +0000 (17:45 -0400)]
drm/amdgpu: increase mes log buffer size for gfx12

MES firmware requires larger log buffer for gfx12. Allocate
proper buffer respectively for gfx11 and gfx12.

Signed-off-by: Michael Chen <michael.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Check stream_status before it is used
Alex Hung [Mon, 15 Jul 2024 16:37:28 +0000 (10:37 -0600)]
drm/amd/display: Check stream_status before it is used

[WHAT & HOW]
dc_state_get_stream_status can return null, and therefore null must be
checked before stream_status is used.

This fixes 1 NULL_RETURNS issue reported by Coverity.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Check null pointers before using them
Alex Hung [Mon, 15 Jul 2024 16:24:58 +0000 (10:24 -0600)]
drm/amd/display: Check null pointers before using them

[WHAT & HOW]
dc_link is null checked previously in the same function, indicating it
might be null as reported by Coverity.

This fixes 1 FORWARD_NULL issue reported by Coverity.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Fix possible overflow in integer multiplication
Alex Hung [Sat, 8 Jun 2024 04:09:53 +0000 (22:09 -0600)]
drm/amd/display: Fix possible overflow in integer multiplication

[WHAT & HOW]
Integer multiplies integer may overflow in context that expects an
expression of unsigned long long (64 bits). This can be fixed by casting
integer to unsigned long long to force 64 bits results.

This fixes 2 OVERFLOW_BEFORE_WIDEN issues reported by Coverity.

Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add option to disable unbounded req in DML21
Alvin Lee [Mon, 15 Jul 2024 17:54:18 +0000 (13:54 -0400)]
drm/amd/display: Add option to disable unbounded req in DML21

Use debug option for disabling unbounded req in DML21

Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Reviewed-by: Austin Zheng <Austin.Zheng@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Refactor for dio
Bhuvanachandra Pinninti [Tue, 16 Jul 2024 13:23:03 +0000 (18:53 +0530)]
drm/amd/display: Refactor for dio

Moved files to respective folders to improve DIO code.

Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Request 0MHz dispclk for zero display case
Nicholas Kazlauskas [Mon, 15 Jul 2024 19:52:46 +0000 (15:52 -0400)]
drm/amd/display: Request 0MHz dispclk for zero display case

[Why]
If we aren't entering RCG/IPS2 or CLKSTOP is not supported by PMFW then
we should be requesting a dispclk value of 0MHz to PMFW.

Currenly we run at max clock since there's an assumption in APU clock
table formulation where we can run at any DISPCLK at any state so the
real clock value ends up as 1200Mhz - the maximum.

[How]
Set to 0 instead of the minimum value in the state array.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Duncan Ma <duncan.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: print VCN instance dump for valid instance
Sunil Khatri [Fri, 26 Jul 2024 09:39:59 +0000 (15:09 +0530)]
drm/amdgpu: print VCN instance dump for valid instance

VCN dump is dependent on power state of the ip. Dump is
valid if VCN was powered up at the time of ip dump.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add two dmmuy I2C entry for GPIO port mapping issue
Chris Park [Fri, 12 Jul 2024 16:50:48 +0000 (12:50 -0400)]
drm/amd/display: Add two dmmuy I2C entry for GPIO port mapping issue

[Why]
When only 4 I2C is declared, two dummies are required to correctly map
GPIO port.

[How]
Add one more I2C dummy entry to match GPIO port.

Signed-off-by: Chris Park <chris.park@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Run idle optimizations at end of vblank handler
Leo Li [Thu, 11 Jul 2024 18:38:11 +0000 (14:38 -0400)]
drm/amd/display: Run idle optimizations at end of vblank handler

[Why & How]
1. After allowing idle optimizations, hw programming is disallowed.
2. Before hw programming, we need to disallow idle optimizations.

Otherwise, in scenario 1, we will immediately kick hw out of idle
optimizations with register access.

Scenario 2 is less of a concern, since any register access will kick
hw out of idle optimizations. But we'll do it early for correctness.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Let drm_crtc_vblank_on/off manage interrupts
Leo Li [Thu, 11 Jul 2024 18:31:27 +0000 (14:31 -0400)]
drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts

[Why]
We manage interrupts for CRTCs in two places:
1. In manage_dm_interrupts(), when CRTC get enabled or disabled
2. When drm_vblank_get/put() starts or kills the vblank counter, calling
   into amdgpu_dm_crtc_set_vblank()

The interrupts managed by these twp places should be identical.

[How]
Since manage_dm_interrupts() already use drm_crtc_vblank_on/off(), just
move all CRTC interrupt management into amdgpu_dm_crtc_set_vblank().

This has the added benefit of disabling all CRTC and HUBP interrupts
when there are no vblank requestors.

Note that there is a TODO item - unchanged from when it was first
introduced - to properly identify the HUBP instance from the OTG
instance, rather than just assume direct mapping.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: roll back quality EASF and ISHARP and dc dependency changes
Samson Tam [Sun, 14 Jul 2024 20:31:05 +0000 (16:31 -0400)]
drm/amd/display: roll back quality EASF and ISHARP and dc dependency changes

[Why]
Seeing several regressions related to quality EASF and ISHARP changes
and removing dc dependency changes.

[How]
Roll back SPL changes

Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Martin Leung <martin.leung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdkfd: Fix missing error code in kfd_queue_acquire_buffers
Srinivasan Shanmugam [Fri, 26 Jul 2024 06:47:12 +0000 (12:17 +0530)]
drm/amdkfd: Fix missing error code in kfd_queue_acquire_buffers

The fix involves setting 'err' to '-EINVAL' before each 'goto
out_err_unreserve'.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_queue.c:265 kfd_queue_acquire_buffers()
warn: missing error code 'err'

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_queue.c
    226 int kfd_queue_acquire_buffers(struct kfd_process_device *pdd, struct queue_properties *properties)
    227 {
    228         struct kfd_topology_device *topo_dev;
    229         struct amdgpu_vm *vm;
    230         u32 total_cwsr_size;
    231         int err;
    232
    233         topo_dev = kfd_topology_device_by_id(pdd->dev->id);
    234         if (!topo_dev)
    235                 return -EINVAL;
    236
    237         vm = drm_priv_to_vm(pdd->drm_priv);
    238         err = amdgpu_bo_reserve(vm->root.bo, false);
    239         if (err)
    240                 return err;
    241
    242         err = kfd_queue_buffer_get(vm, properties->write_ptr, &properties->wptr_bo, PAGE_SIZE);
    243         if (err)
    244                 goto out_err_unreserve;
    245
    246         err = kfd_queue_buffer_get(vm, properties->read_ptr, &properties->rptr_bo, PAGE_SIZE);
    247         if (err)
    248                 goto out_err_unreserve;
    249
    250         err = kfd_queue_buffer_get(vm, (void *)properties->queue_address,
    251                                    &properties->ring_bo, properties->queue_size);
    252         if (err)
    253                 goto out_err_unreserve;
    254
    255         /* only compute queue requires EOP buffer and CWSR area */
    256         if (properties->type != KFD_QUEUE_TYPE_COMPUTE)
    257                 goto out_unreserve;

This is clearly a success path.

    258
    259         /* EOP buffer is not required for all ASICs */
    260         if (properties->eop_ring_buffer_address) {
    261                 if (properties->eop_ring_buffer_size != topo_dev->node_props.eop_buffer_size) {
    262                         pr_debug("queue eop bo size 0x%lx not equal to node eop buf size 0x%x\n",
    263                                 properties->eop_buf_bo->tbo.base.size,
    264                                 topo_dev->node_props.eop_buffer_size);
--> 265                         goto out_err_unreserve;

This has err in the label name.  err = -EINVAL?

    266                 }
    267                 err = kfd_queue_buffer_get(vm, (void *)properties->eop_ring_buffer_address,
    268                                            &properties->eop_buf_bo,
    269                                            properties->eop_ring_buffer_size);
    270                 if (err)
    271                         goto out_err_unreserve;
    272         }
    273
    274         if (properties->ctl_stack_size != topo_dev->node_props.ctl_stack_size) {
    275                 pr_debug("queue ctl stack size 0x%x not equal to node ctl stack size 0x%x\n",
    276                         properties->ctl_stack_size,
    277                         topo_dev->node_props.ctl_stack_size);
    278                 goto out_err_unreserve;

err?

    279         }
    280
    281         if (properties->ctx_save_restore_area_size != topo_dev->node_props.cwsr_size) {
    282                 pr_debug("queue cwsr size 0x%x not equal to node cwsr size 0x%x\n",
    283                         properties->ctx_save_restore_area_size,
    284                         topo_dev->node_props.cwsr_size);
    285                 goto out_err_unreserve;

err?  Not sure.

    286         }
    287
    288         total_cwsr_size = (topo_dev->node_props.cwsr_size + topo_dev->node_props.debug_memory_size)
    289                           * NUM_XCC(pdd->dev->xcc_mask);
    290         total_cwsr_size = ALIGN(total_cwsr_size, PAGE_SIZE);
    291
    292         err = kfd_queue_buffer_get(vm, (void *)properties->ctx_save_restore_area_address,
    293                                    &properties->cwsr_bo, total_cwsr_size);
    294         if (!err)
    295                 goto out_unreserve;
    296
    297         amdgpu_bo_unreserve(vm->root.bo);
    298
    299         err = kfd_queue_buffer_svm_get(pdd, properties->ctx_save_restore_area_address,
    300                                        total_cwsr_size);
    301         if (err)
    302                 goto out_err_release;
    303
    304         return 0;
    305
    306 out_unreserve:
    307         amdgpu_bo_unreserve(vm->root.bo);
    308         return 0;
    309
    310 out_err_unreserve:
    311         amdgpu_bo_unreserve(vm->root.bo);
    312 out_err_release:
    313         kfd_queue_release_buffers(pdd, properties);
    314         return err;
    315 }

Fixes: 629568d25fea ("drm/amdkfd: Validate queue cwsr area and eop buffer size")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream
Srinivasan Shanmugam [Thu, 25 Jul 2024 01:53:48 +0000 (07:23 +0530)]
drm/amd/display: Add null check for top_pipe_to_program in commit_planes_for_stream

This commit addresses a null pointer dereference issue in the
`commit_planes_for_stream` function at line 4140. The issue could occur
when `top_pipe_to_program` is null.

The fix adds a check to ensure `top_pipe_to_program` is not null before
accessing its stream_res. This prevents a null pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4140 commit_planes_for_stream() error: we previously assumed 'top_pipe_to_program' could be null (see line 3906)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe
Srinivasan Shanmugam [Thu, 25 Jul 2024 02:44:56 +0000 (08:14 +0530)]
drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe

This commit addresses a null pointer dereference issue in the
`dcn20_program_pipe` function. The issue could occur when
`pipe_ctx->plane_state` is null.

The fix adds a check to ensure `pipe_ctx->plane_state` is not null
before accessing. This prevents a null pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn20/dcn20_hwseq.c:1925 dcn20_program_pipe() error: we previously assumed 'pipe_ctx->plane_state' could be null (see line 1877)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: Add MFD support for ISP I2C bus
Venkata Narendra Kumar Gutta [Wed, 19 Jun 2024 01:16:52 +0000 (18:16 -0700)]
drm/amdgpu: Add MFD support for ISP I2C bus

ISP I2C bus device can't be enumerated via ACPI mechanism
since it shares the memory map with the AMDGPU.
So use the MFD mechanism for registering the ISP I2C device
and add the required resources.

Signed-off-by: Venkata Narendra Kumar Gutta <vengutta@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: fix contiguous handling for IB parsing v2
Christian König [Wed, 24 Jul 2024 07:24:02 +0000 (09:24 +0200)]
drm/amdgpu: fix contiguous handling for IB parsing v2

Otherwise we won't get correct access to the IB.

v2: keep setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS to avoid problems in
    the VRAM backend.

Signed-off-by: Christian König <christian.koenig@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3501
Fixes: e362b7c8f8c7 ("drm/amdgpu: Modify the contiguous flags behaviour")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: add print support for vcn_v3_0 ip dump
Sunil Khatri [Wed, 24 Jul 2024 11:18:28 +0000 (16:48 +0530)]
drm/amdgpu: add print support for vcn_v3_0 ip dump

Add support for logging the registers in devcoredump
buffer for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: add vcn_v3_0 ip dump support
Sunil Khatri [Wed, 24 Jul 2024 11:05:41 +0000 (16:35 +0530)]
drm/amdgpu: add vcn_v3_0 ip dump support

Add support of vcn ip dump in the devcoredump
for vcn_v3_0.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: add macro to calculate offset with instance
Sunil Khatri [Wed, 24 Jul 2024 17:05:56 +0000 (22:35 +0530)]
drm/amdgpu: add macro to calculate offset with instance

Add macro definition which calculate offset of the
register with index override.

This is useful in case when there is an array of
registers which is common for all instances.
To read registers in that case it is easy to define
registers once and the index value is manually passed
to calculate proper offset of register for each instance.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
7 weeks agodrm/amdgpu: add vcn ip dump ptr in vcn global struct
Sunil Khatri [Tue, 23 Jul 2024 07:38:55 +0000 (13:08 +0530)]
drm/amdgpu: add vcn ip dump ptr in vcn global struct

Add pointer to the vcn ip dump in the vcn global structure
to be accessible for all vcn version via global adev.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: remove unneeded semicolon
Jiapeng Chong [Thu, 25 Jul 2024 01:57:12 +0000 (09:57 +0800)]
drm/amd/display: remove unneeded semicolon

No functional modification involved.

./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:481:2-3: Unneeded semicolon.
./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3783:168-169: Unneeded semicolon.
./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3782:166-167: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9575
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: allow users to target recommended SDMA engines
Jonathan Kim [Tue, 21 May 2024 17:22:15 +0000 (13:22 -0400)]
drm/amdkfd: allow users to target recommended SDMA engines

Certain GPUs have better copy performance over xGMI on specific
SDMA engines depending on the source and destination GPU.
Allow users to create SDMA queues on these recommended engines.
Close to 2x overall performance has been observed with this
optimization.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3
Kenneth Feng [Thu, 4 Jul 2024 00:14:15 +0000 (08:14 +0800)]
drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3

support gpu_metrics sysfs interface for smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: use swap() in sort()
Jiapeng Chong [Wed, 24 Jul 2024 07:37:49 +0000 (15:37 +0800)]
drm/amd/display: use swap() in sort()

Use existing swap() function rather than duplicating its implementation.

./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c:17:29-30: WARNING opportunity for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9573
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Reapply 2fde4fdddc1f
Nathan Chancellor [Wed, 24 Jul 2024 15:49:35 +0000 (08:49 -0700)]
drm/amd/display: Reapply 2fde4fdddc1f

Commit 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization") blew
away the compiler warning fix from commit 2fde4fdddc1f
("drm/amd/display: Avoid -Wenum-float-conversion in
add_margin_and_round_to_dfs_grainularity()"), causing the warning to
reappear.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion]
    183 |         divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz));
        |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Apply the fix again to resolve the warning.

Fixes: 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix spelling mistake "tolarance" -> "tolerance"
Colin Ian King [Wed, 24 Jul 2024 13:24:28 +0000 (14:24 +0100)]
drm/amd/display: Fix spelling mistake "tolarance" -> "tolerance"

There is a spelling mistake in a dml2_printf message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/radeon: properly handle vbios fake edid sizing
Alex Deucher [Tue, 23 Jul 2024 17:31:58 +0000 (13:31 -0400)]
drm/radeon: properly handle vbios fake edid sizing

The comment in the vbios structure says:
// = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128

This fake edid struct has not been used in a long time, so I'm
not sure if there were actually any boards out there with a non-128 byte
EDID, but align the code with the comment.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Reported-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html
Fixes: c324acd5032f ("drm/radeon/kms: parse the extended LCD info block")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: properly handle vbios fake edid sizing
Alex Deucher [Tue, 23 Jul 2024 17:23:56 +0000 (13:23 -0400)]
drm/amdgpu: properly handle vbios fake edid sizing

The comment in the vbios structure says:
// = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128

This fake edid struct has not been used in a long time, so I'm
not sure if there were actually any boards out there with a non-128 byte
EDID, but align the code with the comment.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Reported-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html
Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Validate queue cwsr area and eop buffer size
Philip Yang [Wed, 26 Jun 2024 19:03:05 +0000 (15:03 -0400)]
drm/amdkfd: Validate queue cwsr area and eop buffer size

When creating KFD user compute queue, check if queue eop buffer size,
cwsr area size, ctl stack size equal to the size of KFD node
properities.

Check the entire cwsr area which may split into multiple svm ranges
aligned to granularity boundary.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Store queue cwsr area size to node properties
Philip Yang [Wed, 26 Jun 2024 18:52:28 +0000 (14:52 -0400)]
drm/amdkfd: Store queue cwsr area size to node properties

Use the queue eop buffer size, cwsr area size, ctl stack size
calculation from Thunk, store the value to KFD node properties.

Those will be used to validate queue eop buffer size, cwsr area size,
ctl stack size when creating KFD user compute queue.

Those will be exposed to user space via sysfs KFD node properties, to
remove the duplicate calculation code from Thunk.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: reset vm state machine after gpu reset(vram lost)
ZhenGuo Yin [Fri, 19 Jul 2024 08:10:40 +0000 (16:10 +0800)]
drm/amdgpu: reset vm state machine after gpu reset(vram lost)

[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.

[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.

v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.

Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add null check for set_output_gamma in dcn30_set_output_transfer_func
Srinivasan Shanmugam [Mon, 22 Jul 2024 11:48:17 +0000 (17:18 +0530)]
drm/amd/display: Add null check for set_output_gamma in dcn30_set_output_transfer_func

This commit adds a null check for the set_output_gamma function pointer
in the  dcn30_set_output_transfer_func function. Previously,
set_output_gamma was being checked for nullity at line 386, but then it
was being dereferenced without any nullity check at line 401. This
could potentially lead to a null pointer dereference error if
set_output_gamma is indeed null.

To fix this, we now ensure that set_output_gamma is not null before
dereferencing it. We do this by adding a nullity check for
set_output_gamma before the call to set_output_gamma at line 401. If
set_output_gamma is null, we log an error message and do not call the
function.

This fix prevents a potential null pointer dereference error.

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:401 dcn30_set_output_transfer_func()
error: we previously assumed 'mpc->funcs->set_output_gamma' could be null (see line 386)

drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c
    373 bool dcn30_set_output_transfer_func(struct dc *dc,
    374                                 struct pipe_ctx *pipe_ctx,
    375                                 const struct dc_stream_state *stream)
    376 {
    377         int mpcc_id = pipe_ctx->plane_res.hubp->inst;
    378         struct mpc *mpc = pipe_ctx->stream_res.opp->ctx->dc->res_pool->mpc;
    379         const struct pwl_params *params = NULL;
    380         bool ret = false;
    381
    382         /* program OGAM or 3DLUT only for the top pipe*/
    383         if (pipe_ctx->top_pipe == NULL) {
    384                 /*program rmu shaper and 3dlut in MPC*/
    385                 ret = dcn30_set_mpc_shaper_3dlut(pipe_ctx, stream);
    386                 if (ret == false && mpc->funcs->set_output_gamma) {
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If this is NULL

    387                         if (stream->out_transfer_func.type == TF_TYPE_HWPWL)
    388                                 params = &stream->out_transfer_func.pwl;
    389                         else if (pipe_ctx->stream->out_transfer_func.type ==
    390                                         TF_TYPE_DISTRIBUTED_POINTS &&
    391                                         cm3_helper_translate_curve_to_hw_format(
    392                                         &stream->out_transfer_func,
    393                                         &mpc->blender_params, false))
    394                                 params = &mpc->blender_params;
    395                          /* there are no ROM LUTs in OUTGAM */
    396                         if (stream->out_transfer_func.type == TF_TYPE_PREDEFINED)
    397                                 BREAK_TO_DEBUGGER();
    398                 }
    399         }
    400
--> 401         mpc->funcs->set_output_gamma(mpc, mpcc_id, params);
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Then it will crash

    402         return ret;
    403 }

Fixes: d99f13878d6f ("drm/amd/display: Add DCN3 HWSEQ")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: add missed harvest check for VCN IP v4/v5
Tim Huang [Tue, 23 Jul 2024 08:54:34 +0000 (16:54 +0800)]
drm/amdgpu: add missed harvest check for VCN IP v4/v5

To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.

v2:
Apply the same check to VCN IP v4.0 and v5.0.

[   54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[   54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[   54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[   54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[   54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[   54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[   54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[   54.075400] FS:  00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[   54.075988] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[   54.076927] PKRU: 55555554
[   54.077132] Call Trace:
[   54.077319]  <TASK>
[   54.077484]  ? show_regs+0x69/0x80
[   54.077747]  ? __die+0x28/0x70
[   54.077979]  ? page_fault_oops+0x180/0x4b0
[   54.078286]  ? do_user_addr_fault+0x2d2/0x680
[   54.078610]  ? exc_page_fault+0x84/0x190
[   54.078910]  ? asm_exc_page_fault+0x2b/0x30
[   54.079224]  ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[   54.079941]  ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[   54.080617]  vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[   54.081316]  amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[   54.082057]  amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[   54.082727]  amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[   54.083351]  amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[   54.084054]  amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[   54.084698]  vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[   54.085307]  amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[   54.085951]  amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[   54.086591]  amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[   54.087215]  local_pci_probe+0x48/0xa0
[   54.087509]  pci_device_probe+0xc9/0x250
[   54.087812]  really_probe+0x1a4/0x3f0
[   54.088101]  __driver_probe_device+0x7d/0x170
[   54.088443]  driver_probe_device+0x24/0xa0
[   54.088765]  __driver_attach+0xdd/0x1d0
[   54.089068]  ? __pfx___driver_attach+0x10/0x10
[   54.089417]  bus_for_each_dev+0x8e/0xe0
[   54.089718]  driver_attach+0x22/0x30
[   54.090000]  bus_add_driver+0x120/0x220
[   54.090303]  driver_register+0x62/0x120
[   54.090606]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   54.091255]  __pci_register_driver+0x62/0x70
[   54.091593]  amdgpu_init+0x67/0xff0 [amdgpu]
[   54.092190]  do_one_initcall+0x5f/0x330
[   54.092495]  do_init_module+0x68/0x240
[   54.092794]  load_module+0x201c/0x2110
[   54.093093]  init_module_from_file+0x97/0xd0
[   54.093428]  ? init_module_from_file+0x97/0xd0
[   54.093777]  idempotent_init_module+0x11c/0x2a0
[   54.094134]  __x64_sys_finit_module+0x64/0xc0
[   54.094476]  do_syscall_64+0x58/0x120
[   54.094767]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: skip kfd init if GFX is not ready.
Yifan Zhang [Thu, 18 Jul 2024 05:18:53 +0000 (13:18 +0800)]
drm/amdgpu: skip kfd init if GFX is not ready.

avoid kfd init crash in that case.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Validate user queue update
Philip Yang [Thu, 20 Jun 2024 17:00:48 +0000 (13:00 -0400)]
drm/amdkfd: Validate user queue update

Ensure update queue new ring buffer is mapped on GPU with correct size.

Decrease queue old ring_bo queue_refcount and increase new ring_bo
queue_refcount.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Validate user queue svm memory residency
Philip Yang [Thu, 20 Jun 2024 16:44:57 +0000 (12:44 -0400)]
drm/amdkfd: Validate user queue svm memory residency

Queue CWSR area maybe registered to GPU as svm memory, create queue to
ensure svm mapped to GPU with KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag.

Add queue_refcount to struct svm_range, to track queue CWSR area usage.

Because unmap mmu notifier callback return value is ignored, if
application unmap the CWSR area while queue is active, pr_warn message
in dmesg log. To be safe, evict user queue.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx9.4.3: Enable bad opcode interrupt
Alex Deucher [Fri, 12 Jul 2024 22:57:14 +0000 (18:57 -0400)]
drm/amdgpu/gfx9.4.3: Enable bad opcode interrupt

For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx9: Enable bad opcode interrupt
Alex Deucher [Fri, 12 Jul 2024 22:50:26 +0000 (18:50 -0400)]
drm/amdgpu/gfx9: Enable bad opcode interrupt

For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx12: Enable bad opcode interrupt
Jesse Zhang [Fri, 12 Jul 2024 22:42:53 +0000 (18:42 -0400)]
drm/amdgpu/gfx12: Enable bad opcode interrupt

For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx10: Enable bad opcode interrupt
Jesse Zhang [Fri, 12 Jul 2024 22:14:52 +0000 (18:14 -0400)]
drm/amdgpu/gfx10: Enable bad opcode interrupt

For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx11: Enable bad opcode interrupt
Jesse Zhang [Thu, 11 Jul 2024 02:38:03 +0000 (10:38 +0800)]
drm/amdgpu/gfx11: Enable bad opcode interrupt

For the bad opcode case, it will cause CP/ME hang.
The firmware will prevent the ME side from hanging by raising a bad opcode interrupt.
And the driver needs to perform a vmid reset when receiving the interrupt.

v2: update irq naming (drop priv) (Alex)

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx: add bad opcode interrupt
Alex Deucher [Fri, 12 Jul 2024 22:01:06 +0000 (18:01 -0400)]
drm/amdgpu/gfx: add bad opcode interrupt

Add the irq source for bad opcodes.

Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx9: properly handle error ints on all pipes
Alex Deucher [Tue, 2 Jul 2024 14:24:59 +0000 (10:24 -0400)]
drm/amdgpu/gfx9: properly handle error ints on all pipes

Need to handle the interrupt enables for all pipes.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx12: properly handle error ints on all pipes
Alex Deucher [Mon, 1 Jul 2024 21:40:55 +0000 (17:40 -0400)]
drm/amdgpu/gfx12: properly handle error ints on all pipes

Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx11: properly handle error ints on all pipes
Alex Deucher [Mon, 1 Jul 2024 15:18:00 +0000 (11:18 -0400)]
drm/amdgpu/gfx11: properly handle error ints on all pipes

Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx10: properly handle error ints on all pipes
Alex Deucher [Mon, 1 Jul 2024 15:08:52 +0000 (11:08 -0400)]
drm/amdgpu/gfx10: properly handle error ints on all pipes

Need to handle the interrupt enables for all pipes.

v2: fix indexing (Jessie)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx12: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:20:37 +0000 (18:20 -0400)]
drm/amdgpu/gfx12: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx11: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:19:42 +0000 (18:19 -0400)]
drm/amdgpu/gfx11: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx10: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:17:18 +0000 (18:17 -0400)]
drm/amdgpu/gfx10: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: Fix eeprom max record count
Stanley.Yang [Thu, 18 Jul 2024 02:58:04 +0000 (10:58 +0800)]
drm/amdgpu: Fix eeprom max record count

The eeprom table is empty before initializing,
set eeprom table version first before initializing.

Changed from V1:
Reuse amdgpu_ras_set_eeprom_table_version function

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw
Srinivasan Shanmugam [Mon, 22 Jul 2024 11:14:40 +0000 (16:44 +0530)]
drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw

This commit addresses a potential null pointer dereference issue in the
`dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is
null.

The fix adds a check to ensure `dc->clk_mgr` is not null before
accessing its functions. This prevents a potential null pointer
dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: fix ras UE error injection failure issue
YiPeng Chai [Fri, 19 Jul 2024 12:43:04 +0000 (20:43 +0800)]
drm/amdgpu: fix ras UE error injection failure issue

The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Ensure user queue buffers residency
Philip Yang [Thu, 20 Jun 2024 16:31:36 +0000 (12:31 -0400)]
drm/amdkfd: Ensure user queue buffers residency

Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap
BO from the GPU if the bo_va queue_refcount is not zero.

Create queue to increase the bo_va queue_refcount, destroy queue to
decrease the bo_va queue_refcount, to ensure the queue buffers mapped on
the GPU when queue is active.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx9.4.3: implement wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:27:37 +0000 (18:27 -0400)]
drm/amdgpu/gfx9.4.3: implement wave kill for compute queues

Based on gfx9.0 implementation.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu: add print support for sdma_v_4_4_2 ip_dump
Sunil Khatri [Wed, 17 Jul 2024 13:15:50 +0000 (18:45 +0530)]
drm/amdgpu: add print support for sdma_v_4_4_2 ip_dump

Add print support for ip dump for sdma_v_4_4_2 in
devcoredump.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw
Srinivasan Shanmugam [Mon, 22 Jul 2024 11:28:32 +0000 (16:58 +0530)]
drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw

This commit addresses a potential null pointer dereference issue in the
`dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.

The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevents a potential null
pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw
Srinivasan Shanmugam [Mon, 22 Jul 2024 10:51:19 +0000 (16:21 +0530)]
drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw

This commit addresses a potential null pointer dereference issue in the
`dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or
`dc->clk_mgr->funcs` is null.

The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is
not null before accessing its functions. This prevents a potential null
pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdkfd: Validate user queue buffers
Philip Yang [Thu, 20 Jun 2024 16:21:57 +0000 (12:21 -0400)]
drm/amdkfd: Validate user queue buffers

Find user queue rptr, ring buf, eop buffer and cwsr area BOs, and
check BOs are mapped on the GPU with correct size and take the BO
reference.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx9: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:21:48 +0000 (18:21 -0400)]
drm/amdgpu/gfx9: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx8: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:29:59 +0000 (18:29 -0400)]
drm/amdgpu/gfx8: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amdgpu/gfx7: enable wave kill for compute queues
Alex Deucher [Fri, 12 Jul 2024 22:29:20 +0000 (18:29 -0400)]
drm/amdgpu/gfx7: enable wave kill for compute queues

It should work the same for compute as well as gfx.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pip...
Srinivasan Shanmugam [Sun, 21 Jul 2024 14:00:16 +0000 (19:30 +0530)]
drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer

This commit addresses a potential null pointer dereference issue in the
`dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue
could occur when `head_pipe` is null.

The fix adds a check to ensure `head_pipe` is not null before asserting
it. If `head_pipe` is null, the function returns NULL to prevent a
potential null pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer
Srinivasan Shanmugam [Sun, 21 Jul 2024 13:48:58 +0000 (19:18 +0530)]
drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer

This commit addresses a potential null pointer dereference issue in the
`dcn201_acquire_free_pipe_for_layer` function. The issue could occur
when `head_pipe` is null.

The fix adds a check to ensure `head_pipe` is not null before asserting
it. If `head_pipe` is null, the function returns NULL to prevent a
potential null pointer dereference.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010)

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix index out of bounds in DCN30 color transformation
Srinivasan Shanmugam [Sat, 20 Jul 2024 12:35:20 +0000 (18:05 +0530)]
drm/amd/display: Fix index out of bounds in DCN30 color transformation

This commit addresses a potential index out of bounds issue in the
`cm3_helper_translate_curve_to_hw_format` function in the DCN30 color
management module. The issue could occur when the index 'i' exceeds the
number of transfer function points (TRANSFER_FUNC_POINTS).

The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds, the function returns
false to indicate an error.

drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Implement bounds check for stream encoder creation in DCN401
Srinivasan Shanmugam [Fri, 19 Jul 2024 16:09:57 +0000 (21:39 +0530)]
drm/amd/display: Implement bounds check for stream encoder creation in DCN401

'stream_enc_regs' array is an array of dcn10_stream_enc_registers
structures. The array is initialized with four elements, corresponding
to the four calls to stream_enc_regs() in the array initializer. This
means that valid indices for this array are 0, 1, 2, and 3.

The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
there is an attempt to access this array with an index of 5, which is
out of bounds. This could lead to undefined behavior

Here, eng_id is used as an index to access the stream_enc_regs array. If
eng_id is 5, this would result in an out-of-bounds access on the
stream_enc_regs array.

Thus fixing Buffer overflow error in dcn401_stream_encoder_create

Found by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix index out of bounds in degamma hardware format translation
Srinivasan Shanmugam [Sat, 20 Jul 2024 12:18:27 +0000 (17:48 +0530)]
drm/amd/display: Fix index out of bounds in degamma hardware format translation

Fixes index out of bounds issue in
`cm_helper_translate_curve_to_degamma_hw_format` function. The issue
could occur when the index 'i' exceeds the number of transfer function
points (TRANSFER_FUNC_POINTS).

The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds the function returns
false to indicate an error.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation
Srinivasan Shanmugam [Sat, 20 Jul 2024 13:14:02 +0000 (18:44 +0530)]
drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation

This commit addresses a potential index out of bounds issue in the
`cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30
color  management module. The issue could occur when the index 'i'
exceeds the  number of transfer function points (TRANSFER_FUNC_POINTS).

The fix adds a check to ensure 'i' is within bounds before accessing the
transfer function points. If 'i' is out of bounds, the function returns
false to indicate an error.

Reported by smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max

Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp
Srinivasan Shanmugam [Mon, 22 Jul 2024 12:45:18 +0000 (18:15 +0530)]
drm/amd/display: Add kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp'

Fixes: 94beb4ac1b3b ("drm/amd/display: ensure EASF and ISHARP coefficients are programmed together")
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: 3.2.293
Aric Cyr [Mon, 15 Jul 2024 01:54:49 +0000 (21:54 -0400)]
drm/amd/display: 3.2.293

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: remove unused folder
Aurabindo Pillai [Mon, 15 Jul 2024 20:03:30 +0000 (16:03 -0400)]
drm/amd/display: remove unused folder

dc/{dcn401,dcn303} are unused since the files in it got moved under their
respective new components location. Hence they are no longer necessary

Fixes: 2d62bb450ed1 ("drm/amd/display: Refactor DCN3X into component folder")
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Remove duplicate HWSS interfaces
Joshua Aberback [Thu, 6 Jun 2024 19:51:16 +0000 (15:51 -0400)]
drm/amd/display: Remove duplicate HWSS interfaces

[Why]
Some interface functions are defined in both the public and private HWSS
interfaces, which can lead to confusion and runtime issues, therefore
the duplicates should be eliminated.

[How]
- power_down should only be private, because it's only used within HWSS.
- update_plane_addr should only be public, as it's used outside HWSS.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Various DML2 fixes for FAMS2
Dillon Varone [Mon, 20 May 2024 15:12:07 +0000 (11:12 -0400)]
drm/amd/display: Various DML2 fixes for FAMS2

The disable fams2 operation was reworked, but some of the old code
remained. This commit removes the disable_fams2_drr from the
dml2_stream_parameters.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Remove old comments
Rodrigo Siqueira [Thu, 11 Jul 2024 16:53:41 +0000 (10:53 -0600)]
drm/amd/display: Remove old comments

Remove some old comments from DCN32/321.

Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Check link_res->hpo_dp_link_enc before using it
Alex Hung [Thu, 27 Jun 2024 22:45:39 +0000 (16:45 -0600)]
drm/amd/display: Check link_res->hpo_dp_link_enc before using it

[WHAT & HOW]
Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res
without initializing hpo_dp_link_enc and it is necessary to check for
null before dereferencing.

This fixes 1 FORWARD_NULL issue reported by Coverity.

Fixes: 0beca868cde8 ("drm/amd/display: Check link_res->hpo_dp_link_enc before using it")
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add MST debug message when link detection fails
Alex Hung [Fri, 12 Jul 2024 15:39:13 +0000 (09:39 -0600)]
drm/amd/display: Add MST debug message when link detection fails

[WHY & HOW]
dc_link_detect returns a boolean value which can be used to print debug
messages when it fails.

This fixes 1 CHECKED_RETURN issue reported by Coverity.

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Check top sink only when multiple streams for DP2
Sung Joon Kim [Thu, 11 Jul 2024 15:24:07 +0000 (11:24 -0400)]
drm/amd/display: Check top sink only when multiple streams for DP2

[why]
When switching from extended to second display only
mode, the top remote sink is not removed while the top stream
itself is released. This causes DML to think there is no
DP2 output encoder because top remote sink does not match
with the second stream and disables DTBCLK and causes
hang.

[how]
For DP2.0 MST hubs, only treat 1st remote sink as an encoder
only when there are multiple displays connected.

Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix Potential Null Dereference
Gabe Teeger [Thu, 11 Jul 2024 18:56:29 +0000 (14:56 -0400)]
drm/amd/display: Fix Potential Null Dereference

[what & why]
System hang after s4 regression points to code change here.
Removing possible NULL dereference.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add helper function to check for non-address fast updates
Ilya Bakoulin [Tue, 9 Jul 2024 17:11:55 +0000 (13:11 -0400)]
drm/amd/display: Add helper function to check for non-address fast updates

[Why/How]
Need to identify which fast updates will update more than just the
address.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Ilya Bakoulin <ilya.bakoulin@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: rename dcn401_soc to dcn4_variant_a_soc
Aurabindo Pillai [Thu, 4 Jul 2024 18:41:58 +0000 (18:41 +0000)]
drm/amd/display: rename dcn401_soc to dcn4_variant_a_soc

To distinguish between different soc with same DCN IP, use variants
starting with alphabets

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: rename dcn3/dcn4 to more sound terms
Aurabindo Pillai [Thu, 4 Jul 2024 18:33:02 +0000 (18:33 +0000)]
drm/amd/display: rename dcn3/dcn4 to more sound terms

Use more accurate names to refer to the asic architecture.
dcn3 in DML actually refers to DCN32 and DCN321, so rename it to dcn32x
dcn4 refers to any DCN4x soc., and hence rename dcn4 to dcn4x

Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add source select helper functions
Hansen Dsouza [Thu, 11 Jul 2024 14:58:51 +0000 (10:58 -0400)]
drm/amd/display: Add source select helper functions

[why & how]
Add source select helpers based on DCCG spec

Reviewed-by: Daniel Miess <daniel.miess@amd.com>
Signed-off-by: Hansen Dsouza <hansen.dsouza@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Check if Mode is Supported Before Returning Result
Austin Zheng [Wed, 10 Jul 2024 18:15:57 +0000 (14:15 -0400)]
drm/amd/display: Check if Mode is Supported Before Returning Result

[Why]
Even if the mode is not supported dml2_check_mode_supported() would still return true.
This causes an unsupported mode to be programmed.

[How]
Check if the mode is supported or not and return the proper result.

Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: ensure EASF and ISHARP coefficients are programmed together
Samson Tam [Wed, 10 Jul 2024 21:09:04 +0000 (17:09 -0400)]
drm/amd/display: ensure EASF and ISHARP coefficients are programmed together

[Why]
EASF coefficients are programmed to RAM and then RAM selector is toggled.
 ISHARP coefficients are programmed after so they will not be in the same
 RAM block

[How]
Move ISHARP programming before EASF programming
Add flag if ISHARP coefficients are updated.  If so, then
 force EASF coefficients programming

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Samson Tam <samson.tam@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Fix visual confirm bug for SubVP
Ryan Seto [Wed, 10 Jul 2024 21:03:32 +0000 (17:03 -0400)]
drm/amd/display: Fix visual confirm bug for SubVP

[Why]
Visual confirm was incorrect on dual monitor SubVP setup

[How]
Adjusted p_state assignment for dual monitor SubVP setup

Signed-off-by: Ryan Seto <ryanseto@amd.com>
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add RCG helper functions
Hansen Dsouza [Tue, 9 Jul 2024 20:50:05 +0000 (16:50 -0400)]
drm/amd/display: Add RCG helper functions

[why & how]
Add standard RCG helpers based on DCCG spec

Reviewed-by: Daniel Miess <daniel.miess@amd.com>
Reviewed-by: Muhammad Ahmed <ahmed.ahmed@amd.com>
Signed-off-by: Hansen Dsouza <hansen.dsouza@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Remove ASSERT if significance is zero in math_ceil2
Rodrigo Siqueira [Thu, 4 Jul 2024 17:54:34 +0000 (11:54 -0600)]
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2

In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Refactoring HPO
Revalla Hari Krishna [Mon, 8 Jul 2024 10:05:08 +0000 (15:35 +0530)]
drm/amd/display: Refactoring HPO

[Why]
To refactor HPO files

[How]
Moved hpo related files to specific hpo folder and
update Makefiles.

Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Revalla Hari Krishna <harikrishna.revalla@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Add private data type for RCG
Hansen Dsouza [Tue, 9 Jul 2024 19:56:36 +0000 (15:56 -0400)]
drm/amd/display: Add private data type for RCG

[why & how]
Add private data types for better RCG control

Reviewed-by: Chris Park <chris.park@amd.com>
Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Signed-off-by: Hansen Dsouza <hansen.dsouza@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 weeks agodrm/amd/display: Check for NULL pointer
Sung Joon Kim [Mon, 8 Jul 2024 23:29:49 +0000 (19:29 -0400)]
drm/amd/display: Check for NULL pointer

[why & how]
Need to make sure plane_state is initialized
before accessing its members.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>