qed: Don't collect too many protection override GRC elements
In the protection override dump path, the firmware can return far too
many GRC elements, resulting in attempting to write past the end of the
previously-kmalloc'ed dump buffer.
This will result in a kernel panic with reason:
BUG: unable to handle kernel paging request at ADDRESS
where "ADDRESS" is just past the end of the protection override dump
buffer. The start address of the buffer is:
p_hwfn->cdev->dbg_features[DBG_FEATURE_PROTECTION_OVERRIDE].dump_buf
and the size of the buffer is buf_size in the same data structure.
The panic can be arrived at from either the qede Ethernet driver path:
[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at
ffffffffc02662ed [qed]
qed_dbg_protection_override_dump at
ffffffffc0267792 [qed]
qed_dbg_feature at
ffffffffc026aa8f [qed]
qed_dbg_all_data at
ffffffffc026b211 [qed]
qed_fw_fatal_reporter_dump at
ffffffffc027298a [qed]
devlink_health_do_dump at
ffffffff82497f61
devlink_health_report at
ffffffff8249cf29
qed_report_fatal_error at
ffffffffc0272baf [qed]
qede_sp_task at
ffffffffc045ed32 [qede]
process_one_work at
ffffffff81d19783
or the qedf storage driver path:
[exception RIP: qed_grc_dump_addr_range+0x108]
qed_protection_override_dump at
ffffffffc068b2ed [qed]
qed_dbg_protection_override_dump at
ffffffffc068c792 [qed]
qed_dbg_feature at
ffffffffc068fa8f [qed]
qed_dbg_all_data at
ffffffffc0690211 [qed]
qed_fw_fatal_reporter_dump at
ffffffffc069798a [qed]
devlink_health_do_dump at
ffffffff8aa95e51
devlink_health_report at
ffffffff8aa9ae19
qed_report_fatal_error at
ffffffffc0697baf [qed]
qed_hw_err_notify at
ffffffffc06d32d7 [qed]
qed_spq_post at
ffffffffc06b1011 [qed]
qed_fcoe_destroy_conn at
ffffffffc06b2e91 [qed]
qedf_cleanup_fcport at
ffffffffc05e7597 [qedf]
qedf_rport_event_handler at
ffffffffc05e7bf7 [qedf]
fc_rport_work at
ffffffffc02da715 [libfc]
process_one_work at
ffffffff8a319663
Resolve this by clamping the firmware's return value to the maximum
number of legal elements the firmware should return.
Fixes:
d52c89f120de8 ("qed*: Utilize FW 8.37.2.0")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Link: https://patch.msgid.link/f8e1182934aa274c18d0682a12dbaf347595469c.1757485536.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>