cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCH
authorDave Jiang <dave.jiang@intel.com>
Fri, 26 Apr 2024 22:47:56 +0000 (15:47 -0700)
committerDave Jiang <dave.jiang@intel.com>
Mon, 29 Apr 2024 16:03:26 +0000 (09:03 -0700)
commit5d211c7090590033581175d6405ae40917ca3a06
tree3239fa3608f69ef70452b3a92b530de31b7f4434
parente67572cd2204894179d89bd7b984072f19313b03
cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCH

Robert reported the following when booting a CXL host with Restricted CXL
Host (RCH) topology:
 [   39.815379] cxl_acpi ACPI0017:00: not a cxl_port device
 [   39.827123] WARNING: CPU: 46 PID: 1754 at drivers/cxl/core/port.c:592 to_cxl_port+0x56/0x70 [cxl_core]

... plus some related subsequent NULL pointer dereference:

 [   40.718708] BUG: kernel NULL pointer dereference, address: 00000000000002d8

The iterator to walk the PCIe path did not account for RCH topology.
However RCH does not support hotplug and the memory exported by the
Restricted CXL Device (RCD) should be covered by HMAT and therefore no
access_coordinate is needed. Add check to see if the endpoint device is
RCD and skip calculation.

Also add a call to cxl_endpoint_get_perf_coordinates() in cxl_test in order
to exercise the topology iterator. The dev_is_pci() check added is to help
with this test and should be harmless for normal operation.

Reported-by: Robert Richter <rrichter@amd.com>
Closes: https://lore.kernel.org/all/Ziv8GfSMSbvlBB0h@rric.localdomain/
Fixes: 592780b8391f ("cxl: Fix retrieving of access_coordinates in PCIe path")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/20240426224913.1027420-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
drivers/cxl/core/port.c
tools/testing/cxl/test/cxl.c