+++ /dev/null
-.. SPDX-License-Identifier: GPL-2.0
-.. include:: <isonum.txt>
-
-==================================
-CXL Access Coordinates Computation
-==================================
-
-Shared Upstream Link Calculation
-================================
-For certain CXL region construction with endpoints behind CXL switches (SW) or
-Root Ports (RP), there is the possibility of the total bandwidth for all
-the endpoints behind a switch being more than the switch upstream link.
-A similar situation can occur within the host, upstream of the root ports.
-The CXL driver performs an additional pass after all the targets have
-arrived for a region in order to recalculate the bandwidths with possible
-upstream link being a limiting factor in mind.
-
-The algorithm assumes the configuration is a symmetric topology as that
-maximizes performance. When asymmetric topology is detected, the calculation
-is aborted. An asymmetric topology is detected during topology walk where the
-number of RPs detected as a grandparent is not equal to the number of devices
-iterated in the same iteration loop. The assumption is made that subtle
-asymmetry in properties does not happen and all paths to EPs are equal.
-
-There can be multiple switches under an RP. There can be multiple RPs under
-a CXL Host Bridge (HB). There can be multiple HBs under a CXL Fixed Memory
-Window Structure (CFMWS).
-
-An example hierarchy:
-
-> CFMWS 0
-> |
-> _________|_________
-> | |
-> ACPI0017-0 ACPI0017-1
-> GP0/HB0/ACPI0016-0 GP1/HB1/ACPI0016-1
-> | | | |
-> RP0 RP1 RP2 RP3
-> | | | |
-> SW 0 SW 1 SW 2 SW 3
-> | | | | | | | |
-> EP0 EP1 EP2 EP3 EP4 EP5 EP6 EP7
-
-Computation for the example hierarchy:
-
-Min (GP0 to CPU BW,
- Min(SW 0 Upstream Link to RP0 BW,
- Min(SW0SSLBIS for SW0DSP0 (EP0), EP0 DSLBIS, EP0 Upstream Link) +
- Min(SW0SSLBIS for SW0DSP1 (EP1), EP1 DSLBIS, EP1 Upstream link)) +
- Min(SW 1 Upstream Link to RP1 BW,
- Min(SW1SSLBIS for SW1DSP0 (EP2), EP2 DSLBIS, EP2 Upstream Link) +
- Min(SW1SSLBIS for SW1DSP1 (EP3), EP3 DSLBIS, EP3 Upstream link))) +
-Min (GP1 to CPU BW,
- Min(SW 2 Upstream Link to RP2 BW,
- Min(SW2SSLBIS for SW2DSP0 (EP4), EP4 DSLBIS, EP4 Upstream Link) +
- Min(SW2SSLBIS for SW2DSP1 (EP5), EP5 DSLBIS, EP5 Upstream link)) +
- Min(SW 3 Upstream Link to RP3 BW,
- Min(SW3SSLBIS for SW3DSP0 (EP6), EP6 DSLBIS, EP6 Upstream Link) +
- Min(SW3SSLBIS for SW3DSP1 (EP7), EP7 DSLBIS, EP7 Upstream link))))
-
-The calculation starts at cxl_region_shared_upstream_perf_update(). A xarray
-is created to collect all the endpoint bandwidths via the
-cxl_endpoint_gather_bandwidth() function. The min() of bandwidth from the
-endpoint CDAT and the upstream link bandwidth is calculated. If the endpoint
-has a CXL switch as a parent, then min() of calculated bandwidth and the
-bandwidth from the SSLBIS for the switch downstream port that is associated
-with the endpoint is calculated. The final bandwidth is stored in a
-'struct cxl_perf_ctx' in the xarray indexed by a device pointer. If the
-endpoint is direct attached to a root port (RP), the device pointer would be an
-RP device. If the endpoint is behind a switch, the device pointer would be the
-upstream device of the parent switch.
-
-At the next stage, the code walks through one or more switches if they exist
-in the topology. For endpoints directly attached to RPs, this step is skipped.
-If there is another switch upstream, the code takes the min() of the current
-gathered bandwidth and the upstream link bandwidth. If there's a switch
-upstream, then the SSLBIS of the upstream switch.
-
-Once the topology walk reaches the RP, whether it's direct attached endpoints
-or walking through the switch(es), cxl_rp_gather_bandwidth() is called. At
-this point all the bandwidths are aggregated per each host bridge, which is
-also the index for the resulting xarray.
-
-The next step is to take the min() of the per host bridge bandwidth and the
-bandwidth from the Generic Port (GP). The bandwidths for the GP is retrieved
-via ACPI tables SRAT/HMAT. The min bandwidth are aggregated under the same
-ACPI0017 device to form a new xarray.
-
-Finally, the cxl_region_update_bandwidth() is called and the aggregated
-bandwidth from all the members of the last xarray is updated for the
-access coordinates residing in the cxl region (cxlr) context.
Compute Express Link
====================
+CXL device configuration has a complex handoff between platform (Hardware,
+BIOS, EFI), OS (early boot, core kernel, driver), and user policy decisions
+that have impacts on each other. The docs here break up configurations steps.
+
+.. toctree::
+ :maxdepth: 2
+ :caption: Overview
+
+ theory-of-operation
+ maturity-map
+
.. toctree::
:maxdepth: 1
+ :caption: Linux Kernel Configuration
- memory-devices
- access-coordinates
+ linux/access-coordinates
- maturity-map
.. only:: subproject and html
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+==================================
+CXL Access Coordinates Computation
+==================================
+
+Shared Upstream Link Calculation
+================================
+For certain CXL region construction with endpoints behind CXL switches (SW) or
+Root Ports (RP), there is the possibility of the total bandwidth for all
+the endpoints behind a switch being more than the switch upstream link.
+A similar situation can occur within the host, upstream of the root ports.
+The CXL driver performs an additional pass after all the targets have
+arrived for a region in order to recalculate the bandwidths with possible
+upstream link being a limiting factor in mind.
+
+The algorithm assumes the configuration is a symmetric topology as that
+maximizes performance. When asymmetric topology is detected, the calculation
+is aborted. An asymmetric topology is detected during topology walk where the
+number of RPs detected as a grandparent is not equal to the number of devices
+iterated in the same iteration loop. The assumption is made that subtle
+asymmetry in properties does not happen and all paths to EPs are equal.
+
+There can be multiple switches under an RP. There can be multiple RPs under
+a CXL Host Bridge (HB). There can be multiple HBs under a CXL Fixed Memory
+Window Structure (CFMWS).
+
+An example hierarchy:
+
+> CFMWS 0
+> |
+> _________|_________
+> | |
+> ACPI0017-0 ACPI0017-1
+> GP0/HB0/ACPI0016-0 GP1/HB1/ACPI0016-1
+> | | | |
+> RP0 RP1 RP2 RP3
+> | | | |
+> SW 0 SW 1 SW 2 SW 3
+> | | | | | | | |
+> EP0 EP1 EP2 EP3 EP4 EP5 EP6 EP7
+
+Computation for the example hierarchy:
+
+Min (GP0 to CPU BW,
+ Min(SW 0 Upstream Link to RP0 BW,
+ Min(SW0SSLBIS for SW0DSP0 (EP0), EP0 DSLBIS, EP0 Upstream Link) +
+ Min(SW0SSLBIS for SW0DSP1 (EP1), EP1 DSLBIS, EP1 Upstream link)) +
+ Min(SW 1 Upstream Link to RP1 BW,
+ Min(SW1SSLBIS for SW1DSP0 (EP2), EP2 DSLBIS, EP2 Upstream Link) +
+ Min(SW1SSLBIS for SW1DSP1 (EP3), EP3 DSLBIS, EP3 Upstream link))) +
+Min (GP1 to CPU BW,
+ Min(SW 2 Upstream Link to RP2 BW,
+ Min(SW2SSLBIS for SW2DSP0 (EP4), EP4 DSLBIS, EP4 Upstream Link) +
+ Min(SW2SSLBIS for SW2DSP1 (EP5), EP5 DSLBIS, EP5 Upstream link)) +
+ Min(SW 3 Upstream Link to RP3 BW,
+ Min(SW3SSLBIS for SW3DSP0 (EP6), EP6 DSLBIS, EP6 Upstream Link) +
+ Min(SW3SSLBIS for SW3DSP1 (EP7), EP7 DSLBIS, EP7 Upstream link))))
+
+The calculation starts at cxl_region_shared_upstream_perf_update(). A xarray
+is created to collect all the endpoint bandwidths via the
+cxl_endpoint_gather_bandwidth() function. The min() of bandwidth from the
+endpoint CDAT and the upstream link bandwidth is calculated. If the endpoint
+has a CXL switch as a parent, then min() of calculated bandwidth and the
+bandwidth from the SSLBIS for the switch downstream port that is associated
+with the endpoint is calculated. The final bandwidth is stored in a
+'struct cxl_perf_ctx' in the xarray indexed by a device pointer. If the
+endpoint is direct attached to a root port (RP), the device pointer would be an
+RP device. If the endpoint is behind a switch, the device pointer would be the
+upstream device of the parent switch.
+
+At the next stage, the code walks through one or more switches if they exist
+in the topology. For endpoints directly attached to RPs, this step is skipped.
+If there is another switch upstream, the code takes the min() of the current
+gathered bandwidth and the upstream link bandwidth. If there's a switch
+upstream, then the SSLBIS of the upstream switch.
+
+Once the topology walk reaches the RP, whether it's direct attached endpoints
+or walking through the switch(es), cxl_rp_gather_bandwidth() is called. At
+this point all the bandwidths are aggregated per each host bridge, which is
+also the index for the resulting xarray.
+
+The next step is to take the min() of the per host bridge bandwidth and the
+bandwidth from the Generic Port (GP). The bandwidths for the GP is retrieved
+via ACPI tables SRAT/HMAT. The min bandwidth are aggregated under the same
+ACPI0017 device to form a new xarray.
+
+Finally, the cxl_region_update_bandwidth() is called and the aggregated
+bandwidth from all the members of the last xarray is updated for the
+access coordinates residing in the cxl region (cxlr) context.
+++ /dev/null
-.. SPDX-License-Identifier: GPL-2.0
-.. include:: <isonum.txt>
-
-===================================
-Compute Express Link Memory Devices
-===================================
-
-A Compute Express Link Memory Device is a CXL component that implements the
-CXL.mem protocol. It contains some amount of volatile memory, persistent memory,
-or both. It is enumerated as a PCI device for configuration and passing
-messages over an MMIO mailbox. Its contribution to the System Physical
-Address space is handled via HDM (Host Managed Device Memory) decoders
-that optionally define a device's contribution to an interleaved address
-range across multiple devices underneath a host-bridge or interleaved
-across host-bridges.
-
-CXL Bus: Theory of Operation
-============================
-Similar to how a RAID driver takes disk objects and assembles them into a new
-logical device, the CXL subsystem is tasked to take PCIe and ACPI objects and
-assemble them into a CXL.mem decode topology. The need for runtime configuration
-of the CXL.mem topology is also similar to RAID in that different environments
-with the same hardware configuration may decide to assemble the topology in
-contrasting ways. One may choose performance (RAID0) striping memory across
-multiple Host Bridges and endpoints while another may opt for fault tolerance
-and disable any striping in the CXL.mem topology.
-
-Platform firmware enumerates a menu of interleave options at the "CXL root port"
-(Linux term for the top of the CXL decode topology). From there, PCIe topology
-dictates which endpoints can participate in which Host Bridge decode regimes.
-Each PCIe Switch in the path between the root and an endpoint introduces a point
-at which the interleave can be split. For example platform firmware may say at a
-given range only decodes to 1 one Host Bridge, but that Host Bridge may in turn
-interleave cycles across multiple Root Ports. An intervening Switch between a
-port and an endpoint may interleave cycles across multiple Downstream Switch
-Ports, etc.
-
-Here is a sample listing of a CXL topology defined by 'cxl_test'. The 'cxl_test'
-module generates an emulated CXL topology of 2 Host Bridges each with 2 Root
-Ports. Each of those Root Ports are connected to 2-way switches with endpoints
-connected to those downstream ports for a total of 8 endpoints::
-
- # cxl list -BEMPu -b cxl_test
- {
- "bus":"root3",
- "provider":"cxl_test",
- "ports:root3":[
- {
- "port":"port5",
- "host":"cxl_host_bridge.1",
- "ports:port5":[
- {
- "port":"port8",
- "host":"cxl_switch_uport.1",
- "endpoints:port8":[
- {
- "endpoint":"endpoint9",
- "host":"mem2",
- "memdev":{
- "memdev":"mem2",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x1",
- "numa_node":1,
- "host":"cxl_mem.1"
- }
- },
- {
- "endpoint":"endpoint15",
- "host":"mem6",
- "memdev":{
- "memdev":"mem6",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x5",
- "numa_node":1,
- "host":"cxl_mem.5"
- }
- }
- ]
- },
- {
- "port":"port12",
- "host":"cxl_switch_uport.3",
- "endpoints:port12":[
- {
- "endpoint":"endpoint17",
- "host":"mem8",
- "memdev":{
- "memdev":"mem8",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x7",
- "numa_node":1,
- "host":"cxl_mem.7"
- }
- },
- {
- "endpoint":"endpoint13",
- "host":"mem4",
- "memdev":{
- "memdev":"mem4",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x3",
- "numa_node":1,
- "host":"cxl_mem.3"
- }
- }
- ]
- }
- ]
- },
- {
- "port":"port4",
- "host":"cxl_host_bridge.0",
- "ports:port4":[
- {
- "port":"port6",
- "host":"cxl_switch_uport.0",
- "endpoints:port6":[
- {
- "endpoint":"endpoint7",
- "host":"mem1",
- "memdev":{
- "memdev":"mem1",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0",
- "numa_node":0,
- "host":"cxl_mem.0"
- }
- },
- {
- "endpoint":"endpoint14",
- "host":"mem5",
- "memdev":{
- "memdev":"mem5",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x4",
- "numa_node":0,
- "host":"cxl_mem.4"
- }
- }
- ]
- },
- {
- "port":"port10",
- "host":"cxl_switch_uport.2",
- "endpoints:port10":[
- {
- "endpoint":"endpoint16",
- "host":"mem7",
- "memdev":{
- "memdev":"mem7",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x6",
- "numa_node":0,
- "host":"cxl_mem.6"
- }
- },
- {
- "endpoint":"endpoint11",
- "host":"mem3",
- "memdev":{
- "memdev":"mem3",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x2",
- "numa_node":0,
- "host":"cxl_mem.2"
- }
- }
- ]
- }
- ]
- }
- ]
- }
-
-In that listing each "root", "port", and "endpoint" object correspond a kernel
-'struct cxl_port' object. A 'cxl_port' is a device that can decode CXL.mem to
-its descendants. So "root" claims non-PCIe enumerable platform decode ranges and
-decodes them to "ports", "ports" decode to "endpoints", and "endpoints"
-represent the decode from SPA (System Physical Address) to DPA (Device Physical
-Address).
-
-Continuing the RAID analogy, disks have both topology metadata and on device
-metadata that determine RAID set assembly. CXL Port topology and CXL Port link
-status is metadata for CXL.mem set assembly. The CXL Port topology is enumerated
-by the arrival of a CXL.mem device. I.e. unless and until the PCIe core attaches
-the cxl_pci driver to a CXL Memory Expander there is no role for CXL Port
-objects. Conversely for hot-unplug / removal scenarios, there is no need for
-the Linux PCI core to tear down switch-level CXL resources because the endpoint
-->remove() event cleans up the port data that was established to support that
-Memory Expander.
-
-The port metadata and potential decode schemes that a give memory device may
-participate can be determined via a command like::
-
- # cxl list -BDMu -d root -m mem3
- {
- "bus":"root3",
- "provider":"cxl_test",
- "decoders:root3":[
- {
- "decoder":"decoder3.1",
- "resource":"0x8030000000",
- "size":"512.00 MiB (536.87 MB)",
- "volatile_capable":true,
- "nr_targets":2
- },
- {
- "decoder":"decoder3.3",
- "resource":"0x8060000000",
- "size":"512.00 MiB (536.87 MB)",
- "pmem_capable":true,
- "nr_targets":2
- },
- {
- "decoder":"decoder3.0",
- "resource":"0x8020000000",
- "size":"256.00 MiB (268.44 MB)",
- "volatile_capable":true,
- "nr_targets":1
- },
- {
- "decoder":"decoder3.2",
- "resource":"0x8050000000",
- "size":"256.00 MiB (268.44 MB)",
- "pmem_capable":true,
- "nr_targets":1
- }
- ],
- "memdevs:root3":[
- {
- "memdev":"mem3",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x2",
- "numa_node":0,
- "host":"cxl_mem.2"
- }
- ]
- }
-
-...which queries the CXL topology to ask "given CXL Memory Expander with a kernel
-device name of 'mem3' which platform level decode ranges may this device
-participate". A given expander can participate in multiple CXL.mem interleave
-sets simultaneously depending on how many decoder resource it has. In this
-example mem3 can participate in one or more of a PMEM interleave that spans to
-Host Bridges, a PMEM interleave that targets a single Host Bridge, a Volatile
-memory interleave that spans 2 Host Bridges, and a Volatile memory interleave
-that only targets a single Host Bridge.
-
-Conversely the memory devices that can participate in a given platform level
-decode scheme can be determined via a command like the following::
-
- # cxl list -MDu -d 3.2
- [
- {
- "memdevs":[
- {
- "memdev":"mem1",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0",
- "numa_node":0,
- "host":"cxl_mem.0"
- },
- {
- "memdev":"mem5",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x4",
- "numa_node":0,
- "host":"cxl_mem.4"
- },
- {
- "memdev":"mem7",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x6",
- "numa_node":0,
- "host":"cxl_mem.6"
- },
- {
- "memdev":"mem3",
- "pmem_size":"256.00 MiB (268.44 MB)",
- "ram_size":"256.00 MiB (268.44 MB)",
- "serial":"0x2",
- "numa_node":0,
- "host":"cxl_mem.2"
- }
- ]
- },
- {
- "root decoders":[
- {
- "decoder":"decoder3.2",
- "resource":"0x8050000000",
- "size":"256.00 MiB (268.44 MB)",
- "pmem_capable":true,
- "nr_targets":1
- }
- ]
- }
- ]
-
-...where the naming scheme for decoders is "decoder<port_id>.<instance_id>".
-
-Driver Infrastructure
-=====================
-
-This section covers the driver infrastructure for a CXL memory device.
-
-CXL Memory Device
------------------
-
-.. kernel-doc:: drivers/cxl/pci.c
- :doc: cxl pci
-
-.. kernel-doc:: drivers/cxl/pci.c
- :internal:
-
-.. kernel-doc:: drivers/cxl/mem.c
- :doc: cxl mem
-
-.. kernel-doc:: drivers/cxl/cxlmem.h
- :internal:
-
-.. kernel-doc:: drivers/cxl/core/memdev.c
- :identifiers:
-
-CXL Port
---------
-.. kernel-doc:: drivers/cxl/port.c
- :doc: cxl port
-
-CXL Core
---------
-.. kernel-doc:: drivers/cxl/cxl.h
- :doc: cxl objects
-
-.. kernel-doc:: drivers/cxl/cxl.h
- :internal:
-
-.. kernel-doc:: drivers/cxl/core/hdm.c
- :doc: cxl core hdm
-
-.. kernel-doc:: drivers/cxl/core/hdm.c
- :identifiers:
-
-.. kernel-doc:: drivers/cxl/core/cdat.c
- :identifiers:
-
-.. kernel-doc:: drivers/cxl/core/port.c
- :doc: cxl core
-
-.. kernel-doc:: drivers/cxl/core/port.c
- :identifiers:
-
-.. kernel-doc:: drivers/cxl/core/pci.c
- :doc: cxl core pci
-
-.. kernel-doc:: drivers/cxl/core/pci.c
- :identifiers:
-
-.. kernel-doc:: drivers/cxl/core/pmem.c
- :doc: cxl pmem
-
-.. kernel-doc:: drivers/cxl/core/regs.c
- :doc: cxl registers
-
-.. kernel-doc:: drivers/cxl/core/mbox.c
- :doc: cxl mbox
-
-CXL Regions
------------
-.. kernel-doc:: drivers/cxl/core/region.c
- :doc: cxl core region
-
-.. kernel-doc:: drivers/cxl/core/region.c
- :identifiers:
-
-External Interfaces
-===================
-
-CXL IOCTL Interface
--------------------
-
-.. kernel-doc:: include/uapi/linux/cxl_mem.h
- :doc: UAPI
-
-.. kernel-doc:: include/uapi/linux/cxl_mem.h
- :internal:
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+===============================================
+Compute Express Link Driver Theory of Operation
+===============================================
+
+A Compute Express Link Memory Device is a CXL component that implements the
+CXL.mem protocol. It contains some amount of volatile memory, persistent memory,
+or both. It is enumerated as a PCI device for configuration and passing
+messages over an MMIO mailbox. Its contribution to the System Physical
+Address space is handled via HDM (Host Managed Device Memory) decoders
+that optionally define a device's contribution to an interleaved address
+range across multiple devices underneath a host-bridge or interleaved
+across host-bridges.
+
+The CXL Bus
+===========
+Similar to how a RAID driver takes disk objects and assembles them into a new
+logical device, the CXL subsystem is tasked to take PCIe and ACPI objects and
+assemble them into a CXL.mem decode topology. The need for runtime configuration
+of the CXL.mem topology is also similar to RAID in that different environments
+with the same hardware configuration may decide to assemble the topology in
+contrasting ways. One may choose performance (RAID0) striping memory across
+multiple Host Bridges and endpoints while another may opt for fault tolerance
+and disable any striping in the CXL.mem topology.
+
+Platform firmware enumerates a menu of interleave options at the "CXL root port"
+(Linux term for the top of the CXL decode topology). From there, PCIe topology
+dictates which endpoints can participate in which Host Bridge decode regimes.
+Each PCIe Switch in the path between the root and an endpoint introduces a point
+at which the interleave can be split. For example platform firmware may say at a
+given range only decodes to 1 one Host Bridge, but that Host Bridge may in turn
+interleave cycles across multiple Root Ports. An intervening Switch between a
+port and an endpoint may interleave cycles across multiple Downstream Switch
+Ports, etc.
+
+Here is a sample listing of a CXL topology defined by 'cxl_test'. The 'cxl_test'
+module generates an emulated CXL topology of 2 Host Bridges each with 2 Root
+Ports. Each of those Root Ports are connected to 2-way switches with endpoints
+connected to those downstream ports for a total of 8 endpoints::
+
+ # cxl list -BEMPu -b cxl_test
+ {
+ "bus":"root3",
+ "provider":"cxl_test",
+ "ports:root3":[
+ {
+ "port":"port5",
+ "host":"cxl_host_bridge.1",
+ "ports:port5":[
+ {
+ "port":"port8",
+ "host":"cxl_switch_uport.1",
+ "endpoints:port8":[
+ {
+ "endpoint":"endpoint9",
+ "host":"mem2",
+ "memdev":{
+ "memdev":"mem2",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x1",
+ "numa_node":1,
+ "host":"cxl_mem.1"
+ }
+ },
+ {
+ "endpoint":"endpoint15",
+ "host":"mem6",
+ "memdev":{
+ "memdev":"mem6",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x5",
+ "numa_node":1,
+ "host":"cxl_mem.5"
+ }
+ }
+ ]
+ },
+ {
+ "port":"port12",
+ "host":"cxl_switch_uport.3",
+ "endpoints:port12":[
+ {
+ "endpoint":"endpoint17",
+ "host":"mem8",
+ "memdev":{
+ "memdev":"mem8",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x7",
+ "numa_node":1,
+ "host":"cxl_mem.7"
+ }
+ },
+ {
+ "endpoint":"endpoint13",
+ "host":"mem4",
+ "memdev":{
+ "memdev":"mem4",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x3",
+ "numa_node":1,
+ "host":"cxl_mem.3"
+ }
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "port":"port4",
+ "host":"cxl_host_bridge.0",
+ "ports:port4":[
+ {
+ "port":"port6",
+ "host":"cxl_switch_uport.0",
+ "endpoints:port6":[
+ {
+ "endpoint":"endpoint7",
+ "host":"mem1",
+ "memdev":{
+ "memdev":"mem1",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0",
+ "numa_node":0,
+ "host":"cxl_mem.0"
+ }
+ },
+ {
+ "endpoint":"endpoint14",
+ "host":"mem5",
+ "memdev":{
+ "memdev":"mem5",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x4",
+ "numa_node":0,
+ "host":"cxl_mem.4"
+ }
+ }
+ ]
+ },
+ {
+ "port":"port10",
+ "host":"cxl_switch_uport.2",
+ "endpoints:port10":[
+ {
+ "endpoint":"endpoint16",
+ "host":"mem7",
+ "memdev":{
+ "memdev":"mem7",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x6",
+ "numa_node":0,
+ "host":"cxl_mem.6"
+ }
+ },
+ {
+ "endpoint":"endpoint11",
+ "host":"mem3",
+ "memdev":{
+ "memdev":"mem3",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x2",
+ "numa_node":0,
+ "host":"cxl_mem.2"
+ }
+ }
+ ]
+ }
+ ]
+ }
+ ]
+ }
+
+In that listing each "root", "port", and "endpoint" object correspond a kernel
+'struct cxl_port' object. A 'cxl_port' is a device that can decode CXL.mem to
+its descendants. So "root" claims non-PCIe enumerable platform decode ranges and
+decodes them to "ports", "ports" decode to "endpoints", and "endpoints"
+represent the decode from SPA (System Physical Address) to DPA (Device Physical
+Address).
+
+Continuing the RAID analogy, disks have both topology metadata and on device
+metadata that determine RAID set assembly. CXL Port topology and CXL Port link
+status is metadata for CXL.mem set assembly. The CXL Port topology is enumerated
+by the arrival of a CXL.mem device. I.e. unless and until the PCIe core attaches
+the cxl_pci driver to a CXL Memory Expander there is no role for CXL Port
+objects. Conversely for hot-unplug / removal scenarios, there is no need for
+the Linux PCI core to tear down switch-level CXL resources because the endpoint
+->remove() event cleans up the port data that was established to support that
+Memory Expander.
+
+The port metadata and potential decode schemes that a give memory device may
+participate can be determined via a command like::
+
+ # cxl list -BDMu -d root -m mem3
+ {
+ "bus":"root3",
+ "provider":"cxl_test",
+ "decoders:root3":[
+ {
+ "decoder":"decoder3.1",
+ "resource":"0x8030000000",
+ "size":"512.00 MiB (536.87 MB)",
+ "volatile_capable":true,
+ "nr_targets":2
+ },
+ {
+ "decoder":"decoder3.3",
+ "resource":"0x8060000000",
+ "size":"512.00 MiB (536.87 MB)",
+ "pmem_capable":true,
+ "nr_targets":2
+ },
+ {
+ "decoder":"decoder3.0",
+ "resource":"0x8020000000",
+ "size":"256.00 MiB (268.44 MB)",
+ "volatile_capable":true,
+ "nr_targets":1
+ },
+ {
+ "decoder":"decoder3.2",
+ "resource":"0x8050000000",
+ "size":"256.00 MiB (268.44 MB)",
+ "pmem_capable":true,
+ "nr_targets":1
+ }
+ ],
+ "memdevs:root3":[
+ {
+ "memdev":"mem3",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x2",
+ "numa_node":0,
+ "host":"cxl_mem.2"
+ }
+ ]
+ }
+
+...which queries the CXL topology to ask "given CXL Memory Expander with a kernel
+device name of 'mem3' which platform level decode ranges may this device
+participate". A given expander can participate in multiple CXL.mem interleave
+sets simultaneously depending on how many decoder resource it has. In this
+example mem3 can participate in one or more of a PMEM interleave that spans to
+Host Bridges, a PMEM interleave that targets a single Host Bridge, a Volatile
+memory interleave that spans 2 Host Bridges, and a Volatile memory interleave
+that only targets a single Host Bridge.
+
+Conversely the memory devices that can participate in a given platform level
+decode scheme can be determined via a command like the following::
+
+ # cxl list -MDu -d 3.2
+ [
+ {
+ "memdevs":[
+ {
+ "memdev":"mem1",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0",
+ "numa_node":0,
+ "host":"cxl_mem.0"
+ },
+ {
+ "memdev":"mem5",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x4",
+ "numa_node":0,
+ "host":"cxl_mem.4"
+ },
+ {
+ "memdev":"mem7",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x6",
+ "numa_node":0,
+ "host":"cxl_mem.6"
+ },
+ {
+ "memdev":"mem3",
+ "pmem_size":"256.00 MiB (268.44 MB)",
+ "ram_size":"256.00 MiB (268.44 MB)",
+ "serial":"0x2",
+ "numa_node":0,
+ "host":"cxl_mem.2"
+ }
+ ]
+ },
+ {
+ "root decoders":[
+ {
+ "decoder":"decoder3.2",
+ "resource":"0x8050000000",
+ "size":"256.00 MiB (268.44 MB)",
+ "pmem_capable":true,
+ "nr_targets":1
+ }
+ ]
+ }
+ ]
+
+...where the naming scheme for decoders is "decoder<port_id>.<instance_id>".
+
+Driver Infrastructure
+=====================
+
+This section covers the driver infrastructure for a CXL memory device.
+
+CXL Memory Device
+-----------------
+
+.. kernel-doc:: drivers/cxl/pci.c
+ :doc: cxl pci
+
+.. kernel-doc:: drivers/cxl/pci.c
+ :internal:
+
+.. kernel-doc:: drivers/cxl/mem.c
+ :doc: cxl mem
+
+.. kernel-doc:: drivers/cxl/cxlmem.h
+ :internal:
+
+.. kernel-doc:: drivers/cxl/core/memdev.c
+ :identifiers:
+
+CXL Port
+--------
+.. kernel-doc:: drivers/cxl/port.c
+ :doc: cxl port
+
+CXL Core
+--------
+.. kernel-doc:: drivers/cxl/cxl.h
+ :doc: cxl objects
+
+.. kernel-doc:: drivers/cxl/cxl.h
+ :internal:
+
+.. kernel-doc:: drivers/cxl/core/hdm.c
+ :doc: cxl core hdm
+
+.. kernel-doc:: drivers/cxl/core/hdm.c
+ :identifiers:
+
+.. kernel-doc:: drivers/cxl/core/cdat.c
+ :identifiers:
+
+.. kernel-doc:: drivers/cxl/core/port.c
+ :doc: cxl core
+
+.. kernel-doc:: drivers/cxl/core/port.c
+ :identifiers:
+
+.. kernel-doc:: drivers/cxl/core/pci.c
+ :doc: cxl core pci
+
+.. kernel-doc:: drivers/cxl/core/pci.c
+ :identifiers:
+
+.. kernel-doc:: drivers/cxl/core/pmem.c
+ :doc: cxl pmem
+
+.. kernel-doc:: drivers/cxl/core/regs.c
+ :doc: cxl registers
+
+.. kernel-doc:: drivers/cxl/core/mbox.c
+ :doc: cxl mbox
+
+CXL Regions
+-----------
+.. kernel-doc:: drivers/cxl/core/region.c
+ :doc: cxl core region
+
+.. kernel-doc:: drivers/cxl/core/region.c
+ :identifiers:
+
+External Interfaces
+===================
+
+CXL IOCTL Interface
+-------------------
+
+.. kernel-doc:: include/uapi/linux/cxl_mem.h
+ :doc: UAPI
+
+.. kernel-doc:: include/uapi/linux/cxl_mem.h
+ :internal: