mirror of
https://github.com/torvalds/linux.git
synced 2026-06-03 12:03:54 +02:00
cxl: doc/linux/access-coordinates Update access coordinates calculation methods
Add documentation on how to calculate the access coordinates for a given CXL region in detail. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250515000923.2590820-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
This commit is contained in:
parent
1ce91b37b2
commit
fc78561595
|
|
@ -5,6 +5,83 @@
|
|||
CXL Access Coordinates Computation
|
||||
==================================
|
||||
|
||||
Latency and Bandwidth Calculation
|
||||
=================================
|
||||
A memory region performance coordinates (latency and bandwidth) are typically
|
||||
provided via ACPI tables :doc:`SRAT <../platform/acpi/srat>` and
|
||||
:doc:`HMAT <../platform/acpi/hmat>`. However, the platform firmware (BIOS) is
|
||||
not able to annotate those for CXL devices that are hot-plugged since they do
|
||||
not exist during platform firmware initialization. The CXL driver can compute
|
||||
the performance coordinates by retrieving data from several components.
|
||||
|
||||
The :doc:`SRAT <../platform/acpi/srat>` provides a Generic Port Affinity
|
||||
subtable that ties a proximity domain to a device handle, which in this case
|
||||
would be the CXL hostbridge. Using this association, the performance
|
||||
coordinates for the Generic Port can be retrieved from the
|
||||
:doc:`HMAT <../platform/acpi/hmat>` subtable. This piece represents the
|
||||
performance coordinates between a CPU and a Generic Port (CXL hostbridge).
|
||||
|
||||
The :doc:`CDAT <../platform/cdat>` provides the performance coordinates for
|
||||
the CXL device itself. That is the bandwidth and latency to access that device's
|
||||
memory region. The DSMAS subtable provides a DSMADHandle that is tied to a
|
||||
Device Physical Address (DPA) range. The DSLBIS subtable provides the
|
||||
performance coordinates that's tied to a DSMADhandle and this ties the two
|
||||
table entries together to provide the performance coordinates for each DPA
|
||||
region. For example, if a device exports a DRAM region and a PMEM region,
|
||||
then there would be different performance characteristsics for each of those
|
||||
regions.
|
||||
|
||||
If there's a CXL switch in the topology, then the performance coordinates for the
|
||||
switch is provided by SSLBIS subtable. This provides the bandwidth and latency
|
||||
for traversing the switch between the switch upstream port and the switch
|
||||
downstream port that points to the endpoint device.
|
||||
|
||||
Simple topology example::
|
||||
|
||||
GP0/HB0/ACPI0016-0
|
||||
RP0
|
||||
|
|
||||
| L0
|
||||
|
|
||||
SW 0 / USP0
|
||||
SW 0 / DSP0
|
||||
|
|
||||
| L1
|
||||
|
|
||||
EP0
|
||||
|
||||
In this example, there is a CXL switch between an endpoint and a root port.
|
||||
Latency in this example is calculated as such:
|
||||
L(EP0) - Latency from EP0 CDAT DSMAS+DSLBIS
|
||||
L(L1) - Link latency between EP0 and SW0DSP0
|
||||
L(SW0) - Latency for the switch from SW0 CDAT SSLBIS.
|
||||
L(L0) - Link latency between SW0 and RP0
|
||||
L(RP0) - Latency from root port to CPU via SRAT and HMAT (Generic Port).
|
||||
Total read and write latencies are the sum of all these parts.
|
||||
|
||||
Bandwidth in this example is calculated as such:
|
||||
B(EP0) - Bandwidth from EP0 CDAT DSMAS+DSLBIS
|
||||
B(L1) - Link bandwidth between EP0 and SW0DSP0
|
||||
B(SW0) - Bandwidth for the switch from SW0 CDAT SSLBIS.
|
||||
B(L0) - Link bandwidth between SW0 and RP0
|
||||
B(RP0) - Bandwidth from root port to CPU via SRAT and HMAT (Generic Port).
|
||||
The total read and write bandwidth is the min() of all these parts.
|
||||
|
||||
To calculate the link bandwidth:
|
||||
LinkOperatingFrequency (GT/s) is the current negotiated link speed.
|
||||
DataRatePerLink (MB/s) = LinkOperatingFrequency / 8
|
||||
Bandwidth (MB/s) = PCIeCurrentLinkWidth * DataRatePerLink
|
||||
Where PCIeCurrentLinkWidth is the number of lanes in the link.
|
||||
|
||||
To calculate the link latency:
|
||||
LinkLatency (picoseconds) = FlitSize / LinkBandwidth (MB/s)
|
||||
|
||||
See `CXL Memory Device SW Guide r1.0 <https://www.intel.com/content/www/us/en/content-details/643805/cxl-memory-device-software-guide.html>`_,
|
||||
section 2.11.3 and 2.11.4 for details.
|
||||
|
||||
In the end, the access coordinates for a constructed memory region is calculated from one
|
||||
or more memory partitions from each of the CXL device(s).
|
||||
|
||||
Shared Upstream Link Calculation
|
||||
================================
|
||||
For certain CXL region construction with endpoints behind CXL switches (SW) or
|
||||
|
|
@ -91,5 +168,11 @@ Finally, the cxl_region_update_bandwidth() is called and the aggregated
|
|||
bandwidth from all the members of the last xarray is updated for the
|
||||
access coordinates residing in the cxl region (cxlr) context.
|
||||
|
||||
.. kernel-doc:: drivers/cxl/acpi.c
|
||||
:identifiers: cxl_acpi_evaluate_qtg_dsm
|
||||
QTG ID
|
||||
======
|
||||
Each :doc:`CEDT <../platform/acpi/cedt>` has a QTG ID field. This field provides
|
||||
the ID that associates with a QoS Throttling Group (QTG) for the CFMWS window.
|
||||
Once the access coordinates are calculated, an ACPI Device Specific Method can
|
||||
be issued to the ACPI0016 device to retrieve the QTG ID depends on the access
|
||||
coordinates provided. The QTG ID for the device can be used as guidance to match
|
||||
to the CFMWS to setup the best Linux root decoder for the device performance.
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user