cxl: docs/platform/acpi reference documentation

Add basic ACPI table information needed to understand the CXL
driver probe process.

Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-6-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
This commit is contained in:
Gregory Price 2025-05-12 12:21:22 -04:00 committed by Dave Jiang
parent e4528b9ef0
commit d1ba364627
7 changed files with 264 additions and 0 deletions

View File

@ -26,6 +26,7 @@ that have impacts on each other. The docs here break up configurations steps.
:caption: Platform Configuration
platform/bios-and-efi
platform/acpi
.. toctree::
:maxdepth: 1

View File

@ -0,0 +1,76 @@
.. SPDX-License-Identifier: GPL-2.0
===========
ACPI Tables
===========
ACPI is the "Advanced Configuration and Power Interface", which is a standard
that defines how platforms and OS manage power and configure computer hardware.
For the purpose of this theory of operation, when referring to "ACPI" we will
usually refer to "ACPI Tables" - which are the way a platform (BIOS/EFI)
communicates static configuration information to the operation system.
The Following ACPI tables contain *static* configuration and performance data
about CXL devices.
.. toctree::
:maxdepth: 1
acpi/cedt.rst
acpi/srat.rst
acpi/hmat.rst
acpi/slit.rst
acpi/dsdt.rst
The SRAT table may also contain generic port/initiator content that is intended
to describe the generic port, but not information about the rest of the path to
the endpoint.
Linux uses these tables to configure kernel resources for statically configured
(by BIOS/EFI) CXL devices, such as:
- NUMA nodes
- Memory Tiers
- NUMA Abstract Distances
- SystemRAM Memory Regions
- Weighted Interleave Node Weights
ACPI Debugging
==============
The :code:`acpidump -b` command dumps the ACPI tables into binary format.
The :code:`iasl -d` command disassembles the files into human readable format.
Example :code:`acpidump -b && iasl -d cedt.dat` ::
[000h 0000 4] Signature : "CEDT" [CXL Early Discovery Table]
Common Issues
-------------
Most failures described here result in a failure of the driver to surface
memory as a DAX device and/or kmem.
* CEDT CFMWS targets list UIDs do not match CEDT CHBS UIDs.
* CEDT CFMWS targets list UIDs do not match DSDT CXL Host Bridge UIDs.
* CEDT CFMWS Restriction Bits are not correct.
* CEDT CFMWS Memory regions are poorly aligned.
* CEDT CFMWS Memory regions spans a platform memory hole.
* CEDT CHBS UIDs do not match DSDT CXL Host Bridge UIDs.
* CEDT CHBS Specification version is incorrect.
* SRAT is missing regions described in CEDT CFMWS.
* Result: failure to create a NUMA node for the region, or
region is placed in wrong node.
* HMAT is missing data for regions described in CEDT CFMWS.
* Result: NUMA node being placed in the wrong memory tier.
* SLIT has bad data.
* Result: Lots of performance mechanisms in the kernel will be very unhappy.
All of these issues will appear to users as if the driver is failing to
support CXL - when in reality they are all the failure of a platform to
configure the ACPI tables correctly.

View File

@ -0,0 +1,62 @@
.. SPDX-License-Identifier: GPL-2.0
================================
CEDT - CXL Early Discovery Table
================================
The CXL Early Discovery Table is generated by BIOS to describe the CXL memory
regions configured at boot by the BIOS.
CHBS
====
The CXL Host Bridge Structure describes CXL host bridges. Other than describing
device register information, it reports the specific host bridge UID for this
host bridge. These host bridge ID's will be referenced in other tables.
Example ::
Subtable Type : 00 [CXL Host Bridge Structure]
Reserved : 00
Length : 0020
Associated host bridge : 00000007 <- Host bridge _UID
Specification version : 00000001
Reserved : 00000000
Register base : 0000010370400000
Register length : 0000000000010000
CFMWS
=====
The CXL Fixed Memory Window structure describes a memory region associated
with one or more CXL host bridges (as described by the CHBS). It additionally
describes any inter-host-bridge interleave configuration that may have been
programmed by BIOS.
Example ::
Subtable Type : 01 [CXL Fixed Memory Window Structure]
Reserved : 00
Length : 002C
Reserved : 00000000
Window base address : 000000C050000000 <- Memory Region
Window size : 0000003CA0000000
Interleave Members (2^n) : 01 <- Interleave configuration
Interleave Arithmetic : 00
Reserved : 0000
Granularity : 00000000
Restrictions : 0006
QtgId : 0001
First Target : 00000007 <- Host Bridge _UID
Next Target : 00000006 <- Host Bridge _UID
The restriction field dictates what this SPA range may be used for (memory type,
voltile vs persistent, etc). One or more bits may be set. ::
Bit[0]: CXL Type 2 Memory
Bit[1]: CXL Type 3 Memory
Bit[2]: Volatile Memory
Bit[3]: Persistent Memory
Bit[4]: Fixed Config (HPA cannot be re-used)
INTRA-host-bridge interleave (multiple devices on one host bridge) is NOT
reported in this structure, and is solely defined via CXL device decoder
programming (host bridge and endpoint decoders).

View File

@ -0,0 +1,28 @@
.. SPDX-License-Identifier: GPL-2.0
==============================================
DSDT - Differentiated system Description Table
==============================================
This table describes what peripherals a machine has.
This table's UIDs for CXL devices - specifically host bridges, must be
consistent with the contents of the CEDT, otherwise the CXL driver will
fail to probe correctly.
Example Compute Express Link Host Bridge ::
Scope (_SB)
{
Device (S0D0)
{
Name (_HID, "ACPI0016" /* Compute Express Link Host Bridge */) // _HID: Hardware ID
Name (_CID, Package (0x02) // _CID: Compatible ID
{
EisaId ("PNP0A08") /* PCI Express Bus */,
EisaId ("PNP0A03") /* PCI Bus */
})
...
Name (_UID, 0x05) // _UID: Unique ID
...
}

View File

@ -0,0 +1,32 @@
.. SPDX-License-Identifier: GPL-2.0
===========================================
HMAT - Heterogeneous Memory Attribute Table
===========================================
The Heterogeneous Memory Attributes Table contains information such as cache
attributes and bandwidth and latency details for memory proximity domains.
For the purpose of this document, we will only discuss the SSLIB entry.
SLLBI
=====
The System Locality Latency and Bandwidth Information records latency and
bandwidth information for proximity domains.
This table is used by Linux to configure interleave weights and memory tiers.
Example (Heavily truncated for brevity) ::
Structure Type : 0001 [SLLBI]
Data Type : 00 <- Latency
Target Proximity Domain List : 00000000
Target Proximity Domain List : 00000001
Entry : 0080 <- DRAM LTC
Entry : 0100 <- CXL LTC
Structure Type : 0001 [SLLBI]
Data Type : 03 <- Bandwidth
Target Proximity Domain List : 00000000
Target Proximity Domain List : 00000001
Entry : 1200 <- DRAM BW
Entry : 0200 <- CXL BW

View File

@ -0,0 +1,21 @@
.. SPDX-License-Identifier: GPL-2.0
========================================
SLIT - System Locality Information Table
========================================
The system locality information table provides "abstract distances" between
accessor and memory nodes. Node without initiators (cpus) are infinitely (FF)
distance away from all other nodes.
The abstract distance described in this table does not describe any real
latency of bandwidth information.
Example ::
Signature : "SLIT" [System Locality Information Table]
Localities : 0000000000000004
Locality 0 : 10 20 20 30
Locality 1 : 20 10 30 20
Locality 2 : FF FF 0A FF
Locality 3 : FF FF FF 0A

View File

@ -0,0 +1,44 @@
.. SPDX-License-Identifier: GPL-2.0
=====================================
SRAT - Static Resource Affinity Table
=====================================
The System/Static Resource Affinity Table describes resource (CPU, Memory)
affinity to "Proximity Domains". This table is technically optional, but for
performance information (see "HMAT") to be enumerated by linux it must be
present.
There is a careful dance between the CEDT and SRAT tables and how NUMA nodes are
created. If things don't look quite the way you expect - check the SRAT Memory
Affinity entries and CEDT CFMWS to determine what your platform actually
supports in terms of flexible topologies.
The SRAT may statically assign portions of a CFMWS SPA range to a specific
proximity domains. See linux numa creation for more information about how
this presents in the NUMA topology.
Proximity Domain
================
A proximity domain is ROUGHLY equivalent to "NUMA Node" - though a 1-to-1
mapping is not guaranteed. There are scenarios where "Proximity Domain 4" may
map to "NUMA Node 3", for example. (See "NUMA Node Creation")
Memory Affinity
===============
Generally speaking, if a host does any amount of CXL fabric (decoder)
programming in BIOS - an SRAT entry for that memory needs to be present.
Example ::
Subtable Type : 01 [Memory Affinity]
Length : 28
Proximity Domain : 00000001 <- NUMA Node 1
Reserved1 : 0000
Base Address : 000000C050000000 <- Physical Memory Region
Address Length : 0000003CA0000000
Reserved2 : 00000000
Flags (decoded below) : 0000000B
Enabled : 1
Hot Pluggable : 1
Non-Volatile : 0