Commit Graph

99 Commits

Author SHA1 Message Date
Linus Torvalds
12bffaef28 CXL changes for v7.1
Misc patches:
 tools/testing/cxl: Enable replay of user regions as auto regions
 cxl/region: Add a region sysfs interface for region lock status
 cxl/core: Check existence of cxl_memdev_state in poison test
 cxl: Add endpoint decoder flags clear when PCI reset happens
 ACPI: NUMA: Only parse CFMWS at boot when CXL_ACPI is on
 MAINTAINERS: Update address for Dan Williams
 cxl/hdm: Add support for 32 switch decoders
 MAINTAINERS: Update Jonathan Cameron's email address
 
 Ensure endppoint complete initialization before usage:
 cxl/pci: Check memdev driver binding status in cxl_reset_done()
 cxl/pci: Hold memdev lock in cxl_event_trace_record()
 
 Type2 prepatory patches:
 cxl/region: Factor out interleave granularity setup
 cxl/region: Factor out interleave ways setup
 cxl: Make region type based on endpoint type
 cxl/pci: Remove redundant cxl_pci_find_port() call
 cxl: Move pci generic code from cxl_pci to core/cxl_pci
 cxl: export internal structs for external Type2 drivers
 cxl: support Type2 when initializing cxl_dev_state
 
 Patch series that deals with soft reserved memory conflict between CXL and HMEM:
 tools/testing/cxl: Test dax_hmem takeover of CXL regions
 tools/testing/cxl: Simulate auto-assembly failure
 dax/hmem: Parent dax_hmem devices
 dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices
 dax/hmem: Reduce visibility of dax_cxl coordination symbols
 cxl/region: Constify cxl_region_resource_contains()
 cxl/region: Limit visibility of cxl_region_contains_resource()
 dax/cxl: Fix HMEM dependencies
 cxl/region: Fix use-after-free from auto assembly failure
 dax/hmem, cxl: Defer and resolve Soft Reserved ownership
 cxl/region: Add helper to check Soft Reserved containment by CXL regions
 dax: Track all dax_region allocations under a global resource tree
 dax/cxl, hmem: Initialize hmem early and defer dax_cxl binding
 dax/hmem: Gate Soft Reserved deferral on DEV_DAX_CXL
 dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
 dax/hmem: Factor HMEM registration into __hmem_register_device()
 dax/bus: Use dax_region_put() in alloc_dax_region() error path
 
 Refactor CXL core/region code to make region code more manageable by
 splitting out DAX and PMEM code from RAM handling code:
 cxl/core: use cleanup.h for devm_cxl_add_dax_region
 cxl/core/region: move dax region device logic into region_dax.c
 cxl/core/region: move pmem region driver logic into region_pmem.c
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmnhDpYACgkQYGjFFmlT
 OEovUw//cRx5xgX6JIYb2UNOkBD4HEF7pTGTANZD9iAnHpMXg8cUuDjyd/Fw6+oJ
 dt84ntzX9olJXioc8NONpfDwlqQs0c3Yx+J/81PoQKrcproibCUCEcc/XFZazAJM
 1dSDQLy27H1TzCycwOqSwl9aKUXBVyHQgqYRL0kTf9mjroAM8FFhwa5ICxOSi0DS
 LX89CXvXHi7hJmoC4QHGbAKs8u76bC1uBazCfHIR25uuF57Q2LQhNQkd8rRPO5H4
 9c57BzxPGMSjrW9TEabtwwOCPqyTgwGmhlEbL2spJjkkY2PEsFZ+omGVWf3Y559s
 BFrSx68L1g/SySRqj4eqlFzyxvmd+bhOzsta081e2ysLi3cKfXyksRfukze3zg+n
 HQjNeLNWnIxfLj78ZnxBkYXhuZGF4LQQvaeJhVCSgjLR9Fb2Ke3as6uK0FQs+xZj
 S9B/PdboDX1OMuv0Mkyw5wAYMOnv90eODVPBXUkpmB57xy6UntQS6WFsIQZIquX+
 y4NGEb2yVaFREWal5fEaPNsFacp8A+M6AdTi/zkz8k6zQ78Q6F68bjNsKkhv+ncb
 HkI7QFtZXvYhJ5xLGBMvYl6sxzGb6nv9GEFpCn1Fg/FkaWP76z8/0YxBu6awrmE5
 DMuTC/1jli/YUHRgVoW2H1QLDZKerB/x6fDHvv1LudeeyznMwK0=
 =uJpx
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dave Jiang:
 "The significant change of interest is the handling of soft reserved
  memory conflict between CXL and HMEM. In essence CXL will be the first
  to claim the soft reserved memory ranges that belongs to CXL and
  attempt to enumerate them with best effort. If CXL is not able to
  enumerate the ranges it will punt them to HMEM.

  There are also MAINTAINERS email changes from Dan Williams and
  Jonathan Cameron"

* tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (37 commits)
  MAINTAINERS: Update Jonathan Cameron's email address
  cxl/hdm: Add support for 32 switch decoders
  MAINTAINERS: Update address for Dan Williams
  tools/testing/cxl: Enable replay of user regions as auto regions
  cxl/region: Add a region sysfs interface for region lock status
  tools/testing/cxl: Test dax_hmem takeover of CXL regions
  tools/testing/cxl: Simulate auto-assembly failure
  dax/hmem: Parent dax_hmem devices
  dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices
  dax/hmem: Reduce visibility of dax_cxl coordination symbols
  cxl/region: Constify cxl_region_resource_contains()
  cxl/region: Limit visibility of cxl_region_contains_resource()
  dax/cxl: Fix HMEM dependencies
  cxl/region: Fix use-after-free from auto assembly failure
  cxl/core: Check existence of cxl_memdev_state in poison test
  cxl/core: use cleanup.h for devm_cxl_add_dax_region
  cxl/core/region: move dax region device logic into region_dax.c
  cxl/core/region: move pmem region driver logic into region_pmem.c
  dax/hmem, cxl: Defer and resolve Soft Reserved ownership
  cxl/region: Add helper to check Soft Reserved containment by CXL regions
  ...
2026-04-17 15:52:58 -07:00
Li Ming
3624a22783 cxl/hdm: Add support for 32 switch decoders
Per CXL r4.0 section 8.2.4.20.1. CXL host bridge and switch ports can
support 32 HDM decoders. Current implementation misses some decoders on
CXL host bridge and switch in the case that the value of Decoder Count
field in CXL HDM decoder Capability Register is greater than or equal to
9.

Update calculation implementation to ensure the decoder count calculation
is correct for CXL host bridge/switch ports.

Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260321061459.1910205-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-04-10 08:32:14 -07:00
Smita Koralahalli
75cea0776d cxl/hdm: Avoid incorrect DVSEC fallback when HDM decoders are enabled
Check the global CXL_HDM_DECODER_ENABLE bit instead of looping over
per-decoder COMMITTED bits to determine whether to fall back to DVSEC
range emulation. When the HDM decoder capability is globally enabled,
ignore DVSEC range registers regardless of individual decoder commit
state.

should_emulate_decoders() currently loops over per-decoder COMMITTED
bits, which leads to an incorrect DVSEC fallback when those bits are
zero. One way to trigger this is to destroy a region and bounce the
memdev:

  cxl disable-region region0
  cxl destroy-region region0
  cxl disable-memdev mem0
  cxl enable-memdev mem0

Region teardown zeroes the HDM decoder registers including the committed
bits. The subsequent memdev re-probe finds uncommitted decoders and falls
back to DVSEC emulation, even though HDM remains globally enabled.

Observed failures:

  should_emulate_decoders: cxl_port endpoint6: decoder6.0: committed: 0 base: 0x0_00000000 size: 0x0_00000000
  devm_cxl_setup_hdm: cxl_port endpoint6: Fallback map 1 range register
  ..
  devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region0
  __construct_region: cxl_pci 0000:e1:00.0: mem1:decoder6.0:
  __construct_region region0 res: [mem 0x850000000-0x284fffffff flags 0x200] iw: 1 ig: 4096
  cxl region0: pci0000:e0:port1 cxl_port_setup_targets expected iw: 1 ig: 4096 ..
  cxl region0: pci0000:e0:port1 cxl_port_setup_targets got iw: 1 ig: 256 state: disabled ..
  cxl_port endpoint6: failed to attach decoder6.0 to region0: -6
  ..
  devm_cxl_add_region: cxl_acpi ACPI0017:00: decoder0.0: created region4
  alloc_hpa: cxl region4: HPA allocation error (-34) ..

Fixes: 52cc48ad2a ("cxl/hdm: Limit emulation to the number of range registers")
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20260316201950.224567-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-03-16 16:58:32 -07:00
Alison Schofield
0a70b7cd39 cxl: Test CXL_DECODER_F_LOCK as a bitmask
The CXL decoder flags are defined as bitmasks, not bit indices.
Using test_bit() to check them interprets the mask value as a bit
index, which is the wrong test.

For CXL_DECODER_F_LOCK the test reads beyond the defined bits, causing
the test to always return false and allowing resets that should have
been blocked.

Replace test_bit() with a bitmask check.

Fixes: 2230c4bdc4 ("cxl: Add handling of locked CXL decoder")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Tested-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/98851c4770e4631753cf9f75b58a3a6daeca2ea2.1771873256.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-24 08:33:30 -07:00
Linus Torvalds
e812928be2 cxl changes for v7.0
- A set of commits that introduces cxl_memdev_attach and pave way for
   soft reserved handling, type2 accelerator enabling, and LSA 2.0
   enabling. All these series require the endpoint driver to settle
   before continuing the memdev driver probe.
 
 dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready
 cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
 cxl/mem: Drop @host argument to devm_cxl_add_memdev()
 cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
 cxl/port: Arrange for always synchronous endpoint attach
 cxl/mem: Arrange for always-synchronous memdev attach
 cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
 
 - A set to address CXL port error protocol handling and reporting. The
   large patch series was split into 3 parts. Part 1 and 2 are included
   here with part 3 coming later. Part 1 consists of a series of code
   refactoring to PCI AER sub-system that addresses CXL and also CXL
   RAS code to prepare for port error handling. Part 2 refactors the
   CXL code to move management of component registers to cxl_port
   objects to allow all CXL AER errors to be handled through the
   cxl_port hierarchy.
 
 Part 2:
 cxl/port: Move endpoint component register management to cxl_port
 cxl/port: Map Port RAS registers
 cxl/port: Move dport RAS setup to dport add time
 cxl/port: Move dport probe operations to a driver event
 cxl/port: Move decoder setup before dport creation
 cxl/port: Cleanup dport removal with a devres group
 cxl/port: Reduce number of @dport variables in cxl_port_add_dport()
 cxl/port: Cleanup handling of the nr_dports 0 -> 1 transition
 
 Part 1:
 cxl: Update RAS handler interfaces to also support CXL Ports
 cxl/mem: Clarify @host for devm_cxl_add_nvdimm()
 PCI/AER: Update struct aer_err_info with kernel-doc formatting
 PCI/AER: Report CXL or PCIe bus type in AER trace logging
 PCI/AER: Use guard() in cxl_rch_handle_error_iter()
 PCI/AER: Move CXL RCH error handling to aer_cxl_rch.c
 PCI/AER: Update is_internal_error() to be non-static is_aer_internal_error()
 PCI/AER: Export pci_aer_unmask_internal_errors()
 cxl/pci: Move CXL driver's RCH error handling into core/ras_rch.c
 PCI/AER: Replace PCIEAER_CXL symbol with CXL_RAS
 cxl/pci: Remove CXL VH handling in CONFIG_PCIEAER_CXL conditional blocks from core/pci.c
 PCI: Replace cxl_error_is_native() with pcie_aer_is_native()
 cxl/pci: Remove unnecessary CXL RCH handling helper functions
 cxl/pci: Remove unnecessary CXL Endpoint handling helper functions
 PCI: Introduce pcie_is_cxl()
 PCI: Update CXL DVSEC definitions
 PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h
 
 - A set of patches to provide AMD Zen5 platform address translation for
   CXL using ACPI PRMT. Set includes a conventions document to explain
   why this is needed and how it's implemented.
 
 cxl: Disable HPA/SPA translation handlers for Normalized Addressing
 cxl/region: Factor out code into cxl_region_setup_poison()
 cxl/atl: Lock decoders that need address translation
 cxl: Enable AMD Zen5 address translation using ACPI PRMT
 cxl/acpi: Prepare use of EFI runtime services
 cxl: Introduce callback for HPA address ranges translation
 cxl/region: Use region data to get the root decoder
 cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
 cxl/region: Separate region parameter setup and region construction
 cxl: Simplify cxl_root_ops allocation and handling
 cxl/region: Store HPA range in struct cxl_region
 cxl/region: Store root decoder in struct cxl_region
 cxl/region: Rename misleading variable name @hpa to @hpa_range
 Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
 cxl, doc: Moving conventions in separate files
 cxl, doc: Remove isonum.txt inclusion
 
 - A set of misc CXL patches of fixes, cleanups, and updates. Including
   CXL address translation for unaligned MOD3 regions.
 
 cxl: Fix premature commit_end increment on decoder commit failure
 cxl/region: Use do_div() for 64-bit modulo operation
 cxl/region: Translate HPA to DPA and memdev in unaligned regions
 cxl/region: Translate DPA->HPA in unaligned MOD3 regions
 cxl/core: Fix cxl_dport debugfs EINJ entries
 cxl/acpi: Remove cxl_acpi_set_cache_size()
 cxl/hdm: Fix newline character in dev_err() messages
 cxl/pci: Remove outdated FIXME comment and BUILD_BUG_ON
 Documentation/driver-api/cxl: device hotplug section
 Documentation/driver-api/cxl: BIOS/EFI expectation update
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmmOFXcACgkQYGjFFmlT
 OEojaxAApQJFLyX1MkPbhtm6j6GRzzEAEWTBX2XsmliZf1JhfahsNMWI69kO33rm
 LddF+nyZNEl/foyHgUaxVzlQwqWuihyp7Qk2djXnMzLsuCAsWhPbB9j0RgJUN8h5
 N4U76AmOdmhLlXH4CCqoW2jNy0OjxNdgp1FtTHv7VO7RxgRE9MFJRkLulKxB03wy
 t6lRZXPofEFcHen40DlYRtW26vy1BYUO0dng2f16DxWrb1ztdACH/zVqCJJtdoFc
 FAT5EaQCeRYZ9Yz4dONw3DcUjYlG6NcRN9FWNiptBn1Pb7pUX55Le8lfD3qZg0an
 m3lWRs1T/lGz7pWmz4GPUKDwGFCEqLqd4oSz5v+dFR3JJxjJpRzKa19y5TfqK/LF
 diqNZsDD9gCXE1HXzNr1YcbllpU2cPRPf58gWG9bLmG5xUUmScib8LoTMfgcCJW5
 SlC6kf7BFLkJfDTcFaILc/UANeZaLGhrV0vyJntfGyT5EqKOcfjQEvrZvofA8mef
 bdxt0IRDW4D+7kkcuR33OipTVUFG3ban8yYq4zXD64dmeHF76gwdJm3nyXsqdtpc
 IYIIhz0W6pbTKjJ2fy1rZcTac1ZaALstyaF4bYWIjyF3NylPM8tDi48DFr+DGgeX
 xkFs2B9p5vY5Cq73gCmSWsi3PBPTjWzeRp7YZrV6VoBd9uqewUs=
 =blFQ
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:

 - Introduce cxl_memdev_attach and pave way for soft reserved handling,
   type2 accelerator enabling, and LSA 2.0 enabling. All these series
   require the endpoint driver to settle before continuing the memdev
   driver probe.

 - Address CXL port error protocol handling and reporting.

   The large patch series was split into three parts. The first two
   parts are included here with the final part coming later.

   The first part consists of a series of code refactoring to PCI AER
   sub-system that addresses CXL and also CXL RAS code to prepare for
   port error handling.

   The second part refactors the CXL code to move management of
   component registers to cxl_port objects to allow all CXL AER errors
   to be handled through the cxl_port hierarchy.

 - Provide AMD Zen5 platform address translation for CXL using ACPI
   PRMT. This includes a conventions document to explain why this is
   needed and how it's implemented.

 - Misc CXL patches of fixes, cleanups, and updates. Including CXL
   address translation for unaligned MOD3 regions.

[ TLA service: CXL is "Compute Express Link" ]

* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
  cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  cxl/region: Factor out code into cxl_region_setup_poison()
  cxl/atl: Lock decoders that need address translation
  cxl: Enable AMD Zen5 address translation using ACPI PRMT
  cxl/acpi: Prepare use of EFI runtime services
  cxl: Introduce callback for HPA address ranges translation
  cxl/region: Use region data to get the root decoder
  cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  cxl/region: Separate region parameter setup and region construction
  cxl: Simplify cxl_root_ops allocation and handling
  cxl/region: Store HPA range in struct cxl_region
  cxl/region: Store root decoder in struct cxl_region
  cxl/region: Rename misleading variable name @hpa to @hpa_range
  Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  cxl, doc: Moving conventions in separate files
  cxl, doc: Remove isonum.txt inclusion
  cxl/port: Unify endpoint and switch port lookup
  cxl/port: Move endpoint component register management to cxl_port
  cxl/port: Map Port RAS registers
  cxl/port: Move dport RAS setup to dport add time
  ...
2026-02-12 16:33:05 -08:00
Dave Jiang
0da3050bdd Merge branch 'for-7.0/cxl-aer-prep' into cxl-for-next
Fixup and refactor downstream port enumeration to prepare for CXL port
protocol error handling. Main motivation is to move endpoint
component register mapping to a port object.

cxl/port: Unify endpoint and switch port lookup
cxl/port: Move endpoint component register management to cxl_port
cxl/port: Map Port RAS registers
cxl/port: Move dport RAS setup to dport add time
cxl/port: Move dport probe operations to a driver event
cxl/port: Move decoder setup before dport creation
cxl/port: Cleanup dport removal with a devres group
cxl/port: Reduce number of @dport variables in cxl_port_add_dport()
cxl/port: Cleanup handling of the nr_dports 0 -> 1 transition
2026-02-02 09:39:41 -07:00
Dan Williams
3864cb60da cxl/port: Move dport probe operations to a driver event
In preparation for adding more register setup to the cxl_port_add_dport()
path (for RAS register mapping), move the dport creation event to a driver
callback. This achieves two goals, it puts driver operations logically
where they belong, in a driver, and it obviates the gymnastics of
DECLARE_TESTABLE() which just makes a mess of grepping for CXL symbols.

In other words, a driver callback is less of an ongoing maintenance burden
than this DECLARE_TESTABLE arrangement that does not scale and diminishes
the grep-ability of the codebase.

cxl_port_add_dport() moves mostly unmodified from drivers/cxl/core/port.c.
The only deliberate change is that it now assumes that the device_lock is
held on entry and the driver is attached (just like cxl_port_probe()).

Reviewed-by: Terry Bowman <terry.bowman@amd.com>
Tested-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20260131000403.2135324-6-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-02 08:41:29 -07:00
Yuxiong Wang
7b6f9d9b1e cxl: Fix premature commit_end increment on decoder commit failure
In cxl_decoder_commit(), commit_end is incremented before verifying
whether the commit succeeded, and the CXL_DECODER_F_ENABLE bit in
cxld->flags is only set after a successful commit. As a result, if the
commit fails, commit_end has been incremented and cxld->reset() has no
effect since the flag is not set, so commit_end remains incorrectly
incremented. The inconsistency between commit_end and CXL_DECODER_F_ENABLE
causes failure during subsequent either commit or reset operations.

Fix this by incrementing commit_end only after confirming the commit
succeeded. Also, remove the ineffective cxld->reset() call. According to
CXL Spec r4.0 8.2.4.20.12 Committing Decoder Programming, since
cxld_await_commit() has cleared the decoder commit bit on failure, no
additional reset is required.

[dj: Fixed commit log 80 char wrapping. ]
[dj: Fix "Fixes" tag to correct hash length. ]
[dj: Change spec to r4.0. ]

Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Signed-off-by: Yuxiong Wang <yuxiong.wang@linux.alibaba.com>
Acked-by: Huang Ying <ying.huang@linux.alibaba.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20260129064552.31180-1-yuxiong.wang@linux.alibaba.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-29 11:00:35 -07:00
Robert Richter
e5b1887619 cxl/hdm: Fix newline character in dev_err() messages
The newline character is not placed at the end of the string. This
causes unintended line wraps, broken log level and unterminated log
messages. Fix that for all messages.

Note that the messages are changed to use colons now instead of
parentheses, which is more common use.

Fixes: 24b1819718 ("cxl/hdm: Extend DVSEC range register emulation for region enumeration")
Fixes: 9c57cde0dc ("cxl/hdm: Enumerate allocated DPA")
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260109122952.639231-1-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 16:58:13 -07:00
Robert Richter
8441c7d3bd cxl: Check for invalid addresses returned from translation functions on errors
Translation functions may return an invalid address in case of errors.
If the address is not checked the further use of the invalid value
will cause an address corruption.

Consistently check for a valid address returned by translation
functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for
type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX
or ULLONG_MAX is used to indicate an address error.

Propagating an invalid address from a failed translation may cause
userspace to think it has received a valid SPA, when in fact it is
wrong. The CXL userspace API, using trace events, expects ULLONG_MAX
to indicate a translation failure. If ULLONG_MAX is not returned
immediately, subsequent calculations can transform that bad address
into a different value (!ULLONG_MAX), and an invalid SPA may be
returned to userspace. This can lead to incorrect diagnostics and
erroneous corrective actions.

[ dj: Added user impact statement from Alison. ]
[ dj: Fixed checkpatch tab alignment issue. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Fixes: c3dd67681c ("cxl/region: Add inject and clear poison by region offset")
Fixes: b78b9e7b79 ("cxl/region: Refactor address translation funcs for testing")
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-13 08:30:40 -07:00
Li Ming
d4026a4462 cxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve()
In __cxl_dpa_reserve(), it will check if the new resource range is
included in one of paritions of the cxl memory device.
cxlds->nr_paritions is used to represent how many partitions information
the cxl memory device has. In the loop, if driver cannot find a
partition including the new resource range, it will be an infinite loop.

[ dj: Removed incorrect fixes tag ]

Fixes: 991d98f17d ("cxl: Make cxl_dpa_alloc() DPA partition number agnostic")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260112120526.530232-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-12 08:59:16 -07:00
Dave Jiang
2230c4bdc4 cxl: Add handling of locked CXL decoder
When a decoder is locked, it means that its configuration cannot be
changed. CXL spec r3.2 8.2.4.20.13 discusses the details regarding
locked decoders. Locking happens when bit 8 of the decoder control
register is set and then the decoder is committed afterwards (CXL
spec r3.2 8.2.4.20.7).

Given that the driver creates a virtual decoder for each CFMWS, the
Fixed Device Configuration (bit 4) of the Window Restriction field is
considered as locking for the virtual decoder by the driver.

The current driver code disregards the locked status and a region can
be destroyed regardless of the locking state.

Add a region flag to indicate the region is in a locked configuration.
The driver will considered a region locked if the CFMWS or any decoder
is configured as locked. The consideration is all or nothing regarding
the locked state. It is reasonable to determine the region "locked"
status while the region is being assembled based on the decoders.

Add a check in region commit_store() to intercept when a 0 is written
to the commit sysfs attribute in order to prevent the destruction of a
region when in locked state. This should be the only entry point from user
space to destroy a region.

Add a check is added to cxl_decoder_reset() to prevent resetting a locked
decoder within the kernel driver.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251105201826.2901915-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-12 13:17:19 -07:00
Dave Jiang
46037455cb Merge branch 'for-6.18/cxl-delay-dport' into cxl-for-next
Add changes to delay the allocation and setup of dports until when the
endpoint device is being probed. At this point, the CXL link is
established from endpoint to host bridge. Addresses issues seen on
some platforms when dports are probed earlier.

Link: https://lore.kernel.org/linux-cxl/20250829180928.842707-1-dave.jiang@intel.com/
2025-09-18 14:34:51 -07:00
Dave Jiang
644685abc1 cxl/test: Adjust the mock version of devm_cxl_switch_port_decoders_setup()
With devm_cxl_switch_port_decoders_setup() being called within cxl_core
instead of by the port driver probe, adjustments are needed to deal with
circular symbol dependency when this function is being mock'd. Add the
appropriate changes to get around the circular dependency.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18 09:55:23 -07:00
Dave Jiang
4f06d81e7c cxl: Defer dport allocation for switch ports
The current implementation enumerates the dports during the cxl_port
driver probe. Without an endpoint connected, the dport may not be
active during port probe. This scheme may prevent a valid hardware
dport id to be retrieved and MMIO registers to be read when an endpoint
is hot-plugged. Move the dport allocation and setup to behind memdev
probe so the endpoint is guaranteed to be connected.

In the original enumeration behavior, there are 3 phases (or 2 if no CXL
switches) for port creation. cxl_acpi() creates a Root Port (RP) from the
ACPI0017.N device. Through that it enumerates downstream ports composed
of ACPI0016.N devices through add_host_bridge_dport(). Once done, it
uses add_host_bridge_uport() to create the ports that enumerate the PCI
RPs as the dports of these ports. Every time a port is created, the port
driver is attached, cxl_switch_porbe_probe() is called and
devm_cxl_port_enumerate_dports() is invoked to enumerate and probe
the dports.

The second phase is if there are any CXL switches. When the pci endpoint
device driver (cxl_pci) calls probe, it will add a mem device and triggers
the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports()
and attempts to discovery and create all the ports represent CXL switches.
During this phase, a port is created per switch and the attached dports
are also enumerated and probed.

The last phase is creating endpoint port which happens for all endpoint
devices.

The new sequence is instead of creating all possible dports at initial
port creation, defer port instantiation until a memdev beneath that
dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize
the creation and extension of ports with new dports as memory devices
arrive. As part of this rework, switch decoder target list is amended
at runtime as dports show up.

While the decoders are allocated during the port driver probe,
The decoders must also be updated since previously they were setup when
all the dports are setup. Now every time a dport is setup per endpoint,
the switch target listing need to be updated with new dport. A
guard(rwsem_write) is used to update decoder targets. This is similar to
when decoder_populate_target() is called and the decoder programming
must be protected.

Also the port registers are probed the first time when the first dport
shows up. This ensures that the CXL link is established when the port
registers are probed.

[dj] Use ERR_CAST() (Jonathan)

Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18 09:55:22 -07:00
Dave Jiang
68d5d9734c cxl/test: Refactor decoder setup to reduce cxl_test burden
Group the decoder setup code in switch and endpoint port probe into a
single function for each to reduce the number of functions to be mocked
in cxl_test. Introduce devm_cxl_switch_port_decoders_setup() and
devm_cxl_endpoint_decoders_setup(). These two functions will be mocked
instead with some functions optimized out since the mock version does
not do anything. Remove devm_cxl_setup_hdm(),
devm_cxl_add_passthrough_decoder(), and devm_cxl_enumerate_decoders() in
cxl_test mock code. In turn, mock_cxl_add_passthrough_decoder() can be
removed since cxl_test does not setup passthrough decoders.
__wrap_cxl_hdm_decode_init() and __wrap_cxl_dvsec_rr_decode() can be
removed as well since they only return 0 when called.

[dj: drop 'struct cxl_port' forward declaration (Robert)]

Suggested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18 09:54:50 -07:00
Dave Jiang
02edab6cee cxl: Add a cached copy of target_map to cxl_decoder
Add a cached copy of the hardware port-id list that is available at init
before all @dport objects have been instantiated. Change is in preparation
of delayed dport instantiation.

Reviewed-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-17 08:53:24 -07:00
Xichao Zhao
22fb4ad898 cxl/hdm: Use str_plural() to simplify the code
Use the string choice helper function str_plural() to simplify the code.

Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250811122519.543554-1-zhao.xichao@vivo.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-08-12 14:39:24 -07:00
Dave Jiang
b873adfdde Merge branch 'for-6.17/cxl-acquire' into cxl-for-next
Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks.
Convert CXL subsystem to use the new macros.
2025-07-16 13:30:17 -07:00
Dan Williams
d03fcf50ba cxl: Convert to ACQUIRE() for conditional rwsem locking
Use ACQUIRE() to cleanup conditional locking paths in the CXL driver
The ACQUIRE() macro and its associated ACQUIRE_ERR() helpers, like
scoped_cond_guard(), arrange for scoped-based conditional locking. Unlike
scoped_cond_guard(), these macros arrange for an ERR_PTR() to be retrieved
representing the state of the conditional lock.

The goal of this conversion is to complete the removal of all explicit
unlock calls in the subsystem. I.e. the methods to acquire a lock are
solely via guard(), scoped_guard() (for limited cases), or ACQUIRE(). All
unlock is implicit / scope-based. In order to make sure all lock sites are
converted, the existing rwsem's are consolidated and renamed in 'struct
cxl_rwsem'. While that makes the patch noisier it gives a clean cut-off
between old-world (explicit unlock allowed), and new world (explicit unlock
deleted).

Cc: David Lechner <dlechner@baylibre.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Shiju Jose <shiju.jose@huawei.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250711234932.671292-9-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-16 11:34:36 -07:00
Dan Williams
55a89d9c99 cxl/decoder: Drop pointless locking
cxl_dpa_rwsem coordinates changes to dpa allocation settings for a given
decoder. cxl_decoder_reset() has no need for a consistent snapshot of the
dpa settings since it is merely clearing out whatever was there previously.

Otherwise, cxl_region_rwsem protects against 'reset' racing 'setup'.

In preparation for converting to rw_semaphore_acquire semantics, drop this
locking.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250711234932.671292-5-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-16 11:34:36 -07:00
Dan Williams
7cb3b42a6b cxl/decoder: Move decoder register programming to a helper
In preparation for converting to rw_semaphore_acquire semantics move the
contents of an open-coded {down,up}_read(&cxl_dpa_rwsem) section to a
helper function.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250711234932.671292-4-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-16 11:34:36 -07:00
Li Ming
5b6031c832 cxl/core: Introduce a new helper cxl_resource_contains_addr()
In CXL subsystem, many functions need to check an address availability
by checking if the resource range contains the address. Providing a new
helper function cxl_resource_contains_addr() to check if the resource
range contains the input address.

Suggested-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250711032357.127355-2-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-11 09:46:53 -07:00
Dan Carpenter
a223ce1957 cxl/hdm: Clean up a debug printk
Smatch complains that %pa is for phys_addr_t types and "size" is a u64.

    drivers/cxl/core/hdm.c:521 cxl_dpa_alloc() error: '%pa' expects
    argument of type 'phys_addr_t*', argument 4 has type 'ullong*

Looking at this, to me it seems more useful to print the sizes as
decimal instead of hex.  Let's do that.

[dj: Adjusted based on latest code changes. ]

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/3d3d969d-651d-4e9d-a892-900876a60ab5@moroto.mountain
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 14:04:08 -07:00
Robert Richter
98a863fee2 cxl: Add a dev_dbg() when a decoder was added to a port
Improve debugging by adding and unifying messages whenever a decoder
was added to a port. It is especially useful to get the decoder
mapping of the involved CXL host bridge or PCI device. This avoids a
complex lookup of the decoder/port/device mappings in sysfs.

Example log messages:

  cxl_acpi ACPI0017:00: decoder0.0 added to root0
  cxl_acpi ACPI0017:00: decoder0.1 added to root0
  ...
   pci0000:e0: decoder1.0 added to port1
   pci0000:e0: decoder1.1 added to port1
  ...
  cxl_mem mem0: decoder5.0 added to endpoint5
  cxl_mem mem0: decoder5.1 added to endpoint5

Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Gregory Price <gourry@gourry.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250509150700.2817697-15-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-09 09:57:43 -07:00
Dave Jiang
b6faa9c613 Merge branch 'for-6.15/guard_cleanups' into cxl-for-next2
A series of CXL refactoring using scope based resource management to
remove goto patterns on the cleanup paths.
2025-03-14 16:11:06 -07:00
Li Ming
a81ebe7d19 cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc()
In cxl_dpa_alloc(), some checking and operations need to be protected by
a rwsem called cxl_dpa_rwsem, so there is a goto pattern in
cxl_dpa_alloc() to release the rwsem. The goto pattern can be optimized
by using guard() to hold the rwsem.

Creating a new function called __cxl_dpa_alloc() to include all checking
and operations needed to be protected by cxl_dpa_rwsem. Using
guard(rwsem_write()) to hold cxl_dpa_rwsem at the beginning of the new
function.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250221012453.126366-6-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14 14:45:04 -07:00
Li Ming
16fe6ec4ac cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free()
cxl_dpa_free() has a goto pattern to call up_write() for cxl_dpa_rwsem,
it can be removed by using a guard() to replace the down_write() and
up_write().

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250221012453.126366-5-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14 14:37:54 -07:00
Li Ming
eeba74747a cxl/core: Use guard() to replace open-coded down_read/write()
Some down/up_read() and down/up_write() cases can be replaced by a
guard() simply to drop explicit unlock invoked. It helps to align coding
style with current CXL subsystem's.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20250221012453.126366-2-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-03-14 14:37:01 -07:00
Dan Williams
be5cbd0840 cxl: Kill enum cxl_decoder_mode
Now that the operational mode of DPA capacity (ram vs pmem... etc) is
tracked in the partition, and no code paths have dependencies on the
mode implying the partition index, the ambiguous 'enum cxl_decoder_mode'
can be cleaned up, specifically this ambiguity on whether the operation
mode implied anything about the partition order.

Endpoint decoders simply reference their assigned partition where the
operational mode can be retrieved as partition mode.

With this in place PMEM can now be partition0 which happens today when
the RAM capacity size is zero. Dynamic RAM can appear above PMEM when
DCD arrives, etc. Code sequences that hard coded the "PMEM after RAM"
assumption can now just iterate partitions and consult the partition
mode after the fact.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864306972.668823.3327008645125276726.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04 13:48:19 -07:00
Dan Williams
991d98f17d cxl: Make cxl_dpa_alloc() DPA partition number agnostic
cxl_dpa_alloc() is a hard coded nest of assumptions around PMEM
allocations being distinct from RAM allocations in specific ways when in
practice the allocation rules are only relative to DPA partition index.

The rules for cxl_dpa_alloc() are:

- allocations can only come from 1 partition

- if allocating at partition-index-N, all free space in partitions less
  than partition-index-N must be skipped over

Use the new 'struct cxl_dpa_partition' array to support allocation with
an arbitrary number of DPA partitions on the device.

A follow-on patch can go further to cleanup 'enum cxl_decoder_mode'
concept and supersede it with looking up the memory properties from
partition metadata. Until then cxl_part_mode() temporarily bridges code
that looks up partitions by @cxled->mode.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864306400.668823.12143134425285426523.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04 13:48:19 -07:00
Dan Williams
8e4c411c53 cxl: Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'
The pending efforts to add CXL Accelerator (type-2) device [1], and
Dynamic Capacity (DCD) support [2], tripped on the
no-longer-fit-for-purpose design in the CXL subsystem for tracking
device-physical-address (DPA) metadata. Trip hazards include:

- CXL Memory Devices need to consider a PMEM partition, but Accelerator
  devices with CXL.mem likely do not in the common case.

- CXL Memory Devices enumerate DPA through Memory Device mailbox
  commands like Partition Info, Accelerators devices do not.

- CXL Memory Devices that support DCD support more than 2 partitions.
  Some of the driver algorithms are awkward to expand to > 2 partition
  cases.

- DPA performance data is a general capability that can be shared with
  accelerators, so tracking it in 'struct cxl_memdev_state' is no longer
  suitable.

- Hardcoded assumptions around the PMEM partition always being index-1
  if RAM is zero-sized or PMEM is zero sized.

- 'enum cxl_decoder_mode' is sometimes a partition id and sometimes a
  memory property, it should be phased in favor of a partition id and
  the memory property comes from the partition info.

Towards cleaning up those issues and allowing a smoother landing for the
aforementioned pending efforts, introduce a 'struct cxl_dpa_partition'
array to 'struct cxl_dev_state', and 'struct cxl_range_info' as a shared
way for Memory Devices and Accelerators to initialize the DPA information
in 'struct cxl_dev_state'.

For now, split a new cxl_dpa_setup() from cxl_mem_create_range_info() to
get the new data structure initialized, and cleanup some qos_class init.
Follow on patches will go further to use the new data structure to
cleanup algorithms that are better suited to loop over all possible
partitions.

cxl_dpa_setup() follows the locking expectations of mutating the device
DPA map, and is suitable for Accelerator drivers to use. Accelerators
likely only have one hardcoded 'ram' partition to convey to the
cxl_core.

Link: http://lore.kernel.org/20241230214445.27602-1-alejandro.lucero-palau@amd.com [1]
Link: http://lore.kernel.org/20241210-dcd-type2-upstream-v8-0-812852504400@intel.com [2]
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864305827.668823.13978794102080021276.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04 13:48:19 -07:00
Dan Williams
d77ca6c2b5 cxl: Introduce to_{ram,pmem}_{res,perf}() helpers
In preparation for consolidating all DPA partition information into an
array of DPA metadata, introduce helpers that hide the layout of the
current data. I.e. make the eventual replacement of ->ram_res,
->pmem_res, ->ram_perf, and ->pmem_perf with a new DPA metadata array a
no-op for code paths that consume that information, and reduce the noise
of follow-on patches.

The end goal is to consolidate all DPA information in 'struct
cxl_dev_state', but for now the helpers just make it appear that all DPA
metadata is relative to @cxlds.

As the conversion to generic partition metadata walking is completed,
these helpers will naturally be eliminated, or reduced in scope.

Cc: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864305238.668823.16553986866633608541.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04 13:48:18 -07:00
Dan Williams
188e9529a6 cxl: Remove the CXL_DECODER_MIXED mistake
CXL_DECODER_MIXED is a safety mechanism introduced for the case where
platform firmware has programmed an endpoint decoder that straddles a
DPA partition boundary. While the kernel is careful to only allocate DPA
capacity within a single partition there is no guarantee that platform
firmware, or anything that touched the device before the current kernel,
gets that right.

However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
designation because of the way it tracks partition boundaries. A
request_resource() that spans ->ram_res and ->pmem_res fails with the
following signature:

    __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation

CXL_DECODER_MIXED is dead defensive programming after the driver has
already given up on the device. It has never offered any protection in
practice, just delete it.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Link: https://patch.msgid.link/173864304660.668823.17000888505587850279.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-02-04 13:48:18 -07:00
Zijun Hu
523c6b3ed7 driver core: Correct API device_for_each_child_reverse_from() prototype
For API device_for_each_child_reverse_from(..., const void *data,
		int (*fn)(struct device *dev, const void *data))

- Type of @data is const pointer, and means caller's data @*data is not
  allowed to be modified, but that usually is not proper for such non
  finding device iterating API.

- Types for both @data and @fn are not consistent with all other
  for_each device iterating APIs device_for_each_child(_reverse)(),
  bus_for_each_dev() and (driver|class)_for_each_device().

Correct its prototype by removing const from parameter types, then adapt
for various existing usages.

An dedicated typedef device_iter_t will be introduced as @fn() type for
various for_each device interating APIs later.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250105-class_fix-v6-6-3a2f1768d4d4@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-10 15:26:12 +01:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Linus Torvalds
563cb0b1e7 cxl changes for v6.13
- Constify range_contains() input parameters to prevent changes.
 - Add support for displaying RCD capabilities in sysfs to support lspci for CXL device.
 - Downgrade warning message to debug in cxl_probe_component_regs().
 - Add support for adding a printf specifier '$pra' to emit 'struct range' content.
   - Add sanity tests for 'struct resource'.
   - Add documentation for special case.
   - Add %pra for 'struct range'.
   - Add %pra usage in CXL code.
 - Add preparation code for DCD support
   - Add range_overlaps().
   - Add CDAT DSMAS table shared and read only flag in ACPICA.
   - Add documentation to 'struct dev_dax_range'.
   - Delay event buffer allocation in CXL PCI code until needed.
   - Use guard() in cxl_dpa_set_mode().
   - Refactor create region code to consolidate common code.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmc84dMACgkQYGjFFmlT
 OEoGTg//cSJlQ9X7+xZDbngnzpJwcLzQkR/FXDfe3obtmgs7woDJgNNcYnKSlgyf
 wal47Q0UM/1Hv8Dtfrt62Ay1fmOvDL2GSpey35NVJGCEpIsfOqqk1zTCgfgwRHTO
 MZJLnOSFUIlDYlVz8ljLNHnNqPjr7dCoUh9tdBefvkw59FqbkHNcWI8hG1lh1SR4
 2frtJcqVg54S6vJa2eeWmNVpxz7RZvPFrb8TJzhdrGM8PkTMNFA2oJINAf0j00Ev
 8/T6HXTxXvFtNhBH0dtMO1MFh1d6Qr/zFnX/gmrnPWl1l/12HFDMBIZIzq/Whjpo
 +7hQ5xK3cwkMevFgFrAhwdZMj8maR84x1dbFItoThaoeDIQ4sGfyQEMPsbkZP/Sc
 67i5hQFIBZc+ORLB0W+z9Da52ZFGyVw/xsCmDRzXCw4s7N2twpydIoA7Pvu9NN1X
 3JVF35NrsRZ+PyuGWEitNjo0Rj6swNpBC5Xv/T1mgFtSgvVuk1T2QtSHJcPoQyzQ
 zbijsCKmvJYbdJBnPiotdrBs1BUxBsP9dBT9IxWzMy6lcEpTJrYpUheRCk2tSHFa
 Kk8O8IYNiBKZaSpN9UHKaGzr43H8gNbLf4svSIiu1lZJTSSdtWqfZZYjXFBgB1Vb
 l2gBCDmPJ0y7WKZSCa53UmQiOusr+l3Pi+OflZEfCy6JxbSqTTM=
 =GNlu
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dave Jiang:

 - Constify range_contains() input parameters to prevent changes

 - Add support for displaying RCD capabilities in sysfs to support lspci
   for CXL device

 - Downgrade warning message to debug in cxl_probe_component_regs()

 - Add support for adding a printf specifier '%pra' to emit 'struct
   range' content:
     - Add sanity tests for 'struct resource'
     - Add documentation for special case
     - Add %pra for 'struct range'
     - Add %pra usage in CXL code

 - Add preparation code for DCD support:
     - Add range_overlaps()
     - Add CDAT DSMAS table shared and read only flag in ACPICA
     - Add documentation to 'struct dev_dax_range'
     - Delay event buffer allocation in CXL PCI code until needed
     - Use guard() in cxl_dpa_set_mode()
     - Refactor create region code to consolidate common code

* tag 'cxl-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/region: Refactor common create region code
  cxl/hdm: Use guard() in cxl_dpa_set_mode()
  cxl/pci: Delay event buffer allocation
  dax: Document struct dev_dax_range
  ACPI/CDAT: Add CDAT/DSMAS shared and read only flag values
  range: Add range_overlaps()
  cxl/cdat: Use %pra for dpa range outputs
  printf: Add print format (%pra) for struct range
  Documentation/printf: struct resource add start == end special case
  test printf: Add very basic struct resource tests
  cxl: downgrade a warning message to debug level in cxl_probe_component_regs()
  cxl/pci: Add sysfs attribute for CXL 1.1 device link status
  cxl/core/regs: Add rcd_pcie_cap initialization
  kernel/range: Const-ify range_contains parameters
2024-11-22 12:33:52 -08:00
Ira Weiny
27fcfb4168 cxl/hdm: Use guard() in cxl_dpa_set_mode()
Additional DCD functionality is being added to this call which will be
simplified by the use of guard() with the cxl_dpa_rwsem.

Convert the function to use guard() prior to adding DCD functionality.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20241107-dcd-type2-upstream-v7-5-56a84e66bc36@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-11-08 09:39:31 -07:00
Dan Williams
101c268bd2 cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:

    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
    cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1)  cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
    [..]
    cxld_unregister: cxl decoder14.0:
    cxl_region_decode_reset: cxl_region region3:
    mock_decoder_reset: cxl_port port3: decoder3.0 reset
2)  mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
    cxl_endpoint_decoder_release: cxl decoder14.0:
    [..]
    cxld_unregister: cxl decoder7.0:
3)  cxl_region_decode_reset: cxl_region region3:
    Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
    [..]
    RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     cxl_region_decode_reset+0x69/0x190 [cxl_core]
     cxl_region_detach+0xe8/0x210 [cxl_core]
     cxl_decoder_kill_region+0x27/0x40 [cxl_core]
     cxld_unregister+0x5d/0x60 [cxl_core]

At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.

The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.

In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.

A new device_for_each_child_reverse_from() is added to cleanup
port->commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0->1->2 and disabled 1->2->0 then
port->commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: stable@vger.kernel.org
Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25 16:07:03 -05:00
Yao Xingtao
84328c5ace cxl/region: check interleave capability
Since interleave capability is not verified, if the interleave
capability of a target does not match the region need, committing decoder
should have failed at the device end.

In order to checkout this error as quickly as possible, driver needs
to check the interleave capability of target during attaching it to
region.

Per CXL specification r3.1(8.2.4.20.1 CXL HDM Decoder Capability Register),
bits 11 and 12 indicate the capability to establish interleaving in 3, 6,
12 and 16 ways. If these bits are not set, the target cannot be attached to
a region utilizing such interleave ways.

Additionally, bits 8 and 9 represent the capability of the bits used for
interleaving in the address, Linux tracks this in the cxl_port
interleave_mask.

Per CXL specification r3.1(8.2.4.20.13 Decoder Protection):
  eIW means encoded Interleave Ways.
  eIG means encoded Interleave Granularity.

  in HPA:
  if eIW is 0 or 8 (interleave ways: 1, 3), all the bits of HPA are used,
  the interleave bits are none, the following check is ignored.

  if eIW is less than 8 (interleave ways: 2, 4, 8, 16), the interleave bits
  start at bit position eIG + 8 and end at eIG + eIW + 8 - 1.

  if eIW is greater than 8 (interleave ways: 6, 12), the interleave bits
  start at bit position eIG + 8 and end at eIG + eIW - 1.

  if the interleave mask is insufficient to cover the required interleave
  bits, the target cannot be attached to the region.

Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240614084755.59503-2-yaoxt.fnst@fujitsu.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-06-25 14:45:27 -07:00
Ira Weiny
6ef37af6f4 cxl/hdm: Debug, use decoder name function
The decoder enum has a name conversion function defined now.

Use that instead of open coding.

Suggested-by: Navneet Singh <navneet.singh@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230604-dcd-type2-upstream-v2-1-f740c47e7916@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30 10:43:48 -07:00
Alison Schofield
4afaed94bc cxl/hdm: dev_warn() on unsupported mixed mode decoder
A mixed mode decoder is programmed with device physical addresses
that span both ram and pmem partitions of a memdev.

Linux does not support mixed mode decoders. The driver rejects
sysfs writes that try to set decoder mode to mixed, and if a
resource bieng allocated is not wholly contained in either the
pmem or ram partition of a memdev, it is also rejected. Basically,
the CXL region driver is not going to create regions with mixed
mode decoders, but the BIOS could.

If the kernel driver sees the mixed mode decoder, it will fail to
enable the region, and emit a dev_dbg() message.

A dev_dbg() is not noisy enough in this case. Change the message
to be a dev_warn() that explicitly says mixed mode is not supported.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230218013834.31237-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30 10:43:48 -07:00
Huang Ying
54e8dd59a7 cxl/hdm: Add debug message for invalid interleave granularity
There's no debug message for invalid interleave granularity.  This
makes it hard to debug related bugs.  So, this is added in this patch.

Signed-off-by: Huang, Ying <ying.huang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240402061016.388408-1-ying.huang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30 10:43:48 -07:00
Dan Williams
6f5c4eca48 cxl/hdm: Fix dpa translation locking
The helper, cxl_dpa_resource_start(), snapshots the dpa-address of an
endpoint-decoder after acquiring the cxl_dpa_rwsem. However, it is
sufficient to assert that cxl_dpa_rwsem is held rather than acquire it
in the helper. Otherwise, it triggers multiple lockdep reports:

1/ Tracing callbacks are in an atomic context that can not acquire sleeping
locks:

    BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1525
    in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1288, name: bash
    preempt_count: 2, expected: 0
    RCU nest depth: 0, expected: 0
    [..]
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023
    Call Trace:
     <TASK>
     dump_stack_lvl+0x71/0x90
     __might_resched+0x1b2/0x2c0
     down_read+0x1a/0x190
     cxl_dpa_resource_start+0x15/0x50 [cxl_core]
     cxl_trace_hpa+0x122/0x300 [cxl_core]
     trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core]

2/ The rwsem is already held in the inject poison path:

    WARNING: possible recursive locking detected
    6.7.0-rc2+ #12 Tainted: G        W  OE    N
    --------------------------------------------
    bash/1288 is trying to acquire lock:
    ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_dpa_resource_start+0x15/0x50 [cxl_core]

    but task is already holding lock:
    ffffffffc05f73d0 (cxl_dpa_rwsem){++++}-{3:3}, at: cxl_inject_poison+0x7d/0x1e0 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     dump_stack_lvl+0x71/0x90
     __might_resched+0x1b2/0x2c0
     down_read+0x1a/0x190
     cxl_dpa_resource_start+0x15/0x50 [cxl_core]
     cxl_trace_hpa+0x122/0x300 [cxl_core]
     trace_event_raw_event_cxl_poison+0x1c9/0x2d0 [cxl_core]
     __traceiter_cxl_poison+0x5c/0x80 [cxl_core]
     cxl_inject_poison+0x1bc/0x1e0 [cxl_core]

This appears to have been an issue since the initial implementation and
uncovered by the new cxl-poison.sh test [1]. That test is now passing with
these changes.

Fixes: 28a3ae4ff6 ("cxl/trace: Add an HPA to cxl_poison trace events")
Link: http://lore.kernel.org/r/e4f2716646918135ddbadf4146e92abb659de734.1700615159.git.alison.schofield@intel.com [1]
Cc: <stable@vger.kernel.org>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-07 19:14:04 -08:00
Dave Jiang
36a1c2ee50 cxl/hdm: Fix a benign lockdep splat
The new helper "cxl_num_decoders_committed()" added a lockdep assertion
to validate that port->commit_end is protected against modification.
That assertion fires in init_hdm_decoder() where it is initializing
port->commit_end. Given that it is both accessing and writing that
property it obstensibly needs the lock.

In practice, CXL decoder commit rules (must commit in order) and the
in-order discovery of device decoders makes the manipulation of
->commit_end in init_hdm_decoder() safe. However, rather than rely on
the subtle rules of CXL hardware, just make the implementation obviously
correct from a software perspective.

The Fixes: tag is only for cleaning up a lockdep splat, there is no
functional issue addressed by this fix.

Fixes: 458ba8189c ("cxl: Add cxl_decoders_committed() helper")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/170025232811.2147250.16376901801315194121.stgit@djiang5-mobl3
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-11-22 16:34:30 -08:00
Dan Williams
5d09c63f11 cxl/hdm: Remove broken error path
Dan reports that cxl_decoder_commit() potentially leaks a hold of
cxl_dpa_rwsem. The potential error case is a "should not" happen
scenario, turn it into a "can not" happen scenario by adding the error
check to cxl_port_setup_targets() where other setting validation occurs.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: http://lore.kernel.org/r/63295673-5d63-4919-b851-3b06d48734c0@moroto.mountain
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31 14:10:04 -07:00
Dan Carpenter
69d56b15a7 cxl/hdm: Fix && vs || bug
If "info" is NULL then this code will crash.  || was intended instead of
&&.

Fixes: 8ce520fdea ("cxl/hdm: Use stored Component Register mappings to map HDM decoder capability")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/60028378-d3d5-4d6d-90fd-f915f061e731@moroto.mountain
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-31 14:09:50 -07:00
Dan Williams
b3cfdbf6a0 Merge branch 'for-6.7/cxl-commited' into cxl/next
Add the committed decoder sysfs attribute for v6.7.
2023-10-31 11:00:08 -07:00
Dan Williams
7f946e6d83 Merge branch 'for-6.7/cxl-rch-eh' into cxl/next
Restricted CXL Host (RCH) Error Handling undoes the topology munging of
CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
this long running series finally for v6.7.
2023-10-31 10:59:00 -07:00
Dave Jiang
458ba8189c cxl: Add cxl_decoders_committed() helper
Add a helper to retrieve the number of decoders committed for the port.
Replace all the open coding of the calculation with the helper.

Link: https://lore.kernel.org/linux-cxl/651c98472dfed_ae7e729495@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jim Harris <jim.harris@samsung.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/169747906849.272156.1729290904857372335.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:29:41 -07:00