linux/drivers
Shiju Jose 588ca944c2 cxl/edac: Add CXL memory device memory sparing control feature
Memory sparing is defined as a repair function that replaces a portion of
memory with a portion of functional memory at that same DPA. The subclasses
for this operation vary in terms of the scope of the sparing being
performed. The cacheline sparing subclass refers to a sparing action that
can replace a full cacheline. Row sparing is provided as an alternative to
PPR sparing functions and its scope is that of a single DDR row.
As per CXL r3.2 Table 8-125 foot note 1. Memory sparing is preferred over
PPR when possible.
Bank sparing allows an entire bank to be replaced. Rank sparing is defined
as an operation in which an entire DDR rank is replaced.

Memory sparing maintenance operations may be supported by CXL devices
that implement CXL.mem protocol. A sparing maintenance operation requests
the CXL device to perform a repair operation on its media.
For example, a CXL device with DRAM components that support memory sparing
features may implement sparing maintenance operations.

The host may issue a query command by setting query resources flag in the
input payload (CXL spec 3.2 Table 8-120) to determine availability of
sparing resources for a given address. In response to a query request,
the device shall report the resource availability by producing the memory
sparing event record (CXL spec 3.2 Table 8-60) in which the Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy
of the values specified in the request.

During the execution of a sparing maintenance operation, a CXL memory
device:
- may not retain data
- may not be able to process CXL.mem requests correctly.
These CXL memory device capabilities are specified by restriction flags
in the memory sparing feature readable attributes.

When a CXL device identifies error on a memory component, the device
may inform the host about the need for a memory sparing maintenance
operation by using DRAM event record, where the 'maintenance needed' flag
may set. The event record contains some of the DPA, Channel, Rank,
Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that
should be repaired. The userspace tool requests for maintenance operation
if the 'maintenance needed' flag set in the CXL DRAM error record.

CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing
maintenance operation feature.

CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature
discovery and configuration.

Add support for controlling CXL memory device memory sparing feature.
Register with EDAC driver, which gets the memory repair attr descriptors
from the EDAC memory repair driver and exposes sysfs repair control
attributes for memory sparing to the userspace. For example CXL memory
sparing control for the CXL mem0 device is exposed in
/sys/bus/edac/devices/cxl_mem0/mem_repairX/

Use case
========
1. CXL device identifies a failure in a memory component, report to
   userspace in a CXL DRAM trace event with DPA and other attributes of
   memory to repair such as channel, rank, nibble mask, bank Group,
   bank, row, column, sub-channel.

2. Rasdaemon process the trace event and may issue query request in sysfs
check resources available for memory sparing if either of the following
conditions met.
 - 'maintenance needed' flag set in the event record.
 - 'threshold event' flag set for CVME threshold feature.
 - When the number of corrected error reported on a CXL.mem media to the
   userspace exceeds the threshold value for corrected error count defined
   by the userspace policy.

3. Rasdaemon process the memory sparing trace event and issue repair
   request for memory sparing.

Kernel CXL driver shall report memory sparing event record to the userspace
with the resource availability in order rasdaemon to process the event
record and issue a repair request in sysfs for the memory sparing operation
in the CXL device.

Note: Based on the feedbacks from the community 'query' sysfs attribute is
removed and reporting memory sparing error record to the userspace are not
supported. Instead userspace issues sparing operation and kernel does the
same to the CXL memory device, when 'maintenance needed' flag set in the
DRAM event record.

Add checks to ensure the memory to be repaired is offline and if online,
then originates from a CXL DRAM error record reported in the current boot
before requesting a memory sparing operation on the device.

Note: Tested memory sparing feature control with QEMU patch
      "hw/cxl: Add emulation for memory sparing control feature"
      https://lore.kernel.org/linux-cxl/20250509172229.726-1-shiju.jose@huawei.com/T/#m5f38512a95670d75739f9dad3ee91b95c7f5c8d6

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-8-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:53 -07:00
..
accel accel/ivpu: Add cmdq_id to job related logs 2025-04-11 12:07:44 +02:00
accessibility treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
acpi gcc-15: disable '-Wunterminated-string-initialization' entirely for now 2025-04-20 15:30:53 -07:00
amba
android binder: fix offset calculation in debug log 2025-04-15 15:11:12 +02:00
ata ata: libata-scsi: Improve CDL control 2025-04-22 16:06:05 +09:00
atm treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
auxdisplay treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
base vfs-6.15-rc4.fixes 2025-04-25 15:57:21 -07:00
bcma
block block-6.15-20250424 2025-04-25 11:34:39 -07:00
bluetooth Bluetooth: vhci: Avoid needless snprintf() calls 2025-04-16 16:50:47 -04:00
bus treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
cache
cdrom
cdx Merge branches 'apple/dart', 'arm/smmu/updates', 'arm/smmu/bindings', 'rockchip', 's390', 'core', 'intel/vt-d' and 'amd/amd-vi' into next 2025-03-20 09:11:09 +01:00
char Char/Misc driver fixes for 6.15-rc4 2025-04-25 10:30:40 -07:00
clk ARM and clkdev updates for 6.15-rc1 2025-04-03 12:21:44 -07:00
clocksource RISC-V Patches for the 6.15 Merge Window, Part 1 2025-04-04 09:49:17 -07:00
comedi comedi: jr3_pci: Fix synchronous deletion of timer 2025-04-15 15:18:55 +02:00
connector
counter Char/Misc fixes for 6.15-rc1 2025-04-02 18:03:34 -07:00
cpufreq ARM cpufreq fixes for 6.15-rc 2025-04-23 14:55:11 +02:00
cpuidle pmdomain core: 2025-03-25 20:40:51 -07:00
crypto crypto: atmel-sha204a - Set hwrng quality to lowest possible 2025-04-23 09:32:57 +08:00
cxl cxl/edac: Add CXL memory device memory sparing control feature 2025-05-23 13:24:53 -07:00
dax device/dax: properly refcount device dax pages when mapping 2025-03-17 22:06:41 -07:00
dca
devfreq
dio
dma treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
dma-buf dma-buf/sw_sync: Decrement refcount on error in sw_sync_ioctl_get_deadline() 2025-04-11 14:22:22 +02:00
dpll Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-03-20 21:38:01 +01:00
edac cxl/edac: Add CXL memory device memory sparing control feature 2025-05-23 13:24:53 -07:00
eisa
extcon
firewire treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
firmware Char/Misc driver fixes for 6.15-rc4 2025-04-25 10:30:40 -07:00
fpga fpga: tests: add module descriptions 2025-04-11 17:32:38 -07:00
fsi
fwctl fwctl: Fix repeated device word in log message 2025-04-11 20:47:45 -03:00
gnss
gpio gpiolib: Allow to use setters with return value for output-only gpios 2025-04-14 20:31:00 +02:00
gpu drm fixes for 6.15-rc4 2025-04-26 08:32:29 -07:00
greybus treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
hid treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
hsi treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
hte treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
hv - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
hwmon treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
hwspinlock hwspinlock: Remove unused hwspin_lock_get_id() 2025-03-21 17:12:04 -05:00
hwtracing intel_th: avoid using deprecated page->mapping, index fields 2025-04-15 13:29:03 +02:00
i2c i2c-host-fixes for v6.15-rc3 2025-04-18 23:42:56 +02:00
i3c i3c: Add NULL pointer check in i3c_master_queue_ibi() 2025-03-31 11:44:00 +02:00
idle Power management updates for 6.15-rc1 2025-03-25 15:00:18 -07:00
iio gcc-15: add '__nonstring' markers to byte arrays 2025-04-20 11:57:54 -07:00
infiniband RDMA/bnxt_re: Remove unusable nq variable 2025-04-10 14:47:55 -03:00
input gcc-15: add '__nonstring' markers to byte arrays 2025-04-20 11:57:54 -07:00
interconnect
iommu iommu/amd: WARN if KVM attempts to set vCPU affinity without posted intrrupts 2025-04-24 09:52:31 -04:00
ipack
irqchip irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode() 2025-04-26 10:17:24 +02:00
isdn treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
leds treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
macintosh treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
mailbox treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
mcb mcb: fix a double free bug in chameleon_parse_gdd() 2025-04-15 18:21:39 +02:00
md gcc-15: get rid of misc extra NUL character padding 2025-04-20 11:57:54 -07:00
media treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
memory treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
memstick treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
message SCSI misc on 20250326 2025-03-26 19:57:34 -07:00
mfd * Maxim MAX77705: 2025-03-29 14:33:13 -07:00
misc pci-v6.15-fixes-3 2025-04-26 13:02:36 -07:00
mmc treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
most treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
mtd mtd: rawnand: Add status chack in r852_ready() 2025-04-07 09:02:49 +02:00
mux
net No fixes from any subtree. 2025-04-24 09:14:50 -07:00
nfc treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
ntb Bug fixes for NTB Switchtec driver mw negative shift, Intel NTB link 2025-04-04 14:23:07 -07:00
nubus
nvdimm libnvdimm additions for 6.15 2025-04-02 20:27:18 -07:00
nvme block-6.15-20250424 2025-04-25 11:34:39 -07:00
nvmem nvmem: qfprom: switch to 4-byte aligned reads 2025-04-11 14:41:22 +02:00
of Devicetree for v6.15: 2025-03-29 11:23:16 -07:00
opp
parisc
parport treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
pci pci-v6.15-fixes-3 2025-04-26 13:02:36 -07:00
pcmcia treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
peci
perf pci-v6.15-changes 2025-03-28 19:36:53 -07:00
phy phy-for-6.15 2025-04-01 12:47:11 -07:00
pinctrl Pin control changes for the v6.15 kernel cycle: 2025-03-29 16:59:16 -07:00
platform platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug 2025-04-16 11:15:22 +03:00
pmdomain pmdomain: arm: scmi_pm_domain: Remove redundant state verification 2025-03-17 11:12:01 +01:00
pnp Staging driver updates for 6.15-rc1 2025-04-02 18:09:17 -07:00
power gcc-15: get rid of misc extra NUL character padding 2025-04-20 11:57:54 -07:00
powercap Power management updates for 6.15-rc1 2025-03-25 15:00:18 -07:00
pps pps: generators: tio: fix platform_set_drvdata() 2025-04-15 18:22:32 +02:00
ps3
ptp ptp: ocp: fix start time alignment in ptp_ocp_signal_set 2025-04-16 18:23:57 -07:00
pwm pwm: A set of fixes for pwm core and various drivers 2025-04-12 08:11:19 -07:00
rapidio
ras RAS/AMD/FMPM: Get masked address 2025-04-08 19:30:58 +02:00
regulator These are objtool fixes and updates by Josh Poimboeuf, centered 2025-04-02 10:30:10 -07:00
remoteproc remoteproc: qcom_q6v5_pas: Make single-PD handling more robust 2025-03-22 08:42:39 -05:00
reset remoteproc updates for v6.15 2025-03-29 17:18:50 -07:00
rpmsg
rtc treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
s390 s390/virtio_ccw: Don't allocate/assign airqs for non-existing queues 2025-04-09 12:12:41 +02:00
sbus
scsi ata fixes for 6.15-rc4 2025-04-25 16:31:10 -07:00
sh
siox
slimbus
soc soc: drivers for 6.15, part 2 2025-04-04 09:06:32 -07:00
soundwire soundwire updates for 6.15 2025-04-01 12:43:13 -07:00
spi spi: spi-imx: Add check for spi_imx_setupxfer() 2025-04-17 12:25:12 +01:00
spmi
ssb
staging treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
target scsi: target: iscsi: Fix timeout on deleted connection 2025-04-11 22:13:00 -04:00
tc
tee
thermal thermal: intel: int340x: Fix Panther Lake DLVR support 2025-04-15 18:57:25 +02:00
thunderbolt USB/Thunderbolt update for 6.15-rc1 2025-04-02 18:23:31 -07:00
tty serial: sifive: lock port in startup()/shutdown() callbacks 2025-04-15 15:02:39 +02:00
ufs scsi: ufs: core: Add NULL check in ufshcd_mcq_compl_pending_transfer() 2025-04-21 20:50:11 -04:00
uio
usb USB-serial device ids for 6.15-rc3 2025-04-18 06:49:40 +02:00
vdpa
vfio vfio/pci: Virtualize zero INTx PIN if no pdev->irq 2025-04-14 08:31:45 -06:00
vhost vhost-scsi: Fix vhost_scsi_send_status() 2025-04-18 10:08:11 -04:00
video treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
virt treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
virtio virtgpu: don't reset on shutdown 2025-04-18 10:05:49 -04:00
w1
watchdog treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
xen x86/xen: fix balloon target initialization for PVH dom0 2025-04-07 11:24:12 +02:00
zorro
Kconfig
Makefile