Linux kernel source tree
Go to file
Shuai Xue 410063c9e1 ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
[ Upstream commit a70297d221 ]

There are two major types of uncorrected recoverable (UCR) errors :

 - Synchronous error: The error is detected and raised at the point of
   the consumption in the execution flow, e.g. when a CPU tries to
   access a poisoned cache line. The CPU will take a synchronous error
   exception such as Synchronous External Abort (SEA) on Arm64 and
   Machine Check Exception (MCE) on X86. OS requires to take action (for
   example, offline failure page/kill failure thread) to recover this
   uncorrectable error.

 - Asynchronous error: The error is detected out of processor execution
   context, e.g. when an error is detected by a background scrubber.
   Some data in the memory are corrupted. But the data have not been
   consumed. OS is optional to take action to recover this uncorrectable
   error.

When APEI firmware first is enabled, a platform may describe one error
source for the handling of synchronous errors (e.g. MCE or SEA notification
), or for handling asynchronous errors (e.g. SCI or External Interrupt
notification). In other words, we can distinguish synchronous errors by
APEI notification. For synchronous errors, kernel will kill the current
process which accessing the poisoned page by sending SIGBUS with
BUS_MCEERR_AR. In addition, for asynchronous errors, kernel will notify the
process who owns the poisoned page by sending SIGBUS with BUS_MCEERR_AO in
early kill mode. However, the GHES driver always sets mf_flags to 0 so that
all synchronous errors are handled as asynchronous errors in memory failure.

To this end, set memory failure flags as MF_ACTION_REQUIRED on synchronous
events.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Tested-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-05 20:14:15 +00:00
arch x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel 2024-02-05 20:14:14 +00:00
block block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-31 16:19:12 -08:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto crypto: api - Disallow identical driver names 2024-01-31 16:18:49 -08:00
Documentation Documentation/sphinx: fix Python string escapes 2024-02-05 20:14:13 +00:00
drivers ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events 2024-02-05 20:14:15 +00:00
fs cifs: fix stray unlock in cifs_chan_skip_or_disable 2024-01-31 16:19:13 -08:00
include arm64: irq: set the correct node for VMAP stack 2024-02-05 20:14:14 +00:00
init rootfs: Fix support for rootfstype= when root= is given 2024-01-25 15:35:46 -08:00
io_uring io_uring: adjust defer tw counting 2024-01-25 15:36:00 -08:00
ipc Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
kernel audit: Send netlink ACK before setting connection in auditd_set 2024-02-05 20:14:14 +00:00
lib debugobjects: Stop accessing objects after releasing hash bucket lock 2024-02-05 20:14:14 +00:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm memblock: fix crash when reserved memory is not added to memory 2024-01-31 16:19:12 -08:00
net net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set 2024-01-31 16:19:09 -08:00
rust rust: Ignore preserve-most functions 2024-01-25 15:35:41 -08:00
samples vfio/mtty: Overhaul mtty interrupt handling 2024-01-10 17:16:55 +01:00
scripts scripts/get_abi: fix source path leak 2024-01-31 16:18:55 -08:00
security lsm: new security_file_ioctl_compat() hook 2024-01-31 16:18:54 -08:00
sound soundwire: fix initializing sysfs for same devices on different buses 2024-01-31 16:18:47 -08:00
tools kunit: tool: fix parsing of test attributes 2024-02-05 20:14:15 +00:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: rename binkernel.spec to kernel.spec 2023-07-25 00:59:33 +09:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Char/Misc driver fixes for 6.6-final 2023-10-28 07:51:27 -10:00
Makefile Linux 6.6.15 2024-01-31 16:19:14 -08:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.