linux/include
Lukas Wunner 51d0656959 genirq/manage: Reduce priority of forced secondary interrupt handler
Crystal reports that the PCIe Advanced Error Reporting driver gets stuck
in an infinite loop on PREEMPT_RT:

Both the primary interrupt handler aer_irq() as well as the secondary
handler aer_isr() are forced into threads with identical priority.
Crystal writes that on the ARM system in question, the primary handler
has to clear an error in the Root Error Status register...

   "before the next error happens, or else the hardware will set the
    Multiple ERR_COR Received bit.  If that bit is set, then aer_isr()
    can't rely on the Error Source Identification register, so it scans
    through all devices looking for errors -- and for some reason, on
    this system, accessing the AER registers (or any Config Space above
    0x400, even though there are capabilities located there) generates
    an Unsupported Request Error (but returns valid data).  Since this
    happens more than once, without aer_irq() preempting, it causes
    another multi error and we get stuck in a loop."

The issue does not show on non-PREEMPT_RT because the primary handler
runs in hardirq context and thus can preempt the threaded secondary
handler, clear the Root Error Status register and prevent the secondary
handler from getting stuck.

Emulate the same behavior on PREEMPT_RT by assigning a lower default
priority to the secondary handler if the primary handler is forced into
a thread.

Reported-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Crystal Wood <crwood@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/f6dcdb41be2694886b8dbf4fe7b3ab89e9d5114c.1761569303.git.lukas@wunner.de
Closes: https://lore.kernel.org/r/20250902224441.368483-1-crwood@redhat.com/
2025-11-01 21:30:02 +01:00
..
acpi More power management updates for 6.18-rc1 2025-10-07 09:39:51 -07:00
asm-generic hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
clocksource
crypto This update includes the following changes: 2025-10-04 14:59:29 -07:00
cxl
drm drm/gpusvm, drm/xe: Fix userptr to not allow device private pages 2025-10-02 21:57:52 -07:00
dt-bindings There's a bunch of patches here across drivers/clk/ to migrate drivers to use 2025-10-07 09:28:37 -07:00
hyperv hyperv: Remove the spurious null directive line 2025-10-02 21:21:24 +00:00
keys
kunit linux_kselftest-kunit-6.18-rc1 2025-10-01 19:15:11 -07:00
kvm KVM/arm64 updates for 6.18 2025-09-30 13:23:28 -04:00
linux genirq/manage: Reduce priority of forced secondary interrupt handler 2025-11-01 21:30:02 +01:00
math-emu
media
memory
misc
net net: psp: don't assume reply skbs will have a socket 2025-10-03 10:23:50 -07:00
pcmcia
ras
rdma
rv kernel-6.18-rc1.clone3 2025-09-29 10:36:50 -07:00
scsi SCSI misc on 20251002 2025-10-03 19:17:48 -07:00
soc There's a bunch of patches here across drivers/clk/ to migrate drivers to use 2025-10-07 09:28:37 -07:00
sound
target
trace dma-mapping fixes for Linux 6.18: 2025-10-07 12:48:06 -07:00
uapi bpf-fixes 2025-10-11 10:31:38 -07:00
ufs scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS 2025-09-30 16:10:29 -04:00
vdso Updates for the VDSO subsystem: 2025-09-30 16:58:21 -07:00
video
xen
Kbuild