mirror of
https://github.com/torvalds/linux.git
synced 2026-05-31 02:24:24 +02:00
x86/virt: Silence RCU lockdep splat in emergency virt callback path
x86_virt_invoke_kvm_emergency_callback() reaches rcu_dereference() through machine_crash_shutdown() with IRQs disabled but with RCU not necessarily watching the crashing CPU, which triggers a suspicious RCU usage splat on debug kernels (CONFIG_PROVE_RCU=y) during panic/kdump: WARNING: suspicious RCU usage arch/x86/virt/hw.c:52 suspicious rcu_dereference_check() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by tee/11119: #0: ffff8881fa32c440 (sb_writers#3){.+.+}-{0:0}, at: ksys_write Call Trace: <TASK> dump_stack_lvl+0x84/0xd0 lockdep_rcu_suspicious.cold+0x37/0x8f x86_virt_invoke_kvm_emergency_callback+0x5f/0x70 x86_svm_emergency_disable_virtualization_cpu+0x2a/0x30 x86_virt_emergency_disable_virtualization_cpu+0x6b/0x90 native_machine_crash_shutdown+0x72/0x170 __crash_kexec+0x137/0x280 panic+0xce/0xd0 sysrq_handle_crash+0x1f/0x20 __handle_sysrq.cold+0x192/0x335 write_sysrq_trigger+0x8c/0xc0 proc_reg_write+0x1c3/0x3c0 vfs_write+0x1d0/0xf80 ksys_write+0x116/0x250 do_syscall_64+0x11c/0x1480 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> A truly correct fix is non-trivial: the RCU usage genuinely is wrong in panic context (RCU may ignore the crashing CPU during synchronization), and a concurrent KVM module unload could in principle race with the callback read; see commit2baa33a8dd("KVM: x86: Leave user-return notifier registered on reboot/shutdown") which notes that nothing prevents module unload during panic/reboot. However, the alternatives are worse: - smp_store_release()/smp_load_acquire() handles ordering but not liveness; the kernel still needs to keep the module text alive while the callback is in flight. - Taking a lock in the panic path is risky — any lock could be held by a CPU that has already been NMI'd to a halt. Use rcu_dereference_raw() to silence the splat and accept the vanishingly small remaining race. Panic context inherently cannot guarantee complete correctness; the goal here is to keep debug builds quiet on the kdump path so the splat doesn't obscure the actual kernel state being captured. Reproducible on a debug kernel (CONFIG_PROVE_LOCKING=y, CONFIG_PROVE_RCU=y) with kvm_amd or kvm_intel loaded by triggering kdump: echo c > /proc/sysrq-trigger Suggested-by: Sean Christopherson <seanjc@google.com> Fixes:428afac5a8("KVM: x86: Move bulk of emergency virtualizaton logic to virt subsystem") Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260504235435.90957-1-mikhail.v.gavrilov@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
This commit is contained in:
parent
6d3790bc68
commit
fff82ea9d9
|
|
@ -49,7 +49,20 @@ static void x86_virt_invoke_kvm_emergency_callback(void)
|
|||
{
|
||||
cpu_emergency_virt_cb *kvm_callback;
|
||||
|
||||
kvm_callback = rcu_dereference(kvm_emergency_callback);
|
||||
/*
|
||||
* RCU may not be watching the crashing CPU here, so rcu_dereference()
|
||||
* triggers a suspicious-RCU-usage splat. In principle, a concurrent
|
||||
* KVM module unload could race with this read; see commit 2baa33a8ddd6
|
||||
* ("KVM: x86: Leave user-return notifier registered on reboot/shutdown")
|
||||
* which notes that nothing prevents module unload during panic/reboot.
|
||||
*
|
||||
* However, taking a lock here would be riskier than the current race:
|
||||
* the system is going down via NMI shootdown, and any lock could be
|
||||
* held by an already-stopped CPU. Use rcu_dereference_raw() to silence
|
||||
* the lockdep splat and accept the comically small remaining race;
|
||||
* panic context inherently cannot guarantee complete correctness.
|
||||
*/
|
||||
kvm_callback = rcu_dereference_raw(kvm_emergency_callback);
|
||||
if (kvm_callback)
|
||||
kvm_callback();
|
||||
}
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user