linux/include
Thomas Gleixner 763aacf86f clocksource: Rewrite watchdog code completely
The clocksource watchdog code has over time reached the state of an
impenetrable maze of duct tape and staples. The original design, which was
made in the context of systems far smaller than today, is based on the
assumption that the to be monitored clocksource (TSC) can be trivially
compared against a known to be stable clocksource (HPET/ACPI-PM timer).

Over the years it turned out that this approach has major flaws:

  - Long delays between watchdog invocations can result in wrap arounds
    of the reference clocksource

  - Scalability of the reference clocksource readout can degrade on large
    multi-socket systems due to interconnect congestion

This was addressed with various heuristics which degraded the accuracy of
the watchdog to the point that it fails to detect actual TSC problems on
older hardware which exposes slow inter CPU drifts due to firmware
manipulating the TSC to hide SMI time.

To address this and bring back sanity to the watchdog, rewrite the code
completely with a different approach:

  1) Restrict the validation against a reference clocksource to the boot
     CPU, which is usually the CPU/Socket closest to the legacy block which
     contains the reference source (HPET/ACPI-PM timer). Validate that the
     reference readout is within a bound latency so that the actual
     comparison against the TSC stays within 500ppm as long as the clocks
     are stable.

  2) Compare the TSCs of the other CPUs in a round robin fashion against
     the boot CPU in the same way the TSC synchronization on CPU hotplug
     works. This still can suffer from delayed reaction of the remote CPU
     to the SMP function call and the latency of the control variable cache
     line. But this latency is not affecting correctness. It only affects
     the accuracy. With low contention the readout latency is in the low
     nanoseconds range, which detects even slight skews between CPUs. Under
     high contention this becomes obviously less accurate, but still
     detects slow skews reliably as it solely relies on subsequent readouts
     being monotonically increasing. It just can take slightly longer to
     detect the issue.

  3) Rewrite the watchdog test so it tests the various mechanisms one by
     one and validating the result against the expectation.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Reviewed-by: Daniel J Blueman <daniel@quora.org>
Link: https://patch.msgid.link/20260123231521.926490888@kernel.org
Link: https://patch.msgid.link/87h5qeomm5.ffs@tglx
2026-03-20 13:36:32 +01:00
..
acpi mailbox: platform and core updates 2026-02-14 11:13:32 -08:00
asm-generic hrtimer: Push reprogramming timers into the interrupt return path 2026-02-27 16:40:14 +01:00
clocksource
crypto Networking changes for 7.0 2026-02-11 19:31:52 -08:00
cxl
drm drm/pagemap: pass pagemap_addr by reference 2026-02-17 19:39:44 -05:00
dt-bindings phy-for-7.0 2026-02-17 11:40:04 -08:00
hyperv hyperv-next for v7.0 2026-02-20 08:48:31 -08:00
keys
kunit treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
kvm KVM: arm64: Use standard seq_file iterator for vgic-debug debugfs 2026-02-02 10:59:25 +00:00
linux clocksource: Rewrite watchdog code completely 2026-03-20 13:36:32 +01:00
math-emu
media [GIT PULL for v7.0] media updates 2026-02-11 12:20:25 -08:00
memory
misc
net Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
pcmcia
ras
rdma RDMA v7.0 merge window 2026-02-12 17:05:20 -08:00
rv rv: Fix multiple definition of __pcpu_unique_da_mon_this 2026-02-20 13:12:00 +01:00
scsi SCSI misc on 20260212 2026-02-12 15:43:02 -08:00
soc
sound ASoC: Updates for v7.0 2026-02-09 17:39:11 +01:00
target
trace hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry event 2026-03-12 12:15:55 +01:00
uapi drm next fixes for 7.0-rc1 2026-02-20 15:36:38 -08:00
ufs scsi: ufs: host: mediatek: Require CONFIG_PM 2026-02-03 22:28:44 -05:00
vdso
video
xen Partial revert "x86/xen: fix balloon target initialization for PVH dom0" 2026-02-02 07:31:22 +01:00
Kbuild