linux/drivers/clocksource
Thomas Gleixner 763aacf86f clocksource: Rewrite watchdog code completely
The clocksource watchdog code has over time reached the state of an
impenetrable maze of duct tape and staples. The original design, which was
made in the context of systems far smaller than today, is based on the
assumption that the to be monitored clocksource (TSC) can be trivially
compared against a known to be stable clocksource (HPET/ACPI-PM timer).

Over the years it turned out that this approach has major flaws:

  - Long delays between watchdog invocations can result in wrap arounds
    of the reference clocksource

  - Scalability of the reference clocksource readout can degrade on large
    multi-socket systems due to interconnect congestion

This was addressed with various heuristics which degraded the accuracy of
the watchdog to the point that it fails to detect actual TSC problems on
older hardware which exposes slow inter CPU drifts due to firmware
manipulating the TSC to hide SMI time.

To address this and bring back sanity to the watchdog, rewrite the code
completely with a different approach:

  1) Restrict the validation against a reference clocksource to the boot
     CPU, which is usually the CPU/Socket closest to the legacy block which
     contains the reference source (HPET/ACPI-PM timer). Validate that the
     reference readout is within a bound latency so that the actual
     comparison against the TSC stays within 500ppm as long as the clocks
     are stable.

  2) Compare the TSCs of the other CPUs in a round robin fashion against
     the boot CPU in the same way the TSC synchronization on CPU hotplug
     works. This still can suffer from delayed reaction of the remote CPU
     to the SMP function call and the latency of the control variable cache
     line. But this latency is not affecting correctness. It only affects
     the accuracy. With low contention the readout latency is in the low
     nanoseconds range, which detects even slight skews between CPUs. Under
     high contention this becomes obviously less accurate, but still
     detects slow skews reliably as it solely relies on subsequent readouts
     being monotonically increasing. It just can take slightly longer to
     detect the issue.

  3) Rewrite the watchdog test so it tests the various mechanisms one by
     one and validating the result against the expectation.

Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Reviewed-by: Daniel J Blueman <daniel@quora.org>
Link: https://patch.msgid.link/20260123231521.926490888@kernel.org
Link: https://patch.msgid.link/87h5qeomm5.ffs@tglx
2026-03-20 13:36:32 +01:00
..
acpi_pm.c clocksource: Rewrite watchdog code completely 2026-03-20 13:36:32 +01:00
arc_timer.c
arm_arch_timer_mmio.c clocksource/drivers/arm_arch_timer_mmio: Prevent driver unbind 2025-11-26 11:24:47 +01:00
arm_arch_timer.c clocksource/drivers/arm_arch_timer_mmio: Switch over to standalone driver 2025-09-23 12:31:50 +02:00
arm_global_timer.c clocksource/drivers/arm_global_timer: Add auto-detection for initial prescaler values 2025-09-23 12:41:58 +02:00
armv7m_systick.c
asm9260_timer.c clocksource/drivers/asm9260: Add missing clk_disable_unprepare in asm9260_timer_init 2024-09-06 14:49:21 +02:00
bcm_kona_timer.c clocksource/drivers/bcm_kona: Convert to SPDX identifier 2022-05-18 11:08:59 +02:00
bcm2835_timer.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
clksrc_st_lpc.c
clksrc-dbx500-prcmu.c
clps711x-timer.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
dummy_timer.c
dw_apb_timer_of.c
dw_apb_timer.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
em_sti.c clocksource: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:51 -07:00
exynos_mct.c clocksource/drivers/exynos_mct: Fixed a spelling error 2025-03-07 17:55:59 +01:00
hyperv_timer.c x86/paravirt: Move paravirt_sched_clock() related code into tsc.c 2026-01-12 18:47:39 +01:00
i8253.c clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() 2025-05-05 15:34:49 +02:00
ingenic-ost.c clocksource/drivers/ingenic: Use devm_clk_get_enabled() helpers 2024-09-06 14:49:20 +02:00
ingenic-sysost.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
ingenic-timer.c Convert 'alloc_flex' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
jcore-pit.c irqchip/jcore-aic, clocksource/drivers/jcore: Fix jcore-pit interrupt request 2025-02-17 23:27:49 +01:00
Kconfig MIPS: Don't select CLOCKSOURCE_WATCHDOG 2026-03-12 12:23:26 +01:00
Makefile clocksource/drivers: Add Realtek system timer driver 2025-11-26 11:25:15 +01:00
mips-gic-timer.c clocksource/drivers/mips-gic-timer: Move GIC timer to request_percpu_irq() 2026-01-20 18:07:24 +01:00
mmio.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
mps2-timer.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
mxs_timer.c
nomadik-mtu.c clocksource: Explicitly include correct DT includes 2023-08-28 13:30:57 -05:00
numachip.c
renesas-ostm.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
samsung_pwm_timer.c of: remove internal arguments from of_property_for_each_u32() 2024-07-25 06:53:47 -05:00
scx200_hrt.c clocksource/drivers/scx200: Add module owner 2025-09-23 10:21:24 +02:00
sh_cmt.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
sh_mtu2.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
sh_tmu.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
timer-armada-370-xp.c clocksource/drivers/armada-370-xp: Fix dead link to timer binding 2026-01-20 18:06:45 +01:00
timer-atmel-pit.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-atmel-st.c
timer-atmel-tcb.c clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware 2023-10-13 12:56:50 +02:00
timer-cadence-ttc.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-clint.c riscv: Use IPIs for remote cache/TLB flushes by default 2024-04-29 10:49:26 -07:00
timer-cs5535.c clocksource/drivers/cs5535: Add module owner 2025-09-23 10:52:23 +02:00
timer-davinci.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-digicolor.c clocksource/drivers/digicolor: Convert to SPDX identifier 2022-05-18 11:08:59 +02:00
timer-econet-en751221.c clocksource/timer-econet-en751221: Convert comma to semicolon 2025-09-23 10:56:13 +02:00
timer-ep93xx.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-fsl-ftm.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-fttmr010.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-goldfish.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-gx6605s.c
timer-gxp.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-imx-gpt.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-imx-sysctr.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-imx-tpm.c clocksource/drivers/imx-tpm: Fix next event not taking effect sometime 2024-09-02 10:04:15 +02:00
timer-integrator-ap.c clocksource: Explicitly include correct DT includes 2023-08-28 13:30:57 -05:00
timer-ixp4xx.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-keystone.c
timer-loongson1-pwm.c clocksource/drivers/loongson1: Set variable ls1x_timer_lock storage-class-specifier to static 2023-08-18 12:13:03 +02:00
timer-lpc32xx.c clocksource/drivers/lpc32xx: Convert to SPDX identifier 2022-05-18 11:08:59 +02:00
timer-mediatek-cpux.c clocksource/drivers/timer-mediatek: Split out CPUXGPT timers 2023-04-24 16:56:13 +02:00
timer-mediatek.c clocksource/drivers/timer-mediatek: Split out CPUXGPT timers 2023-04-24 16:56:13 +02:00
timer-meson6.c
timer-microchip-pit64b.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-milbeaut.c
timer-mp-csky.c
timer-msc313e.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-npcm7xx.c clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use 2022-12-02 12:48:28 +01:00
timer-nxp-pit.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-nxp-stm.c clocksource/drivers/nxp-stm: Prevent driver unbind 2025-11-26 11:25:03 +01:00
timer-of.c clocksource/drivers/timer-of: Remove percpu irq related code 2024-09-02 10:04:15 +02:00
timer-of.h clocksource/drivers/timer-of: Remove percpu irq related code 2024-09-02 10:04:15 +02:00
timer-orion.c arm: Handle KCOV __init vs inline mismatches 2025-07-21 21:43:39 -07:00
timer-owl.c
timer-pistachio.c clocksource/drivers/pistachio: Convert to SPDX identifier 2022-05-18 11:08:59 +02:00
timer-probe.c
timer-pxa.c
timer-qcom.c clocksource/drivers/qcom: Remove clockevents shutdown call on offlining 2024-10-31 10:41:43 +01:00
timer-ralink.c clocksource/drivers/ralink: Fix resource leaks in init error path 2025-11-26 11:24:34 +01:00
timer-rda.c clocksource/drivers/rda: Add sched_clock_register for RDA8810PL SoC 2025-11-26 11:25:11 +01:00
timer-realtek.c clocksource/drivers: Add Realtek system timer driver 2025-11-26 11:25:15 +01:00
timer-riscv.c riscv: clocksource: Fix stimecmp update hazard on RV32 2026-01-14 17:42:46 -07:00
timer-rockchip.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-rtl-otto.c clocksource/drivers/timer-rtl-otto: Simplify documentation 2025-09-23 12:41:26 +02:00
timer-sp.h
timer-sp804.c clocksource/drivers/timer-sp804: Fix an Oops when read_current_timer is called on ARM32 platforms where the SP804 is not registered as the sched_clock. 2026-01-20 18:06:54 +01:00
timer-sprd.c clocksource/drivers/sprd: Enable register for timer counter from 32 bit to 64 bit 2025-11-26 11:24:26 +01:00
timer-stm32-lp.c clocksource/drivers/stm32-lp: Drop unused module alias 2025-11-26 11:25:15 +01:00
timer-stm32.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-sun4i.c clocksource/drivers/timer-sun4i: Add CLOCK_EVT_FEAT_DYNIRQ 2023-02-13 13:10:17 +01:00
timer-sun5i.c clocksource/drivers/sun5i: Add module owner 2025-09-23 10:51:44 +02:00
timer-tegra.c clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining 2024-10-31 10:41:43 +01:00
timer-tegra186.c clocksource/drivers/timer-tegra186: Don't print superfluous errors 2025-09-23 12:41:39 +02:00
timer-ti-32k.c clocksource/drivers/ti-32K: Fix misuse of "/**" comment 2024-01-22 13:16:32 +01:00
timer-ti-dm-systimer.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
timer-ti-dm.c clocksource/drivers/timer-ti-dm : Capture functionality for OMAP DM timer 2025-09-23 12:32:40 +02:00
timer-versatile.c
timer-vt8500.c
timer-zevio.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00