mirror of
https://github.com/torvalds/linux.git
synced 2026-05-12 16:18:45 +02:00
doc: watchdog: futher improvements
Make further additions and alterations to the watchdog documentation. Link: https://lkml.kernel.org/r/acF3tXBxSr0KOP9b@pathway.suse.cz Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: Ian Rogers <irogers@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Max Kellermann <max.kellermann@ionos.com> Cc: Mayank Rungta <mrungta@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Stephane Erainan <eranian@google.com> Cc: Wang Jinchao <wangjinchao600@gmail.com> Cc: Yunhui Cui <cuiyunhui@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
parent
cb8615f3cb
commit
4580900fe1
|
|
@ -41,31 +41,35 @@ is a trade-off between fast response to lockups and detection overhead.
|
|||
Implementation
|
||||
==============
|
||||
|
||||
The soft lockup detector is built on top of the hrtimer subsystem.
|
||||
The hard lockup detector is built on top of the perf subsystem
|
||||
(on architectures that support it) or uses an SMP "buddy" system.
|
||||
|
||||
Softlockup Detector
|
||||
-------------------
|
||||
|
||||
The watchdog job runs in a stop scheduling thread that updates a
|
||||
timestamp every time it is scheduled. If that timestamp is not updated
|
||||
for 2*watchdog_thresh seconds (the softlockup threshold) the
|
||||
'softlockup detector' (coded inside the hrtimer callback function)
|
||||
will dump useful debug information to the system log, after which it
|
||||
will call panic if it was instructed to do so or resume execution of
|
||||
other kernel code.
|
||||
The soft and hard lockup detectors are built around a hrtimer.
|
||||
In addition, the softlockup detector regularly schedules a job, and
|
||||
the hard lockup detector might use Perf/NMI events on architectures
|
||||
that support it.
|
||||
|
||||
Frequency and Heartbeats
|
||||
------------------------
|
||||
|
||||
The hrtimer used by the softlockup detector serves a dual purpose:
|
||||
it detects softlockups, and it also generates the interrupts
|
||||
(heartbeats) that the hardlockup detectors use to verify CPU liveness.
|
||||
The core of the detectors in a hrtimer. It servers multiple purpose:
|
||||
|
||||
The period of this hrtimer is 2*watchdog_thresh/5. This means the
|
||||
hrtimer has two or three chances to generate an interrupt before the
|
||||
NMI hardlockup detector kicks in.
|
||||
- schedules watchdog job for the softlockup detector
|
||||
- bumps the interrupt counter for hardlockup detectors (heartbeat)
|
||||
- detects softlockups
|
||||
- detects hardlockups in Buddy mode
|
||||
|
||||
The period of this hrtimer is 2*watchdog_thresh/5, which is 4 seconds
|
||||
by default. The hrtimer has two or three chances to generate an interrupt
|
||||
(heartbeat) before the hardlockup detector kicks in.
|
||||
|
||||
Softlockup Detector
|
||||
-------------------
|
||||
|
||||
The watchdog job is scheduled by the hrtimer and runs in a stop scheduling
|
||||
thread. It updates a timestamp every time it is scheduled. If that timestamp
|
||||
is not updated for 2*watchdog_thresh seconds (the softlockup threshold) the
|
||||
'softlockup detector' (coded inside the hrtimer callback function)
|
||||
will dump useful debug information to the system log, after which it
|
||||
will call panic if it was instructed to do so or resume execution of
|
||||
other kernel code.
|
||||
|
||||
Hardlockup Detector (NMI/Perf)
|
||||
------------------------------
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user