linux/kernel/time
Tom Hromatka 75e674419d sysrq: Reset the watchdog timers while displaying high-resolution timers
[ Upstream commit 0107042768 ]

On systems with a large number of CPUs, running sysrq-<q> can cause
watchdog timeouts.  There are two slow sections of code in the sysrq-<q>
path in timer_list.c.

1. print_active_timers() - This function is called by print_cpu() and
   contains a slow goto loop.  On a machine with hundreds of CPUs, this
   loop took approximately 100ms for the first CPU in a NUMA node.
   (Subsequent CPUs in the same node ran much quicker.)  The total time
   to print all of the CPUs is ultimately long enough to trigger the
   soft lockup watchdog.

2. print_tickdevice() - This function outputs a large amount of textual
   information.  This function also took approximately 100ms per CPU.

Since sysrq-<q> is not a performance critical path, there should be no
harm in touching the nmi watchdog in both slow sections above.  Touching
it in just one location was insufficient on systems with hundreds of
CPUs as occasional timeouts were still observed during testing.

This issue was observed on an Oracle T7 machine with 128 CPUs, but I
anticipate it may affect other systems with similarly large numbers of
CPUs.

Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-22 09:23:21 +01:00
..
alarmtimer.c alarmtimer: don't rate limit one-shot timers 2017-07-27 15:06:10 -07:00
clockevents.c
clocksource.c clocksource: Allow unregistering the watchdog 2016-09-15 08:27:47 +02:00
hrtimer.c hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) 2018-03-03 10:19:41 +01:00
itimer.c
jiffies.c
Kconfig
Makefile
ntp_internal.h
ntp.c ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO 2016-09-15 08:27:47 +02:00
posix-clock.c
posix-cpu-timers.c posix_cpu_timer: Exit early when process has been reaped 2016-08-10 11:49:29 +02:00
posix-timers.c posix-timer: Properly check sigevent->sigev_notify 2018-02-16 20:09:40 +01:00
sched_clock.c timers, sched_clock: Update timeout for clock wrap 2018-03-22 09:23:21 +01:00
test_udelay.c
tick-broadcast-hrtimer.c
tick-broadcast.c tick/broadcast: Prevent NULL pointer dereference 2017-01-12 11:22:51 +01:00
tick-common.c
tick-internal.h
tick-oneshot.c
tick-sched.c nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() 2018-01-02 20:33:28 +01:00
tick-sched.h
time.c
timeconst.bc
timeconv.c
timecounter.c
timekeeping_debug.c timekeeping: Cap array access in timekeeping_debug 2016-09-15 08:27:52 +02:00
timekeeping_internal.h
timekeeping.c time: Fix clock->read(clock) race around clocksource changes 2017-06-29 12:48:51 +02:00
timekeeping.h
timer_list.c sysrq: Reset the watchdog timers while displaying high-resolution timers 2018-03-22 09:23:21 +01:00
timer_stats.c
timer.c timers: Plug locking race vs. timer migration 2018-01-31 12:06:08 +01:00