linux/kernel
Michal Hocko e033782a26 OOM, PM: OOM killed task shouldn't escape PM suspend
commit 5695be142e upstream.

PM freezer relies on having all tasks frozen by the time devices are
getting frozen so that no task will touch them while they are getting
frozen. But OOM killer is allowed to kill an already frozen task in
order to handle OOM situtation. In order to protect from late wake ups
OOM killer is disabled after all tasks are frozen. This, however, still
keeps a window open when a killed task didn't manage to die by the time
freeze_processes finishes.

Reduce the race window by checking all tasks after OOM killer has been
disabled. This is still not race free completely unfortunately because
oom_killer_disable cannot stop an already ongoing OOM killer so a task
might still wake up from the fridge and get killed without
freeze_processes noticing. Full synchronization of OOM and freezer is,
however, too heavy weight for this highly unlikely case.

Introduce and check oom_kills counter which gets incremented early when
the allocator enters __alloc_pages_may_oom path and only check all the
tasks if the counter changes during the freezing attempt. The counter
is updated so early to reduce the race window since allocator checked
oom_killer_disabled which is set by PM-freezing code. A false positive
will push the PM-freezer into a slow path but that is not a big deal.

Changes since v1
- push the re-check loop out of freeze_processes into
  check_frozen_processes and invert the condition to make the code more
  readable as per Rafael

Fixes: f660daac47 (oom: thaw threads if oom killed thread is frozen before deferring)
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14 08:47:58 -08:00
..
cpu sched, idle: Fix the idle polling state logic 2013-11-29 11:11:42 -08:00
debug kgdb/sysrq: fix inconstistent help message of sysrq key 2013-04-30 17:04:10 -07:00
events perf: fix perf bug in fork() 2014-10-09 12:18:42 -07:00
gcov kernel/gcov: remove depends on CONFIG_EXPERIMENTAL 2013-01-11 11:39:33 -08:00
irq genirq: Sanitize spurious interrupt detection of threaded irqs 2014-06-30 20:09:45 -07:00
power OOM, PM: OOM killed task shouldn't escape PM suspend 2014-11-14 08:47:58 -08:00
sched printk: rename printk_sched to printk_deferred 2014-08-07 14:30:26 -07:00
time alarmtimer: Lock k_itimer during timer callback 2014-10-05 14:54:14 -07:00
trace tracing/syscalls: Ignore numbers outside NR_syscalls' range 2014-11-14 08:47:53 -08:00
.gitignore kernel/hz.bc: ignore. 2013-04-22 07:09:06 -07:00
acct.c fs: Fix hang with BSD accounting on frozen filesystem 2013-05-04 14:57:58 -04:00
async.c async: rename and redefine async_func_ptr 2013-03-12 13:59:14 -07:00
audit_tree.c kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules() 2013-06-12 16:29:46 -07:00
audit_watch.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
audit.c CAPABILITIES: remove undefined caps from all processes 2014-09-17 09:03:57 -07:00
audit.h audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record 2013-12-04 10:57:03 -08:00
auditfilter.c auditfilter.c: fix kernel-doc warnings 2013-05-24 16:22:52 -07:00
auditsc.c auditsc: audit_krule mask accesses need bounds checking 2014-06-16 13:42:53 -07:00
backtracetest.c
bounds.c
capability.c CAPABILITIES: remove undefined caps from all processes 2014-09-17 09:03:57 -07:00
cgroup_freezer.c cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() 2012-11-19 08:13:38 -08:00
cgroup.c cgroup: use a dedicated workqueue for cgroup destruction 2013-12-04 10:57:20 -08:00
compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-05-01 07:21:43 -07:00
configs.c proc: Supply PDE attribute setting accessor functions 2013-05-01 17:29:18 -04:00
context_tracking.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-06-20 08:18:35 -10:00
cpu_pm.c
cpu.c sched: Fix hotplug vs. set_cpus_allowed_ptr() 2014-06-11 12:03:24 -07:00
cpuset.c cpuset,mempolicy: fix sleeping function called from invalid context 2014-07-17 15:58:00 -07:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-18 10:55:28 -08:00
delayacct.c cputime: Use accessors to read task cputime stats 2013-01-27 19:23:31 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c introduce for_each_thread() to replace the buggy while_each_thread() 2014-10-05 14:54:15 -07:00
extable.c extable: Flip the sorting message 2013-04-15 13:25:16 +02:00
fork.c perf: fix perf bug in fork() 2014-10-09 12:18:42 -07:00
freezer.c freezer: Do not freeze tasks killed by OOM killer 2014-11-14 08:47:58 -08:00
futex_compat.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
futex.c futex: Make lookup_pi_state more robust 2014-06-07 13:25:41 -07:00
groups.c
hrtimer.c hrtimer: Set expiry time before switch_hrtimer_base() 2014-06-07 13:25:31 -07:00
hung_task.c
irq_work.c Merge branch 'nohz/printk-v8' into irq/core 2013-02-05 00:48:46 +01:00
itimer.c
jump_label.c
kallsyms.c kernel: kallsyms: memory override issue, need check destination buffer length 2013-04-15 15:17:26 +09:30
kcmp.c kcmp: fix standard comparison bug 2014-10-05 14:54:13 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/mutex: Disable optimistic spinning on some architectures 2014-07-28 08:00:07 -07:00
Kconfig.preempt
kexec.c PCI: Disable Bus Master only on kexec reboot 2013-12-20 07:45:08 -08:00
kmod.c usermodehelper: check subprocess_info->path != NULL 2013-05-16 12:01:11 -07:00
kprobes.c kprobes: Fix to free gone and unused optprobes 2013-05-28 10:37:59 +02:00
ksysfs.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 18:10:49 -08:00
kthread.c kthread: implement probe_kthread_data() 2013-04-30 17:04:02 -07:00
latencytop.c
lglock.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-block 2013-05-08 11:51:05 -07:00
Makefile We get rid of the general module prefix confusion with a binary config option, 2013-05-05 10:58:06 -07:00
modsign_certificate.S CONFIG_SYMBOL_PREFIX: cleanup. 2013-03-15 15:09:43 +10:30
modsign_pubkey.c keys: use keyring_alloc() to create module signing keyring 2012-12-20 17:40:21 -08:00
module_signing.c MODSIGN: Don't use enum-type bitfields in module signature info block 2012-12-05 11:27:24 +10:30
module-internal.h
module.c modules, lock around setting of MODULE_STATE_UNFORMED 2014-11-14 08:47:55 -08:00
mutex-debug.c
mutex-debug.h
mutex.c mutex: Back out architecture specific check for negative mutex count 2013-04-19 09:33:36 +02:00
mutex.h
notifier.c
nsproxy.c proc: Split the namespace stuff out into linux/proc_ns.h 2013-05-01 17:29:39 -04:00
padata.c padata: use __this_cpu_read per-cpu helper 2012-12-06 17:16:23 +08:00
panic.c dump_stack: implement arch-specific hardware description in task dumps 2013-04-30 17:04:02 -07:00
params.c params: Fix potential memory leak in add_sysfs_param() 2013-03-18 11:40:21 +00:00
pid_namespace.c pid_namespace: pidns_get() should check task_active_pid_ns() != NULL 2014-04-26 17:15:34 -07:00
pid.c pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup 2013-09-26 17:18:27 -07:00
posix-cpu-timers.c posix_timers: Fix pre-condition to stop the tick on full dynticks 2013-04-22 19:59:25 +02:00
posix-timers.c posix-timers: Remove unused variable 2013-04-18 12:51:19 +02:00
printk.c printk: rename printk_sched to printk_deferred 2014-08-07 14:30:26 -07:00
profile.c proc: Supply PDE attribute setting accessor functions 2013-05-01 17:29:18 -04:00
ptrace.c exec/ptrace: fix get_dumpable() incorrect tests 2013-11-29 11:11:44 -08:00
range.c range: Do not add new blank slot with add_range_with_merge 2013-06-18 11:32:10 -05:00
rcu.h rcu: Provide RCU CPU stall warnings for tiny RCU 2013-01-28 22:06:21 -08:00
rcupdate.c Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutiny_plugin.h rcu: Provide RCU CPU stall warnings for tiny RCU 2013-01-28 22:06:21 -08:00
rcutiny.c Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
rcutorture.c rcu: Allow rcutorture to be built at low optimization levels 2013-02-04 12:18:20 -08:00
rcutree_plugin.h rcu: Don't allocate bootmem from rcu_init() 2013-05-15 10:41:12 -07:00
rcutree_trace.c rcutrace: single_open() leaks 2013-05-05 00:16:35 -04:00
rcutree.c rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migration 2013-06-10 13:37:12 -07:00
rcutree.h rcu: Don't call wakeup() with rcu_node structure ->lock held 2013-06-10 13:37:11 -07:00
relay.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
res_counter.c res_counter: return amount of charges after res_counter_uncharge() 2012-12-18 15:02:12 -08:00
resource.c mem hotunplug: fix kfree() of bootmem memory 2013-04-29 15:54:40 -07:00
rtmutex_common.h
rtmutex-debug.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
rtmutex-debug.h rtmutex: Handle deadlock detection smarter 2014-07-17 15:58:04 -07:00
rtmutex-tester.c locking/rtmutex/tester: Set correct permissions on sysfs files 2013-04-10 14:48:37 +02:00
rtmutex.c rtmutex: Plug slow unlock race 2014-07-17 15:58:04 -07:00
rtmutex.h rtmutex: Handle deadlock detection smarter 2014-07-17 15:58:04 -07:00
rwsem.c Revert "rw_semaphore: remove up/down_read_non_owner" 2013-03-23 15:53:52 -07:00
seccomp.c seccomp: allow BPF_XOR based ALU instructions. 2013-03-26 11:07:19 +11:00
semaphore.c semaphore: use `bool' type for semaphore_waiter's up 2013-04-30 17:04:08 -07:00
signal.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-05-01 14:08:52 -07:00
smp.c kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path 2014-09-17 09:03:57 -07:00
smpboot.c kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
smpboot.h
softirq.c irq: Force hardirq exit's softirq processing on its own stack 2013-10-13 16:08:34 -07:00
spinlock.c
srcu.c srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock() 2013-02-07 15:19:36 -08:00
stacktrace.c
stop_machine.c stop_machine: Mark per cpu stopper enabled early 2013-02-26 22:25:17 +01:00
sys_ni.c unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE 2013-05-09 13:46:38 -04:00
sys.c reboot: rigrate shutdown/reboot to boot cpu 2013-06-12 16:29:44 -07:00
sysctl_binary.c switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE 2013-05-09 14:53:20 -04:00
sysctl.c perf: Enforce 1 as lower limit for perf_event_max_sample_rate 2014-06-11 12:03:27 -07:00
task_work.c
taskstats.c
test_kprobes.c kernel/: rename random32() to prandom_u32() 2013-04-29 18:28:42 -07:00
time.c jiffies: Fix timeval conversion to jiffies 2014-10-09 12:18:43 -07:00
timeconst.bc kernel: Replace timeconst.pl with a bc script 2013-02-16 23:17:25 +01:00
timer.c timer: Prevent overflow in apply_slack 2014-06-07 13:25:30 -07:00
tracepoint.c tracepoint: Do not waste memory on mods with no tracepoints 2014-05-30 21:52:11 -07:00
tsacct.c cputime: Use accessors to read task cputime stats 2013-01-27 19:23:31 +01:00
uid16.c make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect 2013-03-03 22:58:33 -05:00
up.c
user_namespace.c user namespace: fix incorrect memory barriers 2014-04-26 17:15:34 -07:00
user-return-notifier.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
user.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
utsname_sysctl.c kernel/utsname_sysctl.c: put get/get_uts() into CONFIG_PROC_SYSCTL code block 2013-02-27 19:10:22 -08:00
utsname.c proc: Split the namespace stuff out into linux/proc_ns.h 2013-05-01 17:29:39 -04:00
wait.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
watchdog.c watchdog: Add comments to explain the watchdog_disabled variable 2013-03-14 08:24:05 +01:00
workqueue_internal.h workqueue: include workqueue info when printing debug dump of a worker task 2013-04-30 17:04:02 -07:00
workqueue.c workqueue: zero cpumask of wq_numa_possible_cpumask on init 2014-07-17 15:58:00 -07:00