linux/kernel
Srivatsa S. Bhat c62f9945ef CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume
commit d35be8bab9 upstream.

In the event of CPU hotplug, the kernel modifies the cpusets' cpus_allowed
masks as and when necessary to ensure that the tasks belonging to the cpusets
have some place (online CPUs) to run on. And regular CPU hotplug is
destructive in the sense that the kernel doesn't remember the original cpuset
configurations set by the user, across hotplug operations.

However, suspend/resume (which uses CPU hotplug) is a special case in which
the kernel has the responsibility to restore the system (during resume), to
exactly the same state it was in before suspend.

In order to achieve that, do the following:

1. Don't modify cpusets during suspend/resume. At all.
   In particular, don't move the tasks from one cpuset to another, and
   don't modify any cpuset's cpus_allowed mask. So, simply ignore cpusets
   during the CPU hotplug operations that are carried out in the
   suspend/resume path.

2. However, cpusets and sched domains are related. We just want to avoid
   altering cpusets alone. So, to keep the sched domains updated, build
   a single sched domain (containing all active cpus) during each of the
   CPU hotplug operations carried out in s/r path, effectively ignoring
   the cpusets' cpus_allowed masks.

   (Since userspace is frozen while doing all this, it will go unnoticed.)

3. During the last CPU online operation during resume, build the sched
   domains by looking up the (unaltered) cpusets' cpus_allowed masks.
   That will bring back the system to the same original state as it was in
   before suspend.

Ultimately, this will not only solve the cpuset problem related to suspend
resume (ie., restores the cpusets to exactly what it was before suspend, by
not touching it at all) but also speeds up suspend/resume because we avoid
running cpuset update code for every CPU being offlined/onlined.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20120524141611.3692.20155.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-13 05:38:57 +09:00
..
debug KGDB/KDB regression fixes 2012-04-04 17:26:08 -07:00
events perf_event: Switch to internal refcount, fix race with close() 2012-10-02 10:29:54 -07:00
gcov gcov: disable CONSTRUCTORS for UML 2011-07-26 16:49:45 -07:00
irq random: remove rand_initialize_irq() 2012-08-15 08:10:29 -07:00
power ftrace: Disable function tracing during suspend/resume and hibernation, again 2012-08-09 08:31:29 -07:00
sched CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume 2012-10-13 05:38:57 +09:00
time time: Move ktime_t overflow checking into timespec_valid_strict 2012-10-02 10:30:36 -07:00
trace splice: fix racy pipe->buffers uses 2012-07-16 09:04:42 -07:00
.gitignore
acct.c Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
async.c Fix a dead loop in async_synchronize_full() 2012-10-02 10:30:35 -07:00
audit_tree.c audit: fix refcounting in audit-tree 2012-09-14 10:00:18 -07:00
audit_watch.c kill path_lookup() 2011-03-14 09:15:23 -04:00
audit.c constify path argument of audit_log_d_path() 2012-03-20 21:29:40 -04:00
audit.h audit: remove AUDIT_SETUP_CONTEXT as it isn't used 2012-01-17 16:16:57 -05:00
auditfilter.c audit: allow interfield comparison in audit rules 2012-01-17 16:17:01 -05:00
auditsc.c kernel-doc: fix new warnings in auditsc.c 2012-01-23 08:44:53 -08:00
backtracetest.c
bounds.c memcg: remove direct page_cgroup-to-page pointer 2011-03-23 19:46:28 -07:00
capability.c Revert "capabitlies: ns_capable can use the cap helpers rather than lsm call" 2012-01-17 10:19:41 -08:00
cgroup_freezer.c cgroup: remove cgroup_subsys argument from callbacks 2012-02-02 09:20:22 -08:00
cgroup.c cgroup: cgroup_attach_task() could return -errno after success 2012-03-29 22:03:33 -07:00
compat.c compat: Fix RT signal mask corruption via sigprocmask 2012-05-10 08:58:33 -07:00
configs.c kernel/configs.c: include MODULE_*() when CONFIG_IKCONFIG_PROC=n 2011-07-25 20:57:15 -07:00
cpu_pm.c cpu_pm: call notifiers during suspend 2011-09-23 12:05:29 +05:30
cpu.c Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
cpuset.c CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume 2012-10-13 05:38:57 +09:00
crash_dump.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
cred.c cred: copy_process() should clear child->replacement_session_keyring 2012-04-11 08:20:11 -07:00
delayacct.c KVM: Steal time implementation 2011-07-14 12:59:14 +03:00
dma.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
elfcore.c
exec_domain.c
exit.c posix_types.h: Cleanup stale __NFDBITS and related definitions 2012-08-09 08:31:39 -07:00
extable.c extable, core_kernel_data(): Make sure all archs define _sdata 2011-05-20 08:56:56 +02:00
fork.c mm/fork: fix overflow in vma length when copying mmap on clone 2012-06-10 00:36:06 +09:00
freezer.c PM / Freezer: Remove references to TIF_FREEZE in comments 2012-03-04 23:08:54 +01:00
futex_compat.c futex: Mark get_robust_list as deprecated 2012-03-29 11:37:17 +02:00
futex.c futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi() 2012-08-09 08:31:53 -07:00
groups.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
hrtimer.c hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-19 08:59:00 -07:00
hung_task.c hung_task: fix the broken rcu_lock_break() logic 2012-03-05 15:49:42 -08:00
irq_work.c irq_work: fix compile failure on tile from missing include 2012-04-13 13:15:16 -04:00
itimer.c itimer: Use printk_once instead of WARN_ONCE 2012-04-10 11:00:30 +02:00
jump_label.c static keys: Inline the static_key_enabled() function 2012-02-28 20:01:08 +01:00
kallsyms.c Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2011-03-25 17:52:22 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage 2012-03-23 13:18:57 +01:00
Kconfig.preempt locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage 2012-03-23 13:18:57 +01:00
kexec.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-28 17:19:28 -07:00
kfifo.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
kmod.c PM / Sleep: Mitigate race between the freezer and request_firmware() 2012-03-28 23:30:28 +02:00
kprobes.c kprobes: return proper error code from register_kprobe() 2012-03-05 15:49:42 -08:00
ksysfs.c kernel: ksysfs.c is implicitly using stat.h 2011-10-31 09:20:13 -04:00
kthread.c kthread_worker: reimplement flush_kthread_work() to allow freeing the work item being executed 2012-10-02 10:30:40 -07:00
latencytop.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_internals.h
lockdep_proc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_states.h
lockdep.c lockdep: Add CPU-idle/offline warning to lockdep-RCU splat 2012-02-21 09:06:06 -08:00
Makefile sysctl: Move the implementation into fs/proc/proc_sysctl.c 2012-01-24 16:37:54 -08:00
module.c module: Remove module size limit 2012-03-26 12:50:53 +10:30
mutex-debug.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mutex-debug.h mutex: Use p->on_cpu for the adaptive spin 2011-04-14 08:52:33 +02:00
mutex.c sched/rt: Use schedule_preempt_disabled() 2012-03-01 10:28:03 +01:00
mutex.h mutex: Use p->on_cpu for the adaptive spin 2011-04-14 08:52:33 +02:00
notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
nsproxy.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
padata.c padata: Fix cpu hotplug 2012-03-29 19:52:46 +08:00
panic.c kdump: Execute kmsg_dump(KMSG_DUMP_PANIC) after smp_send_stop() 2012-06-22 11:36:56 -07:00
params.c params: <level>_initcall-like kernel parameters 2012-03-26 12:50:51 +10:30
pid_namespace.c pidns: add reboot_pid_ns() to handle the reboot syscall 2012-03-28 17:14:36 -07:00
pid.c vfs: fix panic in __d_lookup() with high dentry hashtable counts 2012-02-13 20:45:38 -05:00
posix-cpu-timers.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
posix-timers.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
printk.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-20 10:31:44 -07:00
profile.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
ptrace.c ptrace: remove PTRACE_SEIZE_DEVEL bit 2012-03-23 16:58:41 -07:00
range.c range: fix bogus misuse of module.h to get printk() 2011-10-31 09:20:11 -04:00
rcu.h rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit() 2012-02-21 09:06:12 -08:00
rcupdate.c rcu: Check for illegal use of RCU from offlined CPUs 2012-02-21 09:06:03 -08:00
rcutiny_plugin.h rcu: Simplify unboosting checks 2012-02-21 09:03:43 -08:00
rcutiny.c rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections 2012-02-21 09:06:13 -08:00
rcutorture.c PTR_ERR should be called before its argument is cleared. 2012-02-21 09:06:10 -08:00
rcutree_plugin.h rcu: Hold off RCU_FAST_NO_HZ after timer posted 2012-02-21 09:42:30 -08:00
rcutree_trace.c rcu: Rework detection of use of RCU by offline CPUs 2012-02-21 09:06:07 -08:00
rcutree.c rcu: Fix day-one dyntick-idle stall-warning bug 2012-10-13 05:38:55 +09:00
rcutree.h rcu: Rework detection of use of RCU by offline CPUs 2012-02-21 09:06:07 -08:00
relay.c splice: fix racy pipe->buffers uses 2012-07-16 09:04:42 -07:00
res_counter.c net: introduce res_counter_charge_nofail() for socket allocations 2012-01-22 15:08:46 -05:00
resource.c kernel/resource.c: move EXPORT_SYMBOL right after definition 2012-02-03 23:37:07 +01:00
rtmutex_common.h rtmutex: Simplify PI algorithm and make highest prio task get lock 2011-01-27 21:13:51 -05:00
rtmutex-debug.c lockdep, rtmutex, bug: Show taint flags on error 2011-12-06 08:16:49 +01:00
rtmutex-debug.h
rtmutex-tester.c rtmutex-tester: convert sysdev_class to a regular subsystem 2011-12-14 14:54:22 -08:00
rtmutex.c Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" 2011-12-11 10:33:18 -08:00
rtmutex.h
rwsem.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
seccomp.c seccomp: audit abnormal end to a process due to seccomp 2012-01-17 16:16:55 -05:00
semaphore.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
signal.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
smp.c smp: add func to IPI cpus based on parameter func 2012-03-28 17:14:35 -07:00
softirq.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-20 10:32:09 -07:00
spinlock.c locking/kconfig: Simplify INLINE_SPIN_UNLOCK usage 2012-03-23 13:18:57 +01:00
srcu.c rcu: Call out dangers of expedited RCU primitives 2012-02-21 09:06:08 -08:00
stacktrace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stop_machine.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sys.c kernel/sys.c: call disable_nonboot_cpus() in kernel_restart() 2012-10-13 05:38:38 +09:00
sysctl_binary.c binary_sysctl(): fix memory leak 2011-12-20 10:25:04 -08:00
sysctl.c sysctl: fix write access to dmesg_restrict/kptr_restrict 2012-04-05 14:51:43 +10:00
taskstats.c Make TASKSTATS require root access 2011-09-19 17:04:37 -07:00
test_kprobes.c
time.c time: Remove bogus comments 2012-03-15 18:17:55 -07:00
timeconst.pl
timer.c Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 07:53:34 -08:00
tracepoint.c static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]() 2012-02-24 10:05:59 +01:00
tsacct.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
uid16.c userns: user namespaces: convert several capable() calls 2011-03-23 19:47:08 -07:00
up.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user_namespace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user-return-notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname_sysctl.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
utsname.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
wait.c lockdep/waitqueues: Add better annotation 2011-12-21 10:07:39 +01:00
watchdog.c kernel/watchdog.c: add comment to watchdog() exit path 2012-03-23 16:58:32 -07:00
workqueue_sched.h
workqueue.c workqueue: add missing smp_wmb() in process_one_work() 2012-10-13 05:38:39 +09:00