linux/kernel
David Rientjes 89e8a244b9 cpusets: avoid looping when storing to mems_allowed if one node remains set
{get,put}_mems_allowed() exist so that general kernel code may locklessly
access a task's set of allowable nodes without having the chance that a
concurrent write will cause the nodemask to be empty on configurations
where MAX_NUMNODES > BITS_PER_LONG.

This could incur a significant delay, however, especially in low memory
conditions because the page allocator is blocking and reclaim requires
get_mems_allowed() itself.  It is not atypical to see writes to
cpuset.mems take over 2 seconds to complete, for example.  In low memory
conditions, this is problematic because it's one of the most imporant
times to change cpuset.mems in the first place!

The only way a task's set of allowable nodes may change is through cpusets
by writing to cpuset.mems and when attaching a task to a generic code is
not reading the nodemask with get_mems_allowed() at the same time, and
then clearing all the old nodes.  This prevents the possibility that a
reader will see an empty nodemask at the same time the writer is storing a
new nodemask.

If at least one node remains unchanged, though, it's possible to simply
set all new nodes and then clear all the old nodes.  Changing a task's
nodemask is protected by cgroup_mutex so it's guaranteed that two threads
are not changing the same task's nodemask at the same time, so the
nodemask is guaranteed to be stored before another thread changes it and
determines whether a node remains set or not.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Paul Menage <paul@paulmenage.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-02 16:07:00 -07:00
..
debug kgdb: follow rename pack_hex_byte() to hex_byte_pack() 2011-10-31 17:30:56 -07:00
events mm: distinguish between mlocked and pinned pages 2011-10-31 17:30:46 -07:00
gcov
irq Merge branch 'next/dt' of git://git.linaro.org/people/arnd/arm-soc 2011-11-01 21:02:35 -07:00
power
time
trace
.gitignore
acct.c
async.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c
cgroup.c cgroups: don't attach task to subsystem if migration failed 2011-11-02 16:06:59 -07:00
compat.c
configs.c
cpu_pm.c
cpu.c
cpuset.c cpusets: avoid looping when storing to mems_allowed if one node remains set 2011-11-02 16:07:00 -07:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c oom: remove oom_disable_count 2011-10-31 17:30:45 -07:00
extable.c
fork.c oom: remove oom_disable_count 2011-10-31 17:30:45 -07:00
freezer.c
futex_compat.c
futex.c
groups.c
hrtimer.c
hung_task.c
irq_work.c
itimer.c
jump_label.c
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c
kfifo.c
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c
Makefile
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c
padata.c
panic.c
params.c
pid_namespace.c
pid.c
posix-cpu-timers.c
posix-timers.c
printk.c printk: remove bounds checking for log_prefix 2011-10-31 17:30:53 -07:00
profile.c
ptrace.c
range.c
rcu.h
rcupdate.c
rcutiny_plugin.h
rcutiny.c
rcutorture.c
rcutree_plugin.h
rcutree_trace.c
rcutree.c
rcutree.h
relay.c
res_counter.c
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_autogroup.c
sched_autogroup.h
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c
sched_features.h
sched_idletask.c
sched_rt.c
sched_stats.h
sched_stoptask.c
sched.c
seccomp.c
semaphore.c
signal.c
smp.c
softirq.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c stop_machine: make stop_machine safe and efficient to call early 2011-10-31 17:30:53 -07:00
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sys.c
sysctl_binary.c
sysctl_check.c
sysctl.c Merge branch 'akpm' (Andrew's incoming) 2011-10-31 17:46:07 -07:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
wait.c
watchdog.c watchdog: move watchdog_*_all_cpus under CONFIG_SYSCTL 2011-10-31 17:30:53 -07:00
workqueue_sched.h
workqueue.c