linux/kernel
Tejun Heo 8852aac25e workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
8376fe22c7 ("workqueue: implement mod_delayed_work[_on]()")
implemented mod_delayed_work[_on]() using the improved
try_to_grab_pending().  The function is later used, among others, to
replace [__]candel_delayed_work() + queue_delayed_work() combinations.

Unfortunately, a delayed_work item w/ zero @delay is handled slightly
differently by mod_delayed_work_on() compared to
queue_delayed_work_on().  The latter skips timer altogether and
directly queues it using queue_work_on() while the former schedules
timer which will expire on the closest tick.  This means, when @delay
is zero, that [__]cancel_delayed_work() + queue_delayed_work_on()
makes the target item immediately executable while
mod_delayed_work_on() may induce delay of upto a full tick.

This somewhat subtle difference breaks some of the converted users.
e.g. block queue plugging uses delayed_work for deferred processing
and uses mod_delayed_work_on() when the queue needs to be immediately
unplugged.  The above problem manifested as noticeably higher number
of context switches under certain circumstances.

The difference in behavior was caused by missing special case handling
for 0 delay in mod_delayed_work_on() compared to
queue_delayed_work_on().  Joonsoo Kim posted a patch to add it -
("workqueue: optimize mod_delayed_work_on() when @delay == 0")[1].
The patch was queued for 3.8 but it was described as optimization and
I missed that it was a correctness issue.

As both queue_delayed_work_on() and mod_delayed_work_on() use
__queue_delayed_work() for queueing, it seems that the better approach
is to move the 0 delay special handling to the function instead of
duplicating it in mod_delayed_work_on().

Fix the problem by moving 0 delay special case handling from
queue_delayed_work_on() to __queue_delayed_work().  This replaces
Joonsoo's patch.

[1] http://thread.gmane.org/gmane.linux.kernel/1379011/focus=1379012

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Anders Kaseorg <andersk@MIT.EDU>
Reported-and-tested-by: Zlatko Calusic <zlatko.calusic@iskon.hr>
LKML-Reference: <alpine.DEB.2.00.1211280953350.26602@dr-wily.mit.edu>
LKML-Reference: <50A78AA9.5040904@iskon.hr>
Cc: Joonsoo Kim <js1304@gmail.com>
2012-12-01 16:43:18 -08:00
..
debug KGDB/KDB fixes and cleanups 2012-10-13 11:16:58 +09:00
events perf, powerpc: Fix hw breakpoints returning -ENOSPC 2012-10-30 10:07:58 +01:00
gcov
irq irqdomain: augment add_simple() to allocate descs 2012-10-10 08:57:26 +02:00
power Merge branch 'pm-qos' 2012-09-17 20:25:51 +02:00
sched Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-10-12 22:13:05 +09:00
time Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-10-12 22:17:48 +09:00
trace Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent 2012-10-21 19:53:34 +02:00
.gitignore
acct.c vfs: make path_openat take a struct filename pointer 2012-10-12 20:15:09 -04:00
async.c [SCSI] async: make async_synchronize_full() flush all work regardless of domain 2012-07-20 09:07:37 +01:00
audit_tree.c audit: clean up refcounting in audit-tree 2012-08-15 12:55:22 +02:00
audit_watch.c audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
audit.c fs: handle failed audit_log_start properly 2012-10-09 23:33:37 -04:00
audit.h audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
auditfilter.c audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
auditsc.c audit: make audit_inode take struct filename 2012-10-12 20:15:09 -04:00
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them 2012-09-14 12:01:16 -07:00
cgroup.c Revert "cgroup: Remove task_lock() from cgroup_post_fork()" 2012-10-19 14:09:35 -07:00
compat.c
configs.c
cpu_pm.c
cpu.c CPU hotplug, debug: detect imbalance between get_online_cpus() and put_online_cpus() 2012-10-09 16:22:15 +09:00
cpuset.c cpusets: Remove/update outdated comments 2012-07-24 13:53:28 +02:00
crash_dump.c
cred.c userns: Make credential debugging user namespace safe. 2012-08-23 22:54:18 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
extable.c
fork.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2012-10-10 12:02:25 +09:00
freezer.c
futex_compat.c
futex.c futex: avoid wake_futex() for a PI futex_q 2012-11-26 17:41:24 -08:00
groups.c
hrtimer.c hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-11 23:34:39 +02:00
hung_task.c
irq_work.c
itimer.c
jump_label.c jump_label: Export jump_label_rate_limit() 2012-08-06 19:00:35 +03:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking: Adjust spin lock inlining Kconfig options 2012-09-13 17:56:13 +02:00
Kconfig.preempt
kexec.c kdump: remove unneeded include 2012-10-06 03:05:19 +09:00
kfifo.c
kmod.c infrastructure for saner ret_from_kernel_thread semantics 2012-10-12 13:35:07 -04:00
kprobes.c kprobes/x86: Fix to support jprobes on ftrace-based kprobe 2012-09-13 22:52:11 -04:00
ksysfs.c
kthread.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2012-10-13 10:05:52 +09:00
latencytop.c
lglock.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c lockdep: Check if nested lock is actually held 2012-09-13 17:00:44 +02:00
Makefile Makefile: Documentation for external tool should be correct 2012-10-25 16:00:53 -07:00
modsign_pubkey.c MODSIGN: Provide module signing public keys to the kernel 2012-10-10 20:01:22 +10:30
module_signing.c module_signing: fix printk format warning 2012-10-22 08:56:34 +03:00
module-internal.h MODSIGN: Move the magic string to the end of a module and eliminate the search 2012-10-19 17:30:40 -07:00
module.c module: fix out-by-one error in kallsyms 2012-10-31 13:56:37 +10:30
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c
padata.c
panic.c panic: fix a possible deadlock in panic() 2012-07-30 17:25:13 -07:00
params.c
pid_namespace.c pidns: limit the nesting depth of pid namespaces 2012-10-25 14:37:53 -07:00
pid.c net ip6 flowlabel: Make owner a union of struct pid * and kuid_t 2012-08-14 21:49:25 -07:00
posix-cpu-timers.c
posix-timers.c
printk.c printk: Fix scheduling-while-atomic problem in console_cpu_notify() 2012-10-16 18:17:44 -07:00
profile.c
ptrace.c ptrace: mark __ptrace_may_access() static 2012-08-03 14:47:17 +10:00
range.c
rcu.h
rcupdate.c rcu: Add PROVE_RCU_DELAY to provoke difficult races 2012-09-23 07:42:49 -07:00
rcutiny_plugin.h rcu: Move TINY_PREEMPT_RCU away from raw_local_irq_save() 2012-09-23 07:42:51 -07:00
rcutiny.c rcu: Move TINY_RCU quiescent state out of extended quiescent state 2012-09-23 07:42:52 -07:00
rcutorture.c rcu: Prevent initialization race in rcutorture kthreads 2012-09-23 07:42:23 -07:00
rcutree_plugin.h rcu: Make RCU_FAST_NO_HZ handle adaptive ticks 2012-09-26 15:44:02 +02:00
rcutree_trace.c Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25b 2012-09-25 10:01:45 -07:00
rcutree.c rcu: Grace-period initialization excludes only RCU notifier 2012-10-08 09:06:38 -07:00
rcutree.h rcu: Grace-period initialization excludes only RCU notifier 2012-10-08 09:06:38 -07:00
relay.c
res_counter.c
resource.c kernel/resource.c: fix stack overflow in __reserve_region_with_split() 2012-10-06 03:05:31 +09:00
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
seccomp.c
semaphore.c
signal.c coredump: pass siginfo_t* to do_coredump() and below, not merely signr 2012-10-06 03:05:16 +09:00
smp.c
smpboot.c hotplug: Fix UP bug in smpboot hotplug code 2012-08-13 17:01:07 +02:00
smpboot.h smpboot: Provide infrastructure for percpu hotplug threads 2012-08-13 17:01:07 +02:00
softirq.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-10-01 10:43:39 -07:00
spinlock.c
srcu.c workqueue: deprecate system_nrt[_freezable]_wq 2012-08-20 14:51:24 -07:00
stacktrace.c
stop_machine.c
sys_ni.c
sys.c use clamp_t in UNAME26 fix 2012-10-19 18:51:17 -07:00
sysctl_binary.c mm: prepare for removal of obsolete /proc/sys/vm/nr_pdflush_threads 2012-07-31 18:42:40 -07:00
sysctl.c Kconfig: clean up the "#if defined(arch)" list for exception-trace sysctl entry 2012-10-09 16:22:14 +09:00
task_work.c task_work: task_work_add() should not succeed after exit_task_work() 2012-09-13 16:47:34 +02:00
taskstats.c taskstats: cgroupstats_user_cmd() may leak on error 2012-10-06 03:05:31 +09:00
test_kprobes.c
time.c time: Move update_vsyscall definitions to timekeeper_internal.h 2012-09-24 12:38:06 -04:00
timeconst.pl
timer.c timers: Fix endless looping between cascade() and internal_add_timer() 2012-10-09 21:27:14 +02:00
tracepoint.c
tsacct.c userns: Convert taskstats to handle the user and pid namespaces. 2012-09-18 01:01:32 -07:00
uid16.c
up.c
user_namespace.c userns: Add kprojid_t and associated infrastructure in projid.h 2012-09-18 01:01:37 -07:00
user-return-notifier.c
user.c userns: Add kprojid_t and associated infrastructure in projid.h 2012-09-18 01:01:37 -07:00
utsname_sysctl.c
utsname.c
wait.c
watchdog.c watchdog: using u64 in get_sample_period() 2012-11-26 17:41:24 -08:00
workqueue_sched.h
workqueue.c workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay 2012-12-01 16:43:18 -08:00