linux/kernel
Lai Jiangshan e36bce4466 workqueue: Update the rescuer's affinity only when it is detached
When a rescuer is attached to a pool, its affinity should be only
managed by the pool.

But updating the detached rescuer's affinity is still meaningful so
that it will not disrupt isolated CPUs when it is to be waken up.

But the commit d64f2fa064 ("kernel/workqueue: Let rescuers follow
unbound wq cpumask changes") updates the affinity unconditionally, and
causes some issues

1) it also changes the affinity when the rescuer is already attached to
   a pool, which violates the affinity management.

2) the said commit tries to update the affinity of the rescuers, but it
   misses the rescuers of the PERCPU workqueues, and isolated CPUs can
   be possibly disrupted by these rescuers when they are summoned.

3) The affinity to set to the rescuers should be consistent in all paths
   when a rescuer is in detached state. The affinity could be either
   wq_unbound_cpumask or unbound_effective_cpumask(wq). Related paths:
       rescuer's worker_detach_from_pool()
       update wq_unbound_cpumask
       update wq's cpumask
       init_rescuer()
   Both affinities are Ok as long as they are consistent in all paths.
   But using unbound_effective_cpumask(wq) requres much more code to
   maintain the consistency, and it doesn't make much sense since the
   affinity is only effective when the rescuer is not processing works.
   wq_unbound_cpumask is more favorable.

Fix the 1) issue by testing rescuer->pool before updating with
wq_pool_attach_mutex held.

Fix the 2) issue by moving the rescuer's affinity updating code to
the place updating wq_unbound_cpumask and make it also update for
PERCPU workqueues.

Partially cleanup the 3) consistency issue by using wq_unbound_cpumask.
So that the path of "updating wq's cpumask" doesn't need to maintain it.
and both the paths of "updating wq_unbound_cpumask" and "rescuer's
worker_detach_from_pool()" use wq_unbound_cpumask.

Cleanup for init_rescuer()'s consistency for affinity can be done in
future.

Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-11-20 10:27:55 -10:00
..
bpf bpf: Conditionally include dynptr copy kfuncs 2025-10-24 09:44:47 -07:00
cgroup cgroup: Fixes for v6.18-rc2 2025-10-20 09:41:27 -10:00
configs kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
debug kdb: remove redundant check for scancode 0xe0 2025-09-20 21:19:09 +01:00
dma dma-debug: don't report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC 2025-10-15 13:24:33 -07:00
entry hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
events perf/core: Fix MMAP2 event device with backing files 2025-10-14 10:38:10 +02:00
futex A set of updates for futexes and related selftests: 2025-09-30 16:07:10 -07:00
gcov kbuild: require gcc-8 and binutils-2.30 2025-04-30 21:53:35 +02:00
irq genirq/manage: Add buslock back in to enable_irq() 2025-10-24 11:38:39 +02:00
kcsan Kernel Concurrency Sanitizer (KCSAN) updates for v6.18 2025-10-02 08:31:44 -07:00
livepatch sched,livepatch: Untangle cond_resched() and live-patching 2025-05-14 13:16:24 +02:00
locking locking/local_lock: Introduce local_lock_is_locked(). 2025-09-29 09:42:35 +02:00
module kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
power PM: sleep: Allow pm_restrict_gfp_mask() stacking 2025-10-29 18:55:32 +01:00
printk printk changes for 6.18 2025-10-04 11:13:11 -07:00
rcu hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
sched sched_ext: Fixes for v6.18-rc3 2025-10-27 10:52:18 -07:00
time timekeeping: Fix aux clocks sysfs initialization loop bound 2025-10-20 19:56:12 +02:00
trace rv: Make rtapp/pagefault monitor depends on CONFIG_MMU 2025-10-20 12:47:40 +02:00
unwind unwind: Finish up unwind when a task exits 2025-07-31 10:20:11 -04:00
.gitignore kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-06-24 20:30:37 +09:00
acct.c kernel/acct.c: saner struct file treatment 2025-09-27 20:13:56 -04:00
async.c
audit_fsnotify.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
audit_tree.c mount-related stuff for this cycle 2025-10-03 10:19:44 -07:00
audit_watch.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
audit.c audit: fix skb leak when audit rate limit is exceeded 2025-09-10 19:55:00 -04:00
audit.h audit: create audit_stamp structure 2025-08-30 10:15:28 -04:00
auditfilter.c audit/stable-6.18 PR 20250926 2025-09-30 08:22:16 -07:00
auditsc.c audit: add record for multiple object contexts 2025-08-30 10:15:30 -04:00
backtracetest.c
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c cfi: Move BPF CFI types and helpers to generic code 2025-07-31 18:23:53 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu_pm.c
cpu.c cpu: Remove obsolete comment from takedown_cpu() 2025-08-06 22:48:12 +02:00
crash_core_test.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_core.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_dump_dm_crypt.c crash_dump: retrieve dm crypt keys in kdump kernel 2025-05-21 10:48:21 -07:00
crash_reserve.c kdump: implement reserve_crashkernel_cma 2025-07-19 19:08:23 -07:00
cred.c copy_process: pass clone_flags as u64 across calltree 2025-09-01 15:31:34 +02:00
delayacct.c delayacct: remove redundant code and adjust indentation 2025-05-27 19:40:33 -07:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c task_stack.h: clean-up stack_not_used() implementation 2025-09-21 14:22:00 -07:00
exit.h
extable.c
fail_function.c
fork.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
freezer.c mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
gen_kheaders.sh kheaders: make it possible to override TAR 2025-08-06 10:23:36 +09:00
groups.c
hung_task.c hung_task: dump blocker task if it is not hung 2025-09-13 17:32:43 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: use kmalloc_array() instead of kmalloc() 2025-09-28 11:36:14 -07:00
kallsyms_selftest.h
kallsyms.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
Kconfig.locks
Kconfig.preempt softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
kcov.c kcov: use write memory barrier after memcpy() in kcov_move_area() 2025-09-13 17:32:44 -07:00
kexec_core.c kexec_core: remove redundant 0 value initialization 2025-09-13 17:32:49 -07:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c x86/kexec: carry forward the boot DTB on kexec 2025-09-13 17:32:43 -07:00
kexec_handover.c kho: add support for preserving vmalloc allocations 2025-10-07 13:48:55 -07:00
kexec_internal.h kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kexec.c kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kheaders.c kheaders: Simplify attribute through __BIN_ATTR_SIMPLE_RO() 2024-12-24 09:46:49 +01:00
kprobes.c kprobes: Add missing kerneldoc for __get_insn_slot 2025-07-15 18:45:34 +09:00
kstack_erase.c stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth 2025-07-21 21:40:39 -07:00
ksyms_common.c
ksysfs.c kernel/ksysfs.c: simplify bin_attribute definition 2025-01-07 16:59:15 +01:00
kthread.c ipvs: Fix estimator kthreads preferred affinity 2025-08-13 08:34:33 +02:00
latencytop.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
Makefile Patch series in this pull request: 2025-10-02 18:44:54 -07:00
module_signature.c
notifier.c
nscommon.c ns: drop assert 2025-09-25 09:23:54 +02:00
nsproxy.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
nstree.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
padata.c padata: WQ_PERCPU added to alloc_workqueue users 2025-09-13 12:11:06 +08:00
panic.c panic: remove CONFIG_PANIC_ON_OOPS_VALUE 2025-09-28 11:36:13 -07:00
params.c params: Replace deprecated strcpy() with strscpy() and memcpy() 2025-08-16 21:47:25 +02:00
pid_namespace.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
pid_sysctl.h treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
pid.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
profile.c
ptrace.c ptrace: introduce PTRACE_SET_SYSCALL_INFO request 2025-05-11 17:48:15 -07:00
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relayfs: support a counter tracking if data is too big to write 2025-07-09 22:57:52 -07:00
resource_kunit.c
resource.c resource: improve child resource handling in release_mem_region_adjustable() 2025-09-21 14:22:34 -07:00
rseq.c rseq: Protect event mask against membarrier IPI 2025-09-13 19:51:59 +02:00
scftorture.c
scs.c
seccomp.c Performance events updates for v6.18: 2025-09-30 11:11:21 -07:00
signal.c signal: Fix memory leak for PIDFD_SELF* sentinels 2025-08-19 13:51:28 +02:00
smp.c smp: Fix up and expand the smp_call_function_many() kerneldoc 2025-09-18 22:21:28 +02:00
smpboot.c sched/smp: Use the SMP version of idle_thread_set_boot_cpu() 2025-06-13 08:47:20 +02:00
smpboot.h
softirq.c softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
stacktrace.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
static_call.c
stop_machine.c sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
sys_ni.c uprobes/x86: Add uprobe syscall to speed up uprobe 2025-08-21 20:09:20 +02:00
sys.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
sysctl-test.c sysctl: move u8 register test to lib/test_sysctl.c 2025-04-14 14:13:41 +02:00
sysctl.c sysctl: rename kern_table -> sysctl_subsys_table 2025-07-23 11:56:02 +02:00
task_work.c kasan: make kasan_record_aux_stack_noalloc() the default behaviour 2025-01-13 22:40:36 -08:00
taskstats.c
torture.c torture: Delay CPU-hotplug operations until boot completes 2025-08-14 15:26:30 -07:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c pid: change bacct_add_tsk() to use task_ppid_nr_ns() 2025-08-19 13:38:20 +02:00
ucount.c ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() 2025-08-02 12:01:38 -07:00
uid16.c
uid16.h
umh.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
up.c
user_namespace.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
user-return-notifier.c
user.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
utsname_sysctl.c treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
utsname.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
vhost_task.c vhost: Take a reference on the task in struct vhost_task. 2025-09-21 17:44:20 -04:00
vmcore_info.c crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo 2025-05-11 17:54:04 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog_buddy.c watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu() 2025-07-31 11:28:03 -04:00
watchdog_perf.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
watchdog.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
workqueue_internal.h
workqueue.c workqueue: Update the rescuer's affinity only when it is detached 2025-11-20 10:27:55 -10:00