linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-13 00:28:54 +02:00

History

Breno Leitao 5920d046f7 workqueue: add WQ_AFFN_CACHE_SHARD affinity scope On systems where many CPUs share one LLC, unbound workqueues using WQ_AFFN_CACHE collapse to a single worker pool, causing heavy spinlock contention on pool->lock. For example, Chuck Lever measured 39% of cycles lost to native_queued_spin_lock_slowpath on a 12-core shared-L3 NFS-over-RDMA system. The existing affinity hierarchy (cpu, smt, cache, numa, system) offers no intermediate option between per-LLC and per-SMT-core granularity. Add WQ_AFFN_CACHE_SHARD, which subdivides each LLC into groups of at most wq_cache_shard_size cores (default 8, tunable via boot parameter). Shards are always split on core (SMT group) boundaries so that Hyper-Threading siblings are never placed in different pods. Cores are distributed across shards as evenly as possible -- for example, 36 cores in a single LLC with max shard size 8 produces 5 shards of 8+7+7+7+7 cores. The implementation follows the same comparator pattern as other affinity scopes: precompute_cache_shard_ids() pre-fills the cpu_shard_id[] array from the already-initialized WQ_AFFN_CACHE and WQ_AFFN_SMT topology, and cpus_share_cache_shard() is passed to init_pod_type(). Benchmark on NVIDIA Grace (72 CPUs, single LLC, 50k items/thread), show cache_shard delivers ~5x the throughput and ~6.5x lower p50 latency compared to cache scope on this 72-core single-LLC system. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>		2026-04-01 10:24:18 -10:00
..
acpi	mailbox: platform and core updates	2026-02-14 11:13:32 -08:00
asm-generic	hyperv-next for v7.0	2026-02-20 08:48:31 -08:00
clocksource
crypto	Networking changes for 7.0	2026-02-11 19:31:52 -08:00
cxl
drm	drm/pagemap: pass pagemap_addr by reference	2026-02-17 19:39:44 -05:00
dt-bindings	phy-for-7.0	2026-02-17 11:40:04 -08:00
hyperv	hyperv-next for v7.0	2026-02-20 08:48:31 -08:00
keys
kunit	treewide: Replace kmalloc with kmalloc_obj for non-scalar types	2026-02-21 01:02:28 -08:00
kvm
linux	workqueue: add WQ_AFFN_CACHE_SHARD affinity scope	2026-04-01 10:24:18 -10:00
math-emu
media	[GIT PULL for v7.0] media updates	2026-02-11 12:20:25 -08:00
memory
misc
net	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments	2026-02-21 20:03:00 -08:00
pcmcia
ras
rdma	RDMA v7.0 merge window	2026-02-12 17:05:20 -08:00
rv	rv: Fix multiple definition of __pcpu_unique_da_mon_this	2026-02-20 13:12:00 +01:00
scsi	SCSI misc on 20260212	2026-02-12 15:43:02 -08:00
soc
sound
target
trace	vfs-7.0-rc1.misc.2	2026-02-16 13:00:36 -08:00
uapi	drm next fixes for 7.0-rc1	2026-02-20 15:36:38 -08:00
ufs
vdso
video
xen
Kbuild