linux/kernel/rcu
Uladzislau Rezki (Sony) 09a27d6620 kvfree_rcu: Use same set of GFP flags as does single-argument
[ Upstream commit ee6ddf5847 ]

Running an rcuscale stress-suite can lead to "Out of memory" of a
system. This can happen under high memory pressure with a small amount
of physical memory.

For example, a KVM test configuration with 64 CPUs and 512 megabytes
can result in OOM when running rcuscale with below parameters:

../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \
  rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make

<snip>
[   12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0
[   12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[   12.056485] Workqueue: events_highpri fill_page_cache_func
[   12.056485] Call Trace:
[   12.056485]  dump_stack+0x57/0x6a
[   12.056485]  dump_header+0x4c/0x30a
[   12.056485]  ? del_timer_sync+0x20/0x30
[   12.056485]  out_of_memory.cold.47+0xa/0x7e
[   12.056485]  __alloc_pages_slowpath.constprop.123+0x82f/0xc00
[   12.056485]  __alloc_pages_nodemask+0x289/0x2c0
[   12.056485]  __get_free_pages+0x8/0x30
[   12.056485]  fill_page_cache_func+0x39/0xb0
[   12.056485]  process_one_work+0x1ed/0x3b0
[   12.056485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  worker_thread+0x28/0x3c0
[   12.060485]  ? process_one_work+0x3b0/0x3b0
[   12.060485]  kthread+0x138/0x160
[   12.060485]  ? kthread_park+0x80/0x80
[   12.060485]  ret_from_fork+0x22/0x30
[   12.062156] Mem-Info:
[   12.062350] active_anon:0 inactive_anon:0 isolated_anon:0
[   12.062350]  active_file:0 inactive_file:0 isolated_file:0
[   12.062350]  unevictable:0 dirty:0 writeback:0
[   12.062350]  slab_reclaimable:2797 slab_unreclaimable:80920
[   12.062350]  mapped:1 shmem:2 pagetables:8 bounce:0
[   12.062350]  free:10488 free_pcp:1227 free_cma:0
...
[   12.101610] Out of memory and no killable processes...
[   12.102042] Kernel panic - not syncing: System is deadlocked on memory
[   12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[   12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
<snip>

Because kvfree_rcu() has a fallback path, memory allocation failure is
not the end of the world.  Furthermore, the added overhead of aggressive
GFP settings must be balanced against the overhead of the fallback path,
which is a cache miss for double-argument kvfree_rcu() and a call to
synchronize_rcu() for single-argument kvfree_rcu().  The current choice
of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call
to synchronize_rcu(), so less-tenacious GFP flags would be helpful.

Here is the tradeoff that must be balanced:
    a) Minimize use of the fallback path,
    b) Avoid pushing the system into OOM,
    c) Bound allocation latency to that of synchronize_rcu(), and
    d) Leave the emergency reserves to use cases lacking fallbacks.

This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to
GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN.  This combination
leaves the emergency reserves alone and can initiate reclaim, but will
not invoke the OOM killer.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11 14:47:23 +02:00
..
Kconfig rcu: Reduce leaf fanout for strict RCU grace periods 2020-08-24 18:40:23 -07:00
Kconfig.debug Merge branch 'strictgp.2020.08.24a' into HEAD 2020-09-03 09:47:42 -07:00
Makefile rcuperf: Change rcuperf to rcuscale 2020-08-24 18:39:24 -07:00
rcu_segcblist.c rcu/segcblist: Prevent useless GP start if no CBs to accelerate 2020-09-03 09:39:59 -07:00
rcu_segcblist.h rcu: Remove kfree_rcu() special casing and lazy-callback handling 2020-01-24 10:24:31 -08:00
rcu.h treewide: Make all debug_obj_descriptors const 2020-09-24 21:56:25 +02:00
rcuscale.c rcuperf: Change rcuperf to rcuscale 2020-08-24 18:39:24 -07:00
rcutorture.c rcutorture: Allow pointer leaks to test diagnostic code 2020-08-24 18:45:36 -07:00
refscale.c refperf: Avoid null pointer dereference when buf fails to allocate 2020-08-24 18:45:35 -07:00
srcutiny.c rcu: Use CONFIG_PREEMPTION where appropriate 2019-12-09 12:37:51 -08:00
srcutree.c srcu: Remove KCSAN stubs 2020-08-24 18:36:03 -07:00
sync.c rcu/sync: Simplify the state machine 2019-05-28 09:05:23 -07:00
tasks.h rcu-tasks: Move RCU-tasks initialization to before early_initcall() 2021-01-19 18:27:28 +01:00
tiny.c rcu: Rename *_kfree_callback/*_kfree_rcu_offset/kfree_call_* 2020-06-29 11:59:25 -07:00
tree_exp.h rcu: Initialize at declaration time in rcu_exp_handler() 2020-08-24 18:36:03 -07:00
tree_plugin.h rcu/nocb: Perform deferred wake up before last idle's need_resched() check 2021-03-04 11:38:35 +01:00
tree_stall.h rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled 2020-11-10 17:10:38 -08:00
tree.c kvfree_rcu: Use same set of GFP flags as does single-argument 2021-05-11 14:47:23 +02:00
tree.h Merge branch 'strictgp.2020.08.24a' into HEAD 2020-09-03 09:47:42 -07:00
update.c Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-10-18 14:34:50 -07:00