linux/include
Nico Pache ed5d4efb4d oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup
commit e4a38402c3 upstream.

The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which
can be targeted by the oom reaper.  This mapping is used to store the
futex robust list head; the kernel does not keep a copy of the robust
list and instead references a userspace address to maintain the
robustness during a process death.

A race can occur between exit_mm and the oom reaper that allows the oom
reaper to free the memory of the futex robust list before the exit path
has handled the futex death:

    CPU1                               CPU2
    --------------------------------------------------------------------
    page_fault
    do_exit "signal"
    wake_oom_reaper
                                        oom_reaper
                                        oom_reap_task_mm (invalidates mm)
    exit_mm
    exit_mm_release
    futex_exit_release
    futex_cleanup
    exit_robust_list
    get_user (EFAULT- can't access memory)

If the get_user EFAULT's, the kernel will be unable to recover the
waiters on the robust_list, leaving userspace mutexes hung indefinitely.

Delay the OOM reaper, allowing more time for the exit path to perform
the futex cleanup.

Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer

Based on a patch by Michal Hocko.

Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1]
Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Nico Pache <npache@redhat.com>
Co-developed-by: Joel Savitz <jsavitz@redhat.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Herton R. Krzesinski <herton@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Savitz <jsavitz@redhat.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 13:53:54 +02:00
..
acpi ACPICA: actypes.h: Expand the ACPI_ACCESS_ definitions 2022-01-27 10:54:18 +01:00
asm-generic tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry 2022-04-20 09:23:22 +02:00
clocksource clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG 2021-07-14 16:56:12 +02:00
crypto crypto: public_key: fix overflow during implicit conversion 2021-09-18 13:40:08 +02:00
drm drm: protect drm_master pointers in drm_lease.c 2021-09-18 13:40:19 +02:00
dt-bindings clk: imx8mq: remove SYS PLL 1/2 clock gates 2021-07-14 16:56:20 +02:00
keys certs: Add EFI_CERT_X509_GUID support for dbx entries 2021-06-30 08:47:30 -04:00
kunit kunit: fix display of failed expectations for strings 2020-11-10 13:45:15 -07:00
kvm ARM: 2020-10-23 11:17:56 -07:00
linux oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-27 13:53:54 +02:00
math-emu
media media: subdev: disallow ioctl for saa6588/davinci 2021-07-19 09:45:02 +02:00
memory memory: renesas-rpc-if: Correct QSPI data transfer in Manual mode 2021-11-18 14:03:47 +01:00
misc
net ipv6: make ip6_rt_gc_expire an atomic_t 2022-04-27 13:53:51 +02:00
pcmcia
ras mm,hwpoison: introduce MF_MSG_UNSPLIT_THP 2020-10-16 11:11:17 -07:00
rdma RDMA/netlink: Add __maybe_unused to static inline in C file 2021-11-26 10:39:21 +01:00
scsi scsi: iscsi: Fix conn cleanup and stop race during iscsid restart 2022-04-20 09:23:17 +02:00
soc firmware: raspberrypi: Keep count of all consumers 2021-09-15 09:50:41 +02:00
sound ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock 2022-04-08 14:39:53 +02:00
target scsi: target: Fix ordered tag handling 2021-11-26 10:39:11 +01:00
trace SUNRPC: Fix the svc_deferred_event trace class 2022-04-20 09:23:11 +02:00
uapi can: isotp: set default value for N_As to 50 micro seconds 2022-04-13 21:01:00 +02:00
vdso
video gpu: ipu-v3: remove unused functions 2020-10-26 10:42:38 +01:00
xen xen/gnttab: fix gnttab_end_foreign_access() without page specified 2022-03-11 12:11:54 +01:00