Linux kernel source tree
Go to file
Nicholas Piggin b664645274 mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race
commit d53c3dfb23 upstream.

Reading and modifying current->mm and current->active_mm and switching
mm should be done with irqs off, to prevent races seeing an intermediate
state.

This is similar to commit 38cf307c1f ("mm: fix kthread_use_mm() vs TLB
invalidate"). At exec-time when the new mm is activated, the old one
should usually be single-threaded and no longer used, unless something
else is holding an mm_users reference (which may be possible).

Absent other mm_users, there is also a race with preemption and lazy tlb
switching. Consider the kernel_execve case where the current thread is
using a lazy tlb active mm:

  call_usermodehelper()
    kernel_execve()
      old_mm = current->mm;
      active_mm = current->active_mm;
      *** preempt *** -------------------->  schedule()
                                               prev->active_mm = NULL;
                                               mmdrop(prev active_mm);
                                             ...
                      <--------------------  schedule()
      current->mm = mm;
      current->active_mm = mm;
      if (!old_mm)
          mmdrop(active_mm);

If we switch back to the kernel thread from a different mm, there is a
double free of the old active_mm, and a missing free of the new one.

Closing this race only requires interrupts to be disabled while ->mm
and ->active_mm are being switched, but the TLB problem requires also
holding interrupts off over activate_mm. Unfortunately not all archs
can do that yet, e.g., arm defers the switch if irqs are disabled and
expects finish_arch_post_lock_switch() to be called to complete the
flush; um takes a blocking lock in activate_mm().

So as a first step, disable interrupts across the mm/active_mm updates
to close the lazy tlb preempt race, and provide an arch option to
extend that to activate_mm which allows architectures doing IPI based
TLB shootdowns to close the second race.

This is a bit ugly, but in the interest of fixing the bug and backporting
before all architectures are converted this is a compromise.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[mpe: Manual backport to 4.19 due to membarrier_exec_mmap(mm) changes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200914045219.3736466-2-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05 11:08:38 +01:00
arch mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race 2020-11-05 11:08:38 +01:00
block Revert "block: ratelimit handle_bad_sector() message" 2020-11-05 11:08:35 +01:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: algif_skcipher - EBUSY on aio should be an error 2020-10-29 09:55:01 +01:00
Documentation xen/events: defer eoi in case of excessive number of events 2020-11-05 11:08:37 +01:00
drivers ata: sata_nv: Fix retrieving of active qcs 2020-11-05 11:08:38 +01:00
firmware Fix built-in early-load Intel microcode alignment 2020-01-23 08:21:29 +01:00
fs mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race 2020-11-05 11:08:38 +01:00
include xen/events: add a new "late EOI" evtchn framework 2020-11-05 11:08:36 +01:00
init printk: queue wake_up_klogd irq_work only if per-CPU areas are ready 2020-07-22 09:32:13 +02:00
ipc ipc/util.c: sysvipc_find_ipc() incorrectly updates position index 2020-05-20 08:18:40 +02:00
kernel futex: Fix incorrect should_fail_futex() handling 2020-11-05 11:08:38 +01:00
lib lib/crc32.c: fix trivial typo in preprocessor condition 2020-10-30 10:38:21 +01:00
LICENSES LICENSES: Remove CC-BY-SA-4.0 license text 2018-10-18 11:28:50 +02:00
mm mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary 2020-10-29 09:55:15 +01:00
net tipc: fix memory leak caused by tipc_buf_append() 2020-11-05 11:08:33 +01:00
samples misc: vop: add round_up(x,4) for vring_size to avoid kernel panic 2020-10-30 10:38:29 +01:00
scripts scripts/setlocalversion: make git describe output more reliable 2020-11-05 11:08:31 +01:00
security evm: Check size of security.evm before using it 2020-11-05 11:08:34 +01:00
sound ALSA: seq: oss: Avoid mutex lock for a long-time ioctl 2020-10-29 09:55:11 +01:00
tools bpf: Fix comment for helper bpf_current_task_under_cgroup() 2020-11-05 11:08:34 +01:00
usr initramfs: restore default compression behavior 2020-04-13 10:44:59 +02:00
virt KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch 2020-10-01 13:14:54 +02:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS Documentation/llvm: add documentation on building w/ Clang/LLVM 2020-09-26 18:01:31 +02:00
Makefile Linux 4.19.154 2020-10-30 10:38:33 +01:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.