linux/kernel
Peter Zijlstra 4a5cc64d8a perf/core: Fix race between close() and fork()
commit 1cf8dfe8a6 upstream.

Syzcaller reported the following Use-after-Free bug:

	close()						clone()

							  copy_process()
							    perf_event_init_task()
							      perf_event_init_context()
							        mutex_lock(parent_ctx->mutex)
								inherit_task_group()
								  inherit_group()
								    inherit_event()
								      mutex_lock(event->child_mutex)
								      // expose event on child list
								      list_add_tail()
								      mutex_unlock(event->child_mutex)
							        mutex_unlock(parent_ctx->mutex)

							    ...
							    goto bad_fork_*

							  bad_fork_cleanup_perf:
							    perf_event_free_task()

	  perf_release()
	    perf_event_release_kernel()
	      list_for_each_entry()
		mutex_lock(ctx->mutex)
		mutex_lock(event->child_mutex)
		// event is from the failing inherit
		// on the other CPU
		perf_remove_from_context()
		list_move()
		mutex_unlock(event->child_mutex)
		mutex_unlock(ctx->mutex)

							      mutex_lock(ctx->mutex)
							      list_for_each_entry_safe()
							        // event already stolen
							      mutex_unlock(ctx->mutex)

							    delayed_free_task()
							      free_task()

	     list_for_each_entry_safe()
	       list_del()
	       free_event()
	         _free_event()
		   // and so event->hw.target
		   // is the already freed failed clone()
		   if (event->hw.target)
		     put_task_struct(event->hw.target)
		       // WHOOPSIE, already quite dead

Which puts the lie to the the comment on perf_event_free_task():
'unexposed, unused context' not so much.

Which is a 'fun' confluence of fail; copy_process() doing an
unconditional free_task() and not respecting refcounts, and perf having
creative locking. In particular:

  82d94856fa ("perf/core: Fix lock inversion between perf,trace,cpuhp")

seems to have overlooked this 'fun' parade.

Solve it by using the fact that detached events still have a reference
count on their (previous) context. With this perf_event_free_task()
can detect when events have escaped and wait for their destruction.

Debugged-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 82d94856fa ("perf/core: Fix lock inversion between perf,trace,cpuhp")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-28 08:29:28 +02:00
..
bpf bpf: silence warning messages in core 2019-07-26 09:14:06 +02:00
cgroup cpuset: restore sanity to cpuset_cpus_allowed_fallback() 2019-07-10 09:53:39 +02:00
configs
debug kdb: Don't back trace on a cpu that didn't round up 2019-02-12 19:47:19 +01:00
dma dma-direct: do not include SME mask in the DMA supported check 2019-01-13 09:51:05 +01:00
events perf/core: Fix race between close() and fork() 2019-07-28 08:29:28 +02:00
gcov
irq genirq: Add optional hardware synchronization for shutdown 2019-07-21 09:03:13 +02:00
livepatch module: Fix livepatch/ftrace module text permissions race 2019-07-10 09:53:40 +02:00
locking locking/lockdep: Fix merging of hlocks with non-zero references 2019-07-26 09:14:03 +02:00
power x86/power: Fix 'nosmt' vs hibernation triple fault during resume 2019-06-11 12:20:52 +02:00
printk
rcu rcuperf: Fix cleanup path for invalid perf_type strings 2019-05-31 06:46:30 -07:00
sched sched/fair: Fix "runnable_avg_yN_inv" not used warnings 2019-07-26 09:14:08 +02:00
time timer_list: Guard procfs specific code 2019-07-26 09:14:10 +02:00
trace ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code() 2019-07-10 09:53:44 +02:00
.gitignore
acct.c acct_on(): don't mess with freeze protection 2019-05-31 06:46:05 -07:00
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c audit: fix a memory leak bug 2019-05-31 06:46:17 -07:00
auditsc.c
backtracetest.c
bounds.c
capability.c
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c cpu/hotplug: Fix out-of-bounds read when setting fail state 2019-07-21 09:03:11 +02:00
crash_core.c
crash_dump.c
cred.c ptrace: restore smp_rmb() in __ptrace_may_access() 2019-06-19 08:18:00 +02:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c cgroup/pids: turn cgroup_subsys->free() into cgroup_subsys->release() to fix the accounting 2019-04-05 22:33:13 +02:00
extable.c
fail_function.c
fork.c userfaultfd: use RCU to free the task struct when fork fails 2019-05-22 07:37:41 +02:00
freezer.c
futex_compat.c
futex.c locking/futex: Allow low-level atomic operations to return -EAGAIN 2019-05-10 17:54:11 +02:00
groups.c
hung_task.c kernel: hung_task.c: disable on suspend 2019-04-20 09:16:02 +02:00
iomem.c
irq_work.c irq_work: Do not raise an IPI when queueing work on the local CPU 2019-05-31 06:46:19 -07:00
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-06-04 08:02:34 +02:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kernel/kcov.c: mark write_comp_data() as notrace 2019-02-12 19:47:20 +01:00
kexec_core.c
kexec_file.c
kexec_internal.h
kexec.c
kmod.c
kprobes.c kprobes: Fix error check when reusing optimized probes 2019-04-27 09:36:37 +02:00
ksysfs.c
kthread.c
latencytop.c
Makefile x86/uaccess, kcov: Disable stack protector 2019-06-19 08:18:01 +02:00
memremap.c mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support 2019-01-13 09:51:04 +01:00
module_signing.c
module-internal.h
module.c jump_label: move 'asm goto' support test to Kconfig 2019-06-04 08:02:34 +02:00
notifier.c
nsproxy.c
padata.c padata: use smp_mb in padata_reorder to avoid orphaned padata jobs 2019-07-26 09:14:25 +02:00
panic.c panic: avoid deadlocks in re-entrant console drivers 2018-12-29 13:37:57 +01:00
params.c
pid_namespace.c signal/pid_namespace: Fix reboot_pid_ns to use send_sig not force_sig 2019-07-26 09:14:01 +02:00
pid.c Fix failure path in alloc_pid() 2019-01-13 09:51:06 +01:00
profile.c
ptrace.c ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME 2019-07-10 09:53:41 +02:00
range.c
reboot.c
relay.c relay: check return of create_buf_file() properly 2019-03-13 14:02:35 -07:00
resource.c
rseq.c
seccomp.c
signal.c kernel/signal.c: trace_signal_deliver when signal_group_exit 2019-06-09 09:17:20 +02:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-02-12 19:47:25 +01:00
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c kernel/sys.c: prctl: fix false positive in validate_prctl_map() 2019-06-15 11:54:01 +02:00
sysctl_binary.c
sysctl.c sysctl: return -EINVAL if val violates minmax 2019-06-15 11:53:59 +02:00
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog_hld.c
watchdog.c watchdog: Respect watchdog cpumask on CPU hotplug 2019-04-03 06:26:29 +02:00
workqueue_internal.h
workqueue.c workqueue: Try to catch flush_work() without INIT_WORK(). 2019-05-02 09:58:56 +02:00