linux/kernel
Mark Rutland 708d19eaf3 perf/core: Fix group {cpu,task} validation
commit 64aee2a965 upstream.

Regardless of which events form a group, it does not make sense for the
events to target different tasks and/or CPUs, as this leaves the group
inconsistent and impossible to schedule. The core perf code assumes that
these are consistent across (successfully intialised) groups.

Core perf code only verifies this when moving SW events into a HW
context. Thus, we can violate this requirement for pure SW groups and
pure HW groups, unless the relevant PMU driver happens to perform this
verification itself. These mismatched groups subsequently wreak havoc
elsewhere.

For example, we handle watchpoints as SW events, and reserve watchpoint
HW on a per-CPU basis at pmu::event_init() time to ensure that any event
that is initialised is guaranteed to have a slot at pmu::add() time.
However, the core code only checks the group leader's cpu filter (via
event_filter_match()), and can thus install follower events onto CPUs
violating thier (mismatched) CPU filters, potentially installing them
into a CPU without sufficient reserved slots.

This can be triggered with the below test case, resulting in warnings
from arch backends.

  #define _GNU_SOURCE
  #include <linux/hw_breakpoint.h>
  #include <linux/perf_event.h>
  #include <sched.h>
  #include <stdio.h>
  #include <sys/prctl.h>
  #include <sys/syscall.h>
  #include <unistd.h>

  static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
			   int group_fd, unsigned long flags)
  {
	return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
  }

  char watched_char;

  struct perf_event_attr wp_attr = {
	.type = PERF_TYPE_BREAKPOINT,
	.bp_type = HW_BREAKPOINT_RW,
	.bp_addr = (unsigned long)&watched_char,
	.bp_len = 1,
	.size = sizeof(wp_attr),
  };

  int main(int argc, char *argv[])
  {
	int leader, ret;
	cpu_set_t cpus;

	/*
	 * Force use of CPU0 to ensure our CPU0-bound events get scheduled.
	 */
	CPU_ZERO(&cpus);
	CPU_SET(0, &cpus);
	ret = sched_setaffinity(0, sizeof(cpus), &cpus);
	if (ret) {
		printf("Unable to set cpu affinity\n");
		return 1;
	}

	/* open leader event, bound to this task, CPU0 only */
	leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
	if (leader < 0) {
		printf("Couldn't open leader: %d\n", leader);
		return 1;
	}

	/*
	 * Open a follower event that is bound to the same task, but a
	 * different CPU. This means that the group should never be possible to
	 * schedule.
	 */
	ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
	if (ret < 0) {
		printf("Couldn't open mismatched follower: %d\n", ret);
		return 1;
	} else {
		printf("Opened leader/follower with mismastched CPUs\n");
	}

	/*
	 * Open as many independent events as we can, all bound to the same
	 * task, CPU0 only.
	 */
	do {
		ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
	} while (ret >= 0);

	/*
	 * Force enable/disble all events to trigger the erronoeous
	 * installation of the follower event.
	 */
	printf("Opened all events. Toggling..\n");
	for (;;) {
		prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
		prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
	}

	return 0;
  }

Fix this by validating this requirement regardless of whether we're
moving events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhou Chengming <zhouchengming1@huawei.com>
Link: http://lkml.kernel.org/r/1498142498-15758-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-30 10:19:25 +02:00
..
bpf bpf: prevent leaking pointer via xadd on unpriviledged 2017-07-21 07:44:55 +02:00
configs kconfig: tinyconfig: provide whole choice blocks to avoid warnings 2016-09-24 10:07:42 +02:00
debug kernel/debug/debug_core.c: more properly delay for secondary CPUs 2017-01-06 11:16:16 +01:00
events perf/core: Fix group {cpu,task} validation 2017-08-30 10:19:25 +02:00
gcov
irq genirq: Release resources in __setup_irq() error path 2017-06-26 07:13:10 +02:00
livepatch
locking locking/rtmutex: Use READ_ONCE() in rt_mutex_owner() 2016-12-15 08:49:22 -08:00
power PM / sleep: fix device reference leak in test_suspend 2016-11-26 09:54:53 +01:00
printk printk: use rcuidle console tracepoint 2017-02-23 17:43:10 +01:00
rcu rcu: Fix soft lockup for rcu_nocb_kthread 2016-12-08 07:15:24 +01:00
sched sched/cputime: Fix prev steal time accouting during CPU hotplug 2017-08-06 19:19:43 -07:00
time alarmtimer: don't rate limit one-shot timers 2017-07-27 15:06:10 -07:00
trace tracing: Fix freeing of filter in create_filter() when set_str is false 2017-08-30 10:19:24 +02:00
.gitignore
acct.c
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c audit: Fix use after free in audit_remove_watch_rule() 2017-08-24 17:02:35 -07:00
audit.c
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c exec: Ensure mm->user_ns contains the execed files 2017-01-06 11:16:14 +01:00
cgroup_freezer.c
cgroup_pids.c
cgroup.c cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups 2017-04-21 09:30:04 +02:00
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c stable-fixup: hotplug: fix unused function warning 2017-01-12 11:22:48 +01:00
cpuset.c cpuset: fix a deadlock due to incomplete patching of cpusets_enabled() 2017-08-16 13:40:28 -07:00
crash_dump.c
cred.c cred: Reject inodes with invalid ids in set_create_file_as() 2016-09-15 08:27:49 +02:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c
extable.c kernel/extable.c: mark core_kernel_text notrace 2017-07-21 07:44:56 +02:00
fork.c stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms 2017-06-14 13:16:23 +02:00
freezer.c
futex_compat.c
futex.c futex: Add missing error handling to FUTEX_REQUEUE_PI 2017-03-22 12:04:19 +01:00
groups.c
hung_task.c
irq_work.c
jump_label.c jump_labels: API for flushing deferred jump label updates 2017-01-19 20:17:19 +01:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec_core.c
kexec_file.c kexec: fix double-free when failing to relocate the purgatory 2016-09-24 10:07:36 +02:00
kexec_internal.h
kexec.c
kmod.c
kprobes.c tracing/kprobes: Enforce kprobes teardown after testing 2017-05-25 14:30:17 +02:00
ksysfs.c
kthread.c cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups 2017-04-21 09:30:04 +02:00
latencytop.c
Makefile
membarrier.c Fix: Disable sys_membarrier when nohz_full is enabled 2017-03-12 06:37:26 +01:00
memremap.c mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} 2017-01-19 20:17:18 +01:00
module_signing.c
module-internal.h
module.c
notifier.c
nsproxy.c
padata.c padata: free correct variable 2017-05-20 14:27:02 +02:00
panic.c kernel/panic.c: add missing \n 2017-07-05 14:37:19 +02:00
params.c
pid_namespace.c pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes 2017-05-25 14:30:11 +02:00
pid.c pids: make task_tgid_nr_ns() safe 2017-08-24 17:02:36 -07:00
profile.c
ptrace.c ptrace: Properly initialize ptracer_cred on fork 2017-06-14 13:16:20 +02:00
range.c
reboot.c
relay.c
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2017-08-06 19:19:42 -07:00
seccomp.c
signal.c signal: protect SIGNAL_UNKILLABLE from unintentional clearing. 2017-08-11 09:08:59 -07:00
smp.c
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c
sysctl_binary.c
sysctl.c sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec 2017-07-15 11:57:46 +02:00
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c kernel/watchdog: use nmi registers snapshot in hardlockup handler 2017-01-06 11:16:16 +01:00
workqueue_internal.h
workqueue.c workqueue: implicit ordered attribute should be overridable 2017-08-11 09:09:00 -07:00