linux/tools
Stephane Eranian d99c22eabe perf record: Add num-synthesize-threads option
To control degree of parallelism of the synthesize_mmap() code which
is scanning /proc/PID/task/PID/maps and can be time consuming.
Mimic perf top way of handling the option.
If not specified will default to 1 thread, i.e. default behavior before
this option.

On a desktop computer the processing of /proc/PID/task/PID/maps isn't
slow enough to warrant parallel processing and the thread creation has
some cost - hence the default of 1. On a loaded server with
>100 cores it is possible to see synthesis times in the order of
seconds and in this case having the option is desirable.

As the processing is a synchronization point, it is legitimate to worry if
Amdahl's law will apply to this patch. Profiling with this patch in
place:
https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/
shows:
...
      - 32.59% __perf_event__synthesize_threads
         - 32.54% __event__synthesize_thread
            + 22.13% perf_event__synthesize_mmap_events
            + 6.68% perf_event__get_comm_ids.constprop.0
            + 1.49% process_synthesized_event
            + 1.29% __GI___readdir64
            + 0.60% __opendir
...
That is the processing is 1.49% of execution time and there is plenty to
make parallel. This is shown in the benchmark in this patch:

https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/

  Computing performance of multi threaded perf event synthesis by
  synthesizing events on CPU 0:
   Number of synthesis threads: 1
     Average synthesis took: 127729.000 usec (+- 3372.880 usec)
     Average num. events: 21548.600 (+- 0.306)
     Average time per event 5.927 usec
   Number of synthesis threads: 2
     Average synthesis took: 88863.500 usec (+- 385.168 usec)
     Average num. events: 21552.800 (+- 0.327)
     Average time per event 4.123 usec
   Number of synthesis threads: 3
     Average synthesis took: 83257.400 usec (+- 348.617 usec)
     Average num. events: 21553.200 (+- 0.327)
     Average time per event 3.863 usec
   Number of synthesis threads: 4
     Average synthesis took: 75093.000 usec (+- 422.978 usec)
     Average num. events: 21554.200 (+- 0.200)
     Average time per event 3.484 usec
   Number of synthesis threads: 5
     Average synthesis took: 64896.600 usec (+- 353.348 usec)
     Average num. events: 21558.000 (+- 0.000)
     Average time per event 3.010 usec
   Number of synthesis threads: 6
     Average synthesis took: 59210.200 usec (+- 342.890 usec)
     Average num. events: 21560.000 (+- 0.000)
     Average time per event 2.746 usec
   Number of synthesis threads: 7
     Average synthesis took: 54093.900 usec (+- 306.247 usec)
     Average num. events: 21562.000 (+- 0.000)
     Average time per event 2.509 usec
   Number of synthesis threads: 8
     Average synthesis took: 48938.700 usec (+- 341.732 usec)
     Average num. events: 21564.000 (+- 0.000)
     Average time per event 2.269 usec

Where average time per synthesized event goes from 5.927 usec with 1
thread to 2.269 usec with 8. This isn't a linear speed up as not all of
synthesize code has been made parallel. If the synthesis time was about
10 seconds then using 8 threads may bring this down to less than 4.

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tony Jones <tonyj@suse.de>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-23 11:10:41 -03:00
..
accounting SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
arch tools arch x86: Sync asm/cpufeatures.h with the kernel sources 2020-04-14 09:08:23 -03:00
bootconfig New tracing features: 2020-04-05 10:36:18 -07:00
bpf tools, bpftool: Fix struct_ops command invalid pointer free 2020-04-14 21:33:53 +02:00
build tools/build: tweak unused value workaround 2020-04-21 11:11:55 -07:00
cgroup .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
debugging
edid tools/edid: Move EDID data sets from Documentation/ 2020-02-19 04:10:32 -07:00
firewire
firmware
gpio This is the bulk of GPIO development for the v5.7 kernel cycle. 2020-04-04 10:27:00 -07:00
hv Tools: hv: Reopen the devices if read() or write() returns errors 2020-01-26 22:10:10 -05:00
iio .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
include tools headers: Synchronize linux/bits.h with the kernel sources 2020-04-14 11:40:05 -03:00
io_uring
kvm/kvm_stat tools/kvm_stat: add command line switch '-c' to log in csv format 2020-03-23 15:44:21 -04:00
laptop change email address for Pali Rohár 2020-04-10 15:36:22 -07:00
leds .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
lib perf evlist: Remove duplicate headers 2020-04-22 10:01:33 -03:00
memory-model .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
nfsd
objtool objtool: Make BP scratch register warning more robust 2020-04-14 12:24:22 +02:00
pci tools: PCI: Add 'e' to clear IRQ 2020-04-02 17:57:10 +01:00
pcmcia .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
perf perf record: Add num-synthesize-threads option 2020-04-23 11:10:41 -03:00
power SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
scripts Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-25 18:58:11 -07:00
spi SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
testing Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-04-16 14:52:29 -07:00
thermal/tmon - Convert tsens configuration DT binding to yaml (Rajeshwari) 2020-04-07 20:00:16 -07:00
time
usb .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
virtio tools/virtio: make asm/barrier.h self contained 2020-04-17 06:05:29 -04:00
vm tools/vm: fix cross-compile build 2020-04-21 11:11:56 -07:00
wmi
Makefile tools: bootconfig: Add bootconfig command 2020-01-13 13:19:39 -05:00