mirror of
https://github.com/torvalds/linux.git
synced 2026-05-21 05:18:45 +02:00
$ map_perf_test 128
speed of HASH bpf_map_lookup_elem() in lookups per second
w/o JIT w/JIT
before 46M 58M
after 42M 74M
perf report
before:
54.23% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem
14.24% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw
8.84% map_perf_test [kernel.kallsyms] [k] htab_map_lookup_elem
5.93% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem
2.30% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
1.49% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler
after:
60.03% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem
18.07% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw
2.91% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
1.94% map_perf_test [kernel.kallsyms] [k] _einittext
1.90% map_perf_test [kernel.kallsyms] [k] __audit_syscall_exit
1.72% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler
Notice that bpf_map_lookup_elem() and htab_map_lookup_elem() are trivial
functions, yet they take sizeable amount of cpu time.
htab_map_gen_lookup() removes bpf_map_lookup_elem() and converts
htab_map_lookup_elem() into three BPF insns which causing cpu time
for bpf_prog_da4fc6a3f41761a2() slightly increase.
$ map_perf_test 256
speed of ARRAY bpf_map_lookup_elem() in lookups per second
w/o JIT w/JIT
before 97M 174M
after 64M 280M
before:
37.33% map_perf_test [kernel.kallsyms] [k] array_map_lookup_elem
13.95% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem
6.54% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
4.57% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler
after:
32.86% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
6.54% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler
array_map_gen_lookup() removes calls to array_map_lookup_elem()
and bpf_map_lookup_elem() and replaces them with 7 bpf insns.
The performance without JIT is slower, since executing extra insns
in the interpreter is slower than running native C code,
but with JIT the performance gains are obvious,
since native C->x86 code is replaced with fewer bpf->x86 instructions.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
||
|---|---|---|
| .. | ||
| bpf_helpers.h | ||
| bpf_load.c | ||
| bpf_load.h | ||
| cgroup_helpers.c | ||
| cgroup_helpers.h | ||
| fds_example.c | ||
| lathist_kern.c | ||
| lathist_user.c | ||
| libbpf.h | ||
| lwt_len_hist_kern.c | ||
| lwt_len_hist_user.c | ||
| lwt_len_hist.sh | ||
| Makefile | ||
| map_perf_test_kern.c | ||
| map_perf_test_user.c | ||
| offwaketime_kern.c | ||
| offwaketime_user.c | ||
| parse_ldabs.c | ||
| parse_simple.c | ||
| parse_varlen.c | ||
| README.rst | ||
| sampleip_kern.c | ||
| sampleip_user.c | ||
| sock_example.c | ||
| sock_example.h | ||
| sock_flags_kern.c | ||
| sockex1_kern.c | ||
| sockex1_user.c | ||
| sockex2_kern.c | ||
| sockex2_user.c | ||
| sockex3_kern.c | ||
| sockex3_user.c | ||
| spintest_kern.c | ||
| spintest_user.c | ||
| tc_l2_redirect_kern.c | ||
| tc_l2_redirect_user.c | ||
| tc_l2_redirect.sh | ||
| tcbpf1_kern.c | ||
| tcbpf2_kern.c | ||
| test_cgrp2_array_pin.c | ||
| test_cgrp2_attach.c | ||
| test_cgrp2_attach2.c | ||
| test_cgrp2_sock.c | ||
| test_cgrp2_sock.sh | ||
| test_cgrp2_sock2.c | ||
| test_cgrp2_sock2.sh | ||
| test_cgrp2_tc_kern.c | ||
| test_cgrp2_tc.sh | ||
| test_cls_bpf.sh | ||
| test_current_task_under_cgroup_kern.c | ||
| test_current_task_under_cgroup_user.c | ||
| test_ipip.sh | ||
| test_lru_dist.c | ||
| test_lwt_bpf.c | ||
| test_lwt_bpf.sh | ||
| test_overhead_kprobe_kern.c | ||
| test_overhead_tp_kern.c | ||
| test_overhead_user.c | ||
| test_probe_write_user_kern.c | ||
| test_probe_write_user_user.c | ||
| test_tunnel_bpf.sh | ||
| trace_event_kern.c | ||
| trace_event_user.c | ||
| trace_output_kern.c | ||
| trace_output_user.c | ||
| tracex1_kern.c | ||
| tracex1_user.c | ||
| tracex2_kern.c | ||
| tracex2_user.c | ||
| tracex3_kern.c | ||
| tracex3_user.c | ||
| tracex4_kern.c | ||
| tracex4_user.c | ||
| tracex5_kern.c | ||
| tracex5_user.c | ||
| tracex6_kern.c | ||
| tracex6_user.c | ||
| xdp_tx_iptunnel_common.h | ||
| xdp_tx_iptunnel_kern.c | ||
| xdp_tx_iptunnel_user.c | ||
| xdp1_kern.c | ||
| xdp1_user.c | ||
| xdp2_kern.c | ||
eBPF sample programs ==================== This directory contains a test stubs, verifier test-suite and examples for using eBPF. The examples use libbpf from tools/lib/bpf. Build dependencies ================== Compiling requires having installed: * clang >= version 3.4.0 * llvm >= version 3.7.1 Note that LLVM's tool 'llc' must support target 'bpf', list version and supported targets with command: ``llc --version`` Kernel headers -------------- There are usually dependencies to header files of the current kernel. To avoid installing devel kernel headers system wide, as a normal user, simply call:: make headers_install This will creates a local "usr/include" directory in the git/build top level directory, that the make system automatically pickup first. Compiling ========= For building the BPF samples, issue the below command from the kernel top level directory:: make samples/bpf/ Do notice the "/" slash after the directory name. It is also possible to call make from this directory. This will just hide the the invocation of make as above with the appended "/". Manually compiling LLVM with 'bpf' support ------------------------------------------ Since version 3.7.0, LLVM adds a proper LLVM backend target for the BPF bytecode architecture. By default llvm will build all non-experimental backends including bpf. To generate a smaller llc binary one can use:: -DLLVM_TARGETS_TO_BUILD="BPF" Quick sniplet for manually compiling LLVM and clang (build dependencies are cmake and gcc-c++):: $ git clone http://llvm.org/git/llvm.git $ cd llvm/tools $ git clone --depth 1 http://llvm.org/git/clang.git $ cd ..; mkdir build; cd build $ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" $ make -j $(getconf _NPROCESSORS_ONLN) It is also possible to point make to the newly compiled 'llc' or 'clang' command via redefining LLC or CLANG on the make command line:: make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang