linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-15 01:43:11 +02:00

History

Josh Poimboeuf 3193c0836f bpf: Disable GCC -fgcse optimization for ___bpf_prog_run() On x86-64, with CONFIG_RETPOLINE=n, GCC's "global common subexpression elimination" optimization results in ___bpf_prog_run()'s jumptable code changing from this: select_insn: jmp jumptable(, %rax, 8) ... ALU64_ADD_X: ... jmp jumptable(, %rax, 8) ALU_ADD_X: ... jmp jumptable(, %rax, 8) to this: select_insn: mov jumptable, %r12 jmp (%r12, %rax, 8) ... ALU64_ADD_X: ... jmp (%r12, %rax, 8) ALU_ADD_X: ... jmp (%r12, %rax, 8) The jumptable address is placed in a register once, at the beginning of the function. The function execution can then go through multiple indirect jumps which rely on that same register value. This has a few issues: 1) Objtool isn't smart enough to be able to track such a register value across multiple recursive indirect jumps through the jump table. 2) With CONFIG_RETPOLINE enabled, this optimization actually results in a small slowdown. I measured a ~4.7% slowdown in the test_bpf "tcpdump port 22" selftest. This slowdown is actually predicted by the GCC manual: Note: When compiling a program using computed gotos, a GCC extension, you may get better run-time performance if you disable the global common subexpression elimination pass by adding -fno-gcse to the command line. So just disable the optimization for this function. Fixes: `e55a73251d` ("bpf: Fix ORC unwinding in non-JIT BPF code") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/30c3ca29ba037afcbd860a8672eef0021addf9fe.1563413318.git.jpoimboe@redhat.com		2019-07-18 21:01:06 +02:00
..
acpi	It's been a relatively busy cycle for docs:	2019-07-09 12:34:26 -07:00
asm-generic	include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures	2019-07-16 19:23:24 -07:00
clocksource	clocksource/drivers: Continue making Hyper-V clocksource ISA agnostic	2019-07-03 11:00:59 +02:00
crypto	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	2019-07-08 20:57:08 -07:00
drm	drm main pull request for v5.3-rc1 (sans mm changes)	2019-07-15 19:04:27 -07:00
dt-bindings	This round of clk driver and framework updates is heavy on the driver update	2019-07-17 10:07:48 -07:00
keys	request_key improvements	2019-07-08 19:19:37 -07:00
kvm	KVM: arm/arm64: Support chained PMU counters	2019-07-05 13:56:22 +01:00
linux	bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()	2019-07-18 21:01:06 +02:00
math-emu
media	media updates for v5.3-rc1	2019-07-09 09:47:22 -07:00
memory	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500	2019-06-19 17:09:55 +02:00
misc	powerpc updates for 5.3	2019-07-13 16:08:36 -07:00
net	net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()	2019-07-12 15:21:53 -07:00
pcmcia	It's been a relatively busy cycle for docs:	2019-07-09 12:34:26 -07:00
ras
rdma	RDMA/core: Make rdma_counter.h compile stand alone	2019-07-09 09:44:47 -03:00
scsi	SCSI misc on 20190709	2019-07-11 15:14:01 -07:00
soc	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500	2019-06-19 17:09:55 +02:00
sound	ASoC: Updates for v5.3	2019-07-08 14:45:34 +02:00
target
trace	for-5.3-tag	2019-07-16 15:12:56 -07:00
uapi	virtio, vhost: fixes, features, performance	2019-07-17 11:26:09 -07:00
vdso	vdso: Remove superfluous #ifdef __KERNEL__ in vdso/datapage.h	2019-06-26 07:28:09 +02:00
video	drm main pull request for v5.3-rc1 (sans mm changes)	2019-07-15 19:04:27 -07:00
xen
Kbuild	kbuild: compile-test kernel headers to ensure they are self-contained	2019-07-09 21:44:37 +09:00