linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-07 05:55:44 +02:00

History

Daniel Borkmann 6d2eea6fb0 bpf, arm64: optimize 32/64 immediate emission Improve the JIT to emit 64 and 32 bit immediates, the current algorithm is not optimal and we often emit more instructions than actually needed. arm64 has movz, movn, movk variants but for the current 64 bit immediates we only use movz with a series of movk when needed. For example loading ffffffffffffabab emits the following 4 instructions in the JIT today: * movz: abab, shift: 0, result: 000000000000abab * movk: ffff, shift: 16, result: 00000000ffffabab * movk: ffff, shift: 32, result: 0000ffffffffabab * movk: ffff, shift: 48, result: ffffffffffffabab Whereas after the patch the same load only needs a single instruction: * movn: 5454, shift: 0, result: ffffffffffffabab Another example where two extra instructions can be saved: * movz: abab, shift: 0, result: 000000000000abab * movk: 1f2f, shift: 16, result: 000000001f2fabab * movk: ffff, shift: 32, result: 0000ffff1f2fabab * movk: ffff, shift: 48, result: ffffffff1f2fabab After the patch: * movn: e0d0, shift: 16, result: ffffffff1f2fffff * movk: abab, shift: 0, result: ffffffff1f2fabab Another example with movz, before: * movz: 0000, shift: 0, result: 0000000000000000 * movk: fea0, shift: 32, result: 0000fea000000000 After: * movz: fea0, shift: 32, result: 0000fea000000000 Moreover, reuse emit_a64_mov_i() for 32 bit immediates that are loaded via emit_a64_mov_i64() which is a similar optimization as done in `6fe8b9c1f4` ("bpf, x64: save several bytes by using mov over movabsq when possible"). On arm64, the latter allows to use a single instruction with movn due to zero extension where otherwise two would be needed. And last but not least add a missing optimization in emit_a64_mov_i() where movn is used but the subsequent movk not needed. With some of the Cilium programs in use, this shrinks the needed instructions by about three percent. Tested on Cavium ThunderX CN8890. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>		2018-05-14 19:11:45 -07:00
..
alpha	mm: introduce MAP_FIXED_NOREPLACE	2018-04-11 10:28:38 -07:00
arc	kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers	2018-04-07 19:04:02 +09:00
arm	bpf, arm32: save 4 bytes of unneeded stack space	2018-05-14 19:11:45 -07:00
arm64	bpf, arm64: optimize 32/64 immediate emission	2018-05-14 19:11:45 -07:00
c6x	c6x: pass endianness info to sparse	2018-04-10 09:58:58 -04:00
h8300	h8300: remove extraneous __BIG_ENDIAN definition	2018-03-22 17:07:01 -07:00
hexagon	hexagon: export csum_partial_copy_nocheck	2018-05-01 15:49:50 -05:00
ia64	pci-v4.17-changes	2018-04-06 18:31:06 -07:00
m68k	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu	2018-04-09 09:15:46 -07:00
microblaze	Microblaze patches for 4.17-rc1	2018-04-12 10:18:02 -07:00
mips	bpf, mips: remove unused function	2018-05-14 19:11:45 -07:00
nds32	page cache: use xa_lock	2018-04-11 10:28:39 -07:00
nios2	nios2 update for v4.17-rc1	2018-04-11 16:02:18 -07:00
openrisc	OpenRISC updates for v4.17	2018-04-15 12:27:58 -07:00
parisc	parisc: Fix section mismatches	2018-05-02 21:47:35 +02:00
powerpc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2018-05-07 23:35:08 -04:00
riscv	RISC-V: build vdso-dummy.o with -no-pie	2018-04-24 10:54:46 -07:00
s390	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2018-05-07 23:35:08 -04:00
sh	Merge branch 'akpm' (patches from Andrew)	2018-04-14 08:50:50 -07:00
sparc	bpf, sparc: remove unused variable	2018-05-14 19:11:45 -07:00
um	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml	2018-04-11 16:36:47 -07:00
unicore32	unicore32: turn flush_dcache_mmap_lock into a no-op	2018-04-11 10:28:39 -07:00
x86	bpf, x64: clean up retpoline emission slightly	2018-05-14 19:11:45 -07:00
xtensa	mm: introduce MAP_FIXED_NOREPLACE	2018-04-11 10:28:38 -07:00
.gitignore
Kconfig	kbuild: remove incremental linking option	2018-03-26 02:01:19 +09:00