linux/kernel/bpf
Kumar Kartikeya Dwivedi e723608bf4 bpf: Add verifier support for timed may_goto
Implement support in the verifier for replacing may_goto implementation
from a counter-based approach to one which samples time on the local CPU
to have a bigger loop bound.

We implement it by maintaining 16-bytes per-stack frame, and using 8
bytes for maintaining the count for amortizing time sampling, and 8
bytes for the starting timestamp. To minimize overhead, we need to avoid
spilling and filling of registers around this sequence, so we push this
cost into the time sampling function 'arch_bpf_timed_may_goto'. This is
a JIT-specific wrapper around bpf_check_timed_may_goto which returns us
the count to store into the stack through BPF_REG_AX. All caller-saved
registers (r0-r5) are guaranteed to remain untouched.

The loop can be broken by returning count as 0, otherwise we dispatch
into the function when the count drops to 0, and the runtime chooses to
refresh it (by returning count as BPF_MAX_TIMED_LOOPS) or returning 0
and aborting the loop on next iteration.

Since the check for 0 is done right after loading the count from the
stack, all subsequent cond_break sequences should immediately break as
well, of the same loop or subsequent loops in the program.

We pass in the stack_depth of the count (and thus the timestamp, by
adding 8 to it) to the arch_bpf_timed_may_goto call so that it can be
passed in to bpf_check_timed_may_goto as an argument after r1 is saved,
by adding the offset to r10/fp. This adjustment will be arch specific,
and the next patch will introduce support for x86.

Note that depending on loop complexity, time spent in the loop can be
more than the current limit (250 ms), but imposing an upper bound on
program runtime is an orthogonal problem which will be addressed when
program cancellations are supported.

The current time afforded by cond_break may not be enough for cases
where BPF programs want to implement locking algorithms inline, and use
cond_break as a promise to the verifier that they will eventually
terminate.

Below are some benchmarking numbers on the time taken per-iteration for
an empty loop that counts the number of iterations until cond_break
fires. For comparison, we compare it against bpf_for/bpf_repeat which is
another way to achieve the same number of spins (BPF_MAX_LOOPS).  The
hardware used for benchmarking was a Sapphire Rapids Intel server with
performance governor enabled, mitigations were enabled.

+-----------------------------+--------------+--------------+------------------+
| Loop type                   | Iterations   |  Time (ms)   |   Time/iter (ns) |
+-----------------------------|--------------+--------------+------------------+
| may_goto                    | 8388608      |  3           |   0.36           |
| timed_may_goto (count=65535)| 589674932    |  250         |   0.42           |
| bpf_for                     | 8388608      |  10          |   1.19           |
+-----------------------------+--------------+--------------+------------------+

This gives a good approximation at low overhead while staying close to
the current implementation.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250304003239.2390751-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15 11:48:28 -07:00
..
preload
arena.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 2025-02-20 18:13:57 -08:00
arraymap.c bpf: Remove migrate_{disable|enable} in ->map_for_each_callback 2025-01-08 18:06:35 -08:00
bloom_filter.c
bpf_cgrp_storage.c bpf: Fix deadlock when freeing cgroup storage 2025-01-29 18:38:19 -08:00
bpf_inode_storage.c bpf: Disable migration when destroying inode storage 2025-01-08 18:06:36 -08:00
bpf_iter.c bpf: Make every prog keep a copy of ctx_arg_info 2025-02-17 18:47:27 -08:00
bpf_local_storage.c bpf: Remove migrate_{disable|enable} from bpf_selem_free() 2025-01-08 18:06:37 -08:00
bpf_lru_list.c
bpf_lru_list.h
bpf_lsm.c bpf: lsm: Add two more sleepable hooks 2025-02-13 19:35:31 -08:00
bpf_struct_ops.c bpf: Allow struct_ops prog to return referenced kptr 2025-02-17 18:47:27 -08:00
bpf_task_storage.c bpf: Remove migrate_{disable|enable} from bpf_task_storage_lock helpers 2025-01-08 18:06:36 -08:00
btf_iter.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
btf_relocate.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
btf.c selftests/bpf: Test gen_pro/epilogue that generate kfuncs 2025-02-25 19:04:43 -08:00
cgroup_iter.c
cgroup.c bpf: Allow pre-ordering for bpf cgroup progs 2025-03-15 11:48:25 -07:00
core.c bpf: Add verifier support for timed may_goto 2025-03-15 11:48:28 -07:00
cpumap.c xdp: get rid of xdp_frame::mem.id 2024-12-12 18:22:52 -08:00
cpumask.c bpf: Remove migrate_{disable,enable} in bpf_cpumask_release() 2025-01-08 18:06:37 -08:00
crypto.c bpf: crypto: make state and IV dynptr nullable 2024-06-13 16:33:04 -07:00
devmap.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-12-12 14:19:05 -08:00
disasm.c
disasm.h
dispatcher.c bpf: Add kernel symbol for struct_ops trampoline 2024-11-12 17:13:46 -08:00
hashtab.c bpf: Fix kmemleak warning for percpu hashmap 2025-02-24 12:11:00 -08:00
helpers.c bpf/helpers: Introduce bpf_dynptr_copy kfunc 2025-03-15 11:48:16 -07:00
inode.c bpf: Preserve param->string when parsing mount options 2024-10-22 12:56:38 -07:00
Kconfig
kmem_cache_iter.c bpf: Add open coded version of kmem_cache iterator 2024-11-01 11:08:32 -07:00
link_iter.c
local_storage.c bpf: Replace 8 seq_puts() calls by seq_putc() calls 2024-07-29 12:53:00 -07:00
log.c bpf: Introduce support for bpf_local_irq_{save,restore} 2024-12-04 08:38:29 -08:00
lpm_trie.c bpf: Remove migrate_{disable|enable} from LPM trie 2025-01-08 18:06:35 -08:00
Makefile bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs 2024-12-14 09:49:27 -08:00
map_in_map.c bpf: switch maps to CLASS(fd, ...) 2024-08-13 15:58:17 -07:00
map_in_map.h
map_iter.c
memalloc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2024-11-13 12:52:51 -08:00
mmap_unlock_work.h
mprog.c
net_namespace.c
offload.c
percpu_freelist.c
percpu_freelist.h
prog_iter.c
queue_stack_maps.c
range_tree.c bpf: Disable migration before calling ops->map_free() 2025-01-08 18:06:36 -08:00
range_tree.h bpf: Introduce range_tree data structure and use it in bpf arena 2024-11-13 13:52:45 -08:00
relo_core.c bpf: Remove custom build rule 2024-08-30 08:55:26 -07:00
reuseport_array.c bpf: Use sockfd_put() helper 2024-08-30 08:57:47 -07:00
ringbuf.c bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic 2025-01-29 09:49:50 -08:00
stackmap.c bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack() helpers 2024-09-11 09:58:31 -07:00
syscall.c bpf: no longer acquire map_idr_lock in bpf_map_inc_not_zero() 2025-03-15 11:48:26 -07:00
sysfs_btf.c btf: Switch vmlinux BTF attribute to sysfs_bin_attr_simple_read() 2025-01-09 10:44:06 +01:00
task_iter.c vfs-6.13.file 2024-11-18 10:30:29 -08:00
tcx.c
tnum.c
token.c remove pointless includes of <linux/fdtable.h> 2024-10-07 13:34:41 -04:00
trampoline.c bpf: Add kernel symbol for struct_ops trampoline 2024-11-12 17:13:46 -08:00
verifier.c bpf: Add verifier support for timed may_goto 2025-03-15 11:48:28 -07:00