verify btf__new_empty_opts() adds layouts for all kinds supported,
and after adding kind-related types for an unknown kind, ensure that
parsing uses this info when that kind is encountered rather than
giving up.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260326145444.2076244-9-alan.maguire@oracle.com
The verifier log output may contain multiple lines that start with
18: (bf) r0 = r6
teach reg_bounds to look for lines that have ';' in them,
since reg_bounds test is looking for:
18: (bf) r0 = r6 ; R0=... R6=...
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260325012242.45606-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
(tcp_congestion_ops)->cwnd_event() is called very often, with
@event oscillating between CA_EVENT_TX_START and other values.
This is not branch prediction friendly.
Provide a new cwnd_event_tx_start pointer dedicated for CA_EVENT_TX_START.
Both BBR and CUBIC benefit from this change, since they only care
about CA_EVENT_TX_START.
No change in kernel size:
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 4/4 grow/shrink: 3/1 up/down: 564/-568 (-4)
Function old new delta
bbr_cwnd_event_tx_start - 450 +450
cubictcp_cwnd_event_tx_start - 70 +70
__pfx_cubictcp_cwnd_event_tx_start - 16 +16
__pfx_bbr_cwnd_event_tx_start - 16 +16
tcp_unregister_congestion_control 93 99 +6
tcp_update_congestion_control 518 521 +3
tcp_register_congestion_control 422 425 +3
__tcp_transmit_skb 3308 3306 -2
__pfx_cubictcp_cwnd_event 16 - -16
__pfx_bbr_cwnd_event 16 - -16
cubictcp_cwnd_event 80 - -80
bbr_cwnd_event 454 - -454
Total: Before=25240512, After=25240508, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260323234920.1097858-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test to make sure that variable length stack writes
scrubs STACK_SPILL into STACK_MISC.
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260324215938.81733-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that the UAPI headers provide the required definitions, use those.
Some symbols have been renamed, adapt to those.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Now that the UAPI headers provide the required definitions, use those.
Some symbols have been renamed, adapt to those.
Also adapt the include path for the custom sign-file rule in the
bpf selftests.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
trampoline_count fills all trampoline attachment slots for a single
target function and verifies that one extra attach fails with -E2BIG.
It currently targets bpf_modify_return_test, which is also used by
other selftests such as modify_return, get_func_ip_test, and
get_func_args_test. When such tests run in parallel, they can contend
for the same per-function trampoline quota and cause unexpected attach
failures. This issue is currently masked by harness serialization.
Move trampoline_count to a dedicated bpf_testmod target and register it
for fmod_ret attachment. Also route the final trigger through
trigger_module_test_read(), so the execution path exercises the same
dedicated target.
This keeps the test semantics unchanged while isolating it from other
selftests, so it no longer needs to run in serial mode. Remove the
TODO comment as well.
Tested:
./test_progs -t trampoline_count -vv
./test_progs -j$(nproc) -t trampoline_count -vv
./test_progs -j$(nproc) -t \
trampoline_count,modify_return,get_func_ip_test,get_func_args_test -vv
20 runs of:
./test_progs -j$(nproc) -t \
trampoline_count,modify_return,get_func_ip_test,get_func_args_test
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260324044949.869801-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Previously I added a FIONREAD test for sockmap, but it can occasionally
fail in CI [1].
The test sends 10 bytes in two segments (2 + 8). For UDP, FIONREAD only
reports the length of the first datagram, not the total queued data.
The original code used recv_timeout() expecting all 10 bytes, but under
high system load, the second datagram may not yet be processed by the
protocol stack, so recv would only return the first 2-byte datagram,
causing a size mismatch failure.
Fix this by receiving exactly the expected bytes (matching FIONREAD) in
the first recv. The remaining datagram is then consumed in a second recv
block, which is only reachable for UDP since TCP's expected already
equals sizeof(buf).
Test:
./test_progs -a sockmap_basic
410/1 sockmap_basic/sockmap create_update_free:OK
...
Summary: 1/35 PASSED, 0 SKIPPED, 0 FAILED
[1] https://github.com/kernel-patches/bpf/actions/runs/22919385910/job/66515395423
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Fixes: 17e2ce02bf ("selftests/bpf: Add tests for FIONREAD and copied_seq")
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://lore.kernel.org/r/20260312072549.6766-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A test failure was discovered in BPF CI [1] caused by connection timeout.
The current test timeout of 500ms is insufficient for CI environments,
particularly under high load.
While the optimal timeout is unclear, this test was converted from the
original test_tc_tunnel.sh script. The original script used nc with "-w 1"
to specify a 1-second timeout [2]. Therefore, this test restores the
timeout to 1s.
Test:
./test_progs -a tc_tunnel
#478/1 tc_tunnel/ipip_none:OK
#478/2 tc_tunnel/ipip6_none:OK
#478/3 tc_tunnel/ip6tnl_none:OK
#478/4 tc_tunnel/sit_none:OK
#478/5 tc_tunnel/vxlan_eth:OK
#478/6 tc_tunnel/ip6vxlan_eth:OK
#478/7 tc_tunnel/gre_none:OK
#478/8 tc_tunnel/gre_eth:OK
#478/9 tc_tunnel/gre_mpls:OK
#478/10 tc_tunnel/ip6gre_none:OK
#478/11 tc_tunnel/ip6gre_eth:OK
#478/12 tc_tunnel/ip6gre_mpls:OK
#478/13 tc_tunnel/udp_none:OK
#478/14 tc_tunnel/udp_eth:OK
#478/15 tc_tunnel/udp_mpls:OK
#478/16 tc_tunnel/ip6udp_none:OK
#478/17 tc_tunnel/ip6udp_eth:OK
#478/18 tc_tunnel/ip6udp_mpls:OK
#478 tc_tunnel:OK
Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED
[1] https://github.com/kernel-patches/bpf/actions/runs/22674350732/job/65728072723
[2] https://lore.kernel.org/all/20251027-tc_tunnel-v3-4-505c12019f9d@bootlin.com/
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://lore.kernel.org/r/20260312083615.31835-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add single and multi-level pointer parameters and return value test
coverage for BPF trampolines. Includes verifier tests for single and
multi-level pointers. The tests check verifier logs for pointers
inferred as scalar() type.
Signed-off-by: Slava Imameev <slava.imameev@crowdstrike.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260314082127.7939-3-slava.imameev@crowdstrike.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a test verifying that stacksafe() correctly handles 32-bit scalar
spills when comparing stack states for equivalence during state pruning.
A 32-bit scalar spill creates slot[0-3] = STACK_INVALID and
slot[4-7] = STACK_SPILL. Without the im=4 check in stacksafe(), the
STACK_SPILL vs STACK_MISC mismatch at byte 4 causes pruning to fail,
forcing the verifier to re-explore a path that is provably safe.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260323022410.75444-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a selftest to verify that the verifier correctly identifies refcounted
arguments in struct_ops programs, even when they are not the first
argument. This ensures that the restriction on tail calls for programs
with __ref arguments is properly enforced regardless of which argument
they appear in.
This test verifies the fix for check_struct_ops_btf_id() proposed by
Keisuke Nishimura [0], which corrected a bug where only the first
argument was checked for the refcounted flag.
The test includes:
- An update to bpf_testmod to add 'test_refcounted_multi', an operator with
three arguments where the third is tagged with "__ref".
- A BPF program 'test_refcounted_multi' that attempts a tail call.
- A test runner that asserts the verifier rejects the program with
"program with __ref argument cannot tail call".
[0]: https://lore.kernel.org/bpf/20260320130219.63711-1-keisuke.nishimura@inria.fr/
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Link: https://lore.kernel.org/r/20260321214038.80479-1-varunrmallya@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fix compiler warnings about unused parameter, narrowing non-constant
into a smaller type and comparison between integers of different size.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260323231133.859941-1-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The connect_force_port test fails intermittently in CI because the
hardcoded server ports (60123/60124) may already be in use by other
tests or processes [1].
Fix this by passing port 0 to start_server(), letting the kernel assign
a free port dynamically. The actual assigned port is then propagated to
the BPF programs by writing it into the .bss map's initial value (via
bpf_map__initial_value()) before loading, so the BPF programs use the
correct backend port at runtime.
[1] https://github.com/kernel-patches/bpf/actions/runs/22697676317/job/65808536038
Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260323081131.65604-1-varunrmallya@gmail.com
Cross-merge BPF and other fixes after downstream PR.
Minor conflicts in:
tools/testing/selftests/bpf/progs/exceptions_fail.c
tools/testing/selftests/bpf/progs/verifier_bounds.c
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add multiple test cases for linked register tracking with alu32 ops:
- Add a test that checks sync_linked_regs() regarding reg->id (the linked
target register) for BPF_ADD_CONST32 rather than known_reg->id (the
branch register).
- Add a test case for linked register tracking that exposes the cross-type
sync_linked_regs() bug. One register uses alu32 (w7 += 1, BPF_ADD_CONST32)
and another uses alu64 (r8 += 2, BPF_ADD_CONST64), both linked to the
same base register.
- Add a test case that exercises regsafe() path pruning when two execution
paths reach the same program point with linked registers carrying
different ADD_CONST flags (BPF_ADD_CONST32 from alu32 vs BPF_ADD_CONST64
from alu64). This particular test passes with and without the fix since
the pruning will fail due to different ranges, but it would still be
useful to carry this one as a regression test for the unreachable div
by zero.
With the fix applied all the tests pass:
# LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_linked_scalars
[...]
./test_progs -t verifier_linked_scalars
#602/1 verifier_linked_scalars/scalars: find linked scalars:OK
#602/2 verifier_linked_scalars/sync_linked_regs_preserves_id:OK
#602/3 verifier_linked_scalars/scalars_neg:OK
#602/4 verifier_linked_scalars/scalars_neg_sub:OK
#602/5 verifier_linked_scalars/scalars_neg_alu32_add:OK
#602/6 verifier_linked_scalars/scalars_neg_alu32_sub:OK
#602/7 verifier_linked_scalars/scalars_pos:OK
#602/8 verifier_linked_scalars/scalars_sub_neg_imm:OK
#602/9 verifier_linked_scalars/scalars_double_add:OK
#602/10 verifier_linked_scalars/scalars_sync_delta_overflow:OK
#602/11 verifier_linked_scalars/scalars_sync_delta_overflow_large_range:OK
#602/12 verifier_linked_scalars/scalars_alu32_big_offset:OK
#602/13 verifier_linked_scalars/scalars_alu32_basic:OK
#602/14 verifier_linked_scalars/scalars_alu32_wrap:OK
#602/15 verifier_linked_scalars/scalars_alu32_zext_linked_reg:OK
#602/16 verifier_linked_scalars/scalars_alu32_alu64_cross_type:OK
#602/17 verifier_linked_scalars/scalars_alu32_alu64_regsafe_pruning:OK
#602/18 verifier_linked_scalars/alu32_negative_offset:OK
#602/19 verifier_linked_scalars/spurious_precision_marks:OK
#602 verifier_linked_scalars:OK
Summary: 1/19 PASSED, 0 SKIPPED, 0 FAILED
Co-developed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260319211507.213816-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a test that verifies bpf_program__clone() respects caller-provided
attach_btf_id in bpf_prog_load_opts.
The BPF program has SEC("fentry/bpf_fentry_test1"). It is cloned twice
from the same prepared object: first with no opts, verifying the
callback resolves attach_btf_id from sec_name to bpf_fentry_test1;
then with attach_btf_id overridden to bpf_fentry_test2, verifying the
loaded program is actually attached to bpf_fentry_test2. Both results
are checked via bpf_prog_get_info_by_fd().
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-3-74193d4cc9d9@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Replace veristat's per-program object re-opening with
bpf_program__clone().
Previously, veristat opened a separate bpf_object for every program in a
multi-program object file, iterated all programs to enable only the
target one, and then loaded the entire object.
Use bpf_object__prepare() once, then call bpf_program__clone() for each
program individually. This lets veristat load programs one at a time
from a single prepared object.
The caller now owns the returned fd and closes it after collecting stats.
Remove the special single-program fast path and the per-file early exit
in handle_verif_mode() so all files are always processed.
Split fixup_obj() into fixup_obj_maps() for object-wide map fixups that
must run before bpf_object__prepare(), and fixup_obj() for per-program
fixups (struct_ops masking, freplace type guessing) that run before each
bpf_program__clone() call.
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260317-veristat_prepare-v4-2-74193d4cc9d9@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add three test cases to verifier_bounds.c to verify that
maybe_fork_scalars() correctly tracks register values for BPF_OR
operations with constant source operands:
1. or_scalar_fork_rejects_oob: After ARSH 63 + OR 8, the pushed
path should have dst = 8. With value_size = 8, accessing
map_value + 8 is out of bounds and must be rejected.
2. and_scalar_fork_still_works: Regression test ensuring AND
forking continues to work. ARSH 63 + AND 4 produces pushed
dst = 0 and current dst = 4, both within value_size = 8.
3. or_scalar_fork_allows_inbounds: After ARSH 63 + OR 4, the
pushed path has dst = 4, which is within value_size = 8
and should be accepted.
These tests exercise the fix in the previous patch, which makes the
pushed path re-execute the ALU instruction so it computes the correct
result for BPF_OR.
Signed-off-by: Daniel Wade <danjwade95@gmail.com>
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260314021521.128361-3-danjwade95@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add tests to verify that signed 32-bit division and modulo operations
produce correct results when the dividend is INT_MIN (0x80000000).
The bug fixed in the previous commit only affects the BPF interpreter
path. When JIT is enabled (the default on most architectures), the
native CPU division instruction produces the correct result and these
tests pass regardless. With bpf_jit_enable=0, the interpreter is used
and without the previous fix, INT_MIN / 2 incorrectly returns
0x40000000 instead of 0xC0000000 due to abs(S32_MIN) undefined
behavior, causing these tests to fail.
Test cases:
- SDIV32 INT_MIN / 2 = -1073741824 (imm and reg divisor)
- SMOD32 INT_MIN % 2 = 0 (positive and negative divisor)
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Jenny Guanni Qu <qguanni@gmail.com>
Link: https://lore.kernel.org/r/20260311011116.2108005-3-qguanni@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There exists failure when executing the testcase "./test_verifier 190" if
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is not set on LoongArch.
#190/p calls: two calls that return map_value with incorrect bool check FAIL
...
misaligned access off (0x0; 0xffffffffffffffff)+0 size 8
...
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
It means that the program has unaligned accesses, but the kernel sets
CONFIG_ARCH_STRICT_ALIGN by default to enable -mstrict-align to prevent
unaligned accesses, so add a flag F_NEEDS_EFFICIENT_UNALIGNED_ACCESS
into the testcase to avoid the failure.
This is somehow similar with the commit ce1f289f54 ("selftests/bpf:
Add F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to some tests").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260310064507.4228-3-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The sleepable context check for global function calls in
check_func_call() open-codes the same checks that in_sleepable_context()
already performs. Replace the open-coded check with a call to
in_sleepable_context() and use non_sleepable_context_description() for
the error message, consistent with check_helper_call() and
check_kfunc_call().
Note that in_sleepable_context() also checks active_locks, which
overlaps with the existing active_locks check above it. However, the two
checks serve different purposes: the active_locks check rejects all
global function calls while holding a lock (not just sleepable ones), so
it must remain as a separate guard.
Update the expected error messages in the irq and preempt_lock selftests
to match.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260318174327.3151925-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
check_helper_call() prints the error message for every
env->cur_state->active* element when calling a sleepable helper.
Consolidate all of them into a single print statement.
The check for env->cur_state->active_locks was not part of the removed
print statements and will not be triggered with the consolidated print
as well because it is checked in do_check() before check_helper_call()
is even reached.
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260318174327.3151925-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add test cases to ensure the verifier correctly rejects bpf_throw from
subprogs when RCU, preempt, or IRQ locks are held:
* reject_subprog_rcu_lock_throw: subprog acquires bpf_rcu_read_lock and
then calls bpf_throw
* reject_subprog_throw_preempt_lock: always-throwing subprog called while
caller holds bpf_preempt_disable
* reject_subprog_throw_irq_lock: always-throwing subprog called while
caller holds bpf_local_irq_save
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260320000809.643798-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
process_bpf_exit_full() passes check_lock = !curframe to
check_resource_leak(), which is false in cases when bpf_throw() is
called from a static subprog. This makes check_resource_leak() to skip
validation of active_rcu_locks, active_preempt_locks, and
active_irq_id on exception exits from subprogs.
At runtime bpf_throw() unwinds the stack via ORC without releasing any
user-acquired locks, which may cause various issues as the result.
Fix by setting check_lock = true for exception exits regardless of
curframe, since exceptions bypass all intermediate frame
cleanup. Update the error message prefix to "bpf_throw" for exception
exits to distinguish them from normal BPF_EXIT.
Fix reject_subprog_with_rcu_read_lock test which was previously
passing for the wrong reason. Test program returned directly from the
subprog call without closing the RCU section, so the error was
triggered by the unclosed RCU lock on normal exit, not by
bpf_throw. Update __msg annotations for affected tests to match the
new "bpf_throw" error prefix.
The spin_lock case is not affected because they are already checked [1]
at the call site in do_check_insn() before bpf_throw can run.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?h=v7.0-rc4#n21098
Assisted-by: Claude:claude-opus-4-6
Fixes: f18b03faba ("bpf: Implement BPF exceptions")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260320000809.643798-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
modify_return's fmod_ret programs can override bpf_modify_return_test()'s
return value, which conflicts with get_func_ip_test when selftests run in
parallel.
Store current tgid in BSS and make modify_return hooks act only for that
tgid. For other tasks, fentry/fexit become no-ops and fmod_ret returns the
original ret.
Drop the serial-only restriction and remove the TODO comment.
Tested:
sudo ./test_progs -t modify_return
sudo ./test_progs -t get_func_ip_test
sudo ./test_progs -j$(nproc) -t modify_return
sudo ./test_progs -j$(nproc) -t get_func_ip_test
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260313034540.255805-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
perf_link creates a system-wide perf event pinned to CPU 0 (pid=-1, cpu=0)
and also pins the test thread to CPU 0. Under concurrent selftests this
can lead to cross-test interference and CPU 0 contention, making the test
flaky.
Create a per-task perf event instead (pid=0, cpu=-1) and drop CPU pinning
from burn_cpu(). Use barrier() to prevent the burn loop from being
optimized away. Drop the serial_ prefix so the test can run in parallel.
Also remove the stale TODO comment.
Tested:
./test_progs -t perf_link -vv
./test_progs -j$(nproc) -t perf_link -vv
for i in $(seq 1 50); do ./test_progs -j$(nproc) -t perf_link; done
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260305084306.283983-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Introduce SKIP_LLVM, SKIP_LIBBFD, and SKIP_CRYPTO build flags that let
users build bpftool without these optional dependencies.
SKIP_LLVM=1 skips LLVM even when detected. SKIP_LIBBFD=1 prevents the
libbfd JIT disassembly fallback when LLVM is absent. Together, they
produce a bpftool with no disassembly support.
SKIP_CRYPTO=1 excludes sign.c and removes the -lcrypto link dependency.
Inline stubs in main.h return errors with a clear message if signing
functions are called at runtime.
Use BPFTOOL_WITHOUT_CRYPTO (not HAVE_LIBCRYPTO_SUPPORT) as the C
define, following the BPFTOOL_WITHOUT_SKELETONS naming convention for
bpftool-internal build config, leaving HAVE_LIBCRYPTO_SUPPORT free for
proper feature detection in the future.
All three flags are propagated through the selftests Makefile to bpftool
sub-builds.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260312-b4-bpftool_build-v2-1-4c9d57133644@meta.com
Add tests that demonstrate the verifier support for deep call stacks
while still enforcing maximum stack size limits.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260316161225.128011-3-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The BPF verifier currently enforces a call stack depth of 8 frames,
regardless of the actual stack space consumption of those frames. The
limit is necessary for static call stacks, because the bookkeeping data
structures used by the verifier when stepping into static functions
during verification only support 8 stack frames. However, this
limitation only matters for static stack frames: Global subprogs are
verified by themselves and do not require limiting the call depth.
Relax this limitation to only apply to static stack frames. Verification
now only fails when there is a sequence of 8 calls to non-global
subprogs. Calling into a global subprog resets the counter. This allows
deeper call stacks, provided all frames still fit in the stack.
The change does not increase the maximum size of the call stack, only
the maximum number of frames we can place in it.
Also change the progs/test_global_func3.c selftest to use static
functions, since with the new patch it would otherwise unexpectedly
pass verification.
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260316161225.128011-2-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This new selftest demonstrates the improvement of bounds refinement from
the previous patch. It is inspired from a set of reg_bounds_sync inputs
generated using CBMC [1] by Shung-Hsi:
reg.smin_value=0x8000000000000002
reg.smax_value=2
reg.umin_value=2
reg.umax_value=19
reg.s32_min_value=2
reg.s32_max_value=3
reg.u32_min_value=2
reg.u32_max_value=3
reg_bounds_sync returns R=[2; 3] without the previous patch, and R=2
with it. __reg64_deduce_bounds is able to derive that u64=2, but before
the previous patch, those bounds are overwritten in
__reg_deduce_mixed_bounds using the 32bits bounds.
To arrive to these reg_bounds_sync inputs, we bound the 32bits value
first to [2; 3]. We can then upper-bound s64 without impacting u64. At
that point, the refinement to u64=2 doesn't happen because the ranges
still overlap in two points:
0 umin=2 umax=0xff..ff00..03 U64_MAX
| [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] |
|----------------------------|------------------------------|
|xx] [xxxxxxxxxxxxxxxxxxxxxxxxxxxx|
0 smax=2 smin=0x800..02 -1
With an upper-bound check at value 19, we can reach the above inputs for
reg_bounds_sync. At that point, the refinement to u64=2 happens and
because it isn't overwritten by __reg_deduce_mixed_bounds anymore,
reg_bounds_sync returns with reg=2.
The test validates this result by including an illegal instruction in
the (dead) branch reg != 2.
Link: https://github.com/shunghsiyu/reg_bounds_sync-review/ [1]
Co-developed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/622dc51c581cd4d652fff362188b2a5f73c1fe99.1773401138.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
On powerpc, immediate load instructions are sign extended. In case
of unsigned types, arguments should be explicitly zero-extended by
the caller. For kfunc call, this needs to be handled in the JIT code.
In bpf_kfunc_call_test4(), that tests for sign-extension of signed
argument types in kfunc calls, add some additional failure checks.
And add bpf_kfunc_call_test5() to test zero-extension of unsigned
argument types in kfunc calls.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260312080113.843408-1-hbathini@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Building selftests with
clang 23.0.0 (6fae863eba8a72cdd82f37e7111a46a70be525e0) triggers
the following error:
tools/testing/selftests/bpf/prog_tests/fexit_bpf2bpf.c:117:12:
error: assigning to 'char *' from 'const char *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
The variable `tgt_name` is declared as `char *`, but it stores the
result of strstr(prog_name[i], "/"). Since `prog_name[i]` is a
`const char *`, the returned pointer should also be treated as
const-qualified.
Update `tgt_name` to `const char *` to match the type of the underlying
string and silence the compiler warning.
Signed-off-by: Varun R Mallya <varunrmallya@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Menglong Dong <menglong.dong@linux.dev>
Link: https://lore.kernel.org/bpf/20260305222132.470700-1-varunrmallya@gmail.com
livepatch_trampoline relies on livepatch sysfs and livepatch-sample.ko.
When CONFIG_LIVEPATCH is disabled or the samples module isn't built, the
test fails with ENOENT and causes false failures in minimal CI configs.
Skip the test when livepatch sysfs or the sample module is unavailable.
Also avoid writing to livepatch sysfs when it's not present.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260309104448.817401-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Patch 1/2 added PID filtering to the probe_user BPF program to avoid
cross-test interference from the global connect() hooks.
With the interference removed, drop the serial_ prefix and remove the
stale TODO comment so the test can run in parallel.
Tested:
./test_progs -t probe_user -v
./test_progs -j$(nproc) -t probe_user
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306083330.518627-2-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The test installs a kprobe on __sys_connect and checks that
bpf_probe_write_user() can modify the syscall argument. However, any
concurrent thread in any other test that calls connect() will also
trigger the kprobe and have its sockaddr silently overwritten, causing
flaky failures in unrelated tests.
Constrain the hook to the current test process by filtering on a PID
stored as a global variable in .bss. Initialize the .bss value from
user space before bpf_object__load() using bpf_map__set_initial_value(),
and validate the bss map value size to catch layout mismatches.
No new map is introduced and the test keeps the existing non-skeleton
flow.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306083330.518627-1-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The module_attach test contains subtests which check that unloading a
module while there are BPF programs attached to its functions is not
possible because the module is still referenced.
The problem is that the test calls the generic unload_module() helper
function which is used for module cleanup after test_progs terminate and
tries to wait until all module references are released. This
unnecessarily slows down the module_attach subtests since each
unsuccessful call to unload_module() takes about 1 second.
Introduce try_unload_module() which takes the number of retries as a
parameter. Make unload_module() call it with the currently used amount
of 10000 retries but call it with just 1 retry from module_attach tests
as it is always expected to fail. This speeds up the module_attach()
test significantly.
Before:
# time ./test_progs -t module_attach
[...]
Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED
real 0m5.011s
user 0m0.293s
sys 0m0.108s
After:
# time ./test_progs -t module_attach
[...]
Summary: 1/14 PASSED, 0 SKIPPED, 0 FAILED
real 0m0.350s
user 0m0.197s
sys 0m0.063s
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260306101628.3822284-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently BPF selftests will fail to compile if CONFIG_SMC
is not set.
Use BPF CO-RE to work around the case where CONFIG_SMC is
not set; use ___local variants of relevant structures and
utilize bpf_core_field_exists() for net->smc.
The test continues to pass where
CONFIG_SMC=y
CONFIG_SMC_HS_CTRL_BPF=y
but these changes allow the selftests to build in the absence
of CONFIG_SMC=y.
Also ensure that we get a pure skip rather than a skip+fail
by removing the SMC is unsupported part from the ASSERT_FALSE()
in get_smc_nl_family(); doing this means we get a skip without
a fail when CONFIG_SMC is not set:
$ sudo ./test_progs -t bpf_smc
Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED
Fixes: beb3c67297 ("bpf/selftests: Add selftest for bpf_smc_hs_ctrl")
Reported-by: Colm Harrington <colm.harrington@oracle.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Tested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://patch.msgid.link/20260310111330.601765-1-alan.maguire@oracle.com
For commit b0dcdcb9ae ("resolve_btfids: Fix linker flags detection"),
I suggested setting HOSTPKG_CONFIG to $PKG_CONFIG when compiling
resolve_btfids, but I forgot the quotes around that variable.
As a result, when running vmtest.sh with static linking, it fails as
follows:
$ LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh
[...]
make: unrecognized option '--static'
Usage: make [options] [target] ...
[...]
This worked when I tested it because HOSTPKG_CONFIG didn't have a
default value in the resolve_btfids Makefile, but once it does, the
quotes aren't preserved and it fails on the next make call.
Fixes: b0dcdcb9ae ("resolve_btfids: Fix linker flags detection")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/abADBwn_ykblpABE@mail.gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Replace hardcoded enum values with bpf_core_enum_value() calls in
cgroup_iter_memcg test to improve portability across different
kernel versions.
The change adds runtime enum value resolution for:
- node_stat_item: NR_ANON_MAPPED, NR_SHMEM, NR_FILE_PAGES,
NR_FILE_MAPPED
- vm_event_item: PGFAULT
This ensures the BPF program can adapt to enum value changes
between kernel versions.
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: JP Kobryn <jp.kobryn@linux.dev>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Link: https://lore.kernel.org/r/ca6eb1a1a4fd7a17ffe995acf52c9a4ceb7bac13.1772505399.git.zhuhui@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When cgroup.memory=nokmem is set in the kernel command line, kmem
accounting is disabled. This causes the test_kmem subtest in
cgroup_iter_memcg to fail because it expects non-zero kmem values.
Remove the kmem subtest altogether since the remaining subtests
(shmem, file, pgfault) already provide sufficient coverage for
the cgroup iter memcg functionality.
Reviewed-by: JP Kobryn <jp.kobryn@linux.dev>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Link: https://lore.kernel.org/r/35fa32a019361ec26265c8a789ee31e448d4dbda.1772505399.git.zhuhui@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch adds two tests to check non_null ptr detection when using JEQ and JNE
have a register in second operand, and its value is known to be 0.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260304195018.181396-4-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a test case to ensure that BPF_END operations correctly break
register's scalar ID ties.
The test creates a scenario where r1 is a copy of r0, r0 undergoes a
byte swap, and then r0 is checked against a constant.
- Without the fix in the verifier, the bounds learned from r0 are
incorrectly propagated to r1, making the verifier believe r1 is
bounded and wrongly allowing subsequent pointer arithmetic.
- With the fix, r1 remains an unbounded scalar, and the verifier
correctly rejects the arithmetic operation between the frame pointer
and the unbounded register.
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260304083228.142016-3-tangyazhou@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that sleepable programs are always enabled on syscalls, let
refcounted_kptr tests use syscalls rather than bpf_testmod_test_read,
which is not sleepable with error injection disabled.
The tests just check that the verifier can handle usage of RCU locks in
sleepable programs and never actually attach. So, the attachment target
doesn't matter (as long as it is sleepable) and with syscalls, the tests
pass on kernels with disabled error injection.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/8b6626eae384559855f7a0e846a16e83f25f06f6.1773055375.git.vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The "|| echo -lzstd" default makes zstd an unconditional link
dependency of resolve_btfids. On systems where libzstd-dev is not
installed and pkg-config fails, the linker fails:
ld: cannot find -lzstd: No such file or directory
libzstd is a transitive dependency of libelf, so the -lzstd flag is
strictly necessary only for static builds [1].
Remove ZSTD_LIBS variable, and instead set LIBELF_LIBS depending on
whether the build is static or not. Use $(HOSTPKG_CONFIG) as primary
source of the flags list.
Also add a default value for HOSTPKG_CONFIG in case it's not built via
the toplevel Makefile. Pass it from selftests/bpf too.
[1] https://lore.kernel.org/bpf/4ff82800-2daa-4b9f-95a9-6f512859ee70@linux.dev/
Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org>
Reported-by: Vitaly Chikunov <vt@altlinux.org>
Closes: https://lore.kernel.org/bpf/aaWqMcK-2AQw5dx8@altlinux.org/
Fixes: 4021848a90 ("selftests/bpf: Pass through build flags to bpftool and resolve_btfids")
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260305014730.3123382-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a test for the scenario described in the previous commit:
an iterator loop with two paths where one ties r2/r7 via
shared scalar id and skips a call, while the other goes
through the call. Precision marks from the linked registers
get spuriously propagated to the call path via
propagate_precision(), hitting "backtracking call unexpected
regs" in backtrack_insn().
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-2-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fix an inconsistency between func_states_equal() and
collect_linked_regs():
- regsafe() uses check_ids() to verify that cached and current states
have identical register id mapping.
- func_states_equal() calls regsafe() only for registers computed as
live by compute_live_registers().
- clean_live_states() is supposed to remove dead registers from cached
states, but it can skip states belonging to an iterator-based loop.
- collect_linked_regs() collects all registers sharing the same id,
ignoring the marks computed by compute_live_registers().
Linked registers are stored in the state's jump history.
- backtrack_insn() marks all linked registers for an instruction
as precise whenever one of the linked registers is precise.
The above might lead to a scenario:
- There is an instruction I with register rY known to be dead at I.
- Instruction I is reached via two paths: first A, then B.
- On path A:
- There is an id link between registers rX and rY.
- Checkpoint C is created at I.
- Linked register set {rX, rY} is saved to the jump history.
- rX is marked as precise at I, causing both rX and rY
to be marked precise at C.
- On path B:
- There is no id link between registers rX and rY,
otherwise register states are sub-states of those in C.
- Because rY is dead at I, check_ids() returns true.
- Current state is considered equal to checkpoint C,
propagate_precision() propagates spurious precision
mark for register rY along the path B.
- Depending on a program, this might hit verifier_bug()
in the backtrack_insn(), e.g. if rY ∈ [r1..r5]
and backtrack_insn() spots a function call.
The reproducer program is in the next patch.
This was hit by sched_ext scx_lavd scheduler code.
Changes in tests:
- verifier_scalar_ids.c selftests need modification to preserve
some registers as live for __msg() checks.
- exceptions_assert.c adjusted to match changes in the verifier log,
R0 is dead after conditional instruction and thus does not get
range.
- precise.c adjusted to match changes in the verifier log, register r9
is dead after comparison and it's range is not important for test.
Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Fixes: 0fb3cf6110 ("bpf: use register liveness information for func_states_equal")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Two test cases for signed/unsigned 32-bit bounds refinement
when s32 range crosses the sign boundary:
- s32 range [S32_MIN..1] overlapping with u32 range [3..U32_MAX],
s32 range tail before sign boundary overlaps with u32 range.
- s32 range [-3..5] overlapping with u32 range [0..S32_MIN+3],
s32 range head after the sign boundary overlaps with u32 range.
This covers both branches added in the __reg32_deduce_bounds().
Also, crossing_32_bit_signed_boundary_2() no longer triggers invariant
violations.
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-2-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Same as in __reg64_deduce_bounds(), refine s32/u32 ranges
in __reg32_deduce_bounds() in the following situations:
- s32 range crosses U32_MAX/0 boundary, positive part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxx s32 range xxxxxxxxx] [xxxxxxx|
0 S32_MAX S32_MIN -1
- s32 range crosses U32_MAX/0 boundary, negative part of the s32 range
overlaps with u32 range:
0 U32_MAX
| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] |
|----------------------------|----------------------------|
|xxxxxxxxx] [xxxxxxxxxxxx s32 range |
0 S32_MAX S32_MIN -1
- No refinement if ranges overlap in two intervals.
This helps for e.g. consider the following program:
call %[bpf_get_prandom_u32];
w0 &= 0xffffffff;
if w0 < 0x3 goto 1f; // on fall-through u32 range [3..U32_MAX]
if w0 s> 0x1 goto 1f; // on fall-through s32 range [S32_MIN..1]
if w0 s< 0x0 goto 1f; // range can be narrowed to [S32_MIN..-1]
r10 = 0;
1: ...;
The reg_bounds.c selftest is updated to incorporate identical logic,
refinement based on non-overflowing range halves:
((x ∩ [0, smax]) ∩ (y ∩ [0, smax])) ∪
((x ∩ [smin,-1]) ∩ (y ∩ [smin,-1]))
Reported-by: Andrea Righi <arighi@nvidia.com>
Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Closes: https://lore.kernel.org/bpf/aakqucg4vcujVwif@gpd4/T/
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-1-f7f67e060a6b@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Extend existing kprobe_multi_test subtests to validate the
kprobe.session exact function name optimization:
In kprobe_multi_session.c, add test_kprobe_syms which attaches a
kprobe.session program to an exact function name (bpf_fentry_test1)
exercising the fast syms[] path that bypasses kallsyms parsing. It
calls session_check() so bpf_fentry_test1 is hit by both the wildcard
and exact probes, and test_session_skel_api validates
kprobe_session_result[0] == 4 (entry + return from each probe).
In test_attach_api_fails, add fail_7 and fail_8 verifying error code
consistency between the wildcard pattern path (slow, parses kallsyms)
and the exact function name path (fast, uses syms[] array). Both
paths must return -ENOENT for non-existent functions.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260302200837.317907-4-andrey.grodzovsky@crowdstrike.com
The perf_event subtest relies on SW_CPU_CLOCK sampling to trigger the BPF
program, but the current CPU burn loop can be too short on slower systems
and may fail to generate any overflow sample. This leaves pe_res unchanged
and makes the test flaky.
Make burn_cpu() take a loop count and use a longer burn only for the
perf_event subtest. Also scope perf_event_open() to the current task to
avoid wasting samples on unrelated activity.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260228074555.122950-3-sun.jian.kdev@gmail.com
The kprobe_multi subtests rely on bpf_testmod fentry ksyms.
When bpf_testmod isn't available, libbpf fails to resolve
bpf_testmod_fentry_test* and skeleton load fails with -ESRCH, causing
false failures.
Skip these subtests when env.has_testmod is false.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20260228074555.122950-2-sun.jian.kdev@gmail.com
Add a test that verifies btf__add_btf() correctly handles merging
multiple split BTF objects that share the same base BTF. The test
creates two sibling split BTFs on a common base, merges them into
a combined split BTF, and validates that base type references are
preserved while split type references are properly remapped.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/64a8c947bff1ae89efa9ba8c099466477762490f.1772657690.git.josef@toxicpanda.com
Current release - new code bugs:
- sched: cake: fixup cake_mq rate adjustment for diffserv config
- wifi: fix missing ieee80211_eml_params member initialization
Previous releases - regressions:
- tcp: give up on stronger sk_rcvbuf checks (for now)
Previous releases - always broken:
- net: fix rcu_tasks stall in threaded busypoll
- sched: fq: clear q->band_pkt_count[] in fq_reset()
- sched: only allow act_ct to bind to clsact/ingress qdiscs and
shared blocks
- bridge: check relevant per-VLAN options in VLAN range grouping
- xsk: fix fragment node deletion to prevent buffer leak
Misc:
- spring cleanup of inactive maintainers
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmmptYEACgkQMUZtbf5S
Irsraw/+L+L512Sbh1UlmbZjhT+AQkERHNkkfUMfXAeVb4uwHOqaydVdffvqRlTT
zOK8Oqzqf5ojRezDZ02skXnJTh39MF9IFiugF9JHApxwT2ALv0S7PXPFUJeRQeAY
+OiLT5+iy8wMfM6eryL6OtpM9PC8zwzH32oCYd5m4Ixf90Woj5G7x8Vooz7wUg1n
0cAliam8QLIRBrKXqctf7J8n23AK+WcrLcAt58J+qWCGqiiXdJXMvWXv1PjQ7vs/
KZysy0QaGwh3rw+5SquXmXwjhNIvvs58v3NV/4QbBdIKfJ5uYpTpyVgXJBQ6B4Jv
8SATHNwGbuUHuZl8OHn9ysaPCE3ZuD5pMnHbLnbKR6fyic95GxMIx/BNAOVvvwOH
l+GWEqch8hy6r+BVAJsoSEJzIf9aqUAlEhy0wEhVOP15yn5RWfMRQKpAaD6JKQYm
0Q6i+PsdS8xaANcUzi1Ec6aqyaX+iIBY6srE/twU3PW23Uv2ejqAG89x4s7t9LPu
GdMQ+iAEsR8Auph8Y5mshs4e9MrdlD3jzPCiFhkrqncWl/UcPpBgmHlD80vkTa1/
miMyYG5wq3g9pAFT43aAuoE85K6ZdIW0xGp3wGYMiW8Zy6Ea5EdnM2Wg8kbi/om0
W0pjfcI/2FInsZqK0g/PDeccWFKxl8C1SnfNDvy9rJHBwMkZHm4=
=XGBM
-----END PGP SIGNATURE-----
Merge tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN, netfilter and wireless.
Current release - new code bugs:
- sched: cake: fixup cake_mq rate adjustment for diffserv config
- wifi: fix missing ieee80211_eml_params member initialization
Previous releases - regressions:
- tcp: give up on stronger sk_rcvbuf checks (for now)
Previous releases - always broken:
- net: fix rcu_tasks stall in threaded busypoll
- sched:
- fq: clear q->band_pkt_count[] in fq_reset()
- only allow act_ct to bind to clsact/ingress qdiscs and shared
blocks
- bridge: check relevant per-VLAN options in VLAN range grouping
- xsk: fix fragment node deletion to prevent buffer leak
Misc:
- spring cleanup of inactive maintainers"
* tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
xdp: produce a warning when calculated tailroom is negative
net: enetc: use truesize as XDP RxQ info frag_size
libeth, idpf: use truesize as XDP RxQ info frag_size
i40e: use xdp.frame_sz as XDP RxQ info frag_size
i40e: fix registering XDP RxQ info
ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz
ice: fix rxq info registering in mbuf packets
xsk: introduce helper to determine rxq->frag_size
xdp: use modulo operation to calculate XDP frag tailroom
selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour
net/sched: act_ife: Fix metalist update behavior
selftests: net: add test for IPv4 route with loopback IPv6 nexthop
net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop
net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled
net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled
MAINTAINERS: remove Thomas Falcon from IBM ibmvnic
MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch
MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera
MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP
MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC
...
The test verifies attachment to various hooks in a kernel module,
however, everything is flattened into a single test. This makes it
impossible to run or skip test cases selectively.
Isolate each BPF program into a separate subtest. This is done by
disabling auto-loading of programs and loading and testing each program
separately.
At the same time, modernize the test to use ASSERT* instead of CHECK and
replace `return` by `goto cleanup` where necessary.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Link: https://lore.kernel.org/r/20260225120904.1529112-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add additional testing for void global functions. The tests
ensure that calls to void global functions properly keep
R0 invalid. Also make sure that exception callbacks still
require a return value.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260228184759.108145-6-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Global subprogs are currently not allowed to return void. Adjust
verifier logic to allow global functions with a void return type.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260228184759.108145-5-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The test_bpftool.sh script runs a python unittest script checking
bpftool json output on different commands. As part of the ongoing effort
to get rid of any standalone test, this script should either be
converted to test_progs or removed.
As validating bpftool json output does not bring much value to the test
base (and because it would need test_progs to bring in a json parser),
remove the standalone test script.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20260227-bpftool_feature-v1-1-a25860fd52fb@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add tests to ensure PTR_TO_CTX supports fixed offsets for program types
that don't rewrite accesses to it. Ensure that variable offsets and
negative offsets are still rejected. An extra test also checks writing
into ctx with modified offset for syscall progs. Other program types do
not support writes (notably, writable tracepoints offer a pointer for a
writable buffer through ctx, but don't allow writing to the ctx itself).
Before the fix made in the previous commit, these tests do not succeed,
except the ones testing for failures regardless of the change.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260227005725.1247305-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The A64_MOV macro unconditionally uses ADD Rd, Rn, #0 to implement
register moves. While functionally correct, this is not the canonical
encoding when both operands are general-purpose registers.
On AArch64, MOV has two aliases depending on the operand registers:
- MOV <Xd|SP>, <Xn|SP> → ADD <Xd|SP>, <Xn|SP>, #0
- MOV <Xd>, <Xn> → ORR <Xd>, XZR, <Xn>
The ADD form is required when the stack pointer is involved (as ORR
does not accept SP), while the ORR form is the preferred encoding for
general-purpose registers.
The ORR encoding is also measurably faster on modern microarchitectures.
A microbenchmark [1] comparing dependent chains of MOV (ORR) vs ADD #0
on an ARM Neoverse-V2 (72-core, 3.4 GHz) shows:
=== mov (ORR Xd, XZR, Xn) ===
run1 cycles/op=0.749859456
run2 cycles/op=0.749991250
run3 cycles/op=0.749601847
avg cycles/op=0.749817518
=== add0 (ADD Xd, Xn, #0) ===
run1 cycles/op=1.004777689
run2 cycles/op=1.004558266
run3 cycles/op=1.004806559
avg cycles/op=1.004714171
The ORR form completes in ~0.75 cycles/op vs ~1.00 cycles/op for ADD #0,
a ~25% improvement. This is likely because the CPU's register renaming
hardware can eliminate ORR-based moves, while ADD #0 must go through the
ALU pipeline.
Update A64_MOV to select the appropriate encoding at JIT time:
use ADD when either register is A64_SP, and ORR (via
aarch64_insn_gen_move_reg()) otherwise.
Update verifier_private_stack selftests to expect "mov x7, x0" instead
of "add x7, x0, #0x0" in the JITed instruction checks, matching the
new ORR-based encoding.
[1] https://github.com/puranjaymohan/scripts/blob/main/arm64/bench/run_mov_vs_add0.sh
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260225134339.2723288-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that attaches bpf program on usdt probe in 2 scenarios;
- attach program on top of usdt_1, which is single nop instruction,
so the probe stays on nop instruction and is not optimized.
- attach program on top of usdt_2 which is probe defined on top
of nop,nop5 combo, so the probe is placed on top of nop5 and
is optimized.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260224103915.1369690-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Syncing latest usdt.h change [1].
Now that we have nop5 optimization support in kernel, let's emit
nop,nop5 for usdt probe. We leave it up to the library to use
desirable nop instruction.
[1] c9865d1589
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260224103915.1369690-4-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The fsession is already supported by x86_64, arm64, riscv and s390, so we
don't need to disable it in the compile time according to the
architecture. Factor out the testings for it. Therefore, the testing can
be disabled for the architecture that doesn't support it manually.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20260224092208.1395085-4-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
exe_ctx test fails on s390, because get_preempt_count() is not
implemented and its fallback path always returns 0. Implement it
using the new bpf_get_lowcore() kfunc.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20260217160813.100855-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Implementing BPF version of preempt_count() requires accessing lowcore
from BPF. Since lowcore can be relocated, open-coding
(struct lowcore *)0 does not work, so add a kfunc.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20260217160813.100855-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a selftest to verify that changing xmit_hash_policy to vlan+srcmac
is rejected when a native XDP program is loaded on a bond in 802.3ad
mode. Without the fix in bond_option_xmit_hash_policy_set(), the change
succeeds silently, creating an inconsistent state that triggers a kernel
WARNING in dev_xdp_uninstall() when the bond is torn down.
The test attaches native XDP to a bond0 (802.3ad, layer2+3), then
attempts to switch xmit_hash_policy to vlan+srcmac and asserts the
operation fails. It also verifies the change succeeds after XDP is
detached, confirming the rejection is specific to the XDP-loaded state.
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Link: https://patch.msgid.link/20260226080306.98766-3-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The reg_bounds_crafted tests validate the verifier's range analysis
logic. They focus on the actual ranges and thus ignore the tnum. As a
consequence, they carry the assumption that the tested cases can be
reproduced in userspace without using the tnum information.
Unfortunately, the previous change the refinement logic breaks that
assumption for one test case:
(u64)2147483648 (u32)<op> [4294967294; 0x100000000]
The tested bytecode is shown below. Without our previous improvement, on
the false branch of the condition, R7 is only known to have u64 range
[0xfffffffe; 0x100000000]. With our improvement, and using the tnum
information, we can deduce that R7 equals 0x100000000.
19: (bc) w0 = w6 ; R6=0x80000000
20: (bc) w0 = w7 ; R7=scalar(smin=umin=0xfffffffe,smax=umax=0x100000000,smin32=-2,smax32=0,var_off=(0x0; 0x1ffffffff))
21: (be) if w6 <= w7 goto pc+3 ; R6=0x80000000 R7=0x100000000
R7's tnum is (0; 0x1ffffffff). On the false branch, regs_refine_cond_op
refines R7's u32 range to [0; 0x7fffffff]. Then, __reg32_deduce_bounds
refines the s32 range to 0 using u32 and finally also sets u32=0.
From this, __reg_bound_offset improves the tnum to (0; 0x100000000).
Finally, our previous patch uses this new tnum to deduce that it only
intersect with u64=[0xfffffffe; 0x100000000] in a single value:
0x100000000.
Because the verifier uses the tnum to reach this constant value, the
selftest is unable to reproduce it by only simulating ranges. The
solution implemented in this patch is to change the test case such that
there is more than one overlap value between u64 and the tnum. The max.
u64 value is thus changed from 0x100000000 to 0x300000000.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/50641c6a7ef39520595dcafa605692427c1006ec.1772225741.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch introduces selftests to cover the new bounds refinement
logic introduced in the previous patch. Without the previous patch,
the first two tests fail because of the invariant violation they
trigger. The last test fails because the R10 access is not detected as
dead code. In addition, all three tests fail because of R0 having a
non-constant value in the verifier logs.
In addition, the last two cases are covering the negative cases: when we
shouldn't refine the bounds because the u64 and tnum overlap in at least
two values.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/90d880c8cf587b9f7dc715d8961cd1b8111d01a8.1772225741.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a couple of tests to ensure that the refcount drops to zero when we
exercise the race where creation of a special field succeeds the logical
bpf_obj_free_fields done when deleting an element. Prior to previous
changes, the fields would be freed eagerly and repopulate and end up
leaking, causing the reference to not drop down correctly. Running this
test on a kernel without fixes will cause a hang in delete_module, since
the module reference stays active due to the leaked kptr not dropping
it. After the fixes tests succeed as expected.
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260227224806.646888-6-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dmabuf name allocations can be less than DMA_BUF_NAME_LEN characters,
but bpf_probe_read_kernel always tries to read exactly that many bytes.
If a name is less than DMA_BUF_NAME_LEN characters,
bpf_probe_read_kernel will read past the end. bpf_probe_read_kernel_str
stops at the first NUL terminator so use it instead, like
iter_dmabuf_for_each already does.
Fixes: ae5d2c59ec ("selftests/bpf: Add test for dmabuf_iter")
Reported-by: Jerome Lee <jaewookl@quicinc.com>
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Link: https://lore.kernel.org/r/20260225003349.113746-1-tjmercier@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Test whether tail call count is incorrectly accounted for, when the
tail call fails due to a missing BPF program.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20260216090802.1805655-1-hbathini@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The test_sys_enter_exit test was setting target_pid before attaching
the BPF programs, which causes syscalls made during the attach phase
to be counted. This is flaky because, apparently, there is no
guarantee that both on_enter and on_exit will trigger during the
attachment.
Move the target_pid assignment to after task_local_storage__attach()
so that only explicit sys_gettid() calls are counted.
Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org>
Closes: https://github.com/kernel-patches/vmtest/issues/448
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260224211202.214325-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The bpftool_maps_access and bpftool_metadata tests may fail on BPF CI
with "command not found", depending on a workflow.
This happens because detect_bpftool_path() only checks two hardcoded
relative paths:
- ./tools/sbin/bpftool
- ../tools/sbin/bpftool
Add support for a BPFTOOL environment variable that allows specifying
the exact path to the bpftool binary.
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223191118.655185-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
- kmem_cache_iter: remove unnecessary debug output
- lwt_seg6local: change the type of foobar to char[]
- the sizeof(foobar) returned the pointer size and not a string
length as intended
- verifier_log: increase prog_name buffer size in verif_log_subtest()
- compiler has a conservative estimate of fixed_log_sz value, making
ASAN complain on snprint() call
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223191118.655185-1-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Compiler cannot infer upper bound for labels.cnt and warns about
potential buffer overflow in snprintf. Add an explicit bounds
check (... && i < MAX_LOCAL_LABELS) in the loop condition to fix the
warning.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-18-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
ASAN reported a resource leak due to the bpf_object not being tracked
in test_sysctl. Add obj field to struct sysctl_test to properly clean
it up.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-17-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
ASAN reported a "joining already joined thread" error. The
release_child() may be called multiple times for the same struct
child.
Fix by resetting child->thread to 0 after pthread_join.
Also memset(0) static child variable in test_attach_api().
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-15-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
ASAN reported a use-after-free in close_xsk().
The xsk->socket internally references xsk->umem via socket->ctx->umem,
so the socket must be deleted before the umem. Fix the order of
operations in close_xsk().
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-14-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The Close() macro uses the passed in expression three times, which
leads to repeated execution in case it has side effects. That is,
Close(i--) would decrement i three times.
ASAN caught a stack-buffer-undeflow error at a point where this was
overlooked. Fix it.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-12-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
ASAN reported a memory leak in bpf_get_ksyms(): it allocates a struct
ksyms internally and never frees it.
Move struct ksyms to trace_helpers.h and return it from the
bpf_get_ksyms(), giving ownership to the caller. Add filtered_syms and
filtered_cnt fields to the ksyms to hold the filtered array of
symbols, previously returned by bpf_get_ksyms().
Fixup the call sites: kprobe_multi_test and bench_trigger.
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260223190736.649171-10-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a denylist file for tests that should be skipped when built with
userspace ASAN:
$ make ... SAN_CFLAGS="-fsanitize=address -fno-omit-frame-pointer"
Skip the following tests:
- *arena*: userspace ASAN does not understand BPF arena maps and gets
confused particularly when map_extra is non-zero
- non-zero map_extra leads to mmap with MAP_FIXED, and ASAN treats
this as an unknown memory region
- task_local_data: ASAN complains about "incorrect" aligned_alloc()
usage, but it's intentional in the test
- uprobe_multi_test: very slow with ASAN enabled
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-9-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
EXTRA_* and SAN_* build flags were not correctly propagated to bpftool
and resolve_btids when building selftests/bpf. This led to various
build errors on attempt to build with SAN_CFLAGS="-fsanitize=address",
for example.
Fix the makefiles to address this:
- Pass SAN_CFLAGS/SAN_LDFLAGS to bpftool and resolve_btfids build
- Propagate EXTRA_LDFLAGS to resolve_btfids link command
- Use pkg-config to detect zlib and zstd for resolve_btfids, similar
libelf handling
Also check for ASAN flag in selftests/bpf/Makefile for convenience.
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-7-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Replace strncpy() with memcpy() in cases where the source is
non-NULL-terminated and the copy length is known.
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260223190736.649171-6-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Change the license of task local data mini library to LGPL-2.1 or
BSD-2-Clause to allow it being in a wider range of projects.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260219225849.2426421-1-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
As arm64 JIT now supports instruction array, make sure
all relevant tests run on this architecture.
Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
Acked-by: Anton Protopopov <a.s.protopopov@gmail.com>
Link: https://lore.kernel.org/r/20260223203511.118475-1-adubey@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allow bpf_kptr_xchg to directly operate on pointers marked with
NON_OWN_REF | MEM_RCU.
In the example demonstrated in this patch, as long as "struct
bpf_refcount ref" exists, the __kptr pointer is guaranteed to
carry the MEM_RCU flag. The ref member itself does not need to
be explicitly used.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Link: https://lore.kernel.org/r/20260214124042.62229-6-pilgrimtao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
1. Allow using bpf_kptr_xchg while holding a lock.
2. When the rb_node contains a __kptr pointer, we do not need to
perform a remove-read-add operation.
This patch implements the following workflow:
1. Construct a rbtree with 16 elements.
2. Traverse the rbtree, locate the kptr pointer in the target node,
and read the content pointed to by the pointer.
3. Remove all nodes from the rbtree.
Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Link: https://lore.kernel.org/r/20260214124042.62229-4-pilgrimtao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some architectures require mappings to be aligned to the system page size.
The build_id selftest currently uses a smaller alignment, which can result
in madvise operations executing on a different page than intended.
Increase the mapping alignment to 64K so the buffer is page-aligned on
all supported architectures.
Signed-off-by: Gregory Bell <grbell@redhat.com>
Link: https://lore.kernel.org/r/93543253b32d1cb178ab6e31e4291e387ba1c372.1771338492.git.grbell@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The build_id selftest occasionally fails because MADV_PAGEOUT
does not guarantee the immediate eviction of the page. The test
assumes eviction happens and proceeds without verifying
that the page was actually reclaimed, leading to false test
failures.
Fix the test by retrying the page-out sequence until eviction
is successful, instead of relying on a single MADV_PAGEOUT attempt.
Signed-off-by: Gregory Bell <grbell@redhat.com>
Link: https://lore.kernel.org/r/038bd27c69dd3a16958894fcb19e4fb6fbfe317e.1771338492.git.grbell@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The verification signature header generation requires converting a
binary certificate to a C array. Previously this only worked with xxd,
and a switch to hexdump has been done in commit b640d556a2
("selftests/bpf: Remove xxd util dependency").
hexdump is a more common utility program, yet it might not be installed
by default. When it is not installed, BPF selftests build without
errors, but tests_progs is unusable: it exits with the 255 code and
without any error messages. When manually reproducing the issue, it is
not too hard to find out that the generated verification_cert.h file is
incorrect, but that's time consuming. When digging the BPF selftests
build logs, this line can be seen amongst thousands others, but ignored:
/bin/sh: 2: hexdump: not found
Here, od is used instead of hexdump. od is coming from the coreutils
package, and this new od command produces the same output when using od
from GNU coreutils, uutils, and even busybox. This is more portable, and
it produces a similar results to what was done before with hexdump:
there is an extra comma at the end instead of trailing whitespaces,
but the C code is not impacted.
Fixes: b640d556a2 ("selftests/bpf: Remove xxd util dependency")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20260218-bpf-sft-hexdump-od-v2-1-2f9b3ee5ab86@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
- Replace linux/* includes with vmlinux.h
- Include errno.h
- Include bpf_tracing_net.h for TC_ACT_* and ETH_*
- Use BPF_STDERR instead of BPF_STREAM_STDERR
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260218215651.2057673-2-ihor.solodrai@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
get_preempt_count() is enabled to return preempt_count for powerpc,
so that bpf_in_interrupt()/bpf_in_nmi()/bpf_in_serving_softirq()/
bpf_in_task()/bpf_in_hardirq()/get_preempt_count() works for
powerpc as well.
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20260212092558.370623-1-skb99@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit consolidates static and varying pointer offset tracking
logic. All offsets are now represented solely using `.var_off` and
min/max fields. The reasons are twofold:
- This simplifies pointer tracking code, as each relevant function
needs to check the `.var_off` field anyway.
- It makes it easier to widen pointer registers for the purpose of loop
convergence checks, by forgoing the `regsafe()` logic demanding
`.off` fields to be identical.
The changes are spread across many functions and are hard to group
into smaller patches. Some of the logical changes include:
- Checks in __check_ptr_off_reg() are reordered so that the
tnum_is_const() check is done before operating on reg->var_off.value.
- check_packet_access() now uses check_mem_region_access() to handle
possible 'off' overflow cases.
- In check_helper_mem_access() utility functions like
check_packet_access() are now called with 'off=0', as these utility
functions now account for the complete register offset range.
- In check_reg_type() a call to __check_ptr_off_reg() is added before
a call to btf_struct_ids_match(). This prevents
btf_struct_ids_match() from potentially working on non-constant
reg->var_off.value.
- regsafe() is relaxed to avoid comparing '.off' field for pointers.
As a precaution, the changes are verified in [1] by adding a pass
checking that no pointer has non-zero '.off' field on each
do_check_insn() iteration.
[1] https://github.com/eddyz87/bpf/tree/ptrs-off-migration
Notable selftests changes:
- `.var_off` value changed because it now combines static and varying
offsets. Affected tests:
- linked_list/incorrect_node_var_off
- linked_list/incorrect_head_var_off2
- verifier_align/packet_variable_offset
- Overflowing `smax_value` bound leads to a pointer with big negative
or positive offset to be rejected immediately (previously overflowing
`rX += const` instruction updated `.off` field avoiding the overflow).
Affected tests:
- verifier_align/dubious_pointer_arithmetic
- verifier_bounds/var_off_insn_off_test1
- Invalid access to packet now reports full offset inside a packet.
Affected tests:
- verifier_direct_packet_access/test23_x_pkt_ptr_4
- A change in check_mem_region_access() behavior:
when register `.smin_value` is negative, it reports
"rX min value is negative..." before calling into __check_mem_access()
which reports "invalid access to ...".
In the tests below, the `.off` field was negative, while `.smin_value`
remained positive. This is no longer the case after the changes in
this commit. Affected tests:
- verifier_gotox/jump_table_invalid_mem_acceess_neg
- verifier_helper_packet_access/test15_cls_helper_fail_sub
- verifier_helper_value_access/imm_out_of_bound_2
- verifier_helper_value_access/reg_out_of_bound_2
- verifier_meta_access/meta_access_test2
- verifier_value_ptr_arith/known_scalar_from_different_maps
- lower_oob_arith_test_1
- value_ptr_known_scalar_3
- access_value_ptr_known_scalar
- Usage of check_mem_region_access() instead of __check_mem_access()
in check_packet_access() changes the reported message from
"rX offset is outside ..." to "rX min/max value is outside ...".
Affected tests:
- verifier_xdp_direct_packet_access/*
- In check_func_arg_reg_off() the check for zero offset now operates
on `.var_off` field instead of `.off` field. For tests where the
pattern looks like `kfunc(reg_with_var_off, ...)`, this changes the
reported error:
- previously the error "variable ... access ... disallowed"
was reported by __check_ptr_off_reg();
- now "R1 must have zero offset ..." is reported by
check_func_arg_reg_off() itself.
Affected tests:
- verifier/calls.c
"calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset"
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260212-ptrs-off-migration-v2-2-00820e4d3438@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that the RISC-V trampoline JIT supports BPF_TRACE_FSESSION, run
the fsession selftest on riscv64 as well as x86_64.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Tested-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20260208053311.698352-4-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Commit c27cea4416 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
broke map_kptr selftest since it removed the function we were kprobing.
Use a new kfunc that invokes call_rcu_tasks_trace and sets a program
provided pointer to an integer to 1. Technically this can be unsafe if
the memory being written to from the callback disappears, but this is
just for usage in a test where we ensure we spin until we see the value
to be set to 1, so it's ok.
Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Fixes: c27cea4416 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260211185747.3630539-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
While working on pointer tracking changes I found it necessary to
update expected log messages in align.c series of tests.
As a preliminary step, migrate these tests to test_loader framework.
The tests in question load BPF program and check if expected log is
produced, the log is specified as:
.matches = {
...
{4, "R3", "32"},
...
}
Where:
- '4' is an *instruction number* (contrary to the field name in
struct bpf_reg_match).
- 'R3' is the name of the register to check.
- '32' is the value expected for this register.
Mimic the same logic using __msg macro.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260211051310.2782558-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Total patches: 107
Reviews/patch: 1.07
Reviewed rate: 67%
- The 2 patch series "ocfs2: give ocfs2 the ability to reclaim
suballocator free bg" from Heming Zhao saves disk space by teaching
ocfs2 to reclaim suballocator block group space.
- The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one
bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in
various places.
- The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than
PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if
VMCOREINFO_BYTES ever exceeds the page size.
- The 7 patch series "kallsyms: Prevent invalid access when showing
module buildid" from Petr Mladek cleans up kallsyms code related to
module buildid and fixes an invalid access crash when printing
backtraces.
- The 3 patch series "Address page fault in
ima_restore_measurement_list()" from Harshit Mogalapalli fixes a
kexec-related crash that can occur when booting the second-stage kernel
on x86.
- The 6 patch series "kho: ABI headers and Documentation updates" from
Mike Rapoport updates the kexec handover ABI documentation.
- The 4 patch series "Align atomic storage" from Finn Thain adds the
__aligned attribute to atomic_t and atomic64_t definitions to get
natural alignment of both types on csky, m68k, microblaze, nios2,
openrisc and sh.
- The 2 patch series "kho: clean up page initialization logic" from
Pratyush Yadav simplifies the page initialization logic in
kho_restore_page().
- The 6 patch series "Unload linux/kernel.h" from Yury Norov moves
several things out of kernel.h and into more appropriate places.
- The 7 patch series "don't abuse task_struct.group_leader" from Oleg
Nesterov removes the usage of ->group_leader when it is "obviously
unnecessary".
- The 5 patch series "list private v2 & luo flb" from Pasha Tatashin
adds some infrastructure improvements to the live update orchestrator.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA
jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6
0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww=
=mmsH
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves
disk space by teaching ocfs2 to reclaim suballocator block group
space (Heming Zhao)
- "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the
ARRAY_END() macro and uses it in various places (Alejandro Colomar)
- "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes
the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the
page size (Pnina Feder)
- "kallsyms: Prevent invalid access when showing module buildid" cleans
up kallsyms code related to module buildid and fixes an invalid
access crash when printing backtraces (Petr Mladek)
- "Address page fault in ima_restore_measurement_list()" fixes a
kexec-related crash that can occur when booting the second-stage
kernel on x86 (Harshit Mogalapalli)
- "kho: ABI headers and Documentation updates" updates the kexec
handover ABI documentation (Mike Rapoport)
- "Align atomic storage" adds the __aligned attribute to atomic_t and
atomic64_t definitions to get natural alignment of both types on
csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain)
- "kho: clean up page initialization logic" simplifies the page
initialization logic in kho_restore_page() (Pratyush Yadav)
- "Unload linux/kernel.h" moves several things out of kernel.h and into
more appropriate places (Yury Norov)
- "don't abuse task_struct.group_leader" removes the usage of
->group_leader when it is "obviously unnecessary" (Oleg Nesterov)
- "list private v2 & luo flb" adds some infrastructure improvements to
the live update orchestrator (Pasha Tatashin)
* tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits)
watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency
procfs: fix missing RCU protection when reading real_parent in do_task_stat()
watchdog/softlockup: fix sample ring index wrap in need_counting_irqs()
kcsan, compiler_types: avoid duplicate type issues in BPF Type Format
kho: fix doc for kho_restore_pages()
tests/liveupdate: add in-kernel liveupdate test
liveupdate: luo_flb: introduce File-Lifecycle-Bound global state
liveupdate: luo_file: Use private list
list: add kunit test for private list primitives
list: add primitives for private list manipulations
delayacct: fix uapi timespec64 definition
panic: add panic_force_cpu= parameter to redirect panic to a specific CPU
netclassid: use thread_group_leader(p) in update_classid_task()
RDMA/umem: don't abuse current->group_leader
drm/pan*: don't abuse current->group_leader
drm/amd: kill the outdated "Only the pthreads threading model is supported" checks
drm/amdgpu: don't abuse current->group_leader
android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()
android/binder: don't abuse current->group_leader
kho: skip memoryless NUMA nodes when reserving scratch areas
...
bpf_local_storage_free() already does not rely on local_storage->smap
since switching to kmalloc_nolock(). As local_storage->smap is removed,
fix the outdated test by dropping the local_storage->smap check. Keep
the second map in task local storage map test to test that multiple
elements can be added to the storage similar to sk storage test.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-18-ameryhung@gmail.com
bpf_cgrp_storage_busy has been removed. Use bpf_bprintf_nest_level
instead. This percpu variable is also in the bpf subsystem so that
if it is removed in the future, BPF-CI will catch this type of CI-
breaking change.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-17-ameryhung@gmail.com
Remove a test in test_maps that checks if the updating of the percpu
counter in task local storage map is preemption safe as the percpu
counter is now removed.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-16-ameryhung@gmail.com
Adjust the error code we are checking against as
bpf_task_storage_delete() now returns -EDEADLK or -ETIMEDOUT when
deadlock happens.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-15-ameryhung@gmail.com
Update the expected result of the selftest as recursion of task local
storage syscall and helpers have been relaxed. Now that the percpu
counter is removed, task local storage helpers, bpf_task_storage_get()
and bpf_task_storage_delete() can now run on the same CPU at the same
time unless they cause deadlock.
Note that since there is no percpu counter preventing recursion in
task local storage helpers, bpf_trampoline now catches the recursion
of on_update as reported by recursion_misses.
on_enter: tp_btf/sys_enter
on_update: fentry/bpf_local_storage_update
Old behavior New behavior
____________ ____________
on_enter on_enter
bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a)
bpf_task_storage_trylock succeed bpf_local_storage_update(&map_a)
bpf_local_storage_update(&map_a)
on_update on_update
bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a)
bpf_task_storage_trylock fail on_update::misses++ (1)
return NULL create and return map_a::ptr
map_a::ptr += 1 (1)
bpf_task_storage_delete(&map_a)
return 0
bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b)
bpf_task_storage_trylock fail on_update::misses++ (2)
return NULL create and return map_b::ptr
map_b::ptr += 1 (1)
create and return map_a::ptr create and return map_a::ptr
map_a::ptr = 200 map_a::ptr = 200
bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b)
bpf_task_storage_trylock succeed lockless lookup succeed
bpf_local_storage_update(&map_b) return map_b::ptr
on_update
bpf_task_storage_get(&map_a)
bpf_task_storage_trylock fail
lockless lookup succeed
return map_a::ptr
map_a::ptr += 1 (201)
bpf_task_storage_delete(&map_a)
bpf_task_storage_trylock fail
return -EBUSY
nr_del_errs++ (1)
bpf_task_storage_get(&map_b)
bpf_task_storage_trylock fail
return NULL
create and return ptr
map_b::ptr = 100
Expected result:
map_a::ptr = 201 map_a::ptr = 200
map_b::ptr = 100 map_b::ptr = 1
nr_del_err = 1 nr_del_err = 0
on_update::recursion_misses = 0 on_update::recursion_misses = 2
On_enter::recursion_misses = 0 on_enter::recursion_misses = 0
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-14-ameryhung@gmail.com
Check sk_omem_alloc when the caller of bpf_local_storage_destroy()
returns. bpf_local_storage_destroy() now returns the memory to uncharge
to the caller instead of directly uncharge. Therefore, in the
sk_storage_omem_uncharge, check sk_omem_alloc when bpf_sk_storage_free()
returns instead of bpf_local_storage_destroy().
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260205222916.1788211-13-ameryhung@gmail.com
The issue occurs in TOO_MANY_FRAGS test case when xdp_zc_max_segs is set to
an odd number.
TOO_MANY_FRAGS test case contains an invalid packet consisting of
(xdp_zc_max_segs) frags. Every frag, even the last one has XDP_PKT_CONTD
flag set. This packet is expected to be dropped. After that, there is a
valid linear packet, which is expected to be received back.
Once (xdp_zc_max_segs) is an odd number, the last packet cannot be
received, if packet forwarding between Rx and Tx interfaces relies on
the ethernet header, e.g. checks for ETH_P_LOOPBACK. Packet is malformed,
if all traffic is looped.
Turns out, sending function processes multiple invalid frags as if they
were in 2-frag packets. So once the invalid mbuf packet contains an odd
number of those, the valid packet after gets paired with the previous
invalid descriptor, and hence does not get an ethernet header generated, so
it is either dropped or malformed.
Make invalid packets in verbatim mode always have only a single frag. For
such packets, number of frags is otherwise meaningless, as descriptor flags
are pre-configured in verbatim mode and packet data is not generated for
invalid descriptors.
Fixes: 697604492b ("selftests/xsk: add invalid descriptor test for multi-buffer")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20260203155103.2305816-3-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Referenced commit reduced the scope of the variable pkt, so now it has to
be reinitialized via pkt_stream_get_next_rx_pkt(), which also increments
some counters. When the packet is interrupted by the batch ending, pkt
stream therefore proceeds to the next packet, while xsk ring still contains
the previous one, this results in a pkt_nb mismatch.
Decrement the affected counters when packet is interrupted.
Fixes: 8913e653e9 ("selftests/xsk: Iterate over all the sockets in the receive pkts function")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/20260203155103.2305816-2-larysa.zaremba@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add tests for linked register tracking with negative offsets, BPF_SUB,
and alu32. These test for all edge cases like overflows, etc.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260204151741.2678118-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Previously, the verifier only tracked positive constant deltas between
linked registers using BPF_ADD. This limitation meant patterns like:
r1 = r0;
r1 += -4;
if r1 s>= 0 goto l0_%=; // r1 >= 0 implies r0 >= 4
// verifier couldn't propagate bounds back to r0
if r0 != 0 goto l0_%=;
r0 /= 0; // Verifier thinks this is reachable
l0_%=:
Similar limitation exists for 32-bit registers.
With this change, the verifier can now track negative deltas in reg->off
enabling bound propagation for the above pattern.
For alu32, we make sure the destination register has the upper 32 bits
as 0s before creating the link. BPF_ADD_CONST is split into
BPF_ADD_CONST64 and BPF_ADD_CONST32, the latter is used in case of alu32
and sync_linked_regs uses this to zext the result if known_reg has this
flag.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260204151741.2678118-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now BPF_END has bitwise tracking support. This patch adds selftests to
cover various cases of BPF_END (`bswap(16|32|64)`, `be(16|32|64)`,
`le(16|32|64)`) with bitwise propagation.
This patch is based on existing `verifier_bswap.c`, and add several
types of new tests:
1. Unconditional byte swap operations:
- bswap16/bswap32/bswap64 with unknown bytes
2. Endian conversion operations (architecture-aware):
- be16/be32/be64: convert to big-endian
* on little-endian: do swap
* on big-endian: truncation (16/32-bit) or no-op (64-bit)
- le16/le32/le64: convert to little-endian
* on big-endian: do swap
* on little-endian: truncation (16/32-bit) or no-op (64-bit)
Each test simulates realistic networking scenarios where a value is
masked with unknown bits (e.g., var_off=(0x0; 0x3f00), range=[0,0x3f00]),
then byte-swapped, and the verifier must prove the result stays within
expected bounds.
Specifically, these selftests are based on dead code elimination:
If the BPF verifier can precisely track bitwise through byte swap
operations, it can prune the trap path (invalid memory access) that
should be unreachable, allowing the program to pass verification.
If bitwise tracking is incorrect, the verifier cannot prove the trap
is unreachable, causing verification failure.
The tests use preprocessor conditionals (#ifdef __BYTE_ORDER__) to
verify correct behavior on both little-endian and big-endian
architectures, and require Clang 18+ for bswap instruction support.
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Co-developed-by: Yazhou Tang <tangyazhou518@outlook.com>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260204111503.77871-3-ziye@zju.edu.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now bpf_timer can be used in tracepoints, so these tests are no longer
relevant.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260201025403.66625-9-alexei.starovoitov@gmail.com
Add stress tests for BPF timers that run in NMI context using perf_event
programs attached to PERF_COUNT_HW_CPU_CYCLES.
The tests cover three scenarios:
- nmi_race: Tests concurrent timer start and async cancel operations
- nmi_update: Tests updating a map element (effectively deleting and
inserting new for array map) from within a timer callback
- nmi_cancel: Tests timer self-cancellation attempt.
A common test_common() helper is used to share timer setup logic across
all test modes.
The tests spawn multiple threads in a child process to generate
perf events, which trigger the BPF programs in NMI context. Hit counters
verify that the NMI code paths were actually exercised.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260201025403.66625-8-alexei.starovoitov@gmail.com
Refactor timer selftests, extracting stress test into a separate test.
This makes it easier to debug test failures and allows to extend.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260201025403.66625-5-alexei.starovoitov@gmail.com
Add a selftest to ensure BPF stream functions can now be called
while holding a lock.
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260203180424.14057-5-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Test that two registers with their id=0 (unlinked) in the cached state
can be mapped to a single id (linked) in the current state.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260203165102.2302462-6-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Scalar register IDs are used by the verifier to track relationships
between registers and enable bounds propagation across those
relationships. Once an ID becomes singular (i.e. only a single
register/stack slot carries it), it can no longer contribute to bounds
propagation and effectively becomes stale. The previous commit makes the
verifier clear such ids before caching the state.
When comparing the current and cached states for pruning, these stale
IDs can cause technically equivalent states to be considered different
and thus prevent pruning.
For example, in the selftest added in the next commit, two registers -
r6 and r7 are not linked to any other registers and get cached with
id=0, in the current state, they are both linked to each other with
id=A. Before this commit, check_scalar_ids would give temporary ids to
r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A,
and when it would see tid2->A, it would not consider these state
equivalent.
Relax scalar ID equivalence by treating rold->id == 0 as "independent":
if the old state did not rely on any ID relationships for a register,
then any ID/linking present in the current state only adds constraints
and is always safe to accept for pruning. Implement this by returning
true immediately in check_scalar_ids() when old_id == 0.
Maintain correctness for the opposite direction (old_id != 0 && cur_id
== 0) by still allocating a temporary ID for cur_id == 0. This avoids
incorrectly allowing multiple independent current registers (id==0) to
satisfy a single linked old ID during mapping.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The added fsession does not prevent running on those architectures, that
haven't added fsession support.
For example, try to run fsession tests on arm64:
test_fsession_basic:PASS:fsession_test__open_and_load 0 nsec
test_fsession_basic:PASS:fsession_attach 0 nsec
check_result:FAIL:test_run_opts err unexpected error: -14 (errno 14)
In order to prevent such errors, add bpf_jit_supports_fsession() to guard
those architectures.
Fixes: 2d419c4465 ("bpf: add fsession support")
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260131144950.16294-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding support to call bpf_get_stackid helper from trigger programs,
so far added for kprobe multi.
Adding the --stacktrace/-g option to enable it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260126211837.472802-7-jolsa@kernel.org
Adding test that attaches fentry/fexitand verifies the
ORC stacktrace matches expected functions.
The test is only for ORC unwinder to keep it simple.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260126211837.472802-6-jolsa@kernel.org
Adding test that attaches kprobe/kretprobe and verifies the
ORC stacktrace matches expected functions.
The test is only for ORC unwinder to keep it simple.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260126211837.472802-5-jolsa@kernel.org
We now include the attached function in the stack trace,
fixing the test accordingly.
Fixes: c9e208fa93 ("selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260126211837.472802-4-jolsa@kernel.org
Recent x86 kernels export __preempt_count as a ksym, while some old kernels
between v6.1 and v6.14 expose the preemption counter via
pcpu_hot.preempt_count. The existing selftest helper unconditionally
dereferenced __preempt_count, which breaks BPF program loading on such old
kernels.
Make the x86 preemption count lookup version-agnostic by:
- Marking __preempt_count and pcpu_hot as weak ksyms.
- Introducing a BTF-described pcpu_hot___local layout with
preserve_access_index.
- Selecting the appropriate access path at runtime using ksym availability
and bpf_ksym_exists() and bpf_core_field_exists().
This allows a single BPF binary to run correctly across kernel versions
(e.g., v6.18 vs. v6.13) without relying on compile-time version checks.
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Link: https://lore.kernel.org/r/20260130021843.154885-1-changwoo@igalia.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding test that makes sure we can't mix sleepable and non-sleepable
bpf programs in the BPF_MAP_TYPE_PROG_ARRAY map and that we can do
tail call in the sleepable program.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20260130081208.1130204-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>