linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-03 12:03:54 +02:00

Author	SHA1	Message	Date
Eduard Zingerman	223ffb6a3d	selftests/bpf: add reproducer for spurious precision propagation through calls Add a test for the scenario described in the previous commit: an iterator loop with two paths where one ties r2/r7 via shared scalar id and skips a call, while the other goes through the call. Precision marks from the linked registers get spuriously propagated to the call path via propagate_precision(), hitting "backtracking call unexpected regs" in backtrack_insn(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-2-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 21:50:05 -08:00
Eduard Zingerman	2658a1720a	bpf: collect only live registers in linked regs Fix an inconsistency between func_states_equal() and collect_linked_regs(): - regsafe() uses check_ids() to verify that cached and current states have identical register id mapping. - func_states_equal() calls regsafe() only for registers computed as live by compute_live_registers(). - clean_live_states() is supposed to remove dead registers from cached states, but it can skip states belonging to an iterator-based loop. - collect_linked_regs() collects all registers sharing the same id, ignoring the marks computed by compute_live_registers(). Linked registers are stored in the state's jump history. - backtrack_insn() marks all linked registers for an instruction as precise whenever one of the linked registers is precise. The above might lead to a scenario: - There is an instruction I with register rY known to be dead at I. - Instruction I is reached via two paths: first A, then B. - On path A: - There is an id link between registers rX and rY. - Checkpoint C is created at I. - Linked register set {rX, rY} is saved to the jump history. - rX is marked as precise at I, causing both rX and rY to be marked precise at C. - On path B: - There is no id link between registers rX and rY, otherwise register states are sub-states of those in C. - Because rY is dead at I, check_ids() returns true. - Current state is considered equal to checkpoint C, propagate_precision() propagates spurious precision mark for register rY along the path B. - Depending on a program, this might hit verifier_bug() in the backtrack_insn(), e.g. if rY ∈ [r1..r5] and backtrack_insn() spots a function call. The reproducer program is in the next patch. This was hit by sched_ext scx_lavd scheduler code. Changes in tests: - verifier_scalar_ids.c selftests need modification to preserve some registers as live for __msg() checks. - exceptions_assert.c adjusted to match changes in the verifier log, R0 is dead after conditional instruction and thus does not get range. - precise.c adjusted to match changes in the verifier log, register r9 is dead after comparison and it's range is not important for test. Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Fixes: `0fb3cf6110` ("bpf: use register liveness information for func_states_equal") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-linked-regs-and-propagate-precision-v1-1-18e859be570d@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 21:49:40 -08:00
Eduard Zingerman	d87c9305a8	Revert "selftests/bpf: Update reg_bound range refinement logic" This reverts commit `da653de268`. Removed logic is now covered by range_refine_in_halves() which handles both 32-bit and 64-bit refinements. Suggested-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-3-f7f67e060a6b@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 18:16:17 -08:00
Eduard Zingerman	f81fdfd167	selftests/bpf: test refining u32/s32 bounds when ranges cross min/max boundary Two test cases for signed/unsigned 32-bit bounds refinement when s32 range crosses the sign boundary: - s32 range [S32_MIN..1] overlapping with u32 range [3..U32_MAX], s32 range tail before sign boundary overlaps with u32 range. - s32 range [-3..5] overlapping with u32 range [0..S32_MIN+3], s32 range head after the sign boundary overlaps with u32 range. This covers both branches added in the __reg32_deduce_bounds(). Also, crossing_32_bit_signed_boundary_2() no longer triggers invariant violations. Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Reviewed-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-2-f7f67e060a6b@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 18:16:17 -08:00
Eduard Zingerman	fbc7aef517	bpf: Fix u32/s32 bounds when ranges cross min/max boundary Same as in __reg64_deduce_bounds(), refine s32/u32 ranges in __reg32_deduce_bounds() in the following situations: - s32 range crosses U32_MAX/0 boundary, positive part of the s32 range overlaps with u32 range: 0 U32_MAX \| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] \| \|----------------------------\|----------------------------\| \|xxxxx s32 range xxxxxxxxx] [xxxxxxx\| 0 S32_MAX S32_MIN -1 - s32 range crosses U32_MAX/0 boundary, negative part of the s32 range overlaps with u32 range: 0 U32_MAX \| [xxxxxxxxxxxxxx u32 range xxxxxxxxxxxxxx] \| \|----------------------------\|----------------------------\| \|xxxxxxxxx] [xxxxxxxxxxxx s32 range \| 0 S32_MAX S32_MIN -1 - No refinement if ranges overlap in two intervals. This helps for e.g. consider the following program: call %[bpf_get_prandom_u32]; w0 &= 0xffffffff; if w0 < 0x3 goto 1f; // on fall-through u32 range [3..U32_MAX] if w0 s> 0x1 goto 1f; // on fall-through s32 range [S32_MIN..1] if w0 s< 0x0 goto 1f; // range can be narrowed to [S32_MIN..-1] r10 = 0; 1: ...; The reg_bounds.c selftest is updated to incorporate identical logic, refinement based on non-overflowing range halves: ((x ∩ [0, smax]) ∩ (y ∩ [0, smax])) ∪ ((x ∩ [smin,-1]) ∩ (y ∩ [smin,-1])) Reported-by: Andrea Righi <arighi@nvidia.com> Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Closes: https://lore.kernel.org/bpf/aakqucg4vcujVwif@gpd4/T/ Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260306-bpf-32-bit-range-overflow-v3-1-f7f67e060a6b@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-06 18:16:06 -08:00
Andrey Grodzovsky	a28441dd29	selftests/bpf: Add tests for kprobe.session optimization Extend existing kprobe_multi_test subtests to validate the kprobe.session exact function name optimization: In kprobe_multi_session.c, add test_kprobe_syms which attaches a kprobe.session program to an exact function name (bpf_fentry_test1) exercising the fast syms[] path that bypasses kallsyms parsing. It calls session_check() so bpf_fentry_test1 is hit by both the wildcard and exact probes, and test_session_skel_api validates kprobe_session_result[0] == 4 (entry + return from each probe). In test_attach_api_fails, add fail_7 and fail_8 verifying error code consistency between the wildcard pattern path (slow, parses kallsyms) and the exact function name path (fast, uses syms[] array). Both paths must return -ENOENT for non-existent functions. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260302200837.317907-4-andrey.grodzovsky@crowdstrike.com	2026-03-05 15:14:53 -08:00
Sun Jian	7f20d371fd	selftests/bpf: bpf_cookie: Make perf_event subtest trigger reliably The perf_event subtest relies on SW_CPU_CLOCK sampling to trigger the BPF program, but the current CPU burn loop can be too short on slower systems and may fail to generate any overflow sample. This leaves pe_res unchanged and makes the test flaky. Make burn_cpu() take a loop count and use a longer burn only for the perf_event subtest. Also scope perf_event_open() to the current task to avoid wasting samples on unrelated activity. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20260228074555.122950-3-sun.jian.kdev@gmail.com	2026-03-05 15:07:40 -08:00
Sun Jian	74d3305e62	selftests/bpf: bpf_cookie: Skip kprobe_multi tests without bpf_testmod The kprobe_multi subtests rely on bpf_testmod fentry ksyms. When bpf_testmod isn't available, libbpf fails to resolve bpf_testmod_fentry_test* and skeleton load fails with -ESRCH, causing false failures. Skip these subtests when env.has_testmod is false. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20260228074555.122950-2-sun.jian.kdev@gmail.com	2026-03-05 15:07:40 -08:00
Josef Bacik	fefeeec612	selftests/bpf: Add test for btf__add_btf() with split BTF sources Add a test that verifies btf__add_btf() correctly handles merging multiple split BTF objects that share the same base BTF. The test creates two sibling split BTFs on a common base, merges them into a combined split BTF, and validates that base type references are preserved while split type references are properly remapped. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/64a8c947bff1ae89efa9ba8c099466477762490f.1772657690.git.josef@toxicpanda.com	2026-03-05 15:03:04 -08:00
Linus Torvalds	abacaf5599	Including fixes from CAN, netfilter and wireless. Current release - new code bugs: - sched: cake: fixup cake_mq rate adjustment for diffserv config - wifi: fix missing ieee80211_eml_params member initialization Previous releases - regressions: - tcp: give up on stronger sk_rcvbuf checks (for now) Previous releases - always broken: - net: fix rcu_tasks stall in threaded busypoll - sched: fq: clear q->band_pkt_count[] in fq_reset() - sched: only allow act_ct to bind to clsact/ingress qdiscs and shared blocks - bridge: check relevant per-VLAN options in VLAN range grouping - xsk: fix fragment node deletion to prevent buffer leak Misc: - spring cleanup of inactive maintainers Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmmptYEACgkQMUZtbf5S Irsraw/+L+L512Sbh1UlmbZjhT+AQkERHNkkfUMfXAeVb4uwHOqaydVdffvqRlTT zOK8Oqzqf5ojRezDZ02skXnJTh39MF9IFiugF9JHApxwT2ALv0S7PXPFUJeRQeAY +OiLT5+iy8wMfM6eryL6OtpM9PC8zwzH32oCYd5m4Ixf90Woj5G7x8Vooz7wUg1n 0cAliam8QLIRBrKXqctf7J8n23AK+WcrLcAt58J+qWCGqiiXdJXMvWXv1PjQ7vs/ KZysy0QaGwh3rw+5SquXmXwjhNIvvs58v3NV/4QbBdIKfJ5uYpTpyVgXJBQ6B4Jv 8SATHNwGbuUHuZl8OHn9ysaPCE3ZuD5pMnHbLnbKR6fyic95GxMIx/BNAOVvvwOH l+GWEqch8hy6r+BVAJsoSEJzIf9aqUAlEhy0wEhVOP15yn5RWfMRQKpAaD6JKQYm 0Q6i+PsdS8xaANcUzi1Ec6aqyaX+iIBY6srE/twU3PW23Uv2ejqAG89x4s7t9LPu GdMQ+iAEsR8Auph8Y5mshs4e9MrdlD3jzPCiFhkrqncWl/UcPpBgmHlD80vkTa1/ miMyYG5wq3g9pAFT43aAuoE85K6ZdIW0xGp3wGYMiW8Zy6Ea5EdnM2Wg8kbi/om0 W0pjfcI/2FInsZqK0g/PDeccWFKxl8C1SnfNDvy9rJHBwMkZHm4= =XGBM -----END PGP SIGNATURE----- Merge tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, netfilter and wireless. Current release - new code bugs: - sched: cake: fixup cake_mq rate adjustment for diffserv config - wifi: fix missing ieee80211_eml_params member initialization Previous releases - regressions: - tcp: give up on stronger sk_rcvbuf checks (for now) Previous releases - always broken: - net: fix rcu_tasks stall in threaded busypoll - sched: - fq: clear q->band_pkt_count[] in fq_reset() - only allow act_ct to bind to clsact/ingress qdiscs and shared blocks - bridge: check relevant per-VLAN options in VLAN range grouping - xsk: fix fragment node deletion to prevent buffer leak Misc: - spring cleanup of inactive maintainers" * tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits) xdp: produce a warning when calculated tailroom is negative net: enetc: use truesize as XDP RxQ info frag_size libeth, idpf: use truesize as XDP RxQ info frag_size i40e: use xdp.frame_sz as XDP RxQ info frag_size i40e: fix registering XDP RxQ info ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz ice: fix rxq info registering in mbuf packets xsk: introduce helper to determine rxq->frag_size xdp: use modulo operation to calculate XDP frag tailroom selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour net/sched: act_ife: Fix metalist update behavior selftests: net: add test for IPv4 route with loopback IPv6 nexthop net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled MAINTAINERS: remove Thomas Falcon from IBM ibmvnic MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC ...	2026-03-05 11:00:46 -08:00
Feng Yang	e8ae16d65a	selftests/bpf: Add selftests for the invocation of bpf_lwt_xmit_push_encap Calling bpf_lwt_xmit_push_encap will not cause a crash when dst is missing. Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260304094429.168521-3-yangfeng59949@163.com	2026-03-04 16:44:29 -08:00
Viktor Malik	05c9b2eda8	selftests/bpf: Split module_attach into subtests The test verifies attachment to various hooks in a kernel module, however, everything is flattened into a single test. This makes it impossible to run or skip test cases selectively. Isolate each BPF program into a separate subtest. This is done by disabling auto-loading of programs and loading and testing each program separately. At the same time, modernize the test to use ASSERT* instead of CHECK and replace `return` by `goto cleanup` where necessary. Signed-off-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260225120904.1529112-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 09:23:57 -08:00
Emil Tsalapatis	422f1efabd	selftests: bpf: Add tests for void global subprogs Add additional testing for void global functions. The tests ensure that calls to void global functions properly keep R0 invalid. Also make sure that exception callbacks still require a return value. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260228184759.108145-6-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:47:23 -08:00
Emil Tsalapatis	8446ded1e1	bpf: Allow void global functions in the verifier Global subprogs are currently not allowed to return void. Adjust verifier logic to allow global functions with a void return type. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260228184759.108145-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:47:23 -08:00
Alexis Lothoré (eBPF Foundation)	7242b0951d	selftests/bpf: drop test_bpftool.sh The test_bpftool.sh script runs a python unittest script checking bpftool json output on different commands. As part of the ongoing effort to get rid of any standalone test, this script should either be converted to test_progs or removed. As validating bpftool json output does not bring much value to the test base (and because it would need test_progs to bring in a json parser), remove the standalone test script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20260227-bpftool_feature-v1-1-a25860fd52fb@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:45:53 -08:00
Kumar Kartikeya Dwivedi	f6312e7175	selftests/bpf: Add tests for ctx fixed offset support Add tests to ensure PTR_TO_CTX supports fixed offsets for program types that don't rewrite accesses to it. Ensure that variable offsets and negative offsets are still rejected. An extra test also checks writing into ctx with modified offset for syscall progs. Other program types do not support writes (notably, writable tracepoints offer a pointer for a writable buffer through ctx, but don't allow writing to the ctx itself). Before the fix made in the previous commit, these tests do not succeed, except the ones testing for failures regardless of the change. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260227005725.1247305-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:45:16 -08:00
Puranjay Mohan	b1d6bd5462	bpf, arm64: Use ORR-based MOV for general-purpose registers The A64_MOV macro unconditionally uses ADD Rd, Rn, #0 to implement register moves. While functionally correct, this is not the canonical encoding when both operands are general-purpose registers. On AArch64, MOV has two aliases depending on the operand registers: - MOV <Xd\|SP>, <Xn\|SP> → ADD <Xd\|SP>, <Xn\|SP>, #0 - MOV <Xd>, <Xn> → ORR <Xd>, XZR, <Xn> The ADD form is required when the stack pointer is involved (as ORR does not accept SP), while the ORR form is the preferred encoding for general-purpose registers. The ORR encoding is also measurably faster on modern microarchitectures. A microbenchmark [1] comparing dependent chains of MOV (ORR) vs ADD #0 on an ARM Neoverse-V2 (72-core, 3.4 GHz) shows: === mov (ORR Xd, XZR, Xn) === run1 cycles/op=0.749859456 run2 cycles/op=0.749991250 run3 cycles/op=0.749601847 avg cycles/op=0.749817518 === add0 (ADD Xd, Xn, #0) === run1 cycles/op=1.004777689 run2 cycles/op=1.004558266 run3 cycles/op=1.004806559 avg cycles/op=1.004714171 The ORR form completes in ~0.75 cycles/op vs ~1.00 cycles/op for ADD #0, a ~25% improvement. This is likely because the CPU's register renaming hardware can eliminate ORR-based moves, while ADD #0 must go through the ALU pipeline. Update A64_MOV to select the appropriate encoding at JIT time: use ADD when either register is A64_SP, and ORR (via aarch64_insn_gen_move_reg()) otherwise. Update verifier_private_stack selftests to expect "mov x7, x0" instead of "add x7, x0, #0x0" in the JITed instruction checks, matching the new ORR-based encoding. [1] https://github.com/puranjaymohan/scripts/blob/main/arm64/bench/run_mov_vs_add0.sh Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260225134339.2723288-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:43:10 -08:00
Jiri Olsa	0c4fc6bd61	selftests/bpf: Add usdt trigger bench Adding usdt trigger bench for usdt: trig-usdt-nop - usdt on top of nop1 instruction trig-usdt-nop5 - usdt on top of nop1/nop5 combo Adding it to benchs/run_bench_uprobes.sh script. Example run on x86_64 kernel with uprobe syscall: # ./benchs/run_bench_uprobes.sh usermode-count : 152.507 ± 0.098M/s syscall-count : 14.309 ± 0.093M/s uprobe-nop : 3.190 ± 0.012M/s uprobe-push : 3.057 ± 0.004M/s uprobe-ret : 1.095 ± 0.009M/s uprobe-nop5 : 7.305 ± 0.034M/s uretprobe-nop : 2.175 ± 0.005M/s uretprobe-push : 2.109 ± 0.003M/s uretprobe-ret : 0.945 ± 0.002M/s uretprobe-nop5 : 3.530 ± 0.006M/s usdt-nop : 3.235 ± 0.008M/s <-- added usdt-nop5 : 7.511 ± 0.045M/s <-- added Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:39:22 -08:00
Jiri Olsa	304841967c	selftests/bpf: Add test for checking correct nop of optimized usdt Adding test that attaches bpf program on usdt probe in 2 scenarios; - attach program on top of usdt_1, which is single nop instruction, so the probe stays on nop instruction and is not optimized. - attach program on top of usdt_2 which is probe defined on top of nop,nop5 combo, so the probe is placed on top of nop5 and is optimized. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:39:22 -08:00
Jiri Olsa	0c178e9deb	selftests/bpf: Emit nop,nop5 instructions combo for x86_64 arch Syncing latest usdt.h change [1]. Now that we have nop5 optimization support in kernel, let's emit nop,nop5 for usdt probe. We leave it up to the library to use desirable nop instruction. [1] `c9865d1589` Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:39:22 -08:00
Menglong Dong	c1eee8d1e8	selftests/bpf: factor out get_func_* tests for fsession The fsession is already supported by x86_64, arm64, riscv and s390, so we don't need to disable it in the compile time according to the architecture. Factor out the testings for it. Therefore, the testing can be disabled for the architecture that doesn't support it manually. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260224092208.1395085-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:36:26 -08:00
Ilya Leoshkevich	0f87614c4d	bpf/s390: Implement get_preempt_count() exe_ctx test fails on s390, because get_preempt_count() is not implemented and its fallback path always returns 0. Implement it using the new bpf_get_lowcore() kfunc. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260217160813.100855-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:35:07 -08:00
Ilya Leoshkevich	6fe54677bc	s390: Introduce bpf_get_lowcore() kfunc Implementing BPF version of preempt_count() requires accessing lowcore from BPF. Since lowcore can be relocated, open-coding (struct lowcore *)0 does not work, so add a kfunc. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260217160813.100855-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-03 08:35:07 -08:00
Jiayuan Chen	181cafbd8a	selftests/bpf: add test for xdp_bonding xmit_hash_policy compat Add a selftest to verify that changing xmit_hash_policy to vlan+srcmac is rejected when a native XDP program is loaded on a bond in 802.3ad mode. Without the fix in bond_option_xmit_hash_policy_set(), the change succeeds silently, creating an inconsistent state that triggers a kernel WARNING in dev_xdp_uninstall() when the bond is torn down. The test attaches native XDP to a bond0 (802.3ad, layer2+3), then attempts to switch xmit_hash_policy to vlan+srcmac and asserts the operation fails. It also verifies the change succeeds after XDP is detached, confirming the rejection is specific to the XDP-loaded state. Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Link: https://patch.msgid.link/20260226080306.98766-3-jiayuan.chen@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-03-03 10:47:38 +01:00
Alexei Starovoitov	309d8808ee	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf before 7.0-rc2 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-03-01 09:04:00 -08:00
Paul Chaignon	024cea2d64	selftests/bpf: Avoid simplification of crafted bounds test The reg_bounds_crafted tests validate the verifier's range analysis logic. They focus on the actual ranges and thus ignore the tnum. As a consequence, they carry the assumption that the tested cases can be reproduced in userspace without using the tnum information. Unfortunately, the previous change the refinement logic breaks that assumption for one test case: (u64)2147483648 (u32)<op> [4294967294; 0x100000000] The tested bytecode is shown below. Without our previous improvement, on the false branch of the condition, R7 is only known to have u64 range [0xfffffffe; 0x100000000]. With our improvement, and using the tnum information, we can deduce that R7 equals 0x100000000. 19: (bc) w0 = w6 ; R6=0x80000000 20: (bc) w0 = w7 ; R7=scalar(smin=umin=0xfffffffe,smax=umax=0x100000000,smin32=-2,smax32=0,var_off=(0x0; 0x1ffffffff)) 21: (be) if w6 <= w7 goto pc+3 ; R6=0x80000000 R7=0x100000000 R7's tnum is (0; 0x1ffffffff). On the false branch, regs_refine_cond_op refines R7's u32 range to [0; 0x7fffffff]. Then, __reg32_deduce_bounds refines the s32 range to 0 using u32 and finally also sets u32=0. From this, __reg_bound_offset improves the tnum to (0; 0x100000000). Finally, our previous patch uses this new tnum to deduce that it only intersect with u64=[0xfffffffe; 0x100000000] in a single value: 0x100000000. Because the verifier uses the tnum to reach this constant value, the selftest is unable to reproduce it by only simulating ranges. The solution implemented in this patch is to change the test case such that there is more than one overlap value between u64 and the tnum. The max. u64 value is thus changed from 0x100000000 to 0x300000000. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/50641c6a7ef39520595dcafa605692427c1006ec.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-27 16:11:50 -08:00
Paul Chaignon	e6ad477d1b	selftests/bpf: Test refinement of single-value tnum This patch introduces selftests to cover the new bounds refinement logic introduced in the previous patch. Without the previous patch, the first two tests fail because of the invariant violation they trigger. The last test fails because the R10 access is not detected as dead code. In addition, all three tests fail because of R0 having a non-constant value in the verifier logs. In addition, the last two cases are covering the negative cases: when we shouldn't refine the bounds because the u64 and tnum overlap in at least two values. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/90d880c8cf587b9f7dc715d8961cd1b8111d01a8.1772225741.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-27 16:11:50 -08:00
Kumar Kartikeya Dwivedi	2939d7b3b0	selftests/bpf: Add tests for special fields races Add a couple of tests to ensure that the refcount drops to zero when we exercise the race where creation of a special field succeeds the logical bpf_obj_free_fields done when deleting an element. Prior to previous changes, the fields would be freed eagerly and repopulate and end up leaking, causing the reference to not drop down correctly. Running this test on a kernel without fixes will cause a hang in delete_module, since the module reference stays active due to the leaked kptr not dropping it. After the fixes tests succeed as expected. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260227224806.646888-6-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-27 15:39:00 -08:00
T.J. Mercier	6881af27f9	selftests/bpf: Fix OOB read in dmabuf_collector Dmabuf name allocations can be less than DMA_BUF_NAME_LEN characters, but bpf_probe_read_kernel always tries to read exactly that many bytes. If a name is less than DMA_BUF_NAME_LEN characters, bpf_probe_read_kernel will read past the end. bpf_probe_read_kernel_str stops at the first NUL terminator so use it instead, like iter_dmabuf_for_each already does. Fixes: `ae5d2c59ec` ("selftests/bpf: Add test for dmabuf_iter") Reported-by: Jerome Lee <jaewookl@quicinc.com> Signed-off-by: T.J. Mercier <tjmercier@google.com> Link: https://lore.kernel.org/r/20260225003349.113746-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-26 11:28:04 -08:00
Ihor Solodrai	60e3cbef35	selftests/bpf: Fix a memory leak in xdp_flowtable test test_progs run with ASAN reported [1]: ==126==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f1ff3cfa340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77 #1 0x5610c15bb520 in bpf_program_attach_fd /codebuild/output/src685977285/src/actions-runner/_work/vmtest/vmtest/src/tools/lib/bpf/libbpf.c:13164 #2 0x5610c15bb740 in bpf_program__attach_xdp /codebuild/output/src685977285/src/actions-runner/_work/vmtest/vmtest/src/tools/lib/bpf/libbpf.c:13204 #3 0x5610c14f91d3 in test_xdp_flowtable /codebuild/output/src685977285/src/actions-runner/_work/vmtest/vmtest/src/tools/testing/selftests/bpf/prog_tests/xdp_flowtable.c:138 #4 0x5610c1533566 in run_one_test /codebuild/output/src685977285/src/actions-runner/_work/vmtest/vmtest/src/tools/testing/selftests/bpf/test_progs.c:1406 #5 0x5610c1537fb0 in main /codebuild/output/src685977285/src/actions-runner/_work/vmtest/vmtest/src/tools/testing/selftests/bpf/test_progs.c:2097 #6 0x7f1ff25df1c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb) #7 0x7f1ff25df28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb) #8 0x5610c0bd3180 in _start (/tmp/work/vmtest/vmtest/selftests/bpf/test_progs+0x593180) (BuildId: cdf9f103f42307dc0a2cd6cfc8afcbc1366cf8bd) Fix by properly destroying bpf_link on exit in xdp_flowtable test. [1] https://github.com/kernel-patches/vmtest/actions/runs/22361085418/job/64716490680 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20260225003351.465104-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-26 11:27:08 -08:00
Hari Bathini	8ebfe65e22	selftests/bpf: Test accounting of tail calls when prog is NULL Test whether tail call count is incorrectly accounted for, when the tail call fails due to a missing BPF program. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20260216090802.1805655-1-hbathini@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 17:09:27 -08:00
Ihor Solodrai	c89b50cc6b	selftests/bpf: Fix flakiness of task_local_storage/sys_enter_exit The test_sys_enter_exit test was setting target_pid before attaching the BPF programs, which causes syscalls made during the attach phase to be counted. This is flaky because, apparently, there is no guarantee that both on_enter and on_exit will trigger during the attachment. Move the target_pid assignment to after task_local_storage__attach() so that only explicit sys_gettid() calls are counted. Reported-by: BPF CI Bot (Claude Opus 4.6) <bot+bpf-ci@kernel.org> Closes: https://github.com/kernel-patches/vmtest/issues/448 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260224211202.214325-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 14:53:28 -08:00
Ihor Solodrai	4c9d07865c	selftests/bpf: Don't override SIGSEGV handler with ASAN test_progs has custom SIGSEGV handler, which interferes with the address sanitizer [1]. Add an #ifndef to avoid this. Additionally, declare an __asan_on_error() to dump the test logs in the same way it happens in the custom SIGSEGV handler. [1] https://lore.kernel.org/bpf/73d832948b01dbc0ebc60d85574bdf8537f3a810.camel@gmail.com/ Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223191118.655185-3-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	a2714e7303	selftests/bpf: Check BPFTOOL env var in detect_bpftool_path() The bpftool_maps_access and bpftool_metadata tests may fail on BPF CI with "command not found", depending on a workflow. This happens because detect_bpftool_path() only checks two hardcoded relative paths: - ./tools/sbin/bpftool - ../tools/sbin/bpftool Add support for a BPFTOOL environment variable that allows specifying the exact path to the bpftool binary. Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223191118.655185-2-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	ad90ecedad	selftests/bpf: Fix out-of-bounds array access bugs reported by ASAN - kmem_cache_iter: remove unnecessary debug output - lwt_seg6local: change the type of foobar to char[] - the sizeof(foobar) returned the pointer size and not a string length as intended - verifier_log: increase prog_name buffer size in verif_log_subtest() - compiler has a conservative estimate of fixed_log_sz value, making ASAN complain on snprint() call Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223191118.655185-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	3e711c8e47	selftests/bpf: Fix array bounds warning in jit_disasm_helpers Compiler cannot infer upper bound for labels.cnt and warns about potential buffer overflow in snprintf. Add an explicit bounds check (... && i < MAX_LOCAL_LABELS) in the loop condition to fix the warning. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-18-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	2bb270a0ac	selftests/bpf: Free bpf_object in test_sysctl ASAN reported a resource leak due to the bpf_object not being tracked in test_sysctl. Add obj field to struct sysctl_test to properly clean it up. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-17-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	71dca2950f	selftests/bpf: Fix resource leaks caused by missing cleanups ASAN reported a number of resource leaks: - Add missing *__destroy(skel) calls - Replace bpf_link__detach() with bpf_link__destroy() where appropriate - cgrp_local_storage: Add bpf_link__destroy() when bpf_iter_create fails - dynptr: Add missing bpf_object__close() Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-16-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	68b8bea1ee	selftests/bpf: Fix double thread join in uprobe_multi_test ASAN reported a "joining already joined thread" error. The release_child() may be called multiple times for the same struct child. Fix by resetting child->thread to 0 after pthread_join. Also memset(0) static child variable in test_attach_api(). Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-15-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	f7ac552be7	selftests/bpf: Fix use-after-free in xdp_metadata test ASAN reported a use-after-free in close_xsk(). The xsk->socket internally references xsk->umem via socket->ctx->umem, so the socket must be deleted before the umem. Fix the order of operations in close_xsk(). Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-14-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	ff9511410d	veristat: Fix a memory leak for preset ENUMERATOR ASAN detected a memory leak in veristat. The cleanup code handling ENUMERATOR value missed freeing strdup-ed svalue. Fix it. Acked-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-13-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	3eb4a2e399	selftests/bpf: Fix cleanup in check_fd_array_cnt__fd_array_too_big() The Close() macro uses the passed in expression three times, which leads to repeated execution in case it has side effects. That is, Close(i--) would decrement i three times. ASAN caught a stack-buffer-undeflow error at a point where this was overlooked. Fix it. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-12-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	9d0272c91f	selftests/bpf: Fix memory leaks in tests Fix trivial memory leaks detected by userspace ASAN: - htab_update: free value buffer in test_reenter_update cleanup - test_xsk: inline pkt_stream_replace() in testapp_stats_rx_full() and testapp_stats_fill_empty() - testing_helpers: free buffer allocated by getline() in parse_test_list_file Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-11-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	a1a771bd64	selftests/bpf: Refactor bpf_get_ksyms() trace helper ASAN reported a memory leak in bpf_get_ksyms(): it allocates a struct ksyms internally and never frees it. Move struct ksyms to trace_helpers.h and return it from the bpf_get_ksyms(), giving ownership to the caller. Add filtered_syms and filtered_cnt fields to the ksyms to hold the filtered array of symbols, previously returned by bpf_get_ksyms(). Fixup the call sites: kprobe_multi_test and bench_trigger. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260223190736.649171-10-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	45897ced3c	selftests/bpf: Add DENYLIST.asan Add a denylist file for tests that should be skipped when built with userspace ASAN: $ make ... SAN_CFLAGS="-fsanitize=address -fno-omit-frame-pointer" Skip the following tests: - arena: userspace ASAN does not understand BPF arena maps and gets confused particularly when map_extra is non-zero - non-zero map_extra leads to mmap with MAP_FIXED, and ASAN treats this as an unknown memory region - task_local_data: ASAN complains about "incorrect" aligned_alloc() usage, but it's intentional in the test - uprobe_multi_test: very slow with ASAN enabled Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-9-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	4021848a90	selftests/bpf: Pass through build flags to bpftool and resolve_btfids EXTRA_* and SAN_* build flags were not correctly propagated to bpftool and resolve_btids when building selftests/bpf. This led to various build errors on attempt to build with SAN_CFLAGS="-fsanitize=address", for example. Fix the makefiles to address this: - Pass SAN_CFLAGS/SAN_LDFLAGS to bpftool and resolve_btfids build - Propagate EXTRA_LDFLAGS to resolve_btfids link command - Use pkg-config to detect zlib and zstd for resolve_btfids, similar libelf handling Also check for ASAN flag in selftests/bpf/Makefile for convenience. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-7-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	9d8685239e	selftests/bpf: Use memcpy() for bounded non-NULL-terminated copies Replace strncpy() with memcpy() in cases where the source is non-NULL-terminated and the copy length is known. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-6-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:49 -08:00
Ihor Solodrai	3ed0bc2d49	selftests/bpf: Use strscpy in bpftool_helpers.c Replace strncpy() calls in bpftool_helpers.c with strscpy(). Pass the destination buffer size to detect_bpftool_path() instead of hardcoding BPFTOOL_PATH_MAX_LEN. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-5-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:48 -08:00
Ihor Solodrai	43277d8f57	selftests/bpf: Replace strncpy() with strscpy() strncpy() does not guarantee NULL-termination and is considered deprecated [1]. Replace strncpy() calls with strscpy(). [1] https://docs.kernel.org/process/deprecated.html#strncpy-on-nul-terminated-strings Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-4-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-24 08:19:33 -08:00
Ihor Solodrai	6a087ac23f	selftests/bpf: Replace strcpy() calls with strscpy() strcpy() does not perform bounds checking and is considered deprecated [1]. Replace strcpy() calls with strscpy() defined in bpf_util.h. [1] https://docs.kernel.org/process/deprecated.html#strcpy Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-3-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 18:40:07 -08:00
Ihor Solodrai	63c49efc98	selftests/bpf: Add simple strscpy() implementation Replace bpf_strlcpy() in bpf_util.h with a sized_strscpy(), which is a simplified sized_strscpy() from the kernel (lib/string.c [1]). It: * takes a count (destination size) parameter * guarantees NULL-termination * returns the number of characters copied or -E2BIG Re-define strscpy macro similar to in-kernel implementation [2]: allow the count parameter to be optional. Add #ifdef-s to tools/include/linux/args.h, as they may be defined in other system headers (for example, __CONCAT in sys/cdefs.h). Fixup the single existing bpf_strlcpy() call in cgroup_helpers.c [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/string.c?h=v6.19#n113 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/string.h?h=v6.19#n91 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260223190736.649171-2-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 18:40:07 -08:00
Amery Hung	055d8dd553	selftests/bpf: Update task local data license Change the license of task local data mini library to LGPL-2.1 or BSD-2-Clause to allow it being in a wider range of projects. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260219225849.2426421-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 18:06:47 -08:00
Abhishek Dubey	99726ece0c	selftests/bpf: Enable test for instruction array on arm64 As arm64 JIT now supports instruction array, make sure all relevant tests run on this architecture. Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com> Acked-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260223203511.118475-1-adubey@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 18:05:07 -08:00
Kaitao Cheng	ed8fa4b889	selftests/bpf: Add test case for rbtree nodes that contain both bpf_refcount and kptr fields. Allow bpf_kptr_xchg to directly operate on pointers marked with NON_OWN_REF \| MEM_RCU. In the example demonstrated in this patch, as long as "struct bpf_refcount ref" exists, the __kptr pointer is guaranteed to carry the MEM_RCU flag. The ref member itself does not need to be explicitly used. Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn> Link: https://lore.kernel.org/r/20260214124042.62229-6-pilgrimtao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 17:37:06 -08:00
Kaitao Cheng	580fa3430b	selftests/bpf: Add supplementary tests for bpf_kptr_xchg 1. Allow using bpf_kptr_xchg while holding a lock. 2. When the rb_node contains a __kptr pointer, we do not need to perform a remove-read-add operation. This patch implements the following workflow: 1. Construct a rbtree with 16 elements. 2. Traverse the rbtree, locate the kptr pointer in the target node, and read the content pointed to by the pointer. 3. Remove all nodes from the rbtree. Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn> Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Link: https://lore.kernel.org/r/20260214124042.62229-4-pilgrimtao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-23 17:37:06 -08:00
Gregory Bell	18a1d365e8	selftests/bpf: align build_id test mapping to 64K page size Some architectures require mappings to be aligned to the system page size. The build_id selftest currently uses a smaller alignment, which can result in madvise operations executing on a different page than intended. Increase the mapping alignment to 64K so the buffer is page-aligned on all supported architectures. Signed-off-by: Gregory Bell <grbell@redhat.com> Link: https://lore.kernel.org/r/93543253b32d1cb178ab6e31e4291e387ba1c372.1771338492.git.grbell@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-19 11:29:41 -08:00
Gregory Bell	d820fa3114	selftests/bpf: fix flaky build_id test The build_id selftest occasionally fails because MADV_PAGEOUT does not guarantee the immediate eviction of the page. The test assumes eviction happens and proceeds without verifying that the page was actually reclaimed, leading to false test failures. Fix the test by retrying the page-out sequence until eviction is successful, instead of relying on a single MADV_PAGEOUT attempt. Signed-off-by: Gregory Bell <grbell@redhat.com> Link: https://lore.kernel.org/r/038bd27c69dd3a16958894fcb19e4fb6fbfe317e.1771338492.git.grbell@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-19 11:29:41 -08:00
Alexei Starovoitov	b0a67f310b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf before 7.0-rc1 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-19 11:08:53 -08:00
Matthieu Baerts (NGI0)	1e5c009126	selftests/bpf: Remove hexdump dependency The verification signature header generation requires converting a binary certificate to a C array. Previously this only worked with xxd, and a switch to hexdump has been done in commit `b640d556a2` ("selftests/bpf: Remove xxd util dependency"). hexdump is a more common utility program, yet it might not be installed by default. When it is not installed, BPF selftests build without errors, but tests_progs is unusable: it exits with the 255 code and without any error messages. When manually reproducing the issue, it is not too hard to find out that the generated verification_cert.h file is incorrect, but that's time consuming. When digging the BPF selftests build logs, this line can be seen amongst thousands others, but ignored: /bin/sh: 2: hexdump: not found Here, od is used instead of hexdump. od is coming from the coreutils package, and this new od command produces the same output when using od from GNU coreutils, uutils, and even busybox. This is more portable, and it produces a similar results to what was done before with hexdump: there is an extra comma at the end instead of trailing whitespaces, but the C code is not impacted. Fixes: `b640d556a2` ("selftests/bpf: Remove xxd util dependency") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20260218-bpf-sft-hexdump-od-v2-1-2f9b3ee5ab86@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-18 15:11:20 -08:00
Ihor Solodrai	b3dfa128f7	selftests/bpf: Use vmlinux.h in test_xdp_meta - Replace linux/* includes with vmlinux.h - Include errno.h - Include bpf_tracing_net.h for TC_ACT_* and ETH_* - Use BPF_STDERR instead of BPF_STREAM_STDERR Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260218215651.2057673-2-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-18 14:58:47 -08:00
Saket Kumar Bhaskar	4c51f90d45	selftests/bpf: Add powerpc support for get_preempt_count() in selftest get_preempt_count() is enabled to return preempt_count for powerpc, so that bpf_in_interrupt()/bpf_in_nmi()/bpf_in_serving_softirq()/ bpf_in_task()/bpf_in_hardirq()/get_preempt_count() works for powerpc as well. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20260212092558.370623-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:42:38 -08:00
Eduard Zingerman	022ac07508	bpf: use reg->var_off instead of reg->off for pointers This commit consolidates static and varying pointer offset tracking logic. All offsets are now represented solely using `.var_off` and min/max fields. The reasons are twofold: - This simplifies pointer tracking code, as each relevant function needs to check the `.var_off` field anyway. - It makes it easier to widen pointer registers for the purpose of loop convergence checks, by forgoing the `regsafe()` logic demanding `.off` fields to be identical. The changes are spread across many functions and are hard to group into smaller patches. Some of the logical changes include: - Checks in __check_ptr_off_reg() are reordered so that the tnum_is_const() check is done before operating on reg->var_off.value. - check_packet_access() now uses check_mem_region_access() to handle possible 'off' overflow cases. - In check_helper_mem_access() utility functions like check_packet_access() are now called with 'off=0', as these utility functions now account for the complete register offset range. - In check_reg_type() a call to __check_ptr_off_reg() is added before a call to btf_struct_ids_match(). This prevents btf_struct_ids_match() from potentially working on non-constant reg->var_off.value. - regsafe() is relaxed to avoid comparing '.off' field for pointers. As a precaution, the changes are verified in [1] by adding a pass checking that no pointer has non-zero '.off' field on each do_check_insn() iteration. [1] https://github.com/eddyz87/bpf/tree/ptrs-off-migration Notable selftests changes: - `.var_off` value changed because it now combines static and varying offsets. Affected tests: - linked_list/incorrect_node_var_off - linked_list/incorrect_head_var_off2 - verifier_align/packet_variable_offset - Overflowing `smax_value` bound leads to a pointer with big negative or positive offset to be rejected immediately (previously overflowing `rX += const` instruction updated `.off` field avoiding the overflow). Affected tests: - verifier_align/dubious_pointer_arithmetic - verifier_bounds/var_off_insn_off_test1 - Invalid access to packet now reports full offset inside a packet. Affected tests: - verifier_direct_packet_access/test23_x_pkt_ptr_4 - A change in check_mem_region_access() behavior: when register `.smin_value` is negative, it reports "rX min value is negative..." before calling into __check_mem_access() which reports "invalid access to ...". In the tests below, the `.off` field was negative, while `.smin_value` remained positive. This is no longer the case after the changes in this commit. Affected tests: - verifier_gotox/jump_table_invalid_mem_acceess_neg - verifier_helper_packet_access/test15_cls_helper_fail_sub - verifier_helper_value_access/imm_out_of_bound_2 - verifier_helper_value_access/reg_out_of_bound_2 - verifier_meta_access/meta_access_test2 - verifier_value_ptr_arith/known_scalar_from_different_maps - lower_oob_arith_test_1 - value_ptr_known_scalar_3 - access_value_ptr_known_scalar - Usage of check_mem_region_access() instead of __check_mem_access() in check_packet_access() changes the reported message from "rX offset is outside ..." to "rX min/max value is outside ...". Affected tests: - verifier_xdp_direct_packet_access/* - In check_func_arg_reg_off() the check for zero offset now operates on `.var_off` field instead of `.off` field. For tests where the pattern looks like `kfunc(reg_with_var_off, ...)`, this changes the reported error: - previously the error "variable ... access ... disallowed" was reported by __check_ptr_off_reg(); - now "R1 must have zero offset ..." is reported by check_func_arg_reg_off() itself. Affected tests: - verifier/calls.c "calls: invalid kfunc call: PTR_TO_BTF_ID with variable offset" Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260212-ptrs-off-migration-v2-2-00820e4d3438@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:41:22 -08:00
Menglong Dong	0265c1fd91	selftests/bpf: enable fsession_test on riscv64 Now that the RISC-V trampoline JIT supports BPF_TRACE_FSESSION, run the fsession selftest on riscv64 as well as x86_64. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Tested-by: Björn Töpel <bjorn@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20260208053311.698352-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:14:27 -08:00
Kumar Kartikeya Dwivedi	2669dde7a8	selftests/bpf: Fix map_kptr grace period wait Commit `c27cea4416` ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast") broke map_kptr selftest since it removed the function we were kprobing. Use a new kfunc that invokes call_rcu_tasks_trace and sets a program provided pointer to an integer to 1. Technically this can be unsafe if the memory being written to from the callback disappears, but this is just for usage in a test where we ensure we spin until we see the value to be set to 1, so it's ok. Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Fixes: `c27cea4416` ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260211185747.3630539-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:14:27 -08:00
Ihor Solodrai	48f624c3dc	selftests/bpf: Adjust selftest due to function rename do_filp_open() was renamed in commit `541003b576` ("rename do_filp_open() to do_file_open()") This broke test_profiler, because it uses a kretprobe on that function. Fix it by renaming accordingly. Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Closes: https://lore.kernel.org/bpf/djwjf2vfb7gro3rfag666bojod6ytcectahnb5z6hx2hawimtj@sx47ghzjg4lw/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260210235855.215679-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:14:27 -08:00
Eduard Zingerman	f632de6e19	selftests/bpf: Migrate align.c tests to test_loader framework While working on pointer tracking changes I found it necessary to update expected log messages in align.c series of tests. As a preliminary step, migrate these tests to test_loader framework. The tests in question load BPF program and check if expected log is produced, the log is specified as: .matches = { ... {4, "R3", "32"}, ... } Where: - '4' is an instruction number (contrary to the field name in struct bpf_reg_match). - 'R3' is the name of the register to check. - '32' is the value expected for this register. Mimic the same logic using __msg macro. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260211051310.2782558-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-13 14:12:28 -08:00
Linus Torvalds	136114e0ab	mm.git review status for linus..mm-nonmm-stable Total patches: 107 Reviews/patch: 1.07 Reviewed rate: 67% - The 2 patch series "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" from Heming Zhao saves disk space by teaching ocfs2 to reclaim suballocator block group space. - The 4 patch series "Add ARRAY_END(), and use it to fix off-by-one bugs" from Alejandro Colomar adds the ARRAY_END() macro and uses it in various places. - The 2 patch series "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" from Pnina Feder makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size. - The 7 patch series "kallsyms: Prevent invalid access when showing module buildid" from Petr Mladek cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces. - The 3 patch series "Address page fault in ima_restore_measurement_list()" from Harshit Mogalapalli fixes a kexec-related crash that can occur when booting the second-stage kernel on x86. - The 6 patch series "kho: ABI headers and Documentation updates" from Mike Rapoport updates the kexec handover ABI documentation. - The 4 patch series "Align atomic storage" from Finn Thain adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. - The 2 patch series "kho: clean up page initialization logic" from Pratyush Yadav simplifies the page initialization logic in kho_restore_page(). - The 6 patch series "Unload linux/kernel.h" from Yury Norov moves several things out of kernel.h and into more appropriate places. - The 7 patch series "don't abuse task_struct.group_leader" from Oleg Nesterov removes the usage of ->group_leader when it is "obviously unnecessary". - The 5 patch series "list private v2 & luo flb" from Pasha Tatashin adds some infrastructure improvements to the live update orchestrator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY4giAAKCRDdBJ7gKXxA jgusAQDnKkP8UWTqXPC1jI+OrDJGU5ciAx8lzLeBVqMKzoYk9AD/TlhT2Nlx+Ef6 0HCUHUD0FMvAw/7/Dfc6ZKxwBEIxyww= =mmsH -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...	2026-02-12 12:13:01 -08:00
Amery Hung	97b859b5ed	selftests/bpf: Fix outdated test on storage->smap bpf_local_storage_free() already does not rely on local_storage->smap since switching to kmalloc_nolock(). As local_storage->smap is removed, fix the outdated test by dropping the local_storage->smap check. Keep the second map in task local storage map test to test that multiple elements can be added to the storage similar to sk storage test. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-18-ameryhung@gmail.com	2026-02-06 14:48:05 -08:00
Amery Hung	cdce7b0848	selftests/bpf: Choose another percpu variable in bpf for btf_dump test bpf_cgrp_storage_busy has been removed. Use bpf_bprintf_nest_level instead. This percpu variable is also in the bpf subsystem so that if it is removed in the future, BPF-CI will catch this type of CI- breaking change. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-17-ameryhung@gmail.com	2026-02-06 14:48:05 -08:00
Amery Hung	e02cf06b85	selftests/bpf: Remove test_task_storage_map_stress_lookup Remove a test in test_maps that checks if the updating of the percpu counter in task local storage map is preemption safe as the percpu counter is now removed. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-16-ameryhung@gmail.com	2026-02-06 14:48:05 -08:00
Amery Hung	902a79b638	selftests/bpf: Update task_local_storage/task_storage_nodeadlock test Adjust the error code we are checking against as bpf_task_storage_delete() now returns -EDEADLK or -ETIMEDOUT when deadlock happens. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-15-ameryhung@gmail.com	2026-02-06 14:48:05 -08:00
Amery Hung	e4772031d1	selftests/bpf: Update task_local_storage/recursion test Update the expected result of the selftest as recursion of task local storage syscall and helpers have been relaxed. Now that the percpu counter is removed, task local storage helpers, bpf_task_storage_get() and bpf_task_storage_delete() can now run on the same CPU at the same time unless they cause deadlock. Note that since there is no percpu counter preventing recursion in task local storage helpers, bpf_trampoline now catches the recursion of on_update as reported by recursion_misses. on_enter: tp_btf/sys_enter on_update: fentry/bpf_local_storage_update Old behavior New behavior ____________ ____________ on_enter on_enter bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a) bpf_task_storage_trylock succeed bpf_local_storage_update(&map_a) bpf_local_storage_update(&map_a) on_update on_update bpf_task_storage_get(&map_a) bpf_task_storage_get(&map_a) bpf_task_storage_trylock fail on_update::misses++ (1) return NULL create and return map_a::ptr map_a::ptr += 1 (1) bpf_task_storage_delete(&map_a) return 0 bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b) bpf_task_storage_trylock fail on_update::misses++ (2) return NULL create and return map_b::ptr map_b::ptr += 1 (1) create and return map_a::ptr create and return map_a::ptr map_a::ptr = 200 map_a::ptr = 200 bpf_task_storage_get(&map_b) bpf_task_storage_get(&map_b) bpf_task_storage_trylock succeed lockless lookup succeed bpf_local_storage_update(&map_b) return map_b::ptr on_update bpf_task_storage_get(&map_a) bpf_task_storage_trylock fail lockless lookup succeed return map_a::ptr map_a::ptr += 1 (201) bpf_task_storage_delete(&map_a) bpf_task_storage_trylock fail return -EBUSY nr_del_errs++ (1) bpf_task_storage_get(&map_b) bpf_task_storage_trylock fail return NULL create and return ptr map_b::ptr = 100 Expected result: map_a::ptr = 201 map_a::ptr = 200 map_b::ptr = 100 map_b::ptr = 1 nr_del_err = 1 nr_del_err = 0 on_update::recursion_misses = 0 on_update::recursion_misses = 2 On_enter::recursion_misses = 0 on_enter::recursion_misses = 0 Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-14-ameryhung@gmail.com	2026-02-06 14:48:05 -08:00
Amery Hung	d652f425d5	selftests/bpf: Update sk_storage_omem_uncharge test Check sk_omem_alloc when the caller of bpf_local_storage_destroy() returns. bpf_local_storage_destroy() now returns the memory to uncharge to the caller instead of directly uncharge. Therefore, in the sk_storage_omem_uncharge, check sk_omem_alloc when bpf_sk_storage_free() returns instead of bpf_local_storage_destroy(). Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260205222916.1788211-13-ameryhung@gmail.com	2026-02-06 14:48:04 -08:00
Larysa Zaremba	88af9fefed	selftests/xsk: fix number of Tx frags in invalid packet The issue occurs in TOO_MANY_FRAGS test case when xdp_zc_max_segs is set to an odd number. TOO_MANY_FRAGS test case contains an invalid packet consisting of (xdp_zc_max_segs) frags. Every frag, even the last one has XDP_PKT_CONTD flag set. This packet is expected to be dropped. After that, there is a valid linear packet, which is expected to be received back. Once (xdp_zc_max_segs) is an odd number, the last packet cannot be received, if packet forwarding between Rx and Tx interfaces relies on the ethernet header, e.g. checks for ETH_P_LOOPBACK. Packet is malformed, if all traffic is looped. Turns out, sending function processes multiple invalid frags as if they were in 2-frag packets. So once the invalid mbuf packet contains an odd number of those, the valid packet after gets paired with the previous invalid descriptor, and hence does not get an ethernet header generated, so it is either dropped or malformed. Make invalid packets in verbatim mode always have only a single frag. For such packets, number of frags is otherwise meaningless, as descriptor flags are pre-configured in verbatim mode and packet data is not generated for invalid descriptors. Fixes: `697604492b` ("selftests/xsk: add invalid descriptor test for multi-buffer") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20260203155103.2305816-3-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-06 09:36:36 -08:00
Larysa Zaremba	42e41b2a0a	selftests/xsk: properly handle batch ending in the middle of a packet Referenced commit reduced the scope of the variable pkt, so now it has to be reinitialized via pkt_stream_get_next_rx_pkt(), which also increments some counters. When the packet is interrupted by the batch ending, pkt stream therefore proceeds to the next packet, while xsk ring still contains the previous one, this results in a pkt_nb mismatch. Decrement the affected counters when packet is interrupted. Fixes: `8913e653e9` ("selftests/xsk: Iterate over all the sockets in the receive pkts function") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20260203155103.2305816-2-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-06 09:36:36 -08:00
Puranjay Mohan	47fcf4dc0a	selftests/bpf: Add tests for improved linked register tracking Add tests for linked register tracking with negative offsets, BPF_SUB, and alu32. These test for all edge cases like overflows, etc. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260204151741.2678118-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-04 13:35:29 -08:00
Puranjay Mohan	7a433e5193	bpf: Support negative offsets, BPF_SUB, and alu32 for linked register tracking Previously, the verifier only tracked positive constant deltas between linked registers using BPF_ADD. This limitation meant patterns like: r1 = r0; r1 += -4; if r1 s>= 0 goto l0_%=; // r1 >= 0 implies r0 >= 4 // verifier couldn't propagate bounds back to r0 if r0 != 0 goto l0_%=; r0 /= 0; // Verifier thinks this is reachable l0_%=: Similar limitation exists for 32-bit registers. With this change, the verifier can now track negative deltas in reg->off enabling bound propagation for the above pattern. For alu32, we make sure the destination register has the upper 32 bits as 0s before creating the link. BPF_ADD_CONST is split into BPF_ADD_CONST64 and BPF_ADD_CONST32, the latter is used in case of alu32 and sync_linked_regs uses this to zext the result if known_reg has this flag. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260204151741.2678118-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-04 13:35:28 -08:00
Tianci Cao	56415363e0	selftests/bpf: Add tests for BPF_END bitwise tracking Now BPF_END has bitwise tracking support. This patch adds selftests to cover various cases of BPF_END (`bswap(16\|32\|64)`, `be(16\|32\|64)`, `le(16\|32\|64)`) with bitwise propagation. This patch is based on existing `verifier_bswap.c`, and add several types of new tests: 1. Unconditional byte swap operations: - bswap16/bswap32/bswap64 with unknown bytes 2. Endian conversion operations (architecture-aware): - be16/be32/be64: convert to big-endian * on little-endian: do swap * on big-endian: truncation (16/32-bit) or no-op (64-bit) - le16/le32/le64: convert to little-endian * on big-endian: do swap * on little-endian: truncation (16/32-bit) or no-op (64-bit) Each test simulates realistic networking scenarios where a value is masked with unknown bits (e.g., var_off=(0x0; 0x3f00), range=[0,0x3f00]), then byte-swapped, and the verifier must prove the result stays within expected bounds. Specifically, these selftests are based on dead code elimination: If the BPF verifier can precisely track bitwise through byte swap operations, it can prune the trap path (invalid memory access) that should be unreachable, allowing the program to pass verification. If bitwise tracking is incorrect, the verifier cannot prove the trap is unreachable, causing verification failure. The tests use preprocessor conditionals (#ifdef __BYTE_ORDER__) to verify correct behavior on both little-endian and big-endian architectures, and require Clang 18+ for bswap instruction support. Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Co-developed-by: Yazhou Tang <tangyazhou518@outlook.com> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260204111503.77871-3-ziye@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-04 13:23:28 -08:00
Alexei Starovoitov	6e65cf81ac	selftests/bpf: Strengthen timer_start_deadlock test Strengthen timer_start_deadlock test and check for recursion now Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260204055147.54960-5-alexei.starovoitov@gmail.com	2026-02-04 13:12:50 -08:00
Alexei Starovoitov	67ee5ad27d	selftests/bpf: Add a testcase for deadlock avoidance Add a testcase that checks that deadlock avoidance is working as expected. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260204055147.54960-3-alexei.starovoitov@gmail.com	2026-02-04 13:12:50 -08:00
Alexei Starovoitov	b135beb077	selftests/bpf: Add a test to stress bpf_timer_start and map_delete race Add a test to stress bpf_timer_start and map_delete race Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-10-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Mykyta Yatsenko	3f7a841520	selftests/bpf: Removed obsolete tests Now bpf_timer can be used in tracepoints, so these tests are no longer relevant. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-9-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Mykyta Yatsenko	083c5a4bab	selftests/bpf: Add timer stress test in NMI context Add stress tests for BPF timers that run in NMI context using perf_event programs attached to PERF_COUNT_HW_CPU_CYCLES. The tests cover three scenarios: - nmi_race: Tests concurrent timer start and async cancel operations - nmi_update: Tests updating a map element (effectively deleting and inserting new for array map) from within a timer callback - nmi_cancel: Tests timer self-cancellation attempt. A common test_common() helper is used to share timer setup logic across all test modes. The tests spawn multiple threads in a child process to generate perf events, which trigger the BPF programs in NMI context. Hit counters verify that the NMI code paths were actually exercised. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-8-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Mykyta Yatsenko	fe9d205cec	selftests/bpf: Verify bpf_timer_cancel_async works Add test that verifies that bpf_timer_cancel_async works: can cancel callback successfully. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-7-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Mykyta Yatsenko	d02fdd7195	selftests/bpf: Add stress test for timer async cancel Extend BPF timer selftest to run stress test for async cancel. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-6-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Mykyta Yatsenko	10653c0dd8	selftests/bpf: Refactor timer selftests Refactor timer selftests, extracting stress test into a separate test. This makes it easier to debug test failures and allows to extend. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260201025403.66625-5-alexei.starovoitov@gmail.com	2026-02-03 16:58:47 -08:00
Emil Tsalapatis	4d99137eea	selftests/bpf: Add selftests for stream functions under lock Add a selftest to ensure BPF stream functions can now be called while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-03 10:41:16 -08:00
Emil Tsalapatis	954fa97e21	selftests/bpf: Add selftests for bpf_stream_print_stack Add selftests for the new bpf_stream_print_stack kfunc. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-03 10:41:16 -08:00
Puranjay Mohan	f6ef5584cc	selftests/bpf: Add a test for ids=0 to verifier_scalar_ids test Test that two registers with their id=0 (unlinked) in the cached state can be mapped to a single id (linked) in the current state. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-03 10:34:33 -08:00
Puranjay Mohan	b0388bafa4	bpf: Relax scalar id equivalence for state pruning Scalar register IDs are used by the verifier to track relationships between registers and enable bounds propagation across those relationships. Once an ID becomes singular (i.e. only a single register/stack slot carries it), it can no longer contribute to bounds propagation and effectively becomes stale. The previous commit makes the verifier clear such ids before caching the state. When comparing the current and cached states for pruning, these stale IDs can cause technically equivalent states to be considered different and thus prevent pruning. For example, in the selftest added in the next commit, two registers - r6 and r7 are not linked to any other registers and get cached with id=0, in the current state, they are both linked to each other with id=A. Before this commit, check_scalar_ids would give temporary ids to r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A, and when it would see tid2->A, it would not consider these state equivalent. Relax scalar ID equivalence by treating rold->id == 0 as "independent": if the old state did not rely on any ID relationships for a register, then any ID/linking present in the current state only adds constraints and is always safe to accept for pruning. Implement this by returning true immediately in check_scalar_ids() when old_id == 0. Maintain correctness for the opposite direction (old_id != 0 && cur_id == 0) by still allocating a temporary ID for cur_id == 0. This avoids incorrectly allowing multiple independent current registers (id==0) to satisfy a single linked old ID during mapping. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-02-03 10:34:23 -08:00
Leon Hwang	7f10da2133	selftests/bpf: Enable get_func_args and get_func_ip tests on arm64 Allow get_func_args, and get_func_ip fsession selftests to run on arm64. Acked-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260131144950.16294-4-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-31 13:51:04 -08:00
Leon Hwang	8798902f2b	bpf: Add bpf_jit_supports_fsession() The added fsession does not prevent running on those architectures, that haven't added fsession support. For example, try to run fsession tests on arm64: test_fsession_basic:PASS:fsession_test__open_and_load 0 nsec test_fsession_basic:PASS:fsession_attach 0 nsec check_result:FAIL:test_run_opts err unexpected error: -14 (errno 14) In order to prevent such errors, add bpf_jit_supports_fsession() to guard those architectures. Fixes: `2d419c4465` ("bpf: add fsession support") Acked-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260131144950.16294-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-31 13:51:04 -08:00
Paul Chaignon	f0b5b3d6b5	selftests/bpf: Test access from RO map from xdp_store_bytes This new test simply checks that helper bpf_xdp_store_bytes can successfully read from a read-only map. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/4fdb934a713b2d7cf133288c77f6cfefe9856440.1769875479.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-31 13:49:43 -08:00
Jiri Olsa	4173b494d9	selftests/bpf: Allow to benchmark trigger with stacktrace Adding support to call bpf_get_stackid helper from trigger programs, so far added for kprobe multi. Adding the --stacktrace/-g option to enable it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260126211837.472802-7-jolsa@kernel.org	2026-01-30 13:40:09 -08:00
Jiri Olsa	e5d532be4a	selftests/bpf: Add stacktrace ips test for fentry/fexit Adding test that attaches fentry/fexitand verifies the ORC stacktrace matches expected functions. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260126211837.472802-6-jolsa@kernel.org	2026-01-30 13:40:08 -08:00
Jiri Olsa	7373f97e86	selftests/bpf: Add stacktrace ips test for kprobe/kretprobe Adding test that attaches kprobe/kretprobe and verifies the ORC stacktrace matches expected functions. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260126211837.472802-5-jolsa@kernel.org	2026-01-30 13:40:08 -08:00
Jiri Olsa	0207f94971	selftests/bpf: Fix kprobe multi stacktrace_ips test We now include the attached function in the stack trace, fixing the test accordingly. Fixes: `c9e208fa93` ("selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260126211837.472802-4-jolsa@kernel.org	2026-01-30 13:40:08 -08:00
Changwoo Min	cd77618c41	selftests/bpf: Make bpf get_preempt_count() work for v6.14+ kernels Recent x86 kernels export __preempt_count as a ksym, while some old kernels between v6.1 and v6.14 expose the preemption counter via pcpu_hot.preempt_count. The existing selftest helper unconditionally dereferenced __preempt_count, which breaks BPF program loading on such old kernels. Make the x86 preemption count lookup version-agnostic by: - Marking __preempt_count and pcpu_hot as weak ksyms. - Introducing a BTF-described pcpu_hot___local layout with preserve_access_index. - Selecting the appropriate access path at runtime using ksym availability and bpf_ksym_exists() and bpf_core_field_exists(). This allows a single BPF binary to run correctly across kernel versions (e.g., v6.18 vs. v6.13) without relying on compile-time version checks. Signed-off-by: Changwoo Min <changwoo@igalia.com> Link: https://lore.kernel.org/r/20260130021843.154885-1-changwoo@igalia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-30 12:20:25 -08:00
Jiri Olsa	15ac1adf0f	selftests/bpf: Add test for sleepable program tailcalls Adding test that makes sure we can't mix sleepable and non-sleepable bpf programs in the BPF_MAP_TYPE_PROG_ARRAY map and that we can do tail call in the sleepable program. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260130081208.1130204-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-30 12:17:47 -08:00
Luis Gerhorst	60d2c438c1	bpf: Test nospec after dead stack write in helper Without the fix from the previous commit, the selftest fails: $ ./tools/testing/selftests/bpf/vmtest.sh -- \ ./test_progs -t verifier_unpriv [...] run_subtest:PASS:obj_open_mem 0 nsec libbpf: BTF loading error: -EPERM libbpf: Error loading .BTF into kernel: -EPERM. BTF is optional, ignoring. libbpf: prog 'unpriv_nospec_after_helper_stack_write': BPF program load failed: -EFAULT libbpf: prog 'unpriv_nospec_after_helper_stack_write': failed to load: -EFAULT libbpf: failed to load object 'verifier_unpriv' run_subtest:FAIL:unexpected_load_failure unexpected error: -14 (errno 14) VERIFIER LOG: ============= 0: R1=ctx() R10=fp0 0: (b7) r0 = 0 ; R0=P0 1: (55) if r0 != 0x1 goto pc+6 2: R0=Pscalar() R1=ctx() R10=fp0 2: (b7) r2 = 0 ; R2=P0 3: (bf) r3 = r10 ; R3=fp0 R10=fp0 4: (07) r3 += -16 ; R3=fp-16 5: (b7) r4 = 4 ; R4=P4 6: (b7) r5 = 0 ; R5=P0 7: (85) call bpf_skb_load_bytes_relative#68 verifier bug: speculation barrier after jump instruction may not have the desired effect (BPF_CLASS(insn->code) == BPF_JMP \|\| BPF_CLASS(insn->code) == BPF_JMP32) processed 9 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ============= [...] The test is based on the PoC from the report. Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de> Reported-by: Yinhao Hu <dddddd@hust.edu.cn> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn> Reported-by: Dongliang Mu <dzm91@hust.edu.cn> Link: https://lore.kernel.org/bpf/7678017d-b760-4053-a2d8-a6879b0dbeeb@hust.edu.cn/ Link: https://lore.kernel.org/r/20260127115912.3026761-3-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-28 18:41:57 -08:00
Mykyta Yatsenko	b640d556a2	selftests/bpf: Remove xxd util dependency The verification signature header generation requires converting a binary certificate to a C array. Previously this only worked with xxd (part of vim-common package). As xxd may not be available on some systems building selftests, it makes sense to substitute it with more common utils: hexdump, wc, sed to generate equivalent C array output. Tested by generating header with both xxd and hexdump and comparing them. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20260128190552.242335-1-mykyta.yatsenko5@gmail.com	2026-01-28 13:23:51 -08:00
Jiayuan Chen	17e2ce02bf	selftests/bpf: Add tests for FIONREAD and copied_seq This commit adds two new test functions: one to reproduce the bug reported by syzkaller [1], and another to cover the calculation of copied_seq. The tests primarily involve installing and uninstalling sockmap on sockets, then reading data to verify proper functionality. Additionally, extend the do_test_sockmap_skb_verdict_fionread() function to support UDP FIONREAD testing. [1] https://syzkaller.appspot.com/bug?extid=06dbd397158ec0ea4983 Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20260124113314.113584-4-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-27 09:12:04 -08:00
Matt Bobrowski	1456ebb291	selftests/bpf: cover BPF_CGROUP_ITER_CHILDREN control option Extend some of the existing CSS iterator selftests such that they cover the newly introduced BPF_CGROUP_ITER_CHILDREN iterator control option. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20260127085112.3608687-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-27 09:06:03 -08:00
Alexis Lothoré (eBPF Foundation)	2d96bbdfd3	selftests/bpf: convert test_bpftool_map_access.sh into test_progs framework The test_bpftool_map.sh script tests that maps read/write accesses are being properly allowed/refused by the kernel depending on a specific fmod_ret program being attached on security_bpf_map function. Rewrite this test to integrate it in the test_progs. The new test spawns a few subtests: #36/1 bpftool_maps_access/unprotected_unpinned:OK #36/2 bpftool_maps_access/unprotected_pinned:OK #36/3 bpftool_maps_access/protected_unpinned:OK #36/4 bpftool_maps_access/protected_pinned:OK #36/5 bpftool_maps_access/nested_maps:OK #36/6 bpftool_maps_access/btf_list:OK #36 bpftool_maps_access:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-3-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:31 -08:00
Alexis Lothoré (eBPF Foundation)	1c0b505908	selftests/bpf: convert test_bpftool_metadata.sh into test_progs framework The test_bpftool_metadata.sh script validates that bpftool properly returns in its ouptput any metadata generated by bpf programs through some .rodata sections. Port this test to the test_progs framework so that it can be executed automatically in CI. The new test, similarly to the former script, checks that valid data appears both for textual output and json output, as well as for both data not used at all and used data. For the json check part, the expected json string is hardcoded to avoid bringing a new external dependency (eg: a json deserializer) for test_progs. As the test is now converted into test_progs, remove the former script. The newly converted test brings two new subtests: #37/1 bpftool_metadata/metadata_unused:OK #37/2 bpftool_metadata/metadata_used:OK #37 bpftool_metadata:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-2-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:31 -08:00
Alexis Lothoré (eBPF Foundation)	f21fae5774	selftests/bpf: Add a few helpers for bpftool testing In order to integrate some bpftool tests into test_progs, define a few specific helpers that allow to execute bpftool commands, while possibly retrieving the command output. Those helpers most notably set the path to the bpftool binary under test. This version checks different possible paths relative to the directories where the different test_progs runners are executed, as we want to make sure not to accidentally use a bootstrap version of the binary. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-1-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:21 -08:00
Leon Hwang	78980b4c7f	selftests/bpf: Harden cpu flags test for lru_percpu_hash map CI occasionally reports failures in the percpu_alloc/cpu_flag_lru_percpu_hash selftest, for example: First test_progs failure (test_progs_no_alu32-x86_64-llvm-21): #264/15 percpu_alloc/cpu_flag_lru_percpu_hash ... test_percpu_map_op_cpu_flag:FAIL:bpf_map_lookup_batch value on specified cpu unexpected bpf_map_lookup_batch value on specified cpu: actual 0 != expected 3735929054 The unexpected value indicates that an element was removed from the map. However, the test never calls delete_elem(), so the only possible cause is LRU eviction. This can happen when the current task migrates to another CPU: an update_elem() triggers eviction because there is no available LRU node on local freelist and global freelist. Harden the test against this behavior by provisioning sufficient spare elements. Set max_entries to 'nr_cpus * 2' and restrict the test to using the first nr_cpus entries, ensuring that updates do not spuriously trigger LRU eviction. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260119133417.19739-1-leon.hwang@linux.dev	2026-01-26 10:59:07 -08:00
Changwoo Min	221b5e76c1	selftests/bpf: Add tests for execution context helpers Add a new selftest suite `exe_ctx` to verify the accuracy of the bpf_in_task(), bpf_in_hardirq(), and bpf_in_serving_softirq() helpers introduced in bpf_experimental.h. Testing these execution contexts deterministically requires crossing context boundaries within a single CPU. To achieve this, the test implements a "Trigger-Observer" pattern using bpf_testmod: 1. Trigger: A BPF syscall program calls a new bpf_testmod kfunc bpf_kfunc_trigger_ctx_check(). 2. Task to HardIRQ: The kfunc uses irq_work_queue() to trigger a self-IPI on the local CPU. 3. HardIRQ to SoftIRQ: The irq_work handler calls a dummy function (observed by BPF fentry) and then schedules a tasklet to transition into SoftIRQ context. The user-space runner ensures determinism by pinning itself to CPU 0 before execution, forcing the entire interrupt chain to remain on a single core. Dummy noinline functions with compiler barriers are added to bpf_testmod.c to serve as stable attachment points for fentry programs. A retry loop is used in user-space to wait for the asynchronous SoftIRQ to complete. Note that testing on s390x is avoided because supporting those helpers purely in BPF on s390x is not possible at this point. Reviewed-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com> Link: https://lore.kernel.org/r/20260125115413.117502-3-changwoo@igalia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-25 08:20:50 -08:00
Changwoo Min	c31df36bd2	selftests/bpf: Introduce execution context detection helpers Introduce bpf_in_nmi(), bpf_in_hardirq(), bpf_in_serving_softirq(), and bpf_in_task() inline helpers in bpf_experimental.h. These allow BPF programs to query the current execution context with higher granularity than the existing bpf_in_interrupt() helper. While BPF programs can often infer their context from attachment points, subsystems like sched_ext may call the same BPF logic from multiple contexts (e.g., task-to-task wake-ups vs. interrupt-to-task wake-ups). These helpers provide a reliable way for logic to branch based on the current CPU execution state. Implementing these as BPF-native inline helpers wrapping get_preempt_count() allows the compiler and JIT to inline the logic. The implementation accounts for differences in preempt_count layout between standard and PREEMPT_RT kernels. Reviewed-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com> Link: https://lore.kernel.org/r/20260125115413.117502-2-changwoo@igalia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-25 08:20:50 -08:00
Menglong Dong	cb4bfacfb0	selftests/bpf: test fsession mixed with fentry and fexit Test the fsession when it is used together with fentry, fexit. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-14-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:37 -08:00
Menglong Dong	8909b3fb23	selftests/bpf: add testcases for fsession cookie Test session cookie for fsession. Multiple fsession BPF progs is attached to bpf_fentry_test1() and session cookie is read and write in the testcase. bpf_get_func_ip() will influence the layout of the session cookies, so we test the cookie in two case: with and without bpf_get_func_ip(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-13-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	a5533a6eaa	selftests/bpf: test bpf_get_func_* for fsession Test following bpf helper for fsession: bpf_get_func_arg() bpf_get_func_arg_cnt() bpf_get_func_ret() bpf_get_func_ip() Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-12-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	f7afef5617	selftests/bpf: add testcases for fsession Add testcases for BPF_TRACE_FSESSION. The function arguments and return value are tested both in the entry and exit. And the kfunc bpf_session_is_ret() is also tested. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-11-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	8fe4dc4f64	bpf: change prototype of bpf_session_{cookie,is_return} Add the function argument of "void *ctx" to bpf_session_cookie() and bpf_session_is_return(), which is a preparation of the next patch. The two kfunc is seldom used now, so it will not introduce much effect to change their function prototype. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20260124062008.8657-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:35 -08:00
Menglong Dong	2d419c4465	bpf: add fsession support The fsession is something that similar to kprobe session. It allow to attach a single BPF program to both the entry and the exit of the target functions. Introduce the struct bpf_fsession_link, which allows to add the link to both the fentry and fexit progs_hlist of the trampoline. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Co-developed-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260124062008.8657-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:35 -08:00
Yonghong Song	c7900f225a	selftests/bpf: Fix xdp_pull_data failure with 64K page If the argument 'pull_len' of run_test() is 'PULL_MAX' or 'PULL_MAX \| PULL_PLUS_ONE', the eventual pull_len size will close to the page size. On arm64 systems with 64K pages, the pull_len size will be close to 64K. But the existing buffer will be close to 9000 which is not enough to pull. For those failed run_tests(), make buff size to pg_sz + (pg_sz / 2) This way, there will be enough buffer space to pull regardless of page size. Tested-by: Alan Maguire <alan.maguire@oracle.com> Cc: Amery Hung <ameryhung@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260123055128.495265-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 16:51:22 -08:00
Yonghong Song	d8df878140	selftests/bpf: Fix task_local_data failure with 64K page On arm64 systems with 64K pages, the selftest task_local_data has the following failures: ... test_task_local_data_basic:PASS:tld_create_key 0 nsec test_task_local_data_basic:FAIL:tld_create_key unexpected tld_create_key: actual 0 != expected -28 ... test_task_local_data_basic_thread:PASS:run task_main 0 nsec test_task_local_data_basic_thread:FAIL:task_main retval unexpected error: 2 (errno 0) test_task_local_data_basic_thread:FAIL:tld_get_data value0 unexpected tld_get_data value0: actual 0 != expected 6268 ... #447/1 task_local_data/task_local_data_basic:FAIL ... #447/2 task_local_data/task_local_data_race:FAIL #447 task_local_data:FAIL When TLD_DYN_DATA_SIZE is 64K page size, for struct tld_meta_u { _Atomic __u8 cnt; __u16 size; struct tld_metadata metadata[]; }; field 'cnt' would overflow. For example, for 4K page, 'cnt' will be 4096/64 = 64. But for 64K page, 'cnt' will be 65536/64 = 1024 and 'cnt' is not enough for 1024. To accommodate 64K page, '_Atomic __u8 cnt' becomes '_Atomic __u16 cnt'. A few other places are adjusted accordingly. In test_task_local_data.c, the value for TLD_DYN_DATA_SIZE is changed from 4096 to (getpagesize() - 8) since the maximum buffer size for TLD_DYN_DATA_SIZE is (getpagesize() - 8). Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Cc: Amery Hung <ameryhung@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260123055122.494352-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 16:51:21 -08:00
Kery Qi	a32ae26584	selftests/bpf: Fix resource leak in serial_test_wq on attach failure When wq__attach() fails, serial_test_wq() returns early without calling wq__destroy(), leaking the skeleton resources allocated by wq__open_and_load(). This causes ASAN leak reports in selftests runs. Fix this by jumping to a common clean_up label that calls wq__destroy() on all exit paths after successful open_and_load. Note that the early return after wq__open_and_load() failure is correct and doesn't need fixing, since that function returns NULL on failure (after internally cleaning up any partial allocations). Fixes: `8290dba519` ("selftests/bpf: wq: add bpf_wq_start() checks") Signed-off-by: Kery Qi <qikeyu2017@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20260121094114.1801-3-qikeyu2017@gmail.com	2026-01-21 16:12:10 -08:00
Yuzuki Ishiyama	f4924ad0b1	selftests/bpf: Test kfunc bpf_strncasecmp Add testsuites for kfunc bpf_strncasecmp. Signed-off-by: Yuzuki Ishiyama <ishiyama@hpc.is.uec.ac.jp> Acked-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260121033328.1850010-3-ishiyama@hpc.is.uec.ac.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-21 09:42:53 -08:00
Menglong Dong	1ed7977643	selftests/bpf: test bpf_get_func_arg() for tp_btf Test bpf_get_func_arg() and bpf_get_func_arg_cnt() for tp_btf. The code is most copied from test1 and test2. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260121044348.113201-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-21 09:31:35 -08:00
Menglong Dong	4fca95095c	selftests/bpf: test the jited inline of bpf_get_current_task Add the testcase for the jited inline of bpf_get_current_task(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260120070555.233486-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 20:39:01 -08:00
Li RongQing	e700f5d156	watchdog: softlockup: panic when lockup duration exceeds N thresholds The softlockup_panic sysctl is currently a binary option: panic immediately or never panic on soft lockups. Panicking on any soft lockup, regardless of duration, can be overly aggressive for brief stalls that may be caused by legitimate operations. Conversely, never panicking may allow severe system hangs to persist undetected. Extend softlockup_panic to accept an integer threshold, allowing the kernel to panic only when the normalized lockup duration exceeds N watchdog threshold periods. This provides finer-grained control to distinguish between transient delays and persistent system failures. The accepted values are: - 0: Don't panic (unchanged) - 1: Panic when duration >= 1 * threshold (20s default, original behavior) - N > 1: Panic when duration >= N * threshold (e.g., 2 = 40s, 3 = 60s.) The original behavior is preserved for values 0 and 1, maintaining full backward compatibility while allowing systems to tolerate brief lockups while still catching severe, persistent hangs. [lirongqing@baidu.com: v2] Link: https://lkml.kernel.org/r/20251218074300.4080-1-lirongqing@baidu.com Link: https://lkml.kernel.org/r/20251216074521.2796-1-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Lance Yang <lance.yang@linux.dev> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-20 19:44:20 -08:00
Matt Bobrowski	dd341eacdb	selftests/bpf: update verifier test for default trusted pointer semantics Replace the verifier test for default trusted pointer semantics, which previously relied on BPF kfunc bpf_get_root_mem_cgroup(), with a new test utilizing dedicated BPF kfuncs defined within the bpf_testmod. bpf_get_root_mem_cgroup() was modified such that it again relies on KF_ACQUIRE semantics, therefore no longer making it a suitable candidate to test BPF verifier default trusted pointer semantics against. Link: https://lore.kernel.org/bpf/20260113083949.2502978-2-mattbobrowski@google.com Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20260120091630.3420452-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 17:11:24 -08:00
Yazhou Tang	c9e440bf25	selftests/bpf: Add tests for BPF_DIV and BPF_MOD range tracking Now BPF_DIV has range tracking support via interval analysis. This patch adds selftests to cover various cases of BPF_DIV and BPF_MOD operations when the divisor is a constant, also covering both signed and unsigned variants. This patch includes several types of tests in 32-bit and 64-bit variants: 1. For UDIV - positive divisor - zero divisor 2. For SDIV - positive divisor, positive dividend - positive divisor, negative dividend - positive divisor, mixed sign dividend - negative divisor, positive dividend - negative divisor, negative dividend - negative divisor, mixed sign dividend - zero divisor - overflow (SIGNED_MIN/-1), normal dividend - overflow (SIGNED_MIN/-1), constant dividend 3. For UMOD - positive divisor - positive divisor, small dividend - zero divisor 4. For SMOD - positive divisor, positive dividend - positive divisor, negative dividend - positive divisor, mixed sign dividend - positive divisor, mixed sign dividend, small dividend - negative divisor, positive dividend - negative divisor, negative dividend - negative divisor, mixed sign dividend - negative divisor, mixed sign dividend, small dividend - zero divisor - overflow (SIGNED_MIN/-1), normal dividend - overflow (SIGNED_MIN/-1), constant dividend Specifically, these selftests are based on dead code elimination: If the BPF verifier can precisely analyze the result of BPF_DIV/BPF_MOD instruction, it can prune the path that leads to an error (here we use invalid memory access as the error case), allowing the program to pass verification. Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Link: https://lore.kernel.org/r/20260119085458.182221-3-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:41:54 -08:00
Yazhou Tang	44fdd581d2	bpf: Add range tracking for BPF_DIV and BPF_MOD This patch implements range tracking (interval analysis) for BPF_DIV and BPF_MOD operations when the divisor is a constant, covering both signed and unsigned variants. While LLVM typically optimizes integer division and modulo by constants into multiplication and shift sequences, this optimization is less effective for the BPF target when dealing with 64-bit arithmetic. Currently, the verifier does not track bounds for scalar division or modulo, treating the result as "unbounded". This leads to false positive rejections for safe code patterns. For example, the following code (compiled with -O2): ```c int test(struct pt_regs ctx) { char buffer[6] = {1}; __u64 x = bpf_ktime_get_ns(); __u64 res = x % sizeof(buffer); char value = buffer[res]; bpf_printk("res = %llu, val = %d", res, value); return 0; } ``` Generates a raw `BPF_MOD64` instruction: ```asm ; __u64 res = x % sizeof(buffer); 1: 97 00 00 00 06 00 00 00 r0 %= 0x6 ; char value = buffer[res]; 2: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 4: 0f 01 00 00 00 00 00 00 r1 += r0 5: 91 14 00 00 00 00 00 00 r4 = (s8 *)(r1 + 0x0) ``` Without this patch, the verifier fails with "math between map_value pointer and register with unbounded min value is not allowed" because it cannot deduce that `r0` is within [0, 5]. According to the BPF instruction set[1], the instruction's offset field (`insn->off`) is used to distinguish between signed (`off == 1`) and unsigned division (`off == 0`). Moreover, we also follow the BPF division and modulo runtime behavior (semantics) to handle special cases, such as division by zero and signed division overflow. - UDIV: dst = (src != 0) ? (dst / src) : 0 - SDIV: dst = (src == 0) ? 0 : ((src == -1 && dst == LLONG_MIN) ? LLONG_MIN : (dst / src)) - UMOD: dst = (src != 0) ? (dst % src) : dst - SMOD: dst = (src == 0) ? dst : ((src == -1 && dst == LLONG_MIN) ? 0: (dst s% src)) Here is the overview of the changes made in this patch (See the code comments for more details and examples): 1. For BPF_DIV: Firstly check whether the divisor is zero. If so, set the destination register to zero (matching runtime behavior). For non-zero constant divisors: goto `scalar(32)?_min_max_(u\|s)div` functions. - General cases: compute the new range by dividing max_dividend and min_dividend by the constant divisor. - Overflow case (SIGNED_MIN / -1) in signed division: mark the result as unbounded if the dividend is not a single number. 2. For BPF_MOD: Firstly check whether the divisor is zero. If so, leave the destination register unchanged (matching runtime behavior). For non-zero constant divisors: goto `scalar(32)?_min_max_(u\|s)mod` functions. - General case: For signed modulo, the result's sign matches the dividend's sign. And the result's absolute value is strictly bounded by `min(abs(dividend), abs(divisor) - 1)`. - Special care is taken when the divisor is SIGNED_MIN. By casting to unsigned before negation and subtracting 1, we avoid signed overflow and correctly calculate the maximum possible magnitude (`res_max_abs` in the code). - "Small dividend" case: If the dividend is already within the possible result range (e.g., [-2, 5] % 10), the operation is an identity function, and the destination register remains unchanged. 3. In `scalar(32)?_min_max_(u\|s)(div\|mod)` functions: After updating current range, reset other ranges and tnum to unbounded/unknown. e.g., in `scalar_min_max_sdiv`, signed 64-bit range is updated. Then reset unsigned 64-bit range and 32-bit range to unbounded, and tnum to unknown. Exception: in BPF_MOD's "small dividend" case, since the result remains unchanged, we do not reset other ranges/tnum. 4. Also updated existing selftests based on the expected BPF_DIV and BPF_MOD behavior. [1] https://www.kernel.org/doc/Documentation/bpf/standardization/instruction-set.rst Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Tested-by: syzbot@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20260119085458.182221-2-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:41:53 -08:00
Ihor Solodrai	bd06b977e0	selftests/bpf: Migrate struct_ops_assoc test to KF_IMPLICIT_ARGS A test kfunc named bpf_kfunc_multi_st_ops_test_1_impl() is a user of __prog suffix. Subsequent patch removes __prog support in favor of KF_IMPLICIT_ARGS, so migrate this kfunc to use implicit argument. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-12-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:38 -08:00
Ihor Solodrai	d806f31012	bpf: Migrate bpf_stream_vprintk() to KF_IMPLICIT_ARGS Implement bpf_stream_vprintk with an implicit bpf_prog_aux argument, and remote bpf_stream_vprintk_impl from the kernel. Update the selftests to use the new API with implicit argument. bpf_stream_vprintk macro is changed to use the new bpf_stream_vprintk kfunc, and the extern definition of bpf_stream_vprintk_impl is replaced accordingly. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-11-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:38 -08:00
Ihor Solodrai	6e663ffdf7	bpf: Migrate bpf_task_work_schedule_* kfuncs to KF_IMPLICIT_ARGS Implement bpf_task_work_schedule_* with an implicit bpf_prog_aux argument, and remove corresponding _impl funcs from the kernel. Update special kfunc checks in the verifier accordingly. Update the selftests to use the new API with implicit argument. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-10-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:20 -08:00
Ihor Solodrai	b97931a25a	bpf: Migrate bpf_wq_set_callback_impl() to KF_IMPLICIT_ARGS Implement bpf_wq_set_callback() with an implicit bpf_prog_aux argument, and remove bpf_wq_set_callback_impl(). Update special kfunc checks in the verifier accordingly. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-8-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:15:57 -08:00
Ihor Solodrai	e939f3d16d	selftests/bpf: Add tests for KF_IMPLICIT_ARGS Add trivial end-to-end tests to validate that KF_IMPLICIT_ARGS flag is properly handled by both resolve_btfids and the verifier. Declare kfuncs in bpf_testmod. Check that bpf_prog_aux pointer is set in the kfunc implementation. Verify that calls with implicit args and a legacy case all work. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-7-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:15:57 -08:00
Gyutae Bae	2e6690d4f7	selftests/bpf: Add perfbuf multi-producer benchmark Add a multi-producer benchmark for perfbuf to complement the existing ringbuf multi-producer test. Unlike ringbuf which uses a shared buffer and experiences contention, perfbuf uses per-CPU buffers so the test measures scaling behavior rather than contention. This allows developers to compare perfbuf vs ringbuf performance under multi-producer workloads when choosing between the two for their systems. Signed-off-by: Gyutae Bae <gyutae.bae@navercorp.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260120090716.82927-1-gyutae.opensource@navercorp.com	2026-01-20 11:37:25 -08:00
Yonghong Song	efad162f5a	selftests/bpf: Fix map_kptr test failure On my arm64 machine, I get the following failure: ... tester_init:PASS:tester_log_buf 0 nsec process_subtest:PASS:obj_open_mem 0 nsec process_subtest:PASS:specs_alloc 0 nsec serial_test_map_kptr:PASS:rcu_tasks_trace_gp__open_and_load 0 nsec ... test_map_kptr_success:PASS:map_kptr__open_and_load 0 nsec test_map_kptr_success:PASS:test_map_kptr_ref1 refcount 0 nsec test_map_kptr_success:FAIL:test_map_kptr_ref1 retval unexpected error: 2 (errno 2) test_map_kptr_success:PASS:test_map_kptr_ref2 refcount 0 nsec test_map_kptr_success:FAIL:test_map_kptr_ref2 retval unexpected error: 1 (errno 2) ... #201/21 map_kptr/success-map:FAIL In serial_test_map_kptr(), before test_map_kptr_success(), one kern_sync_rcu() is used to have some delay for freeing the map. But in my environment, one kern_sync_rcu() seems not enough and caused the test failure. In bpf_map_free_in_work() in syscall.c, the queue time for queue_work(system_dfl_wq, &map->work) may be longer than expected. This may cause the test failure since test_map_kptr_success() expects all previous maps having been freed. Since it is not clear how long queue_work() time takes, a bpf prog is added to count the reference after bpf_kfunc_call_test_acquire(). If the number of references is 2 (for initial ref and the one just acquired), all previous maps should have been released. This will resolve the above 'retval unexpected error' issue. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20260116052245.3692405-1-yonghong.song@linux.dev	2026-01-16 14:59:57 -08:00
Alan Maguire	47d440d0a5	selftests/bpf: Support when CONFIG_VXLAN=m If CONFIG_VXLAN is 'm', struct vxlanhdr will not be in vmlinux.h. Add a ___local variant to support cases where vxlan is a module. Fixes: `8517b1abe5` ("selftests/bpf: Integrate test_tc_tunnel.sh tests into test_progs") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260115163457.146267-1-alan.maguire@oracle.com	2026-01-16 14:54:10 -08:00
Puranjay Mohan	086c99fbe4	selftests: bpf: Add test for multiple syncs from linked register Before the last commit, sync_linked_regs() corrupted the register whose bounds are being updated by copying known_reg's id to it. The ids are the same in value but known_reg has the BPF_ADD_CONST flag which is wrongly copied to reg. This later causes issues when creating new links to this reg. assign_scalar_id_before_mov() sees this BPF_ADD_CONST and gives a new id to this register and breaks the old links. This is exposed by the added selftest. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260115151143.1344724-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-16 10:08:59 -08:00
Jiri Olsa	934d9746ed	selftests/bpf: Add test for bpf_override_return helper We do not actually test the bpf_override_return helper functionality itself at the moment, only the bpf program being able to attach it. Adding test that override prctl syscall return value on top of kprobe and kprobe.multi. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-2-jolsa@kernel.org	2026-01-15 16:15:26 -08:00
Anton Protopopov	7c8e817e44	selftests/bpf: Extend live regs tests with a test for gotox Add a test which checks that the destination register of a gotox instruction is marked as used and that the union of jump targets is considered as live. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260114162544.83253-3-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-14 19:08:09 -08:00
Alexei Starovoitov	e3d0dbb3b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5 Cross-merge BPF and other fixes after downstream PR. No conflicts. Adjacent: Auto-merging MAINTAINERS Auto-merging Makefile Auto-merging kernel/bpf/verifier.c Auto-merging kernel/sched/ext.c Auto-merging mm/memcontrol.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-14 15:22:01 -08:00
Anton Protopopov	c656807675	selftests/bpf: Add tests for loading insn array values with offsets The ldimm64 instruction for map value supports an offset. For insn array maps it wasn't tested before, as normally such instructions aren't generated. However, this is still possible to pass such instructions, so add a few tests to check that correct offsets work properly and incorrect offsets are rejected. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260111153047.8388-4-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:36:47 -08:00
Matt Bobrowski	bbdbed193b	selftests/bpf: assert BPF kfunc default trusted pointer semantics The BPF verifier was recently updated to treat pointers to struct types returned from BPF kfuncs as implicitly trusted by default. Add a new test case to exercise this new implicit trust semantic. The KF_ACQUIRE flag was dropped from the bpf_get_root_mem_cgroup() kfunc because it returns a global pointer to root_mem_cgroup without performing any explicit reference counting. This makes it an ideal candidate to verify the new implicit trusted pointer semantics. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260113083949.2502978-3-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:19:13 -08:00
Matt Bobrowski	f8ade2342e	bpf: return PTR_TO_BTF_ID \| PTR_TRUSTED from BPF kfuncs by default Teach the BPF verifier to treat pointers to struct types returned from BPF kfuncs as implicitly trusted (PTR_TO_BTF_ID \| PTR_TRUSTED) by default. Returning untrusted pointers to struct types from BPF kfuncs should be considered an exception only, and certainly not the norm. Update existing selftests to reflect the change in register type printing (e.g. `ptr_` becoming `trusted_ptr_` in verifier error messages). Link: https://lore.kernel.org/bpf/aV4nbCaMfIoM0awM@google.com/ Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260113083949.2502978-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:19:13 -08:00
Donglin Peng	a3acd7d434	selftests/bpf: Add test cases for btf__permute functionality This patch introduces test cases for the btf__permute function to ensure it works correctly with both base BTF and split BTF scenarios. The test suite includes: - test_permute_base: Validates permutation on base BTF - test_permute_split: Tests permutation on split BTF Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com	2026-01-13 16:10:39 -08:00
Alexei Starovoitov	9160335317	selftests/bpf: Add tests for s>>=31 and s>>=63 Add tests for special arithmetic shift right. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:33:38 -08:00
Yonghong Song	951d79017e	selftests/bpf: Fix verifier_arena_globals1 failure with 64K page With 64K page on arm64, verifier_arena_globals1 failed like below: ... libbpf: map 'arena': failed to create: -E2BIG ... #509/1 verifier_arena_globals1/check_reserve1:FAIL ... For 64K page, if the number of arena pages is (1UL << 20), the total memory will exceed 4G and this will cause map creation failure. Adjusting ARENA_PAGES based on the actual page size fixed the problem. Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	d2f7cd20a7	selftests/bpf: Fix sk_bypass_prot_mem failure with 64K page The current selftest sk_bypass_prot_mem only supports 4K page. When running with 64K page on arm64, the following failure happens: ... check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32 ... #385/1 sk_bypass_prot_mem/TCP :FAIL ... check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32 ... #385/2 sk_bypass_prot_mem/UDP :FAIL ... Adding support to 64K page as well fixed the failure. Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	2465a08d43	selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K page On arm64 with 64K page , I observed the following test failure: ... subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read: actual 4696 <= expected 65536 #97/3 dmabuf_iter/lots_of_buffers:FAIL With 4K page on x86, the total_bytes_read is 4593. With 64K page on arm64, the total_byte_read is 4696. In progs/dmabuf_iter.c, for each iteration, the output is BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter); The only difference between 4K and 64K page is 'size' in the above BPF_SEQ_PRINTF. The 4K page will output '4096' and the 64K page will output '65536'. So the total_bytes_read with 64K page is slighter greater than 4K page. Adjusting the total_bytes_read from 65536 to 4096 fixed the issue. Cc: T.J. Mercier <tjmercier@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Sami Tolvanen	ba7f1024a1	selftests/bpf: Use the correct destructor kfunc type With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_testmod_ctx_release() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-9-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Jose E. Marchesi	681600647c	bpf: GCC requires function attributes before the declarator GCC insists in placing attributes before the declarators in function declarations. Now that GCC supports btf_decl_tag and therefore __tag1 and __tag2 expand to actual attributes, the compiler is complaining about it for static __noinline int foo(int x __tag1 __tag2) __tag1 __tag2 progs/test_btf_decl_tag.c:36:1: error: attributes should be specified \ before the declarator in a function definition This patch simply places the tags before the declarator. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-3-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Jose E. Marchesi	97fb54d86d	bpf: adapt selftests to GCC 16 -Wunused-but-set-variable GCC 16 has changed the semantics of -Wunused-but-set-variable, as well as introducing new options -Wunused-but-set-variable={0,1,2,3} to adjust the level of support. One of the changes is that GCC now treats 'sum += 1' and 'sum++' as non-usage, whereas clang (and GCC < 16) considers the first as usage and the second as non-usage, which is sort of inconsistent. The GCC 16 -Wunused-but-set-variable=2 option implements the previous semantics of -Wunused-but-set-variable, but since it is a new option, it cannot be used unconditionally for forward-compatibility, just for backwards-compatibility. So this patch adds pragmas to the two self-tests impacted by this, progs/free_timer.c and progs/rcu_read_lock.c, to make gcc to ignore -Wunused-but-set-variable warnings when compiling them with GCC > 15. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44677#c25 for details on why this regression got introduced in GCC upstream. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-2-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Leon Hwang	07bf7aa58e	selftests/bpf: Add cases to test BPF_F_CPU and BPF_F_ALL_CPUS flags Add test coverage for the new BPF_F_CPU and BPF_F_ALL_CPUS flags support in percpu maps. The following APIs are exercised: * bpf_map_update_batch() * bpf_map_lookup_batch() * bpf_map_update_elem() * bpf_map__update_elem() * bpf_map_lookup_elem_flags() * bpf_map__lookup_elem() For lru_percpu_hash map, set max_entries to 'libbpf_num_possible_cpus() + 1' and only use the first 'libbpf_num_possible_cpus()' entries. This ensures a spare entry is always available in the LRU free list, avoiding eviction. When updating an existing key in lru_percpu_hash map: 1. l_new = prealloc_lru_pop(); /* Borrow from free list / 2. l_old = lookup_elem_raw(); / Found, key exists / 3. pcpu_copy_value(); / In-place update / 4. bpf_lru_push_free(); / Return l_new to free list */ Also add negative tests to verify that non-percpu array and hash maps reject the BPF_F_CPU and BPF_F_ALL_CPUS flags. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-8-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Emil Tsalapatis	b81d5e9d96	selftests/bpf: add tests for arena kfuncs under lock Add selftests to ensure the verifier permits calling the arena kfunc API while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-3-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:44:13 -08:00
Toke Høiland-Jørgensen	ab86d0bf01	selftests/bpf: Update xdp_context_test_run test to check maximum metadata size Update the selftest to check that the metadata size check takes the xdp_frame size into account in bpf_prog_test_run. The original check (for meta size 256) was broken because the data frame supplied was smaller than this, triggering a different EINVAL return. So supply a larger data frame for this test to make sure we actually exercise the check we think we are. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260105114747.1358750-2-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 11:41:41 -08:00
Puranjay Mohan	cf503eb2c6	selftests: bpf: Fix test_bpf_nf for trusted args becoming default With trusted args now being the default, passing NULL to kfunc parameters that are pointers causes verifier rejection rather than a runtime error. The test_bpf_nf test was failing because it attempted to pass NULL to bpf_xdp_ct_lookup() to verify runtime error handling. Since the NULL check now happens at verification time, remove the runtime test case that passed NULL to the bpf_tuple parameter and instead add verification-time tests to ensure the verifier correctly rejects programs that pass NULL to trusted arguments. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-11-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	cf82580c86	selftests: bpf: fix cgroup_hierarchical_stats The cgroup_hierarchical_stats selftests uses an fentry program attached to cgroup_attach_task and then passes the received &dst_cgrp->self to the css_rstat_updated() kfunc. The verifier now assumes that all kfuncs only takes trusted pointer arguments, and pointers received by fentry are not marked trustes by default. Use a tp_btf program in place for fentry for this test, pointers received by tp_btf programs are marked trusted by the verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-10-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	230b0118e4	selftests: bpf: fix test_kfunc_dynptr_param As verifier now assumes that all kfuncs only takes trusted pointer arguments, passing 0 (NULL) to a kfunc that doesn't mark the argument as __nullable or __opt will be rejected with a failure message of: Possibly NULL pointer passed to trusted arg<n> Pass a non-null value to the kfunc to test the expected failure mode. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-9-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	03cc77b10e	selftests: bpf: Update failure message for rbtree_fail The rbtree_api_use_unchecked_remove_retval() selftest passes a pointer received from bpf_rbtree_remove() to bpf_rbtree_add() without checking for NULL, this was earlier caught by __check_ptr_off_reg() in the verifier. Now the verifier assumes every kfunc only takes trusted pointer arguments, so it catches this NULL pointer earlier in the path and provides a more accurate failure message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-8-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	df5004579b	selftests: bpf: Update kfunc_param_nullable test for new error message With trusted args now being the default, the NULL pointer check runs before type-specific validation. Update test3 to expect the new error message "Possibly NULL pointer passed to trusted arg0" instead of the old dynptr-specific error message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-7-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	7646c7afd9	bpf: Remove redundant KF_TRUSTED_ARGS flag from all kfuncs Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the flag itself. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:28 -08:00
Puranjay Mohan	c286e7e9d1	selftests/bpf: veristat: fix printing order in output_stats() The order of the variables in the printf() doesn't match the text and therefore veristat prints something like this: Done. Processed 24 files, 0 programs. Skipped 62 files, 0 programs. When it should print: Done. Processed 24 files, 62 programs. Skipped 0 files, 0 programs. Fix the order of variables in the printf() call. Fixes: `518fee8bfa` ("selftests/bpf: make veristat skip non-BPF and failing-to-open BPF objects") Tested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251231221052.759396-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 16:11:49 -08:00
Ihor Solodrai	1a8fa7faf4	resolve_btfids: Implement --patch_btfids Recent changes in BTF generation [1] rely on ${OBJCOPY} command to update .BTF_ids section data in target ELF files. This exposed a bug in llvm-objcopy --update-section code path, that may lead to corruption of a target ELF file. Specifically, because of the bug st_shndx of some symbols may be (incorrectly) set to 0xffff (SHN_XINDEX) [2][3]. While there is a pending fix for LLVM, it'll take some time before it lands (likely in 22.x). And the kernel build must keep working with older LLVM toolchains in the foreseeable future. Using GNU objcopy for .BTF_ids update would work, but it would require changes to LLVM-based build process, likely breaking existing build environments as discussed in [2]. To work around llvm-objcopy bug, implement --patch_btfids code path in resolve_btfids as a drop-in replacement for: ${OBJCOPY} --update-section .BTF_ids=${btf_ids} ${elf} Which works specifically for .BTF_ids section: ${RESOLVE_BTFIDS} --patch_btfids ${btf_ids} ${elf} This feature in resolve_btfids can be removed at some point in the future, when llvm-objcopy with a relevant bugfix becomes common. [1] https://lore.kernel.org/bpf/20251219181321.1283664-1-ihor.solodrai@linux.dev/ [2] https://lore.kernel.org/bpf/20251224005752.201911-1-ihor.solodrai@linux.dev/ [3] https://github.com/llvm/llvm-project/issues/168060#issuecomment-3533552952 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20251231012558.1699758-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 09:04:09 -08:00
Eduard Zingerman	4fd99103ee	selftests/bpf: iterator based loop and STACK_MISC states pruning The test case first initializes 9 stack slots as STACK_MISC, then conditionally updates each of them to SCALAR spill inside an iterator based loop. This leads to 2**9 combinations of MISC/SPILL marks for these slots at the iterator next call. The loop converges only if the verifier treats such states as equivalent, otherwise visited states are evicted from the states cache too quickly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251230-loop-stack-misc-pruning-v1-2-585cfd6cec51@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 09:01:13 -08:00
Eduard Zingerman	e6f2612f0e	selftests/bpf: test cases for bpf_loop SCC and state graph backedges Test for state graph backedges accumulation for SCCs formed by bpf_loop(). Equivalent to the following C program: int main(void) { 1: fp[-8] = bpf_get_prandom_u32(); 2: fp[-16] = -32; // used in a memory access below 3: bpf_loop(7, loop_cb4, fp, 0); 4: return 0; } int loop_cb4(int i, void ctx) { 5: if (unlikely(ctx[-8] > bpf_get_prandom_u32())) 6: (u64 *)(fp + ctx[-16]) = 42; // aligned access expected 7: if (unlikely(fp[-8] > bpf_get_prandom_u32())) 8: ctx[-16] = -31; // makes said access unaligned 9: return 0; } If state graph backedges are not accumulated properly at the SCC formed by loop_cb4() call from bpf_loop(), the state {ctx[-16]=-32} injected at instruction 9 on verification path 1,2,3,5,7,9,4 would be considered fully verified and would lack precision mark for ctx[-16]. This would lead to early pruning of verification path 1,2,3,5,7,8,9 in state {ctx[-16]=-31}, which in turn leads to the incorrect assumption that the above program is safe. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251229-scc-for-callbacks-v1-2-ceadfe679900@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-30 15:42:42 -08:00
Puranjay Mohan	317a5df78f	selftests/bpf: Fix verifier_arena_large/big_alloc3 test The big_alloc3() test tries to allocate 2051 pages at once in non-sleepable context and this can fail sporadically on resource contrained systems, so skip this test in case of such failures. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251230195134.599463-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-30 12:35:23 -08:00
Puranjay Mohan	efecc9e825	selftests: bpf: test non-sleepable arena allocations As arena kfuncs can now be called from non-sleepable contexts, test this by adding non-sleepable copies of tests in verifier_arena, this is done by using a socket program instead of syscall. Add a new test case in verifier_arena_large to check that the bpf_arena_alloc_pages() works for more than 1024 pages. 1024 * sizeof(struct page *) is the upper limit of kmalloc_nolock() but bpf_arena_alloc_pages() should still succeed because it re-uses this array in a loop. Augment the arena_list selftest to also run in non-sleepable context by taking rcu_read_lock. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251222195022.431211-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-23 11:30:00 -08:00
Puranjay Mohan	83dd46ecb6	selftests: bpf: fix tests with raw_tp calling kfuncs As the previous commit allowed raw_tp programs to call kfuncs, so of the selftests that were expected to fail will now succeed. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251222133250.1890587-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-22 22:23:38 -08:00
JP Kobryn	6bce6ddbe6	bpf: selftests: selftests for memcg stat kfuncs Add test coverage for the kfuncs that fetch memcg stats. Using some common stats, test scenarios ensuring that the given stat increases by some arbitrary amount. The stats selected cover the three categories represented by the enums: node_stat_item, memcg_stat_item, vm_event_item. Since only a subset of all stats are queried, use a static struct made up of fields for each stat. Write to the struct with the fetched values when the bpf program is invoked and read the fields in the user mode program for verification. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20251223044156.208250-6-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-22 22:20:22 -08:00
Matt Bobrowski	d2749ae85a	selftests/bpf: add test case for BPF LSM hook bpf_lsm_mmap_file Add a trivial test case asserting that the BPF verifier enforces PTR_MAYBE_NULL semantics on the struct file pointer argument of BPF LSM hook bpf_lsm_mmap_file(). Dereferencing the struct file pointer passed into bpf_lsm_mmap_file() without explicitly performing a NULL check first should not be permitted by the BPF verifier as it can lead to NULL pointer dereferences and a kernel crash. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251216133000.3690723-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-21 10:56:33 -08:00
Ihor Solodrai	522397d05e	resolve_btfids: Change in-place update with raw binary output Currently resolve_btfids updates .BTF_ids section of an ELF file in-place, based on the contents of provided BTF, usually within the same input file, and optionally a BTF base. Change resolve_btfids behavior to enable BTF transformations as part of its main operation. To achieve this, in-place ELF write in resolve_btfids is replaced with generation of the following binaries: * ${1}.BTF with .BTF section data * ${1}.BTF_ids with .BTF_ids section data if it existed in ${1} * ${1}.BTF.base with .BTF.base section data for out-of-tree modules The execution of resolve_btfids and consumption of its output is orchestrated by scripts/gen-btf.sh introduced in this patch. The motivation for emitting binary data is that it allows simplifying resolve_btfids implementation by delegating ELF update to the $OBJCOPY tool [1], which is already widely used across the codebase. There are two distinct paths for BTF generation and resolve_btfids application in the kernel build: for vmlinux and for kernel modules. For the vmlinux binary a .BTF section is added in a roundabout way to ensure correct linking. The patch doesn't change this approach, only the implementation is a little different. Before this patch it worked as follows: * pahole consumed .tmp_vmlinux1 [2] and added .BTF section with llvm-objcopy [3] to it * then everything except the .BTF section was stripped from .tmp_vmlinux1 into a .tmp_vmlinux1.bpf.o object [2], later linked into vmlinux * resolve_btfids was executed later on vmlinux.unstripped [4], updating it in-place After this patch gen-btf.sh implements the following: * pahole consumes .tmp_vmlinux1 and produces a detached file with raw BTF data * resolve_btfids consumes .tmp_vmlinux1 and detached BTF to produce (potentially modified) .BTF, and .BTF_ids sections data * a .tmp_vmlinux1.bpf.o object is then produced with objcopy copying BTF output of resolve_btfids * .BTF_ids data gets embedded into vmlinux.unstripped in link-vmlinux.sh by objcopy --update-section For kernel modules, creating a special .bpf.o file is not necessary, and so embedding of sections data produced by resolve_btfids is straightforward with objcopy. With this patch an ELF file becomes effectively read-only within resolve_btfids, which allows deleting elf_update() call and satellite code (like compressed_section_fix [5]). Endianness handling of .BTF_ids data is also changed. Previously the "flags" part of the section was bswapped in sets_patch() [6], and then Elf_Type was modified before elf_update() to signal to libelf that bswap may be necessary. With this patch we explicitly bswap entire data buffer on load and on dump. [1] https://lore.kernel.org/bpf/131b4190-9c49-4f79-a99d-c00fac97fa44@linux.dev/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/link-vmlinux.sh?h=v6.18#n110 [3] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/tree/btf_encoder.c?h=v1.31#n1803 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/link-vmlinux.sh?h=v6.18#n284 [5] https://lore.kernel.org/bpf/20200819092342.259004-1-jolsa@kernel.org/ [6] https://lore.kernel.org/bpf/cover.1707223196.git.vmalik@redhat.com/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251219181825.1289460-3-ihor.solodrai@linux.dev	2025-12-19 10:55:40 -08:00
Ihor Solodrai	014e1cdb5f	selftests/bpf: Run resolve_btfids only for relevant .test.o objects A selftest targeting resolve_btfids functionality relies on a resolved .BTF_ids section to be available in the TRUNNER_BINARY. The underlying BTF data is taken from a special BPF program (btf_data.c), and so resolve_btfids is executed as a part of a TRUNNER_BINARY build recipe on the final binary. Subsequent patches in this series allow resolve_btfids to modify BTF before resolving the symbols, which means that the test needs access to that modified BTF [1]. Currently the test simply reads in btf_data.bpf.o on the assumption that BTF hasn't changed. Implement resolve_btfids call only for particular test objects (just resolve_btfids.test.o for now). The test objects are linked into the TRUNNER_BINARY, and so .BTF_ids section will be available there. This will make it trivial for the resolve_btfids test to access BTF modified by resolve_btfids. [1] https://lore.kernel.org/bpf/CAErzpmvsgSDe-QcWH8SFFErL6y3p3zrqNri5-UHJ9iK2ChyiBw@mail.gmail.com/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251219181825.1289460-2-ihor.solodrai@linux.dev	2025-12-19 10:55:40 -08:00
Alexei Starovoitov	ec439c3801	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1 Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-16 21:29:38 -08:00
Linus Torvalds	ea1013c153	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmlCBmwACgkQ6rmadz2v bToUZA//ZY0IE1x1nCixEAqGF/nGpDzVX4YQQfjrUoXQOD4ykzt35yTNXl6B1IVA dliVSI6kUtdoThUa7xJUxMSkDsVBsEMT/zYXQEXJG1zXvJANCB9wTzsC3OCBWbXt BRczcEkq0OHC9/l5CrILR6ocwxKGDIMIysfeOSABgfqckSEhylWy3+EWZQCk08ka gNpXlDJUG7dYpcZD/zhuC7e5Rg1uNvN7WiTv+Biig8xZCsEtYOq+qC5C/sOnsypI nqfECfbx48cVl49SjatdgquuHn/INESdLRCHisshkurA2Mp5PQuCmrwlXbv4JG59 v9b7lsFQlkpvEXMdo9VYe6K2gjfkOPRdWsVPu2oXA1qISRmrDqX8cKOpapUIwRhL p3ASruMOnz0KFqVaET8+5u2SwtALeW+c+1p1aHMfVGF/qbXuyG05qBkLoGFJR+Xr WznXUXY80Z7pjD57SpA6U3DigAkGqKCBXUwdifaOq8HQonwsnQGqkW/3NngNULGP IC4u0JXn61VgQsM/kAw+ucc4bdKI0g4oKJR56lT48elrj6Yxrjpde4oOqzZ0IQKu VQ0YnzWqqT2tjh4YNMOwkNPbFR4ALd329zI6TUkWib/jByEBNcfjSj9BRANd1KSx JgSHAE6agrbl6h3nOx584YCasX3Zq+nfv1Sj4Z/5GaHKKW3q/Vw= =wHLt -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix BPF builds due to -fms-extensions. selftests (Alexei Starovoitov), bpftool (Quentin Monnet). - Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n (Geert Uytterhoeven) - Fix livepatch/BPF interaction and support reliable unwinding through BPF stack frames (Josh Poimboeuf) - Do not audit capability check in arm64 JIT (Ondrej Mosnacek) - Fix truncated dmabuf BPF iterator reads (T.J. Mercier) - Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu) - Fix warnings in libbpf when built with -Wdiscarded-qualifiers under C23 (Mikhail Gavrilov) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: add regression test for bpf_d_path() bpf: Fix verifier assumptions of bpf_d_path's output buffer selftests/bpf: Add test for truncated dmabuf_iter reads bpf: Fix truncated dmabuf iterator reads x86/unwind/orc: Support reliable unwinding through BPF stack frames bpf: Add bpf_has_frame_pointer() bpf, arm64: Do not audit capability check in do_jit() libbpf: Fix -Wdiscarded-qualifiers under C23 bpftool: Fix build warnings due to MS extensions net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT selftests/bpf: Add -fms-extensions to bpf build flags	2025-12-17 15:54:58 +12:00
Emil Tsalapatis	19f12431b6	selftests/bpf: Add tests for the arena offset of globals Add tests for the new libbpf globals arena offset logic. The tests cover the case of globals being as large as the arena itself, and being smaller than the arena. In that case, the data is placed at the end of the arena, and the beginning of the arena is free. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-6-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	c1f61171d4	libbpf: Move arena globals to the end of the arena Arena globals are currently placed at the beginning of the arena by libbpf. This is convenient, but prevents users from reserving guard pages in the beginning of the arena to identify NULL pointer dereferences. Adjust the load logic to place the globals at the end of the arena instead. Also modify bpftool to set the arena pointer in the program's BPF skeleton to point to the globals. Users now call bpf_map__initial_value() to find the beginning of the arena mapping and use the arena pointer in the skeleton to determine which part of the mapping holds the arena globals and which part is free. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-5-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	12a1fe6e12	bpf/verifier: Do not limit maximum direct offset into arena map The verifier currently limits direct offsets into a map to 512MiB to avoid overflow during pointer arithmetic. However, this prevents arena maps from using direct addressing instructions to access data at the end of > 512MiB arena maps. This is necessary when moving arena globals to the end of the arena instead of the front. Refactor the verifier code to remove the offset calculation during direct value access calculations. This is possible because the only two map types that implement .map_direct_value_addr() are arrays and arenas, and they both do their own internal checks to ensure the offset is within bounds. Adjust selftests that expect the old error. These tests still fail because the verifier identifies the access as out of bounds for the map, so change them to expect an "invalid access to map value pointer" error instead. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-3-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	0355911ac0	selftests/bpf: Explicitly account for globals in verifier_arena_large The big_alloc1 test in verifier_arena_large assumes that the arena base and the first page allocated by bpf_arena_alloc_pages are identical. This is not the case, because the first page in the arena is populated by global arena data. The test still passes because the code makes the tacit assumption that the first page is on offset PAGE_SIZE instead of 0. Make this distinction explicit in the code, and adjust the page offsets requested during the test to count from the beginning of the arena instead of using the address of the first allocated page. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-2-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Shuran Liu	79e247d660	selftests/bpf: add regression test for bpf_d_path() Add a regression test for bpf_d_path() to cover incorrect verifier assumptions caused by an incorrect function prototype. The test attaches to the fallocate hook, calls bpf_d_path() and verifies that a simple prefix comparison on the returned pathname behaves correctly after the fix in patch 1. It ensures the verifier does not assume the buffer remains unwritten. Co-developed-by: Zesen Liu <ftyg@live.com> Signed-off-by: Zesen Liu <ftyg@live.com> Co-developed-by: Peili Gao <gplhust955@gmail.com> Signed-off-by: Peili Gao <gplhust955@gmail.com> Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Shuran Liu <electronlsr@gmail.com> Link: https://lore.kernel.org/r/20251206141210.3148-3-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-10 01:36:26 -08:00
Cupertino Miranda	a5b4867fad	selftests/bpf: add verifier sign extension bound computation tests. This commit adds 3 tests to verify a common compiler generated pattern for sign extension (r1 <<= 32; r1 s>>= 32). The tests make sure the register bounds are correctly computed both for positive and negative register values. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Link: https://lore.kernel.org/r/20251202180220.11128-3-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-10 00:12:09 -08:00
Kohei Enju	18352f8fae	selftests/bpf: add tests for attaching invalid fd Add test cases for situations where adding the following types of file descriptors to a cpumap entry should fail: - Non-BPF file descriptor (expect -EINVAL) - Nonexistent file descriptor (expect -EBADF) Also tighten the assertion for the expected error when adding a non-BPF_XDP_CPUMAP program to a cpumap entry. Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20251208131449.73036-3-enjuk@amazon.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-09 23:53:27 -08:00
T.J. Mercier	9489d457d4	selftests/bpf: Add test for truncated dmabuf_iter reads If many dmabufs are present, reads of the dmabuf iterator can be truncated at PAGE_SIZE or user buffer size boundaries before the fix in "bpf: Fix truncated dmabuf iterator reads". Add a test to confirm truncation does not occur. Signed-off-by: T.J. Mercier <tjmercier@google.com> Link: https://lore.kernel.org/r/20251204000348.1413593-2-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-09 23:49:04 -08:00
H. Peter Anvin	c278d72b99	selftests/bpf: replace "__auto_type" with "auto" Replace instances of "__auto_type" with "auto" in: tools/testing/selftests/bpf/prog_tests/socket_helpers.h This file does not seem to be including <linux/compiler_types.h> directly or indirectly, so copy the definition but guard it with !defined(auto). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>	2025-12-08 15:32:15 -08:00
Linus Torvalds	509d3f4584	Significant patch series in this pull request: - The 6 patch series "panic: sys_info: Refactor and fix a potential issue" from Andy Shevchenko fixes a build issue and does some cleanup in ib/sys_info.c. - The 9 patch series "Implement mul_u64_u64_div_u64_roundup()" from David Laight enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions. - The 2 patch series "scripts/gdb/symbols: make BPF debug info available to GDB" from Ilya Leoshkevich makes BPF symbol names, sizes, and line numbers available to the GDB debugger. - The 4 patch series "Enable hung_task and lockup cases to dump system info on demand" from Feng Tang adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire. - The 6 patch series "lib/base64: add generic encoder/decoder, migrate users" from Kuan-Wei Chiu adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations. - The 2 patch series "rbree: inline rb_first() and rb_last()" from Eric Dumazet makes TCP a little faster. - The 9 patch series "liveupdate: Rework KHO for in-kernel users" from Pasha Tatashin reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients. - The 13 patch series "kho: simplify state machine and enable dynamic updates" from Pasha Tatashin increases the flexibility of KEXEC Handover. Also preparation for LUO. - The 18 patch series "Live Update Orchestrator" from Pasha Tatashin is a major new feature targeted at cloud environments. Quoting the [0/N]: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - The 3 patch series "kexec: reorganize kexec and kdump sysfs" from Sourabh Jain moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day. - The 2 patch series "kho: fixes for vmalloc restoration" from Mike Rapoport fixes a BUG which was being hit during KHO restoration of vmalloc() regions. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTSAkQAKCRDdBJ7gKXxA jrkiAP9QKfsRv46XZaM5raScjY1ayjP+gqb2rgt6BQ/gZvb2+wD/cPAYOR6BiX52 n0pVpQmG5P/KyOmpLztn96ejL4heKwQ= =JY96 -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...	2025-12-06 14:01:20 -08:00
Amery Hung	0e841d1926	selftests/bpf: Test getting associated struct_ops in timer callback Make sure 1) a timer callback can also reference the associated struct_ops, and then make sure 2) the timer callback cannot get a dangled pointer to the struct_ops when the map is freed. The test schedules a timer callback from a struct_ops program since struct_ops programs do not pin the map. It is possible for the timer callback to run after the map is freed. The timer callback calls a kfunc that runs .test_1() of the associated struct_ops, which should return MAP_MAGIC when the map is still alive or -1 when the map is gone. The first subtest added in this patch schedules the timer callback to run immediately, while the map is still alive. The second subtest added schedules the callback to run 500ms after syscall_prog runs and then frees the map right after syscall_prog runs. Both subtests then wait until the callback runs to check the return of the kfunc. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-7-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Amery Hung	04fd12df4e	selftests/bpf: Test ambiguous associated struct_ops Add a test to make sure implicit struct_ops association does not break backward compatibility nor return incorrect struct_ops. struct_ops programs should still be allowed to be reused in different struct_ops map. The associated struct_ops map set implicitly however will be poisoned. Trying to read it through the helper bpf_prog_get_assoc_struct_ops() should result in a NULL pointer. While recursion of test_1() cannot happen due to the associated struct_ops being ambiguois, explicitly check for it to prevent stack overflow if the test regresses. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-6-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Amery Hung	33a165f9c2	selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command Test BPF_PROG_ASSOC_STRUCT_OPS command that associates a BPF program with a struct_ops. The test follows the same logic in commit `ba7000f1c3` ("selftests/bpf: Test multi_st_ops and calling kfuncs from different programs"), but instead of using map id to identify a specific struct_ops, this test uses the new BPF command to associate a struct_ops with a program. The test consists of two sets of almost identical struct_ops maps and BPF programs associated with the map. Their only difference is the unique value returned by bpf_testmod_multi_st_ops::test_1(). The test first loads the programs and associates them with struct_ops maps. Then, it exercises the BPF programs. They will in turn call kfunc bpf_kfunc_multi_st_ops_test_1_prog_arg() to trigger test_1() of the associated struct_ops map, and then check if the right unique value is returned. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-5-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Alexei Starovoitov	835a507535	selftests/bpf: Add -fms-extensions to bpf build flags The kernel is now built with -fms-extensions, therefore generated vmlinux.h contains types like: struct slab { .. struct freelist_counters; }; Use -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-03 20:20:11 -08:00
Linus Torvalds	8f7aa3d3c7	Networking changes for 6.19. Core & protocols ---------------- - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API ---------- - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers -------------- - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkveRQACgkQMUZtbf5S IrvY7A/+Nb0o4BxLHjPkAl1m3t3q2d0Y29B7SNkwnwEtxAV8EkNeZ3GWrdtDnTQY MYhmc7LEzvz8/lihapr7UJkcokzSASUV54hbez5jDBKC8EEoyUk8FdWDPerwlcRI zmCFNAVFyh9GX8i7wcrzKbDTHT5+GZLbSlGl9U5mhLsDdRlJgH7d8PJ7vWcmtLFY XN0paDyaeHfCl8wReWNAYx4C/I0ODOvlscpO0tnAKhB0ngJbQCKY2t6tn3rOYdif ZSQ5KwVRnJtQ4fYOFMOy9+FSCjVXtyrxF8KLxD+mqom2ZhmO00UpOMl09tqhq3uT WnvwoHUVBt6F+iITHwg5kMgIDPUq1kpUvL4S4UbVSuUm9ZKD+4KRU2ZHRBYMx+MU bsqmtY8/IULClUoRz+tZhltA8eb0NEqNZE2JPOFDiJHn1YiCCkFwxibhir893oM3 sB7x65D7LQI2ty2BBGVGYnwYDPtyaxOA/s3WTwPvLEi3+Y/TGNIIrS9lBLA4U+Yr Gi93WQGVjttMmVyaHgXBUGmi3L52hvolm0AZ8zSRGrnIEpecjhly2KfYuaOzuxXC IHEQ6AFLdRh6JzafXGb/mQwGCHNmhwsY8A49i94fakWQamaL/L6A+1dyPu4LXMqi NwqCmlVb/LKGlfNG+V4wT27srJ+yBA2Vk3tpR1sZQQytFh0LKHI= =UoDR -----END PGP SIGNATURE----- Merge tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...	2025-12-03 17:24:33 -08:00
Alexis Lothoré (eBPF Foundation)	1d17bcce6a	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	50ce5ea5f7	selftests/bpf: remove test_tc_edt.sh Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	b0f82e7ab6	selftests/bpf: integrate test_tc_edt into test_progs test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	4b4833acc6	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Kumar Kartikeya Dwivedi	3448375e71	selftests/bpf: Add success stats to rqspinlock stress test Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:35:36 -08:00
Hoyeon Lee	bd5bdd200c	bpf: Remove runqslower tool runqslower was added in commit `9c01546d26` "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:20:16 -08:00
Amery Hung	a3a60cc120	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:18:28 -08:00
Anton Protopopov	7feff23cdf	bpf: force BPF_F_RDONLY_PROG on insn array creation The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:15:43 -08:00
Bala-Vignesh-Reddy	e6fbd1759c	selftests: complete kselftest include centralization This follow-up patch completes centralization of kselftest.h and ksefltest_harness.h includes in remaining seltests files, replacing all relative paths with a non-relative paths using shared -I include path in lib.mk Tested with gcc-13.3 and clang-18.1, and cross-compiled successfully on riscv, arm64, x86_64 and powerpc arch. [reddybalavignesh9979@gmail.com: add selftests include path for kselftest.h] Link: https://lkml.kernel.org/r/20251017090201.317521-1-reddybalavignesh9979@gmail.com Link: https://lkml.kernel.org/r/20251016104409.68985-1-reddybalavignesh9979@gmail.com Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/lkml/20250820143954.33d95635e504e94df01930d0@linux-foundation.org/ Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Günther Noack <gnoack@google.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mickael Salaun <mic@digikod.net> Cc: Ming Lei <ming.lei@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Simon Horman <horms@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-27 14:24:31 -08:00
Eric Dumazet	9a5e5334ad	tcp: remove icsk->icsk_retransmit_timer Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-25 19:28:29 -08:00
Eric Dumazet	08dfe37023	tcp: introduce icsk->icsk_keepalive_timer sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-25 19:28:29 -08:00
Kumar Kartikeya Dwivedi	88337b587b	selftests/bpf: Make CS length configurable for rqspinlock stress test Allow users to configure the critical section delay for both task/normal and NMI contexts, and set to 20ms and 10ms as before by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:14 -08:00
Kumar Kartikeya Dwivedi	6173c1d620	selftests/bpf: Add lock wait time stats to rqspinlock stress test Add statistics per-CPU broken down by context and various timing windows for the time taken to acquire an rqspinlock. Cases where all acquisitions fit into the 10ms window are skipped from printing, otherwise the full breakdown is displayed when printing the summary. This allows capturing precisely the number of times outlier attempts happened for a given lock in a given context. A critical detail is that time is captured regardless of success or failure, which is important to capture events for failed but long waiting timeout attempts. Output: [ 64.279459] rqspinlock acquisition latency histogram (ms): [ 64.279472] cpu1: total 528426 (normal 526559, nmi 1867) [ 64.279477] 0-1ms: total 524697 (normal 524697, nmi 0) [ 64.279480] 2-2ms: total 3652 (normal 1811, nmi 1841) [ 64.279482] 3-3ms: total 66 (normal 47, nmi 19) [ 64.279485] 4-4ms: total 2 (normal 1, nmi 1) [ 64.279487] 5-5ms: total 1 (normal 1, nmi 0) [ 64.279489] 6-6ms: total 1 (normal 0, nmi 1) [ 64.279490] 101-150ms: total 1 (normal 0, nmi 1) [ 64.279492] >= 251ms: total 6 (normal 2, nmi 4) ... Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:14 -08:00
Kumar Kartikeya Dwivedi	224de8d5a3	selftests/bpf: Relax CPU requirements for rqspinlock stress test Only require 2 CPUs for AA, 3 for ABBA, 4 for ABBCCA, which is calculated nicely by adding to the mode enum. Enables running single CPU AA tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:13 -08:00
Menglong Dong	f2cb0660ac	selftests/bpf: Call bpf_get_numa_node_id() in trigger_count() The bench test "trig-kernel-count" can be used as a baseline comparison for fentry and other benchmarks, and the calling to bpf_get_numa_node_id() should be considered as composition of the baseline. So, let's call it in trigger_count(). Meanwhile, rename trigger_count() to trigger_kernel_count() to make it easier understand. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251116014242.151110-1-dongml2@chinatelecom.cn	2025-11-25 14:32:50 -08:00

... 2 3 4 5 6 ...

5219 Commits