linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-03 20:14:06 +02:00

Author	SHA1	Message	Date
Mykyta Yatsenko	b640d556a2	selftests/bpf: Remove xxd util dependency The verification signature header generation requires converting a binary certificate to a C array. Previously this only worked with xxd (part of vim-common package). As xxd may not be available on some systems building selftests, it makes sense to substitute it with more common utils: hexdump, wc, sed to generate equivalent C array output. Tested by generating header with both xxd and hexdump and comparing them. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20260128190552.242335-1-mykyta.yatsenko5@gmail.com	2026-01-28 13:23:51 -08:00
Jiayuan Chen	17e2ce02bf	selftests/bpf: Add tests for FIONREAD and copied_seq This commit adds two new test functions: one to reproduce the bug reported by syzkaller [1], and another to cover the calculation of copied_seq. The tests primarily involve installing and uninstalling sockmap on sockets, then reading data to verify proper functionality. Additionally, extend the do_test_sockmap_skb_verdict_fionread() function to support UDP FIONREAD testing. [1] https://syzkaller.appspot.com/bug?extid=06dbd397158ec0ea4983 Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20260124113314.113584-4-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-27 09:12:04 -08:00
Matt Bobrowski	1456ebb291	selftests/bpf: cover BPF_CGROUP_ITER_CHILDREN control option Extend some of the existing CSS iterator selftests such that they cover the newly introduced BPF_CGROUP_ITER_CHILDREN iterator control option. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20260127085112.3608687-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-27 09:06:03 -08:00
Alexis Lothoré (eBPF Foundation)	2d96bbdfd3	selftests/bpf: convert test_bpftool_map_access.sh into test_progs framework The test_bpftool_map.sh script tests that maps read/write accesses are being properly allowed/refused by the kernel depending on a specific fmod_ret program being attached on security_bpf_map function. Rewrite this test to integrate it in the test_progs. The new test spawns a few subtests: #36/1 bpftool_maps_access/unprotected_unpinned:OK #36/2 bpftool_maps_access/unprotected_pinned:OK #36/3 bpftool_maps_access/protected_unpinned:OK #36/4 bpftool_maps_access/protected_pinned:OK #36/5 bpftool_maps_access/nested_maps:OK #36/6 bpftool_maps_access/btf_list:OK #36 bpftool_maps_access:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-3-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:31 -08:00
Alexis Lothoré (eBPF Foundation)	1c0b505908	selftests/bpf: convert test_bpftool_metadata.sh into test_progs framework The test_bpftool_metadata.sh script validates that bpftool properly returns in its ouptput any metadata generated by bpf programs through some .rodata sections. Port this test to the test_progs framework so that it can be executed automatically in CI. The new test, similarly to the former script, checks that valid data appears both for textual output and json output, as well as for both data not used at all and used data. For the json check part, the expected json string is hardcoded to avoid bringing a new external dependency (eg: a json deserializer) for test_progs. As the test is now converted into test_progs, remove the former script. The newly converted test brings two new subtests: #37/1 bpftool_metadata/metadata_unused:OK #37/2 bpftool_metadata/metadata_used:OK #37 bpftool_metadata:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-2-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:31 -08:00
Alexis Lothoré (eBPF Foundation)	f21fae5774	selftests/bpf: Add a few helpers for bpftool testing In order to integrate some bpftool tests into test_progs, define a few specific helpers that allow to execute bpftool commands, while possibly retrieving the command output. Those helpers most notably set the path to the bpftool binary under test. This version checks different possible paths relative to the directories where the different test_progs runners are executed, as we want to make sure not to accidentally use a bootstrap version of the binary. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20260123-bpftool-tests-v4-1-a6653a7f28e7@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-26 18:47:21 -08:00
Leon Hwang	78980b4c7f	selftests/bpf: Harden cpu flags test for lru_percpu_hash map CI occasionally reports failures in the percpu_alloc/cpu_flag_lru_percpu_hash selftest, for example: First test_progs failure (test_progs_no_alu32-x86_64-llvm-21): #264/15 percpu_alloc/cpu_flag_lru_percpu_hash ... test_percpu_map_op_cpu_flag:FAIL:bpf_map_lookup_batch value on specified cpu unexpected bpf_map_lookup_batch value on specified cpu: actual 0 != expected 3735929054 The unexpected value indicates that an element was removed from the map. However, the test never calls delete_elem(), so the only possible cause is LRU eviction. This can happen when the current task migrates to another CPU: an update_elem() triggers eviction because there is no available LRU node on local freelist and global freelist. Harden the test against this behavior by provisioning sufficient spare elements. Set max_entries to 'nr_cpus * 2' and restrict the test to using the first nr_cpus entries, ensuring that updates do not spuriously trigger LRU eviction. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260119133417.19739-1-leon.hwang@linux.dev	2026-01-26 10:59:07 -08:00
Changwoo Min	221b5e76c1	selftests/bpf: Add tests for execution context helpers Add a new selftest suite `exe_ctx` to verify the accuracy of the bpf_in_task(), bpf_in_hardirq(), and bpf_in_serving_softirq() helpers introduced in bpf_experimental.h. Testing these execution contexts deterministically requires crossing context boundaries within a single CPU. To achieve this, the test implements a "Trigger-Observer" pattern using bpf_testmod: 1. Trigger: A BPF syscall program calls a new bpf_testmod kfunc bpf_kfunc_trigger_ctx_check(). 2. Task to HardIRQ: The kfunc uses irq_work_queue() to trigger a self-IPI on the local CPU. 3. HardIRQ to SoftIRQ: The irq_work handler calls a dummy function (observed by BPF fentry) and then schedules a tasklet to transition into SoftIRQ context. The user-space runner ensures determinism by pinning itself to CPU 0 before execution, forcing the entire interrupt chain to remain on a single core. Dummy noinline functions with compiler barriers are added to bpf_testmod.c to serve as stable attachment points for fentry programs. A retry loop is used in user-space to wait for the asynchronous SoftIRQ to complete. Note that testing on s390x is avoided because supporting those helpers purely in BPF on s390x is not possible at this point. Reviewed-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com> Link: https://lore.kernel.org/r/20260125115413.117502-3-changwoo@igalia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-25 08:20:50 -08:00
Changwoo Min	c31df36bd2	selftests/bpf: Introduce execution context detection helpers Introduce bpf_in_nmi(), bpf_in_hardirq(), bpf_in_serving_softirq(), and bpf_in_task() inline helpers in bpf_experimental.h. These allow BPF programs to query the current execution context with higher granularity than the existing bpf_in_interrupt() helper. While BPF programs can often infer their context from attachment points, subsystems like sched_ext may call the same BPF logic from multiple contexts (e.g., task-to-task wake-ups vs. interrupt-to-task wake-ups). These helpers provide a reliable way for logic to branch based on the current CPU execution state. Implementing these as BPF-native inline helpers wrapping get_preempt_count() allows the compiler and JIT to inline the logic. The implementation accounts for differences in preempt_count layout between standard and PREEMPT_RT kernels. Reviewed-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Changwoo Min <changwoo@igalia.com> Link: https://lore.kernel.org/r/20260125115413.117502-2-changwoo@igalia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-25 08:20:50 -08:00
Menglong Dong	cb4bfacfb0	selftests/bpf: test fsession mixed with fentry and fexit Test the fsession when it is used together with fentry, fexit. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-14-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:37 -08:00
Menglong Dong	8909b3fb23	selftests/bpf: add testcases for fsession cookie Test session cookie for fsession. Multiple fsession BPF progs is attached to bpf_fentry_test1() and session cookie is read and write in the testcase. bpf_get_func_ip() will influence the layout of the session cookies, so we test the cookie in two case: with and without bpf_get_func_ip(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-13-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	a5533a6eaa	selftests/bpf: test bpf_get_func_* for fsession Test following bpf helper for fsession: bpf_get_func_arg() bpf_get_func_arg_cnt() bpf_get_func_ret() bpf_get_func_ip() Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-12-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	f7afef5617	selftests/bpf: add testcases for fsession Add testcases for BPF_TRACE_FSESSION. The function arguments and return value are tested both in the entry and exit. And the kfunc bpf_session_is_ret() is also tested. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-11-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:36 -08:00
Menglong Dong	8fe4dc4f64	bpf: change prototype of bpf_session_{cookie,is_return} Add the function argument of "void *ctx" to bpf_session_cookie() and bpf_session_is_return(), which is a preparation of the next patch. The two kfunc is seldom used now, so it will not introduce much effect to change their function prototype. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20260124062008.8657-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:35 -08:00
Menglong Dong	2d419c4465	bpf: add fsession support The fsession is something that similar to kprobe session. It allow to attach a single BPF program to both the entry and the exit of the target functions. Introduce the struct bpf_fsession_link, which allows to add the link to both the fentry and fexit progs_hlist of the trampoline. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Co-developed-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260124062008.8657-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 18:49:35 -08:00
Yonghong Song	c7900f225a	selftests/bpf: Fix xdp_pull_data failure with 64K page If the argument 'pull_len' of run_test() is 'PULL_MAX' or 'PULL_MAX \| PULL_PLUS_ONE', the eventual pull_len size will close to the page size. On arm64 systems with 64K pages, the pull_len size will be close to 64K. But the existing buffer will be close to 9000 which is not enough to pull. For those failed run_tests(), make buff size to pg_sz + (pg_sz / 2) This way, there will be enough buffer space to pull regardless of page size. Tested-by: Alan Maguire <alan.maguire@oracle.com> Cc: Amery Hung <ameryhung@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260123055128.495265-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 16:51:22 -08:00
Yonghong Song	d8df878140	selftests/bpf: Fix task_local_data failure with 64K page On arm64 systems with 64K pages, the selftest task_local_data has the following failures: ... test_task_local_data_basic:PASS:tld_create_key 0 nsec test_task_local_data_basic:FAIL:tld_create_key unexpected tld_create_key: actual 0 != expected -28 ... test_task_local_data_basic_thread:PASS:run task_main 0 nsec test_task_local_data_basic_thread:FAIL:task_main retval unexpected error: 2 (errno 0) test_task_local_data_basic_thread:FAIL:tld_get_data value0 unexpected tld_get_data value0: actual 0 != expected 6268 ... #447/1 task_local_data/task_local_data_basic:FAIL ... #447/2 task_local_data/task_local_data_race:FAIL #447 task_local_data:FAIL When TLD_DYN_DATA_SIZE is 64K page size, for struct tld_meta_u { _Atomic __u8 cnt; __u16 size; struct tld_metadata metadata[]; }; field 'cnt' would overflow. For example, for 4K page, 'cnt' will be 4096/64 = 64. But for 64K page, 'cnt' will be 65536/64 = 1024 and 'cnt' is not enough for 1024. To accommodate 64K page, '_Atomic __u8 cnt' becomes '_Atomic __u16 cnt'. A few other places are adjusted accordingly. In test_task_local_data.c, the value for TLD_DYN_DATA_SIZE is changed from 4096 to (getpagesize() - 8) since the maximum buffer size for TLD_DYN_DATA_SIZE is (getpagesize() - 8). Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Cc: Amery Hung <ameryhung@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260123055122.494352-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-24 16:51:21 -08:00
Kery Qi	a32ae26584	selftests/bpf: Fix resource leak in serial_test_wq on attach failure When wq__attach() fails, serial_test_wq() returns early without calling wq__destroy(), leaking the skeleton resources allocated by wq__open_and_load(). This causes ASAN leak reports in selftests runs. Fix this by jumping to a common clean_up label that calls wq__destroy() on all exit paths after successful open_and_load. Note that the early return after wq__open_and_load() failure is correct and doesn't need fixing, since that function returns NULL on failure (after internally cleaning up any partial allocations). Fixes: `8290dba519` ("selftests/bpf: wq: add bpf_wq_start() checks") Signed-off-by: Kery Qi <qikeyu2017@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20260121094114.1801-3-qikeyu2017@gmail.com	2026-01-21 16:12:10 -08:00
Yuzuki Ishiyama	f4924ad0b1	selftests/bpf: Test kfunc bpf_strncasecmp Add testsuites for kfunc bpf_strncasecmp. Signed-off-by: Yuzuki Ishiyama <ishiyama@hpc.is.uec.ac.jp> Acked-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260121033328.1850010-3-ishiyama@hpc.is.uec.ac.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-21 09:42:53 -08:00
Menglong Dong	1ed7977643	selftests/bpf: test bpf_get_func_arg() for tp_btf Test bpf_get_func_arg() and bpf_get_func_arg_cnt() for tp_btf. The code is most copied from test1 and test2. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260121044348.113201-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-21 09:31:35 -08:00
Menglong Dong	4fca95095c	selftests/bpf: test the jited inline of bpf_get_current_task Add the testcase for the jited inline of bpf_get_current_task(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260120070555.233486-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 20:39:01 -08:00
Li RongQing	e700f5d156	watchdog: softlockup: panic when lockup duration exceeds N thresholds The softlockup_panic sysctl is currently a binary option: panic immediately or never panic on soft lockups. Panicking on any soft lockup, regardless of duration, can be overly aggressive for brief stalls that may be caused by legitimate operations. Conversely, never panicking may allow severe system hangs to persist undetected. Extend softlockup_panic to accept an integer threshold, allowing the kernel to panic only when the normalized lockup duration exceeds N watchdog threshold periods. This provides finer-grained control to distinguish between transient delays and persistent system failures. The accepted values are: - 0: Don't panic (unchanged) - 1: Panic when duration >= 1 * threshold (20s default, original behavior) - N > 1: Panic when duration >= N * threshold (e.g., 2 = 40s, 3 = 60s.) The original behavior is preserved for values 0 and 1, maintaining full backward compatibility while allowing systems to tolerate brief lockups while still catching severe, persistent hangs. [lirongqing@baidu.com: v2] Link: https://lkml.kernel.org/r/20251218074300.4080-1-lirongqing@baidu.com Link: https://lkml.kernel.org/r/20251216074521.2796-1-lirongqing@baidu.com Signed-off-by: Li RongQing <lirongqing@baidu.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Lance Yang <lance.yang@linux.dev> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-20 19:44:20 -08:00
Matt Bobrowski	dd341eacdb	selftests/bpf: update verifier test for default trusted pointer semantics Replace the verifier test for default trusted pointer semantics, which previously relied on BPF kfunc bpf_get_root_mem_cgroup(), with a new test utilizing dedicated BPF kfuncs defined within the bpf_testmod. bpf_get_root_mem_cgroup() was modified such that it again relies on KF_ACQUIRE semantics, therefore no longer making it a suitable candidate to test BPF verifier default trusted pointer semantics against. Link: https://lore.kernel.org/bpf/20260113083949.2502978-2-mattbobrowski@google.com Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Link: https://lore.kernel.org/r/20260120091630.3420452-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 17:11:24 -08:00
Yazhou Tang	c9e440bf25	selftests/bpf: Add tests for BPF_DIV and BPF_MOD range tracking Now BPF_DIV has range tracking support via interval analysis. This patch adds selftests to cover various cases of BPF_DIV and BPF_MOD operations when the divisor is a constant, also covering both signed and unsigned variants. This patch includes several types of tests in 32-bit and 64-bit variants: 1. For UDIV - positive divisor - zero divisor 2. For SDIV - positive divisor, positive dividend - positive divisor, negative dividend - positive divisor, mixed sign dividend - negative divisor, positive dividend - negative divisor, negative dividend - negative divisor, mixed sign dividend - zero divisor - overflow (SIGNED_MIN/-1), normal dividend - overflow (SIGNED_MIN/-1), constant dividend 3. For UMOD - positive divisor - positive divisor, small dividend - zero divisor 4. For SMOD - positive divisor, positive dividend - positive divisor, negative dividend - positive divisor, mixed sign dividend - positive divisor, mixed sign dividend, small dividend - negative divisor, positive dividend - negative divisor, negative dividend - negative divisor, mixed sign dividend - negative divisor, mixed sign dividend, small dividend - zero divisor - overflow (SIGNED_MIN/-1), normal dividend - overflow (SIGNED_MIN/-1), constant dividend Specifically, these selftests are based on dead code elimination: If the BPF verifier can precisely analyze the result of BPF_DIV/BPF_MOD instruction, it can prune the path that leads to an error (here we use invalid memory access as the error case), allowing the program to pass verification. Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Link: https://lore.kernel.org/r/20260119085458.182221-3-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:41:54 -08:00
Yazhou Tang	44fdd581d2	bpf: Add range tracking for BPF_DIV and BPF_MOD This patch implements range tracking (interval analysis) for BPF_DIV and BPF_MOD operations when the divisor is a constant, covering both signed and unsigned variants. While LLVM typically optimizes integer division and modulo by constants into multiplication and shift sequences, this optimization is less effective for the BPF target when dealing with 64-bit arithmetic. Currently, the verifier does not track bounds for scalar division or modulo, treating the result as "unbounded". This leads to false positive rejections for safe code patterns. For example, the following code (compiled with -O2): ```c int test(struct pt_regs ctx) { char buffer[6] = {1}; __u64 x = bpf_ktime_get_ns(); __u64 res = x % sizeof(buffer); char value = buffer[res]; bpf_printk("res = %llu, val = %d", res, value); return 0; } ``` Generates a raw `BPF_MOD64` instruction: ```asm ; __u64 res = x % sizeof(buffer); 1: 97 00 00 00 06 00 00 00 r0 %= 0x6 ; char value = buffer[res]; 2: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0x0 ll 4: 0f 01 00 00 00 00 00 00 r1 += r0 5: 91 14 00 00 00 00 00 00 r4 = (s8 *)(r1 + 0x0) ``` Without this patch, the verifier fails with "math between map_value pointer and register with unbounded min value is not allowed" because it cannot deduce that `r0` is within [0, 5]. According to the BPF instruction set[1], the instruction's offset field (`insn->off`) is used to distinguish between signed (`off == 1`) and unsigned division (`off == 0`). Moreover, we also follow the BPF division and modulo runtime behavior (semantics) to handle special cases, such as division by zero and signed division overflow. - UDIV: dst = (src != 0) ? (dst / src) : 0 - SDIV: dst = (src == 0) ? 0 : ((src == -1 && dst == LLONG_MIN) ? LLONG_MIN : (dst / src)) - UMOD: dst = (src != 0) ? (dst % src) : dst - SMOD: dst = (src == 0) ? dst : ((src == -1 && dst == LLONG_MIN) ? 0: (dst s% src)) Here is the overview of the changes made in this patch (See the code comments for more details and examples): 1. For BPF_DIV: Firstly check whether the divisor is zero. If so, set the destination register to zero (matching runtime behavior). For non-zero constant divisors: goto `scalar(32)?_min_max_(u\|s)div` functions. - General cases: compute the new range by dividing max_dividend and min_dividend by the constant divisor. - Overflow case (SIGNED_MIN / -1) in signed division: mark the result as unbounded if the dividend is not a single number. 2. For BPF_MOD: Firstly check whether the divisor is zero. If so, leave the destination register unchanged (matching runtime behavior). For non-zero constant divisors: goto `scalar(32)?_min_max_(u\|s)mod` functions. - General case: For signed modulo, the result's sign matches the dividend's sign. And the result's absolute value is strictly bounded by `min(abs(dividend), abs(divisor) - 1)`. - Special care is taken when the divisor is SIGNED_MIN. By casting to unsigned before negation and subtracting 1, we avoid signed overflow and correctly calculate the maximum possible magnitude (`res_max_abs` in the code). - "Small dividend" case: If the dividend is already within the possible result range (e.g., [-2, 5] % 10), the operation is an identity function, and the destination register remains unchanged. 3. In `scalar(32)?_min_max_(u\|s)(div\|mod)` functions: After updating current range, reset other ranges and tnum to unbounded/unknown. e.g., in `scalar_min_max_sdiv`, signed 64-bit range is updated. Then reset unsigned 64-bit range and 32-bit range to unbounded, and tnum to unknown. Exception: in BPF_MOD's "small dividend" case, since the result remains unchanged, we do not reset other ranges/tnum. 4. Also updated existing selftests based on the expected BPF_DIV and BPF_MOD behavior. [1] https://www.kernel.org/doc/Documentation/bpf/standardization/instruction-set.rst Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com> Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com> Co-developed-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Tianci Cao <ziye@zju.edu.cn> Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com> Tested-by: syzbot@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20260119085458.182221-2-tangyazhou@zju.edu.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:41:53 -08:00
Ihor Solodrai	bd06b977e0	selftests/bpf: Migrate struct_ops_assoc test to KF_IMPLICIT_ARGS A test kfunc named bpf_kfunc_multi_st_ops_test_1_impl() is a user of __prog suffix. Subsequent patch removes __prog support in favor of KF_IMPLICIT_ARGS, so migrate this kfunc to use implicit argument. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-12-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:38 -08:00
Ihor Solodrai	d806f31012	bpf: Migrate bpf_stream_vprintk() to KF_IMPLICIT_ARGS Implement bpf_stream_vprintk with an implicit bpf_prog_aux argument, and remote bpf_stream_vprintk_impl from the kernel. Update the selftests to use the new API with implicit argument. bpf_stream_vprintk macro is changed to use the new bpf_stream_vprintk kfunc, and the extern definition of bpf_stream_vprintk_impl is replaced accordingly. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-11-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:38 -08:00
Ihor Solodrai	6e663ffdf7	bpf: Migrate bpf_task_work_schedule_* kfuncs to KF_IMPLICIT_ARGS Implement bpf_task_work_schedule_* with an implicit bpf_prog_aux argument, and remove corresponding _impl funcs from the kernel. Update special kfunc checks in the verifier accordingly. Update the selftests to use the new API with implicit argument. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-10-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:22:20 -08:00
Ihor Solodrai	b97931a25a	bpf: Migrate bpf_wq_set_callback_impl() to KF_IMPLICIT_ARGS Implement bpf_wq_set_callback() with an implicit bpf_prog_aux argument, and remove bpf_wq_set_callback_impl(). Update special kfunc checks in the verifier accordingly. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-8-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:15:57 -08:00
Ihor Solodrai	e939f3d16d	selftests/bpf: Add tests for KF_IMPLICIT_ARGS Add trivial end-to-end tests to validate that KF_IMPLICIT_ARGS flag is properly handled by both resolve_btfids and the verifier. Declare kfuncs in bpf_testmod. Check that bpf_prog_aux pointer is set in the kfunc implementation. Verify that calls with implicit args and a legacy case all work. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120222638.3976562-7-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-20 16:15:57 -08:00
Gyutae Bae	2e6690d4f7	selftests/bpf: Add perfbuf multi-producer benchmark Add a multi-producer benchmark for perfbuf to complement the existing ringbuf multi-producer test. Unlike ringbuf which uses a shared buffer and experiences contention, perfbuf uses per-CPU buffers so the test measures scaling behavior rather than contention. This allows developers to compare perfbuf vs ringbuf performance under multi-producer workloads when choosing between the two for their systems. Signed-off-by: Gyutae Bae <gyutae.bae@navercorp.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260120090716.82927-1-gyutae.opensource@navercorp.com	2026-01-20 11:37:25 -08:00
Yonghong Song	efad162f5a	selftests/bpf: Fix map_kptr test failure On my arm64 machine, I get the following failure: ... tester_init:PASS:tester_log_buf 0 nsec process_subtest:PASS:obj_open_mem 0 nsec process_subtest:PASS:specs_alloc 0 nsec serial_test_map_kptr:PASS:rcu_tasks_trace_gp__open_and_load 0 nsec ... test_map_kptr_success:PASS:map_kptr__open_and_load 0 nsec test_map_kptr_success:PASS:test_map_kptr_ref1 refcount 0 nsec test_map_kptr_success:FAIL:test_map_kptr_ref1 retval unexpected error: 2 (errno 2) test_map_kptr_success:PASS:test_map_kptr_ref2 refcount 0 nsec test_map_kptr_success:FAIL:test_map_kptr_ref2 retval unexpected error: 1 (errno 2) ... #201/21 map_kptr/success-map:FAIL In serial_test_map_kptr(), before test_map_kptr_success(), one kern_sync_rcu() is used to have some delay for freeing the map. But in my environment, one kern_sync_rcu() seems not enough and caused the test failure. In bpf_map_free_in_work() in syscall.c, the queue time for queue_work(system_dfl_wq, &map->work) may be longer than expected. This may cause the test failure since test_map_kptr_success() expects all previous maps having been freed. Since it is not clear how long queue_work() time takes, a bpf prog is added to count the reference after bpf_kfunc_call_test_acquire(). If the number of references is 2 (for initial ref and the one just acquired), all previous maps should have been released. This will resolve the above 'retval unexpected error' issue. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20260116052245.3692405-1-yonghong.song@linux.dev	2026-01-16 14:59:57 -08:00
Alan Maguire	47d440d0a5	selftests/bpf: Support when CONFIG_VXLAN=m If CONFIG_VXLAN is 'm', struct vxlanhdr will not be in vmlinux.h. Add a ___local variant to support cases where vxlan is a module. Fixes: `8517b1abe5` ("selftests/bpf: Integrate test_tc_tunnel.sh tests into test_progs") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260115163457.146267-1-alan.maguire@oracle.com	2026-01-16 14:54:10 -08:00
Puranjay Mohan	086c99fbe4	selftests: bpf: Add test for multiple syncs from linked register Before the last commit, sync_linked_regs() corrupted the register whose bounds are being updated by copying known_reg's id to it. The ids are the same in value but known_reg has the BPF_ADD_CONST flag which is wrongly copied to reg. This later causes issues when creating new links to this reg. assign_scalar_id_before_mov() sees this BPF_ADD_CONST and gives a new id to this register and breaks the old links. This is exposed by the added selftest. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260115151143.1344724-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-16 10:08:59 -08:00
Jiri Olsa	934d9746ed	selftests/bpf: Add test for bpf_override_return helper We do not actually test the bpf_override_return helper functionality itself at the moment, only the bpf program being able to attach it. Adding test that override prctl syscall return value on top of kprobe and kprobe.multi. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-2-jolsa@kernel.org	2026-01-15 16:15:26 -08:00
Anton Protopopov	7c8e817e44	selftests/bpf: Extend live regs tests with a test for gotox Add a test which checks that the destination register of a gotox instruction is marked as used and that the union of jump targets is considered as live. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260114162544.83253-3-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-14 19:08:09 -08:00
Alexei Starovoitov	e3d0dbb3b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5 Cross-merge BPF and other fixes after downstream PR. No conflicts. Adjacent: Auto-merging MAINTAINERS Auto-merging Makefile Auto-merging kernel/bpf/verifier.c Auto-merging kernel/sched/ext.c Auto-merging mm/memcontrol.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-14 15:22:01 -08:00
Anton Protopopov	c656807675	selftests/bpf: Add tests for loading insn array values with offsets The ldimm64 instruction for map value supports an offset. For insn array maps it wasn't tested before, as normally such instructions aren't generated. However, this is still possible to pass such instructions, so add a few tests to check that correct offsets work properly and incorrect offsets are rejected. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20260111153047.8388-4-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:36:47 -08:00
Matt Bobrowski	bbdbed193b	selftests/bpf: assert BPF kfunc default trusted pointer semantics The BPF verifier was recently updated to treat pointers to struct types returned from BPF kfuncs as implicitly trusted by default. Add a new test case to exercise this new implicit trust semantic. The KF_ACQUIRE flag was dropped from the bpf_get_root_mem_cgroup() kfunc because it returns a global pointer to root_mem_cgroup without performing any explicit reference counting. This makes it an ideal candidate to verify the new implicit trusted pointer semantics. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260113083949.2502978-3-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:19:13 -08:00
Matt Bobrowski	f8ade2342e	bpf: return PTR_TO_BTF_ID \| PTR_TRUSTED from BPF kfuncs by default Teach the BPF verifier to treat pointers to struct types returned from BPF kfuncs as implicitly trusted (PTR_TO_BTF_ID \| PTR_TRUSTED) by default. Returning untrusted pointers to struct types from BPF kfuncs should be considered an exception only, and certainly not the norm. Update existing selftests to reflect the change in register type printing (e.g. `ptr_` becoming `trusted_ptr_` in verifier error messages). Link: https://lore.kernel.org/bpf/aV4nbCaMfIoM0awM@google.com/ Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260113083949.2502978-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 19:19:13 -08:00
Donglin Peng	a3acd7d434	selftests/bpf: Add test cases for btf__permute functionality This patch introduces test cases for the btf__permute function to ensure it works correctly with both base BTF and split BTF scenarios. The test suite includes: - test_permute_base: Validates permutation on base BTF - test_permute_split: Tests permutation on split BTF Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com	2026-01-13 16:10:39 -08:00
Alexei Starovoitov	9160335317	selftests/bpf: Add tests for s>>=31 and s>>=63 Add tests for special arithmetic shift right. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Co-developed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:33:38 -08:00
Yonghong Song	951d79017e	selftests/bpf: Fix verifier_arena_globals1 failure with 64K page With 64K page on arm64, verifier_arena_globals1 failed like below: ... libbpf: map 'arena': failed to create: -E2BIG ... #509/1 verifier_arena_globals1/check_reserve1:FAIL ... For 64K page, if the number of arena pages is (1UL << 20), the total memory will exceed 4G and this will cause map creation failure. Adjusting ARENA_PAGES based on the actual page size fixed the problem. Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	d2f7cd20a7	selftests/bpf: Fix sk_bypass_prot_mem failure with 64K page The current selftest sk_bypass_prot_mem only supports 4K page. When running with 64K page on arm64, the following failure happens: ... check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32 ... #385/1 sk_bypass_prot_mem/TCP :FAIL ... check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32 ... #385/2 sk_bypass_prot_mem/UDP :FAIL ... Adding support to 64K page as well fixed the failure. Cc: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Yonghong Song	2465a08d43	selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K page On arm64 with 64K page , I observed the following test failure: ... subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read: actual 4696 <= expected 65536 #97/3 dmabuf_iter/lots_of_buffers:FAIL With 4K page on x86, the total_bytes_read is 4593. With 64K page on arm64, the total_byte_read is 4696. In progs/dmabuf_iter.c, for each iteration, the output is BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter); The only difference between 4K and 64K page is 'size' in the above BPF_SEQ_PRINTF. The 4K page will output '4096' and the 64K page will output '65536'. So the total_bytes_read with 64K page is slighter greater than 4K page. Adjusting the total_bytes_read from 65536 to 4096 fixed the issue. Cc: T.J. Mercier <tjmercier@google.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-13 09:32:16 -08:00
Sami Tolvanen	ba7f1024a1	selftests/bpf: Use the correct destructor kfunc type With CONFIG_CFI enabled, the kernel strictly enforces that indirect function calls use a function pointer type that matches the target function. As bpf_testmod_ctx_release() signature differs from the btf_dtor_kfunc_t pointer type used for the destructor calls in bpf_obj_free_fields(), add a stub function with the correct type to fix the type mismatch. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260110082548.113748-9-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 18:53:57 -08:00
Jose E. Marchesi	681600647c	bpf: GCC requires function attributes before the declarator GCC insists in placing attributes before the declarators in function declarations. Now that GCC supports btf_decl_tag and therefore __tag1 and __tag2 expand to actual attributes, the compiler is complaining about it for static __noinline int foo(int x __tag1 __tag2) __tag1 __tag2 progs/test_btf_decl_tag.c:36:1: error: attributes should be specified \ before the declarator in a function definition This patch simply places the tags before the declarator. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-3-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Jose E. Marchesi	97fb54d86d	bpf: adapt selftests to GCC 16 -Wunused-but-set-variable GCC 16 has changed the semantics of -Wunused-but-set-variable, as well as introducing new options -Wunused-but-set-variable={0,1,2,3} to adjust the level of support. One of the changes is that GCC now treats 'sum += 1' and 'sum++' as non-usage, whereas clang (and GCC < 16) considers the first as usage and the second as non-usage, which is sort of inconsistent. The GCC 16 -Wunused-but-set-variable=2 option implements the previous semantics of -Wunused-but-set-variable, but since it is a new option, it cannot be used unconditionally for forward-compatibility, just for backwards-compatibility. So this patch adds pragmas to the two self-tests impacted by this, progs/free_timer.c and progs/rcu_read_lock.c, to make gcc to ignore -Wunused-but-set-variable warnings when compiling them with GCC > 15. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44677#c25 for details on why this regression got introduced in GCC upstream. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Cc: david.faust@oracle.com Cc: cupertino.miranda@oracle.com Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20260106173650.18191-2-jose.marchesi@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 21:04:11 -08:00
Leon Hwang	07bf7aa58e	selftests/bpf: Add cases to test BPF_F_CPU and BPF_F_ALL_CPUS flags Add test coverage for the new BPF_F_CPU and BPF_F_ALL_CPUS flags support in percpu maps. The following APIs are exercised: * bpf_map_update_batch() * bpf_map_lookup_batch() * bpf_map_update_elem() * bpf_map__update_elem() * bpf_map_lookup_elem_flags() * bpf_map__lookup_elem() For lru_percpu_hash map, set max_entries to 'libbpf_num_possible_cpus() + 1' and only use the first 'libbpf_num_possible_cpus()' entries. This ensures a spare entry is always available in the LRU free list, avoiding eviction. When updating an existing key in lru_percpu_hash map: 1. l_new = prealloc_lru_pop(); /* Borrow from free list / 2. l_old = lookup_elem_raw(); / Found, key exists / 3. pcpu_copy_value(); / In-place update / 4. bpf_lru_push_free(); / Return l_new to free list */ Also add negative tests to verify that non-percpu array and hash maps reject the BPF_F_CPU and BPF_F_ALL_CPUS flags. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260107022022.12843-8-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 20:48:32 -08:00
Emil Tsalapatis	b81d5e9d96	selftests/bpf: add tests for arena kfuncs under lock Add selftests to ensure the verifier permits calling the arena kfunc API while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-3-378e9eab3066@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 17:44:13 -08:00
Toke Høiland-Jørgensen	ab86d0bf01	selftests/bpf: Update xdp_context_test_run test to check maximum metadata size Update the selftest to check that the metadata size check takes the xdp_frame size into account in bpf_prog_test_run. The original check (for meta size 256) was broken because the data frame supplied was smaller than this, triggering a different EINVAL return. So supply a larger data frame for this test to make sure we actually exercise the check we think we are. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260105114747.1358750-2-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-06 11:41:41 -08:00
Puranjay Mohan	cf503eb2c6	selftests: bpf: Fix test_bpf_nf for trusted args becoming default With trusted args now being the default, passing NULL to kfunc parameters that are pointers causes verifier rejection rather than a runtime error. The test_bpf_nf test was failing because it attempted to pass NULL to bpf_xdp_ct_lookup() to verify runtime error handling. Since the NULL check now happens at verification time, remove the runtime test case that passed NULL to the bpf_tuple parameter and instead add verification-time tests to ensure the verifier correctly rejects programs that pass NULL to trusted arguments. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-11-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	cf82580c86	selftests: bpf: fix cgroup_hierarchical_stats The cgroup_hierarchical_stats selftests uses an fentry program attached to cgroup_attach_task and then passes the received &dst_cgrp->self to the css_rstat_updated() kfunc. The verifier now assumes that all kfuncs only takes trusted pointer arguments, and pointers received by fentry are not marked trustes by default. Use a tp_btf program in place for fentry for this test, pointers received by tp_btf programs are marked trusted by the verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-10-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	230b0118e4	selftests: bpf: fix test_kfunc_dynptr_param As verifier now assumes that all kfuncs only takes trusted pointer arguments, passing 0 (NULL) to a kfunc that doesn't mark the argument as __nullable or __opt will be rejected with a failure message of: Possibly NULL pointer passed to trusted arg<n> Pass a non-null value to the kfunc to test the expected failure mode. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-9-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	03cc77b10e	selftests: bpf: Update failure message for rbtree_fail The rbtree_api_use_unchecked_remove_retval() selftest passes a pointer received from bpf_rbtree_remove() to bpf_rbtree_add() without checking for NULL, this was earlier caught by __check_ptr_off_reg() in the verifier. Now the verifier assumes every kfunc only takes trusted pointer arguments, so it catches this NULL pointer earlier in the path and provides a more accurate failure message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-8-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	df5004579b	selftests: bpf: Update kfunc_param_nullable test for new error message With trusted args now being the default, the NULL pointer check runs before type-specific validation. Update test3 to expect the new error message "Possibly NULL pointer passed to trusted arg0" instead of the old dynptr-specific error message. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-7-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:29 -08:00
Puranjay Mohan	7646c7afd9	bpf: Remove redundant KF_TRUSTED_ARGS flag from all kfuncs Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the flag itself. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-02 12:04:28 -08:00
Puranjay Mohan	c286e7e9d1	selftests/bpf: veristat: fix printing order in output_stats() The order of the variables in the printf() doesn't match the text and therefore veristat prints something like this: Done. Processed 24 files, 0 programs. Skipped 62 files, 0 programs. When it should print: Done. Processed 24 files, 62 programs. Skipped 0 files, 0 programs. Fix the order of variables in the printf() call. Fixes: `518fee8bfa` ("selftests/bpf: make veristat skip non-BPF and failing-to-open BPF objects") Tested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251231221052.759396-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 16:11:49 -08:00
Ihor Solodrai	1a8fa7faf4	resolve_btfids: Implement --patch_btfids Recent changes in BTF generation [1] rely on ${OBJCOPY} command to update .BTF_ids section data in target ELF files. This exposed a bug in llvm-objcopy --update-section code path, that may lead to corruption of a target ELF file. Specifically, because of the bug st_shndx of some symbols may be (incorrectly) set to 0xffff (SHN_XINDEX) [2][3]. While there is a pending fix for LLVM, it'll take some time before it lands (likely in 22.x). And the kernel build must keep working with older LLVM toolchains in the foreseeable future. Using GNU objcopy for .BTF_ids update would work, but it would require changes to LLVM-based build process, likely breaking existing build environments as discussed in [2]. To work around llvm-objcopy bug, implement --patch_btfids code path in resolve_btfids as a drop-in replacement for: ${OBJCOPY} --update-section .BTF_ids=${btf_ids} ${elf} Which works specifically for .BTF_ids section: ${RESOLVE_BTFIDS} --patch_btfids ${btf_ids} ${elf} This feature in resolve_btfids can be removed at some point in the future, when llvm-objcopy with a relevant bugfix becomes common. [1] https://lore.kernel.org/bpf/20251219181321.1283664-1-ihor.solodrai@linux.dev/ [2] https://lore.kernel.org/bpf/20251224005752.201911-1-ihor.solodrai@linux.dev/ [3] https://github.com/llvm/llvm-project/issues/168060#issuecomment-3533552952 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20251231012558.1699758-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 09:04:09 -08:00
Eduard Zingerman	4fd99103ee	selftests/bpf: iterator based loop and STACK_MISC states pruning The test case first initializes 9 stack slots as STACK_MISC, then conditionally updates each of them to SCALAR spill inside an iterator based loop. This leads to 2**9 combinations of MISC/SPILL marks for these slots at the iterator next call. The loop converges only if the verifier treats such states as equivalent, otherwise visited states are evicted from the states cache too quickly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251230-loop-stack-misc-pruning-v1-2-585cfd6cec51@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-31 09:01:13 -08:00
Eduard Zingerman	e6f2612f0e	selftests/bpf: test cases for bpf_loop SCC and state graph backedges Test for state graph backedges accumulation for SCCs formed by bpf_loop(). Equivalent to the following C program: int main(void) { 1: fp[-8] = bpf_get_prandom_u32(); 2: fp[-16] = -32; // used in a memory access below 3: bpf_loop(7, loop_cb4, fp, 0); 4: return 0; } int loop_cb4(int i, void ctx) { 5: if (unlikely(ctx[-8] > bpf_get_prandom_u32())) 6: (u64 *)(fp + ctx[-16]) = 42; // aligned access expected 7: if (unlikely(fp[-8] > bpf_get_prandom_u32())) 8: ctx[-16] = -31; // makes said access unaligned 9: return 0; } If state graph backedges are not accumulated properly at the SCC formed by loop_cb4() call from bpf_loop(), the state {ctx[-16]=-32} injected at instruction 9 on verification path 1,2,3,5,7,9,4 would be considered fully verified and would lack precision mark for ctx[-16]. This would lead to early pruning of verification path 1,2,3,5,7,8,9 in state {ctx[-16]=-31}, which in turn leads to the incorrect assumption that the above program is safe. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251229-scc-for-callbacks-v1-2-ceadfe679900@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-30 15:42:42 -08:00
Puranjay Mohan	317a5df78f	selftests/bpf: Fix verifier_arena_large/big_alloc3 test The big_alloc3() test tries to allocate 2051 pages at once in non-sleepable context and this can fail sporadically on resource contrained systems, so skip this test in case of such failures. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251230195134.599463-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-30 12:35:23 -08:00
Puranjay Mohan	efecc9e825	selftests: bpf: test non-sleepable arena allocations As arena kfuncs can now be called from non-sleepable contexts, test this by adding non-sleepable copies of tests in verifier_arena, this is done by using a socket program instead of syscall. Add a new test case in verifier_arena_large to check that the bpf_arena_alloc_pages() works for more than 1024 pages. 1024 * sizeof(struct page *) is the upper limit of kmalloc_nolock() but bpf_arena_alloc_pages() should still succeed because it re-uses this array in a loop. Augment the arena_list selftest to also run in non-sleepable context by taking rcu_read_lock. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20251222195022.431211-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-23 11:30:00 -08:00
Puranjay Mohan	83dd46ecb6	selftests: bpf: fix tests with raw_tp calling kfuncs As the previous commit allowed raw_tp programs to call kfuncs, so of the selftests that were expected to fail will now succeed. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251222133250.1890587-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-22 22:23:38 -08:00
JP Kobryn	6bce6ddbe6	bpf: selftests: selftests for memcg stat kfuncs Add test coverage for the kfuncs that fetch memcg stats. Using some common stats, test scenarios ensuring that the given stat increases by some arbitrary amount. The stats selected cover the three categories represented by the enums: node_stat_item, memcg_stat_item, vm_event_item. Since only a subset of all stats are queried, use a static struct made up of fields for each stat. Write to the struct with the fetched values when the bpf program is invoked and read the fields in the user mode program for verification. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20251223044156.208250-6-roman.gushchin@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-22 22:20:22 -08:00
Matt Bobrowski	d2749ae85a	selftests/bpf: add test case for BPF LSM hook bpf_lsm_mmap_file Add a trivial test case asserting that the BPF verifier enforces PTR_MAYBE_NULL semantics on the struct file pointer argument of BPF LSM hook bpf_lsm_mmap_file(). Dereferencing the struct file pointer passed into bpf_lsm_mmap_file() without explicitly performing a NULL check first should not be permitted by the BPF verifier as it can lead to NULL pointer dereferences and a kernel crash. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251216133000.3690723-2-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-21 10:56:33 -08:00
Ihor Solodrai	522397d05e	resolve_btfids: Change in-place update with raw binary output Currently resolve_btfids updates .BTF_ids section of an ELF file in-place, based on the contents of provided BTF, usually within the same input file, and optionally a BTF base. Change resolve_btfids behavior to enable BTF transformations as part of its main operation. To achieve this, in-place ELF write in resolve_btfids is replaced with generation of the following binaries: * ${1}.BTF with .BTF section data * ${1}.BTF_ids with .BTF_ids section data if it existed in ${1} * ${1}.BTF.base with .BTF.base section data for out-of-tree modules The execution of resolve_btfids and consumption of its output is orchestrated by scripts/gen-btf.sh introduced in this patch. The motivation for emitting binary data is that it allows simplifying resolve_btfids implementation by delegating ELF update to the $OBJCOPY tool [1], which is already widely used across the codebase. There are two distinct paths for BTF generation and resolve_btfids application in the kernel build: for vmlinux and for kernel modules. For the vmlinux binary a .BTF section is added in a roundabout way to ensure correct linking. The patch doesn't change this approach, only the implementation is a little different. Before this patch it worked as follows: * pahole consumed .tmp_vmlinux1 [2] and added .BTF section with llvm-objcopy [3] to it * then everything except the .BTF section was stripped from .tmp_vmlinux1 into a .tmp_vmlinux1.bpf.o object [2], later linked into vmlinux * resolve_btfids was executed later on vmlinux.unstripped [4], updating it in-place After this patch gen-btf.sh implements the following: * pahole consumes .tmp_vmlinux1 and produces a detached file with raw BTF data * resolve_btfids consumes .tmp_vmlinux1 and detached BTF to produce (potentially modified) .BTF, and .BTF_ids sections data * a .tmp_vmlinux1.bpf.o object is then produced with objcopy copying BTF output of resolve_btfids * .BTF_ids data gets embedded into vmlinux.unstripped in link-vmlinux.sh by objcopy --update-section For kernel modules, creating a special .bpf.o file is not necessary, and so embedding of sections data produced by resolve_btfids is straightforward with objcopy. With this patch an ELF file becomes effectively read-only within resolve_btfids, which allows deleting elf_update() call and satellite code (like compressed_section_fix [5]). Endianness handling of .BTF_ids data is also changed. Previously the "flags" part of the section was bswapped in sets_patch() [6], and then Elf_Type was modified before elf_update() to signal to libelf that bswap may be necessary. With this patch we explicitly bswap entire data buffer on load and on dump. [1] https://lore.kernel.org/bpf/131b4190-9c49-4f79-a99d-c00fac97fa44@linux.dev/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/link-vmlinux.sh?h=v6.18#n110 [3] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/tree/btf_encoder.c?h=v1.31#n1803 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/link-vmlinux.sh?h=v6.18#n284 [5] https://lore.kernel.org/bpf/20200819092342.259004-1-jolsa@kernel.org/ [6] https://lore.kernel.org/bpf/cover.1707223196.git.vmalik@redhat.com/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251219181825.1289460-3-ihor.solodrai@linux.dev	2025-12-19 10:55:40 -08:00
Ihor Solodrai	014e1cdb5f	selftests/bpf: Run resolve_btfids only for relevant .test.o objects A selftest targeting resolve_btfids functionality relies on a resolved .BTF_ids section to be available in the TRUNNER_BINARY. The underlying BTF data is taken from a special BPF program (btf_data.c), and so resolve_btfids is executed as a part of a TRUNNER_BINARY build recipe on the final binary. Subsequent patches in this series allow resolve_btfids to modify BTF before resolving the symbols, which means that the test needs access to that modified BTF [1]. Currently the test simply reads in btf_data.bpf.o on the assumption that BTF hasn't changed. Implement resolve_btfids call only for particular test objects (just resolve_btfids.test.o for now). The test objects are linked into the TRUNNER_BINARY, and so .BTF_ids section will be available there. This will make it trivial for the resolve_btfids test to access BTF modified by resolve_btfids. [1] https://lore.kernel.org/bpf/CAErzpmvsgSDe-QcWH8SFFErL6y3p3zrqNri5-UHJ9iK2ChyiBw@mail.gmail.com/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251219181825.1289460-2-ihor.solodrai@linux.dev	2025-12-19 10:55:40 -08:00
Alexei Starovoitov	ec439c3801	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.19-rc1 Cross-merge BPF and other fixes after downstream PR. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-16 21:29:38 -08:00
Linus Torvalds	ea1013c153	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmlCBmwACgkQ6rmadz2v bToUZA//ZY0IE1x1nCixEAqGF/nGpDzVX4YQQfjrUoXQOD4ykzt35yTNXl6B1IVA dliVSI6kUtdoThUa7xJUxMSkDsVBsEMT/zYXQEXJG1zXvJANCB9wTzsC3OCBWbXt BRczcEkq0OHC9/l5CrILR6ocwxKGDIMIysfeOSABgfqckSEhylWy3+EWZQCk08ka gNpXlDJUG7dYpcZD/zhuC7e5Rg1uNvN7WiTv+Biig8xZCsEtYOq+qC5C/sOnsypI nqfECfbx48cVl49SjatdgquuHn/INESdLRCHisshkurA2Mp5PQuCmrwlXbv4JG59 v9b7lsFQlkpvEXMdo9VYe6K2gjfkOPRdWsVPu2oXA1qISRmrDqX8cKOpapUIwRhL p3ASruMOnz0KFqVaET8+5u2SwtALeW+c+1p1aHMfVGF/qbXuyG05qBkLoGFJR+Xr WznXUXY80Z7pjD57SpA6U3DigAkGqKCBXUwdifaOq8HQonwsnQGqkW/3NngNULGP IC4u0JXn61VgQsM/kAw+ucc4bdKI0g4oKJR56lT48elrj6Yxrjpde4oOqzZ0IQKu VQ0YnzWqqT2tjh4YNMOwkNPbFR4ALd329zI6TUkWib/jByEBNcfjSj9BRANd1KSx JgSHAE6agrbl6h3nOx584YCasX3Zq+nfv1Sj4Z/5GaHKKW3q/Vw= =wHLt -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix BPF builds due to -fms-extensions. selftests (Alexei Starovoitov), bpftool (Quentin Monnet). - Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n (Geert Uytterhoeven) - Fix livepatch/BPF interaction and support reliable unwinding through BPF stack frames (Josh Poimboeuf) - Do not audit capability check in arm64 JIT (Ondrej Mosnacek) - Fix truncated dmabuf BPF iterator reads (T.J. Mercier) - Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu) - Fix warnings in libbpf when built with -Wdiscarded-qualifiers under C23 (Mikhail Gavrilov) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: add regression test for bpf_d_path() bpf: Fix verifier assumptions of bpf_d_path's output buffer selftests/bpf: Add test for truncated dmabuf_iter reads bpf: Fix truncated dmabuf iterator reads x86/unwind/orc: Support reliable unwinding through BPF stack frames bpf: Add bpf_has_frame_pointer() bpf, arm64: Do not audit capability check in do_jit() libbpf: Fix -Wdiscarded-qualifiers under C23 bpftool: Fix build warnings due to MS extensions net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT selftests/bpf: Add -fms-extensions to bpf build flags	2025-12-17 15:54:58 +12:00
Emil Tsalapatis	19f12431b6	selftests/bpf: Add tests for the arena offset of globals Add tests for the new libbpf globals arena offset logic. The tests cover the case of globals being as large as the arena itself, and being smaller than the arena. In that case, the data is placed at the end of the arena, and the beginning of the arena is free. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-6-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	c1f61171d4	libbpf: Move arena globals to the end of the arena Arena globals are currently placed at the beginning of the arena by libbpf. This is convenient, but prevents users from reserving guard pages in the beginning of the arena to identify NULL pointer dereferences. Adjust the load logic to place the globals at the end of the arena instead. Also modify bpftool to set the arena pointer in the program's BPF skeleton to point to the globals. Users now call bpf_map__initial_value() to find the beginning of the arena mapping and use the arena pointer in the skeleton to determine which part of the mapping holds the arena globals and which part is free. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-5-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	12a1fe6e12	bpf/verifier: Do not limit maximum direct offset into arena map The verifier currently limits direct offsets into a map to 512MiB to avoid overflow during pointer arithmetic. However, this prevents arena maps from using direct addressing instructions to access data at the end of > 512MiB arena maps. This is necessary when moving arena globals to the end of the arena instead of the front. Refactor the verifier code to remove the offset calculation during direct value access calculations. This is possible because the only two map types that implement .map_direct_value_addr() are arrays and arenas, and they both do their own internal checks to ensure the offset is within bounds. Adjust selftests that expect the old error. These tests still fail because the verifier identifies the access as out of bounds for the map, so change them to expect an "invalid access to map value pointer" error instead. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251216173325.98465-3-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Emil Tsalapatis	0355911ac0	selftests/bpf: Explicitly account for globals in verifier_arena_large The big_alloc1 test in verifier_arena_large assumes that the arena base and the first page allocated by bpf_arena_alloc_pages are identical. This is not the case, because the first page in the arena is populated by global arena data. The test still passes because the code makes the tacit assumption that the first page is on offset PAGE_SIZE instead of 0. Make this distinction explicit in the code, and adjust the page offsets requested during the test to count from the beginning of the arena instead of using the address of the first allocated page. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251216173325.98465-2-emil@etsalapatis.com	2025-12-16 10:42:55 -08:00
Shuran Liu	79e247d660	selftests/bpf: add regression test for bpf_d_path() Add a regression test for bpf_d_path() to cover incorrect verifier assumptions caused by an incorrect function prototype. The test attaches to the fallocate hook, calls bpf_d_path() and verifies that a simple prefix comparison on the returned pathname behaves correctly after the fix in patch 1. It ensures the verifier does not assume the buffer remains unwritten. Co-developed-by: Zesen Liu <ftyg@live.com> Signed-off-by: Zesen Liu <ftyg@live.com> Co-developed-by: Peili Gao <gplhust955@gmail.com> Signed-off-by: Peili Gao <gplhust955@gmail.com> Co-developed-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Haoran Ni <haoran.ni.cs@gmail.com> Signed-off-by: Shuran Liu <electronlsr@gmail.com> Link: https://lore.kernel.org/r/20251206141210.3148-3-electronlsr@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-10 01:36:26 -08:00
Cupertino Miranda	a5b4867fad	selftests/bpf: add verifier sign extension bound computation tests. This commit adds 3 tests to verify a common compiler generated pattern for sign extension (r1 <<= 32; r1 s>>= 32). The tests make sure the register bounds are correctly computed both for positive and negative register values. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrew Pinski <andrew.pinski@oss.qualcomm.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Cc: David Faust <david.faust@oracle.com> Cc: Jose Marchesi <jose.marchesi@oracle.com> Cc: Elena Zannoni <elena.zannoni@oracle.com> Link: https://lore.kernel.org/r/20251202180220.11128-3-cupertino.miranda@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-10 00:12:09 -08:00
Kohei Enju	18352f8fae	selftests/bpf: add tests for attaching invalid fd Add test cases for situations where adding the following types of file descriptors to a cpumap entry should fail: - Non-BPF file descriptor (expect -EINVAL) - Nonexistent file descriptor (expect -EBADF) Also tighten the assertion for the expected error when adding a non-BPF_XDP_CPUMAP program to a cpumap entry. Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20251208131449.73036-3-enjuk@amazon.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-09 23:53:27 -08:00
T.J. Mercier	9489d457d4	selftests/bpf: Add test for truncated dmabuf_iter reads If many dmabufs are present, reads of the dmabuf iterator can be truncated at PAGE_SIZE or user buffer size boundaries before the fix in "bpf: Fix truncated dmabuf iterator reads". Add a test to confirm truncation does not occur. Signed-off-by: T.J. Mercier <tjmercier@google.com> Link: https://lore.kernel.org/r/20251204000348.1413593-2-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-09 23:49:04 -08:00
H. Peter Anvin	c278d72b99	selftests/bpf: replace "__auto_type" with "auto" Replace instances of "__auto_type" with "auto" in: tools/testing/selftests/bpf/prog_tests/socket_helpers.h This file does not seem to be including <linux/compiler_types.h> directly or indirectly, so copy the definition but guard it with !defined(auto). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>	2025-12-08 15:32:15 -08:00
Linus Torvalds	509d3f4584	Significant patch series in this pull request: - The 6 patch series "panic: sys_info: Refactor and fix a potential issue" from Andy Shevchenko fixes a build issue and does some cleanup in ib/sys_info.c. - The 9 patch series "Implement mul_u64_u64_div_u64_roundup()" from David Laight enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions. - The 2 patch series "scripts/gdb/symbols: make BPF debug info available to GDB" from Ilya Leoshkevich makes BPF symbol names, sizes, and line numbers available to the GDB debugger. - The 4 patch series "Enable hung_task and lockup cases to dump system info on demand" from Feng Tang adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire. - The 6 patch series "lib/base64: add generic encoder/decoder, migrate users" from Kuan-Wei Chiu adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations. - The 2 patch series "rbree: inline rb_first() and rb_last()" from Eric Dumazet makes TCP a little faster. - The 9 patch series "liveupdate: Rework KHO for in-kernel users" from Pasha Tatashin reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients. - The 13 patch series "kho: simplify state machine and enable dynamic updates" from Pasha Tatashin increases the flexibility of KEXEC Handover. Also preparation for LUO. - The 18 patch series "Live Update Orchestrator" from Pasha Tatashin is a major new feature targeted at cloud environments. Quoting the [0/N]: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - The 3 patch series "kexec: reorganize kexec and kdump sysfs" from Sourabh Jain moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day. - The 2 patch series "kho: fixes for vmalloc restoration" from Mike Rapoport fixes a BUG which was being hit during KHO restoration of vmalloc() regions. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaTSAkQAKCRDdBJ7gKXxA jrkiAP9QKfsRv46XZaM5raScjY1ayjP+gqb2rgt6BQ/gZvb2+wD/cPAYOR6BiX52 n0pVpQmG5P/KyOmpLztn96ejL4heKwQ= =JY96 -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...	2025-12-06 14:01:20 -08:00
Amery Hung	0e841d1926	selftests/bpf: Test getting associated struct_ops in timer callback Make sure 1) a timer callback can also reference the associated struct_ops, and then make sure 2) the timer callback cannot get a dangled pointer to the struct_ops when the map is freed. The test schedules a timer callback from a struct_ops program since struct_ops programs do not pin the map. It is possible for the timer callback to run after the map is freed. The timer callback calls a kfunc that runs .test_1() of the associated struct_ops, which should return MAP_MAGIC when the map is still alive or -1 when the map is gone. The first subtest added in this patch schedules the timer callback to run immediately, while the map is still alive. The second subtest added schedules the callback to run 500ms after syscall_prog runs and then frees the map right after syscall_prog runs. Both subtests then wait until the callback runs to check the return of the kfunc. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-7-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Amery Hung	04fd12df4e	selftests/bpf: Test ambiguous associated struct_ops Add a test to make sure implicit struct_ops association does not break backward compatibility nor return incorrect struct_ops. struct_ops programs should still be allowed to be reused in different struct_ops map. The associated struct_ops map set implicitly however will be poisoned. Trying to read it through the helper bpf_prog_get_assoc_struct_ops() should result in a NULL pointer. While recursion of test_1() cannot happen due to the associated struct_ops being ambiguois, explicitly check for it to prevent stack overflow if the test regresses. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-6-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Amery Hung	33a165f9c2	selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command Test BPF_PROG_ASSOC_STRUCT_OPS command that associates a BPF program with a struct_ops. The test follows the same logic in commit `ba7000f1c3` ("selftests/bpf: Test multi_st_ops and calling kfuncs from different programs"), but instead of using map id to identify a specific struct_ops, this test uses the new BPF command to associate a struct_ops with a program. The test consists of two sets of almost identical struct_ops maps and BPF programs associated with the map. Their only difference is the unique value returned by bpf_testmod_multi_st_ops::test_1(). The test first loads the programs and associates them with struct_ops maps. Then, it exercises the BPF programs. They will in turn call kfunc bpf_kfunc_multi_st_ops_test_1_prog_arg() to trigger test_1() of the associated struct_ops map, and then check if the right unique value is returned. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251203233748.668365-5-ameryhung@gmail.com	2025-12-05 16:17:58 -08:00
Alexei Starovoitov	835a507535	selftests/bpf: Add -fms-extensions to bpf build flags The kernel is now built with -fms-extensions, therefore generated vmlinux.h contains types like: struct slab { .. struct freelist_counters; }; Use -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-12-03 20:20:11 -08:00
Linus Torvalds	8f7aa3d3c7	Networking changes for 6.19. Core & protocols ---------------- - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API ---------- - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers -------------- - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkveRQACgkQMUZtbf5S IrvY7A/+Nb0o4BxLHjPkAl1m3t3q2d0Y29B7SNkwnwEtxAV8EkNeZ3GWrdtDnTQY MYhmc7LEzvz8/lihapr7UJkcokzSASUV54hbez5jDBKC8EEoyUk8FdWDPerwlcRI zmCFNAVFyh9GX8i7wcrzKbDTHT5+GZLbSlGl9U5mhLsDdRlJgH7d8PJ7vWcmtLFY XN0paDyaeHfCl8wReWNAYx4C/I0ODOvlscpO0tnAKhB0ngJbQCKY2t6tn3rOYdif ZSQ5KwVRnJtQ4fYOFMOy9+FSCjVXtyrxF8KLxD+mqom2ZhmO00UpOMl09tqhq3uT WnvwoHUVBt6F+iITHwg5kMgIDPUq1kpUvL4S4UbVSuUm9ZKD+4KRU2ZHRBYMx+MU bsqmtY8/IULClUoRz+tZhltA8eb0NEqNZE2JPOFDiJHn1YiCCkFwxibhir893oM3 sB7x65D7LQI2ty2BBGVGYnwYDPtyaxOA/s3WTwPvLEi3+Y/TGNIIrS9lBLA4U+Yr Gi93WQGVjttMmVyaHgXBUGmi3L52hvolm0AZ8zSRGrnIEpecjhly2KfYuaOzuxXC IHEQ6AFLdRh6JzafXGb/mQwGCHNmhwsY8A49i94fakWQamaL/L6A+1dyPu4LXMqi NwqCmlVb/LKGlfNG+V4wT27srJ+yBA2Vk3tpR1sZQQytFh0LKHI= =UoDR -----END PGP SIGNATURE----- Merge tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...	2025-12-03 17:24:33 -08:00
Alexis Lothoré (eBPF Foundation)	1d17bcce6a	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	50ce5ea5f7	selftests/bpf: remove test_tc_edt.sh Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	b0f82e7ab6	selftests/bpf: integrate test_tc_edt into test_progs test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Alexis Lothoré (eBPF Foundation)	4b4833acc6	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:37:41 -08:00
Kumar Kartikeya Dwivedi	3448375e71	selftests/bpf: Add success stats to rqspinlock stress test Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-29 09:35:36 -08:00
Hoyeon Lee	bd5bdd200c	bpf: Remove runqslower tool runqslower was added in commit `9c01546d26` "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:20:16 -08:00
Amery Hung	a3a60cc120	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:18:28 -08:00
Anton Protopopov	7feff23cdf	bpf: force BPF_F_RDONLY_PROG on insn array creation The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-28 15:15:43 -08:00
Bala-Vignesh-Reddy	e6fbd1759c	selftests: complete kselftest include centralization This follow-up patch completes centralization of kselftest.h and ksefltest_harness.h includes in remaining seltests files, replacing all relative paths with a non-relative paths using shared -I include path in lib.mk Tested with gcc-13.3 and clang-18.1, and cross-compiled successfully on riscv, arm64, x86_64 and powerpc arch. [reddybalavignesh9979@gmail.com: add selftests include path for kselftest.h] Link: https://lkml.kernel.org/r/20251017090201.317521-1-reddybalavignesh9979@gmail.com Link: https://lkml.kernel.org/r/20251016104409.68985-1-reddybalavignesh9979@gmail.com Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/lkml/20250820143954.33d95635e504e94df01930d0@linux-foundation.org/ Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Günther Noack <gnoack@google.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mickael Salaun <mic@digikod.net> Cc: Ming Lei <ming.lei@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Simon Horman <horms@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-27 14:24:31 -08:00
Eric Dumazet	9a5e5334ad	tcp: remove icsk->icsk_retransmit_timer Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-25 19:28:29 -08:00
Eric Dumazet	08dfe37023	tcp: introduce icsk->icsk_keepalive_timer sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-25 19:28:29 -08:00
Kumar Kartikeya Dwivedi	88337b587b	selftests/bpf: Make CS length configurable for rqspinlock stress test Allow users to configure the critical section delay for both task/normal and NMI contexts, and set to 20ms and 10ms as before by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:14 -08:00
Kumar Kartikeya Dwivedi	6173c1d620	selftests/bpf: Add lock wait time stats to rqspinlock stress test Add statistics per-CPU broken down by context and various timing windows for the time taken to acquire an rqspinlock. Cases where all acquisitions fit into the 10ms window are skipped from printing, otherwise the full breakdown is displayed when printing the summary. This allows capturing precisely the number of times outlier attempts happened for a given lock in a given context. A critical detail is that time is captured regardless of success or failure, which is important to capture events for failed but long waiting timeout attempts. Output: [ 64.279459] rqspinlock acquisition latency histogram (ms): [ 64.279472] cpu1: total 528426 (normal 526559, nmi 1867) [ 64.279477] 0-1ms: total 524697 (normal 524697, nmi 0) [ 64.279480] 2-2ms: total 3652 (normal 1811, nmi 1841) [ 64.279482] 3-3ms: total 66 (normal 47, nmi 19) [ 64.279485] 4-4ms: total 2 (normal 1, nmi 1) [ 64.279487] 5-5ms: total 1 (normal 1, nmi 0) [ 64.279489] 6-6ms: total 1 (normal 0, nmi 1) [ 64.279490] 101-150ms: total 1 (normal 0, nmi 1) [ 64.279492] >= 251ms: total 6 (normal 2, nmi 4) ... Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:14 -08:00
Kumar Kartikeya Dwivedi	224de8d5a3	selftests/bpf: Relax CPU requirements for rqspinlock stress test Only require 2 CPUs for AA, 3 for ABBA, 4 for ABBCCA, which is calculated nicely by adding to the mode enum. Enables running single CPU AA tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-25 15:30:13 -08:00
Menglong Dong	f2cb0660ac	selftests/bpf: Call bpf_get_numa_node_id() in trigger_count() The bench test "trig-kernel-count" can be used as a baseline comparison for fentry and other benchmarks, and the calling to bpf_get_numa_node_id() should be considered as composition of the baseline. So, let's call it in trigger_count(). Meanwhile, rename trigger_count() to trigger_kernel_count() to make it easier understand. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251116014242.151110-1-dongml2@chinatelecom.cn	2025-11-25 14:32:50 -08:00
Saket Kumar Bhaskar	590699d858	selftests/bpf: Fix htab_update/reenter_update selftest failure Since commit `31158ad02d` ("rqspinlock: Add deadlock detection and recovery") the updated path on re-entrancy now reports deadlock via -EDEADLK instead of the previous -EBUSY. Also, the way reentrancy was exercised (via fentry/lookup_elem_raw) has been fragile because lookup_elem_raw may be inlined (find_kernel_btf_id() will return -ESRCH). To fix this fentry is attached to bpf_obj_free_fields() instead of lookup_elem_raw() and: - The htab map is made to use a BTF-described struct val with a struct bpf_timer so that check_and_free_fields() reliably calls bpf_obj_free_fields() on element replacement. - The selftest is updated to do two updates to the same key (insert + replace) in prog_test. - The selftest is updated to align with expected errno with the kernel’s current behavior. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20251117060752.129648-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-24 16:48:28 -08:00
Alan Maguire	ad93ba0267	selftests/bpf: Allow selftests to build with older xxd Currently selftests require xxd with the "-n <name>" option which allows the user to specify a name not derived from the input object path. Instead of relying on this newer feature, older xxd can be used if we link our desired name ("test_progs_verification_cert") to the input object. Many distros ship xxd in vim-common package and do not have the latest xxd with -n support. Fixes: `b720903e2b` ("selftests/bpf: Enable signature verification for some lskel tests") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20251120084754.640405-3-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-24 10:00:16 -08:00
Puranjay Mohan	cf49ec5705	selftests: bpf: Add tests for unbalanced rcu_read_lock As verifier now supports nested rcu critical sections, add new test cases to make sure unbalanced usage of rcu_read_lock()/unlock() is rejected. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 18:34:59 -08:00
Puranjay Mohan	4167096cb9	bpf: support nested rcu critical sections Currently, nested rcu critical sections are rejected by the verifier and rcu_lock state is managed by a boolean variable. Add support for nested rcu critical sections by make active_rcu_locks a counter similar to active_preempt_locks. bpf_rcu_read_lock() increments this counter and bpf_rcu_read_unlock() decrements it, MEM_RCU -> PTR_UNTRUSTED transition happens when active_rcu_locks drops to 0. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 18:34:59 -08:00
Eduard Zingerman	8f7cf305a1	bpf: test the correct stack liveness of tail calls A new test is added: caller_stack_write_tail_call tests that the live stack is correctly tracked for a tail call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Link: https://lore.kernel.org/r/20251119160355.1160932-5-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 17:45:30 -08:00
Martin Teichmann	978da762ea	bpf: test the proper verification of tail calls Three tests are added: - invalidate_pkt_pointers_by_tail_call checks that one can use the packet pointer after a tail call. This was originally possible and also poses not problems, but was made impossible by `1a4607ffba`. - invalidate_pkt_pointers_by_static_tail_call tests a corner case found by Eduard Zingerman during the discussion of the original fix, which was broken in that fix. - subprog_result_tail_call tests that precision propagation works correctly across tail calls. This did not work before. Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251119160355.1160932-3-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 17:45:30 -08:00
Xing Guo	b7f7d76d6e	selftests/bpf: Update test_tag to use sha256 commit `603b441623` ("bpf: Update the bpf_prog_calc_tag to use SHA256") changed digest of prog_tag to SHA256 but forgot to update tests correspondingly. Fix it. Fixes: `603b441623` ("bpf: Update the bpf_prog_calc_tag to use SHA256") Signed-off-by: Xing Guo <higuoxing@gmail.com> Link: https://lore.kernel.org/r/20251121061458.3145167-1-higuoxing@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 16:56:08 -08:00
Matt Bobrowski	ae24fc8a16	selftests/bpf: Improve reliability of test_perf_branches_no_hw() Currently, test_perf_branches_no_hw() relies on the busy loop within test_perf_branches_common() being slow enough to allow at least one perf event sample tick to occur before starting to tear down the backing perf event BPF program. With a relatively small fixed iteration count of 1,000,000, this is not guaranteed on modern fast CPUs, resulting in the test run to subsequently fail with the following: bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:PASS:output not valid 0 nsec check_good_sample:PASS:read_branches_size 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_size 0 nsec test_perf_branches_no_hw:PASS:perf_event_open 0 nsec test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_bad_sample:FAIL:output not valid no valid sample from prog Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. On a modern CPU (i.e. one with a 3.5 GHz clock rate), executing 1 million increments of a volatile integer can take significantly less than 1 millisecond. If the spin loop and detachment of the perf event BPF program elapses before the first 1 ms sampling interval elapses, the perf event will never end up firing. Fix this by bumping the loop iteration counter a little within test_perf_branches_common(), along with ensuring adding another loop termination condition which is directly influenced by the backing perf event BPF program executing. Notably, a concious decision was made to not adjust the sample_freq value as that is just not a reliable way to go about fixing the problem. It effectively still leaves the race window open. Fixes: `67306f84ca` ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251119143540.2911424-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 16:49:16 -08:00
Matt Bobrowski	27746aaf1b	selftests/bpf: skip test_perf_branches_hw() on unsupported platforms Gracefully skip the test_perf_branches_hw subtest on platforms that do not support LBR or require specialized perf event attributes to enable branch sampling. For example, AMD's Milan (Zen 3) supports BRS rather than traditional LBR. This requires specific configurations (attr.type = PERF_TYPE_RAW, attr.config = RETIRED_TAKEN_BRANCH_INSTRUCTIONS) that differ from the generic setup used within this test. Notably, it also probably doesn't hold much value to special case perf event configurations for selected micro architectures. Fixes: `67306f84ca` ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251120142059.2836181-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 16:47:58 -08:00
Puranjay Mohan	d8774a3623	selftests: bpf: Enable gotox tests from arm64 arm64 JIT now supports gotox instruction and jumptables, so run tests in verifier_gotox.c for arm64. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251117130732.11107-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-21 16:40:21 -08:00
Hoyeon Lee	db354a1577	selftests/bpf: Use sockaddr_storage instead of sa46 in select_reuseport test The select_reuseport selftest uses a custom sa46 union to represent IPv4 and IPv6 addresses. This custom wrapper requires extra manual handling for address family and field extraction. Replace sa46 with sockaddr_storage and update the helper functions to operate on native socket structures. This simplifies the code and removes unnecessary custom address-handling logic. No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-3-hoyeon.lee@suse.com	2025-11-21 10:46:31 -08:00
Hoyeon Lee	fd6ed07a05	selftests/bpf: Use sockaddr_storage directly in cls_redirect test The cls_redirect test uses a custom addr_port/tuple wrapper to represent IPv4/IPv6 addresses and ports. This custom wrapper requires extra conversion logic and specific helpers such as fill_addr_port(), which are no longer necessary when using standard socket address structures. This commit replaces addr_port/tuple with the standard sockaddr_storage so test handles address families and ports using native socket types. It removes the custom helper, eliminates redundant casts, and simplifies the setup helpers without functional changes. set_up_conn() and build_input() now take src/dst sockaddr_storage directly. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-2-hoyeon.lee@suse.com	2025-11-21 10:46:01 -08:00
Matt Bobrowski	d088da9042	selftests/bpf: Use ASSERT_STRNEQ to factor in long slab cache names subtest_kmem_cache_iter_check_slabinfo() fundamentally compares slab cache names parsed out from /proc/slabinfo against those stored within struct kmem_cache_result. The current problem is that the slab cache name within struct kmem_cache_result is stored within a bounded fixed-length array (sized to SLAB_NAME_MAX(32)), whereas the name parsed out from /proc/slabinfo is not. Meaning, using ASSERT_STREQ() can certainly lead to test failures, particularly when dealing with slab cache names that are longer than SLAB_NAME_MAX(32) bytes. Notably, kmem_cache_create() allows callers to create slab caches with somewhat arbitrarily sized names via its __name identifier argument, so exceeding the SLAB_NAME_MAX(32) limit that is in place now can certainly happen. Make subtest_kmem_cache_iter_check_slabinfo() more reliable by only checking up to sizeof(struct kmem_cache_result.name) - 1 using ASSERT_STRNEQ(). Fixes: `a496d0cdc8` ("selftests/bpf: Add a test for kmem_cache_iter") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://patch.msgid.link/20251118073734.4188710-1-mattbobrowski@google.com	2025-11-20 09:26:06 -08:00
Jakub Kicinski	9e203721ec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.18-rc7). No conflicts, adjacent changes: tools/testing/selftests/net/af_unix/Makefile `e1bb28bf13` ("selftest: af_unix: Add test for SO_PEEK_OFF.") `45a1cd8346` ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-20 09:13:26 -08:00
Hoyeon Lee	ec12ab2cda	selftests/bpf: Replace TCP CC string comparisons with bpf_strncmp The connect4_prog and bpf_iter_setsockopt tests duplicate the same open-coded TCP congestion control string comparison logic. Since bpf_strncmp() provides the same functionality, use it instead to avoid repeated open-coded loops. This change applies only to functional BPF tests and does not affect the verifier performance benchmarks (veristat.cfg). No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-5-hoyeon.lee@suse.com	2025-11-18 14:57:53 -08:00
Hoyeon Lee	f700b37314	selftests/bpf: Move common TCP helpers into bpf_tracing_net.h Some BPF selftests contain identical copies of the min(), max(), before(), and after() helpers. These repeated snippets are the same across the tests and do not need to be defined separately. Move these helpers into bpf_tracing_net.h so they can be shared by TCP related BPF programs. This removes repeated code and keeps the helpers in a single place. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-4-hoyeon.lee@suse.com	2025-11-18 14:57:45 -08:00
Martin KaFai Lau	6cc73f3540	selftests/bpf: Test bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) when transport_header is not set Add a test to check that bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) is rejected (-EINVAL) if skb->transport_header is not set. The test needs to lower the MTU of the loopback device. Thus, take this opportunity to run the test in a netns by adding "ns_" to the test name. The "serial_" prefix can then be removed. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20251112232331.1566074-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-14 18:49:18 -08:00
Mykyta Yatsenko	a4d31f451d	selftests/bpf: Align kfuncs renamed in bpf tree bpf_task_work_schedule_resume() and bpf_task_work_schedule_signal() have been renamed in bpf tree to bpf_task_work_schedule_resume_impl() and bpf_task_work_schedule_signal_impl() accordingly. There are few uses of these kfuncs in selftests that are not in bpf tree, so that when we port [1] into bpf-next, those BPF programs will not compile. This patch aligns those remaining callsites with the kfunc renaming. It should go on top of [1] when applying on bpf-next. 1: https://lore.kernel.org/all/20251104-implv2-v3-0-4772b9ae0e06@meta.com/ Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251105132105.597344-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-14 17:45:43 -08:00
Alexei Starovoitov	e47b68bda4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+ Cross-merge BPF and other fixes after downstream PR. Minor conflict in kernel/bpf/helpers.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-14 17:43:41 -08:00
Paul Houssel	a69e09823e	selftests/bpf: Add BTF dedup tests for recursive typedef definitions Add several ./test_progs tests: 1. btf/dedup:recursive typedef ensures that deduplication no longer fails on recursive typedefs. 2. btf/dedup:typedef ensures that typedefs are deduplicated correctly just as they were before this patch. Signed-off-by: Paul Houssel <paul.houssel@orange.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/9fac2f744089f6090257d4c881914b79f6cd6c6a.1763037045.git.paul.houssel@orange.com	2025-11-14 17:07:20 -08:00
Alexei Starovoitov	c133390398	selftests/bpf: Fix failure paths in send_signal test When test_send_signal_kern__open_and_load() fails parent closes the pipe which cases ASSERT_EQ(read(pipe_p2c...)) to fail, but child continues and enters infinite loop, while parent is stuck in wait(NULL). Other error paths have similar issue, so kill the child before waiting on it. The bug was discovered while compiling all of selftests with -O1 instead of -O2 which caused progs/test_send_signal_kern.c to fail to load. Fixes: `ab8b7f0cb3` ("tools/bpf: Add self tests for bpf_send_signal_thread()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251113171153.2583-1-alexei.starovoitov@gmail.com	2025-11-14 17:02:25 -08:00
Alexei Starovoitov	63066b7a8e	selftests/bpf: Convert glob_match() to bpf arena Increase arena test coverage. Convert glob_match() to bpf arena in two steps: 1. Copy paste lib/glob.c into bpf_arena_strsearch.h Copy paste lib/globtests.c into progs/arena_strsearch.c 2. Add __arena to pointers Add __arg_arena to global functions that accept arena pointers Add cond_break to loops The test also serves as a good example of what's possible with bpf arena and how existing algorithms can be converted. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251111032931.21430-1-alexei.starovoitov@gmail.com	2025-11-14 13:57:28 -08:00
Eduard Zingerman	6c762611fe	selftests/bpf: Test widen_imprecise_scalars() with different stack depth A test case for a situation when widen_imprecise_scalars() is called with old->allocated_stack > cur->allocated_stack. Test structure: def widening_stack_size_bug(): r1 = 0 for r6 in 0..1: iterator_with_diff_stack_depth(r1) r1 = 42 def iterator_with_diff_stack_depth(r1): if r1 != 42: use 128 bytes of stack iterator based loop iterator_with_diff_stack_depth() is verified with r1 == 0 first and r1 == 42 next. Causing stack usage of 128 bytes on a first visit and 8 bytes on a second. Such arrangement triggered a KASAN error in widen_imprecise_scalars(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251114025730.772723-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-14 09:26:28 -08:00
Matt Bobrowski	93ce3bee31	selftests/bpf: retry bpf_map_update_elem() when E2BIG is returned Executing the test_maps binary on platforms with extremely high core counts may cause intermittent assertion failures in test_update_delete() (called via test_map_parallel()). This can occur because bpf_map_update_elem() under some circumstances (specifically in this case while performing bpf_map_update_elem() with BPF_NOEXIST on a BPF_MAP_TYPE_HASH with its map_flags set to BPF_F_NO_PREALLOC) can return an E2BIG error code i.e. error -7 7 tools/testing/selftests/bpf/test_maps.c:#: void test_update_delete(unsigned int, void ): Assertion `err == 0' failed. tools/testing/selftests/bpf/test_maps.c:#: void __run_parallel(unsigned int, void ()(unsigned int, void ), void ): Assertion `status == 0' failed. As it turns out, is_map_full() which is called from alloc_htab_elem() can take on a conservative approach when htab->use_percpu_counter is true (which is the case here because the percpu_counter is used when a BPF_MAP_TYPE_HASH is created with its map_flags set to BPF_F_NO_PREALLOC). This conservative approach prioritizes preventing over-allocation and potential issues that could arise from possibly exceeding htab->map.max_entries in highly concurrent environments, even if it means slightly under-utilizing the htab map's capacity. Given that bpf_map_update_elem() from test_update_delete() can return E2BIG, update can_retry() such that it also accounts for the E2BIG error code (specifically only when running with map_flags being set to BPF_F_NO_PREALLOC). The retry loop will allow the global count belonging to the percpu_counter to become synchronized and better reflect the current htab map's capacity. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251113092519.2632079-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-13 14:36:28 -08:00
Jiayuan Chen	cb730e4ac1	selftests/bpf: Add mptcp test with sockmap Add test cases to verify that when MPTCP falls back to plain TCP sockets, they can properly work with sockmap. Additionally, add test cases to ensure that sockmap correctly rejects MPTCP sockets as expected. Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251111060307.194196-4-jiayuan.chen@linux.dev	2025-11-13 13:18:25 -08:00
Leon Hwang	c1cbf0d21c	selftests/bpf: Add test to verify freeing the special fields in pcpu maps Add test to verify that updating [lru_,]percpu_hash maps decrements refcount when BPF_KPTR_REF objects are involved. The tests perform the following steps: . Call update_elem() to insert an initial value. . Use bpf_refcount_acquire() to increment the refcount. . Store the node pointer in the map value. . Add the node to a linked list. . Probe-read the refcount and verify it is 2. . Call update_elem() again to trigger refcount decrement. . Probe-read the refcount and verify it is 1. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251105151407.12723-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-13 09:15:33 -08:00
D. Wythe	beb3c67297	bpf/selftests: Add selftest for bpf_smc_hs_ctrl This tests introduces a tiny smc_hs_ctrl for filtering SMC connections based on IP pairs, and also adds a realistic topology model to verify it. Also, we can only use SMC loopback under CI test, so an additional configuration needs to be enabled. Follow the steps below to run this test. make -C tools/testing/selftests/bpf cd tools/testing/selftests/bpf sudo ./test_progs -t smc Results shows: Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20251107035632.115950-4-alibuda@linux.alibaba.com	2025-11-10 12:00:45 -08:00
Jakub Sitnicki	d2c5cca3fb	selftests/bpf: Cover skb metadata access after bpf_skb_change_proto Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_proto(), which modifies packet headroom to accommodate different IP header sizes. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-16-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:33 -08:00
Jakub Sitnicki	85d454afef	selftests/bpf: Cover skb metadata access after change_head/tail helper Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_head() and bpf_skb_change_tail(), which modify packet headroom/tailroom and can trigger head reallocation. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-15-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:33 -08:00
Jakub Sitnicki	29960e635b	selftests/bpf: Cover skb metadata access after bpf_skb_adjust_room Add a test to verify that skb metadata remains accessible after calling bpf_skb_adjust_room(), which modifies the packet headroom and can trigger head reallocation. The helper expects an Ethernet frame carrying an IP packet so switch test packet identification by source MAC address since we can no longer rely on Ethernet proto being set to zero. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-14-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:33 -08:00
Jakub Sitnicki	354d020c29	selftests/bpf: Cover skb metadata access after vlan push/pop helper Add a test to verify that skb metadata remains accessible after calling bpf_skb_vlan_push() and bpf_skb_vlan_pop(), which modify the packet headroom. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-13-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:32 -08:00
Jakub Sitnicki	1e1357fde8	selftests/bpf: Expect unclone to preserve skb metadata Since pskb_expand_head() no longer clears metadata on unclone, update tests for cloned packets to expect metadata to remain intact. Also simplify the clone_dynptr_kept_on_{data,meta}_slice_write tests. Creating an r/w dynptr slice is sufficient to trigger an unclone in the prologue, so remove the extraneous writes to the data/meta slice. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-12-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:32 -08:00
Jakub Sitnicki	9ef9ac15a5	selftests/bpf: Dump skb metadata on verification failure Add diagnostic output when metadata verification fails to help with troubleshooting test failures. Introduce a check_metadata() helper that prints both expected and received metadata to the BPF program's stderr stream on mismatch. The userspace test reads and dumps this stream on failure. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-11-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:32 -08:00
Jakub Sitnicki	967534e57c	selftests/bpf: Verify skb metadata in BPF instead of userspace Move metadata verification into the BPF TC programs. Previously, userspace read metadata from a map and verified it once at test end. Now TC programs compare metadata directly using __builtin_memcmp() and set a test_pass flag. This enables verification at multiple points during test execution rather than a single final check. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-10-5ceb08a9b37b@cloudflare.com	2025-11-10 10:52:32 -08:00
Alexis Lothoré (eBPF Foundation)	5b7d6c9198	selftests/bpf: Use start_server_str rather than start_reuseport_server in tc_tunnel Now that start_server_str enforces SO_REUSEADDR, there's no need to keep using start_reusport_server in tc_tunnel, especially since it only uses one server at a time. Replace start_reuseport_server with start_server_str in tc_tunnel test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-start-server-soreuseaddr-v1-2-1bbd9c1f8d65@bootlin.com	2025-11-06 15:23:04 -08:00
Alexis Lothoré (eBPF Foundation)	38e36514fc	selftests/bpf: Systematically add SO_REUSEADDR in start_server_addr Some tests have to stop/start a server multiple time with the same listening address. Doing so without SO_REUSADDR leads to failures due to the socket still being in TIME_WAIT right after the first instance stop/before the second instance start. Instead of letting each test manually set SO_REUSEADDR on their servers, it can be done automatically by start_server_addr for all tests (and without any major downside). Enforce SO_REUSEADDR in start_server_addr for all tests. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-start-server-soreuseaddr-v1-1-1bbd9c1f8d65@bootlin.com	2025-11-06 15:23:00 -08:00
Anton Protopopov	ac4d838ce1	selftests/bpf: add C-level selftests for indirect jumps Add C-level selftests for indirect jumps to validate LLVM and libbpf functionality. The tests are intentionally disabled, to be run locally by developers, but will not make the CI red. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-13-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:53:56 -08:00
Anton Protopopov	ccbdb48ce5	selftests/bpf: add new verifier_gotox test Add a set of tests to validate core gotox functionality without need to rely on compilers. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-12-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:53:23 -08:00
Anton Protopopov	ae48162a66	selftests/bpf: test instructions arrays with blinding Add a specific test for instructions arrays with blinding enabled. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-7-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:53:22 -08:00
Anton Protopopov	218edd6db6	selftests/bpf: add selftests for new insn_array map Add the following selftests for new insn_array map: * Incorrect instruction indexes are rejected * Two programs can't use the same map * BPF progs can't operate the map * no changes to code => map is the same * expected changes when instructions are added * expected changes when instructions are deleted * expected changes when multiple functions are present Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-5-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-05 17:53:18 -08:00
Jiri Olsa	3490d29964	selftests/bpf: Add stacktrace ips test for raw_tp Adding test that verifies we get expected initial 2 entries from stacktrace for rawtp probe via ORC unwind. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Jiri Olsa	c9e208fa93	selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi Adding test that attaches kprobe/kretprobe multi and verifies the ORC stacktrace matches expected functions. Adding bpf_testmod_stacktrace_test function to bpf_testmod kernel module which is called through several functions so we get reliable call path for stacktrace. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-05 17:05:19 -08:00
Kees Cook	85cb0757d7	net: Convert proto_ops connect() callbacks to use sockaddr_unsized Update all struct proto_ops connect() callback function prototypes from "struct sockaddr " to "struct sockaddr_unsized " to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-04 19:10:32 -08:00
Kees Cook	0e50474fa5	net: Convert proto_ops bind() callbacks to use sockaddr_unsized Update all struct proto_ops bind() callback function prototypes from "struct sockaddr " to "struct sockaddr_unsized " to avoid lying to the compiler about object sizes. Calls into struct proto handlers gain casts that will be removed in the struct proto conversion patch. No binary changes expected. Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-11-04 19:10:32 -08:00
Mykyta Yatsenko	137cc92ffe	bpf: add _impl suffix for bpf_stream_vprintk() kfunc Rename bpf_stream_vprintk() to bpf_stream_vprintk_impl(). This makes bpf_stream_vprintk() follow the already established "_impl" suffix-based naming convention for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-2-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-11-04 17:50:25 -08:00
Mykyta Yatsenko	ea0714d61d	bpf:add _impl suffix for bpf_task_work_schedule* kfuncs Rename: bpf_task_work_schedule_resume()->bpf_task_work_schedule_resume_impl() bpf_task_work_schedule_signal()->bpf_task_work_schedule_signal_impl() This aligns task work scheduling kfuncs with the established naming scheme for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-1-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-11-04 17:50:25 -08:00
Alan Maguire	cc77a20389	selftests/bpf: Test parsing of (multi-)split BTF Write raw BTF to files, parse it and compare to original; this allows us to test parsing of (multi-)split BTF code. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251104203309.318429-3-alan.maguire@oracle.com	2025-11-04 13:44:12 -08:00
KaFai Wan	9f32bfec54	selftests/bpf: Add test for conditional jumps on same scalar register Add test cases to verify the correctness of the BPF verifier's branch analysis when conditional jumps are performed on the same scalar register. And make sure that JGT does not trigger verifier BUG. Signed-off-by: KaFai Wan <kafai.wan@linux.dev> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251103063108.1111764-3-kafai.wan@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-03 17:44:53 -08:00
Song Liu	62d2d0a338	selftests/bpf: Add tests for livepatch + bpf trampoline Both livepatch and BPF trampoline use ftrace. Special attention is needed when livepatch and fexit program touch the same function at the same time, because livepatch updates a kernel function and the BPF trampoline need to call into the right version of the kernel function. Use samples/livepatch/livepatch-sample.ko for the test. The test covers two cases: 1) When a fentry program is loaded first. This exercises the modify_ftrace_direct code path. 2) When a fentry program is loaded first. This exercises the register_ftrace_direct code path. Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251027175023.1521602-4-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-03 17:22:06 -08:00
Alexis Lothoré (eBPF Foundation)	e6e10c51fb	selftests/bpf: Add checks in tc_tunnel when entering net namespaces test_tc_tunnel is missing checks on any open_netns. Add those checks anytime we try to enter a net namespace, and skip the related operations if we fail. While at it, reduce the number of open_netns/close_netns for cases involving operations in two distinct namespaces: the test currently does the following: nstoken = open_netns("foo") do_operation(); close(nstoken); nstoken = open_netns("bar") do_another_operation(); close(nstoken); As already stated in reviews for the initial test, we don't need to go back to the root net namespace to enter a second namespace, so just do: ntoken_client = open_netns("foo") do_operation(); nstoken_server = open_netns("bar") do_another_operation(); close(nstoken_server); close(nstoken_client); Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251031-tc_tunnel_improv-v1-2-0ffe44d27eda@bootlin.com	2025-10-31 15:00:30 -07:00
Alexis Lothoré (eBPF Foundation)	c076fd5bb4	selftests/bpf: Skip tc_tunnel subtest if its setup fails A subtest setup can fail in a wide variety of ways, so make sure not to run it if an issue occurs during its setup. The return value is already representing whether the setup succeeds or fails, it is just about wiring it. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251031-tc_tunnel_improv-v1-1-0ffe44d27eda@bootlin.com	2025-10-31 15:00:30 -07:00
Bastien Curutchet (eBPF Foundation)	d1aec26fce	selftests/bpf: test_xsk: Integrate test_xsk.c to test_progs framework test_xsk.c isn't part of the test_progs framework. Integrate the tests defined by test_xsk.c into the test_progs framework through a new file : prog_tests/xsk.c. ZeroCopy mode isn't tested in it as veth peers don't support it. Move test_xsk{.c/.h} to prog_tests/. Add the find_bit library to test_progs sources in the Makefile as it is is used by test_xsk.c Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-15-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:39 -07:00
Bastien Curutchet (eBPF Foundation)	75fc630867	selftests/bpf: test_xsk: Isolate non-CI tests Following tests won't fit in the CI: - XDP_ADJUST_TAIL_* and SEND_RECEIVE_9K_PACKETS because of their flakyness - UNALIGNED_* because they depend on huge page allocations - *_RING_SIZE because they depend on HW rings - TEARDOWN because it's too long Remove these tests from the nominal tests table so they won't be run by the CI in upcoming patch. Create a skip_ci_tests table to hold them. Use this skip_ci table in xskxceiver.c to keep all the tests available from the test_xsk.sh script. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-14-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:39 -07:00
Bastien Curutchet (eBPF Foundation)	7a96615f2e	selftests/bpf: test_xsk: Don't exit immediately on allocation failures If any allocation in the pkt_stream_*() helpers fail, exit_with_error() is called. This terminates the program immediately. It prevents the following tests from running and isn't compliant with the CI. Return NULL in case of allocation failure. Return TEST_FAILURE when something goes wrong in the packet generation. Clean up the resources if a failure happens between two steps of a test. Move exit_with_error()'s definition into xskxceiver.c as it isn't used anywhere else now. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-13-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:39 -07:00
Bastien Curutchet (eBPF Foundation)	844b13a9ff	selftests/bpf: test_xsk: Don't exit immediately if validate_traffic fails __testapp_validate_traffic() calls exit_with_error() on failures. This exits the program immediately. It prevents the following tests from running and isn't compliant with the CI. Return TEST_FAILURE instead of calling exit_with_error(). Release the resource of the 1st thread if a failure happens between its creation and the creation of the second thread. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-12-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:39 -07:00
Bastien Curutchet (eBPF Foundation)	5b2a757a16	selftests/bpf: test_xsk: Don't exit immediately when workers fail TX and RX workers can fail in many places. These failures trigger a call to exit_with_error() which exits the program immediately. It prevents the following tests from running and isn't compliant with the CI. Add return value to functions that can fail. Handle failures more smoothly through report_failure(). Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-11-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:39 -07:00
Bastien Curutchet (eBPF Foundation)	3f09728f90	selftests/bpf: test_xsk: Don't exit immediately when gettimeofday fails exit_with_error() is called when gettimeofday() fails. This exits the program immediately. It prevents the following tests from being run and isn't compliant with the CI. Return TEST_FAILURE instead of calling exit_on_error(). Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-10-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	f12f1b5d14	selftests/bpf: test_xsk: Don't exit immediately when xsk_attach fails xsk_reattach_xdp calls exit_with_error() on failures. This exits the program immediately. It prevents the following tests from being run and isn't compliant with the CI. Add a return value to the functions handling XDP attachments to handle errors more smoothly. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-9-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	e645bcfb16	selftests/bpf: test_xsk: Add return value to init_iface() init_iface() doesn't have any return value while it can fail. In case of failure it calls exit_on_error() which exits the application immediately. This prevents the following tests from being run and isn't compliant with the CI Add a return value to init_iface() so errors can be handled more smoothly. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-8-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	f477b0fd75	selftests/bpf: test_xsk: Release resources when swap fails testapp_validate_traffic() doesn't release the sockets and the umem created by the threads if the test isn't currently in its last step. Thus, if the swap_xsk_resources() fails before the last step, the created resources aren't cleaned up. Clean the sockets and the umem in case of swap_xsk_resources() failure. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-7-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	e3dfa0faf1	selftests/bpf: test_xsk: Wrap test clean-up in functions The clean-up done at the end of a test in __testapp_validate_traffic() isn't wrapped in a function. It isn't convenient if we want to use it somewhere else in the code. Wrap the clean-up in two new functions : the first deletes the sockets, the second releases the umem. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-6-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	bea4f03897	selftests/bpf: test_xsk: fix memory leak in testapp_xdp_shared_umem() testapp_xdp_shared_umem() generates pkt_stream on each xsk from xsk_arr, where normally xsk_arr[0] gets pkt_streams and xsk_arr[1] have them NULLed. At the end of the test pkt_stream_restore_default() only releases xsk_arr[0] which leads to memory leaks. Release the missing pkt_stream at the end of testapp_xdp_shared_umem() Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-5-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	d66e49ffa0	selftests/bpf: test_xsk: fix memory leak in testapp_stats_rx_dropped() testapp_stats_rx_dropped() generates pkt_stream twice. The last generated is released by pkt_stream_restore_default() at the end of the test but we lose the pointer of the first pkt_stream. Release the 'middle' pkt_stream when it's getting replaced to prevent memory leaks. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-4-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	cadc0c1fd7	selftests/bpf: test_xsk: Fix __testapp_validate_traffic()'s return value __testapp_validate_traffic is supposed to return an integer value that tells if the test passed (0), failed (-1) or was skiped (2). It actually returns a boolean in the end. This doesn't harm when the test is successful but can lead to misinterpretation in case of failure as 1 will be returned instead of -1. Return TEST_FAILURE (-1) in case of failure, TEST_PASS (0) otherwise. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-3-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	2233ef8bba	selftests/bpf: test_xsk: Initialize bitmap before use bitmap is used before being initialized. Initialize it to zero before using it. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-2-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Bastien Curutchet (eBPF Foundation)	3ab77f35a7	selftests/bpf: test_xsk: Split xskxceiver AF_XDP features are tested by the test_xsk.sh script but not by the test_progs framework. The tests used by the script are defined in xksxceiver.c which can't be integrated in the test_progs framework as is. Extract these test definitions from xskxceiver{.c/.h} to put them in new test_xsk{.c/.h} files. Keep the main() function and its unshared dependencies in xksxceiver to avoid impacting the test_xsk.sh script which is often used to test real hardware. Move ksft_test_result_*() calls to xskxceiver.c to keep the kselftest's report valid Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20251031-xsk-v7-1-39fe486593a3@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-31 09:24:38 -07:00
Kumar Kartikeya Dwivedi	a8a0abf097	selftests/bpf: Add ABBCCA case for rqspinlock stress test Introduce a new mode for the rqspinlock stress test that exercises a deadlock that won't be detected by the AA and ABBA checks, such that we always reliably trigger the timeout fallback. We need 4 CPUs for this particular case, as CPU 0 is untouched, and three participant CPUs for triggering the ABBCCA case. Refactor the lock acquisition paths in the module to better reflect the three modes and choose the right lock depending on the context. Also drop ABBA case from running by default as part of test progs, since the stress test can consume a significant amount of time. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251029181828.231529-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-29 18:17:56 -07:00
Mykyta Yatsenko	5913e936f6	selftests/bpf: Fix intermittent failures in file_reader test file_reader/on_open_expect_fault intermittently fails when test_progs runs tests in parallel, because it expects a page fault on first read. Another file_reader test running concurrently may have already pulled the same pages into the page cache, eliminating the fault and causing a spurious failure. Make file_reader/on_open_expect_fault read from a file region that does not overlap with other file_reader tests, so the initial access still faults even under parallel execution. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20251029195907.858217-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-29 18:15:30 -07:00
Alexis Lothoré (eBPF Foundation)	5d3591607d	selftests/bpf: Remove test_tc_tunnel.sh Now that test_tc_tunnel.sh scope has been ported to the test_progs framework, remove it. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-4-505c12019f9d@bootlin.com	2025-10-29 12:17:24 -07:00
Alexis Lothoré (eBPF Foundation)	8517b1abe5	selftests/bpf: Integrate test_tc_tunnel.sh tests into test_progs The test_tc_tunnel.sh script checks that a large variety of tunneling mechanisms handled by the kernel can be handled as well by eBPF programs. While this test shares similarities with test_tunnel.c (which is already integrated in test_progs), those are testing slightly different things: - test_tunnel.c creates a tunnel interface, and then get and set tunnel keys in packet metadata, from BPF programs. - test_tc_tunnels.sh manually parses/crafts packets content Bring the tests covered by test_tc_tunnel.sh into the test_progs framework, by creating a dedicated test_tc_tunnel.sh. This new test defines a "generic" runner which, for each test configuration: - will configure the relevant veth pair, each of those isolated in a dedicated namespace - will check that traffic will fail if there is only an encapsulating program attached to one veth egress - will check that traffic succeed if we enable some decapsulation module on kernel side - will check that traffic still succeeds if we replace the kernel decapsulation with some eBPF ingress decapsulation. Example of the new test execution: # ./test_progs -a tc_tunnel #447/1 tc_tunnel/ipip_none:OK #447/2 tc_tunnel/ipip6_none:OK #447/3 tc_tunnel/ip6tnl_none:OK #447/4 tc_tunnel/sit_none:OK #447/5 tc_tunnel/vxlan_eth:OK #447/6 tc_tunnel/ip6vxlan_eth:OK #447/7 tc_tunnel/gre_none:OK #447/8 tc_tunnel/gre_eth:OK #447/9 tc_tunnel/gre_mpls:OK #447/10 tc_tunnel/ip6gre_none:OK #447/11 tc_tunnel/ip6gre_eth:OK #447/12 tc_tunnel/ip6gre_mpls:OK #447/13 tc_tunnel/udp_none:OK #447/14 tc_tunnel/udp_eth:OK #447/15 tc_tunnel/udp_mpls:OK #447/16 tc_tunnel/ip6udp_none:OK #447/17 tc_tunnel/ip6udp_eth:OK #447/18 tc_tunnel/ip6udp_mpls:OK #447 tc_tunnel:OK Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-3-505c12019f9d@bootlin.com	2025-10-29 12:17:22 -07:00
Alexis Lothoré (eBPF Foundation)	86433db932	selftests/bpf: Make test_tc_tunnel.bpf.c compatible with big endian platforms When trying to run bpf-based encapsulation in a s390x environment, some parts of test_tc_tunnel.bpf.o do not encapsulate correctly the traffic, leading to tests failures. Adding some logs shows for example that packets about to be sent on an interface with the ip6vxlan_eth program attached do not have the expected value 5 in the ip header ihl field, and so are ignored by the program. This phenomenon appears when trying to cross-compile the selftests, rather than compiling it from a virtualized host: the selftests build system may then wrongly pick some host headers. If <asm/byteorder.h> ends up being picked on the host (and if the host has a endianness different from the target one), it will then expose wrong endianness defines (e.g __LITTLE_ENDIAN_BITFIELD instead of __BIT_ENDIAN_BITFIELD), and it will for example mess up the iphdr structure layout used in the ebpf program. To prevent this, directly use the vmlinux.h header generated by the selftests build system rather than including directly specific kernel headers. As a consequence, add some missing definitions that are not exposed by vmlinux.h, and adapt the bitfield manipulations to allow building and using the program on both types of platforms. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-2-505c12019f9d@bootlin.com	2025-10-29 11:07:26 -07:00
Alexis Lothoré (eBPF Foundation)	1d5137c8d1	selftests/bpf: Add tc helpers The test_tunnel.c file defines small fonctions to easily attach eBPF programs to tc hooks, either on egress, ingress or both. Create a shared helper in network_helpers.c so that other tests can benefit from it. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-1-505c12019f9d@bootlin.com	2025-10-29 11:07:24 -07:00
Xu Kuohai	f9db3a3822	selftests/bpf/benchs: Add overwrite mode benchmark for BPF ring buffer Add --rb-overwrite option to benchmark BPF ring buffer in overwrite mode. Since overwrite mode is not yet supported by libbpf for consumer, also add --rb-bench-producer option to benchmark producer directly without a consumer. Benchmarks on an x86_64 and an arm64 CPU are shown below for reference. - AMD EPYC 9654 (x86_64) Ringbuf, multi-producer contention in overwrite mode, no consumer ================================================================= rb-prod nr_prod 1 32.180 ± 0.033M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 2 9.617 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 3 8.810 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 4 9.272 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 8 9.173 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 12 3.086 ± 0.032M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 16 2.945 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 20 2.519 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 24 2.545 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 28 2.363 ± 0.024M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 32 2.357 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 36 2.267 ± 0.011M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 40 2.284 ± 0.020M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 44 2.215 ± 0.025M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 48 2.193 ± 0.023M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 52 2.208 ± 0.024M/s (drops 0.000 ± 0.000M/s) - HiSilicon Kunpeng 920 (arm64) Ringbuf, multi-producer contention in overwrite mode, no consumer ================================================================= rb-prod nr_prod 1 14.478 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 2 21.787 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 3 6.045 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 4 5.352 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 8 4.850 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 12 3.542 ± 0.016M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 16 3.509 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 20 3.171 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 24 3.154 ± 0.014M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 28 2.974 ± 0.015M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 32 3.167 ± 0.014M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 36 2.903 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 40 2.866 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 44 2.914 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 48 2.806 ± 0.012M/s (drops 0.000 ± 0.000M/s) Rb-prod nr_prod 52 2.840 ± 0.012M/s (drops 0.000 ± 0.000M/s) Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251018035738.4039621-4-xukuohai@huaweicloud.com	2025-10-27 19:47:32 -07:00
Xu Kuohai	8f7a86ecde	selftests/bpf: Add overwrite mode test for BPF ring buffer Add overwrite mode test for BPF ring buffer. The test creates a BPF ring buffer in overwrite mode, then repeatedly reserves and commits records to check if the ring buffer works as expected both before and after overwriting occurs. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251018035738.4039621-3-xukuohai@huaweicloud.com	2025-10-27 19:46:32 -07:00
Mykyta Yatsenko	784cdf9315	selftests/bpf: add file dynptr tests Introducing selftests for validating file-backed dynptr works as expected. * validate implementation supports dynptr slice and read operations * validate destructors should be paired with initializers * validate sleepable progs can page in. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-11-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-27 09:56:27 -07:00
Mykyta Yatsenko	531b87d865	bpf: widen dynptr size/offset to 64 bit Dynptr currently caps size and offset at 24 bits, which isn’t sufficient for file-backed use cases; even 32 bits can be limiting. Refactor dynptr helpers/kfuncs to use 64-bit size and offset, ensuring consistency across the APIs. This change does not affect internals of xdp, skb or other dynptrs, which continue to behave as before. Also it does not break binary compatibility. The widening enables large-file access support via dynptr, implemented in the next patches. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-3-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-27 09:56:26 -07:00
Mykyta Yatsenko	a61a257ff5	selftests/bpf: remove unnecessary kfunc prototypes Remove unnecessary kfunc prototypes from test programs, these are provided by vmlinux.h Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-2-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-27 09:56:26 -07:00
Jakub Kicinski	2b7553db91	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.18-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-23 10:53:08 -07:00
Puranjay Mohan	7361c86485	selftests/bpf: Fix list_del() in arena list The __list_del fuction doesn't set the previous node's next pointer to the next node of the node to be deleted. It just updates the local variable and not the actual pointer in the previous node. The test was passing up till now because the bpf code is doing bpf_free() after list_del and therfore reading head->first from the userspace will read all zeroes. But after arena_list_del() is finished, head->first should point to NULL; Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251017141727.51355-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-18 19:27:26 -07:00
Yonghong Song	4f8543b5f2	selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22 With latest llvm22, I hit the verif_scale_strobemeta selftest failure below: $ ./test_progs -n 618 libbpf: prog 'on_event': BPF program load failed: -E2BIG libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG -- BPF program is too large. Processed 1000001 insn verification time 7019091 usec stack depth 488 processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0 -- END PROG LOAD LOG -- libbpf: prog 'on_event': failed to load: -E2BIG libbpf: failed to load object 'strobemeta.bpf.o' scale_test:FAIL:expect_success unexpected error: -7 (errno 7) #618 verif_scale_strobemeta:FAIL But if I increase the verificaiton insn limit from 1M to 10M, the above test_progs run actually will succeed. The below is the result from veristat: $ ./veristat strobemeta.bpf.o Processing 'strobemeta.bpf.o'... File Program Verdict Duration (us) Insns States Program size Jited size ---------------- -------- ------- ------------- ------- ------ ------------ ---------- strobemeta.bpf.o on_event success 90250893 9777685 358230 15954 80794 ---------------- -------- ------- ------------- ------- ------ ------------ ---------- Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs. Further debugging shows the llvm commit [1] is responsible for the verificaiton failure as it tries to convert certain switch statement to if-condition. Such change may cause different transformation compared to original switch statement. In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2, ptr noundef %3, ptr noundef %4) #2 !dbg !535 { %6 = alloca ptr, align 8 %7 = alloca i64, align 8 %8 = alloca ptr, align 8 %9 = alloca ptr, align 8 %10 = alloca ptr, align 8 %11 = alloca ptr, align 8 %12 = alloca i32, align 4 ... %20 = icmp ne ptr %19, null, !dbg !561 br i1 %20, label %22, label %21, !dbg !562 21: ; preds = %5 store i32 1, ptr %12, align 4 br label %48, !dbg !563 22: %23 = load ptr, ptr %9, align 8, !dbg !564 ... 47: ; preds = %38, %22 store i32 0, ptr %12, align 4, !dbg !588 br label %48, !dbg !588 48: ; preds = %47, %21 call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588 %49 = load i32, ptr %12, align 4 switch i32 %49, label %51 [ i32 0, label %50 i32 1, label %50 ] 50: ; preds = %48, %48 ret void, !dbg !589 51: ; preds = %48 unreachable } Note that the above 'switch' statement is added by clang frontend. Without [1], the switch statement will survive until SelectionDag, so the switch statement acts like a 'barrier' and prevents some transformation involved with both 'before' and 'after' the switch statement. But with [1], the switch statement will be removed during middle end optimization and later middle end passes (esp. after inlining) have more freedom to reorder the code. The following is the related source code: static void calc_location(struct strobe_value_loc loc, void tls_base): bpf_probe_read_user(&tls_ptr, sizeof(void ), dtv); /* if pointer has (void )-1 value, then TLS wasn't initialized yet / return tls_ptr && tls_ptr != (void )-1 ? tls_ptr + tls_index.offset : NULL; In read_int_var() func, we have: void location = calc_location(&cfg->int_locs[idx], tls_base); if (!location) return; bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location); ... The static func calc_location() is called inside read_int_var(). The asm code without [1]: 77: .123....89 (85) call bpf_probe_read_user#112 78: ........89 (79) r1 = (u64 )(r10 -368) 79: .1......89 (79) r2 = (u64 )(r10 -8) 80: .12.....89 (bf) r3 = r2 81: .123....89 (0f) r3 += r1 82: ..23....89 (07) r2 += 1 83: ..23....89 (79) r4 = (u64 )(r10 -464) 84: ..234...89 (a5) if r2 < 0x2 goto pc+13 85: ...34...89 (15) if r3 == 0x0 goto pc+12 86: ...3....89 (bf) r1 = r10 87: .1.3....89 (07) r1 += -400 88: .1.3....89 (b4) w2 = 16 In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place, so the verifier actually prefers to do verification first at 'r1 = r10' etc. The asm code with [1]: 119: .123....89 (85) call bpf_probe_read_user#112 120: ........89 (79) r1 = (u64 )(r10 -368) 121: .1......89 (79) r2 = (u64 )(r10 -8) 122: .12.....89 (bf) r3 = r2 123: .123....89 (0f) r3 += r1 124: ..23....89 (07) r2 += -1 125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6 126: ........89 (05) goto pc+17 ... 144: ........89 (b4) w1 = 0 145: .1......89 (6b) (u16 )(r8 +80) = r1 In this case, if 'r2 < 0xfffffffe' is true, the control will go to non-null 'location' branch, so 'goto pc+17' will actually go to null 'location' branch. This seems causing tremendous amount of verificaiton state. To fix the issue, rewrite the following code return tls_ptr && tls_ptr != (void *)-1 ? tls_ptr + tls_index.offset : NULL; to if/then statement and hopefully these explicit if/then statements are sticky during middle-end optimizations. Test with llvm20 and llvm21 as well and all strobemeta related selftests are passed. [1] https://github.com/llvm/llvm-project/pull/161000 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251014051639.1996331-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-18 19:25:03 -07:00
Yafang Shao	7484e7cd8a	bpf: mark vma->{vm_mm,vm_file} as __safe_trusted_or_null The vma->vm_mm might be NULL and it can be accessed outside of RCU. Thus, we can mark it as trusted_or_null. With this change, BPF helpers can safely access vma->vm_mm to retrieve the associated mm_struct from the VMA. Then we can make policy decision from the VMA. The "trusted" annotation enables direct access to vma->vm_mm within kfuncs marked with KF_TRUSTED_ARGS or KF_RCU, such as bpf_task_get_cgroup1() and bpf_task_under_cgroup(). Conversely, "null" enforcement requires all callsites using vma->vm_mm to perform NULL checks. The lsm selftest must be modified because it directly accesses vma->vm_mm without a NULL pointer check; otherwise it will break due to this change. For the VMA based THP policy, the use case is as follows, @mm = @vma->vm_mm; // vm_area_struct::vm_mm is trusted or null if (!@mm) return; bpf_rcu_read_lock(); // rcu lock must be held to dereference the owner @owner = @mm->owner; // mm_struct::owner is rcu trusted or null if (!@owner) goto out; @cgroup1 = bpf_task_get_cgroup1(@owner, MEMCG_HIERARCHY_ID); /* make the decision based on the @cgroup1 attribute / bpf_cgroup_release(@cgroup1); // release the associated cgroup out: bpf_rcu_read_unlock(); PSI memory information can be obtained from the associated cgroup to inform policy decisions. Since upstream PSI support is currently limited to cgroup v2, the following example demonstrates cgroup v2 implementation: @owner = @mm->owner; if (@owner) { // @ancestor_cgid is user-configured @ancestor = bpf_cgroup_from_id(@ancestor_cgid); if (bpf_task_under_cgroup(@owner, @ancestor)) { @psi_group = @ancestor->psi; / Extract PSI metrics from @psi_group and * implement policy logic based on the values */ } } The vma::vm_file can also be marked with __safe_trusted_or_null. No additional selftests are required since vma->vm_file and vma->vm_mm are already validated in the existing selftest suite. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Link: https://lore.kernel.org/r/20251016063929.13830-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-18 19:23:08 -07:00
Tiezhu Yang	c67f4ae737	selftests/bpf: Silence unused-but-set build warnings There are some set but not used build errors when compiling bpf selftests with the latest upstream mainline GCC, at the beginning add the attribute __maybe_unused for the variables, but it is better to just add the option -Wno-unused-but-set-variable to CFLAGS in Makefile to disable the errors instead of hacking the tests. tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c:229:36: error: variable ‘n_matches_after_delete’ set but not used [-Werror=unused-but-set-variable=] tools/testing/selftests/bpf/map_tests/lpm_trie_map_basic_ops.c:229:25: error: variable ‘n_matches’ set but not used [-Werror=unused-but-set-variable=] tools/testing/selftests/bpf/prog_tests/bpf_cookie.c:426:22: error: variable ‘j’ set but not used [-Werror=unused-but-set-variable=] tools/testing/selftests/bpf/prog_tests/find_vma.c:52:22: error: variable ‘j’ set but not used [-Werror=unused-but-set-variable=] tools/testing/selftests/bpf/prog_tests/perf_branches.c:67:22: error: variable ‘j’ set but not used [-Werror=unused-but-set-variable=] tools/testing/selftests/bpf/prog_tests/perf_link.c:15:22: error: variable ‘j’ set but not used [-Werror=unused-but-set-variable=] Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/20251018082815.20622-1-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-18 19:21:29 -07:00
Alexei Starovoitov	50de48a4dd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf at 6.18-rc2 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-18 18:20:57 -07:00
Brahmajit Das	a1e83d4c03	selftests/bpf: Fix redefinition of 'off' as different kind of symbol This fixes the following build error CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as different kind of symbol 228 \| u32 off; \| ^ The symbol 'off' was previously defined in tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an enum i40e_ptp_gpio_pin_state from drivers/net/ethernet/intel/i40e/i40e_ptp.c: enum i40e_ptp_gpio_pin_state { end = -2, invalid = -1, off = 0, in_A = 1, in_B = 2, out_A = 3, out_B = 4, }; This enum is included when CONFIG_I40E is enabled. As of commit `032676ff82` ("LoongArch: Update Loongson-3 default config file"), CONFIG_I40E is set in the defconfig, which leads to the conflict. Renaming the local variable avoids the redefinition and allows the build to succeed. Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Brahmajit Das <listout@listout.xyz> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-17 11:33:23 -07:00
Kuniyuki Iwashima	5f941dd87b	selftests/bpf: Add test for sk->sk_bypass_prot_mem. The test does the following for IPv4/IPv6 x TCP/UDP sockets with/without sk->sk_bypass_prot_mem, which can be turned on by net.core.bypass_prot_mem or bpf_setsockopt(SK_BPF_BYPASS_PROT_MEM). 1. Create socket pairs 2. Send NR_PAGES (32) of data (TCP consumes around 35 pages, and UDP consuems 66 pages due to skb overhead) 3. Read memory_allocated from sk->sk_prot->memory_allocated and sk->sk_prot->memory_per_cpu_fw_alloc 4. Check if unread data is charged to memory_allocated If sk->sk_bypass_prot_mem is set, memory_allocated should not be changed, but we allow a small error (up to 10 pages) in case other processes on the host use some amounts of TCP/UDP memory. The amount of allocated pages are buffered to per-cpu variable {tcp,udp}_memory_per_cpu_fw_alloc up to +/- net.core.mem_pcpu_rsv before reported to {tcp,udp}_memory_allocated. At 3., memory_allocated is calculated from the 2 variables at fentry of socket create function. We drain the receive queue only for UDP before close() because UDP recv queue is destroyed after RCU grace period. When I printed memory_allocated, UDP bypass cases sometimes saw the no-bypass case's leftover, but it's still in the small error range (<10 pages). bpf_trace_printk: memory_allocated: 0 <-- TCP no-bypass bpf_trace_printk: memory_allocated: 35 bpf_trace_printk: memory_allocated: 0 <-- TCP w/ sysctl bpf_trace_printk: memory_allocated: 0 bpf_trace_printk: memory_allocated: 0 <-- TCP w/ bpf bpf_trace_printk: memory_allocated: 0 bpf_trace_printk: memory_allocated: 0 <-- UDP no-bypass bpf_trace_printk: memory_allocated: 66 bpf_trace_printk: memory_allocated: 2 <-- UDP w/ sysctl (2 pages leftover) bpf_trace_printk: memory_allocated: 2 bpf_trace_printk: memory_allocated: 2 <-- UDP w/ bpf (2 pages leftover) bpf_trace_printk: memory_allocated: 2 We prefer finishing tests faster than oversleeping for call_rcu() + sk_destruct(). The test completes within 2s on QEMU (64 CPUs) w/ KVM. # time ./test_progs -t sk_bypass #371/1 sk_bypass_prot_mem/TCP :OK #371/2 sk_bypass_prot_mem/UDP :OK #371/3 sk_bypass_prot_mem/TCPv6:OK #371/4 sk_bypass_prot_mem/UDPv6:OK #371 sk_bypass_prot_mem:OK Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED real 0m1.481s user 0m0.181s sys 0m0.441s Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://patch.msgid.link/20251014235604.3057003-7-kuniyu@google.com	2025-10-16 12:04:47 -07:00
Xing Guo	0c1999ed33	selftests: arg_parsing: Ensure data is flushed to disk before reading. test_parse_test_list_file writes some data to /tmp/bpf_arg_parsing_test.XXXXXX and parse_test_list_file() will read the data back. However, after writing data to that file, we forget to call fsync() and it's causing testing failure in my laptop. This patch helps fix it by adding the missing fsync() call. Fixes: `64276f01dc` ("selftests/bpf: Test_progs can read test lists from file") Signed-off-by: Xing Guo <higuoxing@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251016035330.3217145-1-higuoxing@gmail.com	2025-10-16 09:34:39 -07:00
Andrii Nakryiko	e603a342cf	selftests/bpf: make arg_parsing.c more robust to crashes We started getting a crash in BPF CI, which seems to originate from test_parse_test_list_file() test and is happening at this line: ASSERT_OK(strcmp("test_with_spaces", set.tests[0].name), "test 0 name"); One way we can crash there is if set.cnt zero, which is checked for with ASSERT_EQ() above, but we proceed after this regardless of the outcome. Instead of crashing, we should bail out with test failure early. Similarly, if parse_test_list_file() fails, we shouldn't be even looking at set, so bail even earlier if ASSERT_OK() fails. Fixes: `64276f01dc` ("selftests/bpf: Test_progs can read test lists from file") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20251014202037.72922-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-14 16:39:33 -07:00
Alexei Starovoitov	39e9d5f630	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf before 6.18-rc1 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-11 18:27:47 -07:00
Mykyta Yatsenko	bca2b74ea9	selftests/bpf: Add more bpf_wq tests Add bpf_wq selftests to verify: * BPF program using non-constant offset of struct bpf_wq is rejected * BPF program using map with no BTF for storing struct bpf_wq is rejected Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251010164606.147298-2-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-10 11:12:51 -07:00
Paul Chaignon	bc3eeb4259	selftests/bpf: Test direct packet access on non-linear skbs This patch adds new selftests in the direct packet access suite, to cover the non-linear case. The first six tests cover the behavior of the bounds check with a non-linear skb. The last test adds a call to bpf_skb_pull_data() to be able to access the packet. Note that the size of the linear area includes the L2 header, but for some program types like cgroup_skb, ctx->data points to the L3 header. Therefore, a linear area of 22 bytes will have only 8 bytes accessible to the BPF program (22 - ETH_HLEN). For that reason, the cgroup_skb test cases access the packet at an offset of 8 bytes. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/ceedbfd719e58f0d49dcceb8592f5e6bd38ce5fe.1760037899.git.paul.chaignon@gmail.com	2025-10-10 10:43:04 -07:00
Paul Chaignon	8d45d0398d	selftests/bpf: Support non-linear flag in test loader This patch adds support for a new tag __linear_size in the test loader, to specify the size of the linear area in case of non-linear skbs. If the tag is absent or null, a linear skb is crafted. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/7ad928ec7591daef4f1b84032aeb86c918b3e5a7.1760037899.git.paul.chaignon@gmail.com	2025-10-10 10:43:04 -07:00
KaFai Wan	accb9a7e87	selftests/bpf: Add test for unpinning htab with internal timer struct Add test to verify that unpinning hash tables containing internal timer structures does not trigger context warnings. Each subtest (timer_prealloc and timer_no_prealloc) can trigger the context warning when unpinning, but the warning cannot be triggered twice within a short time interval (a HZ), which is expected behavior. Signed-off-by: KaFai Wan <kafai.wan@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251008102628.808045-3-kafai.wan@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-10 10:10:08 -07:00
Rong Tao	eca0b643ef	selftests/bpf: Test bpf_strcasestr,bpf_strncasestr kfuncs Add tests for new kfuncs bpf_strcasestr() and bpf_strncasestr(). Signed-off-by: Rong Tao <rongtao@cestc.cn> Link: https://lore.kernel.org/r/tencent_4F1A340A8966155C52AA9CBDB68FD221FE0A@qq.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-10 10:05:32 -07:00
Kumar Kartikeya Dwivedi	5b1b5d380a	selftests/bpf: Add tests for async cb context Add tests to verify that async callback's sleepable attribute is correctly determined by the callback type, not the arming program's context, reflecting its true execution context. Introduce verifier_async_cb_context.c with tests for all three async callback primitives: bpf_timer, bpf_wq, and bpf_task_work. Each primitive is tested when armed from both sleepable (lsm.s/file_open) and non-sleepable (fentry) programs. Test coverage: - bpf_timer callbacks: Verify they are never sleepable, even when armed from sleepable programs. Both tests should fail when attempting to use sleepable helper bpf_copy_from_user() in the callback. - bpf_wq callbacks: Verify they are always sleepable, even when armed from non-sleepable programs. Both tests should succeed when using sleepable helpers in the callback. - bpf_task_work callbacks: Verify they are always sleepable, even when armed from non-sleepable programs. Both tests should succeed when using sleepable helpers in the callback. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251007220349.3852807-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-10 10:04:51 -07:00
Rong Tao	de7342228b	bpf: Finish constification of 1st parameter of bpf_d_path() The commit `1b8abbb121` ("bpf...d_path(): constify path argument") constified the first parameter of the bpf_d_path(), but failed to update it in all places. Finish constification. Otherwise the selftest fail to build: .../selftests/bpf/bpf_experimental.h:222:12: error: conflicting types for 'bpf_path_d_path' 222 \| extern int bpf_path_d_path(const struct path path, char buf, size_t buf__sz) __ksym; \| ^ .../selftests/bpf/tools/include/vmlinux.h:153922:12: note: previous declaration is here 153922 \| extern int bpf_path_d_path(struct path path, char buf, size_t buf__sz) __weak __ksym; Fixes: `1b8abbb121` ("bpf...d_path(): constify path argument") Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-04 09:05:23 -07:00
Linus Torvalds	cbf33b8e0b	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjfBtMACgkQ6rmadz2v bToTuA/+JizieMIW6eqzCDrrhj45DmuKu6gUMrkmM0xte3QCoLly5FDGJ9HMmrqf L5xfz/TAExg/Nh9a0zpCGE7/fMr9P10Z5vVHMe/I06rcW+/fDrAERXDurlfpfliv /7KqF/lq1D6kp7bcBzWX4dzkT66Nh/Ax6ZJKIXF/K3iwFXBb1A/95Wwgqxuc3xNM n0kthEhnX6x5cojy6fla2zYNoebNLVxPvafTcAtnE0QM8vxdOl0Q7KWrdQpVE5V/ z6WF2M822eTQeUIy6k/ZRckFlmEwa++I+w5EnygBFP8INUHARdYNBQpQVFkg31jz GYGlbTs7jaDbMNWkEhKyDbuhL5Xq9/5FGOaLI9CvDtpPjbfYTZMGtIn9L902XtqE VEfBcTA/g/qd8Urx5622sfPwau64LrSAKBRE/4CJ7/dsnPx4BdKavsA5DMVuyRIm Au6tOxlRFhPWhultnmmbFBjwn33xTleYNUF9wpWQnijNq+Ph29gKZk8Ac5eKKS6A dTSCxv7LGafezN/evr779vXohZyF9vWfqHdITVzTo40tsSbi5v4uyfJ2qzoaS7Gf UQ4F2nkjqddhG/O1W2TY8aFzmWXP2q92V6yqkcq5zjA3D1Ue6JOxRzG/3kNzgsLX Gf1DmPosUyeLNpwdDYRD83Dk7tJ7rAI+x3i6Iwh8qivIhC4CzrU= =Kzf9 -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix selftests/bpf (typo, conflicts) and unbreak BPF CI (Jiri Olsa) - Remove linux/unaligned.h dependency for libbpf_sha256 (Andrii Nakryiko) and add a test (Eric Biggers) - Reject negative offsets for ALU operations in the verifier (Yazhou Tang) and add a test (Eduard Zingerman) - Skip scalar adjustment for BPF_NEG operation if destination register is a pointer (Brahmajit Das) and add a test (KaFai Wan) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: libbpf: Fix missing #pragma in libbpf_utils.c selftests/bpf: Add tests for rejection of ALU ops with negative offsets selftests/bpf: Add test for libbpf_sha256() bpf: Reject negative offsets for ALU ops libbpf: remove linux/unaligned.h dependency for libbpf_sha256() libbpf: move libbpf_sha256() implementation into libbpf_utils.c libbpf: move libbpf_errstr() into libbpf_utils.c libbpf: remove unused libbpf_strerror_r and STRERR_BUFSIZE libbpf: make libbpf_errno.c into more generic libbpf_utils.c selftests/bpf: Add test for BPF_NEG alu on CONST_PTR_TO_MAP bpf: Skip scalar adjustment for BPF_NEG if dst is a pointer selftests/bpf: Fix realloc size in bpf_get_addrs selftests/bpf: Fix typo in subtest_basic_usdt after merge conflict selftests/bpf: Fix open-coded gettid syscall in uprobe syscall tests	2025-10-03 19:38:19 -07:00
Linus Torvalds	50647a1176	file->f_path constification Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaN3daAAKCRBZ7Krx/gZQ 6zNWAP9kD6rOJRNqDgea4pibDPa47Tps/WM5tsDv3dsLliY29gEA6sveOWZ3guAj 4oY3ts/NtHLWXvhI7Vd/1mr2aTKEZQk= =YNK+ -----END PGP SIGNATURE----- Merge tag 'pull-f_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull file->f_path constification from Al Viro: "Only one thing was modifying ->f_path of an opened file - acct(2). Massaging that away and constifying a bunch of struct path * arguments in functions that might be given &file->f_path ends up with the situation where we can turn ->f_path into an anon union of const struct path f_path and struct path __f_path, the latter modified only in a few places in fs/{file_table,open,namei}.c, all for struct file instances that are yet to be opened" * tag 'pull-f_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits) Have cc(1) catch attempts to modify ->f_path kernel/acct.c: saner struct file treatment configfs:get_target() - release path as soon as we grab configfs_item reference apparmor/af_unix: constify struct path * arguments ovl_is_real_file: constify realpath argument ovl_sync_file(): constify path argument ovl_lower_dir(): constify path argument ovl_get_verity_digest(): constify path argument ovl_validate_verity(): constify {meta,data}path arguments ovl_ensure_verity_loaded(): constify datapath argument ksmbd_vfs_set_init_posix_acl(): constify path argument ksmbd_vfs_inherit_posix_acl(): constify path argument ksmbd_vfs_kern_path_unlock(): constify path argument ksmbd_vfs_path_lookup_locked(): root_share_path can be const struct path * check_export(): constify path argument export_operations->open(): constify path argument rqst_exp_get_by_name(): constify path argument nfs: constify path argument of __vfs_getattr() bpf...d_path(): constify path argument done_path_create(): constify path argument ...	2025-10-03 16:32:36 -07:00
Linus Torvalds	07fdad3a93	Networking changes for 6.18. Core & protocols ---------------- - Improve drop account scalability on NUMA hosts for RAW and UDP sockets and the backlog, almost doubling the Pps capacity under DoS. - Optimize the UDP RX performance under stress, reducing contention, revisiting the binary layout of the involved data structs and implementing NUMA-aware locking. This improves UDP RX performance by an additional 50%, even more under extreme conditions. - Add support for PSP encryption of TCP connections; this mechanism has some similarities with IPsec and TLS, but offers superior HW offloads capabilities. - Ongoing work to support Accurate ECN for TCP. AccECN allows more than one congestion notification signal per RTT and is a building block for Low Latency, Low Loss, and Scalable Throughput (L4S). - Reorganize the TCP socket binary layout for data locality, reducing the number of touched cachelines in the fastpath. - Refactor skb deferral free to better scale on large multi-NUMA hosts, this improves TCP and UDP RX performances significantly on such HW. - Increase the default socket memory buffer limits from 256K to 4M to better fit modern link speeds. - Improve handling of setups with a large number of nexthop, making dump operating scaling linearly and avoiding unneeded synchronize_rcu() on delete. - Improve bridge handling of VLAN FDB, storing a single entry per bridge instead of one entry per port; this makes the dump order of magnitude faster on large switches. - Restore IP ID correctly for encapsulated packets at GSO segmentation time, allowing GRO to merge packets in more scenarios. - Improve netfilter matching performance on large sets. - Improve MPTCP receive path performance by leveraging recently introduced core infrastructure (skb deferral free) and adopting recent TCP autotuning changes. - Allow bridges to redirect to a backup port when the bridge port is administratively down. - Introduce MPTCP 'laminar' endpoint that con be used only once per connection and simplify common MPTCP setups. - Add RCU safety to dst->dev, closing a lot of possible races. - A significant crypto library API for SCTP, MPTCP and IPv6 SR, reducing code duplication. - Supports pulling data from an skb frag into the linear area of an XDP buffer. Things we sprinkled into general kernel code -------------------------------------------- - Generate netlink documentation from YAML using an integrated YAML parser. Driver API ---------- - Support using IPv6 Flow Label in Rx hash computation and RSS queue selection. - Introduce API for fetching the DMA device for a given queue, allowing TCP zerocopy RX on more H/W setups. - Make XDP helpers compatible with unreadable memory, allowing more easily building DevMem-enabled drivers with a unified XDP/skbs datapath. - Add a new dedicated ethtool callback enabling drivers to provide the number of RX rings directly, improving efficiency and clarity in RX ring queries and RSS configuration. - Introduce a burst period for the health reporter, allowing better handling of multiple errors due to the same root cause. - Support for DPLL phase offset exponential moving average, controlling the average smoothing factor. Device drivers -------------- - Add a new Huawei driver for 3rd gen NIC (hinic3). - Add a new SpacemiT driver for K1 ethernet MAC. - Add a generic abstraction for shared memory communication devices (dibps) - Ethernet high-speed NICs: - nVidia/Mellanox: - Use multiple per-queue doorbell, to avoid MMIO contention issues - support adjacent functions, allowing them to delegate their SR-IOV VFs to sibling PFs - support RSS for IPSec offload - support exposing raw cycle counters in PTP and mlx5 - support for disabling host PFs. - Intel (100G, ice, idpf): - ice: support for SRIOV VFs over an Active-Active link aggregate - ice: support for firmware logging via debugfs - ice: support for Earliest TxTime First (ETF) hardware offload - idpf: support basic XDP functionalities and XSk - Broadcom (bnxt): - support Hyper-V VF ID - dynamic SRIOV resource allocations for RoCE - Meta (fbnic): - support queue API, zero-copy Rx and Tx - support basic XDP functionalities - devlink health support for FW crashes and OTP mem corruptions - expand hardware stats coverage to FEC, PHY, and Pause - Wangxun: - support ethtool coalesce options - support for multiple RSS contexts - Ethernet virtual: - Macsec: - replace custom netlink attribute checks with policy-level checks - Bonding: - support aggregator selection based on port priority - Microsoft vNIC: - use page pool fragments for RX buffers instead of full pages to improve memory efficiency - Ethernet NICs consumer, and embedded: - Qualcomm: support Ethernet function for IPQ9574 SoC - Airoha: implement wlan offloading via NPU - Freescale - enetc: add NETC timer PTP driver and add PTP support - fec: enable the Jumbo frame support for i.MX8QM - Renesas (R-Car S4): support HW offloading for layer 2 switching - support for RZ/{T2H, N2H} SoCs - Cadence (macb): support TAPRIO traffic scheduling - TI: - support for Gigabit ICSS ethernet SoC (icssm-prueth) - Synopsys (stmmac): a lot of cleanups - Ethernet PHYs: - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS driver - Support bcm63268 GPHY power control - Support for Micrel lan8842 PHY and PTP - Support for Aquantia AQR412 and AQR115 - CAN: - a large CAN-XL preparation work - reorganize raw_sock and uniqframe struct to minimize memory usage - rcar_canfd: update the CAN-FD handling - WiFi: - extended Neighbor Awareness Networking (NAN) support - S1G channel representation cleanup - improve S1G support - WiFi drivers: - Intel (iwlwifi): - major refactor and cleanup - Broadcom (brcm80211): - support for AP isolation - RealTek (rtw88/89) rtw88/89: - preparation work for RTL8922DE support - MediaTek (mt76): - HW restart improvements - MLO support - Qualcomm/Atheros (ath10k_ - GTK rekey fixes - Bluetooth drivers: - btusb: support for several new IDs for MT7925 - btintel: support for BlazarIW core - btintel_pcie: support for _suspend() / _resume() - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmjdBdsSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkWnAP/2Jz0ExnYMDxLxkkKqMte+3JNeF/KNjB zVIslYN/5ekkd1TMXyDLFlWHqUOUMSF9+cK5ZRMHG6/lzOueSzFuuuDFsWs6Kf2f q7Q9MzlXhR9++YCsdES1uS5x3PPjMInzo2ZivMFa6fUDLLFSzeAOKL9k+RS0EggU VlXv2Wkm73R0O6KAssgDsHke9cnNz+F0DzhQ0S3qkyZF9tS5NrDeUl7fZ47XZgwb ZuXdEzKmTTepo2XvZGxJoF2D7nekWFlHhLjEPpDJtET19nwhukCry41/FplJwlKR dWsYkqLkrOEQKFQ+q++/5c3BgFOG+IrQLbP5bGQF1tZRMl4Dqp9PDxK5fKJfccbS 0VY3Y2qWOSYejT2Ji2kEHR5E5rPyFm3Y5A4Rnpz0yEHj14vL2v4zf7CZRkCyyDfD doIZXFGkM0+N7QeXLEN833cje9zjaXuP9GAE+2bb+wAWAZAqof4KX8JgHh+y5Rwm pvUtvFxmEtntlMwNBap8aT3FquGtfZncU8pzAE5kvWjuMvyF0NVWiE5rB2kSQM/X NLmUdvDyjwwJqthXAJG0fl+a0mNJ/kOAqSOKJDJrfKDGWa+ovwY0iY06SpK0wIbO Wz7tpMk5MSlYXW8xWMlbyhvvU6T9xuoQ2KV4QTdMxc6Ir3sNX6YkQr+gjQjxB0gx ST2QF6lZeWFh =w2Kz -----END PGP SIGNATURE----- Merge tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - Improve drop account scalability on NUMA hosts for RAW and UDP sockets and the backlog, almost doubling the Pps capacity under DoS - Optimize the UDP RX performance under stress, reducing contention, revisiting the binary layout of the involved data structs and implementing NUMA-aware locking. This improves UDP RX performance by an additional 50%, even more under extreme conditions - Add support for PSP encryption of TCP connections; this mechanism has some similarities with IPsec and TLS, but offers superior HW offloads capabilities - Ongoing work to support Accurate ECN for TCP. AccECN allows more than one congestion notification signal per RTT and is a building block for Low Latency, Low Loss, and Scalable Throughput (L4S) - Reorganize the TCP socket binary layout for data locality, reducing the number of touched cachelines in the fastpath - Refactor skb deferral free to better scale on large multi-NUMA hosts, this improves TCP and UDP RX performances significantly on such HW - Increase the default socket memory buffer limits from 256K to 4M to better fit modern link speeds - Improve handling of setups with a large number of nexthop, making dump operating scaling linearly and avoiding unneeded synchronize_rcu() on delete - Improve bridge handling of VLAN FDB, storing a single entry per bridge instead of one entry per port; this makes the dump order of magnitude faster on large switches - Restore IP ID correctly for encapsulated packets at GSO segmentation time, allowing GRO to merge packets in more scenarios - Improve netfilter matching performance on large sets - Improve MPTCP receive path performance by leveraging recently introduced core infrastructure (skb deferral free) and adopting recent TCP autotuning changes - Allow bridges to redirect to a backup port when the bridge port is administratively down - Introduce MPTCP 'laminar' endpoint that con be used only once per connection and simplify common MPTCP setups - Add RCU safety to dst->dev, closing a lot of possible races - A significant crypto library API for SCTP, MPTCP and IPv6 SR, reducing code duplication - Supports pulling data from an skb frag into the linear area of an XDP buffer Things we sprinkled into general kernel code: - Generate netlink documentation from YAML using an integrated YAML parser Driver API: - Support using IPv6 Flow Label in Rx hash computation and RSS queue selection - Introduce API for fetching the DMA device for a given queue, allowing TCP zerocopy RX on more H/W setups - Make XDP helpers compatible with unreadable memory, allowing more easily building DevMem-enabled drivers with a unified XDP/skbs datapath - Add a new dedicated ethtool callback enabling drivers to provide the number of RX rings directly, improving efficiency and clarity in RX ring queries and RSS configuration - Introduce a burst period for the health reporter, allowing better handling of multiple errors due to the same root cause - Support for DPLL phase offset exponential moving average, controlling the average smoothing factor Device drivers: - Add a new Huawei driver for 3rd gen NIC (hinic3) - Add a new SpacemiT driver for K1 ethernet MAC - Add a generic abstraction for shared memory communication devices (dibps) - Ethernet high-speed NICs: - nVidia/Mellanox: - Use multiple per-queue doorbell, to avoid MMIO contention issues - support adjacent functions, allowing them to delegate their SR-IOV VFs to sibling PFs - support RSS for IPSec offload - support exposing raw cycle counters in PTP and mlx5 - support for disabling host PFs. - Intel (100G, ice, idpf): - ice: support for SRIOV VFs over an Active-Active link aggregate - ice: support for firmware logging via debugfs - ice: support for Earliest TxTime First (ETF) hardware offload - idpf: support basic XDP functionalities and XSk - Broadcom (bnxt): - support Hyper-V VF ID - dynamic SRIOV resource allocations for RoCE - Meta (fbnic): - support queue API, zero-copy Rx and Tx - support basic XDP functionalities - devlink health support for FW crashes and OTP mem corruptions - expand hardware stats coverage to FEC, PHY, and Pause - Wangxun: - support ethtool coalesce options - support for multiple RSS contexts - Ethernet virtual: - Macsec: - replace custom netlink attribute checks with policy-level checks - Bonding: - support aggregator selection based on port priority - Microsoft vNIC: - use page pool fragments for RX buffers instead of full pages to improve memory efficiency - Ethernet NICs consumer, and embedded: - Qualcomm: support Ethernet function for IPQ9574 SoC - Airoha: implement wlan offloading via NPU - Freescale - enetc: add NETC timer PTP driver and add PTP support - fec: enable the Jumbo frame support for i.MX8QM - Renesas (R-Car S4): - support HW offloading for layer 2 switching - support for RZ/{T2H, N2H} SoCs - Cadence (macb): support TAPRIO traffic scheduling - TI: - support for Gigabit ICSS ethernet SoC (icssm-prueth) - Synopsys (stmmac): a lot of cleanups - Ethernet PHYs: - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS driver - Support bcm63268 GPHY power control - Support for Micrel lan8842 PHY and PTP - Support for Aquantia AQR412 and AQR115 - CAN: - a large CAN-XL preparation work - reorganize raw_sock and uniqframe struct to minimize memory usage - rcar_canfd: update the CAN-FD handling - WiFi: - extended Neighbor Awareness Networking (NAN) support - S1G channel representation cleanup - improve S1G support - WiFi drivers: - Intel (iwlwifi): - major refactor and cleanup - Broadcom (brcm80211): - support for AP isolation - RealTek (rtw88/89) rtw88/89: - preparation work for RTL8922DE support - MediaTek (mt76): - HW restart improvements - MLO support - Qualcomm/Atheros (ath10k): - GTK rekey fixes - Bluetooth drivers: - btusb: support for several new IDs for MT7925 - btintel: support for BlazarIW core - btintel_pcie: support for _suspend() / _resume() - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs" * tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1536 commits) net: stmmac: Add support for Allwinner A523 GMAC200 dt-bindings: net: sun8i-emac: Add A523 GMAC200 compatible Revert "Documentation: net: add flow control guide and document ethtool API" octeontx2-pf: fix bitmap leak octeontx2-vf: fix bitmap leak net/mlx5e: Use extack in set rxfh callback net/mlx5e: Introduce mlx5e_rss_params for RSS configuration net/mlx5e: Introduce mlx5e_rss_init_params net/mlx5e: Remove unused mdev param from RSS indir init net/mlx5: Improve QoS error messages with actual depth values net/mlx5e: Prevent entering switchdev mode with inconsistent netns net/mlx5: HWS, Generalize complex matchers net/mlx5: Improve write-combining test reliability for ARM64 Grace CPUs selftests/net: add tcp_port_share to .gitignore Revert "net/mlx5e: Update and set Xon/Xoff upon MTU set" net: add NUMA awareness to skb_attempt_defer_free() net: use llist for sd->defer_list net: make softnet_data.defer_count an atomic selftests: drv-net: psp: add tests for destroying devices selftests: drv-net: psp: add test for auto-adjusting TCP MSS ...	2025-10-02 15:17:01 -07:00
Eduard Zingerman	2ce61c63e7	selftests/bpf: Add tests for rejection of ALU ops with negative offsets Define a simple program using MOD, DIV, ADD instructions, make sure that the program is rejected if invalid offset field is used for instruction. These are test cases for commit `55c0ced59f` ("bpf: Reject negative offsets for ALU ops") Link: https://lore.kernel.org/all/tencent_70D024BAE70A0A309A4781694C7B764B0608@qq.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-02 13:34:08 -07:00
Eric Biggers	f09f57c746	selftests/bpf: Add test for libbpf_sha256() Test that libbpf_sha256() calculates SHA-256 digests correctly. Tested with: make -C tools/testing/selftests/bpf/ ./tools/testing/selftests/bpf/test_progs -t sha256 -v Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-01 16:00:25 -07:00
KaFai Wan	8709c16852	selftests/bpf: Add test for BPF_NEG alu on CONST_PTR_TO_MAP Add a test case for BPF_NEG operation on CONST_PTR_TO_MAP. Tests if BPF_NEG operation on map_ptr is rejected in unprivileged mode and is a scalar value and do not trigger Oops in privileged mode. Signed-off-by: KaFai Wan <kafai.wan@linux.dev> Signed-off-by: Brahmajit Das <listout@listout.xyz> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251001191739.2323644-3-listout@listout.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-01 13:55:00 -07:00
Jiri Olsa	0c342bfc99	selftests/bpf: Fix realloc size in bpf_get_addrs We will segfault once we call realloc in bpf_get_addrs due to wrong size argument. Fixes: `6302bdeb91` ("selftests/bpf: Add a kprobe_multi subtest to use addrs instead of syms") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-01 13:37:30 -07:00
Jiri Olsa	4b2b38ea20	selftests/bpf: Fix typo in subtest_basic_usdt after merge conflict Use proper 'called' variable name. Fixes: `ae28ed4578` ("Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/aN0JVRynHxqKy4lw@krava/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-01 13:37:30 -07:00
Jiri Olsa	f83fcec784	selftests/bpf: Fix open-coded gettid syscall in uprobe syscall tests Commit `0e2fb011a0` ("selftests/bpf: Clean up open-coded gettid syscall invocations") addressed the issue that older libc may not have a gettid() function call wrapper for the associated syscall. The uprobe syscall tests got in from tip tree, using sys_gettid in there. Fixes: `0e2fb011a0` ("selftests/bpf: Clean up open-coded gettid syscall invocations") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-10-01 13:37:29 -07:00
Linus Torvalds	ae28ed4578	bpf-next-6.18 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjZH40ACgkQ6rmadz2v bTrG7w//X/5CyDoKIYJCqynYRdMtfqYuCe8Jhud4p5++iBVqkDyS6Y8EFLqZVyg/ UHTqaSE4Nz8/pma0WSjhUYn6Chs1AeH+Rw/g109SovE/YGkek2KNwY3o2hDrtPMX +oD0my8qF2HLKgEyteXXyZ5Ju+AaF92JFiGko4/wNTX8O99F9nyz2pTkrctS9Vl9 VwuTxrEXpmhqrhP3WCxkfNfcbs9HP+AALpgOXZKdMI6T4KI0N1gnJ0ZWJbiXZ8oT tug0MTPkNRidYMl0wHY2LZ6ZG8Q3a7Sgc+M0xFzaHGvGlJbBg1HjsDMtT6j34CrG TIVJ/O8F6EJzAnQ5Hio0FJk8IIgMRgvng5Kd5GXidU+mE6zokTyHIHOXitYkBQNH Hk+lGA7+E2cYqUqKvB5PFoyo+jlucuIH7YwrQlyGfqz+98n65xCgZKcmdVXr0hdB 9v3WmwJFtVIoPErUvBC3KRANQYhFk4eVk1eiGV/20+eIVyUuNbX6wqSWSA9uEXLy n5fm/vlk4RjZmrPZHxcJ0dsl9LTF1VvQQHkgoC1Sz/Cc+jA6k4I+ECVHAqEbk36p 1TUF52yPOD2ViaJKkj+962JaaaXlUn6+Dq7f1GMP6VuyHjz4gsI3mOo4XarqNdWd c7TnYmlGO/cGwqd4DdbmWiF1DDsrBcBzdbC8+FgffxQHLPXGzUg= =LeQi -----END PGP SIGNATURE----- Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc (Amery Hung) Applied as a stable branch in bpf-next and net-next trees. - Support reading skb metadata via bpf_dynptr (Jakub Sitnicki) Also a stable branch in bpf-next and net-next trees. - Enforce expected_attach_type for tailcall compatibility (Daniel Borkmann) - Replace path-sensitive with path-insensitive live stack analysis in the verifier (Eduard Zingerman) This is a significant change in the verification logic. More details, motivation, long term plans are in the cover letter/merge commit. - Support signed BPF programs (KP Singh) This is another major feature that took years to materialize. Algorithm details are in the cover letter/marge commit - Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich) - Add support for may_goto instruction to arm64 JIT (Puranjay Mohan) - Fix USDT SIB argument handling in libbpf (Jiawei Zhao) - Allow uprobe-bpf program to change context registers (Jiri Olsa) - Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and Puranjay Mohan) - Allow access to union arguments in tracing programs (Leon Hwang) - Optimize rcu_read_lock() + migrate_disable() combination where it's used in BPF subsystem (Menglong Dong) - Introduce bpf_task_work_schedule() kfuncs to schedule deferred execution of BPF callback in the context of a specific task using the kernel’s task_work infrastructure (Mykyta Yatsenko) - Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya Dwivedi) - Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi) - Improve the precision of tnum multiplier verifier operation (Nandakumar Edamana) - Use tnums to improve is_branch_taken() logic (Paul Chaignon) - Add support for atomic operations in arena in riscv JIT (Pu Lehui) - Report arena faults to BPF error stream (Puranjay Mohan) - Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin Monnet) - Add bpf_strcasecmp() kfunc (Rong Tao) - Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao Chen) tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits) libbpf: Replace AF_ALG with open coded SHA-256 selftests/bpf: Add stress test for rqspinlock in NMI selftests/bpf: Add test case for different expected_attach_type bpf: Enforce expected_attach_type for tailcall compatibility bpftool: Remove duplicate string.h header bpf: Remove duplicate crypto/sha2.h header libbpf: Fix error when st-prefix_ops and ops from differ btf selftests/bpf: Test changing packet data from kfunc selftests/bpf: Add stacktrace map lookup_and_delete_elem test case selftests/bpf: Refactor stacktrace_map case with skeleton bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE selftests/bpf: Fix flaky bpf_cookie selftest selftests/bpf: Test changing packet data from global functions with a kfunc bpf: Emit struct bpf_xdp_sock type in vmlinux BTF selftests/bpf: Task_work selftest cleanup fixes MAINTAINERS: Delete inactive maintainers from AF_XDP bpf: Mark kfuncs as __noclone selftests/bpf: Add kprobe multi write ctx attach test selftests/bpf: Add kprobe write ctx attach test selftests/bpf: Add uprobe context ip register change test ...	2025-09-30 17:58:11 -07:00
Linus Torvalds	e4dcbdff11	Performance events updates for v6.18: Core perf code updates: - Convert mmap() related reference counts to refcount_t. This is in reaction to the recently fixed refcount bugs, which could have been detected earlier and could have mitigated the bug somewhat. (Thomas Gleixner, Peter Zijlstra) - Clean up and simplify the callchain code, in preparation for sframes. (Steven Rostedt, Josh Poimboeuf) Uprobes updates: - Add support to optimize usdt probes on x86-64, which gives a substantial speedup. (Jiri Olsa) - Cleanups and fixes on x86 (Peter Zijlstra) PMU driver updates: - Various optimizations and fixes to the Intel PMU driver (Dapeng Mi) Misc cleanups and fixes: - Remove redundant __GFP_NOWARN (Qianfeng Rong) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmjWpGIRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iHvxAAvO8qWbbhUdF3EZaFU0Wx6oh5KBhImU49 VZ107xe9llA0Szy3hIl1YpdOQA2NAHtma6We/ebonrPVTTkcSCGq8absc+GahA3I CHIomx2hjD0OQ01aHvTqgHJUdFUQQ0yzE3+FY6Tsn05JsNZvDmqpAMIoMQT0LuuG 7VvVRLBuDXtuMtNmGaGCvfDGKTZkGGxD6iZS1iWHuixvVAz4IECK0vYqSyh31UGA w9Jwa0thwjKm2EZTmcSKaHSM2zw3N8QXJ3SNPPThuMrtO6QDz2+3Da9kO+vhGcRP Jls9KnWC2wxNxqIs3dr80Mzn4qMplc67Ekx2tUqX4tYEGGtJQxW6tm3JOKKIgFMI g/KF9/WJPXp0rVI9mtoQkgndzyswR/ZJBAwfEQu+nAqlp3gmmQR9+MeYPCyNnyhB 2g22PTMbXkihJmRPAVeH+WhwFy1YY3nsRhh61ha3/N0ULXTHUh0E+hWwUVMifYSV SwXqQx4srlo6RJJNTji1d6R3muNjXCQNEsJ0lCOX6ajVoxWZsPH2x7/W1A8LKmY+ FLYQUi6X9ogQbOO3WxCjUhzp5nMTNA2vvo87MUzDlZOCLPqYZmqcjntHuXwdjPyO lPcfTzc2nK1Ud26bG3+p2Bk3fjqkX9XcTMFniOvjKfffEfwpAq4xRPBQ3uRlzn0V pf9067JYF+c= =sVXH -----END PGP SIGNATURE----- Merge tag 'perf-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Core perf code updates: - Convert mmap() related reference counts to refcount_t. This is in reaction to the recently fixed refcount bugs, which could have been detected earlier and could have mitigated the bug somewhat (Thomas Gleixner, Peter Zijlstra) - Clean up and simplify the callchain code, in preparation for sframes (Steven Rostedt, Josh Poimboeuf) Uprobes updates: - Add support to optimize usdt probes on x86-64, which gives a substantial speedup (Jiri Olsa) - Cleanups and fixes on x86 (Peter Zijlstra) PMU driver updates: - Various optimizations and fixes to the Intel PMU driver (Dapeng Mi) Misc cleanups and fixes: - Remove redundant __GFP_NOWARN (Qianfeng Rong)" * tag 'perf-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) selftests/bpf: Fix uprobe_sigill test for uprobe syscall error value uprobes/x86: Return error from uprobe syscall when not called from trampoline perf: Skip user unwind if the task is a kernel thread perf: Simplify get_perf_callchain() user logic perf: Use current->flags & PF_KTHREAD\|PF_USER_WORKER instead of current->mm == NULL perf: Have get_perf_callchain() return NULL if crosstask and user are set perf: Remove get_perf_callchain() init_nr argument perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap() perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error perf/x86/intel: Use early_initcall() to hook bts_init() uprobes: Remove redundant __GFP_NOWARN selftests/seccomp: validate uprobe syscall passes through seccomp seccomp: passthrough uprobe systemcall without filtering selftests/bpf: Fix uprobe syscall shadow stack test selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe selftests/bpf: Add uprobe_regs_equal test selftests/bpf: Add optimized usdt variant for basic usdt test ...	2025-09-30 11:11:21 -07:00
Kumar Kartikeya Dwivedi	15cf39221e	selftests/bpf: Add stress test for rqspinlock in NMI Introduce a kernel module that will exercise lock acquisition in the NMI path, and bias toward creating contention such that NMI waiters end up being non-head waiters. Prior to the rqspinlock fix made in the commit `0d80e7f951` ("rqspinlock: Choose trylock fallback for NMI waiters"), it was possible for the queueing path of non-head waiters to get stuck in NMI, which this stress test reproduces fairly easily with just 3 CPUs. Both AA and ABBA flavors are supported, and it will serve as a test case for future fixes that address this corner case. More information about the problem in question is available in the commit cited above. When the fix is reverted, this stress test will lock up the system. To enable this test automatically through the test_progs infrastructure, add a load_module_params API to exercise both AA and ABBA cases when running the test. Note that the test runs for at most 5 seconds, and becomes a noop after that, in order to allow the system to make forward progress. In addition, CPU 0 is always kept untouched by the created threads and NMIs. The test will automatically scale to the number of available online CPUs. Note that at least 3 CPUs are necessary to run this test, hence skip the selftest in case the environment has less than 3 CPUs available. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250927205304.199760-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-28 03:18:40 -07:00
Daniel Borkmann	0e8e60e86c	selftests/bpf: Add test case for different expected_attach_type Add a small test case which adds two programs - one calling the other through a tailcall - and check that BPF rejects them in case of different expected_attach_type values: # ./vmtest.sh -- ./test_progs -t xdp_devmap [...] #641/1 xdp_devmap_attach/DEVMAP with programs in entries:OK #641/2 xdp_devmap_attach/DEVMAP with frags programs in entries:OK #641/3 xdp_devmap_attach/Verifier check of DEVMAP programs:OK #641/4 xdp_devmap_attach/DEVMAP with programs in entries on veth:OK #641 xdp_devmap_attach:OK Summary: 2/4 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250926171201.188490-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-27 06:24:27 -07:00
Matthieu Baerts (NGI0)	c5273f6ca1	mptcp: pm: rename 'subflows' to 'extra_subflows' A few variables linked to the Path-Managers are confusing, and it would help current and future developers, to clarify them. One of them is 'subflows', which in fact represents the number of extra subflows: all the additional subflows created after the initial one, and not the total number of subflows. While at it, add an additional name for the corresponding variable in MPTCP INFO: mptcpi_extra_subflows. Not to break the current uAPI, the new name is added as a 'define' pointing to the former name. This will then also help userspace devs. No functional changes intended. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-5-ad126cc47c6b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-26 17:44:04 -07:00
Amery Hung	991e555eff	selftests/bpf: Test changing packet data from kfunc bpf_xdp_pull_data() is the first kfunc that changes packet data. Make sure the verifier clear all packet pointers after calling packet data changing kfunc. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250926164142.1850176-1-ameryhung@gmail.com	2025-09-26 10:44:51 -07:00
Tao Chen	d43029ff7d	selftests/bpf: Add stacktrace map lookup_and_delete_elem test case Add tests for stacktrace map lookup and delete: 1. use bpf_map_lookup_and_delete_elem to lookup and delete the target stack_id, 2. lookup the deleted stack_id again to double check. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250925175030.1615837-3-chen.dylane@linux.dev	2025-09-25 16:17:30 -07:00
Tao Chen	363b17e273	selftests/bpf: Refactor stacktrace_map case with skeleton The loading method of the stacktrace_map test case looks too outdated, refactor it with skeleton, and we can use global variable feature in the next patch. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250925175030.1615837-2-chen.dylane@linux.dev	2025-09-25 16:17:14 -07:00
Mykyta Yatsenko	105eb5dc74	selftests/bpf: Fix flaky bpf_cookie selftest bpf_cookie can fail on perf_event_open(), when it runs after the task_work selftest. The task_work test causes perf to lower sysctl_perf_event_sample_rate, and bpf_cookie uses sample_freq, which is validated against that sysctl. As a result, perf_event_open() rejects the attr if the (now tighter) limit is exceeded. >From perf_event_open(): if (attr.freq) { if (attr.sample_freq > sysctl_perf_event_sample_rate) return -EINVAL; } else { if (attr.sample_period & (1ULL << 63)) return -EINVAL; } Switch bpf_cookie to use sample_period, which is not checked against sysctl_perf_event_sample_rate. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250925215230.265501-1-mykyta.yatsenko5@gmail.com	2025-09-25 15:55:44 -07:00
Amery Hung	1193c46c17	selftests/bpf: Test changing packet data from global functions with a kfunc The verifier should invalidate all packet pointers after a packet data changing kfunc is called. So, similar to commit `3f23ee5590` ("selftests/bpf: test for changing packet data from global functions"), test changing packet data from global functions to make sure packet pointers are indeed invalidated. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250925170013.1752561-2-ameryhung@gmail.com	2025-09-25 14:52:09 -07:00
Mykyta Yatsenko	5730dacb3f	selftests/bpf: Task_work selftest cleanup fixes task_work selftest does not properly handle cleanup during failures: * destroy bpf_link * perf event fd is passed to bpf_link, no need to close it if link was created successfully * goto cleanup if fork() failed, close pipe. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250924142954.129519-2-mykyta.yatsenko5@gmail.com	2025-09-25 11:00:01 -07:00
Jakub Kicinski	5e3fee34f6	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCaNNwBQAKCRAraS+Z+3y5 E8heAQDdJTR9rwAL7gD79cldlHP5PTmjyidLIoFG/efaGSbN1AD9EdvrykDU4xOG aGaO8TooGUZf7vAL8tIFuMeydYvi/gM= =Qu4T -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-09-23 We've added 9 non-merge commits during the last 33 day(s) which contain a total of 10 files changed, 480 insertions(+), 53 deletions(-). The main changes are: 1) A new bpf_xdp_pull_data kfunc that supports pulling data from a frag into the linear area of a xdp_buff, from Amery Hung. This includes changes in the xdp_native.bpf.c selftest, which Nimrod's future work depends on. It is a merge from a stable branch 'xdp_pull_data' which has also been merged to bpf-next. There is a conflict with recent changes in 'include/net/xdp.h' in the net-next tree that will need to be resolved. 2) A compiler warning fix when CONFIG_NET=n in the recent dynptr skb_meta support, from Jakub Sitnicki. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests: drv-net: Pull data before parsing headers selftests/bpf: Test bpf_xdp_pull_data bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN bpf: Make variables in bpf_prog_test_run_xdp less confusing bpf: Clear packet pointers after changing packet data in kfuncs bpf: Support pulling non-linear xdp data bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail bpf: Clear pfmemalloc flag when freeing all fragments bpf: Return an error pointer for skb metadata when CONFIG_NET=n ==================== Link: https://patch.msgid.link/20250924050303.2466356-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-24 10:22:37 -07:00
Jiri Olsa	3d237467a4	selftests/bpf: Add kprobe multi write ctx attach test Adding test to check we can't attach kprobe multi program that writes to the context. It's x86_64 specific test. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250916215301.664963-7-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-24 02:25:06 -07:00
Jiri Olsa	1b881ee294	selftests/bpf: Add kprobe write ctx attach test Adding test to check we can't attach standard kprobe program that writes to the context. It's x86_64 specific test. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250916215301.664963-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-24 02:25:06 -07:00
Jiri Olsa	6a4ea0d1cb	selftests/bpf: Add uprobe context ip register change test Adding test to check we can change the application execution through instruction pointer change through uprobe program. It's x86_64 specific test. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250916215301.664963-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-24 02:25:06 -07:00
Jiri Olsa	7f8a05c5d3	selftests/bpf: Add uprobe context registers changes test Adding test to check we can change common register values through uprobe program. It's x86_64 specific test. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20250916215301.664963-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-24 02:25:06 -07:00
Martin KaFai Lau	34f033a6c9	Merge branch 'bpf-next/xdp_pull_data' into 'bpf-next/master' Merge the xdp_pull_data stable branch into the master branch. No conflict. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2025-09-23 16:23:58 -07:00
Amery Hung	323302f54d	selftests/bpf: Test bpf_xdp_pull_data Test bpf_xdp_pull_data() with xdp packets with different layouts. The xdp bpf program first checks if the layout is as expected. Then, it calls bpf_xdp_pull_data(). Finally, it checks the 0xbb marker at offset 1024 using directly packet access. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250922233356.3356453-8-ameryhung@gmail.com	2025-09-23 13:35:12 -07:00
Amery Hung	fe9544ed1a	bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN To test bpf_xdp_pull_data(), an xdp packet containing fragments as well as free linear data area after xdp->data_end needs to be created. However, bpf_prog_test_run_xdp() always fills the linear area with data_in before creating fragments, leaving no space to pull data. This patch will allow users to specify the linear data size through ctx->data_end. Currently, ctx_in->data_end must match data_size_in and will not be the final ctx->data_end seen by xdp programs. This is because ctx->data_end is populated according to the xdp_buff passed to test_run. The linear data area available in an xdp_buff, max_linear_sz, is alawys filled up before copying data_in into fragments. This patch will allow users to specify the size of data that goes into the linear area. When ctx_in->data_end is different from data_size_in, only ctx_in->data_end bytes of data will be put into the linear area when creating the xdp_buff. While ctx_in->data_end will be allowed to be different from data_size_in, it cannot be larger than the data_size_in as there will be no data to copy from user space. If it is larger than the maximum linear data area size, the layout suggested by the user will not be honored. Data beyond max_linear_sz bytes will still be copied into fragments. Finally, since it is possible for a NIC to produce a xdp_buff with empty linear data area, allow it when calling bpf_test_init() from bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such xdp_buff. This is done by moving lower-bound check to callers as most of them already do except bpf_prog_test_run_skb(). The change also fixes a bug that allows passing an xdp_buff with data < ETH_HLEN. This can happen when ctx is used and metadata is at least ETH_HLEN. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250922233356.3356453-7-ameryhung@gmail.com	2025-09-23 13:35:12 -07:00
Leon Hwang	1c6686bf7f	selftests/bpf: Add union argument tests using fexit programs Add test coverage for union argument support using fexit programs: * 8B union argument - verify that the verifier accepts it and that fexit programs can trace such functions. * 16B union argument - verify that the verifier accepts it and that fexit programs can access the argument, which is passed using two registers. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20250919044110.23729-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 12:07:47 -07:00
Puranjay Mohan	f616549124	selftests: bpf: Add tests for signed loads from arena Add tests for loading 8, 16, and 32 bits with sign extension from arena, also verify that exception handling is working correctly and correct assembly is being generated by the x86 and arm64 JITs. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20250923110157.18326-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 12:00:23 -07:00
Mykyta Yatsenko	c6ae18e0af	selftests/bpf: add bpf task work stress tests Add stress tests for BPF task-work scheduling kfuncs. The tests spawn multiple threads that concurrently schedule task_work callbacks against the same and different map values to exercise the kfuncs under high contention. Verify callbacks are reliably enqueued and executed with no drops. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20250923112404.668720-10-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 07:34:39 -07:00
Mykyta Yatsenko	39fd74dfd5	selftests/bpf: BPF task work scheduling tests Introducing selftests that check BPF task work scheduling mechanism. Validate that verifier does not accepts incorrect calls to bpf_task_work_schedule kfunc. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250923112404.668720-9-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-23 07:34:39 -07:00
KP Singh	b720903e2b	selftests/bpf: Enable signature verification for some lskel tests The test harness uses the verify_sig_setup.sh to generate the required key material for program signing. Generate key material for signing LSKEL some lskel programs and use xxd to convert the verification certificate into a C header file. Finally, update the main test runner to load this certificate into the session keyring via the add_key() syscall before executing any tests. Use the session keyring in the tests with signed programs. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250921160120.9711-6-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-22 19:17:55 -07:00
Yonghong Song	5a427fddec	selftests/bpf: Fix selftest verifier_arena_large failure With latest llvm22, I got the following verification failure: ... ; int big_alloc2(void ctx) @ verifier_arena_large.c:207 0: (b4) w6 = 1 ; R6_w=1 ... ; if (err) @ verifier_arena_large.c:233 53: (56) if w6 != 0x0 goto pc+62 ; R6=0 54: (b7) r7 = -4 ; R7_w=-4 55: (18) r8 = 0x7f4000000000 ; R8_w=scalar() 57: (bf) r9 = addr_space_cast(r8, 0, 1) ; R8_w=scalar() R9_w=arena 58: (b4) w6 = 5 ; R6_w=5 ; pg = page[i]; @ verifier_arena_large.c:238 59: (bf) r1 = r7 ; R1_w=-4 R7_w=-4 60: (07) r1 += 4 ; R1_w=0 61: (79) r2 = (u64 )(r9 +0) ; R2_w=scalar() R9_w=arena ; if (pg != i) @ verifier_arena_large.c:239 62: (bf) r3 = addr_space_cast(r2, 0, 1) ; R2_w=scalar() R3_w=arena 63: (71) r3 = (u8 )(r3 +0) ; R3_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 64: (5d) if r1 != r3 goto pc+51 ; R1_w=0 R3_w=0 ; bpf_arena_free_pages(&arena, (void __arena )pg, 2); @ verifier_arena_large.c:241 65: (18) r1 = 0xff11000114548000 ; R1_w=map_ptr(map=arena,ks=0,vs=0) 67: (b4) w3 = 2 ; R3_w=2 68: (85) call bpf_arena_free_pages#72675 ; 69: (b7) r1 = 0 ; R1_w=0 ; page[i + 1] = NULL; @ verifier_arena_large.c:243 70: (7b) (u64 *)(r8 +8) = r1 R8 invalid mem access 'scalar' processed 61 insns (limit 1000000) max_states_per_insn 0 total_states 6 peak_states 6 mark_read 2 ============= #489/5 verifier_arena_large/big_alloc2:FAIL The main reason is that 'r8' in insn '70' is not an arena pointer. Further debugging at llvm side shows that llvm commit ([1]) caused the failure. For the original code: page[i] = NULL; page[i + 1] = NULL; the llvm transformed it to something like below at source level: __builtin_memset(&page[i], 0, 16) Such transformation prevents llvm BPFCheckAndAdjustIR pass from generating proper addr_space_cast insns ([2]). Adding support in llvm BPFCheckAndAdjustIR pass should work, but not sure that such a pattern exists or not in real applications. At the same time, simply adding a memory barrier between two 'page' assignment can fix the issue. [1] https://github.com/llvm/llvm-project/pull/155415 [2] https://github.com/llvm/llvm-project/pull/84410 Cc: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250920045805.3288551-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-20 15:49:35 -07:00
Eduard Zingerman	fdcecdff90	selftests/bpf: test cases for callchain sensitive live stack tracking - simple propagation of read/write marks; - joining read/write marks from conditional branches; - avoid must_write marks in when same instruction accesses different stack offsets on different execution paths; - avoid must_write marks in case same instruction accesses stack and non-stack pointers on different execution paths; - read/write marks propagation to outer stack frame; - independent read marks for different callchains ending with the same function; - bpf_calls_callback() dependent logic in liveness.c:bpf_stack_slot_alive(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-12-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:24 -07:00
Eduard Zingerman	34c513be3d	selftests/bpf: __not_msg() tag for test_loader framework This patch adds tags __not_msg(<msg>) and __not_msg_unpriv(<msg>). Test fails if <msg> is found in verifier log. If __msg_not() is situated between __msg() tags framework matches __msg() tags first, and then checks that <msg> is not present in a portion of a log between bracketing __msg() tags. __msg_not() tags bracketed by a same __msg() group are effectively unordered. The idea is borrowed from LLVM's CheckFile with its CHECK-NOT syntax. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-11-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
Eduard Zingerman	107e169799	bpf: disable and remove registers chain based liveness Remove register chain based liveness tracking: - struct bpf_reg_state->{parent,live} fields are no longer needed; - REG_LIVE_WRITTEN marks are superseded by bpf_mark_stack_write() calls; - mark_reg_read() calls are superseded by bpf_mark_stack_read(); - log.c:print_liveness() is superseded by logging in liveness.c; - propagate_liveness() is superseded by bpf_update_live_stack(); - no need to establish register chains in is_state_visited() anymore; - fix a bunch of tests expecting "_w" suffixes in verifier log messages. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-9-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:23 -07:00
KP Singh	ea2e6467ac	bpf: Return hashes of maps in BPF_OBJ_GET_INFO_BY_FD Currently only array maps are supported, but the implementation can be extended for other maps and objects. The hash is memoized only for exclusive and frozen maps as their content is stable until the exclusive program modifies the map. This is required for BPF signing, enabling a trusted loader program to verify a map's integrity. The loader retrieves the map's runtime hash from the kernel and compares it against an expected hash computed at build time. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-7-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
KP Singh	6c850cbca8	selftests/bpf: Add tests for exclusive maps Check if access is denied to another program for an exclusive map Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-6-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 19:11:42 -07:00
Kumar Kartikeya Dwivedi	8b788d6638	selftests/bpf: Add tests for KF_RCU_PROTECTED Add a couple of test cases to ensure RCU protection is kicked in automatically, and the return type is as expected. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 15:36:17 -07:00
Kumar Kartikeya Dwivedi	1512231b6c	bpf: Enforce RCU protection for KF_RCU_PROTECTED Currently, KF_RCU_PROTECTED only applies to iterator APIs and that too in a convoluted fashion: the presence of this flag on the kfunc is used to set MEM_RCU in iterator type, and the lack of RCU protection results in an error only later, once next() or destroy() methods are invoked on the iterator. While there is no bug, this is certainly a bit unintuitive, and makes the enforcement of the flag iterator specific. In the interest of making this flag useful for other upcoming kfuncs, e.g. scx_bpf_cpu_curr() [0][1], add enforcement for invoking the kfunc in an RCU critical section in general. This would also mean that iterator APIs using KF_RCU_PROTECTED will error out earlier, instead of throwing an error for lack of RCU CS protection when next() or destroy() methods are invoked. In addition to this, if the kfuncs tagged KF_RCU_PROTECTED return a pointer value, ensure that this pointer value is only usable in an RCU critical section. There might be edge cases where the return value is special and doesn't need to imply MEM_RCU semantics, but in general, the assumption should hold for the majority of kfuncs, and we can revisit things if necessary later. [0]: https://lore.kernel.org/all/20250903212311.369697-3-christian.loehle@arm.com [1]: https://lore.kernel.org/all/20250909195709.92669-1-arighi@nvidia.com Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-18 15:36:17 -07:00
Eduard Zingerman	a24a2dda70	selftests/bpf: trigger verifier.c:maybe_exit_scc() for a speculative state This is a test case minimized from a syzbot reproducer from [1]. The test case triggers verifier.c:maybe_exit_scc() w/o preceding call to verifier.c:maybe_enter_scc() on a speculative symbolic execution path. Here is verifier log for the test case: Live regs before insn: 0: .......... (b7) r0 = 100 1 1: 0......... (7b) (u64 )(r10 -512) = r0 1 2: 0......... (b5) if r0 <= 0x0 goto pc-2 3: 0......... (95) exit 0: R1=ctx() R10=fp0 0: (b7) r0 = 100 ; R0_w=100 1: (7b) (u64 )(r10 -512) = r0 ; R0_w=100 R10=fp0 fp-512_w=100 2: (b5) if r0 <= 0x0 goto pc-2 mark_precise: ... 2: R0_w=100 3: (95) exit from 2 to 1 (speculative execution): R0_w=scalar() R1=ctx() R10=fp0 fp-512_w=100 1: R0_w=scalar() R1=ctx() R10=fp0 fp-512_w=100 1: (7b) (u64 )(r10 -512) = r0 processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 - Non-speculative execution path 0-3 does not allocate any checkpoints (and hence does not call maybe_enter_scc()), and schedules a speculative jump from 2 to 1. - Speculative execution path stops immediately because of an infinite loop detection and triggers verifier.c:update_branch_counts() -> maybe_exit_scc() calls. [1] https://lore.kernel.org/bpf/68c85acd.050a0220.2ff435.03a4.GAE@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250916212251.3490455-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-17 11:19:58 -07:00
Paul Chaignon	180a46bc1a	selftests/bpf: Test accesses to ctx padding This patch adds tests covering the various paddings in ctx structures. In case of sk_lookup BPF programs, the behavior is a bit different because accesses to the padding are explicitly allowed. Other cases result in a clear reject from the verifier. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/3dc5f025e350aeb2bb1c257b87c577518e574aeb.1758094761.git.paul.chaignon@gmail.com	2025-09-17 16:15:57 +02:00
Paul Chaignon	7c60f6e488	selftests/bpf: Move macros to bpf_misc.h Move the sizeof_field and offsetofend macros from individual test files to the common bpf_misc.h to avoid duplication. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/97a3f3788bd3aec309100bc073a5c77130e371fd.1758094761.git.paul.chaignon@gmail.com	2025-09-17 16:15:37 +02:00
Al Viro	1b8abbb121	bpf...d_path(): constify path argument Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2025-09-15 21:17:08 -04:00
Alan Maguire	3ae4c52708	selftests/bpf: More open-coded gettid syscall cleanup Commit `0e2fb011a0` ("selftests/bpf: Clean up open-coded gettid syscall invocations") addressed the issue that older libc may not have a gettid() function call wrapper for the associated syscall. A few more instances have crept into tests, use sys_gettid() instead, and poison raw gettid() usage to avoid future issues. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250911163056.543071-1-alan.maguire@oracle.com	2025-09-15 20:39:24 +02:00
Kumar Kartikeya Dwivedi	a8250d167c	selftests/bpf: Add a test for bpf_cgroup_from_id lookup in non-root cgns Make sure that we only switch the cgroup namespace and enter a new cgroup in a child process separate from test_progs, to not mess up the environment for subsequent tests. To remove this cgroup, we need to wait for the child to exit, and then rmdir its cgroup. If the read call fails, or waitpid succeeds, we know the child exited (read call would fail when the last pipe end is closed, otherwise waitpid waits until exit(2) is called). We then invoke a newly introduced remove_cgroup_pid() helper, that identifies cgroup path using the passed in pid of the now dead child, instead of using the current process pid (getpid()). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250915032618.1551762-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:53:15 -07:00
Saket Kumar Bhaskar	a9d4e9f0e8	selftests/bpf: Fix arena_spin_lock selftest failure For systems having CONFIG_NR_CPUS set to > 1024 in kernel config the selftest fails as arena_spin_lock_irqsave() returns EOPNOTSUPP. (eg - incase of powerpc default value for CONFIG_NR_CPUS is 8192) The selftest is skipped incase bpf program returns EOPNOTSUPP, with a descriptive message logged. Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Link: https://lore.kernel.org/r/20250913091337.1841916-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-15 10:49:18 -07:00
Leon Hwang	f7528e4412	selftests/bpf: Skip timer_interrupt case when bpf_timer is not supported Like commit `fbdd61c94b` ("selftests/bpf: Skip timer cases when bpf_timer is not supported"), 'timer_interrupt' test case should be skipped if verifier rejects bpf_timer with returning -EOPNOTSUPP. cd tools/testing/selftests/bpf ./test_progs -t timer 461 timer_interrupt:SKIP Summary: 6/0 PASSED, 7 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250915121657.28084-1-leon.hwang@linux.dev	2025-09-15 10:22:58 -07:00
Jiri Olsa	6d48436560	selftests/bpf: Fix uprobe_sigill test for uprobe syscall error value The uprobe syscall now returns -ENXIO errno when called outside kernel trampoline, fixing the current sigill test to reflect that and renaming it to uprobe_error. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org>	2025-09-15 13:46:29 +02:00
Jakub Kicinski	fc3a281041	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.17-rc6). Conflicts: net/netfilter/nft_set_pipapo.c net/netfilter/nft_set_pipapo_avx2.c `c4eaca2e10` ("netfilter: nft_set_pipapo: don't check genbit from packetpath lookups") `84c1da7b38` ("netfilter: nft_set_pipapo: use avx2 algorithm for insertions too") Only trivial adjacent changes (in a doc and a Makefile). Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-09-11 17:40:13 -07:00
Puranjay Mohan	86f2225065	selftests/bpf: Add tests for arena fault reporting Add selftests for testing the reporting of arena page faults through BPF streams. Two new bpf programs are added that read and write to an unmapped arena address and the fault reporting is verified in the userspace through streams. The added bpf programs need to access the user_vm_start in struct bpf_arena, this is done by casting &arena to struct bpf_arena *, but barrier_var() is used on this ptr before accessing ptr->user_vm_start; to stop GCC from issuing an out-of-bound access due to the cast from smaller map struct to larger "struct bpf_arena" Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-7-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Puranjay Mohan	edd03fcd76	selftests: bpf: use __stderr in stream error tests Start using __stderr directly in the bpf programs to test the reporting of may_goto timeout detection and spin_lock dead lock detection. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Puranjay Mohan	744eeb2b27	selftests: bpf: introduce __stderr and __stdout Add __stderr and __stdout to validate the output of BPF streams for bpf selftests. Similar to __xlated, __jited, etc., __stderr/out can be used in the BPF progs to compare a string (regex supported) to the output in the bpf streams. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250911145808.58042-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 13:00:44 -07:00
Alexei Starovoitov	5d87e96a49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5 Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-11 09:34:37 -07:00
Leon Hwang	fbdd61c94b	selftests/bpf: Skip timer cases when bpf_timer is not supported When enable CONFIG_PREEMPT_RT, verifier will reject bpf_timer with returning -EOPNOTSUPP. Therefore, skip test cases when errno is EOPNOTSUPP. cd tools/testing/selftests/bpf ./test_progs -t timer 125 free_timer:SKIP 456 timer:SKIP 457/1 timer_crash/array:SKIP 457/2 timer_crash/hash:SKIP 457 timer_crash:SKIP 458 timer_lockup:SKIP 459 timer_mim:SKIP Summary: 5/0 PASSED, 6 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20250910125740.52172-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-10 12:34:09 -07:00
Rong Tao	6624fb2f33	selftests/bpf: Add tests for bpf_strnstr Add tests for bpf_strnstr(): bpf_strnstr("", "", 0) = 0 bpf_strnstr("hello world", "hello", 5) = 0 bpf_strnstr(str, "hello", 4) = -ENOENT bpf_strnstr("", "a", 0) = -ENOENT Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/tencent_2ED218F8082565C95D86A804BDDA8DBA200A@qq.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-09 15:07:58 -07:00
Ilya Leoshkevich	5d40c038c8	selftests/bpf: Fix "expression result unused" warnings with icecc icecc is a compiler wrapper that distributes compile jobs over a build farm [1]. It works by sending toolchain binaries and preprocessed source code to remote machines. Unfortunately using it with BPF selftests causes build failures due to a clang bug [2]. The problem is that clang suppresses the -Wunused-value warning if the unused expression comes from a macro expansion. Since icecc compiles preprocessed source code, this information is not available. This leads to -Wunused-value false positives. obj_new_no_struct() and obj_new_acq() use the bpf_obj_new() macro and discard the result. arena_spin_lock_slowpath() uses two macros that produce values and ignores the results. Add (void) casts to explicitly indicate that this is intentional and suppress the warning. An alternative solution is to change the macros to not produce values. This would work today for the arena_spin_lock_slowpath() issue, but in the future there may appear users who need them. Another potential solution is to replace these macros with functions. Unfortunately this would not work, because these macros work with unknown types and control flow. [1] https://github.com/icecc/icecream [2] https://github.com/llvm/llvm-project/issues/142614 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250829030017.102615-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-09 15:07:58 -07:00
Daniel Borkmann	3aa9b9a165	selftests/bpf: Extend crypto_sanity selftest with invalid dst buffer Small cleanup and test extension to probe the bpf_crypto_{encrypt,decrypt}() kfunc when a bad dst buffer is passed in to assert that an error is returned. Also, encrypt_sanity() and skb_crypto_setup() were explicit to set the global status variable to zero before any test, so do the same for decrypt_sanity(). Do not explicitly zero the on-stack err before bpf_crypto_ctx_create() given the kfunc is expected to do it internally for the success case. Before kernel fix: # ./vmtest.sh -- ./test_progs -t crypto [...] [ 1.531200] bpf_testmod: loading out-of-tree module taints kernel. [ 1.533388] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #87/1 crypto_basic/crypto_release:OK #87/2 crypto_basic/crypto_acquire:OK #87 crypto_basic:OK test_crypto_sanity:PASS:skel open 0 nsec test_crypto_sanity:PASS:ip netns add crypto_sanity_ns 0 nsec test_crypto_sanity:PASS:ip -net crypto_sanity_ns -6 addr add face::1/128 dev lo nodad 0 nsec test_crypto_sanity:PASS:ip -net crypto_sanity_ns link set dev lo up 0 nsec test_crypto_sanity:PASS:open_netns 0 nsec test_crypto_sanity:PASS:AF_ALG init fail 0 nsec test_crypto_sanity:PASS:if_nametoindex lo 0 nsec test_crypto_sanity:PASS:skb_crypto_setup fd 0 nsec test_crypto_sanity:PASS:skb_crypto_setup 0 nsec test_crypto_sanity:PASS:skb_crypto_setup retval 0 nsec test_crypto_sanity:PASS:skb_crypto_setup status 0 nsec test_crypto_sanity:PASS:create qdisc hook 0 nsec test_crypto_sanity:PASS:make_sockaddr 0 nsec test_crypto_sanity:PASS:attach encrypt filter 0 nsec test_crypto_sanity:PASS:encrypt socket 0 nsec test_crypto_sanity:PASS:encrypt send 0 nsec test_crypto_sanity:FAIL:encrypt status unexpected error: -5 (errno 95) #88 crypto_sanity:FAIL Summary: 1/2 PASSED, 0 SKIPPED, 1 FAILED After kernel fix: # ./vmtest.sh -- ./test_progs -t crypto [...] [ 1.540963] bpf_testmod: loading out-of-tree module taints kernel. [ 1.542404] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #87/1 crypto_basic/crypto_release:OK #87/2 crypto_basic/crypto_acquire:OK #87 crypto_basic:OK #88 crypto_sanity:OK Summary: 2/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20250829143657.318524-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-09 15:07:57 -07:00
Jiayuan Chen	f85981327a	selftests/bpf: Fix incorrect array size calculation The loop in bench_sockmap_prog_destroy() has two issues: 1. Using 'sizeof(ctx.fds)' as the loop bound results in the number of bytes, not the number of file descriptors, causing the loop to iterate far more times than intended. 2. The condition 'ctx.fds[0] > 0' incorrectly checks only the first fd for all iterations, potentially leaving file descriptors unclosed. Change it to 'ctx.fds[i] > 0' to check each fd properly. These fixes ensure correct cleanup of all file descriptors when the benchmark exits. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250909124721.191555-1-jiayuan.chen@linux.dev Closes: https://lore.kernel.org/bpf/aLqfWuRR9R_KTe5e@stanley.mountain/	2025-09-09 09:23:47 -07:00
Feng Yang	93a83d0443	selftests/bpf: Fix the issue where the error code is 0 The error message printed here only uses the previous err value, which results in it being printed as 0. When bpf_map__attach_struct_ops encounters an error, it uses libbpf_err_ptr(err) to set errno = -err and returns NULL. Therefore, Using -errno can fix this issue. Fix before: run_subtest:FAIL:1019 bpf_map__attach_struct_ops failed for map pro_epilogue: err=0 Fix after: run_subtest:FAIL:1019 bpf_map__attach_struct_ops failed for map pro_epilogue: err=-9 Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250908060810.1054341-1-yangfeng59949@163.com	2025-09-08 09:51:13 -07:00
Mykyta Yatsenko	e12873ee85	selftests/bpf: Add BPF program dump in veristat Add the ability to dump BPF program instructions directly from veristat. Previously, inspecting a program required separate bpftool invocations: one to load and another to dump it, which meant running multiple commands. During active development, it's common for developers to use veristat for testing verification. Integrating instruction dumping into veristat reduces the need to switch tools and simplifies the workflow. By making this information more readily accessible, this change aims to streamline the BPF development cycle and improve usability for developers. This implementation leverages bpftool, by running it directly via popen to avoid any code duplication and keep veristat simple. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250905140835.1416179-1-mykyta.yatsenko5@gmail.com	2025-09-05 12:12:09 -07:00
Leon Hwang	88a3bde432	selftests/bpf: Add case to test bpf_in_interrupt() Add a timer test case to test 'bpf_in_interrupt()'. cd tools/testing/selftests/bpf ./test_progs -t timer_interrupt 462 timer_interrupt:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20250903140438.59517-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:05:58 -07:00
Leon Hwang	4b69e31329	selftests/bpf: Introduce experimental bpf_in_interrupt() Filtering pid_tgid is meanlingless when the current task is preempted by an interrupt. To address this, introduce 'bpf_in_interrupt()' helper function, which allows BPF programs to determine whether they are executing in interrupt context. 'get_preempt_count()': * On x86, '(int ) bpf_this_cpu_ptr(&__preempt_count)'. * On arm64, 'bpf_get_current_task_btf()->thread_info.preempt.count'. Then 'bpf_in_interrupt()' will be: * If !PREEMPT_RT, 'get_preempt_count() & (NMI_MASK \| HARDIRQ_MASK \| SOFTIRQ_MASK)'. * If PREEMPT_RT, '(get_preempt_count() & (NMI_MASK \| HARDIRQ_MASK)) \| (bpf_get_current_task_btf()->softirq_disable_cnt & SOFTIRQ_MASK)'. As for other archs, it can be added support by updating 'get_preempt_count()'. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20250903140438.59517-2-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:05:58 -07:00
Rong Tao	abc8a952d4	selftests/bpf: Test kfunc bpf_strcasecmp Add testsuites for kfunc bpf_strcasecmp. Signed-off-by: Rong Tao <rongtao@cestc.cn> Link: https://lore.kernel.org/r/tencent_81A1A0ACC04B68158C57C4D151C46A832B07@qq.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:00:57 -07:00
Menglong Dong	a85d888768	selftests/bpf: add benchmark testing for kprobe-multi-all For now, the benchmark for kprobe-multi is single, which means there is only 1 function is hooked during testing. Add the testing "kprobe-multi-all", which will hook all the kernel functions during the benchmark. And the "kretprobe-multi-all" is added too. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20250904021011.14069-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:00:25 -07:00
Menglong Dong	adf6b57ce4	selftests/bpf: skip recursive functions for kprobe_multi Some functions is recursive for the kprobe_multi and impact the benchmark results. So just skip them. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20250904021011.14069-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:00:25 -07:00
Menglong Dong	8bad31edf5	selftests/bpf: move get_ksyms and get_addrs to trace_helpers.c We need to get all the kernel function that can be traced sometimes, so we move the get_syms() and get_addrs() in kprobe_multi_test.c to trace_helpers.c and rename it to bpf_get_ksyms() and bpf_get_addrs(). Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20250904021011.14069-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-09-04 09:00:25 -07:00
Ricardo B. Marlière	c9110e6f72	selftests/bpf: Fix count write in testapp_xdp_metadata_copy() Commit `4b30209255` ("selftests/xsk: Add tail adjustment tests and support check") added a new global to xsk_xdp_progs.c, but left out the access in the testapp_xdp_metadata_copy() function. Since bpf_map_update_elem() will write to the whole bss section, it gets truncated. Fix by writing to skel_rx->bss->count directly. Fixes: `4b30209255` ("selftests/xsk: Add tail adjustment tests and support check") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250829-selftests-bpf-xsk_regression_fix-v1-1-5f5acdb9fe6b@suse.com	2025-09-03 16:46:36 -07:00
Ricardo B. Marlière	2a912258c9	selftests/bpf: Upon failures, exit with code 1 in test_xsk.sh Currently, even if some subtests fails, the end result will still yield "ok 1 selftests: bpf: test_xsk.sh". Fix it by exiting with 1 if there are any failures. Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20250828-selftests-bpf-test_xsk_ret-v1-1-e6656c01f397@suse.com	2025-09-03 16:44:22 -07:00
Ricardo B. Marlière	98857d111c	selftests/bpf: Fix bpf_prog_detach2 usage in test_lirc_mode2 Commit `e9fc3ce99b` ("libbpf: Streamline error reporting for high-level APIs") redefined the way that bpf_prog_detach2() returns. Therefore, adapt the usage in test_lirc_mode2_user.c. Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250828-selftests-bpf-v1-1-c7811cd8b98c@suse.com	2025-08-28 09:49:46 -07:00
Eric Dumazet	51132b99f0	udp: add drop_counters to udp socket When a packet flood hits one or more UDP sockets, many cpus have to update sk->sk_drops. This slows down other cpus, because currently sk_drops is in sock_write_rx group. Add a socket_drop_counters structure to udp sockets. Using dedicated cache lines to hold drop counters makes sure that consumers no longer suffer from false sharing if/when producers only change sk->sk_drops. This adds 128 bytes per UDP socket. Tested with the following stress test, sending about 11 Mpps to a dual socket AMD EPYC 7B13 64-Core. super_netperf 20 -t UDP_STREAM -H DUT -l10 -- -n -P,1000 -m 120 Note: due to socket lookup, only one UDP socket is receiving packets on DUT. Then measure receiver (DUT) behavior. We can see both consumer and BH handlers can process more packets per second. Before: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 615091 0.0 Udp6InErrors 3904277 0.0 Udp6RcvbufErrors 3904277 0.0 After: nstat -n ; sleep 1 ; nstat \| grep Udp Udp6InDatagrams 816281 0.0 Udp6InErrors 7497093 0.0 Udp6RcvbufErrors 7497093 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250826125031.1578842-5-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-08-28 13:14:50 +02:00
Matt Fleming	737433c6a5	selftests/bpf: Add LPM trie microbenchmarks Add benchmarks for the standard set of operations: LOOKUP, INSERT, UPDATE, DELETE. Also include benchmarks to measure the overhead of the bench framework itself (NOOP) as well as the overhead of generating keys (BASELINE). Lastly, this includes a benchmark for FREE (trie_free()) which is known to have terrible performance for maps with many entries. Benchmarks operate on tries without gaps in the key range, i.e. each test begins or ends with a trie with valid keys in the range [0, nr_entries). This is intended to cause maximum branching when traversing the trie. LOOKUP, UPDATE, DELETE, and FREE fill a BPF LPM trie from userspace using bpf_map_update_batch() and run the corresponding benchmark operation via bpf_loop(). INSERT starts with an empty map and fills it kernel-side from bpf_loop(). FREE records the time to free a filled LPM trie by attaching and destroying a BPF prog. NOOP measures the overhead of the test harness by running an empty function with bpf_loop(). BASELINE is similar to NOOP except that the function generates a key. Each operation runs 10,000 times using bpf_loop(). Note that this value is intentionally independent of the number of entries in the LPM trie so that the stability of the results isn't affected by the number of entries. For those benchmarks that need to reset the LPM trie once it's full (INSERT) or empty (DELETE), throughput and latency results are scaled by the fraction of a second the operation actually ran to ignore any time spent reinitialising the trie. By default, benchmarks run using sequential keys in the range [0, nr_entries). BASELINE, LOOKUP, and UPDATE can use random keys via the --random parameter but beware there is a runtime cost involved in generating random keys. Other benchmarks are prohibited from using random keys because it can skew the results, e.g. when inserting an existing key or deleting a missing one. All measurements are recorded from within the kernel to eliminate syscall overhead. Most benchmarks run an XDP program to generate stats but FREE needs to collect latencies using fentry/fexit on map_free_deferred() because it's not possible to use fentry directly on lpm_trie.c since commit `c83508da56` ("bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs") and there's no way to create/destroy a map from within an XDP program. Here is example output from an AMD EPYC 9684X 96-Core machine for each of the benchmarks using a trie with 10K entries and a 32-bit prefix length, e.g. $ ./bench lpm-trie-$op \ --prefix_len=32 \ --producers=1 \ --nr_entries=10000 noop: throughput 74.417 ± 0.032 M ops/s ( 74.417M ops/prod), latency 13.438 ns/op baseline: throughput 70.107 ± 0.171 M ops/s ( 70.107M ops/prod), latency 14.264 ns/op lookup: throughput 8.467 ± 0.047 M ops/s ( 8.467M ops/prod), latency 118.109 ns/op insert: throughput 2.440 ± 0.015 M ops/s ( 2.440M ops/prod), latency 409.290 ns/op update: throughput 2.806 ± 0.042 M ops/s ( 2.806M ops/prod), latency 356.322 ns/op delete: throughput 4.625 ± 0.011 M ops/s ( 4.625M ops/prod), latency 215.613 ns/op free: throughput 0.578 ± 0.006 K ops/s ( 0.578K ops/prod), latency 1.730 ms/op And the same benchmarks using random keys: $ ./bench lpm-trie-$op \ --prefix_len=32 \ --producers=1 \ --nr_entries=10000 \ --random noop: throughput 74.259 ± 0.335 M ops/s ( 74.259M ops/prod), latency 13.466 ns/op baseline: throughput 35.150 ± 0.144 M ops/s ( 35.150M ops/prod), latency 28.450 ns/op lookup: throughput 7.119 ± 0.048 M ops/s ( 7.119M ops/prod), latency 140.469 ns/op insert: N/A update: throughput 2.736 ± 0.012 M ops/s ( 2.736M ops/prod), latency 365.523 ns/op delete: N/A free: N/A Signed-off-by: Matt Fleming <mfleming@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20250827140149.1001557-1-matt@readmodwrite.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-27 17:28:14 -07:00
Puranjay Mohan	22b22bf9ee	selftests/bpf: Enable timed may_goto tests for arm64 As arm64 JIT now supports timed may_goto instruction, make sure all relevant tests run on this architecture. Some tests were enabled and other required modifications to work properly on arm64. $ ./test_progs -a "stream","may_goto*",verifier_bpf_fastcall #404 stream_errors:OK [...] #406/2 stream_success/stream_cond_break:OK [...] #494/23 verifier_bpf_fastcall/may_goto_interaction_x86_64:SKIP #494/24 verifier_bpf_fastcall/may_goto_interaction_arm64:OK [...] #539/1 verifier_may_goto_1/may_goto 0:OK #539/2 verifier_may_goto_1/batch 2 of may_goto 0:OK #539/3 verifier_may_goto_1/may_goto batch with offsets 2/1/0:OK #539/4 verifier_may_goto_1/may_goto batch with offsets 2/0:OK #539 verifier_may_goto_1:OK #540/1 verifier_may_goto_2/C code with may_goto 0:OK #540 verifier_may_goto_2:OK Summary: 7/16 PASSED, 25 SKIPPED, 0 FAILED Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20250827113245.52629-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-27 17:16:22 -07:00
Jiawei Zhao	69424097ee	selftests/bpf: Enrich subtest_basic_usdt case in selftests to cover SIB handling logic When using GCC on x86-64 to compile an usdt prog with -O1 or higher optimization, the compiler will generate SIB addressing mode for global array, e.g. "1@-96(%rbp,%rax,8)". In this patch: - enrich subtest_basic_usdt test case to cover SIB addressing usdt argument spec handling logic Signed-off-by: Jiawei Zhao <phoenix500526@163.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250827053128.1301287-3-phoenix500526@163.com	2025-08-27 15:48:05 -07:00
Shubham Sharma	d3abefe897	selftests/bpf: Fix typos and grammar in test sources Fix spelling typos and grammar errors in BPF selftests source code. Signed-off-by: Shubham Sharma <slopixelz@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250826125746.17983-1-slopixelz@gmail.com	2025-08-27 15:13:08 -07:00
Nandakumar Edamana	2660b9d477	bpf: Add selftest to check the verifier's abstract multiplication Add new selftest to test the abstract multiplication technique(s) used by the verifier, following the recent improvement in tnum multiplication (tnum_mul). One of the newly added programs, verifier_mul/mul_precise, results in a false positive with the old tnum_mul, while the program passes with the latest one. Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250826034524.2159515-2-nandakumar@nandakumar.co.in	2025-08-27 15:00:31 -07:00
Ilya Leoshkevich	21bce56940	selftests/bpf: Remove may_goto tests from DENYLIST.s390x The may_goto instruction is now fully supported on s390x, including the timed implementation, so remove the respective test from the denylist. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250821113339.292434-6-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-26 16:51:52 -07:00
Ilya Leoshkevich	7197dbcba2	selftests/bpf: Enable timed may_goto verifier tests on s390x Now that the timed may_goto implementation is available on s390x, enable the respective verifier tests. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250821113339.292434-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-26 16:51:52 -07:00
Ilya Leoshkevich	1e4e6b9e26	selftests/bpf: Add __arch_s390x macro Make it possible to limit certain tests to s390x, just like it's already done for x86_64, arm64, and riscv64. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250821113339.292434-4-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-26 16:51:52 -07:00
Ilya Leoshkevich	b68dfcc12a	selftests/bpf: Add a missing newline to the "bad arch spec" message Fix error messages like this one: parse_test_spec:FAIL:569 bad arch spec: 's390x'process_subtest:FAIL:1153 Can't parse test spec for program 'may_goto_simple' Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250821113339.292434-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-08-26 16:51:52 -07:00
Tiezhu Yang	d0f27ff27c	selftests/bpf: Remove entries from config.{arch} already present in config `config.{arch}` had entries already present in `config`. When generating the config used by vmtest, concatenate the `config` file with the `config.{arch}` one, making those entries duplicated, so remove those duplications. Use the following command to get the differences: $ comm -1 -2 <(sort tools/testing/selftests/bpf/config.x86_64) <(sort tools/testing/selftests/bpf/config) $ comm -1 -2 <(sort tools/testing/selftests/bpf/config.aarch64) <(sort tools/testing/selftests/bpf/config) $ comm -1 -2 <(sort tools/testing/selftests/bpf/config.riscv64) <(sort tools/testing/selftests/bpf/config) $ comm -1 -2 <(sort tools/testing/selftests/bpf/config.ppc64el) <(sort tools/testing/selftests/bpf/config) $ comm -1 -2 <(sort tools/testing/selftests/bpf/config.s390x) <(sort tools/testing/selftests/bpf/config) This is similar with commit `7a42af4b94` ("selftests/bpf: Remove entries from config.s390x already present in config"). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250826065057.11415-1-yangtiezhu@loongson.cn	2025-08-26 13:47:03 +02:00
Paul Chaignon	0780f54ab1	selftests/bpf: Tests for is_scalar_branch_taken tnum logic This patch adds tests for the new jeq and jne logic in is_scalar_branch_taken. The following shows the first test failing before the previous patch is applied. Once the previous patch is applied, the verifier can use the tnum values to deduce that instruction 7 is dead code. 0: call bpf_get_prandom_u32#7 ; R0_w=scalar() 1: w0 = w0 ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 2: r0 >>= 30 ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=3,var_off=(0x0; 0x3)) 3: r0 <<= 30 ; R0_w=scalar(smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000)) 4: r1 = r0 ; R0_w=scalar(id=1,smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000)) R1_w=scalar(id=1,smin=0,smax=umax=umax32=0xc0000000,smax32=0x40000000,var_off=(0x0; 0xc0000000)) 5: r1 += 1024 ; R1_w=scalar(smin=umin=umin32=1024,smax=umax=umax32=0xc0000400,smin32=0x80000400,smax32=0x40000400,var_off=(0x400; 0xc0000000)) 6: if r1 != r0 goto pc+1 ; R0_w=scalar(id=1,smin=umin=umin32=1024,smax=umax=umax32=0xc0000000,smin32=0x80000400,smax32=0x40000000,var_off=(0x400; 0xc0000000)) R1_w=scalar(smin=umin=umin32=1024,smax=umax=umax32=0xc0000000,smin32=0x80000400,smax32=0x40000400,var_off=(0x400; 0xc0000000)) 7: r10 = 0 frame pointer is read only Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/550004f935e2553bdb2fb1f09cbde7d0452112d0.1755694148.git.paul.chaignon@gmail.com	2025-08-22 18:12:34 +02:00
Hengqi Chen	21aeabb682	selftests/bpf: Use vmlinux.h for BPF programs Some of the bpf test progs still use linux/libc headers. Let's use vmlinux.h instead like the rest of test progs. This will also ease cross compiling. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250821030254.398826-1-hengqi.chen@gmail.com	2025-08-21 11:32:25 -07:00
Jiri Olsa	52718438af	selftests/bpf: Fix uprobe syscall shadow stack test Now that we have uprobe syscall working properly with shadow stack, we can remove testing limitations for shadow stack tests and make sure uprobe gets properly optimized. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250821141557.13233-1-jolsa@kernel.org	2025-08-21 20:09:25 +02:00
Jiri Olsa	3abf4298c6	selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe Changing the test_uretprobe_regs_change test to test both uprobe and uretprobe by adding entry consumer handler to the testmod and making it to change one of the registers. Making sure that changed values both uprobe and uretprobe handlers propagate to the user space. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-20-jolsa@kernel.org	2025-08-21 20:09:25 +02:00
Jiri Olsa	275eae6789	selftests/bpf: Add uprobe_regs_equal test Changing uretprobe_regs_trigger to allow the test for both uprobe and uretprobe and renaming it to uprobe_regs_equal. We check that both uprobe and uretprobe probes (bpf programs) see expected registers with few exceptions. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-19-jolsa@kernel.org	2025-08-21 20:09:25 +02:00
Jiri Olsa	875e1705ad	selftests/bpf: Add optimized usdt variant for basic usdt test Adding optimized usdt variant for basic usdt test to check that usdt arguments are properly passed in optimized code path. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-18-jolsa@kernel.org	2025-08-21 20:09:24 +02:00
Jiri Olsa	c11661bd9a	selftests/bpf: Add uprobe syscall sigill signal test Make sure that calling uprobe syscall from outside uprobe trampoline results in sigill signal. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-17-jolsa@kernel.org	2025-08-21 20:09:24 +02:00
Jiri Olsa	c8be59667c	selftests/bpf: Add hit/attach/detach race optimized uprobe test Adding test that makes sure parallel execution of the uprobe and attach/detach of optimized uprobe on it works properly. By default the test runs for 500ms, which is adjustable by using BPF_SELFTESTS_UPROBE_SYSCALL_RACE_MSEC env variable. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-16-jolsa@kernel.org	2025-08-21 20:09:24 +02:00
Jiri Olsa	d5c86c3370	selftests/bpf: Add uprobe/usdt syscall tests Adding tests for optimized uprobe/usdt probes. Checking that we get expected trampoline and attached bpf programs get executed properly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-15-jolsa@kernel.org	2025-08-21 20:09:24 +02:00
Jiri Olsa	7932c4cf57	selftests/bpf: Rename uprobe_syscall_executed prog to test_uretprobe_multi Renaming uprobe_syscall_executed prog to test_uretprobe_multi to fit properly in the following changes that add more programs. Plus adding pid filter and increasing executed variable. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-14-jolsa@kernel.org	2025-08-21 20:09:23 +02:00
Jiri Olsa	4e7005223e	selftests/bpf: Reorg the uprobe_syscall test function Adding __test_uprobe_syscall with non x86_64 stub to execute all the tests, so we don't need to keep adding non x86_64 stub functions for new tests. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-13-jolsa@kernel.org	2025-08-21 20:09:23 +02:00
Jiri Olsa	17c3b00157	selftests/bpf: Import usdt.h from libbpf/usdt project Importing usdt.h from libbpf/usdt project. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250720112133.244369-12-jolsa@kernel.org	2025-08-21 20:09:23 +02:00
Martin KaFai Lau	5c42715e63	Merge branch 'bpf-next/skb-meta-dynptr' into 'bpf-next/master' Merge 'skb-meta-dynptr' branch into 'master' branch. No conflict. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2025-08-18 17:59:26 -07:00
Jakub Sitnicki	403fae5978	selftests/bpf: Cover metadata access from a modified skb clone Demonstrate that, when processing an skb clone, the metadata gets truncated if the program contains a direct write to either the payload or the metadata, due to an implicit unclone in the prologue, and otherwise the dynptr to the metadata is limited to being read-only. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-9-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:43 -07:00
Jakub Sitnicki	bd1b51b319	selftests/bpf: Cover read/write to skb metadata at an offset Exercise r/w access to skb metadata through an offset-adjusted dynptr, read/write helper with an offset argument, and a slice starting at an offset. Also check for the expected errors when the offset is out of bounds. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-8-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:43 -07:00
Jakub Sitnicki	ed93360807	selftests/bpf: Cover write access to skb metadata via dynptr Add tests what exercise writes to skb metadata in two ways: 1. indirectly, using bpf_dynptr_write helper, 2. directly, using a read-write dynptr slice. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-7-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:43 -07:00
Jakub Sitnicki	153f6bfd48	selftests/bpf: Cover read access to skb metadata via dynptr Exercise reading from SKB metadata area in two new ways: 1. indirectly, with bpf_dynptr_read(), and 2. directly, with bpf_dynptr_slice(). Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-6-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:42 -07:00
Jakub Sitnicki	dd9f6cfb4e	selftests/bpf: Parametrize test_xdp_context_tuntap We want to add more test cases to cover different ways to access the metadata area. Prepare for it. Pull up the skeleton management. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-5-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:42 -07:00
Jakub Sitnicki	6dfd5e01e1	selftests/bpf: Pass just bpf_map to xdp_context_test helper Prepare for parametrizing the xdp_context tests. The assert_test_result helper doesn't need the whole skeleton. Pass just what it needs. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-4-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:42 -07:00
Jakub Sitnicki	0e74eb4d57	selftests/bpf: Cover verifier checks for skb_meta dynptr type dynptr for skb metadata behaves the same way as the dynptr for skb data with one exception - writes to skb_meta dynptr don't invalidate existing skb and skb_meta slices. Duplicate those the skb dynptr tests which we can, since bpf_dynptr_from_skb_meta kfunc can be called only from TC BPF, to cover the skb_meta dynptr verifier checks. Also add a couple of new tests (skb_data_valid_*) to ensure we don't invalidate the slices in the mentioned case, which are specific to skb_meta dynptr. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-3-8a39e636e0fb@cloudflare.com	2025-08-18 10:29:42 -07:00
Ilya Leoshkevich	1274163035	selftests/bpf: Clobber a lot of registers in tailcall_bpf2bpf_hierarchy tests Clobbering a lot of registers and stack slots helps exposing tail call counter overwrite bugs in JITs. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250813121016.163375-5-iii@linux.ibm.com	2025-08-18 15:08:30 +02:00
Yureka Lilian	7f8fa9d370	selftests/bpf: Add test for DEVMAP reuse The test covers basic re-use of a pinned DEVMAP map, with both matching and mismatching parameters. Signed-off-by: Yureka Lilian <yuka@yuka.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250814180113.1245565-4-yuka@yuka.dev	2025-08-15 16:52:52 -07:00
Matt Bobrowski	c80d797206	bpf/selftests: Fix test_tcpnotify_user Based on a bisect, it appears that commit `7ee9887703` ("timers: Implement the hierarchical pull model") has somehow inadvertently broken BPF selftest test_tcpnotify_user. The error that is being generated by this test is as follows: FAILED: Wrong stats Expected 10 calls, got 8 It looks like the change allows timer functions to be run on CPUs different from the one they are armed on. The test had pinned itself to CPU 0, and in the past the retransmit attempts also occurred on CPU 0. The test had set the max_entries attribute for BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was likely to be in range. With the change to allow timers to run on other CPUs, the current CPU tasked with performing the retransmit might be bumped and in turn fall out of range, as the event will be filtered out via __bpf_perf_event_output() using: if (unlikely(index >= array->map.max_entries)) return -E2BIG; A possible change would be to explicitly set the max_entries attribute for perf_event_map in test_tcpnotify_kern.c to a value that's at least as large as the number of CPUs. As it turns out however, if the field is left unset, then the libbpf will determine the number of CPUs available on the underlying system and update the max_entries attribute accordingly in map_set_def_max_entries(). A further problem with the test is that it has a thread that continues running up until the program exits. The main thread cleans up some LIBBPF data structures, while the other thread continues to use them, which inevitably will trigger a SIGSEGV. This can be dealt with by telling the thread to run for as long as necessary and doing a pthread_join on it before exiting the program. Finally, I don't think binding the process to CPU 0 is meaningful for this test any more, so get rid of that. Fixes: `435f90a338` ("selftests/bpf: add a test case for sock_ops perf-event notification") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/aJ8kHhwgATmA3rLf@google.com	2025-08-15 13:05:29 -07:00

... 4 5 6 7 8 ...

5219 Commits