From 17ce3e5949bc37557305ad46316f41c7875d6366 Mon Sep 17 00:00:00 2001 From: Kuniyuki Iwashima Date: Tue, 22 Jul 2025 22:40:37 +0000 Subject: [PATCH 1/3] bpf: Disable migration in nf_hook_run_bpf(). syzbot reported that the netfilter bpf prog can be called without migration disabled in xmit path. Then the assertion in __bpf_prog_run() fails, triggering the splat below. [0] Let's use bpf_prog_run_pin_on_cpu() in nf_hook_run_bpf(). [0]: BUG: assuming non migratable context at ./include/linux/filter.h:703 in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 5829, name: sshd-session 3 locks held by sshd-session/5829: #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1667 [inline] #0: ffff88807b4e4218 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x20/0x50 net/ipv4/tcp.c:1395 #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline] #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline] #1: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x69/0x26c0 net/ipv4/ip_output.c:470 #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline] #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline] #2: ffffffff8e5c4e00 (rcu_read_lock){....}-{1:3}, at: nf_hook+0xb2/0x680 include/linux/netfilter.h:241 CPU: 0 UID: 0 PID: 5829 Comm: sshd-session Not tainted 6.16.0-rc6-syzkaller-00002-g155a3c003e55 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120 __cant_migrate kernel/sched/core.c:8860 [inline] __cant_migrate+0x1c7/0x250 kernel/sched/core.c:8834 __bpf_prog_run include/linux/filter.h:703 [inline] bpf_prog_run include/linux/filter.h:725 [inline] nf_hook_run_bpf+0x83/0x1e0 net/netfilter/nf_bpf_link.c:20 nf_hook_entry_hookfn include/linux/netfilter.h:157 [inline] nf_hook_slow+0xbb/0x200 net/netfilter/core.c:623 nf_hook+0x370/0x680 include/linux/netfilter.h:272 NF_HOOK_COND include/linux/netfilter.h:305 [inline] ip_output+0x1bc/0x2a0 net/ipv4/ip_output.c:433 dst_output include/net/dst.h:459 [inline] ip_local_out net/ipv4/ip_output.c:129 [inline] __ip_queue_xmit+0x1d7d/0x26c0 net/ipv4/ip_output.c:527 __tcp_transmit_skb+0x2686/0x3e90 net/ipv4/tcp_output.c:1479 tcp_transmit_skb net/ipv4/tcp_output.c:1497 [inline] tcp_write_xmit+0x1274/0x84e0 net/ipv4/tcp_output.c:2838 __tcp_push_pending_frames+0xaf/0x390 net/ipv4/tcp_output.c:3021 tcp_push+0x225/0x700 net/ipv4/tcp.c:759 tcp_sendmsg_locked+0x1870/0x42b0 net/ipv4/tcp.c:1359 tcp_sendmsg+0x2e/0x50 net/ipv4/tcp.c:1396 inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:851 sock_sendmsg_nosec net/socket.c:712 [inline] __sock_sendmsg net/socket.c:727 [inline] sock_write_iter+0x4aa/0x5b0 net/socket.c:1131 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x6c7/0x1150 fs/read_write.c:686 ksys_write+0x1f8/0x250 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x4c0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fe7d365d407 Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff RSP: Fixes: fd9c663b9ad67 ("bpf: minimal support for programs hooked into netfilter framework") Reported-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6879466d.a00a0220.3af5df.0022.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima Signed-off-by: Martin KaFai Lau Tested-by: syzbot+40f772d37250b6d10efc@syzkaller.appspotmail.com Acked-by: Florian Westphal Link: https://patch.msgid.link/20250722224041.112292-1-kuniyu@google.com --- net/netfilter/nf_bpf_link.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c index 06b084844700..25bbac8986c2 100644 --- a/net/netfilter/nf_bpf_link.c +++ b/net/netfilter/nf_bpf_link.c @@ -17,7 +17,7 @@ static unsigned int nf_hook_run_bpf(void *bpf_prog, struct sk_buff *skb, .skb = skb, }; - return bpf_prog_run(prog, &ctx); + return bpf_prog_run_pin_on_cpu(prog, &ctx); } struct bpf_nf_link { From e09299225d5ba3916c91ef70565f7d2187e4cca0 Mon Sep 17 00:00:00 2001 From: Paul Chaignon Date: Tue, 22 Jul 2025 16:32:32 +0200 Subject: [PATCH 2/3] bpf: Reject narrower access to pointer ctx fields The following BPF program, simplified from a syzkaller repro, causes a kernel warning: r0 = *(u8 *)(r1 + 169); exit; With pointer field sk being at offset 168 in __sk_buff. This access is detected as a narrower read in bpf_skb_is_valid_access because it doesn't match offsetof(struct __sk_buff, sk). It is therefore allowed and later proceeds to bpf_convert_ctx_access. Note that for the "is_narrower_load" case in the convert_ctx_accesses(), the insn->off is aligned, so the cnt may not be 0 because it matches the offsetof(struct __sk_buff, sk) in the bpf_convert_ctx_access. However, the target_size stays 0 and the verifier errors with a kernel warning: verifier bug: error during ctx access conversion(1) This patch fixes that to return a proper "invalid bpf_context access off=X size=Y" error on the load instruction. The same issue affects multiple other fields in context structures that allow narrow access. Some other non-affected fields (for sk_msg, sk_lookup, and sockopt) were also changed to use bpf_ctx_range_ptr for consistency. Note this syzkaller crash was reported in the "Closes" link below, which used to be about a different bug, fixed in commit fce7bd8e385a ("bpf/verifier: Handle BPF_LOAD_ACQ instructions in insn_def_regno()"). Because syzbot somehow confused the two bugs, the new crash and repro didn't get reported to the mailing list. Fixes: f96da09473b52 ("bpf: simplify narrower ctx access") Fixes: 0df1a55afa832 ("bpf: Warn on internal verifier errors") Reported-by: syzbot+0ef84a7bdf5301d4cbec@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0ef84a7bdf5301d4cbec Signed-off-by: Paul Chaignon Signed-off-by: Martin KaFai Lau Acked-by: Eduard Zingerman Link: https://patch.msgid.link/3b8dcee67ff4296903351a974ddd9c4dca768b64.1753194596.git.paul.chaignon@gmail.com --- kernel/bpf/cgroup.c | 8 ++++---- net/core/filter.c | 20 ++++++++++---------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index f4885514f007..deb88fade249 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -2440,22 +2440,22 @@ static bool cg_sockopt_is_valid_access(int off, int size, } switch (off) { - case offsetof(struct bpf_sockopt, sk): + case bpf_ctx_range_ptr(struct bpf_sockopt, sk): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_SOCKET; break; - case offsetof(struct bpf_sockopt, optval): + case bpf_ctx_range_ptr(struct bpf_sockopt, optval): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_PACKET; break; - case offsetof(struct bpf_sockopt, optval_end): + case bpf_ctx_range_ptr(struct bpf_sockopt, optval_end): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_PACKET_END; break; - case offsetof(struct bpf_sockopt, retval): + case bpf_ctx_range(struct bpf_sockopt, retval): if (size != size_default) return false; return prog->expected_attach_type == BPF_CGROUP_GETSOCKOPT; diff --git a/net/core/filter.c b/net/core/filter.c index 2eb8947d8097..c09a85c17496 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -8699,7 +8699,7 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type if (size != sizeof(__u64)) return false; break; - case offsetof(struct __sk_buff, sk): + case bpf_ctx_range_ptr(struct __sk_buff, sk): if (type == BPF_WRITE || size != sizeof(__u64)) return false; info->reg_type = PTR_TO_SOCK_COMMON_OR_NULL; @@ -9277,7 +9277,7 @@ static bool sock_addr_is_valid_access(int off, int size, return false; } break; - case offsetof(struct bpf_sock_addr, sk): + case bpf_ctx_range_ptr(struct bpf_sock_addr, sk): if (type != BPF_READ) return false; if (size != sizeof(__u64)) @@ -9327,17 +9327,17 @@ static bool sock_ops_is_valid_access(int off, int size, if (size != sizeof(__u64)) return false; break; - case offsetof(struct bpf_sock_ops, sk): + case bpf_ctx_range_ptr(struct bpf_sock_ops, sk): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_SOCKET_OR_NULL; break; - case offsetof(struct bpf_sock_ops, skb_data): + case bpf_ctx_range_ptr(struct bpf_sock_ops, skb_data): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_PACKET; break; - case offsetof(struct bpf_sock_ops, skb_data_end): + case bpf_ctx_range_ptr(struct bpf_sock_ops, skb_data_end): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_PACKET_END; @@ -9346,7 +9346,7 @@ static bool sock_ops_is_valid_access(int off, int size, bpf_ctx_record_field_size(info, size_default); return bpf_ctx_narrow_access_ok(off, size, size_default); - case offsetof(struct bpf_sock_ops, skb_hwtstamp): + case bpf_ctx_range(struct bpf_sock_ops, skb_hwtstamp): if (size != sizeof(__u64)) return false; break; @@ -9416,17 +9416,17 @@ static bool sk_msg_is_valid_access(int off, int size, return false; switch (off) { - case offsetof(struct sk_msg_md, data): + case bpf_ctx_range_ptr(struct sk_msg_md, data): info->reg_type = PTR_TO_PACKET; if (size != sizeof(__u64)) return false; break; - case offsetof(struct sk_msg_md, data_end): + case bpf_ctx_range_ptr(struct sk_msg_md, data_end): info->reg_type = PTR_TO_PACKET_END; if (size != sizeof(__u64)) return false; break; - case offsetof(struct sk_msg_md, sk): + case bpf_ctx_range_ptr(struct sk_msg_md, sk): if (size != sizeof(__u64)) return false; info->reg_type = PTR_TO_SOCKET; @@ -11632,7 +11632,7 @@ static bool sk_lookup_is_valid_access(int off, int size, return false; switch (off) { - case offsetof(struct bpf_sk_lookup, sk): + case bpf_ctx_range_ptr(struct bpf_sk_lookup, sk): info->reg_type = PTR_TO_SOCKET_OR_NULL; return size == sizeof(__u64); From ba578b87fe2beef95b37264f8a98c0b505b93de9 Mon Sep 17 00:00:00 2001 From: Paul Chaignon Date: Tue, 22 Jul 2025 16:33:37 +0200 Subject: [PATCH 3/3] selftests/bpf: Test invalid narrower ctx load This patch adds selftests to cover invalid narrower loads on the context. These used to cause kernel warnings before the previous patch. To trigger the warning, the load had to be aligned, to read an affected context field (ex., skb->sk), and not starting at the beginning of the field. The nine new cases all fail without the previous patch. Suggested-by: Eduard Zingerman Signed-off-by: Paul Chaignon Signed-off-by: Martin KaFai Lau Acked-by: Eduard Zingerman Link: https://patch.msgid.link/44cd83ea9c6868079943f0a436c6efa850528cc1.1753194596.git.paul.chaignon@gmail.com --- .../selftests/bpf/progs/verifier_ctx.c | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_ctx.c b/tools/testing/selftests/bpf/progs/verifier_ctx.c index a83809a1dbbf..0450840c92d9 100644 --- a/tools/testing/selftests/bpf/progs/verifier_ctx.c +++ b/tools/testing/selftests/bpf/progs/verifier_ctx.c @@ -218,4 +218,29 @@ __naked void null_check_8_null_bind(void) : __clobber_all); } +#define narrow_load(type, ctx, field) \ + SEC(type) \ + __description("narrow load on field " #field " of " #ctx) \ + __failure __msg("invalid bpf_context access") \ + __naked void invalid_narrow_load##ctx##field(void) \ + { \ + asm volatile (" \ + r1 = *(u32 *)(r1 + %[off]); \ + r0 = 0; \ + exit;" \ + : \ + : __imm_const(off, offsetof(struct ctx, field) + 4) \ + : __clobber_all); \ + } + +narrow_load("cgroup/getsockopt", bpf_sockopt, sk); +narrow_load("cgroup/getsockopt", bpf_sockopt, optval); +narrow_load("cgroup/getsockopt", bpf_sockopt, optval_end); +narrow_load("tc", __sk_buff, sk); +narrow_load("cgroup/bind4", bpf_sock_addr, sk); +narrow_load("sockops", bpf_sock_ops, sk); +narrow_load("sockops", bpf_sock_ops, skb_data); +narrow_load("sockops", bpf_sock_ops, skb_data_end); +narrow_load("sockops", bpf_sock_ops, skb_hwtstamp); + char _license[] SEC("license") = "GPL";