linux

mirror of https://github.com/torvalds/linux.git synced 2026-06-07 05:55:44 +02:00

History

John Fastabend fd09af0107 bpf: sock_ops ctx access may stomp registers in corner case I had a sockmap program that after doing some refactoring started spewing this splat at me: [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 [...] [18610.807359] Call Trace: [18610.807370] ? 0xffffffffc114d0d5 [18610.807382] __cgroup_bpf_run_filter_sock_ops+0x7d/0xb0 [18610.807391] tcp_connect+0x895/0xd50 [18610.807400] tcp_v4_connect+0x465/0x4e0 [18610.807407] __inet_stream_connect+0xd6/0x3a0 [18610.807412] ? __inet_stream_connect+0x5/0x3a0 [18610.807417] inet_stream_connect+0x3b/0x60 [18610.807425] __sys_connect+0xed/0x120 After some debugging I was able to build this simple reproducer, __section("sockops/reproducer_bad") int bpf_reproducer_bad(struct bpf_sock_ops skops) { volatile __maybe_unused __u32 i = skops->snd_ssthresh; return 0; } And along the way noticed that below program ran without splat, __section("sockops/reproducer_good") int bpf_reproducer_good(struct bpf_sock_ops skops) { volatile __maybe_unused __u32 i = skops->snd_ssthresh; volatile __maybe_unused __u32 family; compiler_barrier(); family = skops->family; return 0; } So I decided to check out the code we generate for the above two programs and noticed each generates the BPF code you would expect, 0000000000000000 <bpf_reproducer_bad>: ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: r1 = (u32 )(r1 + 96) 1: (u32 )(r10 - 4) = r1 ; return 0; 2: r0 = 0 3: exit 0000000000000000 <bpf_reproducer_good>: ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: r2 = (u32 )(r1 + 96) 1: (u32 )(r10 - 4) = r2 ; family = skops->family; 2: r1 = (u32 )(r1 + 20) 3: (u32 )(r10 - 8) = r1 ; return 0; 4: r0 = 0 5: exit So we get reasonable assembly, but still something was causing the null pointer dereference. So, we load the programs and dump the xlated version observing that line 0 above 'r* = (u32 )(r1 +96)' is going to be translated by the skops access helpers. int bpf_reproducer_bad(struct bpf_sock_ops * skops): ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: (61) r1 = (u32 )(r1 +28) 1: (15) if r1 == 0x0 goto pc+2 2: (79) r1 = (u64 )(r1 +0) 3: (61) r1 = (u32 )(r1 +2340) ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 4: (63) (u32 )(r10 -4) = r1 ; return 0; 5: (b7) r0 = 0 6: (95) exit int bpf_reproducer_good(struct bpf_sock_ops * skops): ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 0: (61) r2 = (u32 )(r1 +28) 1: (15) if r2 == 0x0 goto pc+2 2: (79) r2 = (u64 )(r1 +0) 3: (61) r2 = (u32 )(r2 +2340) ; volatile __maybe_unused __u32 i = skops->snd_ssthresh; 4: (63) (u32 )(r10 -4) = r2 ; family = skops->family; 5: (79) r1 = (u64 )(r1 +0) 6: (69) r1 = (u16 )(r1 +16) ; family = skops->family; 7: (63) (u32 )(r10 -8) = r1 ; return 0; 8: (b7) r0 = 0 9: (95) exit Then we look at lines 0 and 2 above. In the good case we do the zero check in r2 and then load 'r1 + 0' at line 2. Do a quick cross-check into the bpf_sock_ops check and we can confirm that is the 'struct sock sk' pointer field. But, in the bad case, 0: (61) r1 = (u32 )(r1 +28) 1: (15) if r1 == 0x0 goto pc+2 2: (79) r1 = (u64 )(r1 +0) Oh no, we read 'r1 +28' into r1, this is skops->fullsock and then in line 2 we read the 'r1 +0' as a pointer. Now jumping back to our spat, [18610.807284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 The 0x01 makes sense because that is exactly the fullsock value. And its not a valid dereference so we splat. To fix we need to guard the case when a program is doing a sock_ops field access with src_reg == dst_reg. This is already handled in the load case where the ctx_access handler uses a tmp register being careful to store the old value and restore it. To fix the get case test if src_reg == dst_reg and in this case do the is_fullsock test in the temporary register. Remembering to restore the temporary register before writing to either dst_reg or src_reg to avoid smashing the pointer into the struct holding the tmp variable. Adding this inline code to test_tcpbpf_kern will now be generated correctly from, 9: r2 = (u32 )(r2 + 96) to xlated code, 12: (7b) (u64 )(r2 +32) = r9 13: (61) r9 = (u32 )(r2 +28) 14: (15) if r9 == 0x0 goto pc+4 15: (79) r9 = (u64 )(r2 +32) 16: (79) r2 = (u64 )(r2 +0) 17: (61) r2 = (u32 )(r2 +2348) 18: (05) goto pc+1 19: (79) r9 = (u64 )(r2 +32) And in the normal case we keep the original code, because really this is an edge case. From this, 9: r2 = (u32 )(r6 + 96) to xlated code, 22: (61) r2 = (u32 )(r6 +28) 23: (15) if r2 == 0x0 goto pc+2 24: (79) r2 = (u64 )(r6 +0) 25: (61) r2 = (u32 *)(r2 +2348) So three additional instructions if dst == src register, but I scanned my current code base and did not see this pattern anywhere so should not be a big deal. Further, it seems no one else has hit this or at least reported it so it must a fairly rare pattern. Fixes: `9b1f3d6e5a` ("bpf: Refactor sock_ops_convert_ctx_access") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/159718347772.4728.2781381670567919577.stgit@john-Precision-5820-Tower		2020-08-13 22:40:36 +02:00
..
bpf_sk_storage.c	bpf: Change uapi for bpf iterator map elements	2020-08-06 16:39:14 -07:00
datagram.c	net: use indirect call wrappers for skb_copy_datagram_iter()	2020-03-25 11:30:40 -07:00
datagram.h
dev_addr_lists.c	net: explain the lockdep annotations for dev_uc_unsync()	2020-06-28 21:38:27 -07:00
dev_ioctl.c	net: Call into DSA netdevice_ops wrappers	2020-07-20 16:48:22 -07:00
dev.c	bpf: Fix XDP FD-based attach/detach logic around XDP_FLAGS_UPDATE_IF_NOEXIST	2020-08-12 18:00:49 -07:00
devlink.c	devlink: Pass extack when setting trap's action and group's parameters	2020-08-03 18:06:46 -07:00
drop_monitor.c	net: Add MODULE_DESCRIPTION entries to network modules	2020-06-20 21:33:57 -07:00
dst_cache.c
dst.c	net/dst: use a smaller percpu_counter batch for dst entries accounting	2020-05-08 21:33:33 -07:00
failover.c
fib_notifier.c	net: fib_notifier: propagate extack down to the notifier block callback	2019-10-04 11:10:56 -07:00
fib_rules.c	fib: Fix undef compile warning	2020-08-03 18:01:49 -07:00
filter.c	bpf: sock_ops ctx access may stomp registers in corner case	2020-08-13 22:40:36 +02:00
flow_dissector.c	net/flow_dissector: add packet hash dissection	2020-07-24 15:23:31 -07:00
flow_offload.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2020-07-25 17:49:04 -07:00
gen_estimator.c	net_sched: gen_estimator: extend packet counter to 64bit	2019-11-06 21:51:36 -08:00
gen_stats.c	docs: networking: convert gen_stats.txt to ReST	2020-04-28 14:39:46 -07:00
gro_cells.c
hwbm.c	net: hwbm: Make the hwbm_pool lock a mutex	2019-06-09 19:40:10 -07:00
link_watch.c	net: Add IF_OPER_TESTING	2020-04-20 12:43:24 -07:00
lwt_bpf.c	net: add net available in build_state	2020-03-29 22:30:57 -07:00
lwtunnel.c	net: ipv6: add rpl sr tunnel	2020-03-29 22:30:57 -07:00
Makefile	ethtool: move to its own directory	2019-12-12 17:07:05 -08:00
neighbour.c	net: neighbor: add fdb extended attribute	2020-06-24 14:36:33 -07:00
net_namespace.c	nsproxy: add struct nsset	2020-05-09 13:57:12 +02:00
net-procfs.c	net: procfs: use index hashlist instead of name hashlist	2019-10-01 14:47:19 -07:00
net-sysfs.c	The main changes in this cycle were:	2020-08-03 14:58:38 -07:00
net-sysfs.h	net-sysfs: add netdev_change_owner()	2020-02-26 20:07:25 -08:00
net-traces.c	page_pool: add tracepoints for page_pool with details need by XDP	2019-06-19 11:23:13 -04:00
netclassid_cgroup.c	cgroup, netclassid: remove double cond_resched	2020-04-21 15:44:30 -07:00
netevent.c
netpoll.c	netpoll: accept NULL np argument in netpoll_send_skb()	2020-05-07 18:11:07 -07:00
netprio_cgroup.c	netprio_cgroup: Fix unlimited memory leak of v2 cgroups	2020-05-09 20:59:21 -07:00
page_pool.c	net: page pool: allow to pass zero flags to page_pool_init()	2020-03-29 21:49:20 -07:00
pktgen.c	docs: networking: convert pktgen.txt to ReST	2020-04-30 12:56:37 -07:00
ptp_classifier.c
request_sock.c	tcp: add rcu protection around tp->fastopen_rsk	2019-10-13 10:13:08 -07:00
rtnetlink.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	2020-08-03 18:27:40 -07:00
scm.c	fs: Add receive_fd() wrapper for __receive_fd()	2020-07-13 11:03:44 -07:00
secure_seq.c	crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h	2020-05-08 15:32:17 +10:00
skbuff.c	net: Use helper function ip_is_fragment()	2020-08-08 14:24:53 -07:00
skmsg.c	bpf, sockmap: RCU dereferenced psock may be used outside RCU block	2020-06-28 08:33:28 -07:00
sock_diag.c	sock: make cookie generation global instead of per netns	2019-08-09 13:14:46 -07:00
sock_map.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2020-07-11 00:46:00 -07:00
sock_reuseport.c	udp: Copy has_conns in reuseport_grow().	2020-07-21 15:31:02 -07:00
sock.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next	2020-08-05 20:13:21 -07:00
stream.c	tcp: make sure EPOLLOUT wont be missed	2019-08-19 13:07:43 -07:00
sysctl_net_core.c	bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()	2020-07-08 16:01:21 -07:00
timestamping.c	net: Introduce a new MII time stamping interface.	2019-12-25 19:51:33 -08:00
tso.c	net: tso: add UDP segmentation support	2020-06-18 20:46:23 -07:00
utils.c	net: Fix skb->csum update in inet_proto_csum_replace16().	2020-01-24 20:54:30 +01:00
xdp.c	bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commands	2020-07-25 20:37:02 -07:00