mirror of
https://github.com/torvalds/linux.git
synced 2026-05-17 11:08:02 +02:00
The kernel crash was caused by a BPF program attached to the
"lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to
`bpf_setsockopt()` in order to set the TCP_NODELAY flag as an
example. Flags like TCP_NODELAY can prompt the kernel to flush a
socket's outgoing queue, and this hook
"lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by
softirqs. The issue was that in certain circumstances, when
`tcp_write_xmit()` was called to flush the queue, it would also allow
BH (bottom-half) to run. This could lead to our program attempting to
flush the same socket recursively, which caused a `skbuff` to be
unlinked twice.
`security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs
before the sock ownership is checked in `tcp_v4_rcv()`. Consequently,
if a bpf program runs on `security_sock_rcv_skb()` while under softirq
conditions, it may not possess the lock needed for `bpf_setsockopt()`,
thus presenting an issue.
The patch fixes this issue by ensuring that a BPF program attached to
the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call
`bpf_setsockopt()`.
The differences from v1 are
- changing commit log to explain holding the lock of the sock,
- emphasizing that TCP_NODELAY is not the only flag, and
- adding the fixes tag.
v1: https://lore.kernel.org/bpf/20230125000244.1109228-1-kuifeng@meta.com/
Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Fixes:
|
||
|---|---|---|
| .. | ||
| preload | ||
| arraymap.c | ||
| bloom_filter.c | ||
| bpf_cgrp_storage.c | ||
| bpf_inode_storage.c | ||
| bpf_iter.c | ||
| bpf_local_storage.c | ||
| bpf_lru_list.c | ||
| bpf_lru_list.h | ||
| bpf_lsm.c | ||
| bpf_struct_ops_types.h | ||
| bpf_struct_ops.c | ||
| bpf_task_storage.c | ||
| btf.c | ||
| cgroup_iter.c | ||
| cgroup.c | ||
| core.c | ||
| cpumap.c | ||
| devmap.c | ||
| disasm.c | ||
| disasm.h | ||
| dispatcher.c | ||
| hashtab.c | ||
| helpers.c | ||
| inode.c | ||
| Kconfig | ||
| link_iter.c | ||
| local_storage.c | ||
| lpm_trie.c | ||
| Makefile | ||
| map_in_map.c | ||
| map_in_map.h | ||
| map_iter.c | ||
| memalloc.c | ||
| mmap_unlock_work.h | ||
| net_namespace.c | ||
| offload.c | ||
| percpu_freelist.c | ||
| percpu_freelist.h | ||
| prog_iter.c | ||
| queue_stack_maps.c | ||
| reuseport_array.c | ||
| ringbuf.c | ||
| stackmap.c | ||
| syscall.c | ||
| sysfs_btf.c | ||
| task_iter.c | ||
| tnum.c | ||
| trampoline.c | ||
| verifier.c | ||