linux/net/core
Jakub Kicinski 3988164fe9 net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
[ Upstream commit 24bcbe1cc6 ]

sk_stream_kill_queues() can be called on close when there are
still outstanding skbs to transmit. Those skbs may try to queue
notifications to the error queue (e.g. timestamps).
If sk_stream_kill_queues() purges the queue without taking
its lock the queue may get corrupted, and skbs leaked.

This shows up as a warning about an rmem leak:

WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x...

The leak is always a multiple of 0x300 bytes (the value is in
%rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of
an empty sk_buff. Indeed if we dump the socket state at the time
of the warning the sk_error_queue is often (but not always)
corrupted. The ->next pointer points back at the list head,
but not the ->prev pointer. Indeed we can find the leaked skb
by scanning the kernel memory for something that looks like
an skb with ->sk = socket in question, and ->truesize = 0x300.
The contents of ->cb[] of the skb confirms the suspicion that
it is indeed a timestamp notification (as generated in
__skb_complete_tx_timestamp()).

Removing purging of sk_error_queue should be okay, since
inet_sock_destruct() does it again once all socket refs
are gone. Eric suggests this may cause sockets that go
thru disconnect() to maintain notifications from the
previous incarnations of the socket, but that should be
okay since the race was there anyway, and disconnect()
is not exactly dependable.

Thanks to Jonathan Lemon and Omar Sandoval for help at various
stages of tracing the issue.

Fixes: cb9eff0978 ("net: new user space API for time stamping of incoming and outgoing packets")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:34 +01:00
..
bpf_sk_storage.c net: in_irq() cleanup 2021-08-13 14:09:19 -07:00
datagram.c
datagram.h
dev_addr_lists.c net: dev_addr_list: handle first address in __hw_addr_add_ex 2021-09-30 13:29:09 +01:00
dev_ioctl.c net: core: don't call SIOCBRADD/DELIF for non-bridge devices 2021-08-05 11:36:59 +01:00
dev.c net: sched: update default qdisc visibility after Tx queue cnt changes 2021-11-18 19:16:10 +01:00
devlink.c devlink: Clear whole devlink_flash_notify struct 2021-08-14 13:59:10 +01:00
drop_monitor.c net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
dst_cache.c
dst.c net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
failover.c
fib_notifier.c
fib_rules.c memcg: enable accounting for IP address and routing-related objects 2021-07-20 06:00:38 -07:00
filter.c bpf: tcp: Allow bpf-tcp-cc to call bpf_(get|set)sockopt 2021-08-25 17:40:35 -07:00
flow_dissector.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-07-31 09:14:46 -07:00
flow_offload.c net: Fix offloading indirect devices dependency on qdisc order creation 2021-08-19 13:19:30 +01:00
gen_estimator.c
gen_stats.c
gro_cells.c
hwbm.c
link_watch.c net: linkwatch: fix failure to restore device state across suspend/resume 2021-08-11 14:43:16 -07:00
lwt_bpf.c
lwtunnel.c netfilter: add netfilter hooks to SRv6 data plane 2021-08-30 01:51:36 +02:00
Makefile sock_map: Relax config dependency to CONFIG_NET 2021-07-15 18:17:49 -07:00
neighbour.c net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE 2021-11-18 19:16:32 +01:00
net_namespace.c net: net_namespace: Fix undefined member in key_remove_domain() 2021-11-18 19:16:24 +01:00
net-procfs.c Revert "net: procfs: add seq_puts() statement for dev_mcast" 2021-10-13 17:24:38 -07:00
net-sysfs.c net-sysfs: try not to restart the syscall if it will fail eventually 2021-11-18 19:16:14 +01:00
net-sysfs.h
net-traces.c tcp: add tracepoint for checksum errors 2021-05-14 15:26:03 -07:00
netclassid_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
netevent.c net: core: Correct function name netevent_unregister_notifier() in the kerneldoc 2021-03-28 17:56:56 -07:00
netpoll.c asm-generic/unaligned: Unify asm/unaligned.h around struct helper 2021-07-02 12:43:40 -07:00
netprio_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
page_pool.c page_pool: use relaxed atomic for release side accounting 2021-08-24 10:46:31 +01:00
pktgen.c pktgen: remove unused variable 2021-09-03 11:48:28 +01:00
ptp_classifier.c bpf: Refactor BPF_PROG_RUN into a function 2021-08-17 00:45:07 +02:00
request_sock.c
rtnetlink.c rtnetlink: fix if_nlmsg_stats_size() under estimation 2021-10-06 15:09:46 +01:00
scm.c memcg: enable accounting for scm_fp_list objects 2021-07-20 06:00:38 -07:00
secure_seq.c
selftests.c net: selftests: add MTU test 2021-07-22 00:52:04 -07:00
skbuff.c skb_expand_head() adjust skb->truesize incorrectly 2021-10-22 12:35:51 -07:00
skmsg.c skmsg: Extract and reuse sk_msg_is_readable() 2021-10-26 12:29:33 -07:00
sock_destructor.h skb_expand_head() adjust skb->truesize incorrectly 2021-10-22 12:35:51 -07:00
sock_diag.c
sock_map.c af_unix: Add unix_stream_proto for sockmap 2021-08-16 18:43:39 -07:00
sock_reuseport.c tcp: Add stats for socket migration. 2021-06-23 12:56:08 -07:00
sock.c af_unix: fix races in sk_peer_pid and sk_peer_cred accesses 2021-09-30 14:18:40 +01:00
stream.c net: stream: don't purge sk_error_queue in sk_stream_kill_queues() 2021-11-18 19:16:34 +01:00
sysctl_net_core.c bpf: Prevent increasing bpf_jit_limit above max 2021-10-22 17:23:53 -07:00
timestamping.c
tso.c
utils.c
xdp.c xdp: Move the rxq_info.mem clearing to unreg_mem_model() 2021-06-28 23:07:59 +02:00