linux/net
Paolo Abeni cf742d0955 mptcp: fix inconsistent state on fastopen race
[ Upstream commit 4fd19a3070 ]

The netlink PM can race with fastopen self-connect attempts, shutting
down the first subflow via:

MPTCP_PM_CMD_DEL_ADDR -> mptcp_nl_remove_id_zero_address ->
  mptcp_pm_nl_rm_subflow_received -> mptcp_close_ssk

and transitioning such subflow to FIN_WAIT1 status before the syn-ack
packet is processed. The MPTCP code does not react to such state change,
leaving the connection in not-fallback status and the subflow handshake
uncompleted, triggering the following splat:

  WARNING: CPU: 0 PID: 10630 at net/mptcp/subflow.c:1405 subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
  Modules linked in:
  CPU: 0 PID: 10630 Comm: kworker/u4:11 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
  Workqueue: bat_events batadv_nc_worker
  RIP: 0010:subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
  Code: 18 89 ee e8 e3 d2 21 f7 40 84 ed 75 1f e8 a9 d7 21 f7 44 89 fe bf 07 00 00 00 e8 0c d3 21 f7 41 83 ff 07 74 07 e8 91 d7 21 f7 <0f> 0b e8 8a d7 21 f7 48 89 df e8 d2 b2 ff ff 31 ff 89 c5 89 c6 e8
  RSP: 0018:ffffc90000007448 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff888031efc700 RCX: ffffffff8a65baf4
  RDX: ffff888043222140 RSI: ffffffff8a65baff RDI: 0000000000000005
  RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
  R10: 000000000000000b R11: 0000000000000000 R12: 1ffff92000000e89
  R13: ffff88807a534d80 R14: ffff888021c11a00 R15: 000000000000000b
  FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fa19a0ffc81 CR3: 000000007a2db000 CR4: 00000000003506f0
  DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Call Trace:
   <IRQ>
   tcp_data_ready+0x14c/0x5b0 net/ipv4/tcp_input.c:5128
   tcp_data_queue+0x19c3/0x5190 net/ipv4/tcp_input.c:5208
   tcp_rcv_state_process+0x11ef/0x4e10 net/ipv4/tcp_input.c:6844
   tcp_v4_do_rcv+0x369/0xa10 net/ipv4/tcp_ipv4.c:1929
   tcp_v4_rcv+0x3888/0x3b30 net/ipv4/tcp_ipv4.c:2329
   ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205
   ip_local_deliver_finish+0x2e4/0x510 net/ipv4/ip_input.c:233
   NF_HOOK include/linux/netfilter.h:314 [inline]
   NF_HOOK include/linux/netfilter.h:308 [inline]
   ip_local_deliver+0x1b6/0x550 net/ipv4/ip_input.c:254
   dst_input include/net/dst.h:461 [inline]
   ip_rcv_finish+0x1c4/0x2e0 net/ipv4/ip_input.c:449
   NF_HOOK include/linux/netfilter.h:314 [inline]
   NF_HOOK include/linux/netfilter.h:308 [inline]
   ip_rcv+0xce/0x440 net/ipv4/ip_input.c:569
   __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527
   __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641
   process_backlog+0x101/0x6b0 net/core/dev.c:5969
   __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6531
   napi_poll net/core/dev.c:6600 [inline]
   net_rx_action+0x956/0xe90 net/core/dev.c:6733
   __do_softirq+0x21a/0x968 kernel/softirq.c:553
   do_softirq kernel/softirq.c:454 [inline]
   do_softirq+0xaa/0xe0 kernel/softirq.c:441
   </IRQ>
   <TASK>
   __local_bh_enable_ip+0xf8/0x120 kernel/softirq.c:381
   spin_unlock_bh include/linux/spinlock.h:396 [inline]
   batadv_nc_purge_paths+0x1ce/0x3c0 net/batman-adv/network-coding.c:471
   batadv_nc_worker+0x9b1/0x10e0 net/batman-adv/network-coding.c:722
   process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
   process_scheduled_works kernel/workqueue.c:2703 [inline]
   worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
   kthread+0x33c/0x440 kernel/kthread.c:388
   ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
   ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
   </TASK>

To address the issue, catch the racing subflow state change and
use it to cause the MPTCP fallback. Such fallback is also used to
cause the first subflow state propagation to the msk socket via
mptcp_set_connected(). After this change, the first subflow can
additionally propagate the TCP_FIN_WAIT1 state, so rename the
helper accordingly.

Finally, if the state propagation is delayed to the msk release
callback, the first subflow can change to a different state in between.
Cache the relevant target state in a new msk-level field and use
such value to update the msk state at release time.

Fixes: 1e777f39b4 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
Cc: stable@vger.kernel.org
Reported-by: <syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-05 15:19:42 +01:00
..
6lowpan
9p net: 9p: avoid freeing uninit memory in p9pdu_vreadf 2024-01-01 12:42:41 +00:00
802
8021q net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() 2024-01-01 12:42:32 +00:00
appletalk appletalk: Fix Use-After-Free in atalk_ioctl 2023-12-20 17:01:50 +01:00
atm atm: Fix Use-After-Free in do_vcc_ioctl 2023-12-20 17:01:48 +01:00
ax25 ax25: Kconfig: Update link for linux-ax25.org 2023-09-18 12:56:58 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-24 10:51:39 -07:00
bluetooth Bluetooth: Add more enc key size check 2024-01-01 12:42:41 +00:00
bpf bpf: Prevent inlining of bpf_fentry_test7() 2023-08-30 08:36:17 +02:00
bpfilter
bridge netfilter: nf_conntrack_bridge: initialize err to 0 2023-11-28 17:19:52 +00:00
caif
can can: isotp: isotp_sendmsg(): fix TX state detection and wait behavior 2023-10-06 12:54:33 +02:00
ceph libceph: use kernel_connect() 2023-10-09 13:35:24 +02:00
core net: avoid build bug in skb extension length calculation 2024-01-01 12:42:42 +00:00
dcb
dccp dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses. 2023-11-20 11:59:35 +01:00
devlink devlink: Hold devlink lock on health reporter dump get 2023-10-06 15:56:46 -07:00
dns_resolver keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry 2024-01-01 12:42:33 +00:00
dsa
ethernet
ethtool ethtool: don't propagate EOPNOTSUPP from dumps 2023-12-08 08:52:23 +01:00
handshake net/handshake: fix file ref count in handshake_nl_accept_doit() 2023-10-23 10:19:33 -07:00
hsr hsr: Prevent use after free in prp_create_tagged_frame() 2023-11-20 11:59:34 +01:00
ieee802154 sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
ife net: sched: ife: fix potential use-after-free 2024-01-01 12:42:30 +00:00
ipv4 net: Remove acked SYN flag from packet in the transmit queue correctly 2023-12-20 17:01:49 +01:00
ipv6 net/ipv6: Revert remove expired routes with a separated list of routes 2024-01-01 12:42:33 +00:00
iucv
kcm kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg(). 2023-09-14 10:43:51 +02:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-18 12:44:56 -07:00
l2tp udp: annotate data-races around udp->encap_type 2023-11-20 11:58:56 +01:00
l3mdev
lapb
llc llc: verify mac len before reading mac header 2023-11-20 11:59:34 +01:00
mac80211 wifi: mac80211: mesh_plink: fix matches_local logic 2024-01-01 12:42:27 +00:00
mac802154
mctp mctp: perform route lookups under a RCU read-side lock 2023-10-10 19:43:22 -07:00
mpls
mptcp mptcp: fix inconsistent state on fastopen race 2024-01-05 15:19:42 +01:00
ncsi Revert ncsi: Propagate carrier gain/loss events to the NCSI controller 2023-11-28 17:20:10 +00:00
netfilter netfilter: nft_set_pipapo: skip inactive elements during set walk 2023-12-13 18:45:35 +01:00
netlabel
netlink drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group 2023-12-13 18:45:10 +01:00
netrom netrom: Deny concurrent connect(). 2023-08-28 06:58:46 +01:00
nfc nfc: nci: fix possible NULL pointer dereference in send_acknowledge() 2023-10-16 17:34:53 -07:00
nsh
openvswitch net/sched: act_ct: Always fill offloading tuple iifidx 2023-11-20 11:59:37 +01:00
packet packet: Move reference count in packet_sock to atomic_long_t 2023-12-13 18:45:23 +01:00
phonet
psample psample: Require 'CAP_NET_ADMIN' when joining "packets" group 2023-12-13 18:45:10 +01:00
qrtr
rds net: prevent address rewrite in kernel_bind() 2023-10-01 19:31:29 +01:00
rfkill net: rfkill: gpio: set GPIO direction 2024-01-01 12:42:41 +00:00
rose net/rose: fix races in rose_kill_by_device() 2024-01-01 12:42:31 +00:00
rxrpc rxrpc: Fix some minor issues with bundle tracing 2023-12-20 17:01:55 +01:00
sched net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table 2023-12-20 17:01:47 +01:00
sctp sctp: update hb timer immediately after users change hb_interval 2023-10-04 17:29:58 -07:00
smc net/smc: fix missing byte order conversion in CLC handshake 2023-12-13 18:45:11 +01:00
strparser
sunrpc SUNRPC: Revert 5f7fc5d69f 2024-01-01 12:42:26 +00:00
switchdev
tipc tipc: Fix kernel-infoleak due to uninitialized TLV value 2023-11-28 17:19:51 +00:00
tls net: tls, update curr on splice as well 2023-12-13 18:45:10 +01:00
unix bpf, sockmap: af_unix stream sockets need to hold ref for pair sock 2023-12-08 08:52:23 +01:00
vmw_vsock vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space() 2023-12-20 17:01:50 +01:00
wireless wifi: cfg80211: fix certs build to not depend on file order 2024-01-01 12:42:39 +00:00
x25
xdp xsk: Skip polling event check for unbound socket 2023-12-13 18:45:06 +01:00
xfrm ipsec-2023-10-17 2023-10-17 18:21:13 -07:00
compat.c
devres.c
Kconfig
Kconfig.debug
Makefile
socket.c net: prevent address rewrite in kernel_bind() 2023-10-01 19:31:29 +01:00
sysctl_net.c