linux/include
Peilin Ye 720b6ced85 ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode
[ Upstream commit 31c417c948 ]

As pointed out by Jakub Kicinski, currently using TUNNEL_SEQ in
collect_md mode is racy for [IP6]GRE[TAP] devices.  Consider the
following sequence of events:

1. An [IP6]GRE[TAP] device is created in collect_md mode using "ip link
   add ... external".  "ip" ignores "[o]seq" if "external" is specified,
   so TUNNEL_SEQ is off, and the device is marked as NETIF_F_LLTX (i.e.
   it uses lockless TX);
2. Someone sets TUNNEL_SEQ on outgoing skb's, using e.g.
   bpf_skb_set_tunnel_key() in an eBPF program attached to this device;
3. gre_fb_xmit() or __gre6_xmit() processes these skb's:

	gre_build_header(skb, tun_hlen,
			 flags, protocol,
			 tunnel_id_to_key32(tun_info->key.tun_id),
			 (flags & TUNNEL_SEQ) ? htonl(tunnel->o_seqno++)
					      : 0);   ^^^^^^^^^^^^^^^^^

Since we are not using the TX lock (&txq->_xmit_lock), multiple CPUs may
try to do this tunnel->o_seqno++ in parallel, which is racy.  Fix it by
making o_seqno atomic_t.

As mentioned by Eric Dumazet in commit b790e01aee ("ip_gre: lockless
xmit"), making o_seqno atomic_t increases "chance for packets being out
of order at receiver" when NETIF_F_LLTX is on.

Maybe a better fix would be:

1. Do not ignore "oseq" in external mode.  Users MUST specify "oseq" if
   they want the kernel to allow sequencing of outgoing packets;
2. Reject all outgoing TUNNEL_SEQ packets if the device was not created
   with "oseq".

Unfortunately, that would break userspace.

We could now make [IP6]GRE[TAP] devices always NETIF_F_LLTX, but let us
do it in separate patches to keep this fix minimal.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 77a5196a80 ("gre: add sequence number for collect md mode.")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09 09:05:04 +02:00
..
acpi ACPICA: actypes.h: Expand the ACPI_ACCESS_ definitions 2022-01-27 10:54:18 +01:00
asm-generic tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry 2022-04-20 09:23:22 +02:00
clocksource clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG 2021-07-14 16:56:12 +02:00
crypto crypto: public_key: fix overflow during implicit conversion 2021-09-18 13:40:08 +02:00
drm drm: protect drm_master pointers in drm_lease.c 2021-09-18 13:40:19 +02:00
dt-bindings clk: imx8mq: remove SYS PLL 1/2 clock gates 2021-07-14 16:56:20 +02:00
keys certs: Add EFI_CERT_X509_GUID support for dbx entries 2021-06-30 08:47:30 -04:00
kunit kunit: fix display of failed expectations for strings 2020-11-10 13:45:15 -07:00
kvm ARM: 2020-10-23 11:17:56 -07:00
linux mtd: fix 'part' field data corruption in mtd_info 2022-05-09 09:05:02 +02:00
math-emu
media media: subdev: disallow ioctl for saa6588/davinci 2021-07-19 09:45:02 +02:00
memory memory: renesas-rpc-if: Fix HF/OSPI data transfer in Manual Mode 2022-05-09 09:05:02 +02:00
misc
net ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-05-09 09:05:04 +02:00
pcmcia
ras mm,hwpoison: introduce MF_MSG_UNSPLIT_THP 2020-10-16 11:11:17 -07:00
rdma RDMA/netlink: Add __maybe_unused to static inline in C file 2021-11-26 10:39:21 +01:00
scsi scsi: iscsi: Fix conn cleanup and stop race during iscsid restart 2022-04-20 09:23:17 +02:00
soc firmware: raspberrypi: Keep count of all consumers 2021-09-15 09:50:41 +02:00
sound ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock 2022-04-08 14:39:53 +02:00
target scsi: target: Fix ordered tag handling 2021-11-26 10:39:11 +01:00
trace SUNRPC: Fix the svc_deferred_event trace class 2022-04-20 09:23:11 +02:00
uapi can: isotp: set default value for N_As to 50 micro seconds 2022-04-13 21:01:00 +02:00
vdso
video gpu: ipu-v3: remove unused functions 2020-10-26 10:42:38 +01:00
xen xen/gnttab: fix gnttab_end_foreign_access() without page specified 2022-03-11 12:11:54 +01:00