linux/net
Julian Anastasov 7106f94330 ipvs: allow connection reuse for unconfirmed conntrack
[ Upstream commit f0a5e4d7a5 ]

YangYuxi is reporting that connection reuse
is causing one-second delay when SYN hits
existing connection in TIME_WAIT state.
Such delay was added to give time to expire
both the IPVS connection and the corresponding
conntrack. This was considered a rare case
at that time but it is causing problem for
some environments such as Kubernetes.

As nf_conntrack_tcp_packet() can decide to
release the conntrack in TIME_WAIT state and
to replace it with a fresh NEW conntrack, we
can use this to allow rescheduling just by
tuning our check: if the conntrack is
confirmed we can not schedule it to different
real server and the one-second delay still
applies but if new conntrack was created,
we are free to select new real server without
any delays.

YangYuxi lists some of the problem reports:

- One second connection delay in masquerading mode:
https://marc.info/?t=151683118100004&r=1&w=2

- IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747

- Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544

- Additional 1s latency in `host -> service IP -> pod`
https://github.com/kubernetes/kubernetes/issues/90854

Fixes: f719e3754e ("ipvs: drop first packet to redirect conntrack")
Co-developed-by: YangYuxi <yx.atom1@gmail.com>
Signed-off-by: YangYuxi <yx.atom1@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19 08:14:56 +02:00
..
6lowpan 6lowpan: Off by one handling ->nexthdr 2020-01-27 14:50:41 +01:00
9p net/9p: validate fds in p9_fd_open 2020-08-11 15:32:32 +02:00
802
8021q vlan: vlan_changelink() should propagate errors 2020-01-12 12:17:28 +01:00
appletalk appletalk: Set error code if register_snap_client failed 2019-12-13 08:52:59 +01:00
atm net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
ax25 AX.25: Prevent integer overflows in connect and sendmsg 2020-07-31 18:37:48 +02:00
batman-adv batman-adv: Revert "disable ethtool link speed detection when auto negotiation off" 2020-06-22 09:05:12 +02:00
bluetooth Bluetooth: add a mutex lock to avoid UAF in do_enale_set 2020-08-19 08:14:50 +02:00
bpf
bpfilter signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig 2020-01-27 14:50:51 +01:00
bridge net: bridge: enfore alignment for ethernet address 2020-06-30 23:17:03 -04:00
caif net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
can can: gw: Fix error path of cgw_module_init 2019-08-29 08:28:30 +02:00
ceph libceph: don't omit recovery_deletes in target_copy() 2020-07-22 09:32:13 +02:00
core rtnetlink: Fix memory(net_device) leak when ->newlink fails 2020-07-31 18:37:49 +02:00
dcb
dccp net: ipv6: add net argument to ip6_dst_lookup_flow 2020-04-29 16:31:16 +02:00
decnet net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:13:37 +01:00
dns_resolver KEYS: Don't write out to userspace while holding key semaphore 2020-04-23 10:30:24 +02:00
dsa net: dsa: mt7530: fix roaming from DSA user ports 2020-06-03 08:19:03 +02:00
ethernet net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:19:09 +01:00
hsr hsr: check protocol version in hsr_newlink() 2020-04-21 09:03:03 +02:00
ieee802154 nl802154: add missing attribute validation for dev_type 2020-03-18 07:14:15 +01:00
ife
ipv4 net: gre: recompute gre csum for sctp over gre tunnels 2020-08-11 15:32:34 +02:00
ipv6 ipv6: fix memory leaks on IPV6_ADDRFORM path 2020-08-11 15:32:34 +02:00
iucv net/af_iucv: always register net_device notifier 2020-01-27 14:50:56 +01:00
kcm kcm: switch order of device registration to fix a crash 2019-04-17 08:38:40 +02:00
key af_key: fix leaks in key_pol_get_resp and dump_sp. 2019-07-26 09:14:01 +02:00
l2tp l2tp: remove skb_dst_set() from l2tp_xmit_skb() 2020-07-22 09:31:59 +02:00
l3mdev
lapb lapb: fixed leak of control-blocks. 2019-06-22 08:15:13 +02:00
llc llc: make sure applications use ARPHRD_ETHER 2020-07-22 09:31:59 +02:00
mac80211 mac80211: mesh: Free pending skb when destroying a mpath 2020-08-05 10:06:04 +02:00
mac802154
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2020-04-29 16:31:17 +02:00
ncsi
netfilter ipvs: allow connection reuse for unconfirmed conntrack 2020-08-19 08:14:56 +02:00
netlabel netlabel: cope with NULL catmap 2020-05-20 08:18:35 +02:00
netlink genetlink: remove genl_bind 2020-07-22 09:31:58 +02:00
netrom net: netrom: Fix potential nr_neigh refcnt leak in nr_add_node 2020-04-29 16:31:21 +02:00
nfc nfc: add missing attribute validation for vendor subcommand 2020-03-18 07:14:17 +01:00
nsh
openvswitch openvswitch: Prevent kernel-infoleak in ovs_ct_put_key() 2020-08-11 15:32:35 +02:00
packet net/packet: tpacket_rcv: avoid a producer race condition 2020-04-02 15:28:11 +02:00
phonet net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
psample net: psample: fix skb_over_panic 2019-12-05 09:21:30 +01:00
qrtr qrtr: orphan socket in qrtr_release() 2020-07-31 18:37:48 +02:00
rds rds: Prevent kernel-infoleak in rds_notify_queue_get() 2020-08-05 10:06:01 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2020-01-12 12:17:17 +01:00
rose net/rose: fix unbound loop in rose_loopback_timer() 2019-05-02 09:59:00 +02:00
rxrpc rxrpc: Fix race between recvmsg and sendmsg on immediate call failure 2020-08-11 15:32:35 +02:00
sched sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-22 09:32:00 +02:00
sctp sctp: implement memory accounting on tx path 2020-08-05 10:06:00 +02:00
smc net/smc: cancel event worker during device removal 2020-03-18 07:14:25 +01:00
strparser net: strparser: partially revert "strparser: Call skb_unclone conditionally" 2019-05-16 19:41:27 +02:00
sunrpc SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment() 2020-06-30 23:17:18 -04:00
switchdev
tipc tipc: clean up skb list lock handling on send path 2020-07-29 10:16:47 +02:00
tls net/tls: Fix to avoid gettig invalid tls record 2020-03-05 16:42:17 +01:00
unix af_unix: add compat_ioctl support 2020-01-17 19:47:07 +01:00
vmw_vsock vsock: fix timeout in vsock_accept() 2020-06-10 21:34:59 +02:00
wimax
wireless cfg80211: check vendor command doit pointer before use 2020-08-11 15:32:33 +02:00
x25 net/x25: Fix null-ptr-deref in x25_disconnect 2020-08-05 10:06:02 +02:00
xdp xdp: Fix xsk_generic_xmit errno 2020-06-25 15:33:05 +02:00
xfrm xfrm: Fix double ESP trailer insertion in IPsec crypto offload. 2020-06-30 23:17:10 -04:00
compat.c sock: Make sock->sk_stamp thread-safe 2019-01-09 17:38:33 +01:00
Kconfig
Makefile
socket.c compat_ioctl: handle SIOCOUTQNSD 2020-01-17 19:47:07 +01:00
sysctl_net.c