linux/net
Priyaranjan Jha 610058f519 tcp_bbr: adapt cwnd based on ack aggregation estimation
commit 78dc70ebaa upstream.

Aggregation effects are extremely common with wifi, cellular, and cable
modem link technologies, ACK decimation in middleboxes, and LRO and GRO
in receiving hosts. The aggregation can happen in either direction,
data or ACKs, but in either case the aggregation effect is visible
to the sender in the ACK stream.

Previously BBR's sending was often limited by cwnd under severe ACK
aggregation/decimation because BBR sized the cwnd at 2*BDP. If packets
were acked in bursts after long delays (e.g. one ACK acking 5*BDP after
5*RTT), BBR's sending was halted after sending 2*BDP over 2*RTT, leaving
the bottleneck idle for potentially long periods. Note that loss-based
congestion control does not have this issue because when facing
aggregation it continues increasing cwnd after bursts of ACKs, growing
cwnd until the buffer is full.

To achieve good throughput in the presence of aggregation effects, this
algorithm allows the BBR sender to put extra data in flight to keep the
bottleneck utilized during silences in the ACK stream that it has evidence
to suggest were caused by aggregation.

A summary of the algorithm: when a burst of packets are acked by a
stretched ACK or a burst of ACKs or both, BBR first estimates the expected
amount of data that should have been acked, based on its estimated
bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max
filter to estimate the recent level of observed ACK aggregation. Then cwnd
is increased by the ACK aggregation estimate. The larger cwnd avoids BBR
being cwnd-limited in the face of ACK silences that recent history suggests
were caused by aggregation. As a sanity check, the ACK aggregation degree
is upper-bounded by the cwnd (at the time of measurement) and a global max
of BW * 100ms. The algorithm is further described by the following
presentation:
https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00

In our internal testing, we observed a significant increase in BBR
throughput (measured using netperf), in a basic wifi setup.
- Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi)
- 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps
- 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps

Also, this code is running globally on YouTube TCP connections and produced
significant bandwidth increases for YouTube traffic.

This is based on Ian Swett's max_ack_height_ algorithm from the
QUIC BBR implementation.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26 18:01:32 +02:00
..
6lowpan 6lowpan: Off by one handling ->nexthdr 2020-01-27 14:50:41 +01:00
9p net/9p: validate fds in p9_fd_open 2020-08-11 15:32:32 +02:00
802
8021q vlan: vlan_changelink() should propagate errors 2020-01-12 12:17:28 +01:00
appletalk appletalk: Set error code if register_snap_client failed 2019-12-13 08:52:59 +01:00
atm net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
ax25 AX.25: Prevent integer overflows in connect and sendmsg 2020-07-31 18:37:48 +02:00
batman-adv batman-adv: bla: use netif_rx_ni when not in interrupt context 2020-09-09 19:04:24 +02:00
bluetooth Bluetooth: add a mutex lock to avoid UAF in do_enale_set 2020-08-19 08:14:50 +02:00
bpf
bpfilter signal/bpfilter: Fix bpfilter_kernl to use send_sig not force_sig 2020-01-27 14:50:51 +01:00
bridge net: bridge: enfore alignment for ethernet address 2020-06-30 23:17:03 -04:00
caif net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
can can: gw: Fix error path of cgw_module_init 2019-08-29 08:28:30 +02:00
ceph libceph: don't omit recovery_deletes in target_copy() 2020-07-22 09:32:13 +02:00
core net: handle the return value of pskb_carve_frag_list() correctly 2020-09-23 12:10:57 +02:00
dcb net: DCB: Validate DCB_ATTR_DCB_BUFFER argument 2020-09-26 18:01:29 +02:00
dccp net: ipv6: add net argument to ip6_dst_lookup_flow 2020-04-29 16:31:16 +02:00
decnet net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2020-01-04 19:13:37 +01:00
dns_resolver KEYS: Don't write out to userspace while holding key semaphore 2020-04-23 10:30:24 +02:00
dsa dsa: Allow forwarding of redirected IGMP traffic 2020-09-23 12:10:56 +02:00
ethernet net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:19:09 +01:00
hsr hsr: check protocol version in hsr_newlink() 2020-04-21 09:03:03 +02:00
ieee802154 nl802154: add missing attribute validation for dev_type 2020-03-18 07:14:15 +01:00
ife
ipv4 tcp_bbr: adapt cwnd based on ack aggregation estimation 2020-09-26 18:01:32 +02:00
ipv6 net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC 2020-09-26 18:01:29 +02:00
iucv net/af_iucv: always register net_device notifier 2020-01-27 14:50:56 +01:00
kcm kcm: switch order of device registration to fix a crash 2019-04-17 08:38:40 +02:00
key af_key: pfkey_dump needs parameter validation 2020-09-26 18:01:28 +02:00
l2tp l2tp: remove skb_dst_set() from l2tp_xmit_skb() 2020-07-22 09:31:59 +02:00
l3mdev
lapb lapb: fixed leak of control-blocks. 2019-06-22 08:15:13 +02:00
llc llc: make sure applications use ARPHRD_ETHER 2020-07-22 09:31:59 +02:00
mac80211 mac80211: fix misplaced while instead of if 2020-08-21 11:05:32 +02:00
mac802154
mpls net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup 2020-04-29 16:31:17 +02:00
ncsi
netfilter netfilter: conntrack: allow sctp hearbeat after connection re-use 2020-09-17 13:45:24 +02:00
netlabel netlabel: fix problems with mapping removal 2020-09-12 13:40:22 +02:00
netlink genetlink: remove genl_bind 2020-07-22 09:31:58 +02:00
netrom net: netrom: Fix potential nr_neigh refcnt leak in nr_add_node 2020-04-29 16:31:21 +02:00
nfc net/nfc/rawsock.c: add CAP_NET_RAW check. 2020-08-19 08:15:03 +02:00
nsh
openvswitch openvswitch: Prevent kernel-infoleak in ovs_ct_put_key() 2020-08-11 15:32:35 +02:00
packet af_packet: TPACKET_V3: fix fill status rwlock imbalance 2020-08-19 08:15:03 +02:00
phonet net: use skb_queue_empty_lockless() in poll() handlers 2019-11-10 11:27:48 +01:00
psample net: psample: fix skb_over_panic 2019-12-05 09:21:30 +01:00
qrtr net: qrtr: check skb_put_padto() return value 2020-09-26 18:01:30 +02:00
rds rds: Prevent kernel-infoleak in rds_notify_queue_get() 2020-08-05 10:06:01 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2020-01-12 12:17:17 +01:00
rose net/rose: fix unbound loop in rose_loopback_timer() 2019-05-02 09:59:00 +02:00
rxrpc rxrpc: Fix race between recvmsg and sendmsg on immediate call failure 2020-08-11 15:32:35 +02:00
sched net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc 2020-09-26 18:01:30 +02:00
sctp sctp: not disable bh in the whole sctp_get_port_local() 2020-09-12 13:40:22 +02:00
smc net/smc: Prevent kernel-infoleak in __smc_diag_dump() 2020-09-03 11:24:17 +02:00
strparser net: strparser: partially revert "strparser: Call skb_unclone conditionally" 2019-05-16 19:41:27 +02:00
sunrpc SUNRPC: stop printk reading past end of string 2020-09-23 12:10:58 +02:00
switchdev
tipc tipc: use skb_unshare() instead in tipc_buf_append() 2020-09-26 18:01:30 +02:00
tls net/tls: Fix kmap usage 2020-08-19 08:15:03 +02:00
unix af_unix: add compat_ioctl support 2020-01-17 19:47:07 +01:00
vmw_vsock vsock: fix timeout in vsock_accept() 2020-06-10 21:34:59 +02:00
wimax
wireless cfg80211: regulatory: reject invalid hints 2020-09-09 19:04:32 +02:00
x25 net/x25: Fix null-ptr-deref in x25_disconnect 2020-08-05 10:06:02 +02:00
xdp xdp: Fix xsk_generic_xmit errno 2020-06-25 15:33:05 +02:00
xfrm xfrm: Fix double ESP trailer insertion in IPsec crypto offload. 2020-06-30 23:17:10 -04:00
compat.c net/compat: Add missing sock updates for SCM_RIGHTS 2020-08-21 11:05:32 +02:00
Kconfig
Makefile
socket.c net: Set fput_needed iff FDPUT_FPUT is set 2020-08-19 08:15:03 +02:00
sysctl_net.c