Linux kernel source tree
Go to file
Neal Cardwell 96a3ddb870 tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited
[ Upstream commit f4ce91ce12 ]

This commit fixes a bug in the tracking of max_packets_out and
is_cwnd_limited. This bug can cause the connection to fail to remember
that is_cwnd_limited is true, causing the connection to fail to grow
cwnd when it should, causing throughput to be lower than it should be.

The following event sequence is an example that triggers the bug:

 (a) The connection is cwnd_limited, but packets_out is not at its
     peak due to TSO deferral deciding not to send another skb yet.
     In such cases the connection can advance max_packets_seq and set
     tp->is_cwnd_limited to true and max_packets_out to a small
     number.

(b) Then later in the round trip the connection is pacing-limited (not
     cwnd-limited), and packets_out is larger. In such cases the
     connection would raise max_packets_out to a bigger number but
     (unexpectedly) flip tp->is_cwnd_limited from true to false.

This commit fixes that bug.

One straightforward fix would be to separately track (a) the next
window after max_packets_out reaches a maximum, and (b) the next
window after tp->is_cwnd_limited is set to true. But this would
require consuming an extra u32 sequence number.

Instead, to save space we track only the most important
information. Specifically, we track the strongest available signal of
the degree to which the cwnd is fully utilized:

(1) If the connection is cwnd-limited then we remember that fact for
the current window.

(2) If the connection not cwnd-limited then we track the maximum
number of outstanding packets in the current window.

In particular, note that the new logic cannot trigger the buggy
(a)/(b) sequence above because with the new logic a condition where
tp->packets_out > tp->max_packets_out can only trigger an update of
tp->is_cwnd_limited if tp->is_cwnd_limited is false.

This first showed up in a testing of a BBRv2 dev branch, but this
buggy behavior highlighted a general issue with the
tcp_cwnd_validate() logic that can cause cwnd to fail to increase at
the proper rate for any TCP congestion control, including Reno or
CUBIC.

Fixes: ca8a226343 ("tcp: make cwnd-limited checks measurement-based, and gentler")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Kevin(Yudong) Yang <yyd@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26 13:25:23 +02:00
arch x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype 2022-10-26 13:25:21 +02:00
block block: fix inflight statistics of part0 2022-10-26 13:25:11 +02:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:13:17 +02:00
crypto KEYS: asymmetric: enforce SM2 signature use pkey algo 2022-08-21 15:16:22 +02:00
Documentation ARM: dts: fix Moxa SDIO 'compatible', remove 'sdhci' misnomer 2022-10-15 07:55:52 +02:00
drivers mISDN: fix use-after-free bugs in l1oip timer handlers 2022-10-26 13:25:22 +02:00
fs nfsd: Fix a memory leak in an error handling path 2022-10-26 13:25:18 +02:00
include tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited 2022-10-26 13:25:23 +02:00
init Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug 2022-06-09 10:21:25 +02:00
ipc ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() 2022-06-09 10:21:17 +02:00
kernel bpf: Ensure correct locking around vulnerable function find_vpid() 2022-10-26 13:25:21 +02:00
lib lib/vdso: Mark do_hres_timens() and do_coarse_timens() __always_inline() 2022-09-05 10:28:58 +02:00
LICENSES
mm mm/mmap: undo ->mmap() when arch_validate_flags() fails 2022-10-26 13:25:11 +02:00
net tcp: fix tcp_cwnd_validate() to not forget is_cwnd_limited 2022-10-26 13:25:23 +02:00
samples x86: Prepare inline-asm for straight-line-speculation 2022-07-25 11:26:29 +02:00
scripts selinux: use "grep -E" instead of "egrep" 2022-10-26 13:25:17 +02:00
security hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero 2022-10-26 13:25:12 +02:00
sound ASoC: wcd934x: fix order of Slimbus unprepare/disable 2022-10-26 13:25:09 +02:00
tools selftests/xsk: Avoid use-after-free on ctx 2022-10-26 13:25:20 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:25:48 +01:00
virt KVM: SEV: add cache flush to solve SEV cache incoherency issues 2022-09-28 11:10:28 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild
Kconfig
MAINTAINERS MAINTAINERS: add Amir as xfs maintainer for 5.10.y 2022-07-02 16:39:22 +02:00
Makefile hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero 2022-10-26 13:25:12 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.