linux/net
Eric Dumazet 5e25ba5003 tcp: TSO packets automatic sizing
[ Upstream commits 6d36824e730f247b602c90e8715a792003e3c5a7,
  02cf4ebd82, and parts of
  7eec4174ff ]

After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.

One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.

This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.

This field could be set by other transports.

Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.

For other flows, this helps better packet scheduling and ACK clocking.

This patch increases performance of TCP flows in lossy environments.

A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).

A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.

This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.

sk_pacing_rate = 2 * cwnd * mss / srtt

v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Tom Herbert <therbert@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-04 04:30:59 -08:00
..
9p 9p: fix off by one causing access violations and memory corruption 2013-07-28 16:29:58 -07:00
802 net/802/mrp: fix lockdep splat 2013-05-14 13:02:30 -07:00
8021q vlan: fix a race in egress prio management 2013-07-28 16:30:05 -07:00
appletalk appletalk: info leak in ->getname() 2013-04-25 01:47:58 -04:00
atm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
ax25 ax25: fix info leak via msg_name in ax25_recvmsg() 2013-04-07 16:28:00 -04:00
batman-adv batman-adv: Don't handle address updates when bla is disabled 2013-06-10 08:42:18 +02:00
bluetooth Bluetooth: Fix rfkill functionality during the HCI setup stage 2013-10-13 16:08:32 -07:00
bridge bridge: fix NULL pointer deref of br_port_get_rcu 2013-10-13 16:08:29 -07:00
caif caif: Add missing braces to multiline if in cfctrl_linkup_request 2013-10-13 16:08:28 -07:00
can Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
ceph libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc 2013-09-26 17:18:29 -07:00
core tcp: TSO packets automatic sizing 2013-11-04 04:30:59 -08:00
dcb
dccp net:dccp: do not report ICMP redirects to user space 2013-10-13 16:08:30 -07:00
decnet decnet: remove duplicated include from dn_table.c 2013-04-07 17:12:01 -04:00
dns_resolver
dsa
ethernet
ieee802154 ieee802154/nl-mac.c: make some MLME operations optional 2013-04-08 12:00:16 -04:00
ipv4 tcp: TSO packets automatic sizing 2013-11-04 04:30:59 -08:00
ipv6 ip6tnl: allow to use rtnl ops on fb tunnel 2013-10-13 16:08:31 -07:00
ipx
irda net: irda: using kzalloc() instead of kmalloc() to avoid strncpy() issue. 2013-05-19 15:10:47 -07:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-22 20:32:51 -04:00
key af_key: more info leaks in pfkey messages 2013-08-11 18:35:25 -07:00
l2tp l2tp: add missing .owner to struct pppox_proto 2013-07-28 16:29:49 -07:00
lapb
llc llc: Fix missing msg_namelen update in llc_ui_recvmsg() 2013-04-07 16:28:01 -04:00
mac80211 mac80211: add a flag to indicate CCK support for HT clients 2013-09-07 22:09:59 -07:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-30 03:55:20 -04:00
netfilter ip: generate unique IP identificator if local fragmentation is allowed 2013-10-13 16:08:30 -07:00
netlabel netlabel: improve domain mapping validation 2013-05-19 14:49:55 -07:00
netlink genl: Hold reference on correct module while netlink-dump. 2013-09-14 06:54:55 -07:00
netrom netrom: info leak in ->getname() 2013-04-25 01:47:58 -04:00
nfc NFC: llcp: Fix non blocking sockets connections 2013-08-29 09:47:30 -07:00
openvswitch openvswitch: Remove unneeded ovs_netdev_get_ifindex() 2013-04-30 00:19:11 -04:00
packet packet: restore packet statistics tp_packets to include drops 2013-09-14 06:54:55 -07:00
phonet
rds
rfkill Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2013-04-22 14:58:14 -04:00
rose rose: fix info leak via msg_name in rose_recvmsg() 2013-04-07 16:28:02 -04:00
rxrpc
sched net_sched: htb: fix a typo in htb_change_class() 2013-10-13 16:08:29 -07:00
sctp net: sctp: rfc4443: do not report ICMP redirects to user space 2013-10-13 16:08:29 -07:00
sunrpc rpc: let xdr layer allocate gssproxy receieve pages 2013-10-01 09:17:48 -07:00
tipc tipc: set sk_err correctly when connection fails 2013-09-14 06:54:56 -07:00
unix af_unix: fix a fatal race with bit fields 2013-05-01 15:13:49 -04:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-30 03:55:20 -04:00
wimax
wireless nl80211: fix another nl80211_fam.attrbuf race 2013-08-20 08:43:04 -07:00
x25 x25: Fix broken locking in ioctl error paths. 2013-07-28 16:29:45 -07:00
xfrm xfrm: force a garbage collection after deleting a policy 2013-05-31 17:30:07 -07:00
compat.c net: Unbreak compat_sys_{send,recv}msg 2013-06-06 11:52:14 -07:00
Kconfig netlink: kconfig: move mmap i/o into netlink kconfig 2013-05-01 15:02:42 -04:00
Makefile
nonet.c
socket.c net: Unbreak compat_sys_{send,recv}msg 2013-06-06 11:52:14 -07:00
sysctl_net.c net: Update the sysctl permissions handler to test effective uid/gid 2013-10-13 16:08:34 -07:00