Linux kernel source tree
Go to file
Eric Dumazet 024158d3b5 net: avoid 32 x truesize under-estimation for tiny skbs
[ Upstream commit 3226b158e6 ]

Both virtio net and napi_get_frags() allocate skbs
with a very small skb->head

While using page fragments instead of a kmalloc backed skb->head might give
a small performance improvement in some cases, there is a huge risk of
under estimating memory usage.

For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations
per page (order-3 page in x86), or even 64 on PowerPC

We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
but consuming far more memory for TCP buffers than instructed in tcp_mem[2]

Even if we force napi_alloc_skb() to only use order-0 pages, the issue
would still be there on arches with PAGE_SIZE >= 32768

This patch makes sure that small skb head are kmalloc backed, so that
other objects in the slab page can be reused instead of being held as long
as skbs are sitting in socket queues.

Note that we might in the future use the sk_buff napi cache,
instead of going through a more expensive __alloc_skb()

Another idea would be to use separate page sizes depending
on the allocated length (to never have more than 4 frags per page)

I would like to thank Greg Thelen for his precious help on this matter,
analysing crash dumps is always a time consuming task.

Fixes: fd11a83dd3 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-23 16:04:03 +01:00
arch x86/hyperv: Initialize clockevents after LAPIC is initialized 2021-01-23 16:03:57 +01:00
block blk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED 2021-01-19 18:27:29 +01:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto X.509: Fix crash caused by NULL pointer 2021-01-23 16:03:58 +01:00
Documentation dt-bindings: display: sii902x: Add supply bindings 2021-01-19 18:27:19 +01:00
drivers net: stmmac: fix taprio configuration when base_time is in the past 2021-01-23 16:04:02 +01:00
fs nfsd4: readdirplus shouldn't return parent of export 2021-01-23 16:03:58 +01:00
include rcu-tasks: Move RCU-tasks initialization to before early_initcall() 2021-01-19 18:27:28 +01:00
init rcu-tasks: Move RCU-tasks initialization to before early_initcall() 2021-01-19 18:27:28 +01:00
ipc ipc: adjust proc_ipc_sem_dointvec definition to match prototype 2020-09-05 12:14:29 -07:00
kernel bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback 2021-01-23 16:03:59 +01:00
lib lib/raid6: Let $(UNROLL) rules work with macOS userland 2021-01-19 18:27:25 +01:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm, slub: consider rest of partial list if acquire_slab() fails 2021-01-19 18:27:32 +01:00
net net: avoid 32 x truesize under-estimation for tiny skbs 2021-01-23 16:04:03 +01:00
samples samples/bpf: Fix possible hang in xdpsock with multiple threads 2020-12-30 11:53:49 +01:00
scripts Revert "kconfig: remove 'kvmconfig' and 'xenconfig' shorthands" 2021-01-23 16:03:57 +01:00
security dump_common_audit_data(): fix racy accesses to ->d_name 2021-01-19 18:27:29 +01:00
sound ALSA: firewire-tascam: Fix integer overflow in midi_port_work() 2021-01-19 18:27:33 +01:00
tools bpf: Fix selftest compilation on clang 11 2021-01-23 16:03:57 +01:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt kvm: check tlbs_dirty directly 2021-01-12 20:18:22 +01:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-10 15:30:13 -08:00
Makefile Linux 5.10.9 2021-01-19 18:27:34 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.