linux/net/ipv4
Jakub Kicinski a8585fdf42 net: ip: avoid OOM kills with large UDP sends over loopback
[ Upstream commit 6d123b81ac ]

Dave observed number of machines hitting OOM on the UDP send
path. The workload seems to be sending large UDP packets over
loopback. Since loopback has MTU of 64k kernel will try to
allocate an skb with up to 64k of head space. This has a good
chance of failing under memory pressure. What's worse if
the message length is <32k the allocation may trigger an
OOM killer.

This is entirely avoidable, we can use an skb with page frags.

af_unix solves a similar problem by limiting the head
length to SKB_MAX_ALLOC. This seems like a good and simple
approach. It means that UDP messages > 16kB will now
use fragments if underlying device supports SG, if extra
allocator pressure causes regressions in real workloads
we can switch to trying the large allocation first and
falling back.

v4: pre-calculate all the additions to alloclen so
    we can be sure it won't go over order-2

Reported-by: Dave Jones <dsj@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19 09:44:53 +02:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter netfilter: arp_tables: add pre_exit hook for table unregister 2021-04-21 13:00:56 +02:00
af_inet.c inet: annotate data race in inet_send_prepare() and inet_dgram_connect() 2021-06-30 08:47:20 -04:00
ah4.c xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume 2021-04-14 08:42:05 +02:00
arp.c net: Exempt multicast addresses from five-second neighbor lifetime 2020-11-13 14:24:39 -08:00
bpf_tcp_ca.c bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
cipso_ipv4.c net: ipv4: fix memory leak in netlbl_cipsov4_add_std 2021-06-23 14:42:41 +02:00
datagram.c inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
devinet.c net: ipv4: Remove unneed BUG() function 2021-06-30 08:47:20 -04:00
esp4_offload.c xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets 2021-04-14 08:42:07 +02:00
esp4.c xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-07-14 16:56:14 +02:00
fib_frontend.c net/ipv4: swap flow ports when validating source 2021-07-14 16:56:25 +02:00
fib_lookup.h net: add net available in build_state 2020-03-29 22:30:57 -07:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c fib: use indirect call wrappers in the most common fib_rules_ops 2020-07-28 17:42:31 -07:00
fib_semantics.c net: Fix the arp error in some cases 2020-06-18 20:21:51 -07:00
fib_trie.c ipv4: Silence suspicious RCU usage warning 2020-08-26 15:58:48 -07:00
fou.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
gre_demux.c erspan: fix version 1 check in gre_parse_header() 2021-01-12 20:18:12 +01:00
gre_offload.c net: gre: recompute gre csum for sctp over gre tunnels 2020-08-03 15:29:44 -07:00
icmp.c icmp: don't send out ICMP messages with a source address of 0.0.0.0 2021-06-23 14:42:47 +02:00
igmp.c net: ipv4: fix memory leak in ip_mc_add1_src 2021-06-23 14:42:46 +02:00
inet_connection_sock.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:31:59 +02:00
inet_diag.c inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() 2020-11-17 16:08:36 -08:00
inet_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
inet_hashtables.c tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
inet_timewait_sock.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
inetpeer.c inetpeer: fix data-race in inet_putpeer / inet_putpeer 2019-11-07 16:15:56 -08:00
ip_forward.c ipv4: Revert removal of rt_uses_gateway 2019-09-20 18:23:33 -07:00
ip_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
ip_gre.c ip_gre: set dev->hard_header_len and dev->needed_headroom properly 2020-10-13 18:35:29 -07:00
ip_input.c bpf: Add socket assign support 2020-03-30 13:45:04 -07:00
ip_options.c net: clean up codestyle for net/ipv4 2020-08-25 06:28:02 -07:00
ip_output.c net: ip: avoid OOM kills with large UDP sends over loopback 2021-07-19 09:44:53 +02:00
ip_sockglue.c net: Remove duplicated midx check against 0 2020-08-25 06:23:59 -07:00
ip_tunnel_core.c tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies 2020-11-09 15:39:39 -08:00
ip_tunnel.c net: always use icmp{,v6}_ndo_send from ndo_start_xmit 2021-03-17 17:06:12 +01:00
ip_vti.c net: always use icmp{,v6}_ndo_send from ndo_start_xmit 2021-03-17 17:06:12 +01:00
ipcomp.c ipcomp: assign if_id to child tunnel from parent tunnel 2020-07-09 12:55:37 +02:00
ipconfig.c net: ipconfig: Don't override command-line hostnames or domains 2021-06-18 10:00:05 +02:00
ipip.c net: ipip: implement header_ops->parse_protocol for AF_PACKET 2020-06-30 12:29:39 -07:00
ipmr_base.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
ipmr.c ipmr: Use full VIF ID in netlink cache reports 2020-09-10 12:25:51 -07:00
Kconfig net: ipv4: remove duplicate "the the" phrase in Kconfig text 2020-08-18 16:02:16 -07:00
Makefile udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
metrics.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
netlink.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
nexthop.c nexthop: Do not flush blackhole nexthops when loopback goes down 2021-03-17 17:06:14 +01:00
ping.c ping: Check return value of function 'ping_queue_rcv_skb' 2021-06-30 08:47:20 -04:00
proc.c tcp: skip DSACKs with dubious sequence ranges 2020-09-24 20:15:45 -07:00
protocol.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
raw_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
raw.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
route.c net: lwtunnel: handle MTU calculation in forwading 2021-07-14 16:56:32 +02:00
syncookies.c net: Update window_clamp if SOCK_RCVBUF is set 2020-11-10 17:42:35 -08:00
sysctl_net_ipv4.c net: Make tcp_allowed_congestion_control readonly in non-init netns 2021-04-21 13:00:57 +02:00
tcp_bbr.c tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate 2020-11-17 11:03:22 -08:00
tcp_bic.c tcp: fix stretch ACK bugs in BIC 2020-03-16 18:26:54 -07:00
tcp_bpf.c bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect 2020-11-18 00:12:34 +01:00
tcp_cdg.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_cong.c net: Only allow init netns to set default tcp cong to a restricted algo 2021-05-14 09:50:46 +02:00
tcp_cubic.c tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT 2020-06-25 16:08:47 -07:00
tcp_dctcp.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_dctcp.h tcp: refactor DCTCP ECN ACK handling 2018-10-10 22:26:00 -07:00
tcp_diag.c inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data 2020-02-27 18:50:19 -08:00
tcp_fastopen.c bpf: tcp: Add bpf_skops_established() 2020-08-24 14:35:00 -07:00
tcp_highspeed.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_htcp.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_hybla.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_illinois.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_input.c net: tcp better handling of reordering then loss cases 2021-07-19 09:44:45 +02:00
tcp_ipv4.c tcp: Fix potential use-after-free due to double kfree() 2021-01-27 11:55:28 +01:00
tcp_lp.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_metrics.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
tcp_minisocks.c tcp: relookup sock for RST+ACK packets handled by obsolete req sock 2021-03-30 14:31:59 +02:00
tcp_nv.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_offload.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tcp_output.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-03 23:28:51 +01:00
tcp_rate.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
tcp_recovery.c tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN 2021-02-03 23:28:52 +01:00
tcp_scalable.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
tcp_timer.c tcp: make TCP_USER_TIMEOUT accurate for zero window probes 2021-02-03 23:28:51 +01:00
tcp_ulp.c bpf: sockmap: Only check ULP for TCP sockets 2020-03-09 22:34:58 +01:00
tcp_vegas.c tcp: use semicolons rather than commas to separate statements 2020-10-13 17:11:52 -07:00
tcp_vegas.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tcp_veno.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_westwood.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
tcp_yeah.c tcp: fix stretch ACK bugs in Yeah 2020-03-16 18:26:55 -07:00
tcp.c tcp: add sanity tests to TCP_QUEUE_SEQ 2021-03-17 17:06:11 +01:00
tunnel4.c tunnel4: add cb_handler to struct xfrm_tunnel 2020-07-09 12:51:36 +02:00
udp_bpf.c net: sk_msg: Simplify sk_psock initialization 2020-08-21 15:16:11 -07:00
udp_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c net: Fix gro aggregation for udp encaps with zero csum 2021-03-17 17:06:11 +01:00
udp_tunnel_core.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp_tunnel_nic.c udp_tunnel: add the ability to share port tables 2020-09-28 12:50:12 -07:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp.c udp: fix race between close() and udp_abort() 2021-06-23 14:42:42 +02:00
udplite.c net/ipv4: remove compat_ip_{get,set}sockopt 2020-07-19 18:16:41 -07:00
xfrm4_input.c xfrm: state: remove extract_input indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_output.c xfrm: fix unused variable warning if CONFIG_NETFILTER=n 2020-05-11 15:12:27 +02:00
xfrm4_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
xfrm4_protocol.c xfrm: add route lookup to xfrm4_rcv_encap 2019-12-09 09:59:07 +01:00
xfrm4_state.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_tunnel.c xfrm: interface: fix the priorities for ipip and ipv6 tunnels 2020-10-09 12:29:48 +02:00