linux/net
Willem de Bruijn d21daf7f3e packet: on direct_xmit, limit tso and csum to supported devices
[ Upstream commit 104ba78c98 ]

When transmitting on a packet socket with PACKET_VNET_HDR and
PACKET_QDISC_BYPASS, validate device support for features requested
in vnet_hdr.

Drop TSO packets sent to devices that do not support TSO or have the
feature disabled. Note that the latter currently do process those
packets correctly, regardless of not advertising the feature.

Because of SKB_GSO_DODGY, it is not sufficient to test device features
with netif_needs_gso. Full validate_xmit_skb is needed.

Switch to software checksum for non-TSO packets that request checksum
offload if that device feature is unsupported or disabled. Note that
similar to the TSO case, device drivers may perform checksum offload
correctly even when not advertising it.

When switching to software checksum, packets hit skb_checksum_help,
which has two BUG_ON checksum not in linear segment. Packet sockets
always allocate at least up to csum_start + csum_off + 2 as linear.

Tested by running github.com/wdebruij/kerneltools/psock_txring_vnet.c

  ethtool -K eth0 tso off tx on
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v
  psock_txring_vnet -d $dst -s $src -i eth0 -l 2000 -n 1 -q -v -N

  ethtool -K eth0 tx off
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G
  psock_txring_vnet -d $dst -s $src -i eth0 -l 1000 -n 1 -q -v -G -N

v2:
  - add EXPORT_SYMBOL_GPL(validate_xmit_skb_list)

Fixes: d346a3fae3 ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-15 07:46:39 +01:00
..
6lowpan 6lowpan: put mcast compression in an own function 2015-10-21 00:49:25 +02:00
9p IB/cma: Add support for network namespaces 2015-10-28 12:32:48 -04:00
802
8021q net: add recursion limit to GRO 2016-11-15 07:46:38 +01:00
appletalk
atm
ax25 AX.25: Close socket connection on session completion 2016-07-11 09:31:12 -07:00
batman-adv batman-adv: remove unused callback from batadv_algo_ops struct 2016-10-07 15:23:47 +02:00
bluetooth Bluetooth: Fix l2cap_sock_setsockopt() with optname BT_RCVMTU 2016-08-20 18:09:19 +02:00
bridge bridge: multicast: restore perm router ports on multicast enable 2016-11-15 07:46:38 +01:00
caif net: caif: fix misleading indentation 2016-09-30 10:18:35 +02:00
can
ceph libceph: apply new_state before new_up_client on incrementals 2016-08-10 11:49:29 +02:00
core packet: on direct_xmit, limit tso and csum to supported devices 2016-11-15 07:46:39 +01:00
dcb
dccp tcp/dccp: remove obsolete WARN_ON() in icmp handlers 2016-04-20 15:42:04 +09:00
decnet decnet: Do not build routes to devices without decnet private data. 2016-05-18 17:06:35 -07:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa net: dsa: use switchdev obj for VLAN add/del ops 2015-11-01 15:56:11 -05:00
ethernet net: add recursion limit to GRO 2016-11-15 07:46:38 +01:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 net: fix percpu memory leaks 2015-11-02 22:47:14 -05:00
ipv4 udp: fix IP_CHECKSUM handling 2016-11-15 07:46:39 +01:00
ipv6 udp: fix IP_CHECKSUM handling 2016-11-15 07:46:39 +01:00
ipx
irda net/irda: handle iriap_register_lsap() allocation failure 2016-09-30 10:18:36 +02:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-03-03 15:07:03 -08:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp l2tp: fix configuration passed to setup_udp_tunnel_sock() 2016-06-24 10:18:17 -07:00
l3mdev
lapb
llc net: fix infoleak in llc 2016-05-18 17:06:40 -07:00
mac80211 mac80211: discard multicast and 4-addr A-MSDUs 2016-11-10 16:36:35 +01:00
mac802154 mac802154: llsec: use kzfree 2015-10-21 00:49:24 +02:00
mpls mpls: find_outdev: check for err ptr in addition to NULL check 2016-04-20 15:42:07 +09:00
netfilter ipvs: fix bind to link-local mcast IPv6 address in backup 2016-10-07 15:23:41 +02:00
netlabel netlabel: add address family checks to netlbl_{sock,req}_delattr() 2016-08-20 18:09:22 +02:00
netlink netlink: do not enter direct reclaim from netlink_dump() 2016-11-15 07:46:37 +01:00
netrom
nfc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
openvswitch vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices 2016-06-24 10:18:18 -07:00
packet packet: on direct_xmit, limit tso and csum to supported devices 2016-11-15 07:46:39 +01:00
phonet phonet: properly unshare skbs in phonet_rcv() 2016-01-31 11:29:00 -08:00
rds rds: fix an infoleak in rds_inc_info_copy 2016-09-15 08:27:51 +02:00
rfkill rfkill: fix rfkill_fop_read wait_event usage 2016-03-03 15:07:26 -08:00
rose
rxrpc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
sched net sched filters: fix notification of filter delete with proper handle 2016-11-15 07:46:39 +01:00
sctp sctp: validate chunk len before actually using it 2016-11-15 07:46:39 +01:00
sunrpc sunrpc: fix write space race causing stalls 2016-10-28 03:01:31 -04:00
switchdev switchdev: pass pointer to fib_info instead of copy 2016-06-24 10:18:16 -07:00
tipc tipc: fix NULL pointer dereference in shutdown() 2016-09-30 10:18:36 +02:00
unix af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' 2016-09-30 10:18:36 +02:00
vmw_vsock VSOCK: do not disconnect socket when peer has shutdown SEND only 2016-05-18 17:06:41 -07:00
wimax
wireless nl80211: validate number of probe response CSA counters 2016-09-30 10:18:38 +02:00
x25 net: fix a kernel infoleak in x25 module 2016-05-18 17:06:43 -07:00
xfrm xfrm: Fix crash observed during device unregistration and decryption 2016-04-20 15:42:05 +09:00
compat.c
Kconfig
Makefile
socket.c net: Fix use after free in the recvmmsg exit path 2016-04-20 15:42:03 +09:00
sysctl_net.c net: Use ns_capable_noaudit() when determining net sysctl permissions 2016-09-15 08:27:50 +02:00