linux/net
Tonghao Zhang 38a6f08657 net: sched: support hash selecting tx queue
This patch allows users to pick queue_mapping, range
from A to B. Then we can load balance packets from A
to B tx queue. The range is an unsigned 16bit value
in decimal format.

$ tc filter ... action skbedit queue_mapping skbhash A B

"skbedit queue_mapping QUEUE_MAPPING" (from "man 8 tc-skbedit")
is enhanced with flags: SKBEDIT_F_TXQ_SKBHASH

  +----+      +----+      +----+
  | P1 |      | P2 |      | Pn |
  +----+      +----+      +----+
    |           |           |
    +-----------+-----------+
                |
                | clsact/skbedit
                |      MQ
                v
    +-----------+-----------+
    | q0        | qn        | qm
    v           v           v
  HTB/FQ       FIFO   ...  FIFO

For example:
If P1 sends out packets to different Pods on other host, and
we want distribute flows from qn - qm. Then we can use skb->hash
as hash.

setup commands:
$ NETDEV=eth0
$ ip netns add n1
$ ip link add ipv1 link $NETDEV type ipvlan mode l2
$ ip link set ipv1 netns n1
$ ip netns exec n1 ifconfig ipv1 2.2.2.100/24 up

$ tc qdisc add dev $NETDEV clsact
$ tc filter add dev $NETDEV egress protocol ip prio 1 \
        flower skip_hw src_ip 2.2.2.100 action skbedit queue_mapping skbhash 2 6
$ tc qdisc add dev $NETDEV handle 1: root mq
$ tc qdisc add dev $NETDEV parent 1:1 handle 2: htb
$ tc class add dev $NETDEV parent 2: classid 2:1 htb rate 100kbit
$ tc class add dev $NETDEV parent 2: classid 2:2 htb rate 200kbit
$ tc qdisc add dev $NETDEV parent 1:2 tbf rate 100mbit burst 100mb latency 1
$ tc qdisc add dev $NETDEV parent 1:3 pfifo
$ tc qdisc add dev $NETDEV parent 1:4 pfifo
$ tc qdisc add dev $NETDEV parent 1:5 pfifo
$ tc qdisc add dev $NETDEV parent 1:6 pfifo
$ tc qdisc add dev $NETDEV parent 1:7 pfifo

$ ip netns exec n1 iperf3 -c 2.2.2.1 -i 1 -t 10 -P 10

pick txqueue from 2 - 6:
$ ethtool -S $NETDEV | grep -i tx_queue_[0-9]_bytes
     tx_queue_0_bytes: 42
     tx_queue_1_bytes: 0
     tx_queue_2_bytes: 11442586444
     tx_queue_3_bytes: 7383615334
     tx_queue_4_bytes: 3981365579
     tx_queue_5_bytes: 3983235051
     tx_queue_6_bytes: 6706236461
     tx_queue_7_bytes: 42
     tx_queue_8_bytes: 0
     tx_queue_9_bytes: 0

txqueues 2 - 6 are mapped to classid 1:3 - 1:7
$ tc -s class show dev $NETDEV
...
class mq 1:3 root leaf 8002:
 Sent 11949133672 bytes 7929798 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:4 root leaf 8003:
 Sent 7710449050 bytes 5117279 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:5 root leaf 8004:
 Sent 4157648675 bytes 2758990 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:6 root leaf 8005:
 Sent 4159632195 bytes 2759990 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:7 root leaf 8006:
 Sent 7003169603 bytes 4646912 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
...

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Lobakin <alobakin@pm.me>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Talal Ahmad <talalahmad@google.com>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Antoine Tenart <atenart@kernel.org>
Cc: Wei Wang <weiwan@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-19 12:20:45 +02:00
..
6lowpan net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
9p xen/grant-table: remove readonly parameter from functions 2022-03-15 20:34:40 -05:00
802
8021q vlan: use correct format characters 2022-03-17 16:34:49 -07:00
appletalk net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
atm net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
ax25 net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
batman-adv batman-adv: Use netif_rx(). 2022-03-07 11:40:41 +00:00
bluetooth net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
bpf bpf, test_run: Fix packet size check for live packet mode 2022-03-11 22:01:26 +01:00
bpfilter uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
bridge net: bridge: fdb: add support for flush filtering based on ifindex and vlan 2022-04-13 12:46:26 +01:00
caif net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
can net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
ceph libceph: drop else branches in prepare_read_data{,_cont} 2022-03-01 18:26:36 +01:00
core net: sched: use queue_mapping to pick tx queue 2022-04-19 12:20:45 +02:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
decnet net: decnet: use time_is_before_jiffies() instead of open coding it 2022-02-28 13:21:32 +00:00
dns_resolver
dsa Revert "net: dsa: setup master before ports" 2022-04-13 12:38:06 +01:00
ethernet net: ethernet: set default assignment identifier to NET_NAME_ENUM 2022-04-07 21:04:03 -07:00
ethtool net: ethtool: move checks before rtnl_lock() in ethnl_set_rings 2022-04-15 11:41:45 -07:00
hsr net: add per-cpu storage and net->core_stats 2022-03-11 23:17:24 -08:00
ieee802154 net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
ife
ipv4 tcp: fix signed/unsigned comparison 2022-04-18 10:28:40 +01:00
ipv6 net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131 2022-04-17 13:23:49 +01:00
iucv net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
kcm net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
key net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
l2tp net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
l3mdev net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
lapb
llc llc: only change llc->dev when bind() succeeds 2022-03-25 16:55:41 -07:00
mac80211 mac80211: fix ht_capa printout in debugfs 2022-04-11 11:57:27 +02:00
mac802154
mctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-07 23:24:23 -07:00
mpls net: mpls: fix memdup.cocci warning 2022-04-07 21:06:41 -07:00
mptcp net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
ncsi all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate 2022-01-15 08:47:31 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
netlabel netlabel: fix out-of-bounds memory accesses 2022-03-21 10:59:11 +00:00
netlink net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
netrom net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
nsh
openvswitch net: openvswitch: fix leak of nested actions 2022-04-06 13:36:50 +01:00
packet net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
phonet net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
psample
qrtr net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-16 16:13:19 -08:00
rfkill rfkill: make new event layout opt-in 2022-03-18 13:09:17 +02:00
rose net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
rxrpc rxrpc: fix a race in rxrpc_exit_net() 2022-04-06 13:48:51 +01:00
sched net: sched: support hash selecting tx queue 2022-04-19 12:20:45 +02:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
smc net/smc: Fix af_ops of child socket pointing to released memory 2022-04-11 18:28:03 -07:00
strparser
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
switchdev net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device 2022-02-24 21:31:43 -08:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-23 10:53:49 -07:00
tls tls: rx: only copy IV from the packet for TLS 1.2 2022-04-13 11:45:39 +01:00
unix net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
vmw_vsock net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
wireless cfg80211: hold bss_lock while updating nontrans_list 2022-04-11 11:55:36 +02:00
x25 net: remove noblock parameter from skb_recv_datagram() 2022-04-06 13:45:26 +01:00
xdp xsk: Do not write NULL in SW ring at allocation failure 2022-03-28 19:56:28 -07:00
xfrm net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
compat.c
devres.c
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
Makefile
socket.c fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
sysctl_net.c