Linux kernel source tree
Go to file
Eric Dumazet a7f46e18ec net: sched: fix reordering issues
[ Upstream commit b88dd52c62 ]

Whenever MQ is not used on a multiqueue device, we experience
serious reordering problems. Bisection found the cited
commit.

The issue can be described this way :

- A single qdisc hierarchy is shared by all transmit queues.
  (eg : tc qdisc replace dev eth0 root fq_codel)

- When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting
  a different transmit queue than the one used to build a packet train,
  we stop building the current list and save the 'bad' skb (P1) in a
  special queue. (bad_txq)

- When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this
  skb (P1), it checks if the associated transmit queues is still in frozen
  state. If the queue is still blocked (by BQL or NIC tx ring full),
  we leave the skb in bad_txq and return NULL.

- dequeue_skb() calls q->dequeue() to get another packet (P2)

  The other packet can target the problematic queue (that we found
  in frozen state for the bad_txq packet), but another cpu just ran
  TX completion and made room in the txq that is now ready to accept
  new packets.

- Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent
  at next round. In practice P2 is the lead of a big packet train
  (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/

To solve this problem, we have to block the dequeue process as long
as the first packet in bad_txq can not be sent. Reordering issues
disappear and no side effects have been seen.

Fixes: a53851e2c3 ("net: sched: explicit locking in gso_cpu fallback")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-19 09:09:31 +02:00
arch powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts 2019-09-16 08:22:25 +02:00
block blk-mq: free hw queue's resource in hctx's release handler 2019-09-16 08:22:13 +02:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: chacha20poly1305 - fix atomic sleep when using async algorithm 2019-07-26 09:14:19 +02:00
Documentation drm/panel: Add support for Armadeus ST0700 Adapt 2019-09-16 08:22:21 +02:00
drivers net: phylink: Fix flow control resolution 2019-09-19 09:09:30 +02:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs ext4: unsigned int compared against zero 2019-09-16 08:22:24 +02:00
include isdn/capi: check message length in capi_write() 2019-09-19 09:09:29 +02:00
init initramfs: free initrd memory if opening /initrd.image fails 2019-06-15 11:54:01 +02:00
ipc ipc/mqueue.c: only perform resource calculation if user valid 2019-08-06 19:06:52 +02:00
kernel resource: fix locking in find_next_iomem_res() 2019-09-16 08:22:20 +02:00
lib lib: logic_pio: Add logic_pio_unregister_range() 2019-09-06 10:22:19 +02:00
LICENSES LICENSES: Remove CC-BY-SA-4.0 license text 2018-10-18 11:28:50 +02:00
mm mm/migrate.c: initialize pud_entry in migrate_vma() 2019-09-16 08:22:22 +02:00
net net: sched: fix reordering issues 2019-09-19 09:09:31 +02:00
samples samples, bpf: suppress compiler warning 2019-07-14 08:11:04 +02:00
scripts scripts/decode_stacktrace: match basepath using shell prefix operator, not regex 2019-09-16 08:21:44 +02:00
security apparmor: reset pos on failure to unpack for various functions 2019-09-16 08:22:16 +02:00
sound ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips 2019-09-16 08:22:21 +02:00
tools selftests: fib_rule_tests: use pre-defined DEV_ADDR 2019-09-16 08:21:42 +02:00
usr initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
virt kvm: Check irqchip mode before assign irqfd 2019-09-16 08:22:15 +02:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS platform/x86: Add Intel AtomISP2 dummy / power-management driver 2019-04-20 09:16:02 +02:00
Makefile Linux 4.19.73 2019-09-16 08:22:25 +02:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.