Linux kernel source tree
Go to file
Mathieu Desnoyers 7799ad4d18 tracepoint: Fix static call function vs data state mismatch
commit 231264d692 upstream.

On a 1->0->1 callbacks transition, there is an issue with the new
callback using the old callback's data.

Considering __DO_TRACE_CALL:

        do {                                                            \
                struct tracepoint_func *it_func_ptr;                    \
                void *__data;                                           \
                it_func_ptr =                                           \
                        rcu_dereference_raw((&__tracepoint_##name)->funcs); \
                if (it_func_ptr) {                                      \
                        __data = (it_func_ptr)->data;                   \

----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ]

                        static_call(tp_func_##name)(__data, args);      \
                }                                                       \
        } while (0)

It has loaded the tp->funcs of the old callback, so it will try to use the old
data. This can be fixed by adding a RCU sync anywhere in the 1->0->1
transition chain.

On a N->2->1 transition, we need an rcu-sync because you may have a
sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged
between 2->1, but was changed from 3->2 (or from 1->2), which may be
observed by the static call. This can be fixed by adding an
unconditional RCU sync in transition 2->1.

Note, this fixes a correctness issue at the cost of adding a tremendous
performance regression to the disabling of tracepoints.

Before this commit:

  # trace-cmd start -e all
  # time trace-cmd start -p nop

  real	0m0.778s
  user	0m0.000s
  sys	0m0.061s

After this commit:

  # trace-cmd start -e all
  # time trace-cmd start -p nop

  real	0m10.593s
  user	0m0.017s
  sys	0m0.259s

A follow up fix will introduce a more lightweight scheme based on RCU
get_state and cond_sync, that will return the performance back to what it
was. As both this change and the lightweight versions are complex on their
own, for bisecting any issues that this may cause, they are kept as two
separate changes.

Link: https://lkml.kernel.org/r/20210805132717.23813-3-mathieu.desnoyers@efficios.com
Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/

Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Stefan Metzmacher <metze@samba.org>
Fixes: d25e37d89d ("tracepoint: Optimize using static_call()")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-08-12 13:22:12 +02:00
arch mips: Fix non-POSIX regexp 2021-08-12 13:22:07 +02:00
block blk-iolatency: error out if blk_get_queue() failed in iolatency_set_limit() 2021-08-12 13:22:08 +02:00
certs certs: add 'x509_revocation_list' to gitignore 2021-07-20 16:05:35 +02:00
crypto crypto: sm2 - fix a memory leak in sm2 2021-07-14 16:56:06 +02:00
Documentation Documentation: Fix intiramfs script name 2021-07-28 14:35:47 +02:00
drivers clk: fix leak on devm_clk_bulk_get_all() unwind 2021-08-12 13:22:11 +02:00
fs btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction 2021-08-08 09:05:22 +02:00
include usb: otg-fsm: Fix hrtimer list corruption 2021-08-12 13:22:11 +02:00
init sched/core: Initialize the idle task with preemption disabled 2021-07-14 16:55:50 +02:00
ipc ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry 2021-05-26 12:06:54 +02:00
kernel tracepoint: Fix static call function vs data state mismatch 2021-08-12 13:22:12 +02:00
lib net: add kcov handle to skb extensions 2021-07-28 14:35:33 +02:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions 2021-07-28 14:35:46 +02:00
net Bluetooth: defer cleanup of resources in hci_unregister_dev() 2021-08-12 13:22:08 +02:00
samples samples/bpf: Fix the error return code of xdp_redirect's main() 2021-07-14 16:56:23 +02:00
scripts scripts/tracing: fix the bug that can't parse raw_trace_func 2021-08-12 13:22:12 +02:00
security smackfs: restrict bytes count in smk_set_cipso() 2021-07-19 09:45:03 +02:00
sound ALSA: usb-audio: Add registration quirk for JBL Quantum 600 2021-08-12 13:22:10 +02:00
tools selftest/bpf: Verifier tests for var-off access 2021-08-08 09:05:24 +02:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt KVM: add missing compat KVM_CLEAR_DIRTY_LOG 2021-08-04 12:46:39 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS f2fs: move ioctl interface definitions to separated file 2021-05-19 10:13:00 +02:00
Makefile Linux 5.10.57 2021-08-08 09:05:24 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.