Linux kernel source tree
Go to file
Waiman Long 76723ed1fb locking/rwsem: Make handoff bit handling more consistent
[ Upstream commit d257cc8cb8 ]

There are some inconsistency in the way that the handoff bit is being
handled in readers and writers that lead to a race condition.

Firstly, when a queue head writer set the handoff bit, it will clear
it when the writer is being killed or interrupted on its way out
without acquiring the lock. That is not the case for a queue head
reader. The handoff bit will simply be inherited by the next waiter.

Secondly, in the out_nolock path of rwsem_down_read_slowpath(), both
the waiter and handoff bits are cleared if the wait queue becomes
empty.  For rwsem_down_write_slowpath(), however, the handoff bit is
not checked and cleared if the wait queue is empty. This can
potentially make the handoff bit set with empty wait queue.

Worse, the situation in rwsem_down_write_slowpath() relies on wstate,
a variable set outside of the critical section containing the ->count
manipulation, this leads to race condition where RWSEM_FLAG_HANDOFF
can be double subtracted, corrupting ->count.

To make the handoff bit handling more consistent and robust, extract
out handoff bit clearing code into the new rwsem_del_waiter() helper
function. Also, completely eradicate wstate; always evaluate
everything inside the same critical section.

The common function will only use atomic_long_andnot() to clear bits
when the wait queue is empty to avoid possible race condition.  If the
first waiter with handoff bit set is killed or interrupted to exit the
slowpath without acquiring the lock, the next waiter will inherit the
handoff bit.

While at it, simplify the trylock for loop in
rwsem_down_write_slowpath() to make it easier to read.

Fixes: 4f23dbc1e6 ("locking/rwsem: Implement lock handoff to prevent lock starvation")
Reported-by: Zhenhua Ma <mazhenhua@xiaomi.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211116012912.723980-1-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01 09:04:54 +01:00
arch MIPS: use 3-level pgtable for 64KB page size on MIPS_VA_BITS_48 2021-12-01 09:04:53 +01:00
block block: Check ADMIN before NICE for IOPRIO_CLASS_RT 2021-11-25 09:48:45 +01:00
certs certs: Add support for using elliptic curve keys for signing modules 2021-08-23 19:55:42 +03:00
crypto crypto: pcrypt - Delay write to padata->info 2021-11-18 19:16:44 +01:00
Documentation netfilter: ipvs: Fix reuse connection if RS weight is 0 2021-12-01 09:04:45 +01:00
drivers net: mscc: ocelot: correctly report the timestamping RX filters in ethtool 2021-12-01 09:04:54 +01:00
fs erofs: fix deadlock when shrink erofs slab 2021-12-01 09:04:50 +01:00
include net: ipv6: add fib6_nh_release_dsts stub 2021-12-01 09:04:49 +01:00
init init: make unknown command line param message clearer 2021-11-18 19:17:11 +01:00
ipc shm: extend forced shm destroy to support objects from several IPC nses 2021-11-25 09:48:42 +01:00
kernel locking/rwsem: Make handoff bit handling more consistent 2021-12-01 09:04:54 +01:00
lib printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces 2021-11-25 09:48:45 +01:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm hugetlbfs: flush TLBs correctly after huge_pmd_unshare 2021-11-25 09:49:07 +01:00
net net/smc: Don't call clcsock shutdown twice when smc shutdown 2021-12-01 09:04:53 +01:00
samples samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu 2021-11-25 09:48:33 +01:00
scripts leaking_addresses: Always print a trailing newline 2021-11-18 19:16:16 +01:00
security selinux: fix NULL-pointer dereference when hashtab allocation fails 2021-11-25 09:49:07 +01:00
sound ALSA: intel-dsp-config: add quirk for JSL devices based on ES8336 codec 2021-12-01 09:04:49 +01:00
tools tools build: Fix removal of feature-sync-compare-and-swap feature detection 2021-11-25 09:48:40 +01:00
usr .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
virt KVM: Remove tlbs_dirty 2021-09-23 11:01:12 -04:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS drm fixes for 5.15 final 2021-10-28 12:17:01 -07:00
Makefile Linux 5.15.5 2021-11-25 09:49:08 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.