mirror of
https://github.com/torvalds/linux.git
synced 2026-06-08 22:52:35 +02:00
Currently the percpu-rwsem switches to (global) atomic ops while a writer is waiting; which could be quite a while and slows down releasing the readers. This patch cures this problem by ordering the reader-state vs reader-count (see the comments in __percpu_down_read() and percpu_down_write()). This changes a global atomic op into a full memory barrier, which doesn't have the global cacheline contention. This also enables using the percpu-rwsem with rcu_sync disabled in order to bias the implementation differently, reducing the writer latency by adding some cost to readers. Mailing-list-URL: https://lkml.org/lkml/2016/8/9/181 Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [jstultz: Backported to 4.4] Change-Id: I8ea04b4dca2ec36f1c2469eccafde1423490572f Signed-off-by: John Stultz <john.stultz@linaro.org> |
||
|---|---|---|
| .. | ||
| lglock.c | ||
| lockdep_internals.h | ||
| lockdep_proc.c | ||
| lockdep_states.h | ||
| lockdep.c | ||
| locktorture.c | ||
| Makefile | ||
| mcs_spinlock.h | ||
| mutex-debug.c | ||
| mutex-debug.h | ||
| mutex.c | ||
| mutex.h | ||
| osq_lock.c | ||
| percpu-rwsem.c | ||
| qrwlock.c | ||
| qspinlock_paravirt.h | ||
| qspinlock.c | ||
| rtmutex_common.h | ||
| rtmutex-debug.c | ||
| rtmutex-debug.h | ||
| rtmutex.c | ||
| rtmutex.h | ||
| rwsem-spinlock.c | ||
| rwsem-xadd.c | ||
| rwsem.c | ||
| rwsem.h | ||
| semaphore.c | ||
| spinlock_debug.c | ||
| spinlock.c | ||