sched: Document wait_var_event() family of functions and wake_up_var()

wake_up_var(), wait_var_event() and related interfaces are not
documented but have important ordering requirements.  This patch adds
documentation and makes these requirements explicit.

The return values for those wait_var_event_* functions which return a
value are documented.  Note that these are, perhaps surprisingly,
sometimes different from comparable wait_on_bit() functions.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240925053405.3960701-4-neilb@suse.de
This commit is contained in:
NeilBrown 2024-09-25 15:31:40 +10:00 committed by Peter Zijlstra
parent 3cdee6b359
commit bf39882edc
2 changed files with 101 additions and 0 deletions

View File

@ -282,6 +282,22 @@ __out: __ret; \
___wait_var_event(var, condition, TASK_UNINTERRUPTIBLE, 0, 0, \
schedule())
/**
* wait_var_event - wait for a variable to be updated and notified
* @var: the address of variable being waited on
* @condition: the condition to wait for
*
* Wait for a @condition to be true, only re-checking when a wake up is
* received for the given @var (an arbitrary kernel address which need
* not be directly related to the given condition, but usually is).
*
* The process will wait on a waitqueue selected by hash from a shared
* pool. It will only be woken on a wake_up for the given address.
*
* The condition should normally use smp_load_acquire() or a similarly
* ordered access to ensure that any changes to memory made before the
* condition became true will be visible after the wait completes.
*/
#define wait_var_event(var, condition) \
do { \
might_sleep(); \
@ -294,6 +310,24 @@ do { \
___wait_var_event(var, condition, TASK_KILLABLE, 0, 0, \
schedule())
/**
* wait_var_event_killable - wait for a variable to be updated and notified
* @var: the address of variable being waited on
* @condition: the condition to wait for
*
* Wait for a @condition to be true or a fatal signal to be received,
* only re-checking the condition when a wake up is received for the given
* @var (an arbitrary kernel address which need not be directly related
* to the given condition, but usually is).
*
* This is similar to wait_var_event() but returns a value which is
* 0 if the condition became true, or %-ERESTARTSYS if a fatal signal
* was received.
*
* The condition should normally use smp_load_acquire() or a similarly
* ordered access to ensure that any changes to memory made before the
* condition became true will be visible after the wait completes.
*/
#define wait_var_event_killable(var, condition) \
({ \
int __ret = 0; \
@ -308,6 +342,26 @@ do { \
TASK_UNINTERRUPTIBLE, 0, timeout, \
__ret = schedule_timeout(__ret))
/**
* wait_var_event_timeout - wait for a variable to be updated or a timeout to expire
* @var: the address of variable being waited on
* @condition: the condition to wait for
* @timeout: maximum time to wait in jiffies
*
* Wait for a @condition to be true or a timeout to expire, only
* re-checking the condition when a wake up is received for the given
* @var (an arbitrary kernel address which need not be directly related
* to the given condition, but usually is).
*
* This is similar to wait_var_event() but returns a value which is 0 if
* the timeout expired and the condition was still false, or the
* remaining time left in the timeout (but at least 1) if the condition
* was found to be true.
*
* The condition should normally use smp_load_acquire() or a similarly
* ordered access to ensure that any changes to memory made before the
* condition became true will be visible after the wait completes.
*/
#define wait_var_event_timeout(var, condition, timeout) \
({ \
long __ret = timeout; \
@ -321,6 +375,23 @@ do { \
___wait_var_event(var, condition, TASK_INTERRUPTIBLE, 0, 0, \
schedule())
/**
* wait_var_event_killable - wait for a variable to be updated and notified
* @var: the address of variable being waited on
* @condition: the condition to wait for
*
* Wait for a @condition to be true or a signal to be received, only
* re-checking the condition when a wake up is received for the given
* @var (an arbitrary kernel address which need not be directly related
* to the given condition, but usually is).
*
* This is similar to wait_var_event() but returns a value which is 0 if
* the condition became true, or %-ERESTARTSYS if a signal was received.
*
* The condition should normally use smp_load_acquire() or a similarly
* ordered access to ensure that any changes to memory made before the
* condition became true will be visible after the wait completes.
*/
#define wait_var_event_interruptible(var, condition) \
({ \
int __ret = 0; \

View File

@ -196,6 +196,36 @@ void init_wait_var_entry(struct wait_bit_queue_entry *wbq_entry, void *var, int
}
EXPORT_SYMBOL(init_wait_var_entry);
/**
* wake_up_var - wake up waiters on a variable (kernel address)
* @var: the address of the variable being waited on
*
* Wake up any process waiting in wait_var_event() or similar for the
* given variable to change. wait_var_event() can be waiting for an
* arbitrary condition to be true and associates that condition with an
* address. Calling wake_up_var() suggests that the condition has been
* made true, but does not strictly require the condtion to use the
* address given.
*
* The wake-up is sent to tasks in a waitqueue selected by hash from a
* shared pool. Only those tasks on that queue which have requested
* wake_up on this specific address will be woken.
*
* In order for this to function properly there must be a full memory
* barrier after the variable is updated (or more accurately, after the
* condition waited on has been made to be true) and before this function
* is called. If the variable was updated atomically, such as a by
* atomic_dec() then smb_mb__after_atomic() can be used. If the
* variable was updated by a fully ordered operation such as
* atomic_dec_and_test() then no extra barrier is required. Otherwise
* smb_mb() is needed.
*
* Normally the variable should be updated (the condition should be made
* to be true) by an operation with RELEASE semantics such as
* smp_store_release() so that any changes to memory made before the
* variable was updated are guaranteed to be visible after the matching
* wait_var_event() completes.
*/
void wake_up_var(void *var)
{
__wake_up_bit(__var_waitqueue(var), var, -1);