mirror of
https://github.com/torvalds/linux.git
synced 2026-05-22 14:12:07 +02:00
ocfs2/dlm: fix possible convert=sion deadlock
We found there is a conversion deadlock when the owner of lockres
happened to crash before send DLM_PROXY_AST_MSG for a downconverting
lock. The situation is as follows:
Node1 Node2 Node3
the owner of lockresA
lock_1 granted at EX mode
and call ocfs2_cluster_unlock
to decrease ex_holders.
converting lock_3 from
NL to EX
send DLM_PROXY_AST_MSG
to Node1, asking Node 1
to downconvert.
receiving DLM_PROXY_AST_MSG,
thread ocfs2dc send
DLM_CONVERT_LOCK_MSG
to Node2 to downconvert
lock_1(EX->NL).
lock_1 can be granted and
put it into pending_asts
list, return DLM_NORMAL.
then something happened
and Node2 crashed.
received DLM_NORMAL, waiting
for DLM_PROXY_AST_MSG.
selected as the recovery
master, receving migrate
lock from Node1, queue
lock_1 to the tail of
converting list.
After dlm recovery, converting list in the master of lockresA(Node3)
will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL).
Requested mode of lock_3 is not compatible with the granted mode of
lock_1, so it can not be granted. and lock_1 can not downconvert
because covnerting queue is strictly FIFO. So a deadlock is created.
We think function dlm_process_recovery_data() should queue_ast for
lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can
process lock_1 first. And if there are multiple downconverting locks,
they must convert form PR to NL, so no need to sort them.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
55b465b668
commit
6718cb5e0e
|
|
@ -1986,7 +1986,15 @@ static int dlm_process_recovery_data(struct dlm_ctxt *dlm,
|
|||
}
|
||||
if (!bad) {
|
||||
dlm_lock_get(newlock);
|
||||
list_add_tail(&newlock->list, queue);
|
||||
if (mres->flags & DLM_MRES_RECOVERY &&
|
||||
ml->list == DLM_CONVERTING_LIST &&
|
||||
newlock->ml.type >
|
||||
newlock->ml.convert_type) {
|
||||
/* newlock is doing downconvert, add it to the
|
||||
* head of converting list */
|
||||
list_add(&newlock->list, queue);
|
||||
} else
|
||||
list_add_tail(&newlock->list, queue);
|
||||
mlog(0, "%s:%.*s: added lock for node %u, "
|
||||
"setting refmap bit\n", dlm->name,
|
||||
res->lockname.len, res->lockname.name, ml->node);
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user