Linux kernel source tree
Go to file
Coly Li 59afd4f287 bcache: avoid journal no-space deadlock by reserving 1 journal bucket
commit 32feee36c3 upstream.

The journal no-space deadlock was reported time to time. Such deadlock
can happen in the following situation.

When all journal buckets are fully filled by active jset with heavy
write I/O load, the cache set registration (after a reboot) will load
all active jsets and inserting them into the btree again (which is
called journal replay). If a journaled bkey is inserted into a btree
node and results btree node split, new journal request might be
triggered. For example, the btree grows one more level after the node
split, then the root node record in cache device super block will be
upgrade by bch_journal_meta() from bch_btree_set_root(). But there is no
space in journal buckets, the journal replay has to wait for new journal
bucket to be reclaimed after at least one journal bucket replayed. This
is one example that how the journal no-space deadlock happens.

The solution to avoid the deadlock is to reserve 1 journal bucket in
run time, and only permit the reserved journal bucket to be used during
cache set registration procedure for things like journal replay. Then
the journal space will never be fully filled, there is no chance for
journal no-space deadlock to happen anymore.

This patch adds a new member "bool do_reserve" in struct journal, it is
inititalized to 0 (false) when struct journal is allocated, and set to
1 (true) by bch_journal_space_reserve() when all initialization done in
run_cache_set(). In the run time when journal_reclaim() tries to
allocate a new journal bucket, free_journal_buckets() is called to check
whether there are enough free journal buckets to use. If there is only
1 free journal bucket and journal->do_reserve is 1 (true), the last
bucket is reserved and free_journal_buckets() will return 0 to indicate
no free journal bucket. Then journal_reclaim() will give up, and try
next time to see whetheer there is free journal bucket to allocate. By
this method, there is always 1 jouranl bucket reserved in run time.

During the cache set registration, journal->do_reserve is 0 (false), so
the reserved journal bucket can be used to avoid the no-space deadlock.

Reported-by: Nikhil Kshirsagar <nkshirsagar@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220524102336.10684-5-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09 10:21:28 +02:00
arch xtensa/simdisk: fix proc_read_simdisk() 2022-06-09 10:21:27 +02:00
block bfq: Track whether bfq_group is still online 2022-06-09 10:21:22 +02:00
certs certs: Trigger creation of RSA module signing key if it's not an RSA key 2021-09-15 09:50:29 +02:00
crypto crypto: cryptd - Protect per-CPU resource by disabling BH. 2022-06-09 10:21:17 +02:00
Documentation spi: qcom-qspi: Add minItems to interconnect-names 2022-06-09 10:20:59 +02:00
drivers bcache: avoid journal no-space deadlock by reserving 1 journal bucket 2022-06-09 10:21:28 +02:00
fs ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock 2022-06-09 10:21:24 +02:00
include nodemask.h: fix compilation error with GCC12 2022-06-09 10:21:27 +02:00
init Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug 2022-06-09 10:21:25 +02:00
ipc ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() 2022-06-09 10:21:17 +02:00
kernel ftrace: Clean up hash direct_functions on register failures 2022-06-09 10:21:26 +02:00
lib lib/crypto: add prompts back to crypto libraries 2022-06-06 08:42:42 +02:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm hugetlb: fix huge_pmd_unshare address update 2022-06-09 10:21:27 +02:00
net mac80211: upgrade passive scan to active scan on DFS channels after beacon rx 2022-06-09 10:21:26 +02:00
samples samples/bpf, xdpsock: Fix race when running for fix duration of time 2022-04-08 14:40:21 +02:00
scripts scripts/faddr2line: Fix overlapping text section failures 2022-06-09 10:21:08 +02:00
security ima: remove the IMA_TEMPLATE Kconfig option 2022-06-09 10:21:25 +02:00
sound ASoC: rt5514: Fix event generation for "DSP Voice Wake Up" control 2022-06-09 10:21:27 +02:00
tools perf jevents: Fix event syntax error caused by ExtSel 2022-06-09 10:21:21 +02:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:25:48 +01:00
virt KVM: Prevent module exit until all VMs are freed 2022-04-08 14:40:38 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add git tree for random.c 2022-05-30 09:33:24 +02:00
Makefile Linux 5.10.120 2022-06-06 08:42:45 +02:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.