linux/drivers/md
Yu Kuai 41425f96d7 dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape
For raid456, if reshape is still in progress, then IO across reshape
position will wait for reshape to make progress. However, for dm-raid,
in following cases reshape will never make progress hence IO will hang:

1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;

After commit c467e97f07 ("md/raid6: use valid sector values to determine
if an I/O should wait on the reshape") fix the problem that IO across
reshape position doesn't wait for reshape, the dm-raid test
shell/lvconvert-raid-reshape.sh start to hang:

[root@fedora ~]# cat /proc/979/stack
[<0>] wait_woken+0x7d/0x90
[<0>] raid5_make_request+0x929/0x1d70 [raid456]
[<0>] md_handle_request+0xc2/0x3b0 [md_mod]
[<0>] raid_map+0x2c/0x50 [dm_raid]
[<0>] __map_bio+0x251/0x380 [dm_mod]
[<0>] dm_submit_bio+0x1f0/0x760 [dm_mod]
[<0>] __submit_bio+0xc2/0x1c0
[<0>] submit_bio_noacct_nocheck+0x17f/0x450
[<0>] submit_bio_noacct+0x2bc/0x780
[<0>] submit_bio+0x70/0xc0
[<0>] mpage_readahead+0x169/0x1f0
[<0>] blkdev_readahead+0x18/0x30
[<0>] read_pages+0x7c/0x3b0
[<0>] page_cache_ra_unbounded+0x1ab/0x280
[<0>] force_page_cache_ra+0x9e/0x130
[<0>] page_cache_sync_ra+0x3b/0x110
[<0>] filemap_get_pages+0x143/0xa30
[<0>] filemap_read+0xdc/0x4b0
[<0>] blkdev_read_iter+0x75/0x200
[<0>] vfs_read+0x272/0x460
[<0>] ksys_read+0x7a/0x170
[<0>] __x64_sys_read+0x1c/0x30
[<0>] do_syscall_64+0xc6/0x230
[<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74

This is because reshape can't make progress.

For md/raid, the problem doesn't exist because register new sync_thread
doesn't rely on the IO to be done any more:

1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
   'reconfig_mutex', hence it can be cleared and reshape can continue by
   sysfs api 'sync_action'.

However, I'm not sure yet how to avoid the problem in dm-raid yet. This
patch on the one hand make sure raid_message() can't change
sync_thread() through raid_message() after presuspend(), on the other
hand detect the above 3 cases before wait for IO do be done in
dm_suspend(), and let dm-raid requeue those IO.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
2024-03-05 12:53:33 -08:00
..
bcache bcache: pass queue_limits to blk_mq_alloc_disk 2024-02-19 16:58:24 -07:00
persistent-data dm thin metadata: Fix ABBA deadlock by resetting dm_bufio_client 2023-06-16 18:24:13 -04:00
dm-audit.c dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-audit.h dm: introduce audit event module for device mapper 2021-10-27 16:53:47 -04:00
dm-bio-prison-v1.c dm: Annotate struct dm_bio_prison with __counted_by 2023-10-02 09:48:53 -07:00
dm-bio-prison-v1.h dm bio prison v1: add dm_cell_key_has_valid_range 2023-03-30 15:57:51 -04:00
dm-bio-prison-v2.c dm: address space issues relative to switch/while/for/... 2023-02-14 14:23:06 -05:00
dm-bio-prison-v2.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-bio-record.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-bufio.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
dm-builtin.c dm: adjust EXPORT_SYMBOL() to follow functions immediately 2023-02-14 14:23:07 -05:00
dm-cache-background-tracker.c dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-cache-background-tracker.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-cache-block-types.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-cache-metadata.c Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
dm-cache-metadata.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-cache-policy-internal.h dm: add missing empty lines 2023-02-14 14:23:06 -05:00
dm-cache-policy-smq.c dm cache policy smq: ensure IO doesn't prevent cleaner policy progress 2023-07-25 11:55:50 -04:00
dm-cache-policy.c dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-cache-policy.h dm: address indent/space issues 2023-02-14 14:23:06 -05:00
dm-cache-target.c block: replace fmode_t with a block-specific type for block open flags 2023-06-12 08:04:05 -06:00
dm-clone-metadata.c dm clone metadata: remove unused function 2021-04-19 13:20:31 -04:00
dm-clone-metadata.h
dm-clone-target.c block: replace fmode_t with a block-specific type for block open flags 2023-06-12 08:04:05 -06:00
dm-core.h dm: limit the number of targets and parameter size area 2024-01-30 14:05:02 -05:00
dm-crypt.c dm-crypt, dm-verity: disable tasklets 2024-02-02 12:33:50 -05:00
dm-delay.c dm-delay: avoid duplicate logic 2023-11-17 14:41:14 -05:00
dm-dust.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-ebs-target.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-era-target.c block: replace fmode_t with a block-specific type for block open flags 2023-06-12 08:04:05 -06:00
dm-exception-store.c dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-exception-store.h dm: avoid spaces before function arguments or in favour of tabs 2023-02-14 14:23:06 -05:00
dm-flakey.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
dm-ima.c dm: avoid inline filenames 2023-02-14 14:23:07 -05:00
dm-ima.h dm: avoid inline filenames 2023-02-14 14:23:07 -05:00
dm-init.c dm: open code dm_get_dev_t in dm_init_init 2023-06-05 10:57:40 -06:00
dm-integrity.c dm-integrity: don't modify bio's immutable bio_vec in integrity_metadata() 2023-12-18 13:11:05 -05:00
dm-io-rewind.c dm: avoid void function return statements 2023-02-14 14:23:07 -05:00
dm-io-tracker.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-io.c dm: remove unnecessary (void*) conversions 2023-04-11 12:01:01 -04:00
dm-ioctl.c dm: limit the number of targets and parameter size area 2024-01-30 14:05:02 -05:00
dm-kcopyd.c block: remove support for the host aware zone model 2023-12-19 20:17:43 -07:00
dm-linear.c dm: shortcut the calls to linear_map and stripe_map 2023-10-06 19:09:25 -04:00
dm-log-userspace-base.c dm log userspace: replace deprecated strncpy with strscpy 2023-10-23 13:02:48 -04:00
dm-log-userspace-transfer.c dm: avoid split of quoted strings where possible 2023-02-14 14:23:07 -05:00
dm-log-userspace-transfer.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-log-writes.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-log.c dm: remove unnecessary (void*) conversions 2023-04-11 12:01:01 -04:00
dm-mpath.c dm: push error reporting down to dm_register_target() 2023-04-11 12:01:01 -04:00
dm-mpath.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-path-selector.c dm: adjust EXPORT_SYMBOL() to follow functions immediately 2023-02-14 14:23:07 -05:00
dm-path-selector.h dm: avoid spaces before function arguments or in favour of tabs 2023-02-14 14:23:06 -05:00
dm-ps-historical-service-time.c dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-ps-io-affinity.c dm: address space issues relative to switch/while/for/... 2023-02-14 14:23:06 -05:00
dm-ps-queue-length.c dm: avoid spaces before function arguments or in favour of tabs 2023-02-14 14:23:06 -05:00
dm-ps-round-robin.c dm: correct block comments format. 2023-02-14 14:23:06 -05:00
dm-ps-service-time.c dm: avoid spaces before function arguments or in favour of tabs 2023-02-14 14:23:06 -05:00
dm-raid.c dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape 2024-03-05 12:53:33 -08:00
dm-raid1.c dm: remove unnecessary (void*) conversions 2023-04-11 12:01:01 -04:00
dm-region-hash.c dm: correct block comments format. 2023-02-14 14:23:06 -05:00
dm-rq.c dm: avoid using symbolic permissions 2023-02-14 14:23:07 -05:00
dm-rq.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-snap-persistent.c dm: remove unnecessary (void*) conversions 2023-04-11 12:01:01 -04:00
dm-snap-transient.c dm: avoid split of quoted strings where possible 2023-02-14 14:23:07 -05:00
dm-snap.c block: replace fmode_t with a block-specific type for block open flags 2023-06-12 08:04:05 -06:00
dm-stats.c dm stats: limit the number of entries 2024-01-30 14:06:44 -05:00
dm-stats.h dm stats: check for and propagate alloc_percpu failure 2023-03-16 13:37:06 -04:00
dm-stripe.c - Update DM core to directly call the map function for both the linear 2023-11-01 12:55:54 -10:00
dm-switch.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-sysfs.c dm sysfs: make kobj_type structure constant 2023-02-14 14:23:08 -05:00
dm-table.c dm: use queue_limits_set 2024-03-01 08:54:42 -07:00
dm-target.c dm error: Add support for zoned block devices 2023-10-31 11:06:21 -04:00
dm-thin-metadata.c - Update DM crypt to allocate compound pages if possible. 2023-06-30 12:16:00 -07:00
dm-thin-metadata.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-thin.c - Update DM crypt to allocate compound pages if possible. 2023-06-30 12:16:00 -07:00
dm-uevent.c dm: avoid spaces before function arguments or in favour of tabs 2023-02-14 14:23:06 -05:00
dm-uevent.h dm: fix undue/missing spaces 2023-02-14 14:23:06 -05:00
dm-unstripe.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-verity-fec.c dm-verity: align struct dm_verity_fec_io properly 2023-11-29 12:58:06 -05:00
dm-verity-fec.h dm: change "unsigned" to "unsigned int" 2023-02-14 14:23:06 -05:00
dm-verity-loadpin.c dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter 2023-06-28 10:43:04 -07:00
dm-verity-target.c dm-crypt, dm-verity: disable tasklets 2024-02-02 12:33:50 -05:00
dm-verity-verify-sig.c dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-verity-verify-sig.h dm: add missing SPDX-License-Indentifiers 2023-02-14 14:23:06 -05:00
dm-verity.h dm-crypt, dm-verity: disable tasklets 2024-02-02 12:33:50 -05:00
dm-writecache.c dm writecache: allow allocations larger than 2GiB 2024-01-30 14:07:30 -05:00
dm-zero.c dm: add helper macro for simple DM target module init and exit 2023-04-11 12:09:08 -04:00
dm-zone.c dm zone: Use the bitmap API to allocate bitmaps 2023-06-16 18:24:13 -04:00
dm-zoned-metadata.c block: remove gfp_flags from blkdev_zone_mgmt 2024-02-12 08:41:16 -07:00
dm-zoned-reclaim.c dm kcopyd: avoid useless atomic operations 2021-06-04 12:07:24 -04:00
dm-zoned-target.c block: remove support for the host aware zone model 2023-12-19 20:17:43 -07:00
dm-zoned.h dm/dm-zoned: Use the enum req_op type 2022-07-14 12:14:31 -06:00
dm.c block: pass a queue_limits argument to blk_alloc_disk 2024-02-19 16:58:23 -07:00
dm.h dm: shortcut the calls to linear_map and stripe_map 2023-10-06 19:09:25 -04:00
Kconfig for-6.8/block-2024-01-08 2024-01-11 13:58:04 -08:00
Makefile md: Remove deprecated CONFIG_MD_FAULTY 2023-12-19 10:37:50 -08:00
md-autodetect.c md: Remove deprecated CONFIG_MD_LINEAR 2023-12-19 10:16:51 -08:00
md-bitmap.c md/md-bitmap: fix incorrect usage for sb_index 2024-02-26 13:34:44 -08:00
md-bitmap.h md-bitmap: don't use ->index for pages backing the bitmap file 2023-07-27 00:13:29 -07:00
md-cluster.c md-cluster: check for timeout while a new disk adding 2023-10-12 09:16:19 -07:00
md-cluster.h
md.c dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape 2024-03-05 12:53:33 -08:00
md.h dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape 2024-03-05 12:53:33 -08:00
raid1-10.c md/raid1-10: factor out a new helper raid1_should_read_first() 2024-02-29 22:49:46 -08:00
raid1.c md/raid1: factor out helpers to choose the best rdev from read_balance() 2024-02-29 22:49:46 -08:00
raid1.h md/raid1: record nonrot rdevs while adding/removing rdevs to conf 2024-02-29 22:49:45 -08:00
raid5-cache.c md/raid5: remove rcu protection to access rdev from conf 2023-11-27 15:49:05 -08:00
raid5-log.h md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() 2022-08-02 17:14:31 -06:00
raid5-ppl.c md/raid5: remove rcu protection to access rdev from conf 2023-11-27 15:49:05 -08:00
raid5.c dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape 2024-03-05 12:53:33 -08:00
raid5.h md/raid5: remove rcu protection to access rdev from conf 2023-11-27 15:49:05 -08:00
raid10.c md/raid1-10: factor out a new helper raid1_should_read_first() 2024-02-29 22:49:46 -08:00
raid10.h md/raid10: switch to use md_account_bio() for io accounting 2023-07-27 00:13:29 -07:00
raid0.c md: raid0: account for split bio in iostat accounting 2023-08-17 21:11:31 -07:00
raid0.h md/raid0: add discard support for the 'original' layout 2023-06-30 15:43:50 -07:00