linux/drivers/md
Heinz Mauelshagen 0cd2d2577a dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences
commit f99a8e4373 upstream.

If fast table reloads occur during an ongoing reshape of raid4/5/6
devices the target may race reading a superblock vs the the MD resync
thread; causing an inconclusive reshape state to be read in its
constructor.

lvm2 test lvconvert-raid-reshape-stripes-load-reload.sh can cause
BUG_ON() to trigger in md_run(), e.g.:
"kernel BUG at drivers/md/raid5.c:7567!".

Scenario triggering the bug:

1. the MD sync thread calls end_reshape() from raid5_sync_request()
   when done reshaping. However end_reshape() _only_ updates the
   reshape position to MaxSector keeping the changed layout
   configuration though (i.e. any delta disks, chunk sector or RAID
   algorithm changes). That inconclusive configuration is stored in
   the superblock.

2. dm-raid constructs a mapping, loading named inconsistent superblock
   as of step 1 before step 3 is able to finish resetting the reshape
   state completely, and calls md_run() which leads to mentioned bug
   in raid5.c.

3. the MD RAID personality's finish_reshape() is called; which resets
   the reshape information on chunk sectors, delta disks, etc. This
   explains why the bug is rarely seen on multi-core machines, as MD's
   finish_reshape() superblock update races with the dm-raid
   constructor's superblock load in step 2.

Fix identifies inconclusive superblock content in the dm-raid
constructor and resets it before calling md_run(), factoring out
identifying checks into rs_is_layout_change() to share in existing
rs_reshape_requested() and new rs_reset_inclonclusive_reshape(). Also
enhance a comment and remove an empty line.

Cc: stable@vger.kernel.org
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11 14:47:36 +02:00
..
bcache bcache: Move journal work to new flush wq 2021-03-04 11:38:26 +01:00
persistent-data dm thin metadata: Remove unused local variable when create thin and snap 2020-09-29 16:33:11 -04:00
dm-bio-prison-v1.c
dm-bio-prison-v1.h
dm-bio-prison-v2.c
dm-bio-prison-v2.h
dm-bio-record.h
dm-bufio.c dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size 2021-03-09 11:11:12 +01:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: Avoid returning cmd->bm wild pointer on error 2020-09-02 13:38:24 -04:00
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c Revert "dm cache: fix arm link errors with inline" 2020-12-01 15:43:36 -05:00
dm-clone-metadata.c dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-metadata.h dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-target.c writeback: remove bdi->congested_fn 2020-07-08 17:20:46 -06:00
dm-core.h dm: fix deadlock when swapping to encrypted device 2021-03-04 11:38:44 +01:00
dm-crypt.c dm: fix deadlock when swapping to encrypted device 2021-03-04 11:38:44 +01:00
dm-delay.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-dust.c dm dust: add interface to list all badblocks 2020-07-20 11:17:41 -04:00
dm-ebs-target.c dm ebs: Fix incorrect checking for REQ_OP_FLUSH 2020-08-04 16:01:40 -04:00
dm-era-target.c dm era: only resize metadata in preresume 2021-03-04 11:38:46 +01:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c
dm-historical-service-time.c dm mpath: add Historical Service Time Path Selector 2020-05-15 10:29:36 -04:00
dm-init.c dm init: Set file local variable static 2020-08-04 15:51:28 -04:00
dm-integrity.c dm integrity: conditionally disable "recalculate" feature 2021-01-27 11:54:55 +01:00
dm-io.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
dm-ioctl.c dm ioctl: fix out of bounds array access when no devices 2021-03-30 14:31:56 +02:00
dm-kcopyd.c
dm-linear.c dm: add support for REQ_NOWAIT and enable it for linear target 2020-09-25 08:20:03 -06:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-log.c
dm-mpath.c dm: use dm_table_get_device_name() where appropriate in targets 2020-09-29 16:33:08 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-queue-length.c dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-raid.c dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences 2021-05-11 14:47:36 +02:00
dm-raid1.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-region-hash.c
dm-round-robin.c
dm-rq.c dm table: make 'struct dm_table' definition accessible to all of DM core 2020-09-29 16:33:07 -04:00
dm-rq.h
dm-service-time.c dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-snap-persistent.c dm snap persistent: simplify area_io() 2020-09-29 16:33:12 -04:00
dm-snap-transient.c
dm-snap.c dm snapshot: flush merged data before committing metadata 2021-01-19 18:27:21 +01:00
dm-stats.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-stats.h
dm-stripe.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-switch.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-sysfs.c
dm-table.c dm table: Fix zoned model check and zone sectors check 2021-03-30 14:32:06 +02:00
dm-target.c
dm-thin-metadata.c dm thin metadata: Remove unused local variable when create thin and snap 2020-09-29 16:33:11 -04:00
dm-thin-metadata.h
dm-thin.c writeback: remove bdi->congested_fn 2020-07-08 17:20:46 -06:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c
dm-verity-fec.c dm verity fec: fix misaligned RS roots IO 2021-04-21 13:00:54 +02:00
dm-verity-fec.h dm verity fec: fix misaligned RS roots IO 2021-04-21 13:00:54 +02:00
dm-verity-target.c dm verity: fix DM_VERITY_OPTS_MAX value 2021-03-30 14:31:56 +02:00
dm-verity-verify-sig.c
dm-verity-verify-sig.h dm verity: Fix compilation warning 2020-08-04 15:48:13 -04:00
dm-verity.h dm verity: add "panic_on_corruption" error handling mode 2020-07-13 11:47:33 -04:00
dm-writecache.c dm writecache: fix writing beyond end of underlying device when shrinking 2021-03-04 11:38:45 +01:00
dm-zero.c
dm-zoned-metadata.c dm zoned: Fix zone reclaim trigger 2020-07-08 12:21:53 -04:00
dm-zoned-reclaim.c dm zoned: Fix zone reclaim trigger 2020-07-08 12:21:53 -04:00
dm-zoned-target.c dm table: Fix zoned model check and zone sectors check 2021-03-30 14:32:06 +02:00
dm-zoned.h dm zoned: select reclaim zone based on device index 2020-06-05 14:59:53 -04:00
dm.c dm table: fix DAX iterate_devices based device capability checks 2021-03-04 11:38:44 +01:00
dm.h dm table: fix DAX iterate_devices based device capability checks 2021-03-04 11:38:44 +01:00
Kconfig dm integrity: select CRYPTO_SKCIPHER 2021-01-27 11:54:57 +01:00
Makefile md: move the early init autodetect code to drivers/md/ 2020-07-16 15:34:47 +02:00
md-autodetect.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
md-bitmap.c md/bitmap: fix memory leak of temporary bitmap 2020-10-08 22:37:39 -07:00
md-bitmap.h
md-cluster.c md/cluster: fix deadlock when node is doing resync job 2020-12-30 11:54:25 +01:00
md-cluster.h
md-faulty.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
md-linear.c block: add a new revalidate_disk_size helper 2020-09-02 08:00:07 -06:00
md-linear.h md/raid1: Replace zero-length array with flexible-array 2020-05-13 12:02:23 -07:00
md-multipath.c writeback: remove bdi->congested_fn 2020-07-08 17:20:46 -06:00
md-multipath.h
md.c md: Set prev_flush_start and flush_bio in an atomic way 2021-02-10 09:29:22 +01:00
md.h Revert "md: change mddev 'chunk_sectors' from int to unsigned" 2020-12-14 19:33:01 +01:00
raid1-10.c
raid1.c md/raid1: properly indicate failure when ending a failed write request 2021-05-11 14:47:36 +02:00
raid1.h md/raid1: Replace zero-length array with flexible-array 2020-05-13 12:02:23 -07:00
raid5-cache.c raid5-cache: hold spinlock instead of mutex in r5c_journal_mode_show 2020-08-02 23:03:52 -07:00
raid5-log.h
raid5-ppl.c md/raid456: convert macro STRIPE_* to RAID5_STRIPE_* 2020-07-21 17:18:12 -07:00
raid5.c md/raid5: fix oops during stripe resizing 2020-10-08 22:38:10 -07:00
raid5.h md/raid5: let multiple devices of stripe_head share page 2020-09-24 16:44:44 -07:00
raid10.c md/raid10: initialize r10_bio->read_slot before use. 2021-01-06 14:56:49 +01:00
raid10.h Revert "md/raid10: improve discard request for far layout" 2020-12-09 20:46:00 -08:00
raid0.c Revert "md: add md_submit_discard_bio() for submitting discard bio" 2020-12-09 20:46:01 -08:00
raid0.h