md/raid5: fix bug that could result in reads from a failed device.

commit 355840e7a7 upstream.

commit a847627709 in linux-3.0.9
attempted to backport this to 3.0 but only made one change were two
were necessary.  This add the second change.

This bug was introduced in 415e72d034
which was in 2.6.36.

There is a small window of time between when a device fails and when
it is removed from the array.  During this time we might still read
from it, but we won't write to it - so it is possible that we could
read stale data.

We didn't need the test of 'Faulty' before because the test on
In_sync is sufficient.  Since we started allowing reads from the early
part of non-In_sync devices we need a test on Faulty too.

This is suitable for any kernel from 2.6.36 onwards, though the patch
might need a bit of tweaking in 3.0 and earlier.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This commit is contained in:
NeilBrown 2011-12-15 10:54:39 +11:00 committed by Greg Kroah-Hartman
parent bf3673c5e3
commit d8091fda78

View File

@ -3078,7 +3078,7 @@ static void handle_stripe5(struct stripe_head *sh)
/* Not in-sync */;
else if (test_bit(In_sync, &rdev->flags))
set_bit(R5_Insync, &dev->flags);
else {
else if (!test_bit(Faulty, &rdev->flags)) {
/* could be in-sync depending on recovery/reshape status */
if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset)
set_bit(R5_Insync, &dev->flags);