mirror of
https://github.com/torvalds/linux.git
synced 2026-05-30 18:13:41 +02:00
btrfs: fix the incorrect max_bytes value for find_lock_delalloc_range()
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
parent
62701f4190
commit
7b26da4074
|
|
@ -393,6 +393,13 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
|
|||
/* step one, find a bunch of delalloc bytes starting at start */
|
||||
delalloc_start = *start;
|
||||
delalloc_end = 0;
|
||||
|
||||
/*
|
||||
* If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can
|
||||
* return early without handling any dirty ranges.
|
||||
*/
|
||||
ASSERT(max_bytes >= fs_info->sectorsize);
|
||||
|
||||
found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end,
|
||||
max_bytes, &cached_state);
|
||||
if (!found || delalloc_end <= *start || delalloc_start > orig_end) {
|
||||
|
|
@ -423,13 +430,14 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
|
|||
delalloc_end);
|
||||
ASSERT(!ret || ret == -EAGAIN);
|
||||
if (ret == -EAGAIN) {
|
||||
/* some of the folios are gone, lets avoid looping by
|
||||
* shortening the size of the delalloc range we're searching
|
||||
/*
|
||||
* Some of the folios are gone, lets avoid looping by
|
||||
* shortening the size of the delalloc range we're searching.
|
||||
*/
|
||||
btrfs_free_extent_state(cached_state);
|
||||
cached_state = NULL;
|
||||
if (!loops) {
|
||||
max_bytes = PAGE_SIZE;
|
||||
max_bytes = fs_info->sectorsize;
|
||||
loops = 1;
|
||||
goto again;
|
||||
} else {
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user