linux/fs
Filipe Manana 688a45aff2 btrfs: send: avoid unaligned encoded writes when attempting to clone range
[ Upstream commit a11452a370 ]

When trying to see if we can clone a file range, there are cases where we
end up sending two write operations in case the inode from the source root
has an i_size that is not sector size aligned and the length from the
current offset to its i_size is less than the remaining length we are
trying to clone.

Issuing two write operations when we could instead issue a single write
operation is not incorrect. However it is not optimal, specially if the
extents are compressed and the flag BTRFS_SEND_FLAG_COMPRESSED was passed
to the send ioctl. In that case we can end up sending an encoded write
with an offset that is not sector size aligned, which makes the receiver
fallback to decompressing the data and writing it using regular buffered
IO (so re-compressing the data in case the fs is mounted with compression
enabled), because encoded writes fail with -EINVAL when an offset is not
sector size aligned.

The following example, which triggered a bug in the receiver code for the
fallback logic of decompressing + regular buffer IO and is fixed by the
patchset referred in a Link at the bottom of this changelog, is an example
where we have the non-optimal behaviour due to an unaligned encoded write:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV > /dev/null
   mount -o compress $DEV $MNT

   # File foo has a size of 33K, not aligned to the sector size.
   xfs_io -f -c "pwrite -S 0xab 0 33K" $MNT/foo

   xfs_io -f -c "pwrite -S 0xcd 0 64K" $MNT/bar

   # Now clone the first 32K of file bar into foo at offset 0.
   xfs_io -c "reflink $MNT/bar 0 0 32K" $MNT/foo

   # Snapshot the default subvolume and create a full send stream (v2).
   btrfs subvolume snapshot -r $MNT $MNT/snap

   btrfs send --compressed-data -f /tmp/test.send $MNT/snap

   echo -e "\nFile bar in the original filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT
   mkfs.btrfs -f $DEV > /dev/null
   mount $DEV $MNT

   echo -e "\nReceiving stream in a new filesystem..."
   btrfs receive -f /tmp/test.send $MNT

   echo -e "\nFile bar in the new filesystem:"
   od -A d -t x1 $MNT/snap/bar

   umount $MNT

Before this patch, the send stream included one regular write and one
encoded write for file 'bar', with the later being not sector size aligned
and causing the receiver to fallback to decompression + buffered writes.
The output of the btrfs receive command in verbose mode (-vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   write bar - offset=32768 length=1024
   encoded_write bar - offset=33792, len=4096, unencoded_offset=33792, unencoded_file_len=31744, unencoded_len=65536, compression=1, encryption=0
   encoded_write bar - falling back to decompress and write due to errno 22 ("Invalid argument")
   (...)

This patch avoids the regular write followed by an unaligned encoded write
so that we end up sending a single encoded write that is aligned. So after
this patch the stream content is (output of btrfs receive -vvv):

   (...)
   mkfile o258-7-0
   rename o258-7-0 -> bar
   utimes
   clone bar - source=foo source offset=0 offset=0 length=32768
   encoded_write bar - offset=32768, len=4096, unencoded_offset=32768, unencoded_file_len=32768, unencoded_len=65536, compression=1, encryption=0
   (...)

So we get more optimal behaviour and avoid the silent data loss bug in
versions of btrfs-progs affected by the bug referred by the Link tag
below (btrfs-progs v5.19, v5.19.1, v6.0 and v6.0.1).

Link: https://lore.kernel.org/linux-btrfs/cover.1668529099.git.fdmanana@suse.com/
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-14 11:31:53 +01:00
..
9p 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" 2022-06-22 14:13:12 +02:00
adfs
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix fileserver probe RTT handling 2022-12-08 11:23:56 +01:00
autofs
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: send: avoid unaligned encoded writes when attempting to clone range 2022-12-14 11:31:53 +01:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: fix NULL pointer dereference for req->r_session 2022-12-02 17:40:03 +01:00
cifs cifs: add check for returning value of SMB2_set_info_init 2022-11-25 17:45:48 +01:00
coda
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-03-02 11:42:52 +01:00
cramfs
crypto fscrypt: fix keyring memory leak on mount failure 2022-11-10 18:14:25 +01:00
debugfs debugfs: add debugfs_lookup_and_remove() 2022-09-15 11:32:03 +02:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:25:39 +01:00
dlm fs: dlm: handle -EBUSY first in lock arg validation 2022-10-26 13:25:08 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:06:55 +02:00
efivarfs
efs
erofs erofs: avoid consecutive detection for Highmem memory 2022-08-21 15:15:35 +02:00
exfat exfat: check if cluster num is valid 2022-06-06 08:42:42 +02:00
exportfs
ext2 ext2: Add more validity checks for inode counts 2022-08-21 15:15:28 +02:00
ext4 ext4: fix use-after-free in ext4_ext_shift_extents 2022-12-02 17:40:02 +01:00
f2fs ext4,f2fs: fix readahead of verity data 2022-11-10 18:14:29 +01:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-09 10:20:58 +02:00
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-18 13:40:15 +02:00
fuse fuse: lock inode unconditionally in fuse_fallocate() 2022-12-02 17:40:07 +01:00
gfs2 gfs2: Switch from strlcpy to strscpy 2022-11-25 17:45:56 +01:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:16:12 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:13:10 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs
hugetlbfs mm, hugetlb: allow for "high" userspace addresses 2022-04-27 13:53:54 +02:00
iomap xfs: use current->journal_info for detecting transaction recursion 2022-07-07 17:52:19 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:58:33 +01:00
jbd2 jbd2: add miss release buffer head in fc_do_one_pass() 2022-10-26 13:25:13 +02:00
jffs2 jffs2: fix memory leak in jffs2_do_fill_super 2022-06-14 18:32:35 +02:00
jfs fs: jfs: fix possible NULL pointer dereference in dbFree() 2022-06-09 10:20:57 +02:00
kernfs kernfs: fix use-after-free in __kernfs_remove 2022-11-03 23:57:50 +09:00
lockd lockd: lockd server-side shouldn't set fl_ops 2021-09-18 13:40:30 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-13 21:01:01 +02:00
nfs NFSv4: Retry LOCK on OLD_STATEID during delegation return 2022-11-25 17:45:40 +01:00
nfs_common
nfsd NFSD: fix use-after-free on source server when doing inter-server copy 2022-10-26 13:25:45 +02:00
nilfs2 nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() 2022-12-08 11:23:57 +01:00
nls
notify fsnotify: fix wrong lockdep annotations 2022-06-09 10:21:03 +02:00
ntfs ntfs: check overflow when iterating ATTR_RECORDs 2022-11-25 17:45:57 +01:00
ocfs2 ocfs2: fix BUG when iput after ocfs2_mknod fails 2022-10-30 09:41:15 +01:00
omfs
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2022-01-20 09:17:50 +01:00
overlayfs ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh() 2022-08-21 15:15:23 +02:00
proc mm: /proc/pid/smaps_rollup: fix no vma's null-deref 2022-10-30 09:41:19 +01:00
pstore pstore: Don't use semaphores in always-atomic-context code 2022-04-08 14:39:56 +02:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:11:08 +02:00
qnx6
quota quota: Check next/prev free block number after reading from quota file 2022-10-26 13:25:09 +02:00
ramfs
reiserfs reiserfs: check directory items on read from disk 2021-08-12 13:22:19 +02:00
romfs
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:13:10 +02:00
sysfs
sysv
tracefs tracefs: Only clobber mode/uid/gid on remount if asked 2022-09-20 12:38:31 +02:00
ubifs ubifs: Rectify space amount budget for mkdir/tmpfile operations 2022-04-13 21:00:53 +02:00
udf udf: Fix a slab-out-of-bounds write bug in udf_find_entry() 2022-11-16 09:57:17 +01:00
ufs
unicode
vboxsf vboxfs: fix broken legacy mount signature checking 2021-10-17 10:43:33 +02:00
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-10-06 15:55:46 +02:00
xfs xfs: validate inode fork size against fork format 2022-09-28 11:10:29 +02:00
zonefs zonefs: fix zone report size in __zonefs_io_error() 2022-12-02 17:40:05 +01:00
aio.c aio: fix use-after-free due to missing POLLFREE handling 2021-12-14 11:32:40 +01:00
anon_inodes.c
attr.c vfs: Check the truncate maximum size in inode_newsize_ok() 2022-08-21 15:15:22 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-04-08 14:40:44 +02:00
binfmt_elf.c fs/binfmt_elf: Fix memory leak in load_elf_binary() 2022-11-03 23:57:49 +09:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-09 10:20:47 +02:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c
block_dev.c block: fix a race between del_gendisk and BLKRRPART 2021-06-03 09:00:45 +02:00
buffer.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-25 17:45:56 +01:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: Use the vma snapshot in fill_files_note 2022-04-08 14:40:45 +02:00
d_path.c
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-09 10:21:16 +02:00
dcache.c
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c epoll: autoremove wakers even more aggressively 2022-08-21 15:15:28 +02:00
exec.c exec: Copy oldsighand->action under spin-lock 2022-11-03 23:57:49 +09:00
fcntl.c fcntl: fix potential deadlocks for &fown_struct.lock 2022-10-30 09:41:18 +01:00
fhandle.c
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-05-18 10:23:48 +02:00
file.c fs: fix fd table size alignment properly 2022-04-08 14:40:30 +02:00
filesystems.c
fs_context.c memcg: charge fs_context and legacy_fs_context 2022-02-08 18:30:36 +01:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-06-09 10:21:22 +02:00
fsopen.c
init.c
inode.c fs: fix UAF/GPF bug in nilfs_mdt_destroy 2022-10-15 07:55:51 +02:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:16:11 +02:00
io_uring.c io_uring: don't hold uring_lock when calling io_run_task_work* 2022-12-08 11:23:58 +01:00
io-wq.c io-wq: fix wakeup race when adding new work 2021-09-18 13:40:06 +02:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
ioctl.c fs: fix an infinite loop in iomap_fiemap 2022-05-25 09:17:54 +02:00
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-27 09:56:51 +02:00
libfs.c
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile
mbcache.c
mount.h
mpage.c
namei.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-25 17:45:56 +01:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:35:57 -04:00
no-block.c
nsfs.c
open.c open: don't silently ignore unknown O-flags in openat2() 2021-07-14 16:55:59 +02:00
pipe.c pipe: Fix missing lock in pipe_resize_ring() 2022-06-06 08:42:41 +02:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c
read_write.c
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c fs/remap: constrain dedupe of EOF blocks 2022-07-21 21:20:01 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:26:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:05:59 +02:00
signalfd.c io_uring: disable polling pollfree files 2022-09-05 10:28:58 +02:00
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-17 17:26:07 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:53:54 +02:00
statfs.c
super.c fscrypt: fix keyring memory leak on mount failure 2022-11-10 18:14:25 +01:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-08-31 17:15:14 +02:00
timerfd.c
userfaultfd.c userfaultfd: open userfaultfds with O_RDONLY 2022-10-26 13:25:17 +02:00
utimes.c
xattr.c