linux/fs
Filipe Manana e2419c5709 btrfs: fix race causing unnecessary inode logging during link and rename
[ Upstream commit de53d892e5 ]

When we are doing a rename or a link operation for an inode that was logged
in the previous transaction and that transaction is still committing, we
have a time window where we incorrectly consider that the inode was logged
previously in the current transaction and therefore decide to log it to
update it in the log. The following steps give an example on how this
happens during a link operation:

1) Inode X is logged in transaction 1000, so its logged_trans field is set
   to 1000;

2) Task A starts to commit transaction 1000;

3) The state of transaction 1000 is changed to TRANS_STATE_UNBLOCKED;

4) Task B starts a link operation for inode X, and as a consequence it
   starts transaction 1001;

5) Task A is still committing transaction 1000, therefore the value stored
   at fs_info->last_trans_committed is still 999;

6) Task B calls btrfs_log_new_name(), it reads a value of 999 from
   fs_info->last_trans_committed and because the logged_trans field of
   inode X has a value of 1000, the function does not return immediately,
   instead it proceeds to logging the inode, which should not happen
   because the inode was logged in the previous transaction (1000) and
   not in the current one (1001).

This is not a functional problem, just wasted time and space logging an
inode that does not need to be logged, contributing to higher latency
for link and rename operations.

So fix this by comparing the inodes' logged_trans field with the
generation of the current transaction instead of comparing with the value
stored in fs_info->last_trans_committed.

This case is often hit when running dbench for a long enough duration, as
it does lots of rename operations.

This patch belongs to a patch set that is comprised of the following
patches:

  btrfs: fix race causing unnecessary inode logging during link and rename
  btrfs: fix race that results in logging old extents during a fast fsync
  btrfs: fix race that causes unnecessary logging of ancestor inodes
  btrfs: fix race that makes inode logging fallback to transaction commit
  btrfs: fix race leading to unnecessary transaction commit when logging inode
  btrfs: do not block inode logging for so long during transaction commit

Performance results are mentioned in the change log of the last patch.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-08 09:05:22 +02:00
..
9p fs: 9p: add generic splice_write file operation 2020-12-01 21:40:47 +01:00
adfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
affs fs/affs: release old buffer head on error path 2021-03-04 11:38:37 +01:00
afs afs: Fix tracepoint string placement with built-in AFS 2021-07-28 14:35:41 +02:00
autofs autofs: harden ioctl table 2020-10-16 11:11:22 -07:00
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: fix race causing unnecessary inode logging during link and rename 2021-08-08 09:05:22 +02:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: don't WARN if we're still opening a session to an MDS 2021-07-28 14:35:40 +02:00
cifs SMB3: fix readpage for large swap cache 2021-08-04 12:46:45 +02:00
coda
configfs configfs: fix memleak in configfs_release_bin_file 2021-07-14 16:56:48 +02:00
cramfs
crypto fscrypt: fix derivation of SipHash keys on big endian CPUs 2021-07-14 16:56:53 +02:00
debugfs debugfs: Make debugfs_allow RO after init 2021-05-19 10:13:19 +02:00
devpts
dlm fs: dlm: fix memory leak when fenced 2021-07-14 16:55:59 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:06:55 +02:00
efivarfs efivarfs: revert "fix memory leak in efivarfs_create()" 2020-11-25 16:55:02 +01:00
efs
erofs erofs: fix error return code in erofs_read_superblock() 2021-07-14 16:56:53 +02:00
exfat exfat: handle wrong stream entry size in exfat_readdir() 2021-07-14 16:56:52 +02:00
exportfs
ext2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
ext4 ext4: fix memory leak in ext4_fill_super 2021-07-19 09:45:03 +02:00
f2fs f2fs: Show casefolding support only when supported 2021-07-25 14:36:17 +02:00
fat
freevxfs
fscache
fuse fuse: reject internal errno 2021-07-14 16:55:47 +02:00
gfs2 gfs2: Fix error handling in init_statfs 2021-07-14 16:55:38 +02:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-31 08:16:12 +02:00
hfsplus hfsplus: prevent corruption in shrinking truncate 2021-05-19 10:13:10 +02:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs
hugetlbfs hugetlbfs: fix mount mode command line processing 2021-07-28 14:35:46 +02:00
iomap iomap: remove the length variable in iomap_seek_hole 2021-07-31 08:16:12 +02:00
isofs isofs: release buffer head before return 2021-03-04 11:38:00 +01:00
jbd2 ext4: fix debug format string warning 2021-05-19 10:13:19 +02:00
jffs2 jffs2: check the validity of dstlen in jffs2_zlib_compress() 2021-05-11 14:47:36 +02:00
jfs fs/jfs: Fix missing error code in lmLogInit() 2021-07-20 16:05:40 +02:00
kernfs kernfs: wire up ->splice_read and ->splice_write 2021-01-27 11:55:29 +01:00
lockd lockd: don't use interval-based rebinding over TCP 2020-12-30 11:53:30 +01:00
minix
nfs NFSv4/pNFS: Don't call _nfs4_pnfs_v3_ds_connect multiple times 2021-07-20 16:05:53 +02:00
nfs_common nfs_common: need lock during iterate through the list 2020-12-30 11:53:45 +01:00
nfsd nfsd: Reduce contention for the nfsd_file nf_rwsem 2021-07-20 16:05:53 +02:00
nilfs2 nilfs2: fix memory leak in nilfs_sysfs_delete_device_group 2021-06-30 08:47:24 -04:00
nls
notify fanotify: fix copy_event_to_user() fid error clean up 2021-06-23 14:42:41 +02:00
ntfs ntfs: fix validity check for file name attribute 2021-07-14 16:55:38 +02:00
ocfs2 ocfs2: issue zeroout to EOF blocks 2021-08-04 12:46:40 +02:00
omfs fs: omfs: use kmemdup() rather than kmalloc+memcpy 2020-09-22 23:39:45 -04:00
openpromfs
orangefs orangefs: fix orangefs df output. 2021-07-20 16:05:48 +02:00
overlayfs ovl: invalidate readdir cache on changes to dir with origin 2021-05-14 09:50:35 +02:00
proc proc: Avoid mixing integer types in mem_rw() 2021-07-28 14:35:42 +02:00
pstore mark pstore-blk as broken 2021-07-14 16:56:12 +02:00
qnx4
qnx6
quota quota: Fix memory leak when handling corrupted quota file 2021-03-04 11:37:53 +01:00
ramfs ramfs: fix nommu mmap with gaps in the page cache 2020-10-16 11:11:22 -07:00
reiserfs reiserfs: add check for invalid 1st journal block 2021-07-19 09:44:39 +02:00
romfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
squashfs squashfs: fix divide error in calculate_skip() 2021-05-19 10:13:10 +02:00
sysfs sysfs: Add sysfs_emit and sysfs_emit_at to format sysfs output 2020-10-02 12:02:30 +02:00
sysv
tracefs
ubifs ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode 2021-07-20 16:05:51 +02:00
udf udf: Fix NULL pointer dereference in udf_symlink function 2021-07-19 09:44:40 +02:00
ufs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-10-24 12:26:05 -07:00
unicode
vboxsf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2020-10-15 15:11:56 -07:00
verity
xfs xfs: fix return of uninitialized value in variable error 2021-05-14 09:50:34 +02:00
zonefs zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone() 2021-03-25 09:04:05 +01:00
aio.c vfs: separate __sb_start_write into blocking and non-blocking helpers 2020-11-10 16:53:07 -08:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot 2020-10-16 11:11:21 -07:00
binfmt_elf.c fs: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-17 17:06:35 +01:00
binfmt_script.c
block_dev.c block: fix a race between del_gendisk and BLKRRPART 2021-06-03 09:00:45 +02:00
buffer.c mm, memcg: rework remote charging API to support nesting 2020-10-18 09:27:09 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: fix core_pattern parse error 2020-12-06 10:19:07 -08:00
d_path.c fs: fix NULL dereference due to data race in prepend_path() 2020-10-14 14:54:45 -07:00
dax.c dax: fix ENOMEM handling in grab_mapping_entry() 2021-07-14 16:56:13 +02:00
dcache.c
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-11 14:47:12 +02:00
exec.c Add a reference to ucounts for each cred 2021-07-14 16:55:48 +02:00
fcntl.c fcntl: Fix potential deadlock in send_sig{io, urg}() 2021-01-06 14:56:53 +01:00
fhandle.c
file_table.c task_work: cleanup notification modes 2020-10-17 15:05:30 -06:00
file.c kernel/io_uring: cancel io_uring before task works 2021-01-30 13:55:18 +01:00
filesystems.c
fs_context.c
fs_parser.c fs_parse: mark fs_param_bad_value() as static 2020-10-13 18:38:27 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback: fix obtain a reference to a freeing memcg css 2021-07-14 16:56:31 +02:00
fsopen.c
init.c
inode.c fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode() 2020-12-30 11:53:49 +01:00
internal.h cgroup1: fix leaked context root causing sporadic NULL deref in LTP 2021-07-31 08:16:11 +02:00
io_uring.c io_uring: fix null-ptr-deref in io_sq_offload_start() 2021-08-04 12:46:39 +02:00
io-wq.c io_uring: fix false WARN_ONCE 2021-07-19 09:44:51 +02:00
io-wq.h io_uring: always batch cancel in *cancel_files() 2021-02-13 13:54:56 +01:00
ioctl.c
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt
kernel_read_file.c fs/kernel_file_read: Add "offset" arg for partial reads 2020-10-05 13:37:04 +02:00
libfs.c libfs: fix error cast of negative value in simple_attr_write() 2020-11-22 10:48:22 -08:00
locks.c Revert "nfsd4: a client's own opens needn't prevent delegations" 2021-03-20 10:43:44 +01:00
Makefile Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
mbcache.c
mount.h
mpage.c
namei.c LOOKUP_MOUNTPOINT: we are cleaning "jumped" flag too late 2021-04-14 08:41:58 +02:00
namespace.c umount(2): move the flag validity checks first 2021-01-19 18:27:32 +01:00
no-block.c
nsfs.c
open.c open: don't silently ignore unknown O-flags in openat2() 2021-07-14 16:55:59 +02:00
pipe.c pipe: make pipe writes always wake up readers 2021-08-04 12:46:39 +02:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c proc mountinfo: make splice available again 2020-12-30 11:54:02 +01:00
read_write.c Refactored code for 5.10: 2020-10-23 11:33:41 -07:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c vfs: move the remap range helpers to remap_range.c 2020-10-15 09:48:49 -07:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-25 09:04:16 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:05:59 +02:00
signalfd.c
splice.c io_uring-5.10-2020-10-24 2020-10-24 12:40:18 -07:00
stack.c
stat.c fs: fix reporting supported extra file attributes for statx() 2021-05-11 14:47:33 +02:00
statfs.c
super.c vfs: move __sb_{start,end}_write* to fs.h 2020-11-10 16:53:11 -08:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: do not untag user pointers 2021-07-28 14:35:46 +02:00
utimes.c
xattr.c fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr 2020-10-13 18:38:27 -07:00