linux/fs
Chao Yu 1fd916e879 f2fs: fix to avoid deadlock of atomic file operations
commit 48432984d7 upstream.

Thread A				Thread B
- __fput
 - f2fs_release_file
  - drop_inmem_pages
   - mutex_lock(&fi->inmem_lock)
   - __revoke_inmem_pages
    - lock_page(page)
					- open
					- f2fs_setattr
					- truncate_setsize
					 - truncate_inode_pages_range
					  - lock_page(page)
					  - truncate_cleanup_page
					   - f2fs_invalidate_page
					    - drop_inmem_page
					    - mutex_lock(&fi->inmem_lock);

We may encounter above ABBA deadlock as reported by Kyungtae Kim:

I'm reporting a bug in linux-4.17.19: "INFO: task hung in
drop_inmem_page" (no reproducer)

I think this might be somehow related to the following:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ

=========================================
INFO: task syz-executor7:10822 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D27024 10822   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
 mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
 drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
 f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
 do_invalidatepage mm/truncate.c:165 [inline]
 truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
 truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
 truncate_inode_pages mm/truncate.c:478 [inline]
 truncate_pagecache+0x6d/0x90 mm/truncate.c:801
 truncate_setsize+0x81/0xa0 mm/truncate.c:826
 f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
 notify_change+0xa62/0xe80 fs/attr.c:313
 do_truncate+0x12e/0x1e0 fs/open.c:63
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
INFO: task syz-executor7:10858 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D28880 10858   6346 0x00000004
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
 rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_write+0x58/0xa0 kernel/locking/rwsem.c:72
 inode_lock include/linux/fs.h:713 [inline]
 do_truncate+0x120/0x1e0 fs/open.c:61
 do_last fs/namei.c:2955 [inline]
 path_openat+0x2042/0x29f0 fs/namei.c:3505
 do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
 do_sys_open+0x35e/0x4e0 fs/open.c:1101
 __do_sys_open fs/open.c:1119 [inline]
 __se_sys_open fs/open.c:1114 [inline]
 __x64_sys_open+0x89/0xc0 fs/open.c:1114
 do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
INFO: task syz-executor5:10829 blocked for more than 120 seconds.
      Not tainted 4.17.19 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5   D28760 10829   6308 0x80000002
Call Trace:
 context_switch kernel/sched/core.c:2867 [inline]
 __schedule+0x721/0x1e60 kernel/sched/core.c:3515
 schedule+0x88/0x1c0 kernel/sched/core.c:3559
 io_schedule+0x21/0x80 kernel/sched/core.c:5179
 wait_on_page_bit_common mm/filemap.c:1100 [inline]
 __lock_page+0x2b5/0x390 mm/filemap.c:1273
 lock_page include/linux/pagemap.h:483 [inline]
 __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
 drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
 f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
 __fput+0x2c7/0x780 fs/file_table.c:209
 ____fput+0x1a/0x20 fs/file_table.c:243
 task_work_run+0x151/0x1d0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x8ba/0x30a0 kernel/exit.c:865
 do_group_exit+0x13b/0x3a0 kernel/exit.c:968
 get_signal+0x6bb/0x1650 kernel/signal.c:2482
 do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4497b9
RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700

This patch tries to use trylock_page to mitigate such deadlock condition
for fix.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-27 14:14:42 +09:00
..
9p 9p: use inode->i_lock to protect i_size_write() under 32-bit 2019-03-23 20:09:38 +01:00
adfs
affs
afs afs: Fix key refcounting in file locking code 2019-02-27 10:08:56 +01:00
autofs autofs: fix error return in autofs_fill_super() 2019-03-13 14:02:32 -07:00
befs
bfs
btrfs Btrfs: fix corruption reading shared and compressed extents after hole punching 2019-03-23 20:10:00 +01:00
cachefiles
ceph ceph: avoid repeatedly adding inode to mdsc->snap_flush_list 2019-02-27 10:08:50 +01:00
cifs SMB3: Fix SMB3.1.1 guest mounts to Samba 2019-03-27 14:14:41 +09:00
coda
configfs
cramfs
crypto
debugfs debugfs: fix debugfs_rename parameter checking 2019-02-15 08:10:11 +01:00
devpts fs/devpts: always delete dcache dentry-s in dput() 2019-03-23 20:09:59 +01:00
dlm dlm: Don't swamp the CPU with callbacks queued during recovery 2019-02-12 19:46:58 +01:00
ecryptfs
efivarfs
efs
exofs
exportfs
ext2 ext2: Fix underflow in ext2_max_size() 2019-03-23 20:10:03 +01:00
ext4 ext4: brelse all indirect buffer in ext4_ind_remove_space() 2019-03-27 14:14:41 +09:00
f2fs f2fs: fix to avoid deadlock of atomic file operations 2019-03-27 14:14:42 +09:00
fat
freevxfs
fscache
fuse fuse: handle zero sized retrieve correctly 2019-02-12 19:47:24 +01:00
gfs2 gfs2: Fix missed wakeups in find_insert_glock 2019-03-13 14:02:40 -07:00
hfs
hfsplus
hostfs
hpfs
hugetlbfs hugetlbfs: fix races and page leaks during migration 2019-03-05 17:58:53 +01:00
isofs
jbd2 jbd2: fix compile warning when using JBUFFER_TRACE 2019-03-23 20:10:06 +01:00
jffs2 jffs2: Fix use of uninitialized delayed_work, lockdep breakage 2019-01-26 09:32:37 +01:00
jfs
kernfs fix cgroup_do_mount() handling of failure exits 2019-03-23 20:09:53 +01:00
lockd lockd: Show pid of lockd for remote locks 2019-01-13 09:51:08 +01:00
minix
nfs NFSv4.1: Reinitialise sequence results before retransmitting a request 2019-03-23 20:10:10 +01:00
nfs_common
nfsd nfsd: fix wrong check in write_v4_end_grace() 2019-03-23 20:10:09 +01:00
nilfs2
nls
notify inotify: Fix fd refcount leak in inotify_add_watch(). 2019-01-31 08:14:34 +01:00
ntfs
ocfs2 ocfs2: improve ocfs2 Makefile 2019-02-12 19:47:18 +01:00
omfs
openpromfs
orangefs
overlayfs ovl: Do not lose security.capability xattr over metadata file copy-up 2019-03-23 20:09:59 +01:00
proc proc: fix /proc/net/* after setns(2) 2019-03-13 14:02:32 -07:00
pstore pstore/ram: Do not treat empty buffers as valid 2019-01-26 09:32:37 +01:00
qnx4
qnx6
quota quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls. 2019-01-26 09:32:42 +01:00
ramfs
reiserfs
romfs
squashfs
sysfs
sysv
tracefs
ubifs
udf udf: Fix crash on IO error during truncate 2019-03-27 14:14:39 +09:00
ufs
xfs xfs: eof trim writeback mapping as soon as it is cached 2019-02-12 19:47:23 +01:00
aio.c aio: Fix locking in aio_poll() 2019-03-10 07:17:21 +01:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c Revert "exec: load_script: don't blindly truncate shebang string" 2019-02-15 09:09:54 +01:00
block_dev.c blockdev: Fix livelocks on loop device 2019-01-22 21:40:36 +01:00
buffer.c fs: ratelimit __find_get_block_slow() failure message. 2019-03-13 14:02:38 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c
compat.c
coredump.c
d_path.c
dax.c
dcache.c fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb() 2019-02-06 17:30:11 +01:00
dcookies.c
direct-io.c direct-io: allow direct writes to empty inodes 2019-03-05 17:58:50 +01:00
drop_caches.c fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() 2019-03-13 14:02:32 -07:00
eventfd.c
eventpoll.c fs/epoll: drop ovflist branch prediction 2019-02-12 19:47:19 +01:00
exec.c exec: Fix mem leak in kernel_read_file 2019-03-10 07:17:21 +01:00
fcntl.c
fhandle.c
file_table.c
file.c
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c writeback: synchronize sync(2) against cgroup writeback membership switches 2019-03-05 17:58:50 +01:00
inode.c Revert "mm: don't reclaim inodes with many attached pages" 2019-02-20 10:25:47 +01:00
internal.h
ioctl.c
iomap.c iomap: fix a use after free in iomap_dio_rw 2019-03-13 14:02:29 -07:00
Kconfig
Kconfig.binfmt
libfs.c
locks.c
Makefile
mbcache.c
mount.h
mpage.c
namei.c
namespace.c
no-block.c
nsfs.c
open.c
pipe.c splice: don't merge into linked buffers 2019-03-23 20:09:59 +01:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c
select.c
seq_file.c
signalfd.c
splice.c splice: don't merge into linked buffers 2019-03-23 20:09:59 +01:00
stack.c
stat.c
statfs.c
super.c
sync.c
timerfd.c
userfaultfd.c userfaultfd: clear flag if remap event not enabled 2019-01-26 09:32:43 +01:00
utimes.c
xattr.c