Linux kernel source tree
Go to file
Jens Axboe f3ac7a5996 io_uring: fix SQPOLL IORING_OP_CLOSE cancelation state
commit 607ec89ed1 upstream.

IORING_OP_CLOSE is special in terms of cancelation, since it has an
intermediate state where we've removed the file descriptor but hasn't
closed the file yet. For that reason, it's currently marked with
IO_WQ_WORK_NO_CANCEL to prevent cancelation. This ensures that the op
is always run even if canceled, to prevent leaving us with a live file
but an fd that is gone. However, with SQPOLL, since a cancel request
doesn't carry any resources on behalf of the request being canceled, if
we cancel before any of the close op has been run, we can end up with
io-wq not having the ->files assigned. This can result in the following
oops reported by Joseph:

BUG: kernel NULL pointer dereference, address: 00000000000000d8
PGD 800000010b76f067 P4D 800000010b76f067 PUD 10b462067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 1 PID: 1788 Comm: io_uring-sq Not tainted 5.11.0-rc4 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__lock_acquire+0x19d/0x18c0
Code: 00 00 8b 1d fd 56 dd 08 85 db 0f 85 43 05 00 00 48 c7 c6 98 7b 95 82 48 c7 c7 57 96 93 82 e8 9a bc f5 ff 0f 0b e9 2b 05 00 00 <48> 81 3f c0 ca 67 8a b8 00 00 00 00 41 0f 45 c0 89 04 24 e9 81 fe
RSP: 0018:ffffc90001933828 EFLAGS: 00010002
RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000d8
RBP: 0000000000000246 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffff888106e8a140 R15: 00000000000000d8
FS:  0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000d8 CR3: 0000000106efa004 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 lock_acquire+0x31a/0x440
 ? close_fd_get_file+0x39/0x160
 ? __lock_acquire+0x647/0x18c0
 _raw_spin_lock+0x2c/0x40
 ? close_fd_get_file+0x39/0x160
 close_fd_get_file+0x39/0x160
 io_issue_sqe+0x1334/0x14e0
 ? lock_acquire+0x31a/0x440
 ? __io_free_req+0xcf/0x2e0
 ? __io_free_req+0x175/0x2e0
 ? find_held_lock+0x28/0xb0
 ? io_wq_submit_work+0x7f/0x240
 io_wq_submit_work+0x7f/0x240
 io_wq_cancel_cb+0x161/0x580
 ? io_wqe_wake_worker+0x114/0x360
 ? io_uring_get_socket+0x40/0x40
 io_async_find_and_cancel+0x3b/0x140
 io_issue_sqe+0xbe1/0x14e0
 ? __lock_acquire+0x647/0x18c0
 ? __io_queue_sqe+0x10b/0x5f0
 __io_queue_sqe+0x10b/0x5f0
 ? io_req_prep+0xdb/0x1150
 ? mark_held_locks+0x6d/0xb0
 ? mark_held_locks+0x6d/0xb0
 ? io_queue_sqe+0x235/0x4b0
 io_queue_sqe+0x235/0x4b0
 io_submit_sqes+0xd7e/0x12a0
 ? _raw_spin_unlock_irq+0x24/0x30
 ? io_sq_thread+0x3ae/0x940
 io_sq_thread+0x207/0x940
 ? do_wait_intr_irq+0xc0/0xc0
 ? __ia32_sys_io_uring_enter+0x650/0x650
 kthread+0x134/0x180
 ? kthread_create_worker_on_cpu+0x90/0x90
 ret_from_fork+0x1f/0x30

Fix this by moving the IO_WQ_WORK_NO_CANCEL until _after_ we've modified
the fdtable. Canceling before this point is totally fine, and running
it in the io-wq context _after_ that point is also fine.

For 5.12, we'll handle this internally and get rid of the no-cancel
flag, as IORING_OP_CLOSE is the only user of it.

Cc: stable@vger.kernel.org
Fixes: b5dba59e0c ("io_uring: add support for IORING_OP_CLOSE")
Reported-by: "Abaci <abaci@linux.alibaba.com>"
Reviewed-and-tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27 11:55:15 +01:00
arch x86/setup: don't remove E820_TYPE_RAM for pfn 0 2021-01-27 11:55:14 +01:00
block blk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED 2021-01-19 18:27:29 +01:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto crypto: xor - Fix divide error in do_xor_speed() 2021-01-27 11:54:52 +01:00
Documentation x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery 2021-01-27 11:55:01 +01:00
drivers irqchip/mips-cpu: Set IPI domain parent chip 2021-01-27 11:55:13 +01:00
fs io_uring: fix SQPOLL IORING_OP_CLOSE cancelation state 2021-01-27 11:55:15 +01:00
include xen: Fix event channel callback via INTX/GSI 2021-01-27 11:55:00 +01:00
init rcu-tasks: Move RCU-tasks initialization to before early_initcall() 2021-01-19 18:27:28 +01:00
ipc ipc: adjust proc_ipc_sem_dointvec definition to match prototype 2020-09-05 12:14:29 -07:00
kernel printk: fix kmsg_dump_get_buffer length calulations 2021-01-27 11:55:08 +01:00
lib iov_iter: fix the uaccess area in copy_compat_iovec_from_user 2021-01-27 11:55:09 +01:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm: fix numa stats for thp migration 2021-01-27 11:55:14 +01:00
net xsk: Clear pool even for inactive queues 2021-01-27 11:55:10 +01:00
samples samples/bpf: Fix possible hang in xdpsock with multiple threads 2020-12-30 11:53:49 +01:00
scripts Revert "kconfig: remove 'kvmconfig' and 'xenconfig' shorthands" 2021-01-23 16:03:57 +01:00
security dump_common_audit_data(): fix racy accesses to ->d_name 2021-01-19 18:27:29 +01:00
sound ALSA: hda: Balance runtime/system PM if direct-complete is disabled 2021-01-27 11:55:10 +01:00
tools perf evlist: Fix id index for heterogeneous systems 2021-01-27 11:55:11 +01:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt kvm: check tlbs_dirty directly 2021-01-12 20:18:22 +01:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-12-10 15:30:13 -08:00
Makefile Linux 5.10.10 2021-01-23 16:04:06 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.