linux/drivers/block
Josef Bacik dc2b77642e nbd: handle device refs for DESTROY_ON_DISCONNECT properly
commit c9a2f90f4d upstream.

There exists a race where we can be attempting to create a new nbd
configuration while a previous configuration is going down, both
configured with DESTROY_ON_DISCONNECT.  Normally devices all have a
reference of 1, as they won't be cleaned up until the module is torn
down.  However with DESTROY_ON_DISCONNECT we'll make sure that there is
only 1 reference (generally) on the device for the config itself, and
then once the config is dropped, the device is torn down.

The race that exists looks like this

TASK1					TASK2
nbd_genl_connect()
  idr_find()
    refcount_inc_not_zero(nbd)
      * count is 2 here ^^
					nbd_config_put()
					  nbd_put(nbd) (count is 1)
    setup new config
      check DESTROY_ON_DISCONNECT
	put_dev = true
    if (put_dev) nbd_put(nbd)
	* free'd here ^^

In nbd_genl_connect() we assume that the nbd ref count will be 2,
however clearly that won't be true if the nbd device had been setup as
DESTROY_ON_DISCONNECT with its prior configuration.  Fix this by getting
rid of the runtime flag to check if we need to mess with the nbd device
refcount, and use the device NBD_DESTROY_ON_DISCONNECT flag to check if
we need to adjust the ref counts.  This was reported by syzkaller with
the following kasan dump

BUG: KASAN: use-after-free in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
BUG: KASAN: use-after-free in refcount_dec_not_one+0x71/0x1e0 lib/refcount.c:76
Read of size 4 at addr ffff888143bf71a0 by task systemd-udevd/8451

CPU: 0 PID: 8451 Comm: systemd-udevd Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:230
 __kasan_report mm/kasan/report.c:396 [inline]
 kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413
 check_memory_region_inline mm/kasan/generic.c:179 [inline]
 check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
 instrument_atomic_read include/linux/instrumented.h:71 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 refcount_dec_not_one+0x71/0x1e0 lib/refcount.c:76
 refcount_dec_and_mutex_lock+0x19/0x140 lib/refcount.c:115
 nbd_put drivers/block/nbd.c:248 [inline]
 nbd_release+0x116/0x190 drivers/block/nbd.c:1508
 __blkdev_put+0x548/0x800 fs/block_dev.c:1579
 blkdev_put+0x92/0x570 fs/block_dev.c:1632
 blkdev_close+0x8c/0xb0 fs/block_dev.c:1640
 __fput+0x283/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x249/0x250 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fc1e92b5270
Code: 73 01 c3 48 8b 0d 38 7d 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c1 20 00 00 75 10 b8 03 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee fb ff ff 48 89 04 24
RSP: 002b:00007ffe8beb2d18 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000007 RCX: 00007fc1e92b5270
RDX: 000000000aba9500 RSI: 0000000000000000 RDI: 0000000000000007
RBP: 00007fc1ea16f710 R08: 000000000000004a R09: 0000000000000008
R10: 0000562f8cb0b2a8 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562f8cb0afd0 R14: 0000000000000003 R15: 000000000000000e

Allocated by task 1:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:401 [inline]
 ____kasan_kmalloc.constprop.0+0x82/0xa0 mm/kasan/common.c:429
 kmalloc include/linux/slab.h:552 [inline]
 kzalloc include/linux/slab.h:682 [inline]
 nbd_dev_add+0x44/0x8e0 drivers/block/nbd.c:1673
 nbd_init+0x250/0x271 drivers/block/nbd.c:2394
 do_one_initcall+0x103/0x650 init/main.c:1223
 do_initcall_level init/main.c:1296 [inline]
 do_initcalls init/main.c:1312 [inline]
 do_basic_setup init/main.c:1332 [inline]
 kernel_init_freeable+0x605/0x689 init/main.c:1533
 kernel_init+0xd/0x1b8 init/main.c:1421
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Freed by task 8451:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:356
 ____kasan_slab_free+0xe1/0x110 mm/kasan/common.c:362
 kasan_slab_free include/linux/kasan.h:192 [inline]
 slab_free_hook mm/slub.c:1547 [inline]
 slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1580
 slab_free mm/slub.c:3143 [inline]
 kfree+0xdb/0x3b0 mm/slub.c:4139
 nbd_dev_remove drivers/block/nbd.c:243 [inline]
 nbd_put.part.0+0x180/0x1d0 drivers/block/nbd.c:251
 nbd_put drivers/block/nbd.c:295 [inline]
 nbd_config_put+0x6dd/0x8c0 drivers/block/nbd.c:1242
 nbd_release+0x103/0x190 drivers/block/nbd.c:1507
 __blkdev_put+0x548/0x800 fs/block_dev.c:1579
 blkdev_put+0x92/0x570 fs/block_dev.c:1632
 blkdev_close+0x8c/0xb0 fs/block_dev.c:1640
 __fput+0x283/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x249/0x250 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff888143bf7000
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 416 bytes inside of
 1024-byte region [ffff888143bf7000, ffff888143bf7400)
The buggy address belongs to the page:
page:000000005238f4ce refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x143bf0
head:000000005238f4ce order:3 compound_mapcount:0 compound_pincount:0
flags: 0x57ff00000010200(slab|head)
raw: 057ff00000010200 ffffea00004b1400 0000000300000003 ffff888010c41140
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888143bf7080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888143bf7100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888143bf7180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                               ^
 ffff888143bf7200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Reported-and-tested-by: syzbot+429d3f82d757c211bff3@syzkaller.appspotmail.com
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07 12:34:06 +01:00
..
aoe block: lift setting the readahead size into the block layer 2020-09-24 13:43:39 -06:00
drbd block-5.10-2020-10-12 2020-10-13 12:12:44 -07:00
mtip32xx blk-mq: move failure injection out of blk_mq_complete_request 2020-06-24 09:15:57 -06:00
paride paride/pcd: use bdev_check_media_change 2020-09-10 09:32:31 -06:00
rnbd block/rnbd-clt: avoid module unload race with close confirmation 2021-01-17 14:17:05 +01:00
rsxx rsxx: Use fallthrough pseudo-keyword 2020-10-02 17:54:45 -06:00
xen-blkback xen-blkback: fix error handling in xen_blkbk_map() 2021-02-23 15:53:24 +01:00
zram block-5.10-2020-10-24 2020-10-24 12:46:42 -07:00
amiflop.c amiflop: use bdev_check_media_change 2020-09-10 09:32:30 -06:00
ataflop.c ataflop: use bdev_check_media_change 2020-09-10 09:32:30 -06:00
brd.c bdi: remove BDI_CAP_SYNCHRONOUS_IO 2020-09-24 13:43:39 -06:00
cryptoloop.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 30 2019-05-24 17:27:10 +02:00
floppy.c floppy: reintroduce O_NDELAY fix 2021-03-04 11:38:33 +01:00
Kconfig block: rsxx: select CONFIG_CRC32 2021-01-17 14:17:03 +01:00
loop.c loop: Fix occasional uevent drop 2020-11-12 13:59:04 -07:00
loop.h
Makefile block/rnbd: include client and server modules into kernel compilation 2020-05-17 18:57:17 -03:00
nbd.c nbd: handle device refs for DESTROY_ON_DISCONNECT properly 2021-03-07 12:34:06 +01:00
null_blk_main.c null_blk: add support for max open/active zone limit for zoned devices 2020-09-29 08:12:48 -06:00
null_blk_trace.c null_blk: add tracepoint helpers for zoned mode 2020-03-27 13:39:10 -06:00
null_blk_trace.h null_blk: add tracepoint helpers for zoned mode 2020-03-27 13:39:10 -06:00
null_blk_zoned.c null_blk: Fail zone append to conventional zones 2020-12-30 11:54:29 +01:00
null_blk.h null_blk: Fix scheduling in atomic with zoned mode 2020-11-06 09:36:42 -07:00
pktcdvd.c pktcdvd: use blkdev_get_by_dev instead of open coding it 2020-09-23 10:43:19 -06:00
ps3disk.c ps3disk: use the default segment boundary 2020-05-19 00:10:35 +10:00
ps3vram.c block: move ->make_request_fn to struct block_device_operations 2020-07-01 07:27:24 -06:00
rbd_types.h libceph, rbd: replace zero-length array with flexible-array 2020-06-01 13:22:53 +02:00
rbd.c We have: 2020-10-21 10:34:10 -07:00
skd_main.c skd_main: remove unused including <linux/version.h> 2020-10-17 08:11:14 -06:00
skd_s1120.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 2019-06-19 17:09:53 +02:00
sunvdc.c compat_ioctl: block: handle cdrom compat ioctl in non-cdrom drivers 2020-01-03 09:33:15 +01:00
swim_asm.S treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
swim.c swim: simplify media change handling 2020-09-10 09:32:30 -06:00
swim3.c swim3: use bdev_check_media_changed 2020-09-10 09:32:31 -06:00
sx8.c
umem.c block: move ->make_request_fn to struct block_device_operations 2020-07-01 07:27:24 -06:00
umem.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 348 2019-06-05 17:37:08 +02:00
virtio_blk.c block: add a new revalidate_disk_size helper 2020-09-02 08:00:07 -06:00
xen-blkfront.c xen-blkfront: allow discard-* nodes to be optional 2021-02-03 23:28:44 +01:00
xsysace.c xsysace: use platform_get_resource() and platform_get_irq_optional() 2020-10-29 08:22:33 -06:00
z2ram.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00