linux/lib
Ross Zwisler 9f418224e8 radix tree: fix multi-order iteration race
Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault.  This was first seen with a production v4.15 based
kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
order 9 PMD DAX entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree.  Remember for example that
an order 2 entry looks like this:

  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

  radix_tree_delete()
    radix_tree_delete_item()
      __radix_tree_delete()
        replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL.  This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

                                      V preceding slot
  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                              ^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

                                             V preceding slot
  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                    ^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers.  This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

We fix this race by fixing the way that skip_siblings() detects sibling
nodes.  Instead of testing against the preceding slot we instead look
for siblings via is_sibling_entry() which compares against the position
of the struct radix_tree_node.slots[] array.  This ensures that sibling
entries are properly identified, even if they are no longer contiguous
with the 'entry' they point to.

Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: CR, Sapthagirish <sapthagirish.cr@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
..
842
fonts
lz4
lzo
mpi lib/mpi: Fix umul_ppmm() for MIPS64r6 2017-12-22 19:39:09 +11:00
raid6 powerpc updates for 4.17 2018-04-07 12:08:19 -07:00
reed_solomon
xz
zlib_deflate
zlib_inflate
zstd lib: zstd: clean up Makefile for simpler composite object handling 2018-03-26 02:01:27 +09:00
.gitignore
argv_split.c
ashldi3.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
ashrdi3.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
asn1_decoder.c ASN.1: check for error from ASN1_OP_END__ACT actions 2017-12-08 15:13:27 +00:00
assoc_array.c lib/assoc_array: Remove smp_read_barrier_depends() 2017-12-04 10:52:56 -08:00
atomic64_test.c
atomic64.c
audit.c
bcd.c
bch.c
bitmap.c lib: fix stall in __bitmap_parselist() 2018-04-05 21:36:21 -07:00
bitrev.c
bsearch.c
btree.c btree: avoid variable-length allocations 2018-03-14 16:55:29 -07:00
bucket_locks.c spinlock: Add library function to allocate spinlock buckets array 2017-12-11 09:58:39 -05:00
bug.c lib/bug.c: exclude non-BUG/WARN exceptions from report_bug() 2018-03-09 16:40:01 -08:00
build_OID_registry
bust_spinlocks.c
chacha20.c crypto: chacha20 - use rol32() macro from bitops.h 2018-01-12 23:03:01 +11:00
check_signature.c
checksum.c
clz_ctz.c
clz_tab.c
cmdline.c
cmpdi2.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
compat_audit.c
cordic.c
cpu_rmap.c
cpumask.c lib: optimize cpumask_next_and() 2018-02-06 18:32:44 -08:00
crc-ccitt.c lib/crc-ccitt: Add CCITT-FALSE CRC16 variant 2018-01-08 10:08:33 +00:00
crc-itu-t.c
crc-t10dif.c
crc4.c
crc7.c
crc8.c
crc16.c
crc32.c
crc32defs.h
crc32test.c
ctype.c
debug_info.c
debug_locks.c
debugobjects.c debugobjects: Avoid another unused variable warning 2018-03-14 20:20:01 +01:00
dec_and_lock.c
decompress_bunzip2.c
decompress_inflate.c
decompress_unlz4.c
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress.c
devres.c devres: combine function devm_ioremap* 2018-03-15 18:08:55 +01:00
digsig.c
div64.c
dma-debug.c dma-debug: fix memory leak in debug_dma_alloc_coherent 2018-02-22 15:02:33 -08:00
dma-direct.c dma-direct: don't retry allocation for no-op GFP_DMA 2018-04-23 14:43:27 +02:00
dma-virt.c
dump_stack.c printk: move dump stack related code to lib/dump_stack.c 2018-03-15 13:25:36 +01:00
dynamic_debug.c
dynamic_queue_limits.c
earlycpio.c
error-inject.c error-injection: Add injectable error types 2018-01-12 17:33:38 -08:00
errseq.c errseq: Always report a writeback error once 2018-04-27 08:51:26 -04:00
extable.c
fault-inject.c
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit_benchmark.c lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() 2018-05-11 17:28:45 -07:00
find_bit.c lib: optimize cpumask_next_and() 2018-02-06 18:32:44 -08:00
flex_array.c
flex_proportions.c
gcd.c
gen_crc32table.c
genalloc.c lib/genalloc.c: make the avail variable an atomic_long_t 2017-11-17 16:10:02 -08:00
glob.c
globtest.c
hexdump.c
hweight.c
idr.c idr: Fix handling of IDs above INT_MAX 2018-02-26 14:39:30 -05:00
inflate.c
int_sqrt.c lib: Add strongly typed 64bit int_sqrt 2018-02-04 10:17:21 +00:00
interval_tree_test.c lib/rbtree-test: lower default params 2017-11-17 16:10:02 -08:00
interval_tree.c
iomap_copy.c
iomap.c
iommu-common.c
iommu-helper.c
ioremap.c mm/vmalloc: add interfaces to free unmapped page table 2018-03-22 17:07:01 -07:00
iov_iter.c
irq_poll.c
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c
kasprintf.c
Kconfig lib: Add generic PIO mapping method 2018-03-21 17:18:34 -05:00
Kconfig.debug lib/Kconfig.debug: Debug Lockups and Hangs: keep SOFTLOCKUP options together 2018-04-11 10:28:35 -07:00
Kconfig.kasan kasan: rework Kconfig settings 2018-02-06 18:32:47 -08:00
Kconfig.kgdb
Kconfig.ubsan lib: add testing module for UBSAN 2018-04-11 10:28:35 -07:00
kfifo.c kfifo: fix inaccurate comment 2018-03-27 11:15:42 +02:00
klist.c
kobject_uevent.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
kobject.c kobject: don't use WARN for registration failures 2018-04-23 13:14:55 +02:00
kstrtox.c
kstrtox.h
lcm.c
libcrc32c.c libcrc32c: Add crc32c_impl function 2018-03-26 15:09:38 +02:00
list_debug.c lib/list_debug.c: print unmangled addresses 2018-04-11 10:28:35 -07:00
list_sort.c
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-rtmutex.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lockref.c lockref: Add lockref_put_not_zero 2018-04-12 09:41:19 -07:00
logic_pio.c lib: Add generic PIO mapping method 2018-03-21 17:18:34 -05:00
lru_cache.c
lshrdi3.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
Makefile lib: add testing module for UBSAN 2018-04-11 10:28:35 -07:00
memory-notifier-error-inject.c
memweight.c
muldi3.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
net_utils.c
netdev-notifier-error-inject.c
nlattr.c netlink: Relax attr validation for fixed length types 2017-12-07 14:00:57 -05:00
nmi_backtrace.c lib/nmi_backtrace.c: fix kernel text address leak 2017-11-17 16:10:02 -08:00
nodemask.c
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c 509: fix printing uninitialized stack memory when OID is empty 2017-12-08 15:13:28 +00:00
once.c
parman.c
parser.c
pci_iomap.c PCI: Add SPDX GPL-2.0 when no license was specified 2018-01-26 11:45:16 -06:00
percpu_counter.c
percpu_ida.c
percpu_test.c
percpu-refcount.c percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods 2018-03-19 10:09:44 -07:00
plist.c
pm-notifier-error-inject.c
prime_numbers.c
radix-tree.c radix tree: fix multi-order iteration race 2018-05-18 17:17:12 -07:00
random32.c treewide: Switch DEFINE_TIMER callbacks to struct timer_list * 2017-11-21 15:57:05 -08:00
ratelimit.c
rational.c
rbtree_test.c lib/rbtree-test: lower default params 2017-11-17 16:10:02 -08:00
rbtree.c lib/rbtree,drm/mm: add rbtree_replace_node_cached() 2017-12-14 16:00:48 -08:00
reciprocal_div.c
refcount.c
rhashtable.c rhashtable: add schedule points 2018-03-31 23:25:39 -04:00
sbitmap.c sbitmap: use test_and_set_bit_lock()/clear_bit_unlock() 2018-02-28 12:23:35 -07:00
scatterlist.c lib/scatterlist: add sg_init_marker() helper 2018-03-30 22:50:15 +02:00
seq_buf.c
sg_pool.c
sg_split.c
sha1.c
sha256.c kernel/kexec_file.c: move purgatories sha256 to common code 2018-04-13 17:10:28 -07:00
show_mem.c
siphash.c
smp_processor_id.c lib: do not use print_symbol() 2018-01-05 15:24:00 +01:00
sort.c
stackdepot.c lib/stackdepot.c: use a non-instrumented version of memcmp() 2018-02-06 18:32:44 -08:00
stmp_device.c
string_helpers.c
string.c lib/strscpy: Shut up KASAN false-positives in strscpy() 2018-02-01 12:20:21 -08:00
strncpy_from_user.c
strnlen_user.c
swiotlb.c swiotlb: silent unwanted warning "buffer is full" 2018-05-12 11:57:37 +02:00
syscall.c
test_bitmap.c lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly 2018-05-18 17:17:12 -07:00
test_bpf.c test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment() 2018-03-29 14:33:29 -04:00
test_debug_virtual.c
test_firmware.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
test_hash.c
test_hexdump.c
test_kasan.c kasan: fix invalid-free test crashing the kernel 2018-04-11 10:28:32 -07:00
test_kmod.c lib/test_kmod.c: fix limit check on number of test devices created 2018-03-09 16:40:02 -08:00
test_list_sort.c lib/test: delete five error messages for failed memory allocations 2017-11-17 16:10:01 -08:00
test_module.c
test_parman.c
test_printf.c printk: hash addresses printed with %p 2017-11-29 12:09:02 +11:00
test_rhashtable.c test_rhashtable: add test case for rhltable with duplicate objects 2018-03-07 10:44:03 -05:00
test_siphash.c
test_sort.c lib/test_sort.c: add module unload support 2018-02-06 18:32:45 -08:00
test_static_key_base.c
test_static_keys.c
test_string.c lib: add module support to string tests 2017-11-17 16:10:01 -08:00
test_sysctl.c
test_ubsan.c lib/test_ubsan.c: make test_ubsan_misaligned_access() static 2018-04-11 10:28:35 -07:00
test_user_copy.c treewide: simplify Kconfig dependencies for removed archs 2018-03-26 15:55:57 +02:00
test_uuid.c
test-kstrtox.c
test-string_helpers.c
textsearch.c textsearch: fix kernel-doc warnings and add kernel-api section 2018-04-16 18:53:13 -04:00
timerqueue.c timerqueue: Document return values of timerqueue_add/del() 2017-12-29 23:13:10 +01:00
ts_bm.c
ts_fsm.c
ts_kmp.c
ubsan.c lib/ubsan: remove returns-nonnull-attribute checks 2018-02-06 18:32:46 -08:00
ubsan.h lib/ubsan: remove returns-nonnull-attribute checks 2018-02-06 18:32:46 -08:00
ucmpdi2.c move libgcc.h to include/linux 2017-12-01 13:09:40 -08:00
ucs2_string.c
usercopy.c Fix misannotated out-of-line _copy_to_user() 2017-12-11 09:35:11 -05:00
uuid.c Documentation: add UUID/GUID to kernel-api 2017-12-11 15:03:08 -07:00
vsprintf.c vsprintf: Replace memory barrier with static_key for random_ptr_key update 2018-05-16 09:01:41 -04:00
win_minmax.c
xxhash.c