linux/include
Johannes Weiner f84311d7cd mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
commit 22f2ac51b6 upstream.

Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

[mhocko@suse.com: backport for 4.4]
Fixes: 449dd6984d ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28 03:01:34 -04:00
..
acpi Merge branch 'acpi-pci' 2015-11-07 01:30:10 +01:00
asm-generic asm-generic: make copy_from_user() zero the destination properly 2016-09-24 10:07:45 +02:00
clocksource
crypto crypto: ghash-generic - move common definitions to a new header file 2016-10-22 12:26:56 +02:00
drm drm/i915/skl: Add missing SKL ids 2016-09-15 08:27:44 +02:00
dt-bindings ARM: DT updates for v4.4 2015-11-10 15:06:26 -08:00
keys
kvm KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active 2015-11-24 18:07:40 +01:00
linux mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() 2016-10-28 03:01:34 -04:00
math-emu
media videobuf2-core: Check user space planes array in dqbuf 2016-05-04 14:48:50 -07:00
memory
misc
net af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' 2016-09-30 10:18:36 +02:00
pcmcia
ras
rdma IB/security: Restrict use of the write() interface 2016-05-04 14:48:48 -07:00
rxrpc
scsi scsi: Add intermediate STARGET_REMOVE state to scsi_target_state 2016-06-01 12:15:54 -07:00
soc ARM: SoC driver updates for v4.4 2015-11-10 15:00:03 -08:00
sound ALSA: rawmidi: Make snd_rawmidi_transmit() race-free 2016-02-17 12:30:58 -08:00
target target: Fix ordered task CHECK_CONDITION early exception handling 2016-08-20 18:09:26 +02:00
trace SUNRPC: Don't allocate a full sockaddr_storage for tracing 2016-08-20 18:09:26 +02:00
uapi cxlflash: Fix to avoid virtual LUN failover failure 2016-09-15 08:27:49 +02:00
video drm/imx: Match imx-ipuv3-crtc components using device node in platform data 2016-06-07 18:14:37 -07:00
xen xen: Fix page <-> pfn conversion on 32 bit systems 2016-05-11 11:21:14 +02:00
Kbuild