linux/mm
Xishi Qiu 6843d9254c mm: fix process accidentally killed by mce because of huge page migration
Based on c8721bbbdd upstream, but only the
bugfix portion pulled out.

Hi Naoya or Greg,

We found a bug in 3.10.x.
The problem is that we accidentally have a hwpoisoned hugepage in free
hugepage list. It could happend in the the following scenario:

        process A                           process B

  migrate_huge_page
  put_page (old hugepage)
    linked to free hugepage list
                                     hugetlb_fault
                                       hugetlb_no_page
                                         alloc_huge_page
                                           dequeue_huge_page_vma
                                             dequeue_huge_page_node
                                               (steal hwpoisoned hugepage)
  set_page_hwpoison_huge_page
  dequeue_hwpoisoned_huge_page
    (fail to dequeue)

I tested this bug, one process keeps allocating huge page, and I 
use sysfs interface to soft offline a huge page, then received:
"MCE: Killing UCP:2717 due to hardware memory corruption fault at 8200034"

Upstream kernel is free from this bug because of these two commits:

f15bdfa802
mm/memory-failure.c: fix memory leak in successful soft offlining

c8721bbbdd
mm: memory-hotplug: enable memory hotplug to handle hugepage

The first one, although the problem is about memory leak, this patch
moves unset_migratetype_isolate(), which is important to avoid the race.
The latter is not a bug fix and it's too big, so I rewrite a small one.

The following patch can fix this bug.(please apply f15bdfa802 first)

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-20 11:06:12 -08:00
..
backing-dev.c
balloon_compaction.c
bootmem.c
bounce.c mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored 2013-10-13 16:08:33 -07:00
cleancache.c
compaction.c mm/compaction: respect ignore_skip_hint in update_pageblock_skip 2014-01-09 12:24:23 -08:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c
filemap.c
fremap.c mm: fix use-after-free in sys_remap_file_pages 2014-01-09 12:24:24 -08:00
frontswap.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
highmem.c
huge_memory.c thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only 2014-01-25 08:27:12 -08:00
hugetlb_cgroup.c
hugetlb.c mm: fix process accidentally killed by mce because of huge page migration 2014-02-20 11:06:12 -08:00
hwpoison-inject.c
init-mm.c
internal.h
interval_tree.c
Kconfig
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c
maccess.c
madvise.c
Makefile
memblock.c
memcontrol.c memcg: fix memcg_size() calculation 2014-01-09 12:24:24 -08:00
memory_hotplug.c mm/memory_hotplug.c: fix printk format warnings 2013-05-24 16:22:52 -07:00
memory-failure.c mm/memory-failure.c: fix memory leak in successful soft offlining 2014-02-20 11:06:12 -08:00
memory.c mm: numa: Sanitize task_numa_fault() callsites 2013-11-13 12:05:34 +09:00
mempolicy.c mm/mempolicy.c: fix mempolicy printing in numa_maps 2014-02-06 11:08:12 -08:00
mempool.c
migrate.c mm: numa: avoid unnecessary work on the failure path 2014-01-09 12:24:23 -08:00
mincore.c
mlock.c
mm_init.c
mmap.c mm: ensure get_unmapped_area() returns higher address than mmap_min_addr 2013-12-04 10:56:39 -08:00
mmu_context.c
mmu_notifier.c mm: mmu_notifier: re-fix freed page still mapped in secondary MMU 2013-05-24 16:22:51 -07:00
mmzone.c
mprotect.c mm: fix TLB flush race between migration, and change_protection_range 2014-01-09 12:24:23 -08:00
mremap.c
msync.c
nobootmem.c
nommu.c
oom_kill.c mm, oom: base root bonus on current usage 2014-02-13 13:48:02 -08:00
page_alloc.c mm/memory-hotplug: fix lowmem count overflow when offline pages 2013-07-21 18:21:36 -07:00
page_cgroup.c
page_io.c
page_isolation.c
page-writeback.c mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq() 2014-02-20 11:06:11 -08:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-11-13 12:05:34 +09:00
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c mm: fix TLB flush race between migration, and change_protection_range 2014-01-09 12:24:23 -08:00
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm/hugetlb: check for pte NULL pointer in __page_check_address() 2014-01-09 12:24:23 -08:00
shmem.c cope with potentially long ->d_dname() output for shmem/hugetlb 2013-10-18 07:45:45 -07:00
slab_common.c slab: prevent warnings when allocating with __GFP_NOWARN 2013-06-13 10:01:58 +03:00
slab.c slab: fix init_lock_keys 2013-07-21 18:21:26 -07:00
slab.h memcg: check that kmem_cache has memcg_params before accessing it 2013-09-07 22:09:58 -07:00
slob.c
slub.c slub: Fix calculation of cpu slabs 2014-02-13 13:48:00 -08:00
sparse-vmemmap.c
sparse.c
swap_state.c swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion 2013-06-12 16:29:45 -07:00
swap.c mm: hugetlbfs: fix hugetlbfs optimization 2014-02-06 11:08:12 -08:00
swapfile.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
truncate.c
util.c
vmalloc.c mm/vmalloc.c: fix an overflow bug in alloc_vmap_area() 2013-11-13 12:05:34 +09:00
vmpressure.c
vmscan.c mm/page-writeback.c: do not count anon pages as dirtyable memory 2014-02-13 13:48:00 -08:00
vmstat.c mm: numa: return the number of base pages altered by protection changes 2013-12-08 07:29:27 -08:00