linux/mm
Jan Kara e418b3bbe9 mm: fix XFS oops due to dirty pages without buffers on s390
commit ef5d437f71 upstream.

On s390 any write to a page (even from kernel itself) sets architecture
specific page dirty bit.  Thus when a page is written to via buffered
write, HW dirty bit gets set and when we later map and unmap the page,
page_remove_rmap() finds the dirty bit and calls set_page_dirty().

Dirtying of a page which shouldn't be dirty can cause all sorts of
problems to filesystems.  The bug we observed in practice is that
buffers from the page get freed, so when the page gets later marked as
dirty and writeback writes it, XFS crashes due to an assertion
BUG_ON(!PagePrivate(page)) in page_buffers() called from
xfs_count_page_state().

Similar problem can also happen when zero_user_segment() call from
xfs_vm_writepage() (or block_write_full_page() for that matter) set the
hardware dirty bit during writeback, later buffers get freed, and then
page unmapped.

Fix the issue by ignoring s390 HW dirty bit for page cache pages of
mappings with mapping_cap_account_dirty().  This is safe because for
such mappings when a page gets marked as writeable in PTE it is also
marked dirty in do_wp_page() or do_page_fault().  When the dirty bit is
cleared by clear_page_dirty_for_io(), the page gets writeprotected in
page_mkclean().  So pagecache page is writeable if and only if it is
dirty.

Thanks to Hugh Dickins for pointing out mapping has to have
mapping_cap_account_dirty() for things to work and proposing a cleaned
up variant of the patch.

The patch has survived about two hours of running fsx-linux on tmpfs
while heavily swapping and several days of running on out build machines
where the original problem was triggered.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-05 09:44:26 +01:00
..
backing-dev.c backing-dev: ensure wakeup_timer is deleted 2011-11-21 14:31:25 -08:00
bootmem.c bootmem/sparsemem: remove limit constraint in alloc_bootmem_section 2012-04-02 09:27:11 -07:00
bounce.c bounce: call flush_dcache_page() after bounce_copy_vec() 2010-09-09 18:57:25 -07:00
cleancache.c mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
compaction.c mm: compaction: introduce sync-light migration for use by compaction 2012-08-01 12:27:18 -07:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c mm/dmapool.c: use TASK_UNINTERRUPTIBLE in dma_pool_alloc() 2011-01-13 17:32:48 -08:00
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM 2010-03-06 11:26:25 -08:00
failslab.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
filemap_xip.c mm/filemap_xip.c: fix race condition in xip_file_fault() 2012-02-13 11:06:07 -08:00
filemap.c cpuset: mm: reduce large amounts of memory barrier related damage v3 2012-08-01 12:27:20 -07:00
fremap.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
highmem.c mm,x86: fix kmap_atomic_push vs ioremap_32.c 2010-10-27 18:03:05 -07:00
huge_memory.c mm: thp: fix BUG on mm->nr_ptes 2012-03-12 10:32:56 -07:00
hugetlb.c mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables 2012-08-15 12:04:29 -07:00
hwpoison-inject.c Fix common misspellings 2011-03-31 11:26:23 -03:00
init-mm.c mm: convert mm->cpu_vm_cpumask into cpumask_var_t 2011-05-25 08:39:21 -07:00
internal.h mm: thp: tail page refcounting fix 2011-11-11 09:36:29 -08:00
Kconfig mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
Kconfig.debug mm: debug-pagealloc: fix kconfig dependency warning 2011-03-22 17:44:02 -07:00
kmemcheck.c kmemcheck: Fix build errors due to missing slab.h 2010-03-30 22:02:32 +09:00
kmemleak-test.c kmemleak: remove memset by using kzalloc 2011-01-27 18:31:51 +00:00
kmemleak.c kmemleak: Do not return a pointer to an object that kmemleak did not get 2011-05-19 17:35:28 +01:00
ksm.c ksm: fix NULL pointer dereference in scan_get_next_rmap_item() 2011-06-15 20:04:02 -07:00
maccess.c maccess,probe_kernel: Make write/read src const void * 2011-05-25 19:56:23 -04:00
madvise.c mm: Hold a file reference in madvise_remove 2012-07-16 08:47:52 -07:00
Makefile mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
memblock.c mm/memblock: properly handle overlaps and fix error path 2011-03-22 17:44:09 -07:00
memcontrol.c mm: change isolate mode from #define to bitwise type 2012-08-01 12:27:16 -07:00
memory_hotplug.c memory hotplug: fix section info double registration bug 2012-10-02 09:47:26 -07:00
memory-failure.c mm: fix wrong argument of migrate_huge_pages() in soft_offline_huge_page() 2012-08-15 12:04:10 -07:00
memory.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-04-02 09:27:10 -07:00
mempolicy.c mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() 2012-10-13 05:28:14 +09:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c mm: compaction: introduce sync-light migration for use by compaction 2012-08-01 12:27:18 -07:00
mincore.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-04-02 09:27:10 -07:00
mlock.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mm_init.c mm: mminit_loglevel cannot be __meminitdata anymore 2008-08-20 15:40:30 -07:00
mmap.c mm: get rid of the most spurious find_vma_prev() users 2011-06-16 00:35:09 -07:00
mmu_context.c exit: fix oops in sync_mm_rss 2010-03-24 16:31:21 -07:00
mmu_notifier.c mm: mmu_notifier: fix freed page still mapped in secondary MMU 2012-08-15 12:04:10 -07:00
mmzone.c mm: page allocator: adjust the per-cpu counter threshold when memory is low 2011-01-13 17:32:31 -08:00
mprotect.c thp: mprotect: transparent huge page support 2011-01-13 17:32:44 -08:00
mremap.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
msync.c sanitize vfs_fsync calling conventions 2010-05-21 18:31:21 -04:00
nobootmem.c mm: nobootmem: fix sign extend problem in __free_pages_memory() 2012-05-21 09:40:02 -07:00
nommu.c NOMMU: Don't need to clear vm_mm when deleting a VMA 2012-03-12 10:32:56 -07:00
oom_kill.c oom: fix integer overflow of points in oom_badness 2012-01-06 14:13:51 -08:00
page_alloc.c mm/page_alloc: fix the page address of higher page's buddy calculation 2012-10-02 09:47:25 -07:00
page_cgroup.c memcg: fix init_page_cgroup nid with sparsemem 2011-06-15 20:04:01 -07:00
page_io.c block: kill off REQ_UNPLUG 2011-03-10 08:52:27 +01:00
page_isolation.c mm: page_isolation: codeclean fix comment and rm unneeded val init 2010-10-26 16:52:11 -07:00
page-writeback.c writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage 2011-10-03 11:40:43 -07:00
pagewalk.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-04-02 09:27:10 -07:00
percpu-km.c percpu: clear memory allocated with the km allocator 2010-10-02 10:28:42 +03:00
percpu-vm.c percpu: fix chunk range calculation 2011-12-21 12:57:37 -08:00
percpu.c percpu: pcpu_embed_first_chunk() should free unused parts after all allocs are complete 2012-05-21 09:40:02 -07:00
pgtable-generic.c mm/pgtable-generic.c: fix CONFIG_SWAP=n build 2011-01-26 10:49:58 +10:00
prio_tree.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
quicklist.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
readahead.c readahead: readahead page allocations are OK to fail 2011-05-25 08:39:25 -07:00
rmap.c mm: fix XFS oops due to dirty pages without buffers on s390 2012-11-05 09:44:26 +01:00
shmem.c tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking 2012-10-21 09:17:10 -07:00
slab.c cpuset: mm: reduce large amounts of memory barrier related damage v3 2012-08-01 12:27:20 -07:00
slob.c mm: Remove support for kmem_cache_name() 2011-01-23 21:00:05 +02:00
slub.c cpuset: mm: reduce large amounts of memory barrier related damage v3 2012-08-01 12:27:20 -07:00
sparse-vmemmap.c tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
sparse.c bootmem/sparsemem: remove limit constraint in alloc_bootmem_section 2012-04-02 09:27:11 -07:00
swap_state.c mm: fix s390 BUG by __set_page_dirty_no_writeback on swap 2012-04-27 09:51:07 -07:00
swap.c mm: fix UP THP spin_is_locked BUGs 2012-02-13 11:06:11 -08:00
swapfile.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-04-02 09:27:10 -07:00
thrash.c vmscan: implement swap token priority aging 2011-06-15 20:03:59 -07:00
truncate.c mm: fix invalidate_complete_page2() lock ordering 2012-10-13 05:28:10 +09:00
util.c mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
vmalloc.c mm: fix faulty initialization in vmalloc_init() 2012-06-17 11:23:13 -07:00
vmscan.c vmscan: fix initial shrinker size handling 2012-08-01 12:27:20 -07:00
vmstat.c mm/vmstat.c: cache align vm_stat 2012-08-01 12:26:54 -07:00