linux/include
Michal Hocko 820ca57722 mm: allow GFP_{FS,IO} for page_cache_read page cache allocation
commit c20cd45eb0 upstream.

page_cache_read has been historically using page_cache_alloc_cold to
allocate a new page.  This means that mapping_gfp_mask is used as the
base for the gfp_mask.  Many filesystems are setting this mask to
GFP_NOFS to prevent from fs recursion issues.  page_cache_read is called
from the vm_operations_struct::fault() context during the page fault.
This context doesn't need the reclaim protection normally.

ceph and ocfs2 which call filemap_fault from their fault handlers seem
to be OK because they are not taking any fs lock before invoking generic
implementation.  xfs which takes XFS_MMAPLOCK_SHARED is safe from the
reclaim recursion POV because this lock serializes truncate and punch
hole with the page faults and it doesn't get involved in the reclaim.

There is simply no reason to deliberately use a weaker allocation
context when a __GFP_FS | __GFP_IO can be used.  The GFP_NOFS protection
might be even harmful.  There is a push to fail GFP_NOFS allocations
rather than loop within allocator indefinitely with a very limited
reclaim ability.  Once we start failing those requests the OOM killer
might be triggered prematurely because the page cache allocation failure
is propagated up the page fault path and end up in
pagefault_out_of_memory.

We cannot play with mapping_gfp_mask directly because that would be racy
wrt.  parallel page faults and it might interfere with other users who
really rely on NOFS semantic from the stored gfp_mask.  The mask is also
inode proper so it would even be a layering violation.  What we can do
instead is to push the gfp_mask into struct vm_fault and allow fs layer
to overwrite it should the callback need to be called with a different
allocation context.

Initialize the default to (mapping_gfp_mask | __GFP_FS | __GFP_IO)
because this should be safe from the page fault path normally.  Why do
we care about mapping_gfp_mask at all then? Because this doesn't hold
only reclaim protection flags but it also might contain zone and
movability restrictions (GFP_DMA32, __GFP_MOVABLE and others) so we have
to respect those.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24 09:32:11 +02:00
..
acpi Merge branch 'acpi-pci' 2015-11-07 01:30:10 +01:00
asm-generic mm/vmalloc: add interfaces to free unmapped page table 2018-03-28 18:40:14 +02:00
clocksource
crypto crypto: poly1305 - remove ->setkey() method 2018-02-16 20:09:43 +01:00
drm drm: Allow determining if current task is output poll worker 2018-03-18 11:17:48 +01:00
dt-bindings ARM: dts: Fix omap3 off mode pull defines 2017-11-21 09:21:19 +01:00
keys
kvm KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active 2015-11-24 18:07:40 +01:00
linux mm: allow GFP_{FS,IO} for page_cache_read page cache allocation 2018-04-24 09:32:11 +02:00
math-emu
media videobuf2-core: Check user space planes array in dqbuf 2016-05-04 14:48:50 -07:00
memory
misc
net slip: Check if rstate is initialized before uncompressing 2018-04-24 09:32:04 +02:00
pcmcia
ras
rdma RDMA/ucma: Introduce safer rdma_addr_size() variants 2018-04-08 11:51:59 +02:00
rxrpc
scsi scsi: sg: disable SET_FORCE_LOW_DMA 2018-01-23 19:50:14 +01:00
soc ARM: at91: define LPDDR types 2017-03-12 06:37:24 +01:00
sound ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams 2018-04-24 09:32:09 +02:00
target target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK 2018-01-17 09:35:31 +01:00
trace clk: fix a panic error caused by accessing NULL pointer 2018-02-25 11:03:41 +01:00
uapi PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant 2018-04-08 11:51:57 +02:00
video drm/imx: Match imx-ipuv3-crtc components using device node in platform data 2016-06-07 18:14:37 -07:00
xen fix xen_swiotlb_dma_mmap prototype 2017-10-05 09:41:48 +02:00
Kbuild