linux/include
KAMEZAWA Hiroyuki ea1c627781 memcg: add mem_cgroup_replace_page_cache() to fix LRU issue
commit ab936cbcd0 upstream.

Commit ef6a3c6311 ("mm: add replace_page_cache_page() function") added a
function replace_page_cache_page().  This function replaces a page in the
radix-tree with a new page.  WHen doing this, memory cgroup needs to fix
up the accounting information.  memcg need to check PCG_USED bit etc.

In some(many?) cases, 'newpage' is on LRU before calling
replace_page_cache().  So, memcg's LRU accounting information should be
fixed, too.

This patch adds mem_cgroup_replace_page_cache() and removes the old hooks.
 In that function, old pages will be unaccounted without touching
res_counter and new page will be accounted to the memcg (of old page).
WHen overwriting pc->mem_cgroup of newpage, take zone->lru_lock and avoid
races with LRU handling.

Background:
  replace_page_cache_page() is called by FUSE code in its splice() handling.
  Here, 'newpage' is replacing oldpage but this newpage is not a newly allocated
  page and may be on LRU. LRU mis-accounting will be critical for memory cgroup
  because rmdir() checks the whole LRU is empty and there is no account leak.
  If a page is on the other LRU than it should be, rmdir() will fail.

This bug was added in March 2011, but no bug report yet.  I guess there
are not many people who use memcg and FUSE at the same time with upstream
kernels.

The result of this bug is that admin cannot destroy a memcg because of
account leak.  So, no panic, no deadlock.  And, even if an active cgroup
exist, umount can succseed.  So no problem at shutdown.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-25 17:24:43 -08:00
..
acpi Merge branches 'd3cold', 'bugzilla-37412' and 'bugzilla-38152' into release 2011-07-14 00:16:38 -04:00
asm-generic Merge branches 'gpio/merge' and 'spi/merge' of git://git.secretlab.ca/git/linux-2.6 2011-06-17 10:36:32 -07:00
crypto
drm drm/radeon/kms: add some new pci ids 2011-12-21 12:57:45 -08:00
keys
linux memcg: add mem_cgroup_replace_page_cache() to fix LRU issue 2012-01-25 17:24:43 -08:00
math-emu
media [media] tuner-core/v4l2-subdev: document that the type field has to be filled in 2011-07-07 15:04:23 -03:00
mtd
net sctp: fix incorrect overflow check on autoclose 2012-01-06 14:14:08 -08:00
pcmcia pcmcia: Make declaration and uses of struct pcmcia_device_id const 2011-05-06 07:46:15 +02:00
rdma Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband 2011-05-26 12:13:57 -07:00
rxrpc
scsi [SCSI] libsas: Add option for SATA soft reset 2011-05-26 22:49:33 -05:00
sound ALSA: sb16 - Fix build errors on MIPS and others with 13bit ioctl size 2011-06-30 15:33:57 +02:00
target [SCSI] target: Convert REPORT_LUNs to use int_to_scsilun 2011-05-24 13:02:42 -04:00
trace Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-06-21 10:22:35 -07:00
video Merge branches 'common/fbdev' and 'common/fbdev-meram' of master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6 2011-05-24 15:49:57 +09:00
xen xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX. 2012-01-25 17:24:41 -08:00
Kbuild