linux/arch
Stefano Stabellini b9247be526 xen: mark local pages as FOREIGN in the m2p_override
commit b9e0d95c04 upstream.

When the frontend and the backend reside on the same domain, even if we
add pages to the m2p_override, these pages will never be returned by
mfn_to_pfn because the check "get_phys_to_machine(pfn) != mfn" will
always fail, so the pfn of the frontend will be returned instead
(resulting in a deadlock because the frontend pages are already locked).

INFO: task qemu-system-i38:1085 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
qemu-system-i38 D ffff8800cfc137c0     0  1085      1 0x00000000
 ffff8800c47ed898 0000000000000282 ffff8800be4596b0 00000000000137c0
 ffff8800c47edfd8 ffff8800c47ec010 00000000000137c0 00000000000137c0
 ffff8800c47edfd8 00000000000137c0 ffffffff82213020 ffff8800be4596b0
Call Trace:
 [<ffffffff81101ee0>] ? __lock_page+0x70/0x70
 [<ffffffff81a0fdd9>] schedule+0x29/0x70
 [<ffffffff81a0fe80>] io_schedule+0x60/0x80
 [<ffffffff81101eee>] sleep_on_page+0xe/0x20
 [<ffffffff81a0e1ca>] __wait_on_bit_lock+0x5a/0xc0
 [<ffffffff81101ed7>] __lock_page+0x67/0x70
 [<ffffffff8106f750>] ? autoremove_wake_function+0x40/0x40
 [<ffffffff811867e6>] ? bio_add_page+0x36/0x40
 [<ffffffff8110b692>] set_page_dirty_lock+0x52/0x60
 [<ffffffff81186021>] bio_set_pages_dirty+0x51/0x70
 [<ffffffff8118c6b4>] do_blockdev_direct_IO+0xb24/0xeb0
 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00
 [<ffffffff8118ca95>] __blockdev_direct_IO+0x55/0x60
 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00
 [<ffffffff811e91c8>] ext3_direct_IO+0xf8/0x390
 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00
 [<ffffffff81004b60>] ? xen_mc_flush+0xb0/0x1b0
 [<ffffffff81104027>] generic_file_aio_read+0x737/0x780
 [<ffffffff813bedeb>] ? gnttab_map_refs+0x15b/0x1e0
 [<ffffffff811038f0>] ? find_get_pages+0x150/0x150
 [<ffffffff8119736c>] aio_rw_vect_retry+0x7c/0x1d0
 [<ffffffff811972f0>] ? lookup_ioctx+0x90/0x90
 [<ffffffff81198856>] aio_run_iocb+0x66/0x1a0
 [<ffffffff811998b8>] do_io_submit+0x708/0xb90
 [<ffffffff81199d50>] sys_io_submit+0x10/0x20
 [<ffffffff81a18d69>] system_call_fastpath+0x16/0x1b

The explanation is in the comment within the code:

We need to do this because the pages shared by the frontend
(xen-blkfront) can be already locked (lock_page, called by
do_read_cache_page); when the userspace backend tries to use them
with direct_IO, mfn_to_pfn returns the pfn of the frontend, so
do_blockdev_direct_IO is going to try to lock the same pages
again resulting in a deadlock.

A simplified call graph looks like this:

pygrub                          QEMU
-----------------------------------------------
do_read_cache_page              io_submit
  |                              |
lock_page                       ext3_direct_IO
                                 |
                                bio_add_page
                                 |
                                lock_page

Internally the xen-blkback uses m2p_add_override to swizzle (temporarily)
a 'struct page' to have a different MFN (so that it can point to another
guest). It also can easily find out whether another pfn corresponding
to the mfn exists in the m2p, and can set the FOREIGN bit
in the p2m, making sure that mfn_to_pfn returns the pfn of the backend.

This allows the backend to perform direct_IO on these pages, but as a
side effect prevents the frontend from using get_user_pages_fast on
them while they are being shared with the backend.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-26 15:00:40 -07:00
..
alpha alpha: silence 'const' warning in sys_marvel.c 2012-05-02 15:54:06 -04:00
arm Input: eeti_ts: pass gpio value instead of IRQ 2012-08-15 08:10:33 -07:00
avr32 avr32: fix nop compile fails from system.h split up 2012-04-04 08:23:44 -07:00
blackfin blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c 2012-04-26 14:46:51 -04:00
c6x irq: Kill pointless irqd_to_hw export 2012-04-10 22:39:17 -06:00
cris Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
frv frv: delete incorrect task prototypes causing compile fail 2012-05-17 18:00:51 -07:00
h8300 Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
hexagon hexagon: add missing cpu.h include 2012-04-23 12:57:24 -05:00
ia64 random: remove rand_initialize_irq() 2012-08-15 08:10:29 -07:00
m32r Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
m68k m68k: Correct the Atari ALLOWINT definition 2012-08-09 08:31:53 -07:00
microblaze microblaze: Do not select GENERIC_GPIO by default 2012-06-10 00:36:05 +09:00
mips posix_types.h: Cleanup stale __NFDBITS and related definitions 2012-08-09 08:31:39 -07:00
mn10300 mn10300/CPU hotplug: Add missing call to notify_cpu_starting() 2012-05-15 18:16:57 -07:00
openrisc Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
parisc PARISC: fix TLB fault path on PA2.0 narrow systems 2012-06-10 00:36:07 +09:00
powerpc powerpc/85xx: use the BRx registers to enable indirect mode on the P1022DS 2012-08-09 08:31:27 -07:00
s390 s390/compat: fix mmap compat system calls 2012-08-26 15:00:39 -07:00
score Delete all instances of asm/system.h 2012-03-28 18:30:03 +01:00
sh sh: Fix up tracepoint build fallout from static key introduction. 2012-04-27 11:12:38 +09:30
sparc KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat 2012-06-01 15:18:16 +08:00
tile tile: fix bug where fls(0) was not returning 0 2012-06-01 15:18:27 +08:00
um um: Implement a custom pte_same() function 2012-06-01 15:18:18 +08:00
unicore32 Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping 2012-04-04 17:13:43 -07:00
x86 xen: mark local pages as FOREIGN in the m2p_override 2012-08-26 15:00:40 -07:00
xtensa xtensa: fix build fail on undefined ack_bad_irq 2012-04-26 18:35:32 -04:00
.gitignore
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2012-03-29 14:49:45 -07:00