linux/include
Zhang Yi a42efb79d5 futex: Take hugepages into account when generating futex_key
commit 13d60f4b6a upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page->index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page->index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: 'Mel Gorman' <mgorman@suse.de>
Acked-by: 'Darren Hart' <dvhart@linux.intel.com>
Cc: 'Peter Zijlstra' <peterz@infradead.org>
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20 08:26:28 -07:00
..
acpi Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux 2012-05-05 10:06:06 -07:00
asm-generic mm: allow arch code to control the user page table ceiling 2013-05-07 19:51:55 -07:00
crypto crypto: user - Fix lookup of algorithms with IV generator 2012-03-29 19:52:47 +08:00
drm drm/radeon: add new richland pci ids 2013-05-11 13:48:14 -07:00
keys
linux futex: Take hugepages into account when generating futex_key 2013-08-20 08:26:28 -07:00
math-emu
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2012-05-14 11:23:37 -07:00
misc
mtd
net ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data 2013-07-28 16:26:02 -07:00
pcmcia
rdma infiniband: pass rdma_cm module to netlink_dump_start 2012-10-28 10:14:15 -07:00
rxrpc
scsi SCSI: libsas: fix taskfile corruption in sas_ata_qc_fill_rtf 2012-07-16 09:04:37 -07:00
sound ALSA: Add a reference counter to card instance 2012-11-17 13:16:13 -08:00
target target: Add link_magic for fabric allow_link destination target_items 2013-01-21 11:45:24 -08:00
trace xen/mmu: Use Xen specific TLB flush instead of the generic one. 2012-11-17 13:15:54 -08:00
video atmel_lcdfb: fix 16-bpp modes on older SOCs 2013-03-20 13:05:00 -07:00
xen xen/blkback: correctly respond to unknown, non-native requests 2013-04-05 10:04:18 -07:00
Kbuild