linux/arch
Paolo Abeni e47e717666 x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings
commit 236222d393 upstream.

According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.

This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.

The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.

Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):

 - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
 - in the [72%-9%] range on Intel Core i7-4810MQ

Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
2017-07-15 11:57:47 +02:00
..
alpha osf_wait4(): fix infoleak 2017-05-25 14:30:17 +02:00
arc mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
arm ARM: 8685/1: ensure memblock-limit is pmd-aligned 2017-07-05 14:37:22 +02:00
arm64 ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation 2017-07-05 14:37:21 +02:00
avr32 avr32: off by one in at32_init_pio() 2016-10-07 15:23:45 +02:00
blackfin net: smc91x: fix SMC accesses 2016-09-30 10:18:37 +02:00
c6x c6x/ptrace: Remove useless PTRACE_SETREGSET implementation 2017-03-31 09:49:53 +02:00
cris cris: Only build flash rescue image if CONFIG_ETRAX_AXISFLASHMAP is selected 2017-01-12 11:22:48 +01:00
frv mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
h8300 h8300/ptrace: Fix incorrect register transfer count 2017-03-31 09:49:53 +02:00
hexagon hexagon: fix strncpy_from_user() error return 2016-09-24 10:07:44 +02:00
ia64 ia64: copy_from_user() should zero the destination on access_ok() failure 2016-09-24 10:07:46 +02:00
m32r m32r: fix __get_user() 2016-09-24 10:07:43 +02:00
m68k m68k: Fix ndelay() macro 2016-12-15 08:49:23 -08:00
metag metag/uaccess: Check access_ok in strncpy_from_user 2017-05-25 14:30:16 +02:00
microblaze microblaze: fix copy_from_user() 2016-09-24 10:07:43 +02:00
mips MIPS: ralink: fix MT7628 wled_an pinmux gpio 2017-07-05 14:37:17 +02:00
mn10300 mn10300: copy_from_user() should zero on access_ok() failure... 2016-09-24 10:07:45 +02:00
nios2 nios2: reserve boot memory for device tree 2017-04-12 12:38:34 +02:00
openrisc openrisc: fix the fix of copy_from_user() 2016-09-24 10:07:46 +02:00
parisc mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
powerpc powerpc/eeh: Enable IO path on permanent error 2017-07-05 14:37:18 +02:00
s390 s390/ctl_reg: make __ctl_load a full memory barrier 2017-07-05 14:37:20 +02:00
score score: fix copy_from_user() and friends 2016-09-24 10:07:44 +02:00
sh mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
sparc mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
tile mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
um um: Don't discard .text.exit section 2016-09-07 08:32:38 +02:00
unicore32 pwm: Changes for v4.4-rc1 2015-11-11 09:16:10 -08:00
x86 x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings 2017-07-15 11:57:47 +02:00
xtensa mm: larger stack guard gap, between vmas 2017-06-26 07:13:11 +02:00
.gitignore
Kconfig