linux/arch/x86/kernel
Rik van Riel 5f409e20b7 x86/fpu: Defer FPU state load until return to userspace
Defer loading of FPU state until return to userspace. This gives
the kernel the potential to skip loading FPU state for tasks that
stay in kernel mode, or for tasks that end up with repeated
invocations of kernel_fpu_begin() & kernel_fpu_end().

The fpregs_lock/unlock() section ensures that the registers remain
unchanged. Otherwise a context switch or a bottom half could save the
registers to its FPU context and the processor's FPU registers would
became random if modified at the same time.

KVM swaps the host/guest registers on entry/exit path. This flow has
been kept as is. First it ensures that the registers are loaded and then
saves the current (host) state before it loads the guest's registers. The
swap is done at the very end with disabled interrupts so it should not
change anymore before theg guest is entered. The read/save version seems
to be cheaper compared to memcpy() in a micro benchmark.

Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy().
For kernel threads, this flag gets never cleared which avoids saving /
restoring the FPU state for kernel threads and during in-kernel usage of
the FPU registers.

 [
   bp: Correct and update commit message and fix checkpatch warnings.
   s/register/registers/ where it is used in plural.
   minor comment corrections.
   remove unused trace_x86_fpu_activate_state() TP.
 ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Waiman Long <longman@redhat.com>
Cc: x86-ml <x86@kernel.org>
Cc: Yi Wang <wang.yi59@zte.com.cn>
Link: https://lkml.kernel.org/r/20190403164156.19645-24-bigeasy@linutronix.de
2019-04-12 19:34:47 +02:00
..
acpi treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
apic treewide: add checks for the return value of memblock_alloc*() 2019-03-12 10:04:02 -07:00
cpu x86/resctrl: Remove unused variable 2019-03-24 22:09:27 +01:00
fpu x86/fpu: Defer FPU state load until return to userspace 2019-04-12 19:34:47 +02:00
kprobes x86/kprobes: Prohibit probing on IRQ handlers directly 2019-02-13 08:16:39 +01:00
.gitignore
alternative.c Merge branch 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-06 08:45:46 -08:00
amd_gart_64.c x86/amd_gart: fix unmapping of non-GART mappings 2019-01-05 08:27:32 +01:00
amd_nb.c x86/amd_nb: Add PCI device IDs for family 17h, model 30h 2018-11-07 21:36:09 +01:00
apb_timer.c
aperture_64.c x86/gart: Exclude GART aperture from kcore 2019-03-23 12:11:49 +01:00
apm_32.c x86/APM: Fix build warning when PROC_FS is not enabled 2018-09-15 10:16:25 +02:00
asm-offsets_32.c x86/entry/32: Load task stack from x86_tss.sp1 in SYSENTER handler 2018-07-20 01:11:36 +02:00
asm-offsets_64.c x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella 2018-09-03 16:50:36 +02:00
asm-offsets.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
audit_64.c
bootflag.c
check.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
cpuid.c
crash_dump_32.c
crash_dump_64.c x86: Fix various typos in comments 2018-12-03 10:49:13 +01:00
crash.c x86/kexec: Fix a kexec_file_load() failure 2019-01-15 12:12:50 +01:00
devicetree.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
doublefault.c
dumpstack_32.c
dumpstack_64.c
dumpstack.c x86/process: Don't mix user/kernel regs in 64bit __show_regs() 2018-09-06 14:33:12 +02:00
e820.c Merge branch 'akpm' (patches from Andrew) 2019-03-12 10:39:53 -07:00
early_printk.c efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation 2019-02-04 08:27:30 +01:00
early-quirks.c On GEM side: 2018-07-20 12:29:24 +10:00
ebda.c
eisa.c x86/EISA: Don't probe EISA bus for Xen PV guests 2018-09-11 23:36:50 +02:00
espfix_64.c
ftrace_32.S
ftrace_64.S x86/ftrace: Do not call function graph from dynamic trampolines 2018-12-19 22:43:37 -05:00
ftrace.c The biggest change for this release is in the histogram code. 2019-03-11 17:01:32 -07:00
head_32.S x86/pgtable/32: Allocate 8k page-tables when PTI is enabled 2018-07-20 01:11:41 +02:00
head_64.S xen/pvh: Split CONFIG_XEN_PVH into CONFIG_PVH and CONFIG_XEN_PVH 2018-12-13 13:41:49 -05:00
head32.c x86/boot: Mostly revert commit ae7e1238e6 ("Add ACPI RSDP address to setup_header") 2018-11-20 09:43:10 +01:00
head64.c x86/boot: Mostly revert commit ae7e1238e6 ("Add ACPI RSDP address to setup_header") 2018-11-20 09:43:10 +01:00
hpet.c x86/hpet: Prevent potential NULL pointer dereference 2019-03-21 12:24:38 +01:00
hw_breakpoint.c x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error 2019-03-22 17:08:17 +01:00
i8237.c
i8253.c
i8259.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
idt.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
ima_arch.c x86/ima: retry detecting secure boot mode 2018-12-11 07:19:45 -05:00
io_delay.c
ioport.c
irq_32.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irq_64.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irq_work.c
irq.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
irqflags.S x86/paravirt: Make native_save_fl() extern inline 2018-07-03 10:56:27 +02:00
irqinit.c x86: Don't include linux/irq.h from asm/hardirq.h 2018-08-05 09:53:13 +02:00
itmt.c
jailhouse.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
jump_label.c jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
kdebugfs.c
kexec-bzimage64.c Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-03-10 17:32:04 -07:00
kgdb.c x86/kernel: Mark expected switch-case fall-throughs 2019-01-26 11:19:13 +01:00
ksysfs.c
kvm.c KVM: x86: WARN_ONCE if sending a PV IPI returns a fatal error 2019-01-25 19:11:33 +01:00
kvmclock.c x86: kvmguest: use TSC clocksource if invariant TSC is exposed 2019-02-20 22:48:52 +01:00
ldt.c x86/ldt: Remove unused variable in map_ldt_struct() 2018-11-06 21:35:11 +01:00
livepatch.c
machine_kexec_32.c x86/kexec: Allocate 8k PGDs for PTI 2018-07-30 13:53:48 +02:00
machine_kexec_64.c x86/kdump: Export the SME mask to vmcoreinfo 2019-01-11 16:09:25 +01:00
Makefile jump_label: move 'asm goto' support test to Kconfig 2019-01-06 09:46:51 +09:00
mmconf-fam10h_64.c
module.c x86: Add support for 64-bit place relative relocations 2018-09-27 17:56:47 +02:00
mpparse.c x86/mm: Don't leak kernel addresses 2019-03-19 12:10:56 +01:00
msr.c x86: Clean up 'sizeof x' => 'sizeof(x)' 2018-10-29 07:13:28 +01:00
nmi_selftest.c
nmi.c
paravirt_patch_32.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
paravirt_patch_64.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
paravirt-spinlocks.c x86/paravirt: Use a single ops structure 2018-09-03 16:50:35 +02:00
paravirt.c x86/paravirt: Remove unused _paravirt_ident_32 2018-10-30 09:55:31 +01:00
pci-calgary_64.c x86/calgary: remove the mapping_error dma_map_ops method 2018-12-06 06:56:46 -08:00
pci-dma.c dma-mapping: bypass indirect calls for dma-direct 2018-12-13 21:06:18 +01:00
pci-iommu_table.c x86/iommu: Use NULL instead of 0 2018-08-02 14:33:19 +02:00
pci-swiotlb.c dma-direct: merge swiotlb_dma_ops into the dma_direct code 2018-12-13 21:06:17 +01:00
pcspeaker.c x86/platform/pcspeaker: Use PTR_ERR_OR_ZERO() to fix ptr_ret.cocci warning 2018-07-24 09:46:42 +02:00
perf_regs.c
platform-quirks.c
pmem.c
probe_roms.c
process_32.c x86/fpu: Defer FPU state load until return to userspace 2019-04-12 19:34:47 +02:00
process_64.c x86/fpu: Defer FPU state load until return to userspace 2019-04-12 19:34:47 +02:00
process.c x86/fpu: Defer FPU state load until return to userspace 2019-04-12 19:34:47 +02:00
process.h x86/speculation: Change misspelled STIPB to STIBP 2018-12-06 11:49:15 +01:00
ptrace.c x86/fsgsbase/64: Fix the base write helper functions 2018-12-18 14:26:09 +01:00
pvclock.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
quirks.c x86/headers: Fix -Wmissing-prototypes warning 2018-11-23 07:59:59 +01:00
reboot_fixups_32.c
reboot.c
relocate_kernel_32.S
relocate_kernel_64.S
resource.c
rtc.c
setup_percpu.c memblock: drop memblock_alloc_*_nopanic() variants 2019-03-12 10:04:02 -07:00
setup.c Merge branch 'mount.part1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-01-05 13:25:58 -08:00
signal_compat.c
signal.c x86/fpu: Remove fpu->initialized 2019-04-10 15:42:40 +02:00
smp.c x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d 2018-08-05 09:53:13 +02:00
smpboot.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-07 16:36:57 -08:00
stacktrace.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
step.c
sys_x86_64.c x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT 2018-11-01 12:59:25 +01:00
sysfb_efi.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
sysfb_simplefb.c
sysfb.c
tboot.c iommu/vtd: Cleanup dma_remapping.h header 2018-11-12 14:22:56 +01:00
tce_64.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
time.c Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 19:07:25 +01:00
tls.c
tls.h
topology.c x86/xen: Disable CPU0 hotplug for Xen PV 2018-09-12 21:15:02 +02:00
trace_clock.c
tracepoint.c x86/kernel: Fix more -Wmissing-prototypes warnings 2018-12-08 12:24:35 +01:00
traps.c x86/fpu: Use a feature number instead of mask in two more helpers 2019-04-10 18:20:27 +02:00
tsc_msr.c x86/cpu: Sanitize FAM6_ATOM naming 2018-10-02 10:14:32 +02:00
tsc_sync.c
tsc.c x86/tsc: Make calibration refinement more robust 2018-11-06 21:53:15 +01:00
umip.c signal/x86: Use force_sig_fault where appropriate 2018-09-21 15:30:54 +02:00
unwind_frame.c x86/unwind: Handle NULL pointer calls better in frame unwinder 2019-03-06 23:03:26 +01:00
unwind_guess.c
unwind_orc.c x86/unwind: Add hardcoded ORC entry for NULL 2019-03-06 23:03:26 +01:00
uprobes.c x86/kernel: Mark expected switch-case fall-throughs 2019-01-26 11:19:13 +01:00
verify_cpu.S
vm86_32.c Remove 'type' argument from access_ok() function 2019-01-03 18:57:57 -08:00
vmlinux.lds.S x86/build: Use the single-argument OUTPUT_FORMAT() linker script command 2019-01-16 12:21:53 +01:00
vsmp_64.c x86/vsmp: Remove dependency on pv_irq_ops 2018-11-06 21:35:11 +01:00
x86_init.c x86/acpi, x86/boot: Take RSDP address for boot params if available 2018-10-10 10:44:22 +02:00