linux/arch
Sean Christopherson 18eca69f88 KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU
commit 654430efde upstream.

Calculate and check the full mmu_role when initializing the MMU context
for the nested MMU, where "full" means the bits and pieces of the role
that aren't handled by kvm_calc_mmu_role_common().  While the nested MMU
isn't used for shadow paging, things like the number of levels in the
guest's page tables are surprisingly important when walking the guest
page tables.  Failure to reinitialize the nested MMU context if L2's
paging mode changes can result in unexpected and/or missed page faults,
and likely other explosions.

E.g. if an L1 vCPU is running both a 32-bit PAE L2 and a 64-bit L2, the
"common" role calculation will yield the same role for both L2s.  If the
64-bit L2 is run after the 32-bit PAE L2, L0 will fail to reinitialize
the nested MMU context, ultimately resulting in a bad walk of L2's page
tables as the MMU will still have a guest root_level of PT32E_ROOT_LEVEL.

  WARNING: CPU: 4 PID: 167334 at arch/x86/kvm/vmx/vmx.c:3075 ept_save_pdptrs+0x15/0xe0 [kvm_intel]
  Modules linked in: kvm_intel]
  CPU: 4 PID: 167334 Comm: CPU 3/KVM Not tainted 5.13.0-rc1-d849817d5673-reqs #185
  Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
  RIP: 0010:ept_save_pdptrs+0x15/0xe0 [kvm_intel]
  Code: <0f> 0b c3 f6 87 d8 02 00f
  RSP: 0018:ffffbba702dbba00 EFLAGS: 00010202
  RAX: 0000000000000011 RBX: 0000000000000002 RCX: ffffffff810a2c08
  RDX: ffff91d7bc30acc0 RSI: 0000000000000011 RDI: ffff91d7bc30a600
  RBP: ffff91d7bc30a600 R08: 0000000000000010 R09: 0000000000000007
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff91d7bc30a600
  R13: ffff91d7bc30acc0 R14: ffff91d67c123460 R15: 0000000115d7e005
  FS:  00007fe8e9ffb700(0000) GS:ffff91d90fb00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000029f15a001 CR4: 00000000001726e0
  Call Trace:
   kvm_pdptr_read+0x3a/0x40 [kvm]
   paging64_walk_addr_generic+0x327/0x6a0 [kvm]
   paging64_gva_to_gpa_nested+0x3f/0xb0 [kvm]
   kvm_fetch_guest_virt+0x4c/0xb0 [kvm]
   __do_insn_fetch_bytes+0x11a/0x1f0 [kvm]
   x86_decode_insn+0x787/0x1490 [kvm]
   x86_decode_emulated_instruction+0x58/0x1e0 [kvm]
   x86_emulate_instruction+0x122/0x4f0 [kvm]
   vmx_handle_exit+0x120/0x660 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0xe25/0x1cb0 [kvm]
   kvm_vcpu_ioctl+0x211/0x5a0 [kvm]
   __x64_sys_ioctl+0x83/0xb0
   do_syscall_64+0x40/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Fixes: bf627a9288 ("x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210610220026.1364486-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23 14:42:51 +02:00
..
alpha local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
arc ARC: mm: Use max_high_pfn as a HIGHMEM zone border 2021-05-19 10:13:10 +02:00
arm ARM: OMAP2+: Fix build warning when mmc_omap is not built 2021-06-18 10:00:04 +02:00
arm64 KVM: arm64: Fix debug register indexing 2021-06-10 13:39:28 +02:00
c6x arch-cleanup-2020-10-22 2020-10-23 10:06:38 -07:00
csky csky: change a Kconfig symbol name to fix e1000 build error 2021-04-28 13:40:02 +02:00
h8300 h8300: fix PREEMPTION build, TI_PRE_COUNT undefined 2021-02-17 11:02:28 +01:00
hexagon local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
ia64 tweewide: Fix most Shebang lines 2021-05-22 11:40:55 +02:00
m68k m68k: Add missing mmap_read_lock() to sys_cacheflush() 2021-05-14 09:50:19 +02:00
microblaze local64.h: make <asm/local64.h> mandatory 2021-01-12 20:18:16 +01:00
mips MIPS: Fix kernel hang under FUNCTION_GRAPH_TRACER and PREEMPT_TRACER 2021-06-16 12:01:37 +02:00
nds32 nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff 2021-04-14 08:41:58 +02:00
nios2 nios2: fixed broken sys_clone syscall 2021-03-04 11:38:16 +01:00
openrisc openrisc: Define memory barrier mb 2021-06-03 09:00:44 +02:00
parisc parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers 2021-04-14 08:41:59 +02:00
powerpc powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c controllers 2021-06-16 12:01:38 +02:00
riscv riscv: Use -mno-relax when using lld linker 2021-06-18 10:00:04 +02:00
s390 KVM: s390: extend kvm_s390_shadow_fault to return entry pointer 2021-05-14 09:50:03 +02:00
sh sh: Remove unused HAVE_COPY_THREAD_TLS macro 2021-01-27 11:55:20 +01:00
sparc sparc64: Fix opcode filtering in handling of no fault loads 2021-03-30 14:31:50 +02:00
um um: Disable CONFIG_GCOV with MODULES 2021-05-22 11:40:53 +02:00
x86 KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU 2021-06-23 14:42:51 +02:00
xtensa xtensa: move coprocessor_flush to the .text section 2021-04-07 15:00:09 +02:00
.gitignore
Kconfig fanotify: Fix sys_fanotify_mark() on native x86-32 2021-01-17 14:16:59 +01:00