linux/arch/arm64
Oliver Upton f587661f21 KVM: arm64: Don't split hugepages outside of MMU write lock
It is possible to take a stage-2 permission fault on a page larger than
PAGE_SIZE. For example, when running a guest backed by 2M HugeTLB, KVM
eagerly maps at the largest possible block size. When dirty logging is
enabled on a memslot, KVM does *not* eagerly split these 2M stage-2
mappings and instead clears the write bit on the pte.

Since dirty logging is always performed at PAGE_SIZE granularity, KVM
lazily splits these 2M block mappings down to PAGE_SIZE in the stage-2
fault handler. This operation must be done under the write lock. Since
commit f783ef1c0e ("KVM: arm64: Add fast path to handle permission
relaxation during dirty logging"), the stage-2 fault handler
conditionally takes the read lock on permission faults with dirty
logging enabled. To that end, it is possible to split a 2M block mapping
while only holding the read lock.

The problem is demonstrated by running kvm_page_table_test with 2M
anonymous HugeTLB, which splats like so:

  WARNING: CPU: 5 PID: 15276 at arch/arm64/kvm/hyp/pgtable.c:153 stage2_map_walk_leaf+0x124/0x158

  [...]

  Call trace:
  stage2_map_walk_leaf+0x124/0x158
  stage2_map_walker+0x5c/0xf0
  __kvm_pgtable_walk+0x100/0x1d4
  __kvm_pgtable_walk+0x140/0x1d4
  __kvm_pgtable_walk+0x140/0x1d4
  kvm_pgtable_walk+0xa0/0xf8
  kvm_pgtable_stage2_map+0x15c/0x198
  user_mem_abort+0x56c/0x838
  kvm_handle_guest_abort+0x1fc/0x2a4
  handle_exit+0xa4/0x120
  kvm_arch_vcpu_ioctl_run+0x200/0x448
  kvm_vcpu_ioctl+0x588/0x664
  __arm64_sys_ioctl+0x9c/0xd4
  invoke_syscall+0x4c/0x144
  el0_svc_common+0xc4/0x190
  do_el0_svc+0x30/0x8c
  el0_svc+0x28/0xcc
  el0t_64_sync_handler+0x84/0xe4
  el0t_64_sync+0x1a4/0x1a8

Fix the issue by only acquiring the read lock if the guest faulted on a
PAGE_SIZE granule w/ dirty logging enabled. Add a WARN to catch locking
bugs in future changes.

Fixes: f783ef1c0e ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging")
Cc: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220401194652.950240-1-oupton@google.com
2022-04-06 10:41:05 +01:00
..
boot SoC: fixes for 5.18, part 1 2022-04-01 13:21:19 -07:00
configs ARM: DT updates for 5.18 2022-03-23 18:37:22 -07:00
crypto crypto: arm64 - cleanup comments 2022-03-09 15:12:32 +12:00
hyperv arm64: hyperv: Initialize hypervisor on boot 2021-08-04 16:54:36 +00:00
include Char/Misc and other driver updates for 5.18-rc1 2022-03-28 12:27:35 -07:00
kernel ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
kvm KVM: arm64: Don't split hugepages outside of MMU write lock 2022-04-06 10:41:05 +01:00
lib Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2022-03-21 16:02:36 -07:00
mm kasan, arm64: don't tag executable vmalloc allocations 2022-03-24 19:06:48 -07:00
net kasan, arm64: don't tag executable vmalloc allocations 2022-03-24 19:06:48 -07:00
tools Merge branch 'for-next/spectre-bhb' into for-next/core 2022-03-14 19:08:31 +00:00
xen xen: allow pv-only hypercalls only with CONFIG_XEN_PV 2021-11-02 08:11:01 -05:00
Kbuild kbuild: use more subdir- for visiting subdirectories while cleaning 2021-10-24 13:49:46 +09:00
Kconfig Char/Misc and other driver updates for 5.18-rc1 2022-03-28 12:27:35 -07:00
Kconfig.debug
Kconfig.platforms ARM: DT updates for 5.18 2022-03-23 18:37:22 -07:00
Makefile arm64/xor: use EOR3 instructions when available 2021-12-14 12:14:26 +00:00