Commit Graph

123 Commits

Author SHA1 Message Date
Vincent Donnefort
9019e82c7e KVM: arm64: Add PKVM_DISABLE_STAGE2_ON_PANIC
On NVHE_EL2_DEBUG, when using pKVM, the host stage-2 is relaxed to grant
the kernel access to the stacktrace, hypervisor bug table and text to
symbolize addresses. This is unsafe for production. In preparation for
adding more debug options to NVHE_EL2_DEBUG, decouple the stage-2
relaxation into a separate option.

While at it, rename PROTECTED_NVHE_STACKTRACE into PKVM_STACKTRACE,
following the same naming scheme as PKVM_DISABLE_STAGE2_ON_PANIC.

Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260309162516.2623589-20-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-11 08:51:16 +00:00
Marc Zyngier
892f7c38ba KVM: arm64: Fix WFxT handling of nested virt
The spec for WFxT indicates that the parameter to the WFxT instruction
is relative to the reading of CNTVCT_EL0. This means that the implementation
needs to take the execution context into account, as CNTVOFF_EL2
does not always affect readings of CNTVCT_EL0 (such as when HCR_EL2.E2H
is 1 and that we're in host context).

This also rids us of the last instance of KVM_REG_ARM_TIMER_CNT
outside of the userspace interaction code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-10-13 14:42:41 +01:00
Linus Torvalds
f3826aa996 guest_memfd:
* Add support for host userspace mapping of guest_memfd-backed memory for VM
   types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE (which isn't
   precisely the same thing as CoCo VMs, since x86's SEV-MEM and SEV-ES have
   no way to detect private vs. shared).
 
   This lays the groundwork for removal of guest memory from the kernel direct
   map, as well as for limited mmap() for guest_memfd-backed memory.
 
   For more information see:
   * a6ad54137a ("Merge branch 'guest-memfd-mmap' into HEAD", 2025-08-27)
   * https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
     (guest_memfd in Firecracker)
   * https://lore.kernel.org/all/20250221160728.1584559-1-roypat@amazon.co.uk/
     (direct map removal)
   * https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/
     (mmap support)
 
 ARM:
 
 * Add support for FF-A 1.2 as the secure memory conduit for pKVM,
   allowing more registers to be used as part of the message payload.
 
 * Change the way pKVM allocates its VM handles, making sure that the
   privileged hypervisor is never tricked into using uninitialised
   data.
 
 * Speed up MMIO range registration by avoiding unnecessary RCU
   synchronisation, which results in VMs starting much quicker.
 
 * Add the dump of the instruction stream when panic-ing in the EL2
   payload, just like the rest of the kernel has always done. This will
   hopefully help debugging non-VHE setups.
 
 * Add 52bit PA support to the stage-1 page-table walker, and make use
   of it to populate the fault level reported to the guest on failing
   to translate a stage-1 walk.
 
 * Add NV support to the GICv3-on-GICv5 emulation code, ensuring
   feature parity for guests, irrespective of the host platform.
 
 * Fix some really ugly architecture problems when dealing with debug
   in a nested VM. This has some bad performance impacts, but is at
   least correct.
 
 * Add enough infrastructure to be able to disable EL2 features and
   give effective values to the EL2 control registers. This then allows
   a bunch of features to be turned off, which helps cross-host
   migration.
 
 * Large rework of the selftest infrastructure to allow most tests to
   transparently run at EL2. This is the first step towards enabling
   NV testing.
 
 * Various fixes and improvements all over the map, including one BE
   fix, just in time for the removal of the feature.
 
 LoongArch:
 
 * Detect page table walk feature on new hardware
 
 * Add sign extension with kernel MMIO/IOCSR emulation
 
 * Improve in-kernel IPI emulation
 
 * Improve in-kernel PCH-PIC emulation
 
 * Move kvm_iocsr tracepoint out of generic code
 
 RISC-V:
 
 * Added SBI FWFT extension for Guest/VM with misaligned delegation and
   pointer masking PMLEN features
 
 * Added ONE_REG interface for SBI FWFT extension
 
 * Added Zicbop and bfloat16 extensions for Guest/VM
 
 * Enabled more common KVM selftests for RISC-V
 
 * Added SBI v3.0 PMU enhancements in KVM and perf driver
 
 s390:
 
 * Improve interrupt cpu for wakeup, in particular the heuristic to decide
   which vCPU to deliver a floating interrupt to.
 
 * Clear the PTE when discarding a swapped page because of CMMA; this
   bug was introduced in 6.16 when refactoring gmap code.
 
 x86 selftests:
 
 * Add #DE coverage in the fastops test (the only exception that's guest-
   triggerable in fastop-emulated instructions).
 
 * Fix PMU selftests errors encountered on Granite Rapids (GNR), Sierra
   Forest (SRF) and Clearwater Forest (CWF).
 
 * Minor cleanups and improvements
 
 x86 (guest side):
 
 * For the legacy PCI hole (memory between TOLUD and 4GiB) to UC when
   overriding guest MTRR for TDX/SNP to fix an issue where ACPI auto-mapping
   could map devices as WB and prevent the device drivers from mapping their
   devices with UC/UC-.
 
 * Make kvm_async_pf_task_wake() a local static helper and remove its
   export.
 
 * Use native qspinlocks when running in a VM with dedicated vCPU=>pCPU
   bindings even when PV_UNHALT is unsupported.
 
 Generic:
 
 * Remove a redundant __GFP_NOWARN from kvm_setup_async_pf() as __GFP_NOWARN is
   now included in GFP_NOWAIT.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmjcGSkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPSPAgAnJDswU4fZ5YdJr6jGzsbSQ6utlIV
 FeEltLKQIM7Aq/uvL6PLN5Kx1Pb/d9r9ag39mDT6lq9fOfJdOLjJr2SBXPTCsrPS
 6hyNL1mlgo5qzs54T8dkMbQThlSgA4zaehsc0zl8vnwil6ygoAdrtTHqZm6V0hu/
 F/sVlikCsLix1hC0KtzwscyWYcjWtXfVoi9eU5WY6ALpQaVXfRUtwyOhGDkldr+m
 i3iDiGiLAZ5Iu3igUCIOEzSSQY0FgLJpzbwJAeUxIvomDkHGJLaR14ijvM+NkRZi
 FBo2CLbjrwXb56Rbh2ABcq0CGJ3EiU3L+CC34UaRLzbtl/2BtpetkC3irA==
 =fyov
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "This excludes the bulk of the x86 changes, which I will send
  separately. They have two not complex but relatively unusual conflicts
  so I will wait for other dust to settle.

  guest_memfd:

   - Add support for host userspace mapping of guest_memfd-backed memory
     for VM types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE
     (which isn't precisely the same thing as CoCo VMs, since x86's
     SEV-MEM and SEV-ES have no way to detect private vs. shared).

     This lays the groundwork for removal of guest memory from the
     kernel direct map, as well as for limited mmap() for
     guest_memfd-backed memory.

     For more information see:
       - commit a6ad54137a ("Merge branch 'guest-memfd-mmap' into HEAD")
       - guest_memfd in Firecracker:
           https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
       - direct map removal:
           https://lore.kernel.org/all/20250221160728.1584559-1-roypat@amazon.co.uk/
       - mmap support:
           https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/

  ARM:

   - Add support for FF-A 1.2 as the secure memory conduit for pKVM,
     allowing more registers to be used as part of the message payload.

   - Change the way pKVM allocates its VM handles, making sure that the
     privileged hypervisor is never tricked into using uninitialised
     data.

   - Speed up MMIO range registration by avoiding unnecessary RCU
     synchronisation, which results in VMs starting much quicker.

   - Add the dump of the instruction stream when panic-ing in the EL2
     payload, just like the rest of the kernel has always done. This
     will hopefully help debugging non-VHE setups.

   - Add 52bit PA support to the stage-1 page-table walker, and make use
     of it to populate the fault level reported to the guest on failing
     to translate a stage-1 walk.

   - Add NV support to the GICv3-on-GICv5 emulation code, ensuring
     feature parity for guests, irrespective of the host platform.

   - Fix some really ugly architecture problems when dealing with debug
     in a nested VM. This has some bad performance impacts, but is at
     least correct.

   - Add enough infrastructure to be able to disable EL2 features and
     give effective values to the EL2 control registers. This then
     allows a bunch of features to be turned off, which helps cross-host
     migration.

   - Large rework of the selftest infrastructure to allow most tests to
     transparently run at EL2. This is the first step towards enabling
     NV testing.

   - Various fixes and improvements all over the map, including one BE
     fix, just in time for the removal of the feature.

  LoongArch:

   - Detect page table walk feature on new hardware

   - Add sign extension with kernel MMIO/IOCSR emulation

   - Improve in-kernel IPI emulation

   - Improve in-kernel PCH-PIC emulation

   - Move kvm_iocsr tracepoint out of generic code

  RISC-V:

   - Added SBI FWFT extension for Guest/VM with misaligned delegation
     and pointer masking PMLEN features

   - Added ONE_REG interface for SBI FWFT extension

   - Added Zicbop and bfloat16 extensions for Guest/VM

   - Enabled more common KVM selftests for RISC-V

   - Added SBI v3.0 PMU enhancements in KVM and perf driver

  s390:

   - Improve interrupt cpu for wakeup, in particular the heuristic to
     decide which vCPU to deliver a floating interrupt to.

   - Clear the PTE when discarding a swapped page because of CMMA; this
     bug was introduced in 6.16 when refactoring gmap code.

  x86 selftests:

   - Add #DE coverage in the fastops test (the only exception that's
     guest- triggerable in fastop-emulated instructions).

   - Fix PMU selftests errors encountered on Granite Rapids (GNR),
     Sierra Forest (SRF) and Clearwater Forest (CWF).

   - Minor cleanups and improvements

  x86 (guest side):

   - For the legacy PCI hole (memory between TOLUD and 4GiB) to UC when
     overriding guest MTRR for TDX/SNP to fix an issue where ACPI
     auto-mapping could map devices as WB and prevent the device drivers
     from mapping their devices with UC/UC-.

   - Make kvm_async_pf_task_wake() a local static helper and remove its
     export.

   - Use native qspinlocks when running in a VM with dedicated
     vCPU=>pCPU bindings even when PV_UNHALT is unsupported.

  Generic:

   - Remove a redundant __GFP_NOWARN from kvm_setup_async_pf() as
     __GFP_NOWARN is now included in GFP_NOWAIT.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (178 commits)
  KVM: s390: Fix to clear PTE when discarding a swapped page
  KVM: arm64: selftests: Cover ID_AA64ISAR3_EL1 in set_id_regs
  KVM: arm64: selftests: Remove a duplicate register listing in set_id_regs
  KVM: arm64: selftests: Cope with arch silliness in EL2 selftest
  KVM: arm64: selftests: Add basic test for running in VHE EL2
  KVM: arm64: selftests: Enable EL2 by default
  KVM: arm64: selftests: Initialize HCR_EL2
  KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU counters
  KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2
  KVM: arm64: selftests: Select SMCCC conduit based on current EL
  KVM: arm64: selftests: Provide helper for getting default vCPU target
  KVM: arm64: selftests: Alias EL1 registers to EL2 counterparts
  KVM: arm64: selftests: Create a VGICv3 for 'default' VMs
  KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creation
  KVM: arm64: selftests: Add helper to check for VGICv3 support
  KVM: arm64: selftests: Initialize VGICv3 only once
  KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library code
  KVM: selftests: Add ex_str() to print human friendly name of exception vectors
  selftests/kvm: remove stale TODO in xapic_state_test
  KVM: selftests: Handle Intel Atom errata that leads to PMU event overcount
  ...
2025-10-04 08:52:16 -07:00
Kees Cook
23ef9d4397 kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
The kernel's CFI implementation uses the KCFI ABI specifically, and is
not strictly tied to a particular compiler. In preparation for GCC
supporting KCFI, rename CONFIG_CFI_CLANG to CONFIG_CFI (along with
associated options).

Use new "transitional" Kconfig option for old CONFIG_CFI_CLANG that will
enable CONFIG_CFI during olddefconfig.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20250923213422.1105654-3-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-09-24 14:29:14 -07:00
Mostafa Saleh
6f1ece1e86 KVM: arm64: Map hyp text as RO and dump instr on panic
Map the hyp text section as RO, there are no secrets there
and that allows the kernel extract info for debugging.

As in case of panic we can now dump the faulting instructions
similar to the kernel.

Signed-off-by: Mostafa Saleh <smostafa@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15 13:04:23 +01:00
Mostafa Saleh
92b7624fe0 KVM: arm64: Dump instruction on hyp panic
Similar to the kernel panic, where the instruction code is printed,
we can do the same for hypervisor panics.

This patch does that only in case of “CONFIG_NVHE_EL2_DEBUG” or nvhe.

The next patch adds support for pKVM.

Also, remove the hardcoded argument dump_kernel_instr().

Signed-off-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Kunwu Chan <chentao@kylinos.cn>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15 13:04:22 +01:00
Oliver Upton
77ee70a073 KVM: arm64: nv: Honor SError exception routing / masking
To date KVM has used HCR_EL2.VSE to track the state of a pending SError
for the guest. With this bit set, hardware respects the EL1 exception
routing / masking rules and injects the vSError when appropriate.

This isn't correct for NV guests as hardware is oblivious to vEL2's
intentions for SErrors. Better yet, with FEAT_NV2 the guest can change
the routing behind our back as HCR_EL2 is redirected to memory. Cope
with this mess by:

 - Using a flag (instead of HCR_EL2.VSE) to track the pending SError
   state when SErrors are unconditionally masked for the current context

 - Resampling the routing / masking of a pending SError on every guest
   entry/exit

 - Emulating exception entry when SError routing implies a translation
   regime change

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-7-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 11:36:31 -07:00
Marc Zyngier
1d6fea7663 KVM: arm64: Add helper to identify a nested context
A common idiom in the KVM code is to check if we are currently
dealing with a "nested" context, defined as having NV enabled,
but being in the EL1&0 translation regime.

This is usually expressed as:

	if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu) ... )

which is a mouthful and a bit hard to read, specially when followed
by additional conditions.

Introduce a new helper that encapsulate these two terms, allowing
the above to be written as

	if (is_nested_context(vcpu) ... )

which is both shorter and easier to read, and makes more obvious
the potential for simplification on some code paths.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 10:40:30 -07:00
Marc Zyngier
7f3225fe8b Merge branch kvm-arm64/nv-nv into kvmarm-master/next
* kvm-arm64/nv-nv:
  : .
  : Flick the switch on the NV support by adding the missing piece
  : in the form of the VNCR page management. From the cover letter:
  :
  : "This is probably the most interesting bit of the whole NV adventure.
  : So far, everything else has been a walk in the park, but this one is
  : where the real fun takes place.
  :
  : With FEAT_NV2, most of the NV support revolves around tricking a guest
  : into accessing memory while it tries to access system registers. The
  : hypervisor's job is to handle the context switch of the actual
  : registers with the state in memory as needed."
  : .
  KVM: arm64: nv: Release faulted-in VNCR page from mmu_lock critical section
  KVM: arm64: nv: Handle TLBI S1E2 for VNCR invalidation with mmu_lock held
  KVM: arm64: nv: Hold mmu_lock when invalidating VNCR SW-TLB before translating
  KVM: arm64: Document NV caps and vcpu flags
  KVM: arm64: Allow userspace to request KVM_ARM_VCPU_EL2*
  KVM: arm64: nv: Remove dead code from ERET handling
  KVM: arm64: nv: Plumb TLBI S1E2 into system instruction dispatch
  KVM: arm64: nv: Add S1 TLB invalidation primitive for VNCR_EL2
  KVM: arm64: nv: Program host's VNCR_EL2 to the fixmap address
  KVM: arm64: nv: Handle VNCR_EL2 invalidation from MMU notifiers
  KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2
  KVM: arm64: nv: Handle VNCR_EL2-triggered faults
  KVM: arm64: nv: Add userspace and guest handling of VNCR_EL2
  KVM: arm64: nv: Add pseudo-TLB backing VNCR_EL2
  KVM: arm64: nv: Don't adjust PSTATE.M when L2 is nesting
  KVM: arm64: nv: Move TLBI range decoding to a helper
  KVM: arm64: nv: Snapshot S1 ASID tagging information during walk
  KVM: arm64: nv: Extract translation helper from the AT code
  KVM: arm64: nv: Allocate VNCR page when required
  arm64: sysreg: Add layout for VNCR_EL2

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:58:57 +01:00
Marc Zyngier
fef3acf5ae Merge branch kvm-arm64/fgt-masks into kvmarm-master/next
* kvm-arm64/fgt-masks: (43 commits)
  : .
  : Large rework of the way KVM deals with trap bits in conjunction with
  : the CPU feature registers. It now draws a direct link between which
  : the feature set, the system registers that need to UNDEF to match
  : the configuration and bits that need to behave as RES0 or RES1 in
  : the trap registers that are visible to the guest.
  :
  : Best of all, these definitions are mostly automatically generated
  : from the JSON description published by ARM under a permissive
  : license.
  : .
  KVM: arm64: Handle TSB CSYNC traps
  KVM: arm64: Add FGT descriptors for FEAT_FGT2
  KVM: arm64: Allow sysreg ranges for FGT descriptors
  KVM: arm64: Add context-switch for FEAT_FGT2 registers
  KVM: arm64: Add trap routing for FEAT_FGT2 registers
  KVM: arm64: Add sanitisation for FEAT_FGT2 registers
  KVM: arm64: Add FEAT_FGT2 registers to the VNCR page
  KVM: arm64: Use HCR_EL2 feature map to drive fixed-value bits
  KVM: arm64: Use HCRX_EL2 feature map to drive fixed-value bits
  KVM: arm64: Allow kvm_has_feat() to take variable arguments
  KVM: arm64: Use FGT feature maps to drive RES0 bits
  KVM: arm64: Validate FGT register descriptions against RES0 masks
  KVM: arm64: Switch to table-driven FGU configuration
  KVM: arm64: Handle PSB CSYNC traps
  KVM: arm64: Use KVM-specific HCRX_EL2 RES0 mask
  KVM: arm64: Remove hand-crafted masks for FGT registers
  KVM: arm64: Use computed FGT masks to setup FGT registers
  KVM: arm64: Propagate FGT masks to the nVHE hypervisor
  KVM: arm64: Unconditionally configure fine-grain traps
  KVM: arm64: Use computed masks as sanitisers for FGT registers
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:58:15 +01:00
Marc Zyngier
98dbe56a01 KVM: arm64: Handle TSB CSYNC traps
The architecture introduces a trap for TSB CSYNC that fits in
the same EC as LS64 and PSB CSYNC. Let's deal with it in a similar
way.

It's not that we expect this to be useful any time soon anyway.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19 11:36:21 +01:00
Marc Zyngier
069a05e535 KVM: arm64: nv: Handle VNCR_EL2-triggered faults
As VNCR_EL2.BADDR contains a VA, it is bound to trigger faults.

These faults can have multiple source:

- We haven't mapped anything on the host: we need to compute the
  resulting translation, populate a TLB, and eventually map
  the corresponding page

- The permissions are out of whack: we need to tell the guest about
  this state of affairs

Note that the kernel doesn't support S1POE for itself yet, so
the particular case of a VNCR page mapped with no permissions
or with write-only permissions is not correctly handled yet.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250514103501.2225951-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-19 08:01:19 +01:00
Marc Zyngier
397411c743 KVM: arm64: Handle PSB CSYNC traps
The architecture introduces a trap for PSB CSYNC that fits in
 the same EC as LS64. Let's deal with it in a similar way as
LS64.

It's not that we expect this to be useful any time soon anyway.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-10 11:04:35 +01:00
Mostafa Saleh
446692759b KVM: arm64: Handle UBSAN faults
As now UBSAN can be enabled, handle brk64 exits from UBSAN.
Re-use the decoding code from the kernel, and panic with
UBSAN message.

Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250430162713.1997569-5-smostafa@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-07 11:21:35 +01:00
Marc Zyngier
5329358c22 KVM: arm64: Plug FEAT_GCS handling
We don't seem to be handling the GCS-specific exception class.
Handle it by delivering an UNDEF to the guest, and populate the
relevant trap bits.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 17:35:19 +01:00
Marc Zyngier
2e04378f1a KVM: arm64: Handle trapping of FEAT_LS64* instructions
We generally don't expect FEAT_LS64* instructions to trap, unless
they are trapped by a guest hypervisor.

Otherwise, this is just the guest playing tricks on us by using
an instruction that isn't advertised, which we handle with a well
deserved UNDEF.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-06 17:35:14 +01:00
Jintack Lim
69c9176c38 KVM: arm64: nv: Respect virtual HCR_EL2.TWx setting
Forward exceptions due to WFI or WFE instructions to the virtual EL2 if
they are not coming from the virtual EL2 and virtual HCR_EL2.TWx is set.

Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03 14:57:10 -08:00
Oliver Upton
b0ee51033a KVM: arm64: nv: Honor MDCR_EL2.TDE routing for debug exceptions
Inject debug exceptions into vEL2 if MDCR_EL2.TDE is set.

Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241219224116.3941496-17-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-20 09:04:11 +00:00
Marc Zyngier
2ca3f03bf5 KVM: arm64: Manage software step state at load/put
KVM takes over the guest's software step state machine if the VMM is
debugging the guest, but it does the save/restore fiddling for every
guest entry.

Note that the only constraint on host usage of software step is that the
guest's configuration remains visible to userspace via the ONE_REG
ioctls. So, we can cut down on the amount of fiddling by doing this at
load/put instead.

Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241219224116.3941496-16-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-20 09:04:06 +00:00
Oliver Upton
8c2899e770 Merge branch kvm-arm64/nv-sve into kvmarm/next
* kvm-arm64/nv-sve:
  : CPTR_EL2, FPSIMD/SVE support for nested
  :
  : This series brings support for honoring the guest hypervisor's CPTR_EL2
  : trap configuration when running a nested guest, along with support for
  : FPSIMD/SVE usage at L1 and L2.
  KVM: arm64: Allow the use of SVE+NV
  KVM: arm64: nv: Add additional trap setup for CPTR_EL2
  KVM: arm64: nv: Add trap description for CPTR_EL2
  KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
  KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
  KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
  KVM: arm64: nv: Handle CPACR_EL1 traps
  KVM: arm64: Spin off helper for programming CPTR traps
  KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
  KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
  KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
  KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
  KVM: arm64: nv: Handle ZCR_EL2 traps
  KVM: arm64: nv: Forward SVE traps to guest hypervisor
  KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-07-14 00:27:06 +00:00
Oliver Upton
399debfc97 KVM: arm64: nv: Forward SVE traps to guest hypervisor
Similar to FPSIMD traps, don't load SVE state if the guest hypervisor
has SVE traps enabled and forward the trap instead. Note that ZCR_EL2
will require some special handling, as it takes a sysreg trap to EL2
when HCR_EL2.NV = 1.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:01:20 +00:00
Jintack Lim
d2b2ecba8d KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Give precedence to the guest hypervisor's trap configuration when
routing an FP/ASIMD trap taken to EL2. Take advantage of the
infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format
and base the trap decision solely on the VHE view of the register. The
in-memory value of CPTR_EL2 will always be up to date for the guest
hypervisor (more on that later), so just read it directly from memory.

Bury all of this behind a macro keyed off of the CPTR bitfield in
anticipation of supporting other traps (e.g. SVE).

[maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with
 all the hard work actually being done by Chase Conklin]
[ oliver: translate nVHE->VHE format for testing traps; macro for reuse
 in other CPTR_EL2.xEN fields ]

Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 19:01:20 +00:00
Pierre-Clément Tosi
eca4ba5b6d KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
The compiler implements kCFI by adding type information (u32) above
every function that might be indirectly called and, whenever a function
pointer is called, injects a read-and-compare of that u32 against the
value corresponding to the expected type. In case of a mismatch, a BRK
instruction gets executed. When the hypervisor triggers such an
exception in nVHE, it panics and triggers and exception return to EL1.

Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the
ESR and report them. If necessary, remind the user that EL2 kCFI is not
affected by CONFIG_CFI_PERMISSIVE.

Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code.

Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't
call it directly and must use a PA function pointer from C (because it
is part of the idmap page), which would trigger a kCFI failure if the
type ID wasn't present.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-9-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi
8f3873a395 KVM: arm64: Introduce print_nvhe_hyp_panic helper
Add a helper to display a panic banner soon to also be used for kCFI
failures, to ensure that we remain consistent.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-8-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi
7a928b32f1 arm64: Introduce esr_brk_comment, esr_is_cfi_brk
As it is already used in two places, move esr_comment() to a header for
re-use, with a clearer name.

Introduce esr_is_cfi_brk() to detect kCFI BRK syndromes, currently used
by early_brk64() but soon to also be used by hypervisor code.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-7-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-06-20 17:40:54 +00:00
Marc Zyngier
814ad8f96e KVM: arm64: Drop trapping of PAuth instructions/keys
We currently insist on disabling PAuth on vcpu_load(), and get to
enable it on first guest use of an instruction or a key (ignoring
the NV case for now).

It isn't clear at all what this is trying to achieve: guests tend
to use PAuth when available, and nothing forces you to expose it
to the guest if you don't want to. This also isn't totally free:
we take a full GPR save/restore between host and guest, only to
write ten 64bit registers. The "value proposition" escapes me.

So let's forget this stuff and enable PAuth eagerly if exposed to
the guest. This results in much simpler code. Performance wise,
that's not bad either (tested on M2 Pro running a fully automated
Debian installer as the workload):

- On a non-NV guest, I can see reduction of 0.24% in the number
  of cycles (measured with perf over 10 consecutive runs)

- On a NV guest (L2), I see a 2% reduction in wall-clock time
  (measured with 'time', as M2 doesn't have a PMUv3 and NV
  doesn't support it either)

So overall, a much reduced complexity and a (small) performance
improvement.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-16-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20 12:42:51 +01:00
Marc Zyngier
213b3d1ea1 KVM: arm64: nv: Handle ERETA[AB] instructions
Now that we have some emulation in place for ERETA[AB], we can
plug it into the exception handling machinery.

As for a bare ERET, an "easy" ERETAx instruction is processed as
a fixup, while something that requires a translation regime
transition or an exception delivery is left to the slow path.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-14-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20 12:42:51 +01:00
Marc Zyngier
15db034733 KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0
In order for a L1 hypervisor to correctly handle PAuth instructions,
it must observe traps caused by a L1 PAuth instruction when
HCR_EL2.API==0. Since we already handle the case for API==1 as
a fixup, only the exception injection case needs to be handled.

Rework the kvm_handle_ptrauth() callback to reinject the trap
in this case. Note that APK==0 is already handled by the exising
triage_sysreg_trap() helper.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-11-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20 12:42:51 +01:00
Marc Zyngier
95537f06b9 KVM: arm64: nv: Add trap forwarding for ERET and SMC
Honor the trap forwarding bits for both ERET and SMC, using a new
helper that checks for common conditions.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20 12:42:50 +01:00
Marc Zyngier
80d8b55a57 KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET*
The ESR_ELx_ERET_ISS_ERET* macros are a bit confusing:

- ESR_ELx_ERET_ISS_ERET really indicates that we have trapped an
  ERETA* instruction, as opposed to an ERET

- ESR_ELx_ERET_ISS_ERETA really indicates that we have trapped
  an ERETAB instruction, as opposed to an ERETAA.

We could repaint those to make more sense, but these are the
names that are present in the ARM ARM, and we are sentimentally
attached to those.

Instead, add two new helpers:

- esr_iss_is_eretax() being true tells you that you need to
  authenticate the ERET

- esr_iss_is_eretab() tells you that you need to use the B key
  instead of the A key

Following patches will make use of these primitives.

Suggested-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240419102935.1935571-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20 12:42:50 +01:00
Marc Zyngier
ea3b27d8de KVM: arm64: nv: Expand ERET trap forwarding to handle FGT
We already handle ERET being trapped from a L1 guest in hyp context.
However, with FGT, we can also have ERET being trapped from L2, and
this needs to be reinjected into L1.

Add the required exception routing.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-25-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
a77b31dce4 KVM: arm64: nv: Add SVC trap forwarding
HFGITR_EL2 allows the trap of SVC instructions to EL2. Allow these
traps to be forwarded. Take this opportunity to deny any 32bit activity
when NV is enabled.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-24-maz@kernel.org
2023-08-17 10:00:27 +01:00
Oliver Upton
37c8e49479 KVM: arm64: Let errors from SMCCC emulation to reach userspace
Typically a negative return from an exit handler is used to request a
return to userspace with the specified error. KVM's handling of SMCCC
emulation (i.e. both HVCs and SMCs) deviates from the trend and resumes
the guest instead.

Stop handling negative returns this way and instead let the error
percolate to userspace.

Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-12-oliver.upton@linux.dev
2023-04-05 12:07:42 +01:00
Oliver Upton
d824dff191 KVM: arm64: Add support for KVM_EXIT_HYPERCALL
In anticipation of user hypercall filters, add the necessary plumbing to
get SMCCC calls out to userspace. Even though the exit structure has
space for KVM to pass register arguments, let's just avoid it altogether
and let userspace poke at the registers via KVM_GET_ONE_REG.

This deliberately stretches the definition of a 'hypercall' to cover
SMCs from EL1 in addition to the HVCs we know and love. KVM doesn't
support EL1 calls into secure services, but now we can paint that as a
userspace problem and be done with it.

Finally, we need a flag to let userspace know what conduit instruction
was used (i.e. SMC vs. HVC).

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-9-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Oliver Upton
c2d2e9b3d8 KVM: arm64: Start handling SMCs from EL1
Whelp, the architecture gods have spoken and confirmed that the function
ID space is common between SMCs and HVCs. Not only that, the expectation
is that hypervisors handle calls to both SMC and HVC conduits. KVM
recently picked up support for SMCCCs in commit bd36b1a9eb ("KVM:
arm64: nv: Handle SMCs taken from virtual EL2") but scoped it only to a
nested hypervisor.

Let's just open the floodgates and let EL1 access our SMCCC
implementation with the SMC instruction as well.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-6-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Oliver Upton
aac9496812 KVM: arm64: Rename SMC/HVC call handler to reflect reality
KVM handles SMCCC calls from virtual EL2 that use the SMC instruction
since commit bd36b1a9eb ("KVM: arm64: nv: Handle SMCs taken from
virtual EL2"). Thus, the function name of the handler no longer reflects
reality.

Normalize the name on SMCCC, since that's the only hypercall interface
KVM supports in the first place. No fuctional change intended.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230404154050.2270077-5-oliver.upton@linux.dev
2023-04-05 12:07:41 +01:00
Jintack Lim
bd36b1a9eb KVM: arm64: nv: Handle SMCs taken from virtual EL2
Non-nested guests have used the hvc instruction to initiate SMCCC
calls into KVM. This is quite a poor fit for NV as hvc exceptions are
always taken to EL2. In other words, KVM needs to unconditionally
forward the hvc exception back into vEL2 to uphold the architecture.

Instead, treat the smc instruction from vEL2 as we would a guest
hypercall, thereby allowing the vEL2 to interact with KVM's hypercall
surface. Note that on NV-capable hardware HCR_EL2.TSC causes smc
instructions executed in non-secure EL1 to trap to EL2, even if EL3 is
not implemented.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-13-maz@kernel.org
[Oliver: redo commit message, only handle smc from vEL2]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 10:08:39 +00:00
Christoffer Dall
6898a55ce3 KVM: arm64: nv: Handle trapped ERET from virtual EL2
When a guest hypervisor running virtual EL2 in EL1 executes an ERET
instruction, we will have set HCR_EL2.NV which traps ERET to EL2, so
that we can emulate the exception return in software.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Jintack Lim
93c33702cd KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
As we expect all PSCI calls from the L1 hypervisor to be performed
using SMC when nested virtualization is enabled, it is clear that
all HVC instruction from the VM (including from the virtual EL2)
are supposed to handled in the virtual EL2.

Forward these to EL2 as required.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: add handling of HCR_EL2.HCD]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Reiji Watanabe
370531d1e9 KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending
While userspace enables single-step, if the Software Step state at the
last guest exit was "Active-pending", clear PSTATE.SS on guest entry
to restore the state.

Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace
enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP).
It means KVM always makes the vCPU's Software Step state
"Active-not-pending" on the guest entry, which lets the VCPU perform
single-step (then Software Step exception is taken). This could cause
extra single-step (without returning to userspace) if the Software Step
state at the last guest exit was "Active-pending" (i.e. the last
exit was triggered by an asynchronous exception after the single-step
is performed, but before the Software Step exception is taken.
See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior
in the active-pending state" in ARM DDI 0487I.a for more info about
this behavior).

Fix this by clearing PSTATE.SS on guest entry if the Software Step state
at the last exit was "Active-pending" so that KVM restore the state (and
the exception is taken before further single-step is performed).

Fixes: 337b99bf7e ("KVM: arm64: guest debug, add support for single-step")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com
2022-09-19 10:48:53 +01:00
Marc Zyngier
0982c8d859 Merge branch kvm-arm64/nvhe-stacktrace into kvmarm-master/next
* kvm-arm64/nvhe-stacktrace: (27 commits)
  : .
  : Add an overflow stack to the nVHE EL2 code, allowing
  : the implementation of an unwinder, courtesy of
  : Kalesh Singh. From the cover letter (slightly edited):
  :
  : "nVHE has two modes of operation: protected (pKVM) and unprotected
  : (conventional nVHE). Depending on the mode, a slightly different approach
  : is used to dump the hypervisor stacktrace but the core unwinding logic
  : remains the same.
  :
  : * Protected nVHE (pKVM) stacktraces:
  :
  : In protected nVHE mode, the host cannot directly access hypervisor memory.
  :
  : The hypervisor stack unwinding happens in EL2 and is made accessible to
  : the host via a shared buffer. Symbolizing and printing the stacktrace
  : addresses is delegated to the host and happens in EL1.
  :
  : * Non-protected (Conventional) nVHE stacktraces:
  :
  : In non-protected mode, the host is able to directly access the hypervisor
  : stack pages.
  :
  : The hypervisor stack unwinding and dumping of the stacktrace is performed
  : by the host in EL1, as this avoids the memory overhead of setting up
  : shared buffers between the host and hypervisor."
  :
  : Additional patches from Oliver Upton and Marc Zyngier, tidying up
  : the initial series.
  : .
  arm64: Update 'unwinder howto'
  KVM: arm64: Don't open code ARRAY_SIZE()
  KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c
  KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions
  KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit
  KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around
  KVM: arm64: Introduce pkvm_dump_backtrace()
  KVM: arm64: Implement protected nVHE hyp stack unwinder
  KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
  KVM: arm64: Stub implementation of pKVM HYP stack unwinder
  KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
  KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
  KVM: arm64: Introduce hyp_dump_backtrace()
  KVM: arm64: Implement non-protected nVHE hyp stack unwinder
  KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
  KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder
  KVM: arm64: On stack overflow switch to hyp overflow_stack
  arm64: stacktrace: Add description of stacktrace/common.h
  arm64: stacktrace: Factor out common unwind()
  arm64: stacktrace: Handle frame pointer from different address spaces
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-27 18:33:27 +01:00
Marc Zyngier
9f5fee05f6 KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit
The unwinding code doesn't really belong to the exit handling
code. Instead, move it to a file (conveniently named stacktrace.c
to confuse the reviewer), and move all the stacktrace-related
stuff there.

It will be joined by more code very soon.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Tested-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20220727142906.1856759-3-maz@kernel.org
2022-07-27 18:18:03 +01:00
Kalesh Singh
3a7e1b55aa KVM: arm64: Introduce pkvm_dump_backtrace()
Dumps the pKVM hypervisor backtrace from EL1 by reading the unwinded
addresses from the shared stacktrace buffer.

The nVHE hyp backtrace is dumped on hyp_panic(), before panicking the
host.

[  111.623091] kvm [367]: nVHE call trace:
[  111.623215] kvm [367]:  [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8
[  111.623448] kvm [367]:  [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
[  111.623642] kvm [367]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
. . .
[  111.640366] kvm [367]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
[  111.640467] kvm [367]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
[  111.640574] kvm [367]:  [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c
[  111.640676] kvm [367]:  [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48
[  111.640778] kvm [367]:  [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128
[  111.640880] kvm [367]:  [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64
[  111.640996] kvm [367]: ---[ end nVHE call trace ]---

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-18-kaleshsingh@google.com
2022-07-26 10:51:39 +01:00
Kalesh Singh
314a61dc31 KVM: arm64: Introduce hyp_dump_backtrace()
In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace
from EL1. This is possible beacause the host can directly access the
hypervisor stack pages in non-protected mode.

The nVHE backtrace is dumped on hyp_panic(), before panicking the host.

[  101.498183] kvm [377]: nVHE call trace:
[  101.498363] kvm [377]:  [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8
[  101.499045] kvm [377]:  [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
[  101.499498] kvm [377]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
. . .
[  101.524929] kvm [377]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
[  101.525062] kvm [377]:  [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34
[  101.525195] kvm [377]:  [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c
[  101.525333] kvm [377]:  [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48
[  101.525468] kvm [377]:  [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128
[  101.525602] kvm [377]:  [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64
[  101.525745] kvm [377]: ---[ end nVHE call trace ]---

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-12-kaleshsingh@google.com
2022-07-26 10:49:50 +01:00
Marc Zyngier
aeb7942b64 Merge branch kvm-arm64/misc-5.20 into kvmarm-master/next
* kvm-arm64/misc-5.20:
  : .
  : Misc fixes for 5.20:
  :
  : - Tidy up the hyp/nvhe Makefile
  :
  : - Fix functions pointlessly returning a void value
  :
  : - Fix vgic_init selftest to handle the GICv3-on-v3 case
  :
  : - Fix hypervisor symbolisation when CONFIG_RANDOMIZE_BASE=y
  : .
  KVM: arm64: Fix hypervisor address symbolization
  KVM: arm64: selftests: Add support for GICv2 on v3
  KVM: arm64: Don't return from void function
  KVM: arm64: nvhe: Add intermediates to 'targets' instead of extra-y
  KVM: arm64: nvhe: Rename confusing obj-y

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:45:22 +01:00
Kalesh Singh
ed6313a93f KVM: arm64: Fix hypervisor address symbolization
With CONFIG_RANDOMIZE_BASE=y vmlinux addresses will resolve incorrectly
from kallsyms. Fix this by adding the KASLR offset before printing the
symbols.

Fixes: 6ccf9cb557 ("KVM: arm64: Symbolize the nVHE HYP addresses")
Reported-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220715235824.2549012-1-kaleshsingh@google.com
2022-07-17 11:43:40 +01:00
Marc Zyngier
eebc538d8e KVM: arm64: Move vcpu WFIT flag to the state flag set
The host kernel uses the WFIT flag to remember that a vcpu has used
this instruction and wake it up as required. Move it to the state
set, as nothing in the hypervisor uses this information.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:23:23 +01:00
Linus Torvalds
bf9095424d S390:
* ultravisor communication device driver
 
 * fix TEID on terminating storage key ops
 
 RISC-V:
 
 * Added Sv57x4 support for G-stage page table
 
 * Added range based local HFENCE functions
 
 * Added remote HFENCE functions based on VCPU requests
 
 * Added ISA extension registers in ONE_REG interface
 
 * Updated KVM RISC-V maintainers entry to cover selftests support
 
 ARM:
 
 * Add support for the ARMv8.6 WFxT extension
 
 * Guard pages for the EL2 stacks
 
 * Trap and emulate AArch32 ID registers to hide unsupported features
 
 * Ability to select and save/restore the set of hypercalls exposed
   to the guest
 
 * Support for PSCI-initiated suspend in collaboration with userspace
 
 * GICv3 register-based LPI invalidation support
 
 * Move host PMU event merging into the vcpu data structure
 
 * GICv3 ITS save/restore fixes
 
 * The usual set of small-scale cleanups and fixes
 
 x86:
 
 * New ioctls to get/set TSC frequency for a whole VM
 
 * Allow userspace to opt out of hypercall patching
 
 * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr
 
 AMD SEV improvements:
 
 * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES
 
 * V_TSC_AUX support
 
 Nested virtualization improvements for AMD:
 
 * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
   nested vGIF)
 
 * Allow AVIC to co-exist with a nested guest running
 
 * Fixes for LBR virtualizations when a nested guest is running,
   and nested LBR virtualization support
 
 * PAUSE filtering for nested hypervisors
 
 Guest support:
 
 * Decoupling of vcpu_is_preempted from PV spinlocks
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6
 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu
 rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi
 PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL
 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr
 C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw==
 =ozWx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "S390:

   - ultravisor communication device driver

   - fix TEID on terminating storage key ops

  RISC-V:

   - Added Sv57x4 support for G-stage page table

   - Added range based local HFENCE functions

   - Added remote HFENCE functions based on VCPU requests

   - Added ISA extension registers in ONE_REG interface

   - Updated KVM RISC-V maintainers entry to cover selftests support

  ARM:

   - Add support for the ARMv8.6 WFxT extension

   - Guard pages for the EL2 stacks

   - Trap and emulate AArch32 ID registers to hide unsupported features

   - Ability to select and save/restore the set of hypercalls exposed to
     the guest

   - Support for PSCI-initiated suspend in collaboration with userspace

   - GICv3 register-based LPI invalidation support

   - Move host PMU event merging into the vcpu data structure

   - GICv3 ITS save/restore fixes

   - The usual set of small-scale cleanups and fixes

  x86:

   - New ioctls to get/set TSC frequency for a whole VM

   - Allow userspace to opt out of hypercall patching

   - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr

  AMD SEV improvements:

   - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

   - V_TSC_AUX support

  Nested virtualization improvements for AMD:

   - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
     nested vGIF)

   - Allow AVIC to co-exist with a nested guest running

   - Fixes for LBR virtualizations when a nested guest is running, and
     nested LBR virtualization support

   - PAUSE filtering for nested hypervisors

  Guest support:

   - Decoupling of vcpu_is_preempted from PV spinlocks"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits)
  KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest
  KVM: selftests: x86: Sync the new name of the test case to .gitignore
  Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND
  x86, kvm: use correct GFP flags for preemption disabled
  KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer
  x86/kvm: Alloc dummy async #PF token outside of raw spinlock
  KVM: x86: avoid calling x86 emulator without a decoded instruction
  KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak
  x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave)
  s390/uv_uapi: depend on CONFIG_S390
  KVM: selftests: x86: Fix test failure on arch lbr capable platforms
  KVM: LAPIC: Trace LAPIC timer expiration on every vmentry
  KVM: s390: selftest: Test suppression indication on key prot exception
  KVM: s390: Don't indicate suppression on dirtying, failing memop
  selftests: drivers/s390x: Add uvdevice tests
  drivers/s390/char: Add Ultravisor io device
  MAINTAINERS: Update KVM RISC-V entry to cover selftests support
  RISC-V: KVM: Introduce ISA extension register
  RISC-V: KVM: Cleanup stale TLB entries when host CPU changes
  RISC-V: KVM: Add remote HFENCE functions based on VCPU requests
  ...
2022-05-26 14:20:14 -07:00
Marc Zyngier
d25f30fe41 Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/next
* kvm-arm64/aarch32-idreg-trap:
  : .
  : Add trapping/sanitising infrastructure for AArch32 systen registers,
  : allowing more control over what we actually expose (such as the PMU).
  :
  : Patches courtesy of Oliver and Alexandru.
  : .
  KVM: arm64: Fix new instances of 32bit ESRs
  KVM: arm64: Hide AArch32 PMU registers when not available
  KVM: arm64: Start trapping ID registers for 32 bit guests
  KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler
  KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents
  KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds
  KVM: arm64: Return a bool from emulate_cp()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 09:42:45 +01:00
Marc Zyngier
904cabf471 Merge branch kvm-arm64/hyp-stack-guard into kvmarm-master/next
* kvm-arm64/hyp-stack-guard:
  : .
  : Harden the EL2 stack by providing stack guards, courtesy of
  : Kalesh Singh.
  : .
  KVM: arm64: Symbolize the nVHE HYP addresses
  KVM: arm64: Detect and handle hypervisor stack overflows
  KVM: arm64: Add guard pages for pKVM (protected nVHE) hypervisor stack
  KVM: arm64: Add guard pages for KVM nVHE hypervisor stack
  KVM: arm64: Introduce pkvm_alloc_private_va_range()
  KVM: arm64: Introduce hyp_alloc_private_va_range()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 09:42:37 +01:00