- Add support for tracing in the standalone EL2 hypervisor code, which
should help both debugging and performance analysis. This uses the
new infrastructure for 'remote' trace buffers that can be exposed
by non-kernel entities such as firmware, and which came through the
tracing tree.
- Add support for GICv5 Per Processor Interrupts (PPIs), as the starting
point for supporting the new GIC architecture in KVM.
- Finally add support for pKVM protected guests, where pages are unmapped
from the host as they are faulted into the guest and can be shared back
from the guest using pKVM hypercalls. Protected guests are created
using a new machine type identifier. As the elusive guestmem has not
yet delivered on its promises, anonymous memory is also supported.
This is only a first step towards full isolation from the host; for
example, the CPU register state and DMA accesses are not yet isolated.
Because this does not really yet bring fully what it promises, it is
hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and
also triggers TAINT_USER when a VM is created. Caveat emptor.
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to the
various helpers and rendering a substantial amount of state immutable.
- Expand the Stage-2 page table dumper to support NV shadow page tables
on a per-VM basis.
- Tidy up the pKVM PSCI proxy code to be slightly less hard to follow.
- Fix both SPE and TRBE in non-VHE configurations so that they do not
generate spurious, out of context table walks that ultimately lead
to very bad HW lockups.
- A small set of patches fixing the Stage-2 MMU freeing in error cases.
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls.
- The usual cleanups and other selftest churn.
LoongArch:
- Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel().
- Add DMSINTC irqchip in kernel support.
RISC-V:
- Fix steal time shared memory alignment checks
- Fix vector context allocation leak
- Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
- Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
- Fix integer overflow in kvm_pmu_validate_counter_mask()
- Fix shift-out-of-bounds in make_xfence_request()
- Fix lost write protection on huge pages during dirty logging
- Split huge pages during fault handling for dirty logging
- Skip CSR restore if VCPU is reloaded on the same core
- Implement kvm_arch_has_default_irqchip() for KVM selftests
- Factored-out ISA checks into separate sources
- Added hideleg to struct kvm_vcpu_config
- Factored-out VCPU config into separate sources
- Support configuration of per-VM HGATP mode from KVM user space
s390:
- Support for ESA (31-bit) guests inside nested hypervisors.
- Remove restriction on memslot alignment, which is not needed anymore with
the new gmap code.
- Fix LPSW/E to update the bear (which of course is the breaking event
address register).
x86:
- Shut up various UBSAN warnings on reading module parameter before they
were initialized.
- Don't zero-allocate page tables that are used for splitting hugepages in
the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and
thus write all bytes.
- As an optimization, bail early when trying to unsync 4KiB mappings if the
target gfn can just be mapped with a 2MiB hugepage.
x86 generic:
- Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely
struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM
would dereference stack pointer after an exit to userspace.
- Clean up and comment the emulated MMIO code to try to make it easier to
maintain (not necessarily "easy", but "easier").
- Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX
and SVM enabling) as it is needed for trusted I/O.
- Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
- Immediately fail the build if a required #define is missing in one of
KVM's headers that is included multiple times.
- Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
exception, mostly to prevent syzkaller from abusing the uAPI to
trigger WARNs, but also because it can help prevent userspace from
unintentionally crashing the VM.
- Exempt SMM from CPUID faulting on Intel, as per the spec.
- Misc hardening and cleanup changes.
x86 (AMD):
- Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU
so that KVM doesn't prematurely re-enable AVIC if multiple
vCPUs have to-be-injected IRQs.
- Clean up and optimize the OSVW handling, avoiding a bug in which KVM would
overwrite state when enabling virtualization on multiple CPUs in parallel.
This should not be a problem because OSVW should usually be the same for
all CPUs.
- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a
"too large" size based purely on user input.
- Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION.
- Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as
doing so for an SNP guest will crash the host due to an RMP violation
page fault.
- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries
are required to hold kvm->lock, and enforce it by lockdep. Fix various
bugs where sev_guest() was not ensured to be stable for the whole
duration of a function or ioctl.
- Convert a pile of kvm->lock SEV code to guard().
- Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD,
for which KVM needs to set CR2 and DR6 as a response to ioctls such as
KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2
rather than CR2, for example). Only set CR2 and DR6 when consumption of
the payload is imminent, but on the other hand force delivery of the
payload in all paths where userspace retrieves CR2 or DR6.
- Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead
of vmcb02->save.cr2. The value is out of sync after a save/restore
or after a #PF is injected into L2.
- Fix a class of nSVM bugs where some fields written by the CPU are not
synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not
up-to-date when saved by KVM_GET_NESTED_STATE.
- Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and
KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after
save+restore.
- Add a variety of missing nSVM consistency checks.
- Fix several bugs where KVM failed to correctly update VMCB fields on
nested #VMEXIT.
- Fix several bugs where KVM failed to correctly synthesize #UD or #GP for
SVM-related instructions.
- Add support for save+restore of virtualized LBRs (on SVM).
- Refactor various helpers and macros to improve clarity and (hopefully)
make the code easier to maintain.
- Aggressively sanitize fields when copying from vmcb12, to guard against
unintentionally allowing L1 to utilize yet-to-be-defined features.
- Fix several bugs where KVM botched rAX legality checks when emulating SVM
instructions. There are remaining issues in that KVM doesn't handle size
prefix overrides for 64-bit guests.
- Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of
somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's
architectural but sketchy behavior of generating #GP for "unsupported"
addresses).
- Cache all used vmcb12 fields to further harden against TOCTOU bugs.
x86 (Intel):
- Drop obsolete branch hint prefixes from the VMX instruction macros.
- Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
register input when appropriate.
- Code cleanups.
guest_memfd:
- Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support
reclaim, the memory is unevictable, and there is no storage to write
back to.
LoongArch selftests:
- Add KVM PMU test cases
s390 selftests:
- Enable more memory selftests.
x86 selftests:
- Add support for Hygon CPUs in KVM selftests.
- Fix a bug in the MSR test where it would get false failures on AMD/Hygon
CPUs with exactly one of RDPID or RDTSCP.
- Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a
bug where the kernel would attempt to collapse guest_memfd folios against
KVM's will.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz
vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw
qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3
8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX
08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5
Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg==
=CcE0
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"Arm:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis. This
uses the new infrastructure for 'remote' trace buffers that can be
exposed by non-kernel entities such as firmware, and which came
through the tracing tree
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM
- Finally add support for pKVM protected guests, where pages are
unmapped from the host as they are faulted into the guest and can
be shared back from the guest using pKVM hypercalls. Protected
guests are created using a new machine type identifier. As the
elusive guestmem has not yet delivered on its promises, anonymous
memory is also supported
This is only a first step towards full isolation from the host; for
example, the CPU register state and DMA accesses are not yet
isolated. Because this does not really yet bring fully what it
promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
created. Caveat emptor
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to the
various helpers and rendering a substantial amount of state
immutable
- Expand the Stage-2 page table dumper to support NV shadow page
tables on a per-VM basis
- Tidy up the pKVM PSCI proxy code to be slightly less hard to
follow
- Fix both SPE and TRBE in non-VHE configurations so that they do not
generate spurious, out of context table walks that ultimately lead
to very bad HW lockups
- A small set of patches fixing the Stage-2 MMU freeing in error
cases
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls
- The usual cleanups and other selftest churn
LoongArch:
- Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()
- Add DMSINTC irqchip in kernel support
RISC-V:
- Fix steal time shared memory alignment checks
- Fix vector context allocation leak
- Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
- Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
- Fix integer overflow in kvm_pmu_validate_counter_mask()
- Fix shift-out-of-bounds in make_xfence_request()
- Fix lost write protection on huge pages during dirty logging
- Split huge pages during fault handling for dirty logging
- Skip CSR restore if VCPU is reloaded on the same core
- Implement kvm_arch_has_default_irqchip() for KVM selftests
- Factored-out ISA checks into separate sources
- Added hideleg to struct kvm_vcpu_config
- Factored-out VCPU config into separate sources
- Support configuration of per-VM HGATP mode from KVM user space
s390:
- Support for ESA (31-bit) guests inside nested hypervisors
- Remove restriction on memslot alignment, which is not needed
anymore with the new gmap code
- Fix LPSW/E to update the bear (which of course is the breaking
event address register)
x86:
- Shut up various UBSAN warnings on reading module parameter before
they were initialized
- Don't zero-allocate page tables that are used for splitting
hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
the page table and thus write all bytes
- As an optimization, bail early when trying to unsync 4KiB mappings
if the target gfn can just be mapped with a 2MiB hugepage
x86 generic:
- Copy single-chunk MMIO write values into struct kvm_vcpu (more
precisely struct kvm_mmio_fragment) to fix use-after-free stack
bugs where KVM would dereference stack pointer after an exit to
userspace
- Clean up and comment the emulated MMIO code to try to make it
easier to maintain (not necessarily "easy", but "easier")
- Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
VMX and SVM enabling) as it is needed for trusted I/O
- Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
- Immediately fail the build if a required #define is missing in one
of KVM's headers that is included multiple times
- Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
exception, mostly to prevent syzkaller from abusing the uAPI to
trigger WARNs, but also because it can help prevent userspace from
unintentionally crashing the VM
- Exempt SMM from CPUID faulting on Intel, as per the spec
- Misc hardening and cleanup changes
x86 (AMD):
- Fix and optimize IRQ window inhibit handling for AVIC; make it
per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
vCPUs have to-be-injected IRQs
- Clean up and optimize the OSVW handling, avoiding a bug in which
KVM would overwrite state when enabling virtualization on multiple
CPUs in parallel. This should not be a problem because OSVW should
usually be the same for all CPUs
- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
about a "too large" size based purely on user input
- Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION
- Disallow synchronizing a VMSA of an already-launched/encrypted
vCPU, as doing so for an SNP guest will crash the host due to an
RMP violation page fault
- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
queries are required to hold kvm->lock, and enforce it by lockdep.
Fix various bugs where sev_guest() was not ensured to be stable for
the whole duration of a function or ioctl
- Convert a pile of kvm->lock SEV code to guard()
- Play nicer with userspace that does not enable
KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
payload would end up in EXITINFO2 rather than CR2, for example).
Only set CR2 and DR6 when consumption of the payload is imminent,
but on the other hand force delivery of the payload in all paths
where userspace retrieves CR2 or DR6
- Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
instead of vmcb02->save.cr2. The value is out of sync after a
save/restore or after a #PF is injected into L2
- Fix a class of nSVM bugs where some fields written by the CPU are
not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
are not up-to-date when saved by KVM_GET_NESTED_STATE
- Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
initialized after save+restore
- Add a variety of missing nSVM consistency checks
- Fix several bugs where KVM failed to correctly update VMCB fields
on nested #VMEXIT
- Fix several bugs where KVM failed to correctly synthesize #UD or
#GP for SVM-related instructions
- Add support for save+restore of virtualized LBRs (on SVM)
- Refactor various helpers and macros to improve clarity and
(hopefully) make the code easier to maintain
- Aggressively sanitize fields when copying from vmcb12, to guard
against unintentionally allowing L1 to utilize yet-to-be-defined
features
- Fix several bugs where KVM botched rAX legality checks when
emulating SVM instructions. There are remaining issues in that KVM
doesn't handle size prefix overrides for 64-bit guests
- Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
down on AMD's architectural but sketchy behavior of generating #GP
for "unsupported" addresses)
- Cache all used vmcb12 fields to further harden against TOCTOU bugs
x86 (Intel):
- Drop obsolete branch hint prefixes from the VMX instruction macros
- Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
register input when appropriate
- Code cleanups
guest_memfd:
- Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
support reclaim, the memory is unevictable, and there is no storage
to write back to
LoongArch selftests:
- Add KVM PMU test cases
s390 selftests:
- Enable more memory selftests
x86 selftests:
- Add support for Hygon CPUs in KVM selftests
- Fix a bug in the MSR test where it would get false failures on
AMD/Hygon CPUs with exactly one of RDPID or RDTSCP
- Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
for a bug where the kernel would attempt to collapse guest_memfd
folios against KVM's will"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
KVM: x86: use inlines instead of macros for is_sev_*guest
x86/virt: Treat SVM as unsupported when running as an SEV+ guest
KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
KVM: SEV: use mutex guard in snp_handle_guest_req()
KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
KVM: SEV: use mutex guard in snp_launch_update()
KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
KVM: SEV: WARN on unhandled VM type when initializing VM
KVM: LoongArch: selftests: Add PMU overflow interrupt test
KVM: LoongArch: selftests: Add basic PMU event counting test
KVM: LoongArch: selftests: Add cpucfg read/write helpers
LoongArch: KVM: Add DMSINTC inject msi to vCPU
LoongArch: KVM: Add DMSINTC device support
LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
...
fail to provide the mandatory irq_write_msi_msg() callback, which prevents
a NULL pointer dereference later.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbuiMQHHRnbHhAa2Vy
bmVsLm9yZwAKCRCmGPVMDXSYob6MEAC/Xg2JpcgffaqkdSaKMR6s0Hvoz4YevD2k
tlVc7C1X8fYx/AgIYteGWw5G9kacp3FhU1bLprCF8HCaCbsP5Gi5vstaUwfuMZBH
0Lkmg9J/rZKP253rEC6D0AkliOfW1PfoNMcqlNMQyCB+iIxBZqX8ZhQwaZwtb6ul
CoKyiuEaFRsJKyo52p+M39o3dgvB6sFV93rj/UbVnYSnAO4uMJ91yL4tAkcIIbiK
mpoSmYqUdN7CaWYPhZ/6bdGZkDl422dwErszt3KC7p4yYqYWj9ghbmq5h2Tf3Gb6
GBRKXDr7zXBDb7vom/BBduCIYK3KKo7xSaolS0nrpaQcLlAbRQ7xe5aAB4KHf8+U
deO30KHMXaHXHCQ5Y2D2AGo1b9iSrynlBzYwEvdn54M6rK9/RpzwSs9X7IvGs06u
yyNtJzk3TGSRwtLk0/gOE6+jk2ZqVlqR25swPliX+IaUGBCPg0IWu/oT0M76ZhjO
Kya8Z0TmF1oBTtXTSI3EKo5F5qd1tyxUCFKqfr+T9hZPkypagxKMpZCwSsyuK13Q
fcCKpx+0CXLDKY6WINjEp2S1A7iU3KDZUJ+H0khu9XGG6mVNq75HysZaNR3FF1tN
WiU+Ftj5DmRexLoZc/q+7NnmPvw4cN39l2jo8G3ihuwUSngUOn5lvhzid3+hlwEv
GHvsH73pjg==
=ztWy
-----END PGP SIGNATURE-----
Merge tag 'irq-msi-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI interrupt update from Thomas Gleixner:
"A small update for the MSI interrupt library to check for callers
which fail to provide the mandatory irq_write_msi_msg() callback,
which prevents a NULL pointer dereference later"
* tag 'irq-msi-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/msi-lib: Refuse initialization when irq_write_msi_msg() is missing
- A large refactoring for the Renesas RZV2H driver to add new interrupt
types cleanly.
- A large refactoring for the Renesas RZG2L driver to add support the new
RZ/G3L variant.
- Add support for the new NXP S32N79 chip in the IMX irq-steer driver.
- Add support for the Apple AICv3 variant
- Enhance the Loongson PCH LPC driver so it can be used on MIPS with
device tree firmware
- Allow the PIC32 EVIC driver to be built independent of MIPS in compile
tests.
- The usual small fixes and enhancements all over the place
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbuaAQHHRnbHhAa2Vy
bmVsLm9yZwAKCRCmGPVMDXSYoQc6D/9i6QTUNw4ZqpZSjJvrxX2mv49ej3FkWsQf
1P59ej9A0/w1+0A8SyAFd6ExuvO3am9k64nhpsFlwv0lMTu57va3Oj14E28fjN+M
H1XWJYCLX3p7MWGsGFCmbHIeJJQnwCqyiZcs3DG0BMA5iYglJEcOijB+h+FCAwCN
jjiBW4bbqPxPcpT90uQZhwwa3eUvkrkwzZEedKkbtvgPy3jdmKvrNCAte9GJSQu3
CvLgAB1Zraq1yIDFQN/km5NByan7pVmbGuv10K/jqirqmjplM2H0X+fwXlWz4M0Q
uo33xDtJvtOegnySiJ6EimY+GTHDiloZpn2nqqUnHRIR0hxYjYudukSRorxbjXWv
c93CGk1/Mq9WHqCsWvX5xe75Ttuf5xsfej8DETYvP+cTkGV/Kk9EUNmDUoYnGkiv
UZyZBUMWejJLjnNmMpouUwW9Vc/08OLAtmqJgQukl2cbh/ujcmGC0TSA2SpBKlJT
FpGU6ZsHWxpm1UHVv+9Uhx5dArRVNeTvjNgLFtu+ZkGM+C2oBs/MKHpu44mBfAf8
Ex4sFzCslKXbYo6TwO48w9VIMQJ6DDPrwza1YN6hRn4IwBkRXl3SQsT3pJxG1WMg
NcP4dMRfe68o9y4ywxnoomH+qlux4j/tq3J7PqRY5s3WzWe8jXqfiQGNE1f10rop
gGZe5f+/Mw==
=kvZ2
-----END PGP SIGNATURE-----
Merge tag 'irq-drivers-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt chip driver updates from Thomas Gleixner:
- A large refactoring for the Renesas RZV2H driver to add new interrupt
types cleanly
- A large refactoring for the Renesas RZG2L driver to add support the
new RZ/G3L variant
- Add support for the new NXP S32N79 chip in the IMX irq-steer driver
- Add support for the Apple AICv3 variant
- Enhance the Loongson PCH LPC driver so it can be used on MIPS with
device tree firmware
- Allow the PIC32 EVIC driver to be built independent of MIPS in
compile tests
- The usual small fixes and enhancements all over the place
* tag 'irq-drivers-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
irqchip/irq-pic32-evic: Add __maybe_unused for board_bind_eic_interrupt in COMPILE_TEST
irqchip/renesas-rzv2h: Kill icu_err string
irqchip/renesas-rzv2h: Kill swint_names[]
irqchip/renesas-rzv2h: Kill swint_idx[]
irqchip/renesas-rzg2l: Add NMI support
irqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free()
irqchip/renesas-rzg2l: Replace raw_spin_{lock,unlock} with guard() in rzg2l_irq_set_type()
irqchip/gic-v3: Print a warning for out-of-range interrupt numbers
irqchip/renesas-rzg2l: Add shared interrupt support
irqchip/renesas-rzg2l: Add RZ/G3L support
irqchip/renesas-rzg2l: Drop IRQC_IRQ_COUNT macro
irqchip/renesas-rzg2l: Drop IRQC_TINT_START macro
irqchip/renesas-rzg2l: Drop IRQC_NUM_IRQ macro
irqchip/renesas-rzg2l: Dynamically allocate fwspec array
irqchip/renesas-rzg2l: Split rzfive_irqc_{mask,unmask} into separate IRQ and TINT handlers
irqchip/renesas-rzg2l: Split rzfive_tint_irq_endisable() into separate IRQ and TINT helpers
irqchip/renesas-rzg2l: Replace rzg2l_irqc_irq_{enable,disable} with TINT-specific handlers
irqchip/renesas-rzg2l: Split set_type handler into separate IRQ and TINT functions
irqchip/renesas-rzg2l: Split EOI handler into separate IRQ and TINT functions
irqchip/renesas-rzg2l: Replace single irq_chip with per-region irq_chip instances
...
* New features:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis.
This comes with a full infrastructure for 'remote' trace buffers
that can be exposed by non-kernel entities such as firmware.
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM.
- Finally add support for pKVM protected guests, with anonymous
memory being used as a backing store. About time!
* Improvements and bug fixes:
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to
the various helpers and rendering a substantial amount of
state immutable.
- Expand the Stage-2 page table dumper to support NV shadow
page tables on a per-VM basis.
- Tidy up the pKVM PSCI proxy code to be slightly less hard
to follow.
- Fix both SPE and TRBE in non-VHE configurations so that they
do not generate spurious, out of context table walks that
ultimately lead to very bad HW lockups.
- A small set of patches fixing the Stage-2 MMU freeing in error
cases.
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls.
- The usual cleanups and other selftest churn.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmnWdswACgkQI9DQutE9
ekNYvBAAxj5Zmsx8sJ2CYDTJc2w4XkEjSgDugA+J/s0TMgrzExeBlWCstdhVTncy
68nwOjQl3TotnIrt7q36kko9u7IdD0pHNrk34NtlggLjHfB61n9SNcAA6j4F6zJa
GFkHpJSrSnZuUPqapkDnlyhuPkgTIAkEUk2Am9siksSfY4HvRyHZJm2FTdxsdIBn
NN9wvQqw2wefTXOQ8gS+oHbPVp1cPbwrF2a3EhzXXv/6W3mUBstXgsijgo07UzCp
W6vHCv2wqHbHdf67z3Q3hL+VXlVH6oHlyW99/swqISvqRkH/iSB90+oUojnMRrSm
yB6Wmhh8jboCaajWMJhG+veZw+7GMXU4nOrGd1rbnY8cwRl/TQ5YibhRm7DIdvjO
xeUluTLJ0NdweQUwE2k4OlgKOuGang3E2p0clmkUO4SstA48MdqR/kpST6guIlWw
U5syuNaaaiuwP5QOi9qZmMCNmQ3ZfnZG3nseJFdoyGjhVhf5jyQyv4Du9vGZQFF/
Zkg7yTqC4OWiC+3GkW9YYAySM1MyetivLtd47PGzHPTdtaZziWhNvQ0y+8QjQ+R+
CJNvyS/DvsT7epSya4sLgMP1ZAlih9xkz5sQ6k8NJLBYYXi0v33qwqditErgLLyj
S4Ci4WNhHHWIusvCVM7JUBkH0AElpmi506f7F6iHoFLlkYR4t9U=
=/SuQ
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 7.1
* New features:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis.
This comes with a full infrastructure for 'remote' trace buffers
that can be exposed by non-kernel entities such as firmware.
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM.
- Finally add support for pKVM protected guests, with anonymous
memory being used as a backing store. About time!
* Improvements and bug fixes:
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to
the various helpers and rendering a substantial amount of
state immutable.
- Expand the Stage-2 page table dumper to support NV shadow
page tables on a per-VM basis.
- Tidy up the pKVM PSCI proxy code to be slightly less hard
to follow.
- Fix both SPE and TRBE in non-VHE configurations so that they
do not generate spurious, out of context table walks that
ultimately lead to very bad HW lockups.
- A small set of patches fixing the Stage-2 MMU freeing in error
cases.
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls.
- The usual cleanups and other selftest churn.
There are a few ifdefs in this driver so that it can be compiled on all
architectures when COMPILE_TEST is set. board_bind_eic_interrupt is
defined in arch/mips/ for normal usage, however when this driver is
compiled with COMPILE_TEST on other architectures, it is defined as a
static variable inside this driver. This causes the following warning:
drivers/irqchip/irq-pic32-evic.c:54:15: warning: variable
'board_bind_eic_interrupt' set but not used [-Wunused-but-set-global]
54 | static void (*board_bind_eic_interrupt)(int irq,
int regset);
| ^
Annotate the static variable with __maybe_unused to avoid having to put
even more ifdefs into this driver.
Fixes: 282f8b547d ("irqchip/irq-pic32-evic: Define board_bind_eic_interrupt for !MIPS builds")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260403-irq-pic32-evic-unused-v1-1-447cdc0675ec@redhat.com
Closes: https://lore.kernel.org/oe-kbuild-all/202603300715.4HuMMAFb-lkp@intel.com/
The array swint_names[] just contains expansions of "int-ca55-%u".
Replace it by formatting the strings where needed, to improve
readability.
Despite the two error messages can no longer be shared with the ICU
error cases, this reduces generated code size by 56 bytes.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patch.msgid.link/aceab3fbc307ef428dfd62d8d846b68704dea012.1775205874.git.geert+renesas@glider.be
rzg2l_irqc_free() invokes irq_domain_free_irqs_common(), which internally
calls irq_domain_reset_irq_data(). That explicitly sets irq_data->hwirq to
0. Consequently, irqd_to_hwirq(d) returns 0 when called after it.
Since 0 falls outside the valid shared IRQ ranges,
rzg2l_irqc_is_shared_and_get_irq_num() evaluates to false, completely
bypassing the test_and_clear_bit() operation.
This leaves the bit set in priv->used_irqs, causing future allocations to
fail with -EBUSY.
Fix this by retrieving irq_data and caching hwirq before calling
irq_domain_free_irqs_common().
Fixes: e0fcae27ff ("irqchip/renesas-rzg2l: Add shared interrupt support")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260328103324.134131-2-biju.das.jz@bp.renesas.com
On ACPI systems, the aplic's pm_domain is set to acpi_general_pm_domain,
which provides its own power management callbacks (e.g., runtime_suspend
via acpi_subsys_runtime_suspend).
aplic_pm_add() unconditionally calls dev_pm_genpd_add_notifier() when
dev->pm_domain is non‑NULL, leading to a comparison between runtime_suspend
and genpd_runtime_suspend. This results in the following errors when ACPI
is enabled:
riscv-aplic RSCV0002:00: failed to create APLIC context
riscv-aplic RSCV0002:00: error -ENODEV: failed to setup APLIC in MSI mode
Fix this by checking for dev->of_node before adding or removing the genpd
notifier, ensuring it is only used for device tree based systems.
Fixes: 95a8ddde36 ("irqchip/riscv-aplic: Preserve APLIC states across suspend/resume")
Signed-off-by: Jessica Liu <liu.xuemei1@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260331093029749vRpdH-0qoEqjS0Wnn9M4x@zte.com.cn
Simplify the locking logic in rzg2l_irq_set_type() by using guard(),
eliminating the need for an explicit unlock call.
[ tglx: Remove the pointless cleanup.h include. The spinlock guards come
from spinlock.h ]
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260328103324.134131-3-biju.das.jz@bp.renesas.com
gic_irq_domain_translate() does not check if an interrupt number lies
within the valid range of the specified interrupt type. Add these checks,
and print a warning if the interrupt number is out of range.
This can help flagging incorrectly described Extended SPI and PPI
interrupts in DT.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/ce695ea46decc816974179314a86f2b9b5cad6a9.1772799134.git.geert+renesas@glider.be
The RZ/G3L SoC has 16 external interrupts, of which 8 are shared with TINT
(GPIO interrupts), whereas RZ/G2L has only 8 external interrupts with no
sharing. The shared interrupt line selection between external interrupt and
GPIO interrupt is based on the INTTSEL register. Add shared_irq_cnt
variable to struct rzg2l_hw_info handle these differences.
Add used_irqs bitmap to struct rzg2l_irqc_priv to track allocation state.
In the alloc callback, use test_and_set_bit() to enforce mutual exclusion
and configure the INTTSEL register to route to either the external
interrupt or TINT. In the free callback, use test_and_clear_bit() to
release the shared interrupt line and reset the INTTSEL. Also add INTTSEL
register save/restore support to the suspend/resume path.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-17-biju.das.jz@bp.renesas.com
The IRQC block on the RZ/G3L SoC is almost identical to the one found on
the RZ/G2L SoC, with the following differences:
- The number of GPIO interrupts for TINT selection is 113 instead of 123.
- The pin index and TINT selection index are not in the 1:1 map.
- The number of external interrupts are 16 instead of 8, out of these
8 external interrupts are shared with TINT.
Add support for the RZ/G3L driver by filling the rzg2l_hw_info table and
adding LUT for mapping between pin index and TINT selection index.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-16-biju.das.jz@bp.renesas.com
The total number of external interrupts in RZ/G2L and RZ/G3L SoC are
different. The RZ/G3L has 16 external interrupts whereas RZ/G2L has only 8
external interrupts. Add irq_count variable in struct rzg2l_hw_info to
handle these differences and drop the macro IRQC_IRQ_COUNT.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-15-biju.das.jz@bp.renesas.com
The IRQC_TINT_START value is different for RZ/G3L and RZ/G2L SoC. Add
tint_start variable in struct rzg2l_hw_info to handle this difference and
drop the macro IRQC_TINT_START.
While at it, update the variable type of titseln, tssr_offset, tssr_index,
index, and sense to unsigned int, in rzg2l_tint_set_edge() as these
variables are used only for calculation.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-14-biju.das.jz@bp.renesas.com
The total number of interrupts in RZ/G2L and RZ/G3L SoC are different.
Introduce struct rzg2l_hw_info to handle the hardware differences and
replace the macro IRQC_NUM_IRQ with num_irq variable in struct
rzg2l_hw_info.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-13-biju.das.jz@bp.renesas.com
The total number of interrupts in RZ/G2L and RZ/G3L SoC are different. The
RZ/G3L has 16 external interrupts whereas RZ/G2L has only 8 external
interrupts. Dynamically allocate fwspec memory instead of static allocation
to support both SoCs.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-12-biju.das.jz@bp.renesas.com
rzfive_irqc_mask() and rzfive_irqc_unmask() use hw_irq range checks to
dispatch between IRQ and TINT masking operations. Split each into two
dedicated handlers — rzfive_irqc_irq_mask(), rzfive_irqc_tint_mask(),
rzfive_irqc_irq_unmask(), and rzfive_irqc_tint_unmask() — each operating
unconditionally on its respective interrupt type, removing the runtime
conditionals.
Assign the IRQ-specific handlers to rzfive_irqc_irq_chip and the
TINT-specific handlers to rzfive_irqc_tint_chip, consistent with the
separation applied to the EOI, set_type, and enable/disable callbacks in
previous patches.
While at it, simplify rzfive_irqc_{irq,tint}_{mask,unmask}() by replacing
raw_spin_lock locking/unlocking with scoped_guard().
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-11-biju.das.jz@bp.renesas.com
rzfive_tint_irq_endisable() handles both IRQ and TINT enable/disable paths
via a hw_irq range check.
Split this into two dedicated helpers, rzfive_irq_endisable() for IRQ
interrupts and rzfive_tint_endisable() for TINT interrupts, each operating
unconditionally on their respective interrupt type.
While at it, simplify rzfive_{irq,tint}_endisable by replacing
raw_spin_lock locking/unlocking with guard() and update the variable types
of offset, tssr_offset, and tssr_index to unsigned int, as these variables
are used only for calculation.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-10-biju.das.jz@bp.renesas.com
rzg2l_irqc_irq_disable() and rzg2l_irqc_irq_enable() are used by both the
IRQ and TINT chips, but only perform TINT-specific work via
rzg2l_tint_irq_endisable(), guarded by a hw_irq range check.
Since the IRQ chip does not require this extra enable/disable handling,
replace its callbacks with the generic irq_chip_disable_parent() and
irq_chip_enable_parent() directly.
While at it, simplify rzfive_irqc_irq_enable() by replacing raw_spin_lock
locking/unlocking with guard() and update the variable types of offset,
tssr_offset, and tssr_index to unsigned int, as these variables are used
only for calculation.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-9-biju.das.jz@bp.renesas.com
The common rzg2l_irqc_set_type() handler uses hw_irq range checks to
dispatch to either rzg2l_irq_set_type() or rzg2l_tint_set_edge().
Split this into two dedicated handlers, rzg2l_irqc_irq_set_type() and
rzg2l_irqc_tint_set_type(), each calling only their respective type
configuration function without runtime conditionals.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-8-biju.das.jz@bp.renesas.com
The common rzg2l_irqc_eoi() handler uses a conditional to determine whether
to clear an IRQ or an TINT interrupt.
Split this into two dedicated handlers, rzg2l_irqc_irq_eoi() and
rzg2l_irqc_tint_eoi(), each handling only their respective interrupt type
without the need for range checks.
While at it, simplify rzg2l_irqc_{irq,tint}_eoi() by replacing
raw_spin_lock locking/unlocking with scoped_guard().
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-7-biju.das.jz@bp.renesas.com
The driver uses a single irq_chip instance shared across all interrupt
types, relying on dispatcher callbacks to differentiate between IRQ and
TINT regions at runtime.
Replace the per-SoC irq_chip and its dispatcher callbacks with dedicated
irq_chip instances for each interrupt region: IRQ and TINT. Subsequent
patches will add per-region callbacks for IRQ and TINT from the common
code.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-6-biju.das.jz@bp.renesas.com
The check `hwirq < IRQC_TINT_START` in rzg2l_irqc_alloc() is unnecessary as
the condition is already guaranteed to be false at that point in the code.
The outer `if (hwirq > IRQC_IRQ_COUNT)` block ensures that hwirq is always
above IRQC_IRQ_COUNT before reaching this check, and since IRQC_TINT_START
<= IRQC_IRQ_COUNT, the guard can never trigger.
Remove the dead code to simplify the allocation path.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-5-biju.das.jz@bp.renesas.com
Replace pm_runtime_put() with pm_runtime_put_sync() when
irq_domain_create_hierarchy() fails to ensure the device suspends
synchronously before devres cleanup disables runtime PM via
pm_runtime_disable().
[ tglx: Fix up subject and change log to be precise ]
Fixes: 7de11369ef ("irqchip/renesas-rzg2l: Use devm_pm_runtime_enable()")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260325192451.172562-4-biju.das.jz@bp.renesas.com
As the driver now supports OF-based platforms, it's now possible to use it
on MIPS Loongson64 machines.
Drop the requirement of LOONGARCH for this driver, to allow build on
both MIPS-based and LoongArch-based Loongson systems.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20260321092032.3502701-7-zhengxingda@iscas.ac.cn
The OF-based MIPS Loongson-3 systems can also have a PCH LPC interrupt
controller.
Add OF-based initialization code for this driver.
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20260321092032.3502701-6-zhengxingda@iscas.ac.cn
A lot of code can be shared between the existing ACPI init flow with the
upcoming OF init flow.
Extract it into a dedicated function.
The re-ordering of parent interrupt allocation requires the architecture
code to reserve legacy interrupts from the dynamic allocation by overriding
arch_dynirq_lower_bound(), otherwise the parent of LPC irqchip will be
allocated in the intended static range of LPC interrupts, which leads to
allocation failure of LPC interrupts.
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20260321092032.3502701-5-zhengxingda@iscas.ac.cn
Replace pm_runtime_put() with pm_runtime_put_sync() when
irq_domain_create_hierarchy() fails to ensure the device suspends
synchronously before devres cleanup disables runtime PM via
pm_runtime_disable().
Fixes: 5ec8cabc3b ("irqchip/renesas-rzv2h: Use devm_pm_runtime_enable()")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260323124917.41602-1-biju.das.jz@bp.renesas.com
The mbox_client for qcom-mpm sends NULL doorbell messages via
mbox_send_message() but never signals TX completion.
Set knows_txdone=true and call mbox_client_txdone() after a successful
send, matching the pattern used by other Qualcomm mailbox clients (smp2p,
smsm, qcom_aoss etc).
Fixes: a6199bb514 "irqchip: Add Qualcomm MPM controller driver"
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260322171533.608436-1-jassisinghbrar@gmail.com
GICv5 does not support configuring the handling mode or trigger mode
of PPIs at runtime - these choices are made at implementation time,
and most of the architected PPIs have an architected handling mode (as
reported in the ICH_PPI_HMRn_EL1 registers). As chip->set_irq_type()
is optional, this has not been implemented for GICv5 PPIs as it served
no real purpose.
However, although the set_irq_type() function is marked as optional,
the lack of it breaks attempts to create a domain hierarchy on top of
GICv5's PPI domain. This is due to __irq_set_trigger() calling
chip->set_irq_type(), which returns -ENOSYS if the parent domain
doesn't implement the set_irq_type() call.
In order to make things work, this change introduces a set_irq_type()
call for GICv5 PPIs. This performs a basic sanity check (that the
hardware's handling mode (Level/Edge) matches what is being set as the
type, and does nothing else.
This is sufficient to get hierarchical domains working for GICv5 PPIs
(such as the one KVM introduces for the arch timer). It has the side
benefit (or drawback) that it will catch cases where the firmware
description doesn't match what the hardware reports.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-31-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
When riscv_acpi_get_gsi_info() fails, the mailbox channel previously
requested via mbox_request_channel() is not freed. Add the missing
mbox_free_channel() call to prevent the resource leak.
Fixes: 4752b0cfbc ("irqchip/riscv-rpmi-sysmsi: Add ACPI support")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Rahul Pathak <rahul@summations.net>
Link: https://patch.msgid.link/20260315-sysmsi-v1-1-5f090c86c2ca@gmail.com
Introduce support for the new AICv3 hardware block in t8122 and t603x
SoCs. AICv3 is similar to AICv2 but has an increased IRQ config offset.
These MMIO offsets are coded as properties of the "aic,3" node in Apple's
device tree. The actual offsets are the same for all SoCs starting from M3
through at least M5.
So do not bother to follow suit but use AICv3 specific defines in the
driver. The compatible string is SoC specific so future SoCs with AICv3
and different offsets would just use their own compatible string as base
and add their new offsets.
Signed-off-by: Janne Grunau <j@jannau.net>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Sven Peter <sven@kernel.org>
Link: https://patch.msgid.link/20260223-irq-apple-aic3-v3-2-2b7328076b8d@jannau.net
Add support for the interrupt steering controller found in NXP S32N79
series automotive SoCs.
The S32N79 IRQ_STEER variant differs from the i.MX version by not
implementing the CHANCTRL register. To handle this hardware difference,
introduce a device type data structure with quirks field. The
IRQSTEER_QUIRK_NO_CHANCTRL quirk skips CHANCTRL register access for S32N79
variants.
The interrupt routing functionality and register layout are otherwise
identical between the two variants.
Co-developed-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260311081154.381881-4-ciprianmarian.costea@oss.nxp.com
Since commit 95a8ddde36 ("irqchip/riscv-aplic: Preserve APLIC
states across suspend/resume"), when multiple NUMA nodes exist
and AIA is not configured as "none", aplic_probe() is called
multiple times. This leads to register_syscore(&aplic_syscore)
being invoked repeatedly, causing the following Oops:
list_add double add: new=ffffffffb91461f0, prev=ffffffffb91461f0, next=ffffffffb915c408.
[<ffffffffb7b5c8ca>] __list_add_valid_or_report+0x60/0xc0
[<ffffffffb7cc3236>] register_syscore+0x3e/0x70
[<ffffffffb7b8d61c>] aplic_probe+0xc6/0x112
Fix this by registering syscore operations only once, using a static
variable aplic_syscore_registered to track registration.
[ tglx: Trim backtrace properly ]
Fixes: 95a8ddde36 ("irqchip/riscv-aplic: Preserve APLIC states across suspend/resume")
Signed-off-by: Jessica Liu <liu.xuemei1@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260310141731145xMwLsyvXl9Gw-m6A4VRYj@zte.com.cn
aplic_probe() calls acpi_dev_clear_dependencies() unconditionally at the
end, even when the preceding setup (MSI or direct mode) has failed. This is
incorrect because if the device failed to probe, it should not be
considered as active and should not clear dependencies for other devices
waiting on it.
Fix this by returning immediately when the setup fails, skipping the ACPI
dependency cleanup. Also, explicitly return 0 on success instead of relying
on the value of 'rc' to make the success path clear.
Fixes: 5122e380c2 ("irqchip/riscv-aplic: Add ACPI support")
Signed-off-by: Jessica Liu <liu.xuemei1@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260310141600411Fu8H8-GXOOgKISU48Tjgx@zte.com.cn
Prefer using IS_ERR_OR_NULL() over using IS_ERR() and a open coded NULL
pointer check.
Change generated with coccinelle.
To: Marc Zyngier <maz@kernel.org>
To: Thomas Gleixner <tglx@kernel.org>
To: Andrew Lunn <andrew@lunn.ch>
To: Gregory Clement <gregory.clement@bootlin.com>
To: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260310-b4-is_err_or_null-v1-39-bd63b656022d@avm.de
Handle the RZ/V2H ICU error interrupt to help diagnose latched bus,
ECC RAM, and CA55/IP error conditions.
Support error injection via ICU_SWPE to allow testing the pseudo error
error interrupts.
Account for SoC differences in ECC RAM error register coverage so the
handler only iterates over valid ECC status/clear banks, and route the
RZ/V2N compatible to a probe path with the correct ECC range while
keeping the existing RZ/V2H and RZ/G3E handling.
[ tglx: Convert to hwirq_within() and upgrade to pr_warn() for those errors ]
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-8-prabhakar.mahadev-lad.rj@bp.renesas.com
The RZ/V2H ICU exposes four software-triggerable interrupts targeting
the CA55 cores (int-ca55-0 to int-ca55-3). Add support for these
interrupts to enable IRQ injection via the generic IRQ injection
framework.
Add a dedicated rzv2h_icu_swint_chip irq_chip for the CA55 region and
implement rzv2h_icu_irq_set_irqchip_state() to handle software interrupt
injection.
[ tglx: Convert to hwirq_within() ]
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-7-prabhakar.mahadev-lad.rj@bp.renesas.com
Replace the single rzv2h_icu_chip and its dispatcher callbacks with
dedicated irq_chip instances for each interrupt region: NMI, IRQ, and
TINT.
Move the irqd_is_level_type() check ahead of the scoped_guard in
rzv2h_icu_tint_eoi() and rzv2h_icu_irq_eoi() to avoid acquiring the
spinlock unnecessarily for level-type interrupts.
Drop the ICU_TINT_START guard from rzv2h_tint_irq_endisable() since it
is now only reachable via the TINT chip path.
[ tglx: Convert to hwirq_within() ]
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-6-prabhakar.mahadev-lad.rj@bp.renesas.com
Introduce ICU_IRQ_LAST and ICU_TINT_LAST macros to make range boundaries
explicit and reduce the chance of off-by-one errors.
Extract the TINT information up front in rzv2h_icu_alloc() and validate
the resulting hardware IRQ against the full TINT range
[ICU_TINT_START, ICU_TINT_LAST].
[ tglx: Convert the hard to parse inverse conditions to use a simple helper
macro hwirq_within() which is easy to read, less error prone and
avoids a lot of typing ]
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-5-prabhakar.mahadev-lad.rj@bp.renesas.com
Make use of dev_err_probe() to simplify rzv2h_icu_probe_common().
Keep dev_err() for -ENOMEM paths, as dev_err_probe() does not print for
allocation failures, ensuring they remain visible in logs.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Avoid dereferencing pdev->dev.of_node again in rzv2h_icu_probe_common().
Reuse the already available local node pointer when mapping the ICU
register space.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260304113317.129339-2-prabhakar.mahadev-lad.rj@bp.renesas.com
* Make sure we don't leak any S1POE state from guest to guest when
the feature is supported on the HW, but not enabled on the host
* Propagate the ID registers from the host into non-protected VMs
managed by pKVM, ensuring that the guest sees the intended feature set
* Drop double kern_hyp_va() from unpin_host_sve_state(), which could
bite us if we were to change kern_hyp_va() to not being idempotent
* Don't leak stage-2 mappings in protected mode
* Correctly align the faulting address when dealing with single page
stage-2 mappings for PAGE_SIZE > 4kB
* Fix detection of virtualisation-capable GICv5 IRS, due to the
maintainer being obviously fat fingered... [his words, not mine]
* Remove duplication of code retrieving the ASID for the purpose of
S1 PT handling
* Fix slightly abusive const-ification in vgic_set_kvm_info()
Generic:
* Remove internal Kconfigs that are now set on all architectures.
* Remove per-architecture code to enable KVM_CAP_SYNC_MMU, all
architectures finally enable it in Linux 7.0.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmkSMUUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroO2tggAgOHvH9lwgHxFADAOKXPEsv+Tw6cB
zuUBPo6e5XQQx6BJQVo8ctveCBbA0ybRJXbju6lgS/Bg9qQYKjZbNrYdKwBwhw3b
JBUFTIJgl7lLU5gEANNtvgiiNUc2z1GfG8NQ5ovMJ0/MDEEI0YZjkGoe8ZyJ54vg
qrNHBROKvt54wVHXFGhZ1z8u4RqJ/WCmy21wYbwiMgQGXq9ugRRfhnu3iGNO8BcK
bSSh4cP7qusO5Nj0m/XYD/68VwvyogaEkh4sNS1VH0FOoONRWs80Q6iT2HjCGgv6
mAzCx/yB9ziSQRis3oYYEBH23HZxRVDVBeSBlPcbxE3VM7SGzp7O8L692g==
=+mC+
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Arm:
- Make sure we don't leak any S1POE state from guest to guest when
the feature is supported on the HW, but not enabled on the host
- Propagate the ID registers from the host into non-protected VMs
managed by pKVM, ensuring that the guest sees the intended feature
set
- Drop double kern_hyp_va() from unpin_host_sve_state(), which could
bite us if we were to change kern_hyp_va() to not being idempotent
- Don't leak stage-2 mappings in protected mode
- Correctly align the faulting address when dealing with single page
stage-2 mappings for PAGE_SIZE > 4kB
- Fix detection of virtualisation-capable GICv5 IRS, due to the
maintainer being obviously fat fingered... [his words, not mine]
- Remove duplication of code retrieving the ASID for the purpose of
S1 PT handling
- Fix slightly abusive const-ification in vgic_set_kvm_info()
Generic:
- Remove internal Kconfigs that are now set on all architectures
- Remove per-architecture code to enable KVM_CAP_SYNC_MMU, all
architectures finally enable it in Linux 7.0"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: always define KVM_CAP_SYNC_MMU
KVM: remove CONFIG_KVM_GENERIC_MMU_NOTIFIER
KVM: arm64: Deduplicate ASID retrieval code
irqchip/gic-v5: Fix inversion of IRS_IDR0.virt flag
KVM: arm64: Revert accidental drop of kvm_uninit_stage2_mmu() for non-NV VMs
KVM: arm64: Fix protected mode handling of pages larger than 4kB
KVM: arm64: vgic: Handle const qualifier from gic_kvm_info allocation type
KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()
KVM: arm64: Fix ID register initialization for non-protected pKVM guests
KVM: arm64: Optimise away S1POE handling when not supported by host
KVM: arm64: Hide S1POE from guests when not supported by the host
- Fix frozen interrupt bug in the sifive-plic driver
- Limit per-device MSI interrupts on uncommon gic-v3-its
hardware variants
- Address Sparse warning by constifying a variable in
the MMP driver
- Revert broken commit and also fix an error check
in the ls-extirq driver
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmj/MMRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iRdg//eKX6ScDG6IT/hzKHBVP9tGqFcvfR+LWf
3Q5I/5YipYNf2aqZS7UFcfhsOvMkwALF55CrNUCFatW2W4VAUR5pgt5JWNfMS4VO
+9sGkkAhUenhLlGgGnIpuIHgIrkJFCqhQIy2pwMpVF5Sp/7jzJfIwonYziv3UUyD
cWhv6zQULWFMzkLOjBC8Ba44gIfSSq6wERE0JZiE2aZMJ3Azjh+prKwkH5Q48adb
Ni7b19wFFLDSvv8KGzpoFrA3S5Wwzppy4YJpTEKGXUq9AQr08BUtv2KNeLz660ma
pWD9ZzPnclxUiWhmAdi3x+vmn8209Id3pYRGD2auZlT0+7cE1yvyFNl30IWwac5K
IoGQ54F3nWYynT6ScYNJilKlN/kDDX6Azt1AELl+fHhOvp5VpjxlDpiUahvqJGpX
bhNo1GC8fQSZFKehA987IZol1vp9z2utqxEQhm0hZX4FhIRaS8LfUa1e7A66FZ1Q
cp+Hz/8oYczZAEb3vXNdEVkhtIFTzS90uRhrjaRLlO67Yo/0xWxwP34DuBv8MiL7
XWK+rj3TSxYR3U6cAIAkU0bft+XFIVI8fKc92duE2Cx5Nx2s0gHRYsM/F2lFrhIw
jETLC60lWj1Sf1HP03VKi7EjFht6qSVs4PgPVYopZtn94eXV+HK+sZ5CmkxqgfFT
OYpG1VbMxEU=
=hIKS
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irqchip driver fixes from Ingo Molnar:
- Fix frozen interrupt bug in the sifive-plic driver
- Limit per-device MSI interrupts on uncommon gic-v3-its hardware
variants
- Address Sparse warning by constifying a variable in the MMP driver
- Revert broken commit and also fix an error check in the ls-extirq
driver
* tag 'irq-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/ls-extirq: Fix devm_of_iomap() error check
Revert "irqchip/ls-extirq: Use for_each_of_imap_item iterator"
irqchip/mmp: Make icu_irq_chip variable static const
irqchip/gic-v3-its: Limit number of per-device MSIs to the range the ITS supports
irqchip/sifive-plic: Fix frozen interrupt due to affinity setting