linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-13 00:28:54 +02:00

Author	SHA1	Message	Date
Breno Leitao	5cbb61bf41	arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's sve_set_common() is the backend for PTRACE_SETREGSET(NT_ARM_SVE) and PTRACE_SETREGSET(NT_ARM_SSVE). Every write in the function operates on the tracee (target) - except a single memset that uses current instead, zeroing the tracer's saved V0-V31 / FPSR / FPCR shadow on every ptrace SETREGSET call. The memset is meant to give the tracee a defined zero register image before the user-supplied payload is copied in (for partial writes, header-only writes, and FPSIMD<->SVE format switches). Aiming it at current both denies the tracee that clean slate and silently corrupts the tracer. The corruption of the tracer's saved FPSIMD state is not always observable. Where the tracer's state is live on a CPU, this may be reused without loading the corrupted state from memory, and will eventually be written back over the corrupted state. Where the tracer's state is saved in SVE_PT_REGS_SVE format, only the FPSR and FPCR are clobbered, and the effective copy of the vectors is in the task's sve_state. Reproducible on an arm64 kernel with SVE: a single-threaded tracer that loads a known pattern into V0-V31, issues PTRACE_SETREGSET(NT_ARM_SVE) on a child, and reads V0-V31 back observes them all zeroed within tens of thousands of iterations when a sibling thread keeps stealing the FPSIMD CPU binding. Fixes: `316283f276` ("arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE") Cc: <stable@vger.kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-05-06 12:11:49 +01:00
Kevin Brodsky	030e8a40ff	arm64: signal: Preserve POR_EL0 if poe_context is missing Commit `2e8a1acea8` ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures") delayed the write to POR_EL0 in rt_sigreturn to avoid spurious uaccess failures. This change however relies on the poe_context frame record being present: on a system supporting POE, calling sigreturn without a poe_context record now results in writing arbitrary data from the kernel stack into POR_EL0. Fix this by adding a __valid_fields member to struct user_access_state, and zeroing the struct on allocation. restore_poe_context() then indicates that the por_el0 field is valid by setting the corresponding bit in __valid_fields, and restore_user_access_state() only touches POR_EL0 if there is a valid value to set it to. This is in line with how POR_EL0 was originally handled; all frame records are currently optional, except fpsimd_context. To ensure that __valid_fields is kept in sync, fields (currently just por_el0) are now accessed via accessors and prefixed with __ to discourage direct access. Fixes: `2e8a1acea8` ("arm64: signal: Improve POR_EL0 handling to avoid uaccess failures") Cc: <stable@vger.kernel.org> Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-05-01 17:44:25 +01:00
Wentao Guan	4023b7424e	arm64/scs: Fix potential sign extension issue of advance_loc4 The expression (opcode++ << 24) and exp code_alignment_factor may overflow signed int and becomes negative. Fix this by casting each byte to u64 before shifting. Also fix the misaligned break statement while we are here. Example of the result can be seen here: Link: https://godbolt.org/z/zhY8d3595 It maybe not a real problem, but could be a issue in future. Fixes: `d499e9627d` ("arm64/scs: Fix handling of advance_loc4") Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-27 12:16:26 +01:00
Linus Torvalds	13f24586a2	arm64 updates for 7.1 (second round): Core features: - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit) DVMSync acknowledgement. The fix consists of sending IPIs on TLB maintenance to those CPUs running in user space with SME enabled - Include kernel-hwcap.h in list of generated files (missed in a recent commit generating the KERNEL_HWCAP_* macros) CCA: - Fix RSI_INCOMPLETE error check in arm-cca-guest MPAM: - Fix an unmount->remount problem with the CDP emulation, uninitialised variable and checker warnings -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmnmQZcACgkQa9axLQDI XvGouA//SXlo7hQyM41rkgRru9oqftrGg0y6nxz4Z089kv50cm3Jlf/nUuti6vah BMBLCGXA1iOQrGIVmuvtCxDRrfYZWpfKGuT9A0gmEoMqrGIpWl9gfBQG+uR+YrQX 4kp5DLqB85WrJIPiy7HUV6GQoCbFuMrRJwxl89IdWZSobaei3SczTmnttwyJtxG5 /BMitl024TYdiOPNo8bhiML1wIJCaTHvH4IrtCHPyUHEAtsHSMy00y0OrSKBtA/9 ZHZRpY7Po/jnL7YUs1AfYwsaSXjkvqXN0K1Tdavzm75k6lpJmbM3VsZabG/CEuvK PCOGV++is4Y/A+7aQsCwXKeVnY3b6AC4sextytNq0g3GZ7I+Ht9O6nbsp5ZmyXzB HRiFxmFS1pSQOMX9f1neKi3vxDMTy1tKPeccTTzL8dNnxTvUBXnoWfPoJh3cpbjm Dbhe1kksiEn01WWFacGtkIPDa9c+Bkd2T+8wrsk85Z+u3Z0JPM5PfOn6v3X9YlKl K7W8fhvlDL1wP+iyWcMT5zdo+xzHY4ZxuyWbi9a4RhKc6lFHVVG2mpUuPwSsh2ma NnxkDouriuoADHBir89U71N483HSnNfSjhlVSFYD2LFCre5KOZM4KYZ2vwWb8Sy4 79q+BlVRUTQ5O6XjePoSPjUW4APPNviHJsF4E4IiqHkd9O5lMZU= =LNY2 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Catalin Marinas: "The main 'feature' is a workaround for C1-Pro erratum 4193714 requiring IPIs during TLB maintenance if a process is running in user space with SME enabled. The hardware acknowledges the DVMSync messages before completing in-flight SME accesses, with security implications. The workaround makes use of the mm_cpumask() to track the cores that need interrupting (arm64 hasn't used this mask before). The rest are fixes for MPAM, CCA and generated header that turned up during the merging window or shortly before. Summary: Core features: - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit) DVMSync acknowledgement. The fix consists of sending IPIs on TLB maintenance to those CPUs running in user space with SME enabled - Include kernel-hwcap.h in list of generated files (missed in a recent commit generating the KERNEL_HWCAP_* macros) CCA: - Fix RSI_INCOMPLETE error check in arm-cca-guest MPAM: - Fix an unmount->remount problem with the CDP emulation, uninitialised variable and checker warnings" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm_mpam: resctrl: Make resctrl_mon_ctx_waiters static arm_mpam: resctrl: Fix the check for no monitor components found arm_mpam: resctrl: Fix MBA CDP alloc_capable handling on unmount virt: arm-cca-guest: fix error check for RSI_INCOMPLETE arm64/hwcap: Include kernel-hwcap.h in list of generated files arm64: errata: Work around early CME DVMSync acknowledgement arm64: cputype: Add C1-Pro definitions arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance	2026-04-20 16:46:22 -07:00
Catalin Marinas	858fbd7248	Merge branch 'for-next/c1-pro-erratum-4193714' into for-next/core * for-next/c1-pro-erratum-4193714: : Work around C1-Pro erratum 4193714 (CVE-2026-0995) arm64: errata: Work around early CME DVMSync acknowledgement arm64: cputype: Add C1-Pro definitions arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance	2026-04-20 13:12:35 +01:00
Linus Torvalds	87768582a4	dma-mapping updates for Linux 7.0: - added support for batched cache sync, what improves performance of dma_map/unmap_sg() operations on ARM64 architecture (Barry Song) - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory used in confidential computing (Jiri Pirko) - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its clients (Marek Szyprowski, shared branch with device-tree updates to avoid merge conflicts) - prepared Contiguous Memory Allocator related code for making dma-buf drivers modularized (Maxime Ripard) - added support for benchmarking dma_map_sg() calls to tools/dma utility (Qinxin Xia) -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSrngzkoBtlA8uaaJ+Jp1EFxbsSRAUCaeCbdQAKCRCJp1EFxbsS RHbWAQCt70dzrU0lu0omTR1HdDP4GTYfuM6nZR91e8/itGN1+QD/XH4I/0wuybzk v5uxbIC6lR3abQRc3YNRXfi+i5j26A4= =Oee2 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - added support for batched cache sync, what improves performance of dma_map/unmap_sg() operations on ARM64 architecture (Barry Song) - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory used in confidential computing (Jiri Pirko) - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its clients (Marek Szyprowski, shared branch with device-tree updates to avoid merge conflicts) - prepared Contiguous Memory Allocator related code for making dma-buf drivers modularized (Maxime Ripard) - added support for benchmarking dma_map_sg() calls to tools/dma utility (Qinxin Xia) * tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: (24 commits) dma-buf: heaps: system: document system_cc_shared heap dma-buf: heaps: system: add system_cc_shared heap for explicitly shared memory dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory mm: cma: Export cma_alloc(), cma_release() and cma_get_name() dma: contiguous: Export dev_get_cma_area() dma: contiguous: Make dma_contiguous_default_area static dma: contiguous: Make dev_get_cma_area() a proper function dma: contiguous: Turn heap registration logic around of: reserved_mem: rework fdt_init_reserved_mem_node() of: reserved_mem: clarify fdt_scan_reserved_mem*() functions of: reserved_mem: rearrange code a bit of: reserved_mem: replace CMA quirks by generic methods of: reserved_mem: switch to ops based OF_DECLARE() of: reserved_mem: use -ENODEV instead of -ENOENT of: reserved_mem: remove fdt node from the structure dma-mapping: fix false kernel-doc comment marker dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg dma-mapping: Separate DMA sync issuing and completion waiting arm64: Provide dcache_inval_poc_nosync helper arm64: Provide dcache_clean_poc_nosync helper ...	2026-04-17 11:12:42 -07:00
Linus Torvalds	01f492e181	Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported. This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor. - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn. LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel(). - Add DMSINTC irqchip in kernel support. RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors. - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code. - Fix LPSW/E to update the bear (which of course is the breaking event address register). x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized. - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes. - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage. x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace. - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier"). - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not all of VMX and SVM enabling) as it is needed for trusted I/O. - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times. - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM. - Exempt SMM from CPUID faulting on Intel, as per the spec. - Misc hardening and cleanup changes. x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs. - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs. - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input. - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION. - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault. - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl. - Convert a pile of kvm->lock SEV code to guard(). - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs. x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros. - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate. - Code cleanups. guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to. LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests. x86 selftests: - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3 8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX 08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5 Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg== =CcE0 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not all of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ...	2026-04-17 07:18:03 -07:00
Linus Torvalds	440d6635b2	mm.git review status for linus..mm-nonmm-stable Total patches: 126 Reviews/patch: 0.92 Reviewed rate: 76% - The 2 patch series "pid: make sub-init creation retryable" from Oleg Nesterov increases the robustness of our creation of init in a new namespace. By clearing away some historical cruft which is no longer needed. Also some documentation fixups are provided. - The 2 patch series "selftests/fchmodat2: Error handling and general" from Mark Brown has a fixup and a cleanup for the fchmodat2() syscall selftest. - The 3 patch series "lib: polynomial: Move to math/ and clean up" from Andy Shevchenko does as advertised. - The 3 patch series "hung_task: Provide runtime reset interface for hung task detector" from Aaron Tomlin gives administrators the ability to zero out /proc/sys/kernel/hung_task_detect_count. - The 2 patch series "tools/getdelays: use the static UAPI headers from tools/include/uapi" from Thomas Weißschuh teaches getdelays to use the in-kernel UAPI headers rather than the system-provided ones. - The 5 patch series "watchdog/hardlockup: Improvements to hardlockup" from Mayank Rungta provides several cleanups and fixups to the hardlockup detector code and its documentation. - The 2 patch series "lib/bch: fix undefined behavior from signed left-shifts" from Josh Law provides a couple of small/theoretical fixes in the bch code. - The 2 patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()" from Junrui Luo does what is claims. - The 27 patch series "cleanup the RAID5 XOR library" from Christoph Hellwig is a quite far-reaching cleanup to this code. I can't do better than to quote Christoph: The XOR library used for the RAID5 parity is a bit of a mess right now. The main file sits in crypto/ despite not being cryptography and not using the crypto API, with the generic implementations sitting in include/asm-generic and the arch implementations sitting in an asm/ header in theory. The latter doesn't work for many cases, so architectures often build the code directly into the core kernel, or create another module for the architecture code. Change this to a single module in lib/ that also contains the architecture optimizations, similar to the library work Eric Biggers has done for the CRC and crypto libraries later. After that it changes to better calling conventions that allow for smarter architecture implementations (although none is contained here yet), and uses static_call to avoid indirection function call overhead. - The 2 patch series "lib/list_sort: Clean up list_sort() scheduling workarounds" from Kuan-Wei Chiu cleans up this library code by removing a hacky thing which was added for UBIFS, which UBIFS doesn't actually need. - The 5 patch series "Fix bugs in extract_iter_to_sg()" from Christian Ehrhardt fixes a few bugs in the scatterlist code, adds in-kernel tests for the now-fixed bugs and fixes a leak in the test itself. - The 3 patch series "kdump: Enable LUKS-encrypted dump target support in ARM64 and PowerPC" from Coiby Xu eenables support of the LUKS-encrypted device dump target on arm64 and powerpc. - The 4 patch series "ocfs2: consolidate extent list validation into block read callbacks" from Joseph Qi addresses ocfs2's validation of extent list fields - cleanup, simplification, robustness. (Kernel test robot loves mounting corrupted fs images!) -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad90rQAKCRDdBJ7gKXxA jl7rAQD4/Rq7ZSSnEv6FS4gOwc3MgTdWcZZaXkqL1KiWyYhRwAEA+cVCO344+AKb znBOjet/hUr+/kBwyViifiC8LHzchwM= =Nfnf -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "pid: make sub-init creation retryable" (Oleg Nesterov) Make creation of init in a new namespace more robust by clearing away some historical cruft which is no longer needed. Also some documentation fixups - "selftests/fchmodat2: Error handling and general" (Mark Brown) Fix and a cleanup for the fchmodat2() syscall selftest - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko) - "hung_task: Provide runtime reset interface for hung task detector" (Aaron Tomlin) Give administrators the ability to zero out /proc/sys/kernel/hung_task_detect_count - "tools/getdelays: use the static UAPI headers from tools/include/uapi" (Thomas Weißschuh) Teach getdelays to use the in-kernel UAPI headers rather than the system-provided ones - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta) Several cleanups and fixups to the hardlockup detector code and its documentation - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law) A couple of small/theoretical fixes in the bch code - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo) - "cleanup the RAID5 XOR library" (Christoph Hellwig) A quite far-reaching cleanup to this code. I can't do better than to quote Christoph: "The XOR library used for the RAID5 parity is a bit of a mess right now. The main file sits in crypto/ despite not being cryptography and not using the crypto API, with the generic implementations sitting in include/asm-generic and the arch implementations sitting in an asm/ header in theory. The latter doesn't work for many cases, so architectures often build the code directly into the core kernel, or create another module for the architecture code. Change this to a single module in lib/ that also contains the architecture optimizations, similar to the library work Eric Biggers has done for the CRC and crypto libraries later. After that it changes to better calling conventions that allow for smarter architecture implementations (although none is contained here yet), and uses static_call to avoid indirection function call overhead" - "lib/list_sort: Clean up list_sort() scheduling workarounds" (Kuan-Wei Chiu) Clean up this library code by removing a hacky thing which was added for UBIFS, which UBIFS doesn't actually need - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt) Fix a few bugs in the scatterlist code, add in-kernel tests for the now-fixed bugs and fix a leak in the test itself - "kdump: Enable LUKS-encrypted dump target support in ARM64 and PowerPC" (Coiby Xu) Enable support of the LUKS-encrypted device dump target on arm64 and powerpc - "ocfs2: consolidate extent list validation into block read callbacks" (Joseph Qi) Cleanup, simplify, and make more robust ocfs2's validation of extent list fields (Kernel test robot loves mounting corrupted fs images!) * tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits) ocfs2: validate group add input before caching ocfs2: validate bg_bits during freefrag scan ocfs2: fix listxattr handling when the buffer is full doc: watchdog: fix typos etc update Sean's email address ocfs2: use get_random_u32() where appropriate ocfs2: split transactions in dio completion to avoid credit exhaustion ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path() ocfs2: validate extent block list fields during block read ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec() ocfs2: validate dx_root extent list fields during block read ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY ocfs2: handle invalid dinode in ocfs2_group_extend .get_maintainer.ignore: add Askar ocfs2: validate bg_list extent bounds in discontig groups checkpatch: exclude forward declarations of const structs tools/accounting: handle truncated taskstats netlink messages taskstats: set version in TGID exit notifications ocfs2/heartbeat: fix slot mapping rollback leaks on error paths arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel ...	2026-04-16 20:11:56 -07:00
Linus Torvalds	c43267e679	arm64 updates for 7.1: Core features: - Add support for FEAT_LSUI, allowing futex atomic operations without toggling Privileged Access Never (PAN) - Further refactor the arm64 exception handling code towards the generic entry infrastructure - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis through it Memory management: - Refactor the arm64 TLB invalidation API and implementation for better control over barrier placement and level-hinted invalidation - Enable batched TLB flushes during memory hot-unplug - Fix rodata=full block mapping support for realm guests (when BBML2_NOABORT is available) Perf and PMU: - Add support for a whole bunch of system PMUs featured in NVIDIA's Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers for CPU/C2C memory latency PMUs) - Clean up iomem resource handling in the Arm CMN driver - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon} MPAM (Memory Partitioning And Monitoring): - Add architecture context-switch and hiding of the feature from KVM - Add interface to allow MPAM to be exposed to user-space using resctrl - Add errata workaround for some existing platforms - Add documentation for using MPAM and what shape of platforms can use resctrl Miscellaneous: - Check DAIF (and PMR, where relevant) at task-switch time - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode (only relevant to asynchronous or asymmetric tag check modes) - Remove a duplicate allocation in the kexec code - Remove redundant save/restore of SCS SP on entry to/from EL0 - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap descriptions - Add kselftest coverage for cmpbr_sigill() - Update sysreg definitions -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmnc8DEACgkQa9axLQDI XvFauRAAhc1cIgoRpgtdZd7+3/g457teDPYA3L/CjJzI28aesIpV/ECrEw2GL4xs HrQfijF4oyCDbBwh0sAascO/H7RoyOranlbuc+fVJ6Bj6gP9STzR4GmscsWkAMSJ vA3Jd1DREdDBO2sjw+hGhht84nRlcfY1FyORJP+1JaFH4oWTWsRNeOZIiI3BhxR8 EtFP9E8r2Esxi/FmZb/47m7kYCEH+XsrzQvBQNLVCH899QX2Hn0kAY70ndq2ZiQl n+zLAe7FBFwKzUVmlgWuhjrWMmK+1TthK/XQuOtxg13dHmX+vE/j+A+dOqRWSfHY ktNcWaf6m4+TWKVeVTe4E1cnSuwTQTm4VQKd9zaeQxiZYyYJhCQjXuEZg3vDmDbq F6D3MpTaJHRRWp0rEurxnSBlmQPCBE2IxEBdSrjd/WJ6T9e1oYwWiSJSS7bGCgGr dd/XLsOY7Um5n4ooIFEZc1de6VO6/VTKjmxnBMgU+Sa1REbLpD438IX/6CjzG5qM l5Ulke/c6/a/faeVCEpZpD8JuvNOzo9RISDPrNg1KKAL+OSU+9tgmVjIFPhDDB0w zNTqT7YJIhxlJxnUGWDk8YNsTjT3OzyquY9UT1tBTBqC0k13J2i2ev30toUez7xj 2aV+9qMpunbLtwYhXNun1hBFiYrCxpX7I8ha0hXiXL0CywVOPTI= =CnVn -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The biggest changes are MPAM enablement in drivers/resctrl and new PMU support under drivers/perf. On the core side, FEAT_LSUI lets futex atomic operations with EL0 permissions, avoiding PAN toggling. The rest is mostly TLB invalidation refactoring, further generic entry work, sysreg updates and a few fixes. Core features: - Add support for FEAT_LSUI, allowing futex atomic operations without toggling Privileged Access Never (PAN) - Further refactor the arm64 exception handling code towards the generic entry infrastructure - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis through it Memory management: - Refactor the arm64 TLB invalidation API and implementation for better control over barrier placement and level-hinted invalidation - Enable batched TLB flushes during memory hot-unplug - Fix rodata=full block mapping support for realm guests (when BBML2_NOABORT is available) Perf and PMU: - Add support for a whole bunch of system PMUs featured in NVIDIA's Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers for CPU/C2C memory latency PMUs) - Clean up iomem resource handling in the Arm CMN driver - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon} MPAM (Memory Partitioning And Monitoring): - Add architecture context-switch and hiding of the feature from KVM - Add interface to allow MPAM to be exposed to user-space using resctrl - Add errata workaround for some existing platforms - Add documentation for using MPAM and what shape of platforms can use resctrl Miscellaneous: - Check DAIF (and PMR, where relevant) at task-switch time - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode (only relevant to asynchronous or asymmetric tag check modes) - Remove a duplicate allocation in the kexec code - Remove redundant save/restore of SCS SP on entry to/from EL0 - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap descriptions - Add kselftest coverage for cmpbr_sigill() - Update sysreg definitions" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits) arm64: rsi: use linear-map alias for realm config buffer arm64: Kconfig: fix duplicate word in CMDLINE help text arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12 arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps arm64: kexec: Remove duplicate allocation for trans_pgd ACPI: AGDI: fix missing newline in error message arm64: Check DAIF (and PMR) at task-switch time arm64: entry: Use split preemption logic arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() arm64: entry: Consistently prefix arm64-specific wrappers arm64: entry: Don't preempt with SError or Debug masked entry: Split preemption from irqentry_exit_to_kernel_mode() entry: Split kernel mode logic from irqentry_{enter,exit}() entry: Move irqentry_enter() prototype later entry: Remove local_irq_{enable,disable}_exit_to_user() ...	2026-04-14 16:48:56 -07:00
Linus Torvalds	5d0d362330	Kbuild/Kconfig updates for 7.1 Kbuild changes ============== * tools/build: Reject unexpected values for LLVM= * kbuild: uapi: remove usage of toolchain headers * kbuild: Switch from '-fms-extensions' to '-fms-anonymous-structs' when available (currently: clang >= 23.0.0) * kbuild: Reduce the number of compiler-generated suffixes for clang thin-lto build * kbuild: reduce output spam ("GEN Makefile") when building out of tree * check-uapi: improve portability for testing headers * uapi: also test UAPI headers against C++ compilers * kbuild: vdso_install: drop build ID architecture allow-list * checksyscalls: only run when necessary * Documentation: kbuild: Update the debug information notes in reproducible-builds.rst * kconfig: forbid multiple entries with the same symbol in a choice * kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain Kconfig changes =============== * kconfig: Error out on duplicated kconfig inclusion Cc: Alexander Coffin <alex@cyberialabs.net> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bill Wendling <morbo@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Dodji Seketeli <dodji@seketeli.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Helge Deller <deller@gmx.de> Cc: John Moon <john@jmoon.dev> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Song Liu <song@kernel.org> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: kernel-team@fb.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-efi@vger.kernel.org Cc: linux-hexagon@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: llvm@lists.linux.dev Cc: loongarch@lists.linux.dev -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEh0E3p4c3JKeBvsLGB1IKcBYmEmkFAmnatXEACgkQB1IKcBYm Eml6ww/9Hja/CTBoF+ZgMXN/9VcQhzNonPXIp8IGarX3+LCPh8RfUEywaOLnvR/U fE6FEIcwDw0M5drS0hEH7t1Xowc6AhDX05lKBj3aGBgn6JqGGQFAfnysQd5z0cwW Y/8+bMm+Y2XQ/xZNa0J92+3evPO04U7+2kCSVD051ZhRdmK4n290u4YsTgoKs7Fm 1SBIr+tsFa1zMOG6r+J4uCLxXNnujQ5XcejnlmdBM0o19f9kttvVkYKuBVdXPHf4 JaTLti22Td8SklDKMmkSRg+Ul/Wh2x8D8tP98VQAJe5B3f4Uk6YAu1BMrbQaX5Rk 5SsGbhBEeOTDc4qCaS8DS+FJQU6T9W9cf/9+tBY510fXxAIonz5cPB06q5xeJWCd IkVB3KpmaVxo2B54Cy4b/fvd1J3VMkmFjBQWMNwkq6cnCG1ZK/b6Jmvh9BQSNctl IYJxWKBjlddrMuvZEMI0CewVq4GmarTLiOpweghDg8OYqya4E6PfOUGnaWMrWT5c 2E8ZMnQSb68yFUaXK+Sy+Pw2Nig/VvxCUxHdaarHi/RmGeoN5dMGfjj/gGZvZrHt NUGt6qe+X62P0ZAUR8p+GpRcU3+p3uLhCyO7dkwqgLVZTnaXy5XtUQ/uyh2G60hv eJlFfrn8QXplvzrxcSTJya6PunoIhuWh2BfKhf0RDymJTPyMbBc= =+wTC -----END PGP SIGNATURE----- Merge tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nicolas Schier: "Kbuild: - reject unexpected values for LLVM= - uapi: remove usage of toolchain headers - switch from '-fms-extensions' to '-fms-anonymous-structs' when available (currently: clang >= 23.0.0) - reduce the number of compiler-generated suffixes for clang thin-lto build - reduce output spam ("GEN Makefile") when building out of tree - improve portability for testing headers - also test UAPI headers against C++ compilers - drop build ID architecture allow-list in vdso_install - only run checksyscalls when necessary - update the debug information notes in reproducible-builds.rst - expand inlining hints with -fdiagnostics-show-inlining-chain Kconfig: - forbid multiple entries with the same symbol in a choice - error out on duplicated kconfig inclusion" * tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (35 commits) kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain kconfig: forbid multiple entries with the same symbol in a choice Documentation: kbuild: Update the debug information notes in reproducible-builds.rst checksyscalls: move instance functionality into generic code checksyscalls: only run when necessary checksyscalls: fail on all intermediate errors checksyscalls: move path to reference table to a variable kbuild: vdso_install: drop build ID architecture allow-list kbuild: vdso_install: gracefully handle images without build ID kbuild: vdso_install: hide readelf warnings kbuild: vdso_install: split out the readelf invocation kbuild: uapi: also test UAPI headers against C++ compilers kbuild: uapi: provide a C++ compatible dummy definition of NULL kbuild: uapi: handle UML in architecture-specific exclusion lists kbuild: uapi: move all include path flags together kbuild: uapi: move some compiler arguments out of the command definition check-uapi: use dummy libc includes check-uapi: honor ${CROSS_COMPILE} setting check-uapi: link into shared objects kbuild: reduce output spam when building out of tree ...	2026-04-14 09:18:40 -07:00
Linus Torvalds	2e31b16101	ACPI support updates for 7.1-rc1 - Update maintainers information regarding ACPICA (Rafael Wysocki) - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees Cook) - Trigger an ordered system power off after encountering a fatal error operator in AML (Armin Wolf) - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao) - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from the ACPI PPTT parser (Ben Horgan) - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate DeSimone) - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) - Clean up the ACPI TAD driver in various ways and add an RTC class device interface, including both the RTC setting/reading and alarm timer support, to it (Rafael Wysocki) - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) - Consolidate the interface for obtaining a CPU UID from ACPI across architectures and use it to address incorrect PCI TPH Steering Tag on ARM64 resulting from the invalid assumption that the ACPI Processor UID would always be the same as the corresponding logical CPU ID in Linux (Chengwen Feng) -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY/bcSHHJqd0Byand5 c29ja2kubmV0AAoJEO5fvZ0v1OO1chAH/1cGRzh9lSgQ3ZdzIIA5rpRtwKC+CTNz iNDvQ97W73B2N+WYzMaloOh+ZVA1Vdqc+8921aH6HI+v7wtg/ZV3h/hU7TagHNY/ bRFDYaeRXVj4aBXNfoVdn7G5UU9j/kIDcV25I2ubOBqZaO6T5p8p1BK0j0vEj+sG yR7XwpEhr2OUQwlIFGKskJwFaH57QJXPEY8wf+o+lMEx/7o/JQRJzKFwsYu01ZZV kQy9Ee08P/rsNJwU2ibmZu5P3JMnhategAT8VAMBvkfLScv2sKX+1Vz19NGXzm71 ARaT7y8MSPNb7SAvWmNZ/rVYrYIL+D3a76Gd7MOGrbVWEn6oXIbCIhY= =6vEK -----END PGP SIGNATURE----- Merge tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support updates from Rafael Wysocki: "These include an update of the CMOS RTC driver and the related ACPI and x86 code that, among other things, switches it over to using the platform device interface for device binding on x86 instead of the PNP device driver interface (which allows the code in question to be simplified quite a bit), a major update of the ACPI Time and Alarm Device (TAD) driver adding an RTC class device interface to it, and updates of core ACPI drivers that remove some unnecessary and not really useful code from them. Apart from that, two drivers are converted to using the platform driver interface for device binding instead of the ACPI driver one, which is slated for removal, support for the Performance Limited register is added to the ACPI CPPC library and there are some janitorial updates of it and the related cpufreq CPPC driver, the ACPI processor driver is fixed and cleaned up, and NVIDIA vendor CPER record handler is added to the APEI GHES code. Also, the interface for obtaining a CPU UID from ACPI is consolidated across architectures and used for fixing a problem with the PCI TPH Steering Tag on ARM64, there are two updates related to ACPICA, a minor ACPI OS Services Layer (OSL) update, and a few assorted updates related to ACPI tables parsing. Specifics: - Update maintainers information regarding ACPICA (Rafael Wysocki) - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees Cook) - Trigger an ordered system power off after encountering a fatal error operator in AML (Armin Wolf) - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao) - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from the ACPI PPTT parser (Ben Horgan) - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate DeSimone) - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) - Clean up the ACPI TAD driver in various ways and add an RTC class device interface, including both the RTC setting/reading and alarm timer support, to it (Rafael Wysocki) - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) - Consolidate the interface for obtaining a CPU UID from ACPI across architectures and use it to address incorrect PCI TPH Steering Tag on ARM64 resulting from the invalid assumption that the ACPI Processor UID would always be the same as the corresponding logical CPU ID in Linux (Chengwen Feng)" * tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits) ACPICA: Update maintainers information watchdog: ni903x_wdt: Convert to a platform driver ACPI: PAD: xen: Convert to a platform driver ACPI: processor: idle: Reset cpuidle on C-state list changes cpuidle: Extract and export no-lock variants of cpuidle_unregister_device() PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu() perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu() ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler PCI: hisi: Use devm_ghes_register_vendor_record_notifier() ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier() ACPI: tables: Enable FPDT on LoongArch ACPI: processor: idle: Fix NULL pointer dereference in hotplug path ACPI: processor: idle: Reset power_setup_done flag on initialization failure ACPI: TAD: Add alarm support to the RTC class device interface ...	2026-04-13 19:25:07 -07:00
Linus Torvalds	d568788baa	hardening updates for v7.1-rc1 - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCad16PwAKCRA2KwveOeQk u7crAP4qz8gXCjes76KsZm/YQS8PtOG5JroAVu5Oa4ohw0RfaQD+K/XLow1plcNF 4Bi8zSuv2ifcLysh9qEAbx5+wcHijgo= =woB3 -----END PGP SIGNATURE----- Merge tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task	2026-04-13 17:52:29 -07:00
Linus Torvalds	ef3da345cc	vfs-7.1-rc1.misc Please consider pulling these changes from the signed vfs-7.1-rc1.misc tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc ohhBAQCAmQMlMRAXAgUZFYMTZpeQlcujP5rv+/vT2Tf/xS76YwD/dRDaw1FH294+ qtk/Z1NjleNixzE2sld1K9J32NxeyAc= =+g9q -----END PGP SIGNATURE----- Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - coredump: add tracepoint for coredump events - fs: hide file and bfile caches behind runtime const machinery Fixes: - fix architecture-specific compat_ftruncate64 implementations - dcache: Limit the minimal number of bucket to two - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START - fs/mbcache: cancel shrink work before destroying the cache - dcache: permit dynamic_dname()s up to NAME_MAX Cleanups: - remove or unexport unused fs_context infrastructure - trivial ->setattr cleanups - selftests/filesystems: Assume that TIOCGPTPEER is defined - writeback: fix kernel-doc function name mismatch for wb_put_many() - autofs: replace manual symlink buffer allocation in autofs_dir_symlink - init/initramfs.c: trivial fix: FSM -> Finite-state machine - fs: remove stale and duplicate forward declarations - readdir: Introduce dirent_size() - fs: Replace user_access_{begin/end} by scoped user access - kernel: acct: fix duplicate word in comment - fs: write a better comment in step_into() concerning .mnt assignment - fs: attr: fix comment formatting and spelling issues" * tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) dcache: permit dynamic_dname()s up to NAME_MAX fs: attr: fix comment formatting and spelling issues fs: hide file and bfile caches behind runtime const machinery fs: write a better comment in step_into() concerning .mnt assignment proc: rename proc_notify_change to proc_setattr proc: rename proc_setattr to proc_nochmod_setattr affs: rename affs_notify_change to affs_setattr adfs: rename adfs_notify_change to adfs_setattr hfs: update comments on hfs_inode_setattr kernel: acct: fix duplicate word in comment fs: Replace user_access_{begin/end} by scoped user access readdir: Introduce dirent_size() coredump: add tracepoint for coredump events fs: remove do_sys_truncate fs: pass on FTRUNCATE_* flags to do_truncate fs: fix archiecture-specific compat_ftruncate64 fs: remove stale and duplicate forward declarations init/initramfs.c: trivial fix: FSM -> Finite-state machine autofs: replace manual symlink buffer allocation in autofs_dir_symlink fs/mbcache: cancel shrink work before destroying the cache ...	2026-04-13 14:20:11 -07:00
Paolo Bonzini	e74c3a8891	KVM/arm64 updates for 7.1 * New features: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This comes with a full infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, with anonymous memory being used as a backing store. About time! * Improvements and bug fixes: - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmnWdswACgkQI9DQutE9 ekNYvBAAxj5Zmsx8sJ2CYDTJc2w4XkEjSgDugA+J/s0TMgrzExeBlWCstdhVTncy 68nwOjQl3TotnIrt7q36kko9u7IdD0pHNrk34NtlggLjHfB61n9SNcAA6j4F6zJa GFkHpJSrSnZuUPqapkDnlyhuPkgTIAkEUk2Am9siksSfY4HvRyHZJm2FTdxsdIBn NN9wvQqw2wefTXOQ8gS+oHbPVp1cPbwrF2a3EhzXXv/6W3mUBstXgsijgo07UzCp W6vHCv2wqHbHdf67z3Q3hL+VXlVH6oHlyW99/swqISvqRkH/iSB90+oUojnMRrSm yB6Wmhh8jboCaajWMJhG+veZw+7GMXU4nOrGd1rbnY8cwRl/TQ5YibhRm7DIdvjO xeUluTLJ0NdweQUwE2k4OlgKOuGang3E2p0clmkUO4SstA48MdqR/kpST6guIlWw U5syuNaaaiuwP5QOi9qZmMCNmQ3ZfnZG3nseJFdoyGjhVhf5jyQyv4Du9vGZQFF/ Zkg7yTqC4OWiC+3GkW9YYAySM1MyetivLtd47PGzHPTdtaZziWhNvQ0y+8QjQ+R+ CJNvyS/DvsT7epSya4sLgMP1ZAlih9xkz5sQ6k8NJLBYYXi0v33qwqditErgLLyj S4Ci4WNhHHWIusvCVM7JUBkH0AElpmi506f7F6iHoFLlkYR4t9U= =/SuQ -----END PGP SIGNATURE----- Merge tag 'kvmarm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.1 * New features: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This comes with a full infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, with anonymous memory being used as a backing store. About time! * Improvements and bug fixes: - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn.	2026-04-13 11:49:54 +02:00
Catalin Marinas	0baba94a97	arm64: errata: Work around early CME DVMSync acknowledgement C1-Pro acknowledges DVMSync messages before completing the SME/CME memory accesses. Work around this by issuing an IPI to the affected CPUs if they are running in EL0 with SME enabled. Note that we avoid the local DSB in the IPI handler as the kernel runs with SCTLR_EL1.IESB=1. This is sufficient to complete SME memory accesses at EL0 on taking an exception to EL1. On the return to user path, no barrier is necessary either. See the comment in sme_set_active() and the more detailed explanation in the link below. To avoid a potential IPI flood from malicious applications (e.g. madvise(MADV_PAGEOUT) in a tight loop), track where a process is active via mm_cpumask() and only interrupt those CPUs. Link: https://lore.kernel.org/r/ablEXwhfKyJW1i7l@J2N7QTR9R3 Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-10 19:46:14 +01:00
Catalin Marinas	d9fb08ba94	arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() The mm structure will be used for workarounds that need limiting to specific tasks. Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-10 19:46:14 +01:00
Catalin Marinas	480a9e57cc	Merge branches 'for-next/misc', 'for-next/tlbflush', 'for-next/ttbr-macros-cleanup', 'for-next/kselftest', 'for-next/feat_lsui', 'for-next/mpam', 'for-next/hotplug-batched-tlbi', 'for-next/bbml2-fixes', 'for-next/sysreg', 'for-next/generic-entry' and 'for-next/acpi', remote-tracking branches 'arm64/for-next/perf' and 'arm64/for-next/read-once' into for-next/core * arm64/for-next/perf: : Perf updates perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc() perf/arm-cmn: Fix incorrect error check for devm_ioremap() perf: add NVIDIA Tegra410 C2C PMU perf: add NVIDIA Tegra410 CPU Memory Latency PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU perf/arm_cspmu: Add arm_cspmu_acpi_dev_get perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU perf/arm_cspmu: nvidia: Rename doc to Tegra241 perf/arm-cmn: Stop claiming entire iomem region arm64: cpufeature: Use pmuv3_implemented() function arm64: cpufeature: Make PMUVer and PerfMon unsigned KVM: arm64: Read PMUVer as unsigned * arm64/for-next/read-once: : Fixes for __READ_ONCE() with CONFIG_LTO=y arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y arm64: Optimize __READ_ONCE() with CONFIG_LTO=y * for-next/misc: : Miscellaneous cleanups/fixes arm64: rsi: use linear-map alias for realm config buffer arm64: Kconfig: fix duplicate word in CMDLINE help text arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps arm64: kexec: Remove duplicate allocation for trans_pgd arm64: mm: Use generic enum pgtable_level arm64: scs: Remove redundant save/restore of SCS SP on entry to/from EL0 arm64: remove ARCH_INLINE_* * for-next/tlbflush: : Refactor the arm64 TLB invalidation API and implementation arm64: mm: __ptep_set_access_flags must hint correct TTL arm64: mm: Provide level hint for flush_tlb_page() arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() arm64: mm: More flags for __flush_tlb_range() arm64: mm: Refactor __flush_tlb_range() to take flags arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() arm64: mm: Simplify __flush_tlb_range_limit_excess() arm64: mm: Simplify __TLBI_RANGE_NUM() macro arm64: mm: Re-implement the __flush_tlb_range_op macro in C arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() arm64: mm: Push __TLBI_VADDR() into __tlbi_level() arm64: mm: Implicitly invalidate user ASID based on TLBI operation arm64: mm: Introduce a C wrapper for by-range TLB invalidation arm64: mm: Re-implement the __tlbi_level macro as a C function * for-next/ttbr-macros-cleanup: : Cleanups of the TTBR1_* macros arm64/mm: Directly use TTBRx_EL1_CnP arm64/mm: Directly use TTBRx_EL1_ASID_MASK arm64/mm: Describe TTBR1_BADDR_4852_OFFSET * for-next/kselftest: : arm64 kselftest updates selftests/arm64: Implement cmpbr_sigill() to hwcap test * for-next/feat_lsui: : Futex support using FEAT_LSUI instructions to avoid toggling PAN arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present arm64: Kconfig: Add support for LSUI KVM: arm64: Use CAST instruction for swapping guest descriptor arm64: futex: Support futex with FEAT_LSUI arm64: futex: Refactor futex atomic operation KVM: arm64: kselftest: set_id_regs: Add test for FEAT_LSUI KVM: arm64: Expose FEAT_LSUI to guests arm64: cpufeature: Add FEAT_LSUI * for-next/mpam: (40 commits) : Expose MPAM to user-space via resctrl: : - Add architecture context-switch and hiding of the feature from KVM. : - Add interface to allow MPAM to be exposed to user-space using resctrl. : - Add errata workaoround for some existing platforms. : - Add documentation for using MPAM and what shape of platforms can use resctrl arm64: mpam: Add initial MPAM documentation arm_mpam: Quirk CMN-650's CSU NRDY behaviour arm_mpam: Add workaround for T241-MPAM-6 arm_mpam: Add workaround for T241-MPAM-4 arm_mpam: Add workaround for T241-MPAM-1 arm_mpam: Add quirk framework arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl arm64: mpam: Select ARCH_HAS_CPU_RESCTRL arm_mpam: resctrl: Add empty definitions for assorted resctrl functions arm_mpam: resctrl: Update the rmid reallocation limit arm_mpam: resctrl: Add resctrl_arch_rmid_read() arm_mpam: resctrl: Allow resctrl to allocate monitors arm_mpam: resctrl: Add support for csu counters arm_mpam: resctrl: Add monitor initialisation and domain boilerplate arm_mpam: resctrl: Add kunit test for control format conversions arm_mpam: resctrl: Add support for 'MB' resource arm_mpam: resctrl: Wait for cacheinfo to be ready arm_mpam: resctrl: Add rmid index helpers arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT ... * for-next/hotplug-batched-tlbi: : arm64/mm: Enable batched TLB flush in unmap_hotplug_range() arm64/mm: Reject memory removal that splits a kernel leaf mapping arm64/mm: Enable batched TLB flush in unmap_hotplug_range() * for-next/bbml2-fixes: : Fixes for realm guest and BBML2_NOABORT arm64: mm: Remove pmd_sect() and pud_sect() arm64: mm: Handle invalid large leaf mappings correctly arm64: mm: Fix rodata=full block mapping support for realm guests * for-next/sysreg: : arm64 sysreg updates arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12 arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12 arm64/sysreg: Update SMIDR_EL1 to DDI0601 2025-06 * for-next/generic-entry: : More arm64 refactoring towards using the generic entry code arm64: Check DAIF (and PMR) at task-switch time arm64: entry: Use split preemption logic arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() arm64: entry: Consistently prefix arm64-specific wrappers arm64: entry: Don't preempt with SError or Debug masked entry: Split preemption from irqentry_exit_to_kernel_mode() entry: Split kernel mode logic from irqentry_{enter,exit}() entry: Move irqentry_enter() prototype later entry: Remove local_irq_{enable,disable}_exit_to_user() entry: Fix stale comment for irqentry_enter() * for-next/acpi: : arm64 ACPI updates ACPI: AGDI: fix missing newline in error message	2026-04-10 14:22:24 +01:00
Aneesh Kumar K.V (Arm)	34e563947c	arm64: rsi: use linear-map alias for realm config buffer rsi_get_realm_config() passes its argument to virt_to_phys(), but &config is a kernel image address and not a linear-map alias. On arm64 this triggers the below warning: virt_to_phys used for non-linear address: (____ptrval____) (config+0x0/0x1000) WARNING: arch/arm64/mm/physaddr.c:15 at __virt_to_phys+0x50/0x70, CPU#0: swapper/0 Modules linked in: ..... Hardware name: linux,dummy-virt (DT) pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __virt_to_phys+0x50/0x70 lr : __virt_to_phys+0x4c/0x70 ..... ...... Call trace: __virt_to_phys+0x50/0x70 (P) arm64_rsi_init+0xa0/0x1b8 setup_arch+0x13c/0x1a0 start_kernel+0x68/0x398 __primary_switched+0x88/0x90 Pass lm_alias(&config) instead so the RSI call uses the linear-map alias of the same buffer and avoids the boot-time warning. Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-10 11:52:04 +01:00
Muhammad Usama Anjum	249bf97331	arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode With KASAN_HW_TAGS (MTE) in synchronous mode, tag check faults are reported as immediate Data Abort exceptions. The TFSR_EL1.TF1 bit is never set since faults never go through the asynchronous path. Therefore, reading TFSR_EL1 and executing data and instruction barriers on kernel entry, exit, context switch and suspend is unnecessary overhead. As with the check_mte_async_tcf and clear_mte_async_tcf paths for TFSRE0_EL1, extend the same optimisation to kernel entry/exit, context switch and suspend. All mte kselftests pass. The kunit before and after the patch show same results. A selection of test_vmalloc benchmarks running on a arm64 machine. v6.19 is the baseline. (>0 is faster, <0 is slower, (R)/(I) = statistically significant Regression/Improvement). Based on significance and ignoring the noise, the benchmarks improved. * 77 result classes were considered, with 9 wins, 0 losses and 68 ties Results of fastpath [1] on v6.19 vs this patch: +----------------------------+----------------------------------------------------------+------------+ \| Benchmark \| Result Class \| barriers \| +============================+==========================================================+============+ \| micromm/fork \| fork: p:1, d:10 (seconds) \| (I) 2.75% \| \| \| fork: p:512, d:10 (seconds) \| 0.96% \| +----------------------------+----------------------------------------------------------+------------+ \| micromm/munmap \| munmap: p:1, d:10 (seconds) \| -1.78% \| \| \| munmap: p:512, d:10 (seconds) \| 5.02% \| +----------------------------+----------------------------------------------------------+------------+ \| micromm/vmalloc \| fix_align_alloc_test: p:1, h:0, l:500000 (usec) \| -0.56% \| \| \| fix_size_alloc_test: p:1, h:0, l:500000 (usec) \| 0.70% \| \| \| fix_size_alloc_test: p:4, h:0, l:500000 (usec) \| 1.18% \| \| \| fix_size_alloc_test: p:16, h:0, l:500000 (usec) \| -5.01% \| \| \| fix_size_alloc_test: p:16, h:1, l:500000 (usec) \| 13.81% \| \| \| fix_size_alloc_test: p:64, h:0, l:100000 (usec) \| 6.51% \| \| \| fix_size_alloc_test: p:64, h:1, l:100000 (usec) \| 32.87% \| \| \| fix_size_alloc_test: p:256, h:0, l:100000 (usec) \| 4.17% \| \| \| fix_size_alloc_test: p:256, h:1, l:100000 (usec) \| 8.40% \| \| \| fix_size_alloc_test: p:512, h:0, l:100000 (usec) \| -0.48% \| \| \| fix_size_alloc_test: p:512, h:1, l:100000 (usec) \| -0.74% \| \| \| full_fit_alloc_test: p:1, h:0, l:500000 (usec) \| 0.53% \| \| \| kvfree_rcu_1_arg_vmalloc_test: p:1, h:0, l:500000 (usec) \| -2.81% \| \| \| kvfree_rcu_2_arg_vmalloc_test: p:1, h:0, l:500000 (usec) \| -2.06% \| \| \| long_busy_list_alloc_test: p:1, h:0, l:500000 (usec) \| -0.56% \| \| \| pcpu_alloc_test: p:1, h:0, l:500000 (usec) \| -0.41% \| \| \| random_size_align_alloc_test: p:1, h:0, l:500000 (usec) \| 0.89% \| \| \| random_size_alloc_test: p:1, h:0, l:500000 (usec) \| 1.71% \| \| \| vm_map_ram_test: p:1, h:0, l:500000 (usec) \| 0.83% \| +----------------------------+----------------------------------------------------------+------------+ \| schbench/thread-contention \| -m 16 -t 1 -r 10 -s 1000, avg_rps (req/sec) \| 0.05% \| \| \| -m 16 -t 1 -r 10 -s 1000, req_latency_p99 (usec) \| 0.60% \| \| \| -m 16 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.00% \| \| \| -m 16 -t 4 -r 10 -s 1000, avg_rps (req/sec) \| -0.34% \| \| \| -m 16 -t 4 -r 10 -s 1000, req_latency_p99 (usec) \| -0.58% \| \| \| -m 16 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 9.09% \| \| \| -m 16 -t 16 -r 10 -s 1000, avg_rps (req/sec) \| -0.74% \| \| \| -m 16 -t 16 -r 10 -s 1000, req_latency_p99 (usec) \| -1.40% \| \| \| -m 16 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.00% \| \| \| -m 16 -t 64 -r 10 -s 1000, avg_rps (req/sec) \| -0.78% \| \| \| -m 16 -t 64 -r 10 -s 1000, req_latency_p99 (usec) \| -0.11% \| \| \| -m 16 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.11% \| \| \| -m 16 -t 256 -r 10 -s 1000, avg_rps (req/sec) \| 2.64% \| \| \| -m 16 -t 256 -r 10 -s 1000, req_latency_p99 (usec) \| 3.15% \| \| \| -m 16 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 17.54% \| \| \| -m 32 -t 1 -r 10 -s 1000, avg_rps (req/sec) \| -1.22% \| \| \| -m 32 -t 1 -r 10 -s 1000, req_latency_p99 (usec) \| 0.85% \| \| \| -m 32 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.00% \| \| \| -m 32 -t 4 -r 10 -s 1000, avg_rps (req/sec) \| -0.34% \| \| \| -m 32 -t 4 -r 10 -s 1000, req_latency_p99 (usec) \| 1.05% \| \| \| -m 32 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.00% \| \| \| -m 32 -t 16 -r 10 -s 1000, avg_rps (req/sec) \| -0.41% \| \| \| -m 32 -t 16 -r 10 -s 1000, req_latency_p99 (usec) \| 0.58% \| \| \| -m 32 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 2.13% \| \| \| -m 32 -t 64 -r 10 -s 1000, avg_rps (req/sec) \| 0.67% \| \| \| -m 32 -t 64 -r 10 -s 1000, req_latency_p99 (usec) \| 2.07% \| \| \| -m 32 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) \| -1.28% \| \| \| -m 32 -t 256 -r 10 -s 1000, avg_rps (req/sec) \| 1.01% \| \| \| -m 32 -t 256 -r 10 -s 1000, req_latency_p99 (usec) \| 0.69% \| \| \| -m 32 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 13.12% \| \| \| -m 64 -t 1 -r 10 -s 1000, avg_rps (req/sec) \| -0.25% \| \| \| -m 64 -t 1 -r 10 -s 1000, req_latency_p99 (usec) \| -0.48% \| \| \| -m 64 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 10.53% \| \| \| -m 64 -t 4 -r 10 -s 1000, avg_rps (req/sec) \| -0.06% \| \| \| -m 64 -t 4 -r 10 -s 1000, req_latency_p99 (usec) \| 0.00% \| \| \| -m 64 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.00% \| \| \| -m 64 -t 16 -r 10 -s 1000, avg_rps (req/sec) \| -0.36% \| \| \| -m 64 -t 16 -r 10 -s 1000, req_latency_p99 (usec) \| 0.52% \| \| \| -m 64 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec) \| 0.11% \| \| \| -m 64 -t 64 -r 10 -s 1000, avg_rps (req/sec) \| 0.52% \| \| \| -m 64 -t 64 -r 10 -s 1000, req_latency_p99 (usec) \| 3.53% \| \| \| -m 64 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec) \| -0.10% \| \| \| -m 64 -t 256 -r 10 -s 1000, avg_rps (req/sec) \| 2.53% \| \| \| -m 64 -t 256 -r 10 -s 1000, req_latency_p99 (usec) \| 1.82% \| \| \| -m 64 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec) \| -5.80% \| +----------------------------+----------------------------------------------------------+------------+ \| syscall/getpid \| mean (ns) \| (I) 15.98% \| \| \| p99 (ns) \| (I) 11.11% \| \| \| p99.9 (ns) \| (I) 16.13% \| +----------------------------+----------------------------------------------------------+------------+ \| syscall/getppid \| mean (ns) \| (I) 14.82% \| \| \| p99 (ns) \| (I) 17.86% \| \| \| p99.9 (ns) \| (I) 9.09% \| +----------------------------+----------------------------------------------------------+------------+ \| syscall/invalid \| mean (ns) \| (I) 17.78% \| \| \| p99 (ns) \| (I) 11.11% \| \| \| p99.9 (ns) \| 13.33% \| +----------------------------+----------------------------------------------------------+------------+ [1] https://gitlab.arm.com/tooling/fastpath Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-09 18:56:22 +01:00
Wang Wensheng	ee020bf6f1	arm64: kexec: Remove duplicate allocation for trans_pgd trans_pgd would be allocated in trans_pgd_create_copy(), so remove the duplicate allocation before calling trans_pgd_create_copy(). Fixes: `3744b5280e` ("arm64: kexec: install a copy of the linear-map") Signed-off-by: Wang Wensheng <wsw9603@163.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-08 17:49:08 +01:00
Mark Rutland	8d13386c76	arm64: Check DAIF (and PMR) at task-switch time When __switch_to() switches from a 'prev' task to a 'next' task, various pieces of CPU state are expected to have specific values, such that these do not need to be saved/restored. If any of these hold an unexpected value when switching away from the prev task, they could lead to surprising behaviour in the context of the next task, and it would be difficult to determine where they were configured to their unexpected value. Add some checks for DAIF and PMR at task-switch time so that we can detect such issues. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-08 17:40:04 +01:00
Mark Rutland	ae654112ea	arm64: entry: Use split preemption logic The generic irqentry code now provides irqentry_exit_to_kernel_mode_preempt() and irqentry_exit_to_kernel_mode_after_preempt(), which can be used where architectures have different state requirements for involuntary preemption and exception return, as is the case on arm64. Use the new functions on arm64, aligning our exit to kernel mode logic with the style of our exit to user mode logic. This removes the need for the recently-added bodge in arch_irqentry_exit_need_resched(), and allows preemption to occur when returning from any exception taken from kernel mode, which is nicer for RT. In an ideal world, we'd remove arch_irqentry_exit_need_resched(), and fold the conditionality directly into the architecture-specific entry code. That way all the logic necessary to avoid preempting from a pseudo-NMI could be constrained specifically to the EL1 IRQ/FIQ paths, avoiding redundant work for other exceptions, and making the flow a bit clearer. At present it looks like that would require a larger refactoring (e.g. for the PREEMPT_DYNAMIC logic), and so I've left that as-is for now. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-08 17:40:04 +01:00
Mark Rutland	a07b7b2142	arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode() The generic irqentry code now provides irqentry_enter_from_kernel_mode() and irqentry_exit_to_kernel_mode(), which can be used when an exception is known to be taken from kernel mode. These can be inlined into architecture-specific entry code, and avoid redundant work to test whether the exception was taken from user mode. Use these in arm64_enter_from_kernel_mode() and arm64_exit_to_kernel_mode(), which are only used for exceptions known to be taken from kernel mode. This will remove a small amount of redundant work, and will permit further changes to arm64_exit_to_kernel_mode() in subsequent patches. There should be no funcitonal change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-08 17:40:04 +01:00
Mark Rutland	6879ef1302	arm64: entry: Consistently prefix arm64-specific wrappers For historical reasons, arm64's entry code has arm64-specific functions named enter_from_kernel_mode() and exit_to_kernel_mode(), which are wrappers for similarly-named functions from the generic irqentry code. Other arm64-specific wrappers have an 'arm64_' prefix to clearly distinguish them from their generic counterparts, e.g. arm64_enter_from_user_mode() and arm64_exit_to_user_mode(). For consistency and clarity, add an 'arm64_' prefix to these functions. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-08 17:40:04 +01:00
Marc Zyngier	d77f4792db	Merge branch kvm-arm64/vgic-fixes-7.1 into kvmarm-master/next * kvm-arm64/vgic-fixes-7.1: : . : FIrst pass at fixing a number of vgic-v5 bugs that were found : after the merge of the initial series. : . KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE KVM: arm64: vgic-v5: Fold PPI state for all exposed PPIs KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime KVM: arm64: Don't advertises GICv3 in ID_PFR1_EL1 if AArch32 isn't supported KVM: arm64: Correctly plumb ID_AA64PFR2_EL1 into pkvm idreg handling KVM: arm64: Move GICv5 timer PPI validation into timer_irqs_are_valid() KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer() KVM: arm64: Kill arch_timer_context::direct field KVM: arm64: vgic-v5: Correctly set dist->ready once initialised KVM: arm64: vgic-v5: Make the effective priority mask a strict limit KVM: arm64: vgic-v5: Cast vgic_apr to u32 to avoid undefined behaviours KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2 KVM: arm64: vgic-v5: Hold config_lock while finalizing GICv5 PPIs KVM: arm64: Account for RESx bits in __compute_fgt() KVM: arm64: Fix writeable mask for ID_AA64PFR2_EL1 arm64: Fix field references for ICH_PPI_DVIR[01]_EL2 KVM: arm64: Don't skip per-vcpu NV initialisation KVM: arm64: vgic: Don't reset cpuif/redist addresses at finalize time Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:26:00 +01:00
Marc Zyngier	b693940e81	Merge branch kvm-arm64/pkvm-psci into kvmarm-master/next * kvm-arm64/pkvm-psci: : . : Cleanup of the pKVM PSCI relay CPU entry code, making it slightly : easier to follow, should someone have to wade into these waters : ever again. : . KVM: arm64: Remove extra ISBs when using msr_hcr_el2 KVM: arm64: pkvm: Use direct function pointers for cpu_{on,resume} KVM: arm64: pkvm: Turn __kvm_hyp_init_cpu into an inner label KVM: arm64: pkvm: Simplify BTI handling on CPU boot KVM: arm64: pkvm: Move error handling to the end of kvm_hyp_cpu_entry Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:23:24 +01:00
Marc Zyngier	2de32a25a3	Merge branch kvm-arm64/hyp-tracing into kvmarm-master/next * kvm-arm64/hyp-tracing: (40 commits) : . : EL2 tracing support, adding both 'remote' ring-buffer : infrastructure and the tracing itself, courtesy of : Vincent Donnefort. From the cover letter: : : "The growing set of features supported by the hypervisor in protected : mode necessitates debugging and profiling tools. Tracefs is the : ideal candidate for this task: : : * It is simple to use and to script. : : * It is supported by various tools, from the trace-cmd CLI to the : Android web-based perfetto. : : * The ring-buffer, where are stored trace events consists of linked : pages, making it an ideal structure for sharing between kernel and : hypervisor. : : This series first introduces a new generic way of creating remote events and : remote buffers. Then it adds support to the pKVM hypervisor." : . tracing: selftests: Extend hotplug testing for trace remotes tracing: Non-consuming read for trace remotes with an offline CPU tracing: Adjust cmd_check_undefined to show unexpected undefined symbols tracing: Restore accidentally removed SPDX tag KVM: arm64: avoid unused-variable warning tracing: Generate undef symbols allowlist for simple_ring_buffer KVM: arm64: tracing: add ftrace dependency tracing: add more symbols to whitelist tracing: Update undefined symbols allow list for simple_ring_buffer KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing tracing: selftests: Add hypervisor trace remote tests KVM: arm64: Add selftest event support to nVHE/pKVM hyp KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote KVM: arm64: Add trace reset to the nVHE/pKVM hyp KVM: arm64: Sync boot clock with the nVHE/pKVM hyp KVM: arm64: Add trace remote for the nVHE/pKVM hyp KVM: arm64: Add tracing capability for the nVHE/pKVM hyp KVM: arm64: Support unaligned fixmap in the pKVM hyp KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-08 12:21:51 +01:00
Chengwen Feng	7cd5f5659a	arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval As a step towards unifying the interface for retrieving ACPI CPU UID across architectures, introduce a new function acpi_get_cpu_uid() for arm64. While at it, add input validation to make the code more robust. Reimplement get_cpu_for_acpi_id() based on acpi_get_cpu_uid() for consistency, and move its implementation next to the new function for code coherence. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://patch.msgid.link/20260401081640.26875-2-fengchengwen@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-06 16:55:15 +02:00
Marc Zyngier	7e629348df	KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE As we are missing ID_AA64PFR2_EL1.GCIE from the kernel feature set, userspace cannot write ID_AA64PFR2_EL1 with GCIE set, even if we are on a GICv5 host. Add the required field description. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://patch.msgid.link/20260401170017.369529-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-04-05 08:15:35 +01:00
Linus Torvalds	441c63ff42	arm64 fix for v7.0 - Implement a basic static call trampoline to fix CFI failures with the generic implementation. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmnPh0AQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNHRVB/97IOb/LZAq2yguGy6rMptm3tCdCsUmgPkh aPBeI4BE1JXofRcyM1oaavM/wC6M3ASb8JJbg5Ceta3wXwPfjzR2F9+6OEzipXzC nQzm0Da5GvwiHOY6GGhOgUy91+JJB1g7402ALIRjCiaadDBTLgys/YzDFUGC4+8N QKToOJykO4sCUR4lpYpuJvd1NQv1VkJo4ZgtlWvanHo9ovkTXOuCJsCTBv6EHMo6 nJg9iSZOMj3L20VSmnY5fa0MpCNCXH8cfYtbmHBYBxI3e3sKYI8A2j0H22FP4oIH 2+tkIg5TxQsmejf9u9V1JES2/0712SmG/hS0y1BsQtYzVuDp7pBZ =qSXb -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Will Deacon: - Implement a basic static call trampoline to fix CFI failures with the generic implementation * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Use static call trampolines when kCFI is enabled	2026-04-03 08:47:13 -07:00
Coiby Xu	e3a84be1ec	arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel CONFIG_CRASH_DM_CRYPT has been introduced to support LUKS-encrypted device dump target by addressing two challenges [1], - Kdump kernel may not be able to decrypt the LUKS partition. For some machines, a system administrator may not have a chance to enter the password to decrypt the device in kdump initramfs after the 1st kernel crashes - LUKS2 by default use the memory-hard Argon2 key derivation function which is quite memory-consuming compared to the limited memory reserved for kdump. To also enable this feature for ARM64 and PowerPC, the missing piece is to let the kdump kernel know where to find the dm-crypt keys which are randomly stored in memory reserved for kdump. Introduce a new device tree property dmcryptkeys [2] as similar to elfcorehdr to pass the memory address of the stored info of dm-crypt keys to the kdump kernel. Since this property is only needed by the kdump kernel, it won't be exposed to userspace. Link: https://lkml.kernel.org/r/20260225060347.718905-4-coxu@redhat.com Link: https://lore.kernel.org/all/20250502011246.99238-1-coxu@redhat.com/ [1] Link: https://github.com/devicetree-org/dt-schema/pull/181 [2] Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Arnaud Lefebvre <arnaud.lefebvre@clever-cloud.com> Cc: Baoquan he <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Thomas Staudt <tstaudt@de.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-02 23:36:24 -07:00
Ard Biesheuvel	54ac9ff8f1	arm64: Use static call trampolines when kCFI is enabled Implement arm64 support for the 'unoptimized' static call variety, which routes all calls through a trampoline that performs a tail call to the chosen function, and wire it up for use when kCFI is enabled. This works around an issue with kCFI and generic static calls, where the prototypes of default handlers such as __static_call_nop() and __static_call_ret0() don't match the expected prototype of the call site, resulting in kCFI false positives [0]. Since static call targets may be located in modules loaded out of direct branching range, this needs an ADRP/LDR pair to load the branch target into R16 and a branch-to-register (BR) instruction to perform an indirect call. Unlike on x86, there is no pressing need on arm64 to avoid indirect calls at all cost, but hiding it from the compiler as is done here does have some benefits: - the literal is located in .rodata, which gives us the same robustness advantage that code patching does; - no D-cache pollution from fetching hash values from .text sections. From an execution speed PoV, this is unlikely to make any difference at all. Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will McVicker <willmcvicker@google.com> Reported-by: Carlos Llamas <cmllamas@google.com> Closes: https://lore.kernel.org/all/20260311225822.1565895-1-cmllamas@google.com/ [0] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-04-01 15:29:59 +01:00
Yeoreum Yun	e223258ed8	arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present The purpose of supporting LSUI is to eliminate PAN toggling. CPUs that support LSUI are unlikely to support a 32-bit runtime. Rather than emulating the SWP instruction using LSUI instructions in order to remove PAN toggling, simply disable SWP emulation. Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> [catalin.marinas@arm.com: some tweaks to the in-code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-03-27 17:29:10 +00:00
Ben Horgan	37fe0f984d	arm64: mpam: Initialise and context switch the MPAMSM_EL1 register The MPAMSM_EL1 sets the MPAM labels, PMG and PARTID, for loads and stores generated by a shared SMCU. Disable the traps so the kernel can use it and set it to the same configuration as the per-EL cpu MPAM configuration. If an SMCU is not shared with other cpus then it is implementation defined whether the configuration from MPAMSM_EL1 is used or that from the appropriate MPAMy_ELx. As we set the same, PMG_D and PARTID_D, configuration for MPAM0_EL1, MPAM1_EL1 and MPAMSM_EL1 the resulting configuration is the same regardless. The range of valid configurations for the PARTID and PMG in MPAMSM_EL1 is not currently specified in Arm Architectural Reference Manual but the architect has confirmed that it is intended to be the same as that for the cpu configuration in the MPAMy_ELx registers. Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2026-03-27 15:29:02 +00:00
James Morse	735dad9999	arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs The MPAM system registers will be lost if the CPU is reset during PSCI's CPU_SUSPEND. Add a PM notifier to restore them. mpam_thread_switch(current) can't be used as this won't make any changes if the in-memory copy says the register already has the correct value. In reality the system register is UNKNOWN out of reset. Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2026-03-27 15:28:54 +00:00
James Morse	831a7f1672	arm64: mpam: Advertise the CPUs MPAM limits to the driver Requesters need to populate the MPAM fields for any traffic they send on the interconnect. For the CPUs these values are taken from the corresponding MPAMy_ELx register. Each requester may have a limit on the largest PARTID or PMG value that can be used. The MPAM driver has to determine the system-wide minimum supported PARTID and PMG values. To do this, the driver needs to be told what each requestor's limit is. CPUs are special, but this infrastructure is also needed for the SMMU and GIC ITS. Call the helper to tell the MPAM driver what the CPUs can do. The return value can be ignored by the arch code as it runs well before the MPAM driver starts probing. Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> [ morse: requestor->requester as argued by ispell ] Signed-off-by: James Morse <james.morse@arm.com>	2026-03-27 15:28:42 +00:00
James Morse	87b78a5d70	arm64: mpam: Re-initialise MPAM regs when CPU comes online Now that the MPAM system registers are expected to have values that change, reprogram them based on the previous value when a CPU is brought online. Previously MPAM's 'default PARTID' of 0 was always used for MPAM in kernel-space as this is the PARTID that hardware guarantees to reset. Because there are a limited number of PARTID, this value is exposed to user-space, meaning resctrl changes to the resctrl default group would also affect kernel threads. Instead, use the task's PARTID value for kernel work on behalf of user-space too. The default of 0 is kept for both user-space and kernel-space when MPAM is not enabled. Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2026-03-27 15:28:25 +00:00
James Morse	8e06d04ff1	arm64: mpam: Context switch the MPAM registers MPAM allows traffic in the SoC to be labeled by the OS, these labels are used to apply policy in caches and bandwidth regulators, and to monitor traffic in the SoC. The label is made up of a PARTID and PMG value. The x86 equivalent calls these CLOSID and RMID, but they don't map precisely. MPAM has two CPU system registers that is used to hold the PARTID and PMG values that traffic generated at each exception level will use. These can be set per-task by the resctrl file system. (resctrl is the defacto interface for controlling this stuff). Add a helper to switch this. struct task_struct's separate CLOSID and RMID fields are insufficient to implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and PMG (sort of like the RMID) separately. On x86, the rmid is an independent number, so a race that writes a mismatched closid and rmid into hardware is benign. On arm64, the pmg bits extend the partid. (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0). In this case, mismatching the values will 'dirty' a pmg value that resctrl believes is clean, and is not tracking with its 'limbo' code. To avoid this, the partid and pmg are always read and written as a pair. This requires a new u64 field. In struct task_struct there are two u32, rmid and closid for the x86 case, but as we can't use them here do something else. Add this new field, mpam_partid_pmg, to struct thread_info to avoid adding more architecture specific code to struct task_struct. Always use READ_ONCE()/WRITE_ONCE() when accessing this field. Resctrl allows a per-cpu 'default' value to be set, this overrides the values when scheduling a task in the default control-group, which has PARTID 0. The way 'code data prioritisation' gets emulated means the register value for the default group needs to be a variable. The current system register value is kept in a per-cpu variable to avoid writing to the system register if the value isn't going to change. Writes to this register may reset the hardware state for regulating bandwidth. Finally, there is no reason to context switch these registers unless there is a driver changing the values in struct task_struct. Hide the whole thing behind a static key. This also allows the driver to disable MPAM in response to errors reported by hardware. Move the existing static key to belong to the arch code, as in the future the MPAM driver may become a loadable module. All this should depend on whether there is an MPAM driver, hide it behind CONFIG_ARM64_MPAM. Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Zeng Heng <zengheng4@huawei.com> Tested-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com> Tested-by: Jesse Chick <jessechick@os.amperecomputing.com> CC: Amit Singh Tomar <amitsinght@marvell.com> Reviewed-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: Ben Horgan <ben.horgan@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2026-03-27 15:27:59 +00:00
Yeoreum Yun	7181f718cb	arm64: cpufeature: Add FEAT_LSUI Since Armv9.6, FEAT_LSUI introduces atomic instructions that allow privileged code to access user memory without clearing the PSTATE.PAN bit. Add CPU feature detection for FEAT_LSUI. Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> [catalin.marinas@arm.com: Remove commit log references to SW_PAN] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-03-26 18:19:07 +00:00
Ryan Roberts	a96ef5848c	randomize_kstack: Unify random source across arches Previously different architectures were using random sources of differing strength and cost to decide the random kstack offset. A number of architectures (loongarch, powerpc, s390, x86) were using their timestamp counter, at whatever the frequency happened to be. Other arches (arm64, riscv) were using entropy from the crng via get_random_u16(). There have been concerns that in some cases the timestamp counters may be too weak, because they can be easily guessed or influenced by user space. And get_random_u16() has been shown to be too costly for the level of protection kstack offset randomization provides. So let's use a common, architecture-agnostic source of entropy; a per-cpu prng, seeded at boot-time from the crng. This has a few benefits: - We can remove choose_random_kstack_offset(); That was only there to try to make the timestamp counter value a bit harder to influence from user space []. - The architecture code is simplified. All it has to do now is call add_random_kstack_offset() in the syscall path. - The strength of the randomness can be reasoned about independently of the architecture. - Arches previously using get_random_u16() now have much faster syscall paths, see below results. [] Additionally, this gets rid of some redundant work on s390 and x86. Before this patch, those architectures called choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(), which is also called for exception returns to userspace which were not syscalls (e.g. regular interrupts). Getting rid of choose_random_kstack_offset() avoids a small amount of redundant work for the non-syscall cases. In some configurations, add_random_kstack_offset() will now call instrumentable code, so for a couple of arches, I have moved the call a bit later to the first point where instrumentation is allowed. This doesn't impact the efficacy of the mechanism. There have been some claims that a prng may be less strong than the timestamp counter if not regularly reseeded. But the prng has a period of about 2^113. So as long as the prng state remains secret, it should not be possible to guess. If the prng state can be accessed, we have bigger problems. Additionally, we are only consuming 6 bits to randomize the stack, so there are only 64 possible random offsets. I assert that it would be trivial for an attacker to brute force by repeating their attack and waiting for the random stack offset to be the desired one. The prng approach seems entirely proportional to this level of protection. Performance data are provided below. The baseline is v6.18 with rndstack on for each respective arch. (I)/(R) indicate statistically significant improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal). x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge): +-----------------+--------------+---------------+---------------+ \| Benchmark \| Result Class \| per-cpu-prng \| per-cpu-prng \| \| \| \| arm64 (metal) \| x86_64 (VM) \| +=================+==============+===============+===============+ \| syscall/getpid \| mean (ns) \| (I) -9.50% \| (I) -17.65% \| \| \| p99 (ns) \| (I) -59.24% \| (I) -24.41% \| \| \| p99.9 (ns) \| (I) -59.52% \| (I) -28.52% \| +-----------------+--------------+---------------+---------------+ \| syscall/getppid \| mean (ns) \| (I) -9.52% \| (I) -19.24% \| \| \| p99 (ns) \| (I) -59.25% \| (I) -25.03% \| \| \| p99.9 (ns) \| (I) -59.50% \| (I) -28.17% \| +-----------------+--------------+---------------+---------------+ \| syscall/invalid \| mean (ns) \| (I) -10.31% \| (I) -18.56% \| \| \| p99 (ns) \| (I) -60.79% \| (I) -20.06% \| \| \| p99.9 (ns) \| (I) -61.04% \| (I) -25.04% \| +-----------------+--------------+---------------+---------------+ I tested an earlier version of this change on x86 bare metal and it showed a smaller but still significant improvement. The bare metal system wasn't available this time around so testing was done in a VM instance. I'm guessing the cost of rdtsc is higher for VMs. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com Signed-off-by: Kees Cook <kees@kernel.org>	2026-03-24 21:12:03 -07:00
James Clark	15ed3fa23c	arm64: cpufeature: Use pmuv3_implemented() function Other places that are doing this version comparison are already using pmuv3_implemented(), so might as well use it here too for consistency. Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-24 12:33:49 +00:00
James Clark	d1dcc20bcc	arm64: cpufeature: Make PMUVer and PerfMon unsigned On the host, this change doesn't make a difference because the fields are defined as FTR_EXACT. However, KVM allows userspace to set these fields for a guest and overrides the type to be FTR_LOWER_SAFE. And while KVM used to do an unsigned comparison to validate that the new value is lower than what the hardware provides, since the linked commit it uses the generic sanitization framework which does a signed comparison. Fix it by defining these fields as unsigned. In theory, without this fix, userspace could set a higher PMU version than the hardware supports by providing any value with the top bit set. Fixes: `c118cead07` ("KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-24 12:33:49 +00:00
Christoph Hellwig	e43dce8a0b	fs: fix archiecture-specific compat_ftruncate64 The "small" argument to do_sys_ftruncate indicates if > 32-bit size should be reject, but all the arch-specific compat ftruncate64 implementations get this wrong. Merge do_sys_ftruncate and ksys_ftruncate, replace the integer as boolean small flag with a descriptive one about LFS semantics, and use it correctly in the architecture-specific ftruncate64 implementations. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Fixes: `3dd681d944` ("arm64: 32-bit (compat) applications support") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260323070205.2939118-2-hch@lst.de Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-03-23 12:41:57 +01:00
Marc Zyngier	54a3cc1454	KVM: arm64: Remove extra ISBs when using msr_hcr_el2 The msr_hcr_el2 macro is slightly awkward, as it provides an ISB when CONFIG_AMPERE_ERRATUM_AC04_CPU_23 is present, and none otherwise. Note that this this option is 'default y', meaning that it is likely to be selected. Most instances of msr_hcr_el2 are also immediately followed by an ISB, meaning that in most cases, you end-up with two back-to-back ISBs. This isn't a big deal, but once you have seen that, you can't unsee it. Rework the msr_hcr_el2 macro to always provide the ISB, and drop the superfluous ISBs everywhere else. Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260321212419.2803972-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2026-03-23 11:03:53 +00:00
Linus Torvalds	165160265e	arm64 fixes for -rc5 - Fix DWARF parsing for SCS/PAC patching to work with very large modules (such as the amdgpu driver). - Fixes to the mpam resctrl driver. - Fix broken handling of 52-bit physical addresses when sharing memory from within a realm. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmm9KmsQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNMBvB/4xtXt77u6Bx6vG3b9LZVU7XxIZN+svvIWs S1unNTqcLPjNhkqC7kJeTgOjbUOJ6jtCm3NQRg66fUDXOknwHp8d1yjoNI+eS6Ki hhRWtWZm+vGNb0YAJTNAATuNSmvn0qx3KMlHEQKnKUsAdzuVTTxwln0GjASLcP7H gMl0h46/CtvTRoSlBzTd5ObR8bcQeD1tRBHlXaCZI4i0rF9d3Aur3n1Vz7DfOUP9 YzHjNZIdWd/6+hIqVAzrhiJE3kxLRv46OXh71Q2YKWe48/USCUskueGLK3c27Gs1 o6xsc9ZlItVRTO6J1rFCN2No2Pigmdqkqu1moeZCb37R79ilVb/i =Wv4u -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "There's a small crop of fixes for the MPAM resctrl driver, a fix for SCS/PAC patching with the AMDGPU driver and a page-table fix for realms running with 52-bit physical addresses: - Fix DWARF parsing for SCS/PAC patching to work with very large modules (such as the amdgpu driver) - Fixes to the mpam resctrl driver - Fix broken handling of 52-bit physical addresses when sharing memory from within a realm" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: realm: Fix PTE_NS_SHARED for 52bit PA support arm_mpam: Force __iomem casts arm_mpam: Disable preemption when making accesses to fake MSC in kunit test arm_mpam: Fix null pointer dereference when restoring bandwidth counters arm64/scs: Fix handling of advance_loc4	2026-03-20 09:23:01 -07:00
Suzuki K Poulose	8c6e9b60f5	arm64: realm: Fix PTE_NS_SHARED for 52bit PA support With LPA/LPA2, the top bits of the PFN (Bits[51:48]) end up in the lower bits of the PTE. So, simply creating a mask of the "top IPA bit" doesn't work well for these configurations to set the "top" bit at the output of Stage1 translation. Fix this by using the __phys_to_pte_val() to do the right thing for all configurations. Tested using, kvmtool, placing the memory at a higher address (-m <size>@<Addr>). e.g: # lkvm run --realm -c 4 -m 512M@@128T -k Image --console serial sh-5.0# dmesg \| grep "LPA2\\|RSI" [ 0.000000] RME: Using RSI version 1.0 [ 0.000000] CPU features: detected: 52-bit Virtual Addressing (LPA2) [ 0.777354] CPU features: detected: 52-bit Virtual Addressing for KVM (LPA2) Fixes: `3993069549` ("arm64: realm: Query IPA size from the RMM") Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Price <steven.price@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-19 12:46:05 +00:00
Linus Torvalds	11e8c7e947	ARM: - Correctly handle deeactivation of interrupts that were activated from LRs. Since EOIcount only denotes deactivation of interrupts that are not present in an LR, start EOIcount deactivation walk after the last irq that made it into an LR. - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM is already enabled -- not only thhis isn't possible (pKVM will reject the call), but it is also useless: this can only happen for a CPU that has already booted once, and the capability will not change. - Fix a couple of low-severity bugs in our S2 fault handling path, affecting the recently introduced LS64 handling and the even more esoteric handling of hwpoison in a nested context - Address yet another syzkaller finding in the vgic initialisation, where we would end-up destroying an uninitialised vgic with nasty consequences - Address an annoying case of pKVM failing to boot when some of the memblock regions that the host is faulting in are not page-aligned - Inject some sanity in the NV stage-2 walker by checking the limits against the advertised PA size, and correctly report the resulting faults PPC: - Fix a PPC e500 build error due to a long-standing wart that was exposed by the recent conversion to kmalloc_obj(); rip out all the ugliness that led to the wart. RISC-V: - Prevent speculative out-of-bounds access using array_index_nospec() in APLIC interrupt handling, ONE_REG regiser access, AIA CSR access, float register access, and PMU counter access - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(), kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr() - Fix potential null pointer dereference in kvm_riscv_vcpu_aia_rmw_topei() - Fix off-by-one array access in SBI PMU - Skip THP support check during dirty logging - Fix error code returned for Smstateen and Ssaia ONE_REG interface - Check host Ssaia extension when creating AIA irqchip x86: - Fix cases where CPUID mitigation features were incorrectly marked as available whenever the kernel used scattered feature words for them. - Validate _all_ GVAs, rather than just the first GVA, when processing a range of GVAs for Hyper-V's TLB flush hypercalls. - Fix a brown paper bug in add_atomic_switch_msr(). - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list, to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu. - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel local APIC (and AVIC is enabled at the module level). - Update CR8 write interception when AVIC is (de)activated, to fix a bug where the guest can run in perpetuity with the CR8 intercept enabled. - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by default) an unintentional tightening of userspace ABI in 6.17, and provides some amount of backwards compatibility with hypervisors who want to freeze PMCs on VM-Entry. - Validate the VMCS/VMCB on return to a nested guest from SMM, because either userspace or the guest could stash invalid values in memory and trigger the processor's consistency checks. Generic: - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being unnecessary and confusing, triggered compiler warnings due to -Wflex-array-member-not-at-end. - Document that vcpu->mutex is take outside of kvm->slots_lock and kvm->slots_arch_lock, which is intentional and desirable despite being rather unintuitive. Selftests: - Increase the maximum number of NUMA nodes in the guest_memfd selftest to 64 (from 8). -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmy6n8UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNX7ggAhWoCG+AE6P3yrp6Mi+nRYpeRGC3q q2IiZCn0UoCg6q3c2kgn7b/N2zLJs0Q8FZRCEp2Je+2uvptpmdp/BMEfiIU3n2/a 61z+Dydbpyc+kUmhJzUJ+aotq5FnMNmAAmqSKoc19GhAx2OQhQmBP/JOZ0P/eqLE Is0qNBgr/Zms2ib3GFf/JT+urysL2mX47qe92HTzq1T9EEG0KleID0Jz8vYQI8Fr I5N9+lTxagQDi8ytwOM85Cn8K7wh+CQIgzmciHcVErpAvAWkrEjrPlQltpEz2C5B aWEcRgw46utEaAiwPQGJRW6TeoKUG0pUR3v6T90nBkjjJ1npm6gPVE6TBA== =7nQ9 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Quite a large pull request, partly due to skipping last week and therefore having material from ~all submaintainers in this one. About a fourth of it is a new selftest, and a couple more changes are large in number of files touched (fixing a -Wflex-array-member-not-at-end compiler warning) or lines changed (reformatting of a table in the API documentation, thanks rST). But who am I kidding---it's a lot of commits and there are a lot of bugs being fixed here, some of them on the nastier side like the RISC-V ones. ARM: - Correctly handle deactivation of interrupts that were activated from LRs. Since EOIcount only denotes deactivation of interrupts that are not present in an LR, start EOIcount deactivation walk after the last irq that made it into an LR - Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM is already enabled -- not only thhis isn't possible (pKVM will reject the call), but it is also useless: this can only happen for a CPU that has already booted once, and the capability will not change - Fix a couple of low-severity bugs in our S2 fault handling path, affecting the recently introduced LS64 handling and the even more esoteric handling of hwpoison in a nested context - Address yet another syzkaller finding in the vgic initialisation, where we would end-up destroying an uninitialised vgic with nasty consequences - Address an annoying case of pKVM failing to boot when some of the memblock regions that the host is faulting in are not page-aligned - Inject some sanity in the NV stage-2 walker by checking the limits against the advertised PA size, and correctly report the resulting faults PPC: - Fix a PPC e500 build error due to a long-standing wart that was exposed by the recent conversion to kmalloc_obj(); rip out all the ugliness that led to the wart RISC-V: - Prevent speculative out-of-bounds access using array_index_nospec() in APLIC interrupt handling, ONE_REG regiser access, AIA CSR access, float register access, and PMU counter access - Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(), kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr() - Fix potential null pointer dereference in kvm_riscv_vcpu_aia_rmw_topei() - Fix off-by-one array access in SBI PMU - Skip THP support check during dirty logging - Fix error code returned for Smstateen and Ssaia ONE_REG interface - Check host Ssaia extension when creating AIA irqchip x86: - Fix cases where CPUID mitigation features were incorrectly marked as available whenever the kernel used scattered feature words for them - Validate _all_ GVAs, rather than just the first GVA, when processing a range of GVAs for Hyper-V's TLB flush hypercalls - Fix a brown paper bug in add_atomic_switch_msr() - Use hlist_for_each_entry_srcu() when traversing mask_notifier_list, to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu - Ensure AVIC VMCB fields are initialized if the VM has an in-kernel local APIC (and AVIC is enabled at the module level) - Update CR8 write interception when AVIC is (de)activated, to fix a bug where the guest can run in perpetuity with the CR8 intercept enabled - Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by default) an unintentional tightening of userspace ABI in 6.17, and provides some amount of backwards compatibility with hypervisors who want to freeze PMCs on VM-Entry - Validate the VMCS/VMCB on return to a nested guest from SMM, because either userspace or the guest could stash invalid values in memory and trigger the processor's consistency checks Generic: - Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from being unnecessary and confusing, triggered compiler warnings due to -Wflex-array-member-not-at-end - Document that vcpu->mutex is take outside of kvm->slots_lock and kvm->slots_arch_lock, which is intentional and desirable despite being rather unintuitive Selftests: - Increase the maximum number of NUMA nodes in the guest_memfd selftest to 64 (from 8)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8 Documentation: kvm: fix formatting of the quirks table KVM: x86: clarify leave_smm() return value selftests: kvm: add a test that VMX validates controls on RSM selftests: kvm: extract common functionality out of smm_test.c KVM: SVM: check validity of VMCB controls when returning from SMM KVM: VMX: check validity of VMCS controls when returning from SMM KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers() KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr() KVM: x86: hyper-v: Validate all GVAs during PV TLB flush KVM: x86: synthesize CPUID bits only if CPU capability is set KVM: PPC: e500: Rip out "struct tlbe_ref" KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type KVM: selftests: Increase 'maxnode' for guest_memfd tests KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2() ...	2026-03-15 12:22:10 -07:00
Anshuman Khandual	be6e9dee0e	arm64/mm: Directly use TTBRx_EL1_CnP Replace all TTBR_CNP_BIT macro instances with TTBRx_EL1_CNP_BIT which is a standard field from tools sysreg format. Drop the now redundant custom macro TTBR_CNP_BIT. No functional change. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oupton@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: kvmarm@lists.linux.dev Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-03-14 16:12:27 +00:00
Anshuman Khandual	d989010bbe	arm64/mm: Directly use TTBRx_EL1_ASID_MASK Replace all TTBR_ASID_MASK macro instances with TTBRx_EL1_ASID_MASK which is a standard field mask from tools sysreg format. Drop the now redundant custom macro TTBR_ASID_MASK. No functional change. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oupton@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: kvmarm@lists.linux.dev Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2026-03-14 16:12:27 +00:00
Barry Song	2c92eff008	arm64: Provide dcache_by_myline_op_nosync helper dcache_by_myline_op ensures completion of the data cache operations for a region, while dcache_by_myline_op_nosync only issues them without waiting. This enables deferred synchronization so completion for multiple regions can be handled together later. Cc: Leon Romanovsky <leon@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Ada Couprie Diaz <ada.coupriediaz@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tangquan Zheng <zhengtangquan@oppo.com> Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com> Signed-off-by: Barry Song <baohua@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260228221216.59886-1-21cnbao@gmail.com	2026-03-13 23:46:32 +01:00

1 2 3 4 5 ...

5110 Commits