linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-22 14:12:07 +02:00

Author	SHA1	Message	Date
Linus Torvalds	ff57d59200	LoongArch changes for v7.1 1, Adjust build infrastructure for 32BIT/64BIT; 2, Add HIGHMEM (PKMAP and FIX_KMAP) support; 3, Show and handle CPU vulnerabilites correctly; 4, Batch the icache maintenance for jump_label; 5, Add more atomic instructions support for BPF JIT; 6, Add more features (e.g. fsession) support for BPF trampoline; 7, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmnpwWgWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeiAXD/0RSRhj2y8LYGhVSPStMgN4uwMl 1ylbkRg0biTvV0g8sD1R3MQ58/tKBZY5wTeLjwT50rl+JgOqVdrN6OMAxjwOKzJ6 7C0rgpxBG5/YHI93paFVIYszsiWhRQaB5qfZCUOr230ZDJzvnfF1aH6JLybeHoMp HvERNURQsRbZo9yc69YxhrmHETEbum37u9hsrY5mJSEs5Fh+QxvTSYjE36z3Dtal YFqopTCaBgAhVw6BldVAcyvGvVK+d6iQEA035jObNLKKReNkwsQixxgnJhDSkbbG Z3md+hWp+YQQElGIP5q6+rj1rJZGrs/XL3HAnTQfXN+8bXIUO9AOf2/l5f9fZx7o 2Vtt8n2/vVdzsVnKiHXGtsZ5uXrw4/kLiMZSCrUMZCtEOxJV9mmrVskPeie0iq0/ nDG9uCgRldL8Xpg7d5NM9coECui3J+ztNkv06tL/JLm02bJPuqNwt3FeA1T/aH1c l2Hpw3Xuzl7lYuAYoa5CMm4X6yD/RA6w44pW1NKnb6j6llIOk6V6NwcwggWUnqht oB5VIqPKMOYjZ+fLurI2o9VWqWokJxDdzyrHhXyaG0JRK9Vak06C8UI5BQuosu88 9WBoxK77PyNa60m56C32OZ5tu4UoPT8PgZWXDQDwn82SWzuYKWRruS2ng5A/JF7r H2Ez4iBjs2/P7vTQHA== =FiFl -----END PGP SIGNATURE----- Merge tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Adjust build infrastructure for 32BIT/64BIT - Add HIGHMEM (PKMAP and FIX_KMAP) support - Show and handle CPU vulnerabilites correctly - Batch the icache maintenance for jump_label - Add more atomic instructions support for BPF JIT - Add more features (e.g. fsession) support for BPF trampoline - Some bug fixes and other small changes * tag 'loongarch-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (21 commits) selftests/bpf: Enable CAN_USE_LOAD_ACQ_STORE_REL for LoongArch LoongArch: BPF: Add fsession support for trampolines LoongArch: BPF: Introduce emit_store_stack_imm64() helper LoongArch: BPF: Support up to 12 function arguments for trampoline LoongArch: BPF: Support small struct arguments for trampoline LoongArch: BPF: Open code and remove invoke_bpf_mod_ret() LoongArch: BPF: Support load-acquire and store-release instructions LoongArch: BPF: Support 8 and 16 bit read-modify-write instructions LoongArch: BPF: Add the default case in emit_atomic() and rename it LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR LoongArch: Batch the icache maintenance for jump_label LoongArch: Add flush_icache_all()/local_flush_icache_all() LoongArch: Add spectre boundry for syscall dispatch table LoongArch: Show CPU vulnerabilites correctly LoongArch: Make arch_irq_work_has_interrupt() true only if IPI HW exist LoongArch: Use get_random_canary() for stack canary init LoongArch: Improve the logging of disabling KASLR LoongArch: Align FPU register state to 32 bytes LoongArch: Handle CONFIG_32BIT in syscall_get_arch() LoongArch: Add HIGHMEM (PKMAP and FIX_KMAP) support ...	2026-04-24 09:54:45 -07:00
Tiezhu Yang	1dd3e8a8ee	LoongArch: Define instruction formats for AM{SWAP/ADD}.{B/H} and DBAR The 8 and 16 bit read-modify-write atomic instructions amadd.{b/h} and amswap.{b/h} were newly added in the latest LoongArch Reference Manual, define the instruction format and check whether support via CPUCFG. Furthermore, define the instruction format for DBAR which will be used to support BPF load-acquire and store-release instructions. This is preparation for later patches. Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:13 +08:00
Youling Tang	2c749f734e	LoongArch: Batch the icache maintenance for jump_label Switch to the batched version of the jump label update functions so instruction cache maintenance is deferred until the end of the update. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:13 +08:00
Greg Kroah-Hartman	0c965d2784	LoongArch: Add spectre boundry for syscall dispatch table The LoongArch syscall number is directly controlled by userspace, but does not have a array_index_nospec() boundry to prevent access past the syscall function pointer tables. Cc: stable@vger.kernel.org Assisted-by: gkh_clanker_2000 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:12 +08:00
Huacai Chen	37e57e8ad9	LoongArch: Show CPU vulnerabilites correctly Most LoongArch processors are vulnerable to Spectre-V1 Proof-of-Concept (PoC). And the generic mechanism, __user pointer sanitization, can be used as a mitigation. This means to use array_index_nospec() to prevent out of boundry access in syscall and other critical paths. Implement the arch-specific cpu_show_spectre_v1() to show CPU Spectre-V1 vulnerabilites correctly. Cc: stable@vger.kernel.org Link: https://cc-sw.com/chinese-loongarch-architecture-evaluation-part-3-of-3/ Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:12 +08:00
Yuqian Yang	847634955b	LoongArch: Improve the logging of disabling KASLR Whether KASLR is disabled is not handled in nokaslr() which is the early param "nokaslr" setup function, but in kaslr_disabled(). However, the logging was previously done in nokaslr() and lack detail. So we move the logging to the right place and add more specific infomation about why it is disabled. Suggested-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuqian Yang <yangyuqian@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:11 +08:00
Lisa Robinson	e3f4591f79	LoongArch: Align FPU register state to 32 bytes Move fpr to the beginning of struct loongarch_fpu so it is naturally aligned to FPU_ALIGN (32 bytes), improving 256-bit SIMD (LASX) context switch performance. Also adjust process.c and fpu.S to work well with the new loongarch_fpu layout. Signed-off-by: Lisa Robinson <lisa@bytefly.space> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:45:11 +08:00
Huacai Chen	3d9aba6618	LoongArch: Adjust build infrastructure for 32BIT/64BIT Adjust build infrastructure (Kconfig, Makefile and ld scripts) to let us enable both 32BIT/64BIT kernel build. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-22 15:44:26 +08:00
Linus Torvalds	01f492e181	Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported. This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor. - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn. LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel(). - Add DMSINTC irqchip in kernel support. RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors. - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code. - Fix LPSW/E to update the bear (which of course is the breaking event address register). x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized. - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes. - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage. x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace. - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier"). - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not all of VMX and SVM enabling) as it is needed for trusted I/O. - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times. - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM. - Exempt SMM from CPUID faulting on Intel, as per the spec. - Misc hardening and cleanup changes. x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs. - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs. - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input. - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION. - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault. - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl. - Convert a pile of kvm->lock SEV code to guard(). - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs. x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros. - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate. - Code cleanups. guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to. LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests. x86 selftests: - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3 8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX 08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5 Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg== =CcE0 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not all of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ...	2026-04-17 07:18:03 -07:00
Linus Torvalds	f21f7b5162	Update to the VDSO subsystem: - Make the handling of compat functions consistent and more robust - Rework the underlying data store so that it is dynamically allocated, which allows the conversion of the last holdout SPARC64 to the generic VDSO implementation - Rework the SPARC64 VDSO to utilize the generic implementation - Mop up the left overs of the non-generic VDSO support in the core code. - Expand the VDSO selftest and make them more robust - Allow time namespaces to be enabled independently of the generic VDSO support, which was not possible before due to SPARC64 not using it. - Various cleanups and improvements in the related code. -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnb0v8QHHRnbHhAa2Vy bmVsLm9yZwAKCRCmGPVMDXSYocfqD/9ywgnvwRH6B612mY4PI3qCbLHs6n9f78aH YwyXmmfBZ5vt1ZtptHD+BAxiIMm9GC+/exdj5zhcOWucnBVhorcloE6evxhkJAMn RhTQFKkEmcA/UV2Yfct9r+33kgZRyu4IIul4J7hgn2o5T1BqwZbOil0W/O5adr5P MDLxjT1OLV80ZZWI9qbWcR/aR7W7sHcdwfVPPqjhombRY7f391Mo3dZeM5C2y55x 8TXCEqVpN1RJzFinWEgQN7QpP4OmF0rRuXSrDQpkH6pk/+RSqNlT/QGG7MJtmCQR E6CeBjNRUn318KiroaGyTKlM9xsL3gNoiCY24ZTwzZxx3g5gSAR3KTCTJhQU0hpu Svxj+ksqEAyW7fAOIsbce6W8fUPKC2KM+juXgPKcqZ5hjE2fALD+eEYMlq00jSiu sj71007cM9tZKOXPdWs3Fv7AY2Yj7iiRiRz9gv1wqS1z7ybxiaFjxjLYYakej0tr rmwBDEGhNow7msZZttr01BRZk9hDUWfIiJtL+0BrgRLNzst2A7WoagtZ2s0Z7Psl RjtWgYNBDJ878xK0J+Djqb9TyLraGWZShIIna9uYCAJX9i954xfKJ//NOnUkZhcl jslDLHhdttyJ+TmgIsc1ntUGvYvHqH5ywQpyDfWepMKyIYdaJLHOr2K6bwFnGHdw uocXvLrkXw== =8ixX -----END PGP SIGNATURE----- Merge tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull vdso updates from Thomas Gleixner: - Make the handling of compat functions consistent and more robust - Rework the underlying data store so that it is dynamically allocated, which allows the conversion of the last holdout SPARC64 to the generic VDSO implementation - Rework the SPARC64 VDSO to utilize the generic implementation - Mop up the left overs of the non-generic VDSO support in the core code - Expand the VDSO selftest and make them more robust - Allow time namespaces to be enabled independently of the generic VDSO support, which was not possible before due to SPARC64 not using it - Various cleanups and improvements in the related code * tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) timens: Use task_lock guard in timens_get*() timens: Use mutex guard in proc_timens_set_offset() timens: Simplify some calls to put_time_ns() timens: Add a __free() wrapper for put_time_ns() timens: Remove dependency on the vDSO vdso/timens: Move functions to new file selftests: vDSO: vdso_test_correctness: Add a test for time() selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c selftests: vDSO: vdso_test_correctness: Handle different tv_usec types selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers" random: vDSO: Remove ifdeffery random: vDSO: Trim vDSO includes vdso/datapage: Trim down unnecessary includes vdso/datapage: Remove inclusion of gettimeofday.h vdso/helpers: Explicitly include vdso/processor.h vdso/gettimeofday: Add explicit includes random: vDSO: Add explicit includes MIPS: vdso: Explicitly include asm/vdso/vdso.h ...	2026-04-14 10:53:44 -07:00
Linus Torvalds	c0ecb2a9ee	Updates for the interrupt chip driver subsystem: - A large refactoring for the Renesas RZV2H driver to add new interrupt types cleanly. - A large refactoring for the Renesas RZG2L driver to add support the new RZ/G3L variant. - Add support for the new NXP S32N79 chip in the IMX irq-steer driver. - Add support for the Apple AICv3 variant - Enhance the Loongson PCH LPC driver so it can be used on MIPS with device tree firmware - Allow the PIC32 EVIC driver to be built independent of MIPS in compile tests. - The usual small fixes and enhancements all over the place -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbuaAQHHRnbHhAa2Vy bmVsLm9yZwAKCRCmGPVMDXSYoQc6D/9i6QTUNw4ZqpZSjJvrxX2mv49ej3FkWsQf 1P59ej9A0/w1+0A8SyAFd6ExuvO3am9k64nhpsFlwv0lMTu57va3Oj14E28fjN+M H1XWJYCLX3p7MWGsGFCmbHIeJJQnwCqyiZcs3DG0BMA5iYglJEcOijB+h+FCAwCN jjiBW4bbqPxPcpT90uQZhwwa3eUvkrkwzZEedKkbtvgPy3jdmKvrNCAte9GJSQu3 CvLgAB1Zraq1yIDFQN/km5NByan7pVmbGuv10K/jqirqmjplM2H0X+fwXlWz4M0Q uo33xDtJvtOegnySiJ6EimY+GTHDiloZpn2nqqUnHRIR0hxYjYudukSRorxbjXWv c93CGk1/Mq9WHqCsWvX5xe75Ttuf5xsfej8DETYvP+cTkGV/Kk9EUNmDUoYnGkiv UZyZBUMWejJLjnNmMpouUwW9Vc/08OLAtmqJgQukl2cbh/ujcmGC0TSA2SpBKlJT FpGU6ZsHWxpm1UHVv+9Uhx5dArRVNeTvjNgLFtu+ZkGM+C2oBs/MKHpu44mBfAf8 Ex4sFzCslKXbYo6TwO48w9VIMQJ6DDPrwza1YN6hRn4IwBkRXl3SQsT3pJxG1WMg NcP4dMRfe68o9y4ywxnoomH+qlux4j/tq3J7PqRY5s3WzWe8jXqfiQGNE1f10rop gGZe5f+/Mw== =kvZ2 -----END PGP SIGNATURE----- Merge tag 'irq-drivers-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt chip driver updates from Thomas Gleixner: - A large refactoring for the Renesas RZV2H driver to add new interrupt types cleanly - A large refactoring for the Renesas RZG2L driver to add support the new RZ/G3L variant - Add support for the new NXP S32N79 chip in the IMX irq-steer driver - Add support for the Apple AICv3 variant - Enhance the Loongson PCH LPC driver so it can be used on MIPS with device tree firmware - Allow the PIC32 EVIC driver to be built independent of MIPS in compile tests - The usual small fixes and enhancements all over the place * tag 'irq-drivers-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) irqchip/irq-pic32-evic: Add __maybe_unused for board_bind_eic_interrupt in COMPILE_TEST irqchip/renesas-rzv2h: Kill icu_err string irqchip/renesas-rzv2h: Kill swint_names[] irqchip/renesas-rzv2h: Kill swint_idx[] irqchip/renesas-rzg2l: Add NMI support irqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free() irqchip/renesas-rzg2l: Replace raw_spin_{lock,unlock} with guard() in rzg2l_irq_set_type() irqchip/gic-v3: Print a warning for out-of-range interrupt numbers irqchip/renesas-rzg2l: Add shared interrupt support irqchip/renesas-rzg2l: Add RZ/G3L support irqchip/renesas-rzg2l: Drop IRQC_IRQ_COUNT macro irqchip/renesas-rzg2l: Drop IRQC_TINT_START macro irqchip/renesas-rzg2l: Drop IRQC_NUM_IRQ macro irqchip/renesas-rzg2l: Dynamically allocate fwspec array irqchip/renesas-rzg2l: Split rzfive_irqc_{mask,unmask} into separate IRQ and TINT handlers irqchip/renesas-rzg2l: Split rzfive_tint_irq_endisable() into separate IRQ and TINT helpers irqchip/renesas-rzg2l: Replace rzg2l_irqc_irq_{enable,disable} with TINT-specific handlers irqchip/renesas-rzg2l: Split set_type handler into separate IRQ and TINT functions irqchip/renesas-rzg2l: Split EOI handler into separate IRQ and TINT functions irqchip/renesas-rzg2l: Replace single irq_chip with per-region irq_chip instances ...	2026-04-14 10:18:10 -07:00
Linus Torvalds	2e31b16101	ACPI support updates for 7.1-rc1 - Update maintainers information regarding ACPICA (Rafael Wysocki) - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees Cook) - Trigger an ordered system power off after encountering a fatal error operator in AML (Armin Wolf) - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao) - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from the ACPI PPTT parser (Ben Horgan) - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate DeSimone) - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) - Clean up the ACPI TAD driver in various ways and add an RTC class device interface, including both the RTC setting/reading and alarm timer support, to it (Rafael Wysocki) - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) - Consolidate the interface for obtaining a CPU UID from ACPI across architectures and use it to address incorrect PCI TPH Steering Tag on ARM64 resulting from the invalid assumption that the ACPI Processor UID would always be the same as the corresponding logical CPU ID in Linux (Chengwen Feng) -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY/bcSHHJqd0Byand5 c29ja2kubmV0AAoJEO5fvZ0v1OO1chAH/1cGRzh9lSgQ3ZdzIIA5rpRtwKC+CTNz iNDvQ97W73B2N+WYzMaloOh+ZVA1Vdqc+8921aH6HI+v7wtg/ZV3h/hU7TagHNY/ bRFDYaeRXVj4aBXNfoVdn7G5UU9j/kIDcV25I2ubOBqZaO6T5p8p1BK0j0vEj+sG yR7XwpEhr2OUQwlIFGKskJwFaH57QJXPEY8wf+o+lMEx/7o/JQRJzKFwsYu01ZZV kQy9Ee08P/rsNJwU2ibmZu5P3JMnhategAT8VAMBvkfLScv2sKX+1Vz19NGXzm71 ARaT7y8MSPNb7SAvWmNZ/rVYrYIL+D3a76Gd7MOGrbVWEn6oXIbCIhY= =6vEK -----END PGP SIGNATURE----- Merge tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support updates from Rafael Wysocki: "These include an update of the CMOS RTC driver and the related ACPI and x86 code that, among other things, switches it over to using the platform device interface for device binding on x86 instead of the PNP device driver interface (which allows the code in question to be simplified quite a bit), a major update of the ACPI Time and Alarm Device (TAD) driver adding an RTC class device interface to it, and updates of core ACPI drivers that remove some unnecessary and not really useful code from them. Apart from that, two drivers are converted to using the platform driver interface for device binding instead of the ACPI driver one, which is slated for removal, support for the Performance Limited register is added to the ACPI CPPC library and there are some janitorial updates of it and the related cpufreq CPPC driver, the ACPI processor driver is fixed and cleaned up, and NVIDIA vendor CPER record handler is added to the APEI GHES code. Also, the interface for obtaining a CPU UID from ACPI is consolidated across architectures and used for fixing a problem with the PCI TPH Steering Tag on ARM64, there are two updates related to ACPICA, a minor ACPI OS Services Layer (OSL) update, and a few assorted updates related to ACPI tables parsing. Specifics: - Update maintainers information regarding ACPICA (Rafael Wysocki) - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees Cook) - Trigger an ordered system power off after encountering a fatal error operator in AML (Armin Wolf) - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao) - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from the ACPI PPTT parser (Ben Horgan) - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate DeSimone) - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) - Clean up the ACPI TAD driver in various ways and add an RTC class device interface, including both the RTC setting/reading and alarm timer support, to it (Rafael Wysocki) - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) - Consolidate the interface for obtaining a CPU UID from ACPI across architectures and use it to address incorrect PCI TPH Steering Tag on ARM64 resulting from the invalid assumption that the ACPI Processor UID would always be the same as the corresponding logical CPU ID in Linux (Chengwen Feng)" * tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits) ACPICA: Update maintainers information watchdog: ni903x_wdt: Convert to a platform driver ACPI: PAD: xen: Convert to a platform driver ACPI: processor: idle: Reset cpuidle on C-state list changes cpuidle: Extract and export no-lock variants of cpuidle_unregister_device() PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu() perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu() ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler PCI: hisi: Use devm_ghes_register_vendor_record_notifier() ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier() ACPI: tables: Enable FPDT on LoongArch ACPI: processor: idle: Fix NULL pointer dereference in hotplug path ACPI: processor: idle: Reset power_setup_done flag on initialization failure ACPI: TAD: Add alarm support to the RTC class device interface ...	2026-04-13 19:25:07 -07:00
Linus Torvalds	d568788baa	hardening updates for v7.1-rc1 - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCad16PwAKCRA2KwveOeQk u7crAP4qz8gXCjes76KsZm/YQS8PtOG5JroAVu5Oa4ohw0RfaQD+K/XLow1plcNF 4Bi8zSuv2ifcLysh9qEAbx5+wcHijgo= =woB3 -----END PGP SIGNATURE----- Merge tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task	2026-04-13 17:52:29 -07:00
Bibo Mao	c43dce6f13	LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function vcpu_is_preempted() is performance sensitive that called in function osq_lock(), here make it as a macro. So that parameter is not parsed at most time, it can avoid cache line thrashing across numa nodes. Here is part of UnixBench result on Loongson-3C5000 DualWay machine with 32 cores and 2 numa nodes. original inline macro execl 7025.7 6991.2 7242.3 fstime 474.6 703.1 1071 From the test result, making vcpu_is_preempted() as a macro is the best, and there is some improvment compared with the original function method. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-04-09 18:56:36 +08:00
Chengwen Feng	d78ef9d2e1	LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval As a step towards unifying the interface for retrieving ACPI CPU UID across architectures, introduce a new function acpi_get_cpu_uid() for loongarch. While at it, add input validation to make the code more robust. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260401081640.26875-3-fengchengwen@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2026-04-06 16:55:15 +02:00
Icenowy Zheng	dc30127cd0	LoongArch: Override arch_dynirq_lower_bound to reserve LPC IRQs Loongson 7A PCH chips all contain a LPC controller, which is used in some devices to connect legacy ISA devices (e.g. 8259 PS/2 controller). The LPC irqchip driver will register LPC interrupts at the fixed range 0~15, and the PCH PIC irqchip driver uses dynamic allocation. However the LPC interrupt numbers are currently not exempted from dynamic allocation. The current setup work by accident because the LPC interrupt controller is the first consumer of PIC interrupt controller, and the PIC interrupt number is allocated after LPC interrupts are registered. Such setup is fragile and will stop to work when the LPC irqchip driver is reworked. Override arch_dynirq_lower_bound() to reserve LPC interrupts from dynamic allocation, to prevent interrupt number collision and allow rework of the LPC irqchip driver. Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20260321092032.3502701-3-zhengxingda@iscas.ac.cn	2026-03-26 16:15:03 +01:00
Xi Ruoyao	e4878c37f6	LoongArch: vDSO: Emit GNU_EH_FRAME correctly With -fno-asynchronous-unwind-tables and --no-eh-frame-hdr (the default of the linker), the GNU_EH_FRAME segment (specified by vdso.lds.S) is empty. This is not valid, as the current DWARF specification mandates the first byte of the EH frame to be the version number 1. It causes some unwinders to complain, for example the ClickHouse query profiler spams the log with messages: clickhouse-server[365854]: libunwind: unsupported .eh_frame_hdr version: 127 at 7ffffffb0000 Here "127" is just the byte located at the p_vaddr (0, i.e. the beginning of the vDSO) of the empty GNU_EH_FRAME segment. Cross- checking with /proc/365854/maps has also proven 7ffffffb0000 is the start of vDSO in the process VM image. In LoongArch the -fno-asynchronous-unwind-tables option seems just a MIPS legacy, and MIPS only uses this option to satisfy the MIPS-specific "genvdso" program, per the commit `cfd75c2db1` ("MIPS: VDSO: Explicitly use -fno-asynchronous-unwind-tables"). IIRC it indicates some inherent limitation of the MIPS ELF ABI and has nothing to do with LoongArch. So we can simply flip it over to -fasynchronous-unwind-tables and pass --eh-frame-hdr for linking the vDSO, allowing the profilers to unwind the stack for statistics even if the sample point is taken when the PC is in the vDSO. However simply adjusting the options above would exploit an issue: when the libgcc unwinder saw the invalid GNU_EH_FRAME segment, it silently falled back to a machine-specific routine to match the code pattern of rt_sigreturn() and extract the registers saved in the sigframe if the code pattern is matched. As unwinding from signal handlers is vital for libgcc to support pthread cancellation etc., the fall-back routine had been silently keeping the LoongArch Linux systems functioning since Linux 5.19. But when we start to emit GNU_EH_FRAME with the correct format, fall-back routine will no longer be used and libgcc will fail to unwind the sigframe, and unwinding from signal handlers will no longer work, causing dozens of glibc test failures. To make it possible to unwind from signal handlers again, it's necessary to code the unwind info in __vdso_rt_sigreturn via .cfi_* directives. The offsets in the .cfi_* directives depend on the layout of struct sigframe, notably the offset of sigcontext in the sigframe. To use the offset in the assembly file, factor out struct sigframe into a header to allow asm-offsets.c to output the offset for assembly. To work around a long-term issue in the libgcc unwinder (the pc is unconditionally substracted by 1: doing so is technically incorrect for a signal frame), a nop instruction is included with the two real instructions in __vdso_rt_sigreturn in the same FDE PC range. The same hack has been used on x86 for a long time. Cc: stable@vger.kernel.org Fixes: `c6b99bed6b` ("LoongArch: Add VDSO and VSYSCALL support") Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-03-26 14:29:09 +08:00
Li Jun	3a28daa9b7	LoongArch: Fix missing NULL checks for kstrdup() 1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple calls to "of_node_put()". 2. Fix a potential kernel oops during early boot when memory allocation fails while parsing CPU model from device tree. Cc: stable@vger.kernel.org Signed-off-by: Li Jun <lijun01@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-03-26 14:29:08 +08:00
Ryan Roberts	a96ef5848c	randomize_kstack: Unify random source across arches Previously different architectures were using random sources of differing strength and cost to decide the random kstack offset. A number of architectures (loongarch, powerpc, s390, x86) were using their timestamp counter, at whatever the frequency happened to be. Other arches (arm64, riscv) were using entropy from the crng via get_random_u16(). There have been concerns that in some cases the timestamp counters may be too weak, because they can be easily guessed or influenced by user space. And get_random_u16() has been shown to be too costly for the level of protection kstack offset randomization provides. So let's use a common, architecture-agnostic source of entropy; a per-cpu prng, seeded at boot-time from the crng. This has a few benefits: - We can remove choose_random_kstack_offset(); That was only there to try to make the timestamp counter value a bit harder to influence from user space []. - The architecture code is simplified. All it has to do now is call add_random_kstack_offset() in the syscall path. - The strength of the randomness can be reasoned about independently of the architecture. - Arches previously using get_random_u16() now have much faster syscall paths, see below results. [] Additionally, this gets rid of some redundant work on s390 and x86. Before this patch, those architectures called choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(), which is also called for exception returns to userspace which were not syscalls (e.g. regular interrupts). Getting rid of choose_random_kstack_offset() avoids a small amount of redundant work for the non-syscall cases. In some configurations, add_random_kstack_offset() will now call instrumentable code, so for a couple of arches, I have moved the call a bit later to the first point where instrumentation is allowed. This doesn't impact the efficacy of the mechanism. There have been some claims that a prng may be less strong than the timestamp counter if not regularly reseeded. But the prng has a period of about 2^113. So as long as the prng state remains secret, it should not be possible to guess. If the prng state can be accessed, we have bigger problems. Additionally, we are only consuming 6 bits to randomize the stack, so there are only 64 possible random offsets. I assert that it would be trivial for an attacker to brute force by repeating their attack and waiting for the random stack offset to be the desired one. The prng approach seems entirely proportional to this level of protection. Performance data are provided below. The baseline is v6.18 with rndstack on for each respective arch. (I)/(R) indicate statistically significant improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal). x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge): +-----------------+--------------+---------------+---------------+ \| Benchmark \| Result Class \| per-cpu-prng \| per-cpu-prng \| \| \| \| arm64 (metal) \| x86_64 (VM) \| +=================+==============+===============+===============+ \| syscall/getpid \| mean (ns) \| (I) -9.50% \| (I) -17.65% \| \| \| p99 (ns) \| (I) -59.24% \| (I) -24.41% \| \| \| p99.9 (ns) \| (I) -59.52% \| (I) -28.52% \| +-----------------+--------------+---------------+---------------+ \| syscall/getppid \| mean (ns) \| (I) -9.52% \| (I) -19.24% \| \| \| p99 (ns) \| (I) -59.25% \| (I) -25.03% \| \| \| p99.9 (ns) \| (I) -59.50% \| (I) -28.17% \| +-----------------+--------------+---------------+---------------+ \| syscall/invalid \| mean (ns) \| (I) -10.31% \| (I) -18.56% \| \| \| p99 (ns) \| (I) -60.79% \| (I) -20.06% \| \| \| p99.9 (ns) \| (I) -61.04% \| (I) -25.04% \| +-----------------+--------------+---------------+---------------+ I tested an earlier version of this change on x86 bare metal and it showed a smaller but still significant improvement. The bare metal system wasn't available this time around so testing was done in a VM instance. I'm guessing the cost of rdtsc is higher for VMs. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com Signed-off-by: Kees Cook <kees@kernel.org>	2026-03-24 21:12:03 -07:00
Tiezhu Yang	d3b8491961	LoongArch: No need to flush icache if text copy failed If copy_to_kernel_nofault() failed, no need to flush icache and just return immediately. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-03-16 10:36:01 +08:00
Tiezhu Yang	431ce839da	LoongArch: Check return values for set_memory_{rw,rox} set_memory_rw() and set_memory_rox() may fail, so we should check the return values and return immediately in larch_insn_text_copy(). Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-03-16 10:36:01 +08:00
Xi Ruoyao	8a69d02481	LoongArch: Fix calling smp_processor_id() in preemptible code Fix the warning: BUG: using smp_processor_id() in preemptible [00000000] code: systemd/1 caller is larch_insn_text_copy+0x40/0xf0 Simply changing it to raw_smp_processor_id() is not enough: if preempt and CPU hotplug happens after raw_smp_processor_id() but before calling stop_machine(), the CPU where raw_smp_processor_id() has run may become offline when stop_machine() and no CPU will run copy_to_kernel_nofault() in text_copy_cb(). Thus guard the larch_insn_text_copy() calls with cpus_read_lock() and change stop_machine() to stop_machine_cpuslocked() to prevent this. I've considered moving the locks inside larch_insn_text_copy() but doing so seems not an easy hack. In bpf_arch_text_poke() obviously the memcpy() call must be guarded by text_mutex, so we have to leave the acquire of text_mutex out of larch_insn_text_copy(). But in the entire kernel the acquire of mutexes is always after cpus_read_lock(), so we cannot put cpus_read_lock() into larch_insn_text_copy() while leaving the text_mutex acquire out (or we risk a deadlock due to inconsistent lock acquire order). So let's fix the bug first and leave the posssible refactor as future work. Fixes: `9fbd18cf4c` ("LoongArch: BPF: Add dynamic code modification support") Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-03-16 10:36:01 +08:00
Thomas Weißschuh	55434071cd	LoongArch: vDSO: Explicitly include asm/vdso/vdso.h The usage of 'struct old_timespec32' requires asm/vdso/vdso.h. Currently this header is included transitively, but that transitive inclusion is about to go away. Explicitly include the header. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260227-vdso-header-cleanups-v2-6-35d60acf7410@linutronix.de	2026-03-11 15:12:41 +01:00
Nathan Chancellor	8678591b47	kbuild: Split .modinfo out from ELF_DETAILS Commit `3e86e4d74c` ("kbuild: keep .modinfo section in vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and ELF_DETAILS was present in all architecture specific vmlinux linker scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and COMMON_DISCARDS may be used by other linker scripts, such as the s390 and x86 compressed boot images, which may not expect to have a .modinfo section. In certain circumstances, this could result in a bootloader failing to load the compressed kernel [1]. Commit `ddc6cbef3e` ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer") recently addressed this for the s390 bzImage but the same bug remains for arm, parisc, and x86. The presence of .modinfo in the x86 bzImage was the root cause of the issue worked around with commit `d50f210913` ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition. Split .modinfo out from ELF_DETAILS into its own macro and handle it in all vmlinux linker scripts. Discard .modinfo in the places where it was previously being discarded from being in COMMON_DISCARDS, as it has never been necessary in those uses. Cc: stable@vger.kernel.org Fixes: `3e86e4d74c` ("kbuild: keep .modinfo section in vmlinux.unstripped") Reported-by: Ed W <lists@wildgooses.com> Closes: https://lore.kernel.org/587f25e0-a80e-46a5-9f01-87cb40cfa377@wildgooses.com/ [1] Tested-by: Ed W <lists@wildgooses.com> # x86_64 Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>	2026-02-26 11:50:19 -07:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	64275e9fda	LoongArch changes for v7.0 1, Select HAVE_CMPXCHG_{LOCAL,DOUBLE}; 2, Add 128-bit atomic cmpxchg support; 3, Add HOTPLUG_SMT implementation; 4, Wire up memfd_secret system call; 5, Fix boot errors and unwind errors for KASAN; 6, Use BPF prog pack allocator and add BPF arena support; 7, Update dts files to add nand controllers; 8, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmmMTnsWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeo0AEACFniyK/cbaBchYAONJb5TxXcW6 7pvFEAbNrTzvQ8TTGpt+EBsOZlqE+y/afB/NlR06Aow8ifvUnOxJu9Ur1afo2r6A syB3Y7OsuUd8nxsATgrfJrNZnqq30dCJWxnBlP+YCCHQ2FFjLHIGcheRNM7rTrzd LvGCnBwHSKmKv5wGxsDJufYxbHgeb4YvrwZiNJC0ELRM9VqMSCogkIlayJrfC26S Or89+6i2XLC3K+Rrd1MgPp2HX6W9utzhB7kSmro0piUyX6F5UtL1YGHC9t1hamIZ yuTStXOZA2bYQPwEmXNNVucX8UfmPOeUQgl0P0n8XG09RGq0uNKFhfkSy9d+lxUl 2jftUZGujgV3/RsehrsKcto1ZBwwd2FyKL7uLWucuop+XJvrqIus/hsR+M2FI9IY 6sngOJZkKWfxMECTL7+FAMOGuxnghRk0VBZRJ8PqHTU/9YkKLQf0iyYqmvl+wOgu ByJmEapmVdrdGG78zUHsMDAqUFo518ixABhExWuqwEE2/zSj2jQIliIAcHRSJkvT ZOW1CZBX54AuFfRvjelYucSz1Q89lHC7U9WjYkte8Rv4tyPOYnTUmg3ouBPm5W2+ MuPVt1Y4rJN8RnD+1sSHJa4laMo7gZN2Cr4LsELc0mURxOfRK/hU1bjA9JM4mv2X 2L69IvQDbaG2H61qBQ== =WKfm -----END PGP SIGNATURE----- Merge tag 'loongarch-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Select HAVE_CMPXCHG_{LOCAL,DOUBLE} - Add 128-bit atomic cmpxchg support - Add HOTPLUG_SMT implementation - Wire up memfd_secret system call - Fix boot errors and unwind errors for KASAN - Use BPF prog pack allocator and add BPF arena support - Update dts files to add nand controllers - Some bug fixes and other small changes * tag 'loongarch-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: dts: loongson-2k1000: Add nand controller support LoongArch: dts: loongson-2k0500: Add nand controller support LoongArch: BPF: Implement bpf_addr_space_cast instruction LoongArch: BPF: Implement PROBE_MEM32 pseudo instructions LoongArch: BPF: Use BPF prog pack allocator LoongArch: Use IS_ERR_PCPU() macro for KGDB LoongArch: Rework KASAN initialization for PTW-enabled systems LoongArch: Disable instrumentation for setup_ptwalker() LoongArch: Remove some extern variables in source files LoongArch: Guard percpu handler under !CONFIG_PREEMPT_RT LoongArch: Handle percpu handler address for ORC unwinder LoongArch: Use %px to print unmodified unwinding address LoongArch: Prefer top-down allocation after arch_mem_init() LoongArch: Add HOTPLUG_SMT implementation LoongArch: Make cpumask_of_node() robust against NUMA_NO_NODE LoongArch: Wire up memfd_secret system call LoongArch: Replace seq_printf() with seq_puts() for simple strings LoongArch: Add 128-bit atomic cmpxchg support LoongArch: Add detection for SC.Q support LoongArch: Select HAVE_CMPXCHG_LOCAL in Kconfig	2026-02-14 12:47:15 -08:00
Linus Torvalds	cb5573868e	Loongarch: - Add more CPUCFG mask bits. - Improve feature detection. - Add lazy load support for FPU and binary translation (LBT) register state. - Fix return value for memory reads from and writes to in-kernel devices. - Add support for detecting preemption from within a guest. - Add KVM steal time test case to tools/selftests. ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes. RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host. - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit `bf2c3138ae` ("Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD"). - Disallow changing the virtual CPU model if L2 is active, for all the same reasons KVM disallows change the model after the first KVM_RUN. - Fix a bug where KVM would incorrectly reject host accesses to PV MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those were advertised as supported to userspace, - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM would attempt to read CR3 configuring an async #PF entry. - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports that are intended for external usage, and those are allowed explicitly. - When checking nested events after a vCPU is unblocked, ignore -EBUSY instead of WARNing. Userspace can sometimes put the vCPU into what should be an impossible state, and spurious exit to userspace on -EBUSY does not really do anything to solve the issue. - Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller stuffing architecturally impossible states into KVM. - Add support for new Intel instructions that don't require anything beyond enumerating feature flags to userspace. - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2. - Add WARNs to guard against modifying KVM's CPU caps outside of the intended setup flow, as nested VMX in particular is sensitive to unexpected changes in KVM's golden configuration. - Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts when the suppression feature is enabled by the guest (currently limited to split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an option as some userspaces have come to rely on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective of whether or not userspace I/O APIC supports Directed EOIs). - Clean up KVM's handling of marking mapped vCPU pages dirty. - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless. - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n. - Bury all of ioapic.h, i8254.h and related ioctls (including KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y. - Add a regression test for recent APICv update fixes. - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information. - Drop a dead function declaration. - Minor cleanups. x86 (Intel): - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans. - Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults, and instead always reflect them into the guest. Since KVM doesn't shadow EPCM entries, EPCM violations cannot be due to KVM interference and can't be resolved by KVM. - Fix a bug where KVM would register its posted interrupt wakeup handler even if loading kvm-intel.ko ultimately failed. - Disallow access to vmcb12 fields that aren't fully supported, mostly to avoid weirdness and complexity for FRED and other features, where KVM wants enable VMCS shadowing for fields that conditionally exist. - Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or refuses to online a CPU) due to a VMCS config mismatch. x86 (AMD): - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups. x86 selftests: - Add a regression test for TPR<=>CR8 synchronization and IRQ masking. - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests). - Extend several nested VMX tests to also cover nested SVM. - Add a selftest for nested VMLOAD/VMSAVE. - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain. guest_memfd: - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage handling. SEV/SNP was the only user of the tracking and it can do it via the RMP. - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to avoid non-trivial complexity for something that no known VMM seems to be doing and to avoid an API special case for in-place conversion, which simply can't support unaligned sources. - When populating guest_memfd memory, GUP the source page in common code and pass the refcounted page to the vendor callback, instead of letting vendor code do the heavy lifting. Doing so avoids a looming deadlock bug with in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap invalidate lock. Generic: - Fix a bug where KVM would ignore the vCPU's selected address space when creating a vCPU-specific mapping of guest memory. Actually this bug could not be hit even on x86, the only architecture with multiple address spaces, but it's a bug nevertheless. -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmNqwwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPaZAf/cJx5B67lnST272esz0j29MIuT/Ti jnf6PI9b7XubKYOtNvlu5ZW4Jsa5dqRG0qeO/JmcXDlwBf5/UkWOyvqIXyiuTl0l KcSUlKPtTgKZSoZpJpTppuuDE8FSYqEdcCmjNvoYzcJoPjmaeJbK6aqO0AkBbb6e L5InrLV7nV9iua6rFvA0s/G8/Eq2DG8M9hTRHe6NcI/z4hvslOudvpUXtC8Jygoo cV8vFavUwc+atrmvhAOLvSitnrjfNa4zcG6XMOlwXPfIdvi3zqTlQTgUpwGKiAGQ RIDUVZ/9bcWgJqbPRsdEWwaYRkNQWc5nmrAHRpEEaYV/NeBBNf4v6qfKSw== =SkJ1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "Loongarch: - Add more CPUCFG mask bits - Improve feature detection - Add lazy load support for FPU and binary translation (LBT) register state - Fix return value for memory reads from and writes to in-kernel devices - Add support for detecting preemption from within a guest - Add KVM steal time test case to tools/selftests ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers - More pKVM fixes for features that are not supposed to be exposed to guests - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest - Preliminary work for guest GICv5 support - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures - A small set of FPSIMD cleanups - Selftest fixes addressing the incorrect alignment of page allocation - Other assorted low-impact fixes and spelling fixes RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit `bf2c3138ae` ("Merge tag 'kvm-x86-pmu-6.20' ...") - Disallow changing the virtual CPU model if L2 is active, for all the same reasons KVM disallows change the model after the first KVM_RUN - Fix a bug where KVM would incorrectly reject host accesses to PV MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those were advertised as supported to userspace, - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM would attempt to read CR3 configuring an async #PF entry - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports that are intended for external usage, and those are allowed explicitly - When checking nested events after a vCPU is unblocked, ignore -EBUSY instead of WARNing. Userspace can sometimes put the vCPU into what should be an impossible state, and spurious exit to userspace on -EBUSY does not really do anything to solve the issue - Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller stuffing architecturally impossible states into KVM - Add support for new Intel instructions that don't require anything beyond enumerating feature flags to userspace - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2 - Add WARNs to guard against modifying KVM's CPU caps outside of the intended setup flow, as nested VMX in particular is sensitive to unexpected changes in KVM's golden configuration - Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts when the suppression feature is enabled by the guest (currently limited to split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an option as some userspaces have come to rely on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective of whether or not userspace I/O APIC supports Directed EOIs) - Clean up KVM's handling of marking mapped vCPU pages dirty - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n - Bury all of ioapic.h, i8254.h and related ioctls (including KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y - Add a regression test for recent APICv update fixes - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information - Drop a dead function declaration - Minor cleanups x86 (Intel): - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans - Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults, and instead always reflect them into the guest. Since KVM doesn't shadow EPCM entries, EPCM violations cannot be due to KVM interference and can't be resolved by KVM - Fix a bug where KVM would register its posted interrupt wakeup handler even if loading kvm-intel.ko ultimately failed - Disallow access to vmcb12 fields that aren't fully supported, mostly to avoid weirdness and complexity for FRED and other features, where KVM wants enable VMCS shadowing for fields that conditionally exist - Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or refuses to online a CPU) due to a VMCS config mismatch x86 (AMD): - Drop a user-triggerable WARN on nested_svm_load_cr3() failure - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0 - Treat exit_code as an unsigned 64-bit value through all of KVM - Add support for fetching SNP certificates from userspace - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2 - Misc fixes and cleanups x86 selftests: - Add a regression test for TPR<=>CR8 synchronization and IRQ masking - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests) - Extend several nested VMX tests to also cover nested SVM - Add a selftest for nested VMLOAD/VMSAVE - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain guest_memfd: - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage handling. SEV/SNP was the only user of the tracking and it can do it via the RMP - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to avoid non-trivial complexity for something that no known VMM seems to be doing and to avoid an API special case for in-place conversion, which simply can't support unaligned sources - When populating guest_memfd memory, GUP the source page in common code and pass the refcounted page to the vendor callback, instead of letting vendor code do the heavy lifting. Doing so avoids a looming deadlock bug with in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap invalidate lock Generic: - Fix a bug where KVM would ignore the vCPU's selected address space when creating a vCPU-specific mapping of guest memory. Actually this bug could not be hit even on x86, the only architecture with multiple address spaces, but it's a bug nevertheless" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (267 commits) KVM: s390: Increase permitted SE header size to 1 MiB MAINTAINERS: Replace backup for s390 vfio-pci KVM: s390: vsie: Fix race in acquire_gmap_shadow() KVM: s390: vsie: Fix race in walk_guest_tables() KVM: s390: Use guest address to mark guest page dirty irqchip/riscv-imsic: Adjust the number of available guest irq files RISC-V: KVM: Transparent huge page support RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list test RISC-V: KVM: Allow Zalasr extensions for Guest/VM KVM: riscv: selftests: Add riscv vm satp modes KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr() RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr() RISC-V: KVM: Remove unnecessary 'ret' assignment KVM: s390: Add explicit padding to struct kvm_s390_keyop KVM: LoongArch: selftests: Add steal time test case LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side LoongArch: KVM: Add paravirt preempt feature in hypervisor side ...	2026-02-13 11:31:15 -08:00
Linus Torvalds	4cff5c05e0	mm.git review status for linus..mm-stable Everything: Total patches: 325 Reviews/patch: 1.39 Reviewed rate: 72% Excluding DAMON: Total patches: 262 Reviews/patch: 1.63 Reviewed rate: 82% Excluding DAMON and zram: Total patches: 248 Reviews/patch: 1.72 Reviewed rate: 86% - The 14 patch series "powerpc/64s: do not re-activate batched TLB flush" from Alexander Gordeev makes arch_{enter\|leave}_lazy_mmu_mode() nest properly. It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - The 7 patch series "zram: introduce compressed data writeback" from Richard Chang and Sergey Senozhatsky implements data compression for zram writeback. - The 8 patch series "mm: folio_zero_user: clear page ranges" from David Hildenbrand adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated. - The 2 patch series "memcg cleanups" from Chen Ridong tideis up some memcg code. - The 12 patch series "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" from SeongJae Park improves DAMOS stat's provided information, deterministic control, and readability. - The 3 patch series "selftests/mm: hugetlb cgroup charging: robustness fixes" from Li Wang fixes a few issues in the hugetlb cgroup charging selftests. - The 5 patch series "Fix va_high_addr_switch.sh test failure - again" from Chunyu Hu addresses several issues in the va_high_addr_switch test. - The 5 patch series "mm/damon/tests/core-kunit: extend existing test scenarios" from Shu Anzai improves the KUnit test coverage for DAMON. - The 2 patch series "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" from Shivank Garg fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN. - The 29 patch series "arch, mm: consolidate hugetlb early reservation" from Mike Rapoport reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb. - The 9 patch series "mm: clean up anon_vma implementation" from Lorenzo Stoakes cleans up the anon_vma implementation in various ways. - The 3 patch series "tweaks for __alloc_pages_slowpath()" from Vlastimil Babka does a little streamlining of the page allocator's slowpath code. - The 8 patch series "memcg: separate private and public ID namespaces" from Shakeel Butt cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace. - The 6 patch series "mm: hugetlb: allocate frozen gigantic folio" from Kefeng Wang cleans up the allocation of frozen folios and avoids some atomic refcount operations. - The 11 patch series "mm/damon: advance DAMOS-based LRU sorting" from SeongJae Park improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals. - The 18 patch series "Support page table check on PowerPC" from Andrew Donnellan makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc. - The 3 patch series "nodemask: align nodes_and{,not} with underlying bitmap ops" from Yury Norov makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code. - The 5 patch series "mm/damon: hide kdamond and kdamond_lock from API callers" from SeongJae Park cleans up some DAMON internal interfaces. - The 4 patch series "mm/khugepaged: cleanups and scan limit fix" from Shivank Garg does some cleanup work in khupaged and fixes a scan limit accounting issue. - The 24 patch series "mm: balloon infrastructure cleanups" from David Hildenbrand goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification. - The 2 patch series "mm/vmscan: add tracepoint and reason for kswapd_failures reset" from Jiayuan Chen adds additional tracepoints to the page reclaim code. - The 3 patch series "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" from Marco Crivellari is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues. - The 9 patch series "Various mm kselftests improvements/fixes" from Kevin Brodsky provides various unrelated improvements/fixes for the mm kselftests. - The 5 patch series "mm: accelerate gigantic folio allocation" from Kefeng Wang greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig(). - The 5 patch series "selftests/damon: improve leak detection and wss estimation reliability" from SeongJae Park improves the reliability of two of the DAMON selftests. - The 8 patch series "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" from SeongJae Park does some cleanup work in the core DAMON code. - The 8 patch series "Docs/mm/damon: update intro, modules, maintainer profile, and misc" from SeongJae Park performs maintenance work on the DAMON documentation. - The 10 patch series "mm: add and use vma_assert_stabilised() helper" from Lorenzo Stoakes refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires. - The 19 patch series "mm, swap: swap table phase II: unify swapin use" from Kairui Song removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark. - The 8 patch series "enable PT_RECLAIM on more 64-bit architectures" from Qi Zheng makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, um, Various cleanups were performed along the way. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY1HfAAKCRDdBJ7gKXxA jqhZAP9H8ZlKKqCEgnr6U5XXmJ63Ep2FDQpl8p35yr9yVuU9+gEAgfyWiJ43l1fP rT0yjsUW3KQFBi/SEA3R6aYarmoIBgI= =+HLt -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter\|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...	2026-02-12 11:32:37 -08:00
Linus Torvalds	57cb845067	- A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code. Work by Juergen Gross. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmmLKHAACgkQEsHwGGHe VUrURg//Ucf+3EAIkLCmFkH0WwYmQl2JjRYww8bPAw3iJMIVxy4dMnaBbsUiAtUp kYza+pgEtvyAwwd8RIEs85c9VhZn0DKoaWV8goBH3zFH6YvIRiLwb0w2QvjkF+70 FNU+4zlvt/I3FD+tWNElAgVtkFL3Gmzm44qyLLsPtlYaJ71xFl2XB7V+TlqXMHzE m8BMenP9/CrbTlBBdNJGzAkAbWi1uAP+IydvuFNolH/F2lqVM2z5Ta3gUWWCIk/q jWrPLDZCHr2WlBZNUGamKVVH9NEh+7YNwBAGUrSNYGZFoaFjqeX6lN3djzS+wXIj 0nDoW35jN0QNKz239MdXZDf1mfpb6ZQd/iOhFjo4dAvbm+J8WPAMr98ac8wR3Dyb 2LF/BxkoKWRabxQApXSCrLPXEuqT6Qc1+lDA0bNHg51zBoqP5vRNVZRwArnzGB+O LxDKx+o4VYOf+UCaB6oQHjylbSgFvIedZ9p822hBe3QG9act8indRE8LWip7Utld peoJGgvlQ0xtClh6FjVHpvmVfAvk7Zki5ywj2GwmB/TZ0yywuGStAjE3UqY168/M gb7MSajh+HHZNj1/2+b/se4CUYlAgIPDQ+SwHJPm5TqyopvnOVi/2XWmjbx8I5jT jS0nxaxD+SbESSZ6IMAsppnAAxAYbvRHGIS+6mtNCXVkaV1pMbA= =AeFt -----END PGP SIGNATURE----- Merge tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...	2026-02-10 19:01:45 -08:00
Carlos López	f5db714646	LoongArch: Use IS_ERR_PCPU() macro for KGDB In commit `a759e37fb4` ("err.h: add ERR_PTR_PCPU(), PTR_ERR_PCPU() and IS_ERR_PCPU() macros"), specialized macros were added to check an error within a __percpu pointer, so use them instead of manually casting with __force, like all other users of register_wide_hw_breakpoint(). Signed-off-by: Carlos López <clopez@suse.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:17 +08:00
Tiezhu Yang	0e6f596d6a	LoongArch: Remove some extern variables in source files There are declarations of the variable "eentry", "pcpu_handlers[]" and "exception_handlers[]" in asm/setup.h, the source files already include this header file directly or indirectly, so no need to declare them in the source files, just remove the code. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:14 +08:00
Tiezhu Yang	70b0faae35	LoongArch: Guard percpu handler under !CONFIG_PREEMPT_RT After commit `88fd2b7012` ("LoongArch: Fix sleeping in atomic context for PREEMPT_RT"), it should guard percpu handler under !CONFIG_PREEMPT_RT to avoid redundant operations. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:13 +08:00
Tiezhu Yang	055c7e7519	LoongArch: Handle percpu handler address for ORC unwinder After commit `4cd641a79e` ("LoongArch: Remove unnecessary checks for ORC unwinder"), the system can not boot normally under some configs (such as enable KASAN), there are many error messages "cannot find unwind pc". The kernel boots normally with the defconfig, so no problem found out at the first time. Here is one way to reproduce: cd linux make mrproper defconfig -j"$(nproc)" scripts/config -e KASAN make olddefconfig all -j"$(nproc)" sudo make modules_install sudo make install sudo reboot The address that can not unwind is not a valid kernel address which is between "pcpu_handlers[cpu]" and "pcpu_handlers[cpu] + vec_sz" due to the code of eentry was copied to the new area of pcpu_handlers[cpu] in setup_tlb_handler(), handle this special case to get the valid address to unwind normally. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:13 +08:00
Tiezhu Yang	77403a06d8	LoongArch: Use %px to print unmodified unwinding address Currently, use %p to prevent leaking information about the kernel memory layout when printing the PC address, but the kernel log messages are not useful to debug problem if bt_address() returns 0. Given that the type of "pc" variable is unsigned long, it should use %px to print the unmodified unwinding address. Cc: stable@vger.kernel.org Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:13 +08:00
Huacai Chen	2172d6ebac	LoongArch: Prefer top-down allocation after arch_mem_init() Currently we use bottom-up allocation after sparse_init(), the reason is sparse_init() need a lot of memory, and bottom-up allocation may exhaust precious low memory (below 4GB). On the other hand, SWIOTLB and CMA need low memories for DMA32, so swiotlb_init() and dma_contiguous_reserve() need bottom-up allocation. Since swiotlb_init() and dma_contiguous_reserve() are both called in arch_mem_init(), we no longer need bottom-up allocation after that. So we set the allocation policy to top-down at the end of arch_mem_init(), in order to avoid later memory allocations (such as KASAN) exhaust low memory. This solve at least two problems: 1. Some buggy BIOSes use 0xfd000000~0xfe000000 for secondary CPUs, but didn't reserve this range, which causes smpboot failures. 2. Some DMA32 devices, such as Loongson-DRM and OHCI, cannot work with KASAN enabled. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:13 +08:00
Huacai Chen	009ee0c964	LoongArch: Add HOTPLUG_SMT implementation For benchmarking or debugging purpose, we usually want to control SMT via boot parameter and sysfs knobs. So add HOTPLUG_SMT implementation. 1. Boot parameters: nosmt: Disable SMT, can be enabled via sysfs knobs. nosmt=force: Disable SMT, cannot be enabled via sysfs knobs. 2. Runtime sysfs controls: Write "on", "off", "forceoff" or the number of SMT threads (1, 2, ...) to /sys/devices/system/cpu/smt/control. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:13 +08:00
Lain "Fearyncess" Yang	abca6583a2	LoongArch: Wire up memfd_secret system call LoongArch supports ARCH_HAS_SET_DIRECT_MAP, therefore wire up the memfd_secret system call, which just depends on it. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Signed-off-by: Lain "Fearyncess" Yang <fearyncess@aosc.io> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:12 +08:00
George Guo	52c1dbf4cb	LoongArch: Replace seq_printf() with seq_puts() for simple strings Fix warnings like: "Prefer seq_puts to seq_printf" by checkpatch.pl. Replace seq_printf() calls with seq_puts() in show_cpuinfo() when outputting simple constant strings without format specifiers. This improves performance slightly as seq_puts() avoids parsing the format string. Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:12 +08:00
George Guo	48543c4283	LoongArch: Add detection for SC.Q support Check the CPUCFG2_SCQ bit to determine if the current CPU supports the SC.Q instruction. Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Co-developed-by: Yangyang Lian <lianyangyang@kylinos.cn> Signed-off-by: Yangyang Lian <lianyangyang@kylinos.cn> Signed-off-by: George Guo <guodongtai@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-10 19:31:06 +08:00
Linus Torvalds	0c61526621	EFI updates for v7.0 - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCaYoTAgAKCRAwbglWLn0t XFe8AQDJe2GSNfzWgqoTqgT6tcH7lFG2SjdpIb+jHSmvgHckbAD/cUaY8YnhdYkm nz6URLJN/2NHuaDq1mUL8CwJwIot4wk= =41tn -----END PGP SIGNATURE----- Merge tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. * tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Support EDID information sysfb: Move edid_info into sysfb_primary_display sysfb: Pass sysfb_primary_display to devices sysfb: Replace screen_info with sysfb_primary_display sysfb: Add struct sysfb_display_info efi: sysfb_efi: Reduce number of references to global screen_info efi: earlycon: Reduce number of references to global screen_info efi: sysfb_efi: Fix efidrmfb and simpledrmfb on Valve Steam Deck efi: sysfb_efi: Convert swap width and height quirk to a callback efi: sysfb_efi: Fix lfb_linelength calculation when applying quirks efi: sysfb_efi: Replace open coded swap with the macro	2026-02-09 20:49:19 -08:00
Bibo Mao	872d277a1e	LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side Function vcpu_is_preempted() is used to check whether vCPU is preempted or not. Here add the implementation with vcpu_is_preempted() when option CONFIG_PARAVIRT is enabled. Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-02-06 09:28:01 +08:00
Mike Rapoport (Microsoft)	4267739cab	arch, mm: consolidate initialization of SPARSE memory model Every architecture calls sparse_init() during setup_arch() although the data structures created by sparse_init() are not used until the initialization of the core MM. Beside the code duplication, calling sparse_init() from architecture specific code causes ordering differences of vmemmap and HVO initialization on different architectures. Move the call to sparse_init() from architecture specific code to free_area_init() to ensure that vmemmap and HVO initialization order is always the same. Link: https://lkml.kernel.org/r/20260111082105.290734-25-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:18 -08:00
Mike Rapoport (Microsoft)	d49004c5f0	arch, mm: consolidate initialization of nodes, zones and memory map To initialize node, zone and memory map data structures every architecture calls free_area_init() during setup_arch() and passes it an array of zone limits. Beside code duplication it creates "interesting" ordering cases between allocation and initialization of hugetlb and the memory map. Some architectures allocate hugetlb pages very early in setup_arch() in certain cases, some only create hugetlb CMA areas in setup_arch() and sometimes hugetlb allocations happen mm_core_init(). With arch_zone_limits_init() helper available now on all architectures it is no longer necessary to call free_area_init() from architecture setup code. Rather core MM initialization can call arch_zone_limits_init() in a single place. This allows to unify ordering of hugetlb vs memory map allocation and initialization. Remove the call to free_area_init() from architecture specific code and place it in a new mm_core_init_early() function that is called immediately after setup_arch(). After this refactoring it is possible to consolidate hugetlb allocations and eliminate differences in ordering of hugetlb and memory map initialization among different architectures. As the first step of this consolidation move hugetlb_bootmem_alloc() to mm_core_early_init(). Link: https://lkml.kernel.org/r/20260111082105.290734-24-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 20:02:18 -08:00
Lisa Robinson	a91f86e270	LoongArch: Fix PMU counter allocation for mixed-type event groups When validating a perf event group, validate_group() unconditionally attempts to allocate hardware PMU counters for the leader, sibling events and the new event being added. This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event is part of the group, the current code still tries to allocate a hardware PMU counter for it, which can wrongly consume hardware PMU resources and cause spurious allocation failures. Fix this by only allocating PMU counters for hardware events during group validation, and skipping software events. A trimmed down reproducer is as simple as this: #include <stdio.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <sys/syscall.h> #include <linux/perf_event.h> int main (int argc, char *argv[]) { struct perf_event_attr attr = { 0 }; int fds[5]; attr.disabled = 1; attr.exclude_kernel = 1; attr.exclude_hv = 1; attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED \| PERF_FORMAT_TOTAL_TIME_RUNNING \| PERF_FORMAT_ID \| PERF_FORMAT_GROUP; attr.size = sizeof (attr); attr.type = PERF_TYPE_SOFTWARE; attr.config = PERF_COUNT_SW_DUMMY; fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0); assert (fds[0] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CPU_CYCLES; fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[1] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_INSTRUCTIONS; fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[2] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_BRANCH_MISSES; fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[3] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CACHE_REFERENCES; fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[4] >= 0); printf ("PASSED\n"); return 0; } Cc: stable@vger.kernel.org Fixes: `b37042b2bb` ("LoongArch: Add perf events support") Signed-off-by: Lisa Robinson <lisa@bytefly.space> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-01-17 10:56:43 +08:00
Huacai Chen	0d26ca8ec4	LoongArch: Remove redundant code in head.S SETUP_MODES already setup the initial values of CSR.CRMD, CSR.PRMD and CSR.EUEN, so the redundant open code can be removed. Fixes: `7b2afeafaf` ("LoongArch: Adjust boot & setup for 32BIT/64BIT") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2026-01-15 18:50:48 +08:00
Juergen Gross	b8431b901e	loongarch/paravirt: Use common code for paravirt_steal_clock() Remove the arch specific variant of paravirt_steal_clock() and use the common one instead. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-10-jgross@suse.com	2026-01-12 16:47:20 +01:00
Juergen Gross	e6b2aa6d40	sched: Move clock related paravirt code to kernel/sched Paravirt clock related functions are available in multiple archs. In order to share the common parts, move the common static keys to kernel/sched/ and remove them from the arch specific files. Make a common paravirt_steal_clock() implementation available in kernel/sched/cputime.c, guarding it with a new config option CONFIG_HAVE_PV_STEAL_CLOCK_GEN, which can be selected by an arch in case it wants to use that common variant. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-7-jgross@suse.com	2026-01-12 15:39:14 +01:00
Chenghao Duan	45cb47c628	LoongArch: Refactor register restoration in ftrace_common_return Refactor the register restoration sequence in the ftrace_common_return function to clearly distinguish between the logic of normal returns and direct call returns in function tracing scenarios. The logic is as follows: 1. In the case of a normal return, the execution flow returns to the traced function, and ftrace must ensure that the register data is consistent with the state when the function was entered. ra = parent return address; t0 = traced function return address. 2. In the case of a direct call return, the execution flow jumps to the custom trampoline function, and ftrace must ensure that the register data is consistent with the state when ftrace was entered. ra = traced function return address; t0 = parent return address. Cc: stable@vger.kernel.org Fixes: `9cdc3b6a29` ("LoongArch: ftrace: Add direct call support") Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2025-12-31 15:19:20 +08:00

1 2 3 4 5 ...

552 Commits