mirror of
https://github.com/torvalds/linux.git
synced 2026-05-13 08:39:31 +02:00
master
6980 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
c21b90f776 |
x86/CPU/AMD: Prevent improper isolation of shared resources in Zen2's op cache
Make sure resources are not improperly shared in the op cache and cause instruction corruption this way. Signed-off-by: Prathyushi Nangia <prathyushi.nangia@amd.com> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
8fd12b03c7 |
hyperv-next for v7.1
-----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmnobFATHHdlaS5saXVA a2VybmVsLm9yZwAKCRB2FHBfkEGgXhafB/wMvf8yu4FgapbSRIlPboeW/ONDyuHd k0y1a0IFMQLspWfCxK8+snEcHT3g9xzG3ksqcnac0SPgzqxQ4dlK/c7Xr+5EBXuf TEt/bHrt6KT5BUpb1k/XxcdoObCsvZd3vfqR020OsHijw1Ni9PrdqeZxk56/vFJs nvgyKvsrAEyfALOH1Vgwg0gNWpqJBj1KcT3Kl1o4p5lwQzVpUREZii6RyvgXT/pu mckN63FrEPOpDaJllHmCPcfFSzqNi+wFQcFxm35w9rDQsVdnMsRWoJbcXNYW5QGM +KOZ1/tzw4Z3queW78hcxsH6sXiElLDsJgtDohBhxbVvUXy+xHrJCOm5 =4Ke3 -----END PGP SIGNATURE----- Merge tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V updates from Wei Liu: - Fix cross-compilation for hv tools (Aditya Garg) - Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain) - Limit channel interrupt scan to relid high water mark (Michael Kelley) - Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui) - Fix cleanup and shutdown issues for MSHV (Jork Loeser) - Introduce more tracing support for MSHV (Stanislav Kinsburskii) * tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Skip LP/VP creation on kexec x86/hyperv: move stimer cleanup to hv_machine_shutdown() Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing mshv: Add tracepoint for GPA intercept handling mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER tools: hv: Fix cross-compilation Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv mshv: Introduce tracing support Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark |
||
|
|
5170a82e89 |
x86/hyperv: Skip LP/VP creation on kexec
After a kexec the logical processors and virtual processors already exist in the hypervisor because they were created by the previous kernel. Attempting to add them again causes either a BUG_ON or corrupted VP state leading to MCEs in the new kernel. Add hv_lp_exists() to probe whether an LP is already present by calling HVCALL_GET_LOGICAL_PROCESSOR_RUN_TIME. When it succeeds the LP exists and we skip the add-LP and create-VP loops entirely. Also add hv_call_notify_all_processors_started() which informs the hypervisor that all processors are online. This is required after adding LPs (fresh boot) and is a no-op on kexec since we skip that path. Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Co-developed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> |
||
|
|
f7ce370b52 |
x86/hyperv: move stimer cleanup to hv_machine_shutdown()
Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to hv_machine_shutdown() in the platform code. This ensures stimer cleanup happens before the vmbus unload, which is required for root partition kexec to work correctly. Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> |
||
|
|
01f492e181 |
Arm:
- Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree. - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM. - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported. This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor. - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable. - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis. - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow. - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups. - A small set of patches fixing the Stage-2 MMU freeing in error cases. - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls. - The usual cleanups and other selftest churn. LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel(). - Add DMSINTC irqchip in kernel support. RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors. - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code. - Fix LPSW/E to update the bear (which of course is the breaking event address register). x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized. - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes. - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage. x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace. - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier"). - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX and SVM enabling) as it is needed for trusted I/O. - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times. - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM. - Exempt SMM from CPUID faulting on Intel, as per the spec. - Misc hardening and cleanup changes. x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs. - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs. - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input. - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION. - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault. - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl. - Convert a pile of kvm->lock SEV code to guard(). - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs. x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros. - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate. - Code cleanups. guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to. LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests. x86 selftests: - Add support for Hygon CPUs in KVM selftests. - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP. - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3 8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX 08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5 Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg== =CcE0 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ... |
||
|
|
e55d98e775 |
x86/CPU: Fix FPDSS on Zen1
Zen1's hardware divider can leave, under certain circumstances, partial results from previous operations. Those results can be leaked by another, attacker thread. Fix that with a chicken bit. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
334fbe734e |
mm.git review status for linus..mm-stable
Everything: Total patches: 368 Reviews/patch: 1.56 Reviewed rate: 74% Excluding DAMON: Total patches: 316 Reviews/patch: 1.77 Reviewed rate: 81% Excluding DAMON and zram: Total patches: 306 Reviews/patch: 1.81 Reviewed rate: 82% Excluding DAMON, zram and maple_tree: Total patches: 276 Reviews/patch: 2.01 Reviewed rate: 91% Significant patch series in this merge: - The 30 patch series "maple_tree: Replace big node with maple copy" from Liam Howlett is mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - The 12 patch series "mm, swap: swap table phase III: remove swap_map" from Kairui Song offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush Yadav adds file seal preservation to LUO's memfd code. - The 2 patch series "mm: zswap: add per-memcg stat for incompressible pages" from Jiayuan Chen adds additional userspace stats reportng to zswap. - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike Rapoport implements some cleanups for our handling of ZERO_PAGE() and zero_pfn. - The 2 patch series "mm/kmemleak: Improve scan_should_stop() implementation" from Zhongqiu Han provides an robustness improvement and some cleanups in the kmemleak code. - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang "improves the khugepaged scan logic and reduces CPU consumption by prioritizing scanning tasks that access memory frequently". - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies Kexec Handover by "transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel" - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's tracepointing. - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation. - The 2 patch series "Fix KASAN support for KHO restored vmalloc regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO restores a vmalloc area. - The 4 patch series "mm: Remove stray references to pagevec" from Tal Zussman provides several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago. - The 17 patch series "mm: Eliminate fake head pages from vmemmap optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page. - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for core layer filters" from SeongJae Park improves two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used. - The 3 patch series "mm/damon: strictly respect min_nr_regions" from SeongJae Park improves DAMON usability by extending the treatment of the min_nr_regions user-settable parameter. - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil Babka is a proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ennsed. - The 16 patch series "mm: cleanups around unmapping / zapping" from David Hildenbrand implements "a bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions". - The 6 patch series "support batched checking of the young flag for MGLRU" from Baolin Wang supports batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64. - The 5 patch series "memcg: obj stock and slab stat caching cleanups" from Johannes Weiner provides memcg cleanup and robustness improvements. - The 5 patch series "Allow order zero pages in page reporting" from Yuvraj Sakshith enhances page_reporting's free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is cleanup work following from the recent conversion of the VMA flags to a bitmap. - The 10 patch series "mm/damon: add optional debugging-purpose sanity checks" from SeongJae Park adds some more developer-facing debug checks into DAMON core. - The 2 patch series "mm/damon: test and document power-of-2 min_region_sz requirement" from SeongJae Park adds an additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling. - The 3 patch series "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time overflow issue in DAMON core. - The 7 patch series "mm/damon: improve/fixup/update ratio calculation, test and documentation" from SeongJae Park is a "batch of misc/minor improvements and fixups" for DAMON. - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - The 6 patch series "zram: recompression cleanups and tweaks" from Sergey Senozhatsky provides "a somewhat random mix of fixups, recompression cleanups and improvements" in the zram code. - The 11 patch series "mm/damon: support multiple goal-based quota tuning algorithms" from SeongJae Park extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select. - The 4 patch series "mm: thp: reduce unnecessary start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged. - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes provides some cleanups and slight fixes in the mremap, mmap and vma code. - The 5 patch series "mm/damon: support addr_unit on default monitoring targets for modules" from SeongJae Park extends the use of DAMON core's addr_unit tunable. - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites" from Nico Pache provides cleanups in the khugepaged and is a base for Nico's planned khugepaged mTHP support. - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups" from David Hildenbrand implements code movement and cleanups in the memhotplug and sparsemem code. - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some memhotplug Kconfig support. - The 6 patch series "change young flag check functions to return bool" from Baolin Wang is "a cleanup patchset to change all young flag check functions to return bool". - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL dereference issues" from Josh Law and SeongJae Park fixes a few potential DAMON bugs. - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma code" from "converts a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it". Mainly in the vma code. - The 21 patch series "mm: expand mmap_prepare functionality and usage" from Lorenzo Stoakes "expands the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time". Cleanups, documentation, extension of mmap_prepare into filesystem drivers. - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from Lorenzo Stoakes simplifies and cleans up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ= =1Q7o -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ... |
||
|
|
508fed6795 |
- Add new AMD MCA bank names and types to the MCA code, preceded by a clean
up of the relevant places to have them more developer-friendly (read: sort them alphanumerically and clean up comments) such that adding new banks is easy -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndH/cACgkQEsHwGGHe VUpqFBAAvjQCWdL5GQ0sV4EyYVToj4OKU3DmUCJLMKEh3n3yrQpPsbU9+KfxyndP B68lRfRqqV/uUxQGebh7Rnp8o2jWphU2hf1Lr0Ssl6y5ouKWs5Up4foLlG4hAhzC 2MmHVz+jj8Z3FWKLxMEymxqq6wLF+0H3Issd/l23DkK6hMQCkjKc6WrSNC6JBDCA sSF5kR/E4Q/lcW12ncq4pUYwkKox2lcdsNtI/nC7W7W+CoqwpOq8MfomCDIII+A0 Ib7baeRxagOk0WHlfy15fGaDoKlHW6ImT3cVYBK/tomp8dpG2zRMXHHQExan2rBR rHzvk3aHEgOr02DZJ/dxOT+libQIkBwno+DheEhJHcirB/gS5Z51ERhkyzqLReGv +XSO1Eq9j5bqiVn8RdPeJIVLtfqnOrpcks+cCmyH0AlLIx1WV5mSRUtmVl1kWyq1 GBos0yOnH4PgMxqv8fNkfNjm1ATnHyrVjYl5YNKSzJHhu/8BYcQJ4X8R0f2m0pXS WI6uXf35C6rJcKj25qo1Nnhmj5YDWJgelJjes9ZtmRMPDNNooD4VLk1W6ox7VuOY QaNMNwrroLRdfOlaz7oYIUAuoaZbZnTqbz8Lfmb4UScLd9LfI5ZPqs7pB5VORApF 5IYM/Wli+kQl2Qbz0CD6ZtfdidqR09H7oJBE/r6bEFePot2EpUY= =aivF -----END PGP SIGNATURE----- Merge tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: - Add new AMD MCA bank names and types to the MCA code, preceded by a clean up of the relevant places to have them more developer-friendly (read: sort them alphanumerically and clean up comments) such that adding new banks is easy * tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce, EDAC/mce_amd: Add new SMCA bank types x86/mce, EDAC/mce_amd: Update CS bank type naming x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums |
||
|
|
970216e023 |
— Reference the tip tree maintainer handbook directly from the relevant
MAINTAINERS file entries (covering timers, IRQ, locking, scheduling, perf, x86, and others) so that contributors and tooling can know where to look — Enable interrupt remapping in defconfig, which is an architectural requirement for x2APIC to function correctly on bare metal. Without it, x2APIC was effectively enabled but non-functional. — Ensure that drivers which register custom restart handlers (such as those needed for SoC-based x86 devices like Intel Lightning Mountain) are actually invoked during reboot, bringing x86 in line with how other architectures handle this. - Cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndUHwACgkQEsHwGGHe VUridxAAgSJ7wLZ8rypjCA6W17KLTMOn/ymvrR3NEU/3T8R4m5ax5gHNUgyo/cb9 T9H9eer5NRnDjlpKHXG9QTExxVMyrMqZEBcV04wxExKrWTBqWs8tc8Y1ux6sDm5I 84iPjnaI1ideP6Mu8YwxOgUUGHExICh7gdHsvZGAPt7QupGTc3liT0AftT9qNMd7 c33kx05gJshqdIuI8HFVZGlTu8FTfgYl2AGjVikjGx1i1A8LYlERhDsUu8VEDAsA w2KzIH0rINXWQPvfjtKEK9r9KVsqe4Ye28ku3OmVh3ScflBXrAVOzJSH5DXvoy6f ktw5/rn5zUDX+FHNt0amaZItBqXNB5I/nyn1F8qraTTvK9obC7urYL1Q7x/Sg6j8 zqRTdhBQcqvmGiO9FSvUfJyk+VMAakNXGWHNGIoivuoopfIyieeN3jpaQObZ9dld szrOL0WBduQ9gnke6KJ8VMFhIt2vAqCw+PS+OABwbBvN9rSXnpMBys6dLblU1aoF l93L3g8oHIZimPunGNxqFe77SunA/iVlT4dnXDVGIWmte/0fn967SgtEzRESkOEN XipPW8cBhlZBAD0qN9UBkEU7EUM5G3iLFNIDyA0ONbL+OheE+iVtmO7RelAo4sPh ChM4bOaXMcxRrZf1ExoT86pYRH3Z+4lrhnIjC8ba93BkEi6dfoA= =uJyt -----END PGP SIGNATURE----- Merge tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Reference the tip tree maintainer handbook directly from the relevant MAINTAINERS file entries (covering timers, IRQ, locking, scheduling, perf, x86, and others) so that contributors and tooling can know where to look - Enable interrupt remapping in defconfig, which is an architectural requirement for x2APIC to function correctly on bare metal. Without it, x2APIC was effectively enabled but non-functional. - Ensure that drivers which register custom restart handlers (such as those needed for SoC-based x86 devices like Intel Lightning Mountain) are actually invoked during reboot, bringing x86 in line with how other architectures handle this. - Cleanups * tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add references to tip tree handbook x86/64/defconfig: Add CONFIG_IRQ_REMAP x86/reboot: Execute the kernel restart handler upon machine restart x86/mtrr: Use kstrtoul() in parse_mtrr_spare_reg() |
||
|
|
cd4cdc53cc |
- The kernel carries a table of Intel CPUs family,model,stepping,...,etc
tuples which say what is the latest microcode for that particular CPU. Some CPU variants differ only by the platform ID which determines what microcode needs to be loaded on them. Carve out the platform ID handling from the microcode loader and make it available in a more generic place so that the old microcode verification machinery can use it -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndSBoACgkQEsHwGGHe VUoJShAAjwkQMAYzFPrNXCAZzmyXr0At6YI7QVfhCiZnnbhROiRn9BxpRYby5QX9 VbsBG56uwhHmDklKCeIqz8RyNSDAb0I7JxDVJk+tL4HzoSJjmD8k17ETrl89Sne0 QB7J6PXAETfXAwEVPI1KID0fcGSE6EWkxJqM5v1CcV/hsaY2Ye3CNI7fwZNDAx1l mCbhZ0KqdFH6xjp77z8WlwDquNl7/0vyeTpNYv0CL1TMOZsSUXx8ABY/oI9iTNou 27yd9L1vATVtAUfDhq67eh9bVV/rrXwBGLIiflgVchxEZ6FDgdZ2a3NxmtMz3xAc yam2JM0dSo4ZaB1WfJpYJMfm2jBMOtA4Qx10wi19ytfdfKocceTGr2zZ01zQp/5T 9qEilQDt/bjpFZQ93qfXcdAK/u7xd3ZYJ+6ZwYWhrMDiwK9+0iRcdqrqUI6ETdqr UhUMgU0w4ukneWRqfqyhNeZQrQDYpkCsOGCwF+ruW6U3dHOljEKtMD6TyT6gZmX2 aUFrSLEq3LnJVjvAuDq/sHneWMYBIU10cSfl10FrD7G4ypnMLjoY2e5RB2fwlfje sFJthPl0OGt64Ix/cWuDfR6wUIfRi8IB57xajvAPQEUFKHZP2LHrpygBY9aOsPfW fqNoXp52CqznijhLp83kfTGezW8UY7bVNsAJ5a5oKlJ0X4X9d8E= =W/iT -----END PGP SIGNATURE----- Merge tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loading updates from Borislav Petkov: "The kernel carries a table of Intel CPUs family, model, stepping, etc tuples which say what is the latest microcode for that particular CPU. Some CPU variants differ only by the platform ID which determines what microcode needs to be loaded on them. Carve out the platform ID handling from the microcode loader and make it available in a more generic place so that the old microcode verification machinery can use it" * tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Add platform mask to Intel microcode "old" list x86/cpu: Add platform ID to CPU matching structure x86/cpu: Add platform ID to CPU info structure x86/microcode: Refactor platform ID enumeration into a helper |
||
|
|
e9635f2a73 |
- We made the FRED support an opt-in initially out of fear of it breaking
machines left and right in the case of a hw bug in the first generation of machines supporting it. Now that that the FRED code has seen a lot of hammering, flip the logic to be opt-out as is the usual case with new hw features -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndQUQACgkQEsHwGGHe VUpSQRAAqn7bxSqhZoLyzejovGxTSvHqSMaJGDhnkafA1xUNsiAvgONdMyQqxq3r XNQuLST/BCtYqllM7YLfJkKWAV+rpdp6teLA7QnxbU4BFEPwzKQ9OFDeiFDOTBMq 3Bde+t4YbwB05PpwPN8veZEEIXVDxPzzHe9I4M4x5VdsXhDHYVKuonR/DCAK1rpQ UKIXNhDGEcBdSCauOujtNwDZcmYTtcVpxWFkIwzOW0CiM8n/ys6forBTK90W3wg2 geL9r0LZqgBfeYdL16qqFtOCvqpE1pF9ot8JrPaQZa8r/Zw/wJ06faSjQcygQg/d 8gEJzC92eSU/IbwPdJBw/25D5vf/ZXztUgw2DDXCzQ6tBZ7N584YamVpHbh/WFRg degZ2Qu9AnLZquknBwJqp9y3nx8NQvIjsJZ/Q04+USh73LBlFhqHxwOcMGQ7yJHV nPhM9idlzcyTamjiJ1xw9WKx/07sI8P5SNmnte+XWJkbfwM5+mkhIuaRwH7IfRXu y/IODC3ulOLIKYyWN4ybUZnVXbUou5Fsz6o0tv6msAjStUj0O/mv/ImfiDLi6unU 73NQdXzYlFz6YKBvXfQxvjP3/r48i4XWeCc1wJMDAeOegXsBbJRDBuv27bE4kujp +7apQCe8BVyQDk0M+DdlnMUJRt0w6NHNakB671wJl+ooFDBENas= =XX9t -----END PGP SIGNATURE----- Merge tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FRED updates from Borislav Petkov: "We made the FRED support an opt-in initially out of fear of it breaking machines left and right in the case of a hw bug in the first generation of machines supporting it. Now that that the FRED code has seen a lot of hammering, flip the logic to be opt-out as is the usual case with new hw features" * tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fred: Remove kernel log message when initializing exceptions x86/fred: Enable FRED by default |
||
|
|
9f2bb6c7b3 |
- Complete LASS enabling: deal with vsyscall and EFI
- Clean up CPUID usage in newer Intel audio driver -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmndDsMACgkQaDWVMHDJ krAOhRAAi3BsToTbmKayQJKWwVr5aGbLYXe3FiBv5juPprsJDF4otd0ALkdbf5Ls n7nND4L9RBfcGZBu25vo60Y1I1b2JReu9Y8YXxsrEx+WIxTPYClOJa7AqakyNxc4 zbTCTv6TuEyXqG5VWe3AItWrMToc49TvUVoN2fi+V89fLWA8IOkMoIhbAWwQPBFI ir0N4UwC/RtRBFf9qc9jesirqGSgh6SEAK6oBYM3C2PV/njoKljTconaCnF7uJlq WxxVcb07dIvOdpdE941OZ6jyS6eePr2feXOazNjnPb0py5Ai4WCsEGdK68ltQ5fM EhY24Ex6ltudgphB/cajKIKZ8LUCWgnMwTotY8IQMQiQ47JKmyrucdLUwvhl0qTD IAWiOMvb5d3X+CoKt3lXlzc29WhbogXzvxjZE29ad8Dm7DVBEOV57bXDwhnHb7LS rro7odiohsw7FMhfqxlb6NhYzCbdpBbmY3IFzXVgTZlWIXU1g/yBVz/2O0qsKLaw QQ2NKre6lioP2H3pWc0fi53336svzCSuuExCiu47Hx9WcYBX1Poq/AkpAoxPErq+ DRn8lXVwqMpefevlHK9XlZjSpwwwltULmId3LFxh3Z53b1NABrKpVEEJ/VChpzzA xC8dLdu1pbJa2jdccL6mBDYlMvyWLCmKfAlAHdRh8nLmOZxKRHg= =kPr7 -----END PGP SIGNATURE----- Merge tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Dave Hansen: - Complete LASS enabling: deal with vsyscall and EFI The existing Linear Address Space Separation (LASS) support punted on support for common EFI and vsyscall configs. Complete the implementation by supporting EFI and vsyscall=xonly. - Clean up CPUID usage in newer Intel "avs" audio driver and update the x86-cpuid-db file * tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0 ASoC: Intel: avs: Include CPUID header at file scope ASoC: Intel: avs: Check maximum valid CPUID leaf x86/cpu: Remove LASS restriction on vsyscall emulation x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE x86/vsyscall: Restore vsyscall=xonly mode under LASS x86/traps: Consolidate user fixups in the #GP handler x86/vsyscall: Reorganize the page fault emulation code x86/cpu: Remove LASS restriction on EFI x86/efi: Disable LASS while executing runtime services x86/cpu: Defer LASS enabling until userspace comes up |
||
|
|
0972ba5605 |
x86/platform changes for v7.1:
- Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)
- Print AGESA string from DMI additional information entry
(Yazen Ghannam, Mario Limonciello)
- Improve and fix the DMI code (Mario Limonciello):
- Correct an indexing error in <linux/dmi.h>
- Adjust dmi_decode() to use enums <linux/dmi.h>
- Add pr_fmt() for dmi_scan.c to fix & standardize
the log prefixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncskARHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1htlQ//ftceW6GTd/MbFnpHlm6yJDbxAYVNtRJD
eiaRd/A3p73+ljRuaUNQflkP+4IYpFDNd2ZmwvY2MDVx1lYRVR5xbPLcHqJC7s95
gDN/v4VV4iseiGMxx0hB7MqILgK6eSqrSaiQWBVxIsEHc+Y92tlQvnM+SgFJqaiw
K0ZRDQOWWNWGE0SS5GcMwlzvdg01XGn9S/eVrvSZ1UO81a0IjFltz861z+YxzOP2
b1fo9K4HdIH0C8fYcakJDNsxFzijZKXpCFhC/XwTJ918k5PFFM9plmgW1VDZzAkP
XAnx4m55GmL+s+7vU1fQZZYqvgL02URnAlc7CQpbHEQ+WAJqdj3ACaQCj3OExHVH
rqifH62SvpOhBqLgSTODKwL5X9jGwX3+aOjQ9SsALncVPZIGqqKMc6WXAGeOrNgW
gelIn7p1DO3hKEdP3XtucU8CRcCVs27HSUjaJTAKYFYgNKzzuznXulg0LYzesJIe
85xlDiQxN3AiPuAnYmsM8UVj9jHqDfRjTPPjFHDIi8UAEG0ZJtu98bw4zxdyEmPb
fjbpjap52+iPFPISyZMqfy8n2L24B/SbftnllO/YGWudSgtPPCVt/KLJkG4i2AtJ
YBlbJ3eBfz1gU7hZGN6BSXHXHQuEv4Az3q6ThiZ1Edpw0cEInow5CPowmY7/xqjH
Fjv0QI4iMNY=
=dr9A
-----END PGP SIGNATURE-----
Merge tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
- Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)
- Print AGESA string from DMI additional information entry (Yazen
Ghannam, Mario Limonciello)
- Improve and fix the DMI code (Mario Limonciello):
- Correct an indexing error in <linux/dmi.h>
- Adjust dmi_decode() to use enums <linux/dmi.h>
- Add pr_fmt() for dmi_scan.c to fix & standardize the log prefixes
* tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Print AGESA string from DMI additional information entry
firmware: dmi: Add pr_fmt() for dmi_scan.c
firmware: dmi: Adjust dmi_decode() to use enums
firmware: dmi: Correct an indexing error in dmi.h
x86/cpu: Remove M486/M486SX/ELAN support
|
||
|
|
ac633ba77c |
Miscellaneous x86 cleanups for v7.1:
- Consolidate AMD and Hygon cases in parse_topology() (Wei Wang) - asm constraints cleanups in __iowrite32_copy() (Uros Bizjak) - Drop AMD Extended Interrupt LVT macros (Naveen N Rao) - Don't use REALLY_SLOW_IO for delays (Juergen Gross) - paravirt cleanups (Juergen Gross) - FPU code cleanups (Borislav Petkov) - split-lock handling code cleanups (Borislav Petkov, Ronan Pigott) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncsF0RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gZ1A//TmQrq/spNIRx7KqSjT9u9166OaQaBeA/ r535C9n1d/rZxw0l10vIoWeSOpDIXEPLpMguvs463pvgLVfdrNOXABn1Kw1RR6dv yTHV47KUE1j9FIuJ4Y2fQeHhdTC8cdddrC06fEFOezftTMiAMIR/GMaeVA5ExzQd 9tcyocH10gjhtKCF+ILFGt7OdPn75YDIc8ysJAAPrsF6Dw222K5E7p4XedmEYL54 W7WVknLK2jP/BdXp17wDVunQP/Hl7huiM9DMgNlv6eliWV6nyH/hONRm5NrgBUEG s2URPPEu30thveMHQ1qv31P6ZY6lVFi0VylubJ+OdPofUJDCdCINRk22Bc6kXurZ Y8ZV93UyuIgVfvlI9J5UoHSkpi3owMjvrQShquxH2hDbCzzBvwpI7/+KHwWjgVsH 9+xdOkjR40UrlmwhyyzqTzmB10mg2SM1/YK5Ca2DcneibIkQRlfXdNXQqNikWqhN COAEX6U5ayKEu/TjbiNH4zNInJCEQMI65Jiz+oTmdnf+iCQ1L2sp+zSOB6SoyQtp rTyubHDDGu6pq9IEATx3hn5BYO7t6Ly4KJksWCAJ0G8lnP3HRESD9l6QvjqipMWB JToVwWsuqgL3zWqCpuvBOErpHslgzN6Usbym6blyrp8ERKVIb2elDt9lDAWyz5X3 7hS8sNulqDw= =Ox7s -----END PGP SIGNATURE----- Merge tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - Consolidate AMD and Hygon cases in parse_topology() (Wei Wang) - asm constraints cleanups in __iowrite32_copy() (Uros Bizjak) - Drop AMD Extended Interrupt LVT macros (Naveen N Rao) - Don't use REALLY_SLOW_IO for delays (Juergen Gross) - paravirt cleanups (Juergen Gross) - FPU code cleanups (Borislav Petkov) - split-lock handling code cleanups (Borislav Petkov, Ronan Pigott) * tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Correct the comment explaining what xfeatures_in_use() does x86/split_lock: Don't warn about unknown split_lock_detect parameter x86/fpu: Correct misspelled xfeaures_to_write local var x86/apic: Drop AMD Extended Interrupt LVT macros x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology() block/floppy: Don't use REALLY_SLOW_IO for delays x86/paravirt: Replace io_delay() hook with a bool x86/irqflags: Preemptively move include paravirt.h directive where it belongs x86/split_lock: Restructure the unwieldy switch-case in sld_state_show() x86/local: Remove trailing semicolon from _ASM_XADD in local_add_return() x86/asm: Use inout "+" asm onstraint modifiers in __iowrite32_copy() |
||
|
|
db23954eea |
Update for the core interrupt subsystem:
- Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
cleanup the workaround in the Hyper-V driver, which would now invoke
it twice on ARM64. Removing it from the driver requires to add it to
the x86 system vector entry point.
- Remove the pointles cpu_read_lock() around reading CPU possible mask,
which is read only after init.
- Add documentation for the interaction between device tree bindings and
the interrupt type defines in irq.h.
- Delete stale defines in the matrix allocator and the equivalent in
loongarch.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbt64QHHRnbHhAa2Vy
bmVsLm9yZwAKCRCmGPVMDXSYoY/xD/9hdmaaSXX/JySPatJPgkAhekR8cl/brK9t
K6qhMltg/XoUhmU1y0XSfuNDJwEOa4qvW2DbwfnzknIDr0AXvL39S9oNn+1o9I0x
BjIbWwHTAsuLVyjQesqPWwyQtZ8HJCIM+3Ju2rUYz4jYuY8q15GYsbh6QN0pYDNG
F54/OpuNV42/UhX/O01Gxw930GMGnsMkV6ou9dquap23U4FdlIsoY+zU/b09VEU2
MmJZPGwryPpzhSLbCdOGuWA9oM6wd/FoBFYYU31W5OSXm4B0cRs+weE31WMogYKM
lqudBuyNAhCIuZ06sdSvBPRswdvkuTUJovrJwG2r+rV6h953ExujNn+/ZxqraPg7
iPK70Kwp/O2uyMFhHp/6r1u4bylK/AL7q6hace4cZFLqB/Htx5BW+hh/H7Cz6Yan
H3cyBz9XIE7K5BI3lotKnlWVFwkrHgwYXUzHHrMq/UPdQKog1IPh/E29JekqtR+p
RS9QzSF+Gs45I7oMP+7P8o5jAZYuuW+cODnXLlQkuz/bJff3QeftFhOKZkzbFk5x
AFT0DREttxVCGcUCQaQYrWhFshp63+eDXgKGQT1YULleFUC1f9KH2W+6KlqjDwFR
rvrUgmpMxkH2JJz1owct6vJ6wyffFNQtPeCTmq+7GM87HNm6Ji6AaRzQjRnhVGCt
mh9VbBTnFw==
=FvJG
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq updates from Thomas Gleixner:
- Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
cleanup the workaround in the Hyper-V driver, which would now invoke
it twice on ARM64. Removing it from the driver requires to add it to
the x86 system vector entry point
- Remove the pointles cpu_read_lock() around reading CPU possible mask,
which is read only after init
- Add documentation for the interaction between device tree bindings
and the interrupt type defines in irq.h
- Delete stale defines in the matrix allocator and the equivalent in
loongarch
* tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()
genirq/affinity: Remove cpus_read_lock() while reading cpu_possible_mask
genirq/matrix, LoongArch: Delete IRQ_MATRIX_BITS leftovers
genirq: Document interaction between <linux/irq.h> and DT binding defines
|
||
|
|
a970ed1881 |
bitmap updates for v7.1
- new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury); - drop unused __find_nth_andnot_bit() (Yury); - new tests and test improvements (Andy, Akinobu, Yury); - fixes for count_zeroes API (Yury); - cleanup bitmap_print_to_pagebuf() mess (Yury); - documentation updates (Andy, Kai, Kit). -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmnb8vkACgkQsUSA/Tof vsjzKgv/RI6HDkwRgjT/jPVAZzaNFrdoL0nIQ1ZriyE70b/0HtjMzbQBO0P3Vmsa 5k13Nus0eBi9CeEAK0NvjQXy8NRj4E7favqF3faV7l4+J6STHpOKeHZglUAj00CG +23WGInz+TS5RBjXnvT00wuTAVQjT6dvYng9606psVDF/nlh8ZtXmYDjLauseoUH a1EEKwLGXbk3/MhDgVq/R5RvZoNscL4Hky7QWMZiqLutwF8EDrZotF142tfbxkmW mu+2Bn1W66F+8A42HJBDRevcuvsRzMggP2kXxDk50XNL1zTN9f/4iE0r+/5x8UVF s3WiGnuLSkRIK4osey12Z9BAtGJTn3gTPvIPYOWvRiJHskOa1yvGSgcvmzc53x0Q FZgDq1JkBDsF3OZceSjGIp9QOqg+YJArlzun+mNxLbfnahEbhx21Z/ls65vLJCae ENIPAzet5Fxa8mZeJIyiV0zR05DcV+g64FOhcGJ7al4fRWtYVP8qa9FAyGFMV4L2 JL4xHuRO =pEBo -----END PGP SIGNATURE----- Merge tag 'bitmap-for-v7.1' of https://github.com/norov/linux Pull bitmap updates from Yury Norov: - new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury) - drop unused __find_nth_andnot_bit() (Yury) - new tests and test improvements (Andy, Akinobu, Yury) - fixes for count_zeroes API (Yury) - cleanup bitmap_print_to_pagebuf() mess (Yury) - documentation updates (Andy, Kai, Kit). * tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits) bitops: Update kernel-doc for sign_extendXX() powerpc/xive: simplify xive_spapr_debug_show() thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf() coresight: don't use bitmap_print_to_pagebuf() lib/prime_numbers: drop temporary buffer in dump_primes() drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or() ice: use bitmap_empty() in ice_vf_has_no_qs_ena ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx() bitmap: introduce bitmap_weighted_xor() bitmap: add test_zero_nbits() bitmap: exclude nbits == 0 cases from bitmap test bitmap: test bitmap_weight() for more asm-generic/bitops: Fix a comment typo in instrumented-atomic.h bitops: fix kernel-doc parameter name for parity8() lib: count_zeros: unify count_{leading,trailing}_zeros() lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() lib: crypto: fix comments for count_leading_zeros() x86/topology: use bitmap_weight_from() bitmap: add bitmap_weight_from() lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit() ... |
||
|
|
d7c8087a9c |
Power management updates for 7.1-rc1
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
- Update cpufreq-dt-platdev blocklist (Faruque Ansari)
- Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
Rosen Penev)
- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
- Add support for new features: CPPC performance priority, Dynamic EPP,
Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
Mario Limonciello)
- Fix sysfs files being present when HW missing and broken/outdated
documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
- Pass the policy to cpufreq_driver->adjust_perf() to avoid using
cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
leads to a scheduling-while-atomic bug (K Prateek Nayak)
- Clean up dead code in Kconfig for cpufreq (Julian Braha)
- Remove max_freq_req update for pre-existing cpufreq policy and add a
boost_freq_req QoS request to save the boost constraint instead of
overwriting the last scaling_max_freq constraint (Pierre Gondois)
- Embed cpufreq QoS freq_req objects in cpufreq policy so they all
are allocated in one go along with the policy to simplify lifetime
rules and avoid error handling issues (Viresh Kumar)
- Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
scaling driver (Henry Tseng)
- Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
of cpumask_weight() because the former is more efficient (Yury Norov)
- Use sysfs_emit() in sysfs show functions for cpufreq governor
attributes (Thorsten Blum)
- Update intel_pstate to stop returning an error when "off" is written
to its status sysfs attribute while the driver is already off (Fabio
De Francesco)
- Include current frequency in the debug message printed by
__cpufreq_driver_target() (Pengjie Zhang)
- Refine stopped tick handling in the menu cpuidle governor and
rearrange stopped tick handling in the teo cpuidle governor (Rafael
Wysocki)
- Add Panther Lake C-states table to the intel_idle driver (Artem
Bityutskiy)
- Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
- Simplify cpuidle_register_device() with guard() (Huisong Li)
- Use performance level if available to distinguish between rates in
OPP debugfs (Manivannan Sadhasivam)
- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
- Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
- Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
Biggers)
- Clean up and rearrange the intel_rapl power capping driver to make
the respective interface drivers (TPMI, MSR, and MMOI) hold their
own settings and primitives and consolidate PL4 and PMU support
flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
- Correct kernel-doc function parameter names in the power capping core
code (Randy Dunlap)
- Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
- Use _visible attribute to replace create/remove_sysfs_files() in
devfreq (Pengjie Zhang)
- Add Tegra114 support to activity monitor device in tegra30-devfreq as
a preparation to upcoming EMC controller support (Svyatoslav Ryhel)
- Fix mistakes in cpupower man pages, add the boost and epp options to
the cpupower-frequency-info man page, and add the perf-bias option to
the cpupower-info man page (Roberto Ricci)
- Remove unnecessary extern declarations from getopt.h in arguments
parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4
KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag
tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z
Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq
b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E
NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g=
=fRCO
-----END PGP SIGNATURE-----
Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Once again, cpufreq is the most active development area, mostly
because of the new feature additions and documentation updates in the
amd-pstate driver, but there are also changes in the cpufreq core
related to boost support and other assorted updates elsewhere.
Next up are power capping changes due to the major cleanup of the
Intel RAPL driver.
On the cpuidle front, a new C-states table for Intel Panther Lake is
added to the intel_idle driver, the stopped tick handling in the menu
and teo governors is updated, and there are a couple of cleanups.
Apart from the above, support for Tegra114 is added to devfreq and
there are assorted cleanups of that code, there are also two updates
of the operating performance points (OPP) library, two minor updates
related to hibernation, and cpupower utility man pages updates and
cleanups.
Specifics:
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
- Update cpufreq-dt-platdev blocklist (Faruque Ansari)
- Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
Rosen Penev)
- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
- Add support for new features: CPPC performance priority, Dynamic
EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
Shenoy, Mario Limonciello)
- Fix sysfs files being present when HW missing and broken/outdated
documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
- Pass the policy to cpufreq_driver->adjust_perf() to avoid using
cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
which leads to a scheduling-while-atomic bug (K Prateek Nayak)
- Clean up dead code in Kconfig for cpufreq (Julian Braha)
- Remove max_freq_req update for pre-existing cpufreq policy and add
a boost_freq_req QoS request to save the boost constraint instead
of overwriting the last scaling_max_freq constraint (Pierre
Gondois)
- Embed cpufreq QoS freq_req objects in cpufreq policy so they all
are allocated in one go along with the policy to simplify lifetime
rules and avoid error handling issues (Viresh Kumar)
- Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
scaling driver (Henry Tseng)
- Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
of cpumask_weight() because the former is more efficient (Yury
Norov)
- Use sysfs_emit() in sysfs show functions for cpufreq governor
attributes (Thorsten Blum)
- Update intel_pstate to stop returning an error when "off" is
written to its status sysfs attribute while the driver is already
off (Fabio De Francesco)
- Include current frequency in the debug message printed by
__cpufreq_driver_target() (Pengjie Zhang)
- Refine stopped tick handling in the menu cpuidle governor and
rearrange stopped tick handling in the teo cpuidle governor (Rafael
Wysocki)
- Add Panther Lake C-states table to the intel_idle driver (Artem
Bityutskiy)
- Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
- Simplify cpuidle_register_device() with guard() (Huisong Li)
- Use performance level if available to distinguish between rates in
OPP debugfs (Manivannan Sadhasivam)
- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
- Return -ENODATA if the snapshot image is not loaded (Alberto
Garcia)
- Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
Biggers)
- Clean up and rearrange the intel_rapl power capping driver to make
the respective interface drivers (TPMI, MSR, and MMOI) hold their
own settings and primitives and consolidate PL4 and PMU support
flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
- Correct kernel-doc function parameter names in the power capping
core code (Randy Dunlap)
- Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
- Use _visible attribute to replace create/remove_sysfs_files() in
devfreq (Pengjie Zhang)
- Add Tegra114 support to activity monitor device in tegra30-devfreq
as a preparation to upcoming EMC controller support (Svyatoslav
Ryhel)
- Fix mistakes in cpupower man pages, add the boost and epp options
to the cpupower-frequency-info man page, and add the perf-bias
option to the cpupower-info man page (Roberto Ricci)
- Remove unnecessary extern declarations from getopt.h in arguments
parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"
* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
cpupower: remove extern declarations in cmd functions
cpuidle: Simplify cpuidle_register_device() with guard()
PM / devfreq: tegra30-devfreq: add support for Tegra114
PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
cpufreq/amd-pstate-ut: Add a unit test for raw EPP
cpufreq/amd-pstate: Add support for raw EPP writes
cpufreq/amd-pstate: Add support for platform profile class
cpufreq/amd-pstate: add kernel command line to override dynamic epp
cpufreq/amd-pstate: Add dynamic energy performance preference
Documentation: amd-pstate: fix dead links in the reference section
cpufreq/amd-pstate: Cache the max frequency in cpudata
Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
...
|
||
|
|
4a530993da |
KVM x86 VMXON and EFER.SVME extraction for 7.1
Move _only_ VMXON+VMXOFF and EFER.SVME toggling out of KVM (versus all of VMX and SVM enabling) out of KVM and into the core kernel so that non-KVM TDX enabling, e.g. for trusted I/O, can make SEAMCALLs without needing to ensure KVM is fully loaded. TDX isn't a hypervisor, and isn't trying to be a hypervisor. Specifically, TDX should _never_ have it's own VMCSes (that are visible to the host; the TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there is simply no reason to move that functionality out of KVM. With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly simple refcounting game. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnZJkYACgkQOlYIJqCj N/21chAAjg9tb/E8+vqBZDT5vO9Bu6c333irV2vqBBJZWUx6xKhtk77kL6kISWyf aI57hJ5IwbUkfDcomSY+MyRXxw/X4OioSs5qqvcC2XHatGA8XwifJE47cN5ZT0+D hzZjru8Z9VGHf5wUXS41yTHtm+INiEYMgJiseUQR6sbWx3H+zDcLIooNQx/ZLYrV vR+VPtaMYpJ0TTDDqb8PrCnjgXoXFenAnzAj9bAikWP60kaDXrxN9KPc5woDo29+ TrkTyr2mmQvKpNhLCDwAMNa9bXxgzkHEGx8J2WZTbUi9ZBv4MwVsnGLLsaUKQlaa 4V1JDiICzYptjMzU+ka4iTF+m0KEz4EykP7mVVK+5MAHc0NOUVfDW6JP2PM/66dh NyyjGhbrfH0PwqzDn4N2h0MmWT4YNCIxESClecEMtEzsCyWfYOMitxbDbzHnu9Vw a/C0pwWKJ34Trr0O79SevAWJBlu596mya0YvMeCAWxCvSUGknbo5IXdrmtp6htGp Gz5+0ZyvVRbYpwxS+OOpWMkZuPvvEcWTbMAG/scbSHh80P/uCVyuLsRZR2HSB8EV tYnnLDDDQ1KmLV7xmw5XnkN9hFffAM8eXA7KX9TPjCXjd25lCJGgquQEH0oAHe5q 1qXf+lWttP7MIbD5/Ga5CO+FqXAE6xmFRWjEBgLx32kSAWXqxPs= =SuxR -----END PGP SIGNATURE----- Merge tag 'kvm-x86-vmxon-7.1' of https://github.com/kvm-x86/linux into HEAD KVM x86 VMXON and EFER.SVME extraction for 7.1 Move _only_ VMXON+VMXOFF and EFER.SVME toggling out of KVM (versus all of VMX and SVM enabling) out of KVM and into the core kernel so that non-KVM TDX enabling, e.g. for trusted I/O, can make SEAMCALLs without needing to ensure KVM is fully loaded. TIO isn't a hypervisor, and isn't trying to be a hypervisor. Specifically, TIO should _never_ have it's own VMCSes (that are visible to the host; the TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there is simply no reason to move that functionality out of KVM. With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly simple refcounting game. |
||
|
|
fbe80bd699 |
x86/split_lock: Don't warn about unknown split_lock_detect parameter
The split_lock_detect command line parameter is handled in sld_setup() shortly after cpu_parse_early_param() but still before parse_early_param(). Add a dummy parsing function so that parse_early_param() doesn't later complain about the "unknown" parameter split_lock_detect=, and pass it along to init. [ bp: Massage commit message. ] Signed-off-by: Ronan Pigott <ronan@rjp.ie> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260405181807.3906-1-ronan@rjp.ie |
||
|
|
52a9e9cd18 |
mm: rename zap_vma_ptes() to zap_special_vma_range()
zap_vma_ptes() is the only zapping function we export to modules. It's essentially a wrapper around zap_vma_range(), however, with some safety checks: * That the passed range fits fully into the VMA * That it's only used for VM_PFNMAP We will add support for VM_MIXEDMAP next, so use the more-generic term "special vma", although "special" is a bit overloaded. Maybe we'll later just support any VM_SPECIAL flag. While at it, improve the kerneldoc. Link: https://lkml.kernel.org/r/20260227200848.114019-16-david@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Leon Romanovsky <leon@kernel.org> [drivers/infiniband] Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Arve <arve@android.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Dave Airlie <airlied@gmail.com> Cc: David Ahern <dsahern@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jann Horn <jannh@google.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Namhyung kim <namhyung@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
0422b07bc4 |
x86/mce/amd: Filter bogus hardware errors on Zen3 clients
Users have been observing multiple L3 cache deferred errors after recent
kernel rework of deferred error handling.¹ ⁴
The errors are bogus due to inconsistent status values. Also, user verified
that bogus MCA_DESTAT values are present on the system even with an older
kernel.²
The errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel rework
because these do not generate a deferred error interrupt.
A later revision of the rework patch was merged for v6.19. This naturally
filtered out most of the bogus error logs. However, a few signatures still
remain.³
Minimize the scope of the filter to the reported CPU
family/model/stepping and only for errors which don't have the Enabled
bit in the MCi status MSR.
¹ https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
² https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@web.de
³ https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@web.de
⁴ https://lore.kernel.org/r/CAKFB093B2k3sKsGJ_QNX1jVQsaXVFyy=wNwpzCGLOXa_vSDwXw@mail.gmail.com
[ bp: Generalize the condition according to which errors are bogus. ]
Fixes:
|
||
|
|
e8be82c2d7 |
Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
The Hyper-V ISRs, for normal guests and when running in the hypervisor root patition, are calling add_interrupt_randomness() as a primary source of entropy. The call is currently in the ISRs as a common place to handle both x86/x64 and arm64. On x86/x64, hypervisor interrupts come through a custom sysvec entry, and do not go through a generic interrupt handler. On arm64, hypervisor interrupts come through an emulated GICv3. GICv3 uses the generic handler handle_percpu_devid_irq(), which does not do add_interrupt_randomness() -- unlike its counterpart handle_percpu_irq(). But handle_percpu_devid_irq() is now updated to do the add_interrupt_randomness(). So add_interrupt_randomness() is now needed only in Hyper-V's x86/x64 custom sysvec path. Move add_interrupt_randomness() from the Hyper-V ISRs into the Hyper-V x86/x64 custom sysvec path, matching the existing STIMER0 sysvec path. With this change, add_interrupt_randomness() is no longer called from any device drivers, which is appropriate. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://patch.msgid.link/20260402202400.1707-3-mhklkml@zohomail.com |
||
|
|
5cdfedf68e |
amd-pstate new content for 7.1 (2026-04-02)
Add support for new features:
* CPPC performance priority
* Dynamic EPP
* Raw EPP
* New unit tests for new features
Fixes for:
* PREEMPT_RT
* sysfs files being present when HW missing
* Broken/outdated documentation
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmnOpNgTHHN1cGVybTFA
a2VybmVsLm9yZwAKCRAtGSymJHcCduR7EADexgetxq0l6/iV2DyI1/YJcf+cNPoS
yxE93vN9i3A2xcx87klncVF0C2zIZaZFkp6o7VY/AReL/UyUOh6snz371OXBl7pm
A/uppkT5QdzTpmknJMyqkLRlHfkMjNRzWv4sdh4kyJSB3SkgaN7zSVi6Zxamt/vJ
VNCgExZQeDqk4VL2X/NBfaBagYSnPnBmBdXoY6aPYqFrqKj4SlDxYNbJsQlcyE9Z
z0naVGb5YPEJOaMvE+5z+DwX4EmtN3si+vfi8VuQOXPnoDGOG763rpMLnz7xYvfW
poPu2fnitN39MaT96btRShD6XuCg9eaPAEmpb3j6c93n1kUo+joLLbalhfc0HMeL
1/8ndz+KatEUMQTCVgs8cboob1PpRvqhIb+vrs6aTEqCsgqUKUZ7GYgglBamyRka
mivC5Q+ssCxq47/ilGfECFr8vK0oV3rTu9Ltp4MS5zN70tI0YYZk3o1454nY5dhc
Byv5e9bft/n9AA576y5vXENcWCSez/8UFGl5RjoxQZ7SFKNFnbSic1BT4uMRVX/G
4QUk5TWwC8WdOp7YsO30LwZ0y9vtxmfBn8BF/6n/dYGhM1/DVQ1nX9iyzhCHZ3XH
fgyrkUktdI1dsm/xKvbqxK9Djw0tkMsfH1yI6iQccefnlo4gRSvTRFiM2yepY6py
E8MZpz1ML8T2Pw==
=XTdh
-----END PGP SIGNATURE-----
Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:
"Add support for new features:
* CPPC performance priority
* Dynamic EPP
* Raw EPP
* New unit tests for new features
Fixes for:
* PREEMPT_RT
* sysfs files being present when HW missing
* Broken/outdated documentation"
* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
cpufreq/amd-pstate-ut: Add a unit test for raw EPP
cpufreq/amd-pstate: Add support for raw EPP writes
cpufreq/amd-pstate: Add support for platform profile class
cpufreq/amd-pstate: add kernel command line to override dynamic epp
cpufreq/amd-pstate: Add dynamic energy performance preference
Documentation: amd-pstate: fix dead links in the reference section
cpufreq/amd-pstate: Cache the max frequency in cpudata
Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
amd-pstate-ut: Add module parameter to select testcases
amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
amd-pstate: Add sysfs support for floor_freq and floor_count
amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
x86/cpufeatures: Add AMD CPPC Performance Priority feature.
amd-pstate: Make certain freq_attrs conditionally visible
...
|
||
|
|
5635c8bfd3 |
x86/apic: Drop AMD Extended Interrupt LVT macros
AMD defines Extended Interrupt Local Vector Table (EILVT) registers to allow for additional interrupt sources. While the APIC registers for those are unique to AMD, the format of those registers follows the standard LVT registers. Drop EILVT-specific macros in favor of the standard APIC LVT macros. Drop unused APIC_EILVT_NR_AMD_K8 and APIC_EILVT_LVTOFF while at it. No functional change. [ bp: Merge the two cleanup patches into one. ] Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Manali Shukla <manali.shukla@amd.com> Link: https://patch.msgid.link/b98d69037c0102d2ccd082a941888a689cd214c9.1775019269.git.naveen@kernel.org |
||
|
|
172100088f |
x86/cpufeatures: Add AMD CPPC Performance Priority feature.
Some future AMD processors have feature named "CPPC Performance Priority" which lets userspace specify different floor performance levels for different CPUs. The platform firmware takes these different floor performance levels into consideration while throttling the CPUs under power/thermal constraints. The presence of this feature is indicated by bit 16 of the EDX register for CPUID leaf 0x80000007. More details can be found in AMD Publication titled "AMD64 Collaborative Processor Performance Control (CPPC) Performance Priority" Revision 1.10. Define a new feature bit named X86_FEATURE_CPPC_PERF_PRIO to map to CPUID 0x80000007.EDX[16]. Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> |
||
|
|
bc91133e26 |
x86/CPU/AMD: Print AGESA string from DMI additional information entry
Type 40 entries (Additional Information) are summarized in section 7.41 as part of the SMBIOS specification. Generally, these entries aren't interesting to save. However on some AMD Zen systems, the AGESA version is stored here. This is useful to save to the kernel message logs for debugging. It can be used to cross-reference issues. Implement an iterator for the Additional Information entries. Use this to find and print the AGESA string. Do so in AMD code, since the use case is AMD-specific. [ bp: Match only "AGESA". ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: "Mario Limonciello (AMD)" <superm1@kernel.org> Signed-off-by: "Mario Limonciello (AMD)" <superm1@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Jean Delvare <jdelvare@suse.de> Link: https://patch.msgid.link/20260307141024.819807-6-superm1@kernel.org |
||
|
|
ac66a73be0 |
x86/fred: Enable FRED by default
When FRED was added to the mainline kernel, it was set up as an explicit opt-in due to the risk of regressions before hardware was available publicly. Now, Panther Lake (Core Ultra 300 series) has been released, and benchmarking by Phoronix has shown that it provides a significant performance benefit on most workloads: https://www.phoronix.com/review/intel-fred-panther-lake Accordingly, enable FRED by default if the CPU supports it. FRED can of course still be disabled via the fred=off command line option. Touch up Kconfig help too. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://patch.msgid.link/20260325230151.1898287-2-hpa@zytor.com |
||
|
|
c6a98dff41 |
x86/topology: use bitmap_weight_from()
Switch topo_unit_count() to use bitmap_weight_from(). Signed-off-by: Yury Norov <ynorov@nvidia.com> |
||
|
|
6a9fe1ad90 |
x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology()
Merge the two separate switch cases for AMD and Hygon as they share the common cpu_parse_topology_amd(). Also drop the IS_ENABLED(CONFIG_CPU_SUP_AMD/HYGON) guards, because 1) they are dead code: when a vendor's CONFIG_CPU_SUP_* is disabled, its vendor detection code (in amd.c / hygon.c) is not compiled, so x86_vendor will never be set to X86_VENDOR_AMD / X86_VENDOR_HYGON, instead it will default to X86_VENDOR_UNKNOWN and those switch cases are unreachable. 2) topology_amd.o is always built (obj-y), so cpu_parse_topology_amd() is always available regardless of CPU_SUP_* configuration. Signed-off-by: Wei Wang <wei.w.wang@hotmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Yongwei Xu <xuyongwei@open-hieco.net> Link: https://patch.msgid.link/SI2PR01MB4393D6B7E17AB05612AEE925DC4BA@SI2PR01MB4393.apcprd01.prod.exchangelabs.com |
||
|
|
a3e93cac25 |
x86/cpu: Add comment clarifying CRn pinning
To avoid future confusion on the purpose and design of the CRn pinning code. Also note that if the attacker controls page-tables, the CRn bits lose much of the attraction anyway. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260320092521.GG3739106@noisy.programming.kicks-ass.net |
||
|
|
411df123c0 |
x86/cpu: Remove X86_CR4_FRED from the CR4 pinned bits mask
Commit in Fixes added the FRED CR4 bit to the CR4 pinned bits mask so
that whenever something else modifies CR4, that bit remains set. Which
in itself is a perfectly fine idea.
However, there's an issue when during boot FRED is initialized: first on
the BSP and later on the APs. Thus, there's a window in time when
exceptions cannot be handled.
This becomes particularly nasty when running as SEV-{ES,SNP} or TDX
guests which, when they manage to trigger exceptions during that short
window described above, triple fault due to FRED MSRs not being set up
yet.
See Link tag below for a much more detailed explanation of the
situation.
So, as a result, the commit in that Link URL tried to address this
shortcoming by temporarily disabling CR4 pinning when an AP is not
online yet.
However, that is a problem in itself because in this case, an attack on
the kernel needs to only modify the online bit - a single bit in RW
memory - and then disable CR4 pinning and then disable SM*P, leading to
more and worse things to happen to the system.
So, instead, remove the FRED bit from the CR4 pinning mask, thus
obviating the need to temporarily disable CR4 pinning.
If someone manages to disable FRED when poking at CR4, then
idt_invalidate() would make sure the system would crash'n'burn on the
first exception triggered, which is a much better outcome security-wise.
Fixes:
|
||
|
|
05243d490b |
x86/cpu: Enable FSGSBASE early in cpu_init_exception_handling()
Move FSGSBASE enablement from identify_cpu() to cpu_init_exception_handling()
to ensure it is enabled before any exceptions can occur on both boot and
secondary CPUs.
== Background ==
Exception entry code (paranoid_entry()) uses ALTERNATIVE patching based on
X86_FEATURE_FSGSBASE to decide whether to use RDGSBASE/WRGSBASE instructions
or the slower RDMSR/SWAPGS sequence for saving/restoring GSBASE.
On boot CPU, ALTERNATIVE patching happens after enabling FSGSBASE in CR4.
When the feature is available, the code is permanently patched to use
RDGSBASE/WRGSBASE, which require CR4.FSGSBASE=1 to execute without triggering
== Boot Sequence ==
Boot CPU (with CR pinning enabled):
trap_init()
cpu_init() <- Uses unpatched code (RDMSR/SWAPGS)
x2apic_setup()
...
arch_cpu_finalize_init()
identify_boot_cpu()
identify_cpu()
cr4_set_bits(X86_CR4_FSGSBASE) # Enables the feature
# This becomes part of cr4_pinned_bits
...
alternative_instructions() <- Patches code to use RDGSBASE/WRGSBASE
Secondary CPUs (with CR pinning enabled):
start_secondary()
cr4_init() <- Code already patched, CR4.FSGSBASE=1
set implicitly via cr4_pinned_bits
cpu_init() <- exceptions work because FSGSBASE is
already enabled
Secondary CPU (with CR pinning disabled):
start_secondary()
cr4_init() <- Code already patched, CR4.FSGSBASE=0
cpu_init()
x2apic_setup()
rdmsrq(MSR_IA32_APICBASE) <- Triggers #VC in SNP guests
exc_vmm_communication()
paranoid_entry() <- Uses RDGSBASE with CR4.FSGSBASE=0
(patched code)
...
ap_starting()
identify_secondary_cpu()
identify_cpu()
cr4_set_bits(X86_CR4_FSGSBASE) <- Enables the feature, which is
too late
== CR Pinning ==
Currently, for secondary CPUs, CR4.FSGSBASE is set implicitly through
CR-pinning: the boot CPU sets it during identify_cpu(), it becomes part of
cr4_pinned_bits, and cr4_init() applies those pinned bits to secondary CPUs.
This works but creates an undocumented dependency between cr4_init() and the
pinning mechanism.
== Problem ==
Secondary CPUs boot after alternatives have been applied globally. They
execute already-patched paranoid_entry() code that uses RDGSBASE/WRGSBASE
instructions, which require CR4.FSGSBASE=1. Upcoming changes to CR pinning
behavior will break the implicit dependency, causing secondary CPUs to
generate #UD.
This issue manifests itself on AMD SEV-SNP guests, where the rdmsrq() in
x2apic_setup() triggers a #VC exception early during cpu_init(). The #VC
handler (exc_vmm_communication()) executes the patched paranoid_entry() path.
Without CR4.FSGSBASE enabled, RDGSBASE instructions trigger #UD.
== Fix ==
Enable FSGSBASE explicitly in cpu_init_exception_handling() before loading
exception handlers. This makes the dependency explicit and ensures both
boot and secondary CPUs have FSGSBASE enabled before paranoid_entry()
executes.
Fixes:
|
||
|
|
8d8bd2a5aa |
Miscellaneous x86 fixes:
- Improve Qemu MCE-injection behavior by only using
AMD SMCA MSRs if the feature bit is set.
- Fix the relative path of gettimeofday.c inclusion
in vclock_gettime.c
- Fix a boot crash on UV clusters when a socket is
marked as 'deconfigured' which are mapped to the
SOCK_EMPTY (0xffff [short -1]) node ID by the UV firmware,
while Linux APIs expect NUMA_NO_NODE (0xffffffff [int -1]).
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmm/qYwRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1imUBAAmL2xQCgOZCeLPWFZ28S5EiKf5xvyC9R4
uIm/WncXYQDJZN0JaTABdjmLiaBQnlyULpmqN47Xuiy8avMO532S92yreFpWR2OB
5TEE/v9+wcbSQOJBELhch3XNzUu7cNPQ+HbOuhot4rpK/MlyJ8rHaHLYVSVhRBYG
ynM3iTDvD6ENtyAOGT1Afkxg/sqMOZG8jdwrN1z8BH3AMU8BJ6OgfguqKuEi99Vp
Lz31a3bR0LcRPZ4ECKH0ModcKgjkqcgnVccOYCDtq8XReciGQo1Zg0diiV6VRn2a
hAqoSvdd9e1XUGoU8dEi0KwR5YXYJh3uOmjSH56rbceXr5XxPcKRJ4mOOX0GX+KC
KZl5KdwJo4veHodxWTYFe+67hi84BwKoB4gM36nRhDmzLS/XHoI4RUOjZVXSPVzJ
dwBBZDKd9tP6aaFruSa+v5sH7EbVartwLuEkQx6SDydS/4htGbESKTxgdqr63gUZ
jwmFRVaPo3FUWkOhS9GExt6XewhBVaJ9/cHZkcpW0vrm9RBlFoyTX9YLfkF4x9O1
h3wxXsULyZuYn5/tunIFfDsLpcmgsz85NXnW+0A1URAAWFNRYfVIKxaO7tUUF2OA
asnOe0NrztvKoZfKzCi7s14HVCPT34aEp3Vf2Am/QRMy5JYLhS3xAwApiZ/hrZ4R
Kb+4dVfh12Q=
=uIqH
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Improve Qemu MCE-injection behavior by only using AMD SMCA MSRs if
the feature bit is set
- Fix the relative path of gettimeofday.c inclusion in vclock_gettime.c
- Fix a boot crash on UV clusters when a socket is marked as
'deconfigured' which are mapped to the SOCK_EMPTY node ID by
the UV firmware, while Linux APIs expect NUMA_NO_NODE.
The difference being (0xffff [unsigned short ~0]) vs [int -1]
* tag 'x86-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/uv: Handle deconfigured sockets
x86/entry/vdso: Fix path of included gettimeofday.c
x86/mce/amd: Check SMCA feature bit before accessing SMCA MSRs
|
||
|
|
9eece49856 |
x86/paravirt: Replace io_delay() hook with a bool
The io_delay() paravirt hook is in no way performance critical and all users setting it to a different function than native_io_delay() are using an empty function as replacement. Allow replacing the hook with a bool indicating whether native_io_delay() should be called. [ bp: Massage commit message. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20260119182632.596369-3-jgross@suse.com |
||
|
|
c3d13784d5 |
hyperv-fixes for v7.0-rc5
-----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmm80s8THHdlaS5saXVA a2VybmVsLm9yZwAKCRB2FHBfkEGgXuD2B/9lj5tU54YNxRkB7iLjicrX6RfYTTC2 xynQTHRQhAOEBF3woPR7BkTLVBvRyf10fplRP/nSxpAUAzdulz99rprBGlckEJji VlBhvuWz3fQ5XWsAovWaghhcSw9juf9k2x1x2VudwSLKcsgtqft96JQb4JxnuH02 cNJLeprLuKRwafqF+Wmwj+o4uWEe78Sa7yDg6vai10+e2Q4sE+QrvyXqEJGFDdjj W4+aQmeAElvYTxKRGMrUsEaraif/BXNRmihCD96b9v8/w0BIZjPIVGvKE0NA4BKS rkPxs8KfD2Gh7dSvBvvnptd1+36XeyXsW+nFmndumwxqY2NKHuuA9fAt =Cqrd -----END PGP SIGNATURE----- Merge tag 'hyperv-fixes-signed-20260319' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fixes from Wei Liu: - Fix ARM64 MSHV support (Anirudh Rayabharam) - Fix MSHV driver memory handling issues (Stanislav Kinsburskii) - Update maintainers for Hyper-V DRM driver (Saurabh Sengar) - Misc clean up in MSHV crashdump code (Ard Biesheuvel, Uros Bizjak) - Minor improvements to MSHV code (Mukesh R, Wei Liu) - Revert not yet released MSHV scrub partition hypercall (Wei Liu) * tag 'hyperv-fixes-signed-20260319' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: mshv: Fix error handling in mshv_region_pin MAINTAINERS: Update maintainers for Hyper-V DRM driver mshv: Fix use-after-free in mshv_map_user_memory error path mshv: pass struct mshv_user_mem_region by reference x86/hyperv: Use any general-purpose register when saving %cr2 and %cr8 x86/hyperv: Use current_stack_pointer to avoid asm() in hv_hvcrash_ctxt_save() x86/hyperv: Save segment registers directly to memory in hv_hvcrash_ctxt_save() x86/hyperv: Use __naked attribute to fix stackless C function Revert "mshv: expose the scrub partition hypercall" mshv: add arm64 support for doorbell & intercept SINTs mshv: refactor synic init and cleanup x86/hyperv: print out reserved vectors in hexadecimal |
||
|
|
584d752b8a |
x86/cpu: Remove LASS restriction on vsyscall emulation
Vsyscall emulation has two modes of operation: XONLY and EMULATE. The default XONLY mode is now supported with a LASS-triggered #GP. OTOH, LASS is disabled if someone requests the deprecated EMULATE mode via the vsyscall=emulate command line option. So, remove the restriction on LASS when the overall vsyscall emulation support is compiled in. As a result, there is no need for setup_lass() anymore. LASS is enabled by default through a late_initcall(). Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Link: https://patch.msgid.link/20260309181029.398498-6-sohil.mehta@intel.com |
||
|
|
201bc182ad |
x86/mce/amd: Check SMCA feature bit before accessing SMCA MSRs
People do effort to inject MCEs into guests in order to simulate/test
handling of hardware errors. The real use case behind it is testing the
handling of SIGBUS which the memory failure code sends to the process.
If that process is QEMU, instead of killing the whole guest, the MCE can
be injected into the guest kernel so that latter can attempt proper
handling and kill the user *process* in the guest, instead, which
caused the MCE. The assumption being here that the whole injection flow
can supply enough information that the guest kernel can pinpoint the
right process. But that's a different topic...
Regardless of virtualization or not, access to SMCA-specific registers
like MCA_DESTAT should only be done after having checked the smca
feature bit. And there are AMD machines like Bulldozer (the one before
Zen1) which do support deferred errors but are not SMCA machines.
Therefore, properly check the feature bit before accessing related MSRs.
[ bp: Rewrite commit message. ]
Fixes:
|
||
|
|
04e43ec9f0 |
x86/split_lock: Restructure the unwieldy switch-case in sld_state_show()
Split the handling in two parts: 1. handle the sld_state option first 2. handle X86_FEATURE flag-based printing afterwards This splits the function nicely into two, separate logical things which are easier to parse and understand. Also, zap the printing in the disabled case. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://patch.msgid.link/20260226145033.GAaaBduQ0rWXydOkAm@fat_crate.local |
||
|
|
b90d398138 |
x86/mce, EDAC/mce_amd: Add new SMCA bank types
Recognize new SMCA bank types and include their short names for sysfs and long names for decoding. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260307163316.345923-4-yazen.ghannam@amd.com |
||
|
|
b595a00972 |
x86/mce, EDAC/mce_amd: Update CS bank type naming
Recent documentation updated the "CS" bank type name from "Coherent Slave" to "Coherent Station". Apply this change in the kernel also. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260307163316.345923-3-yazen.ghannam@amd.com |
||
|
|
bee9f4178b |
x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
Originally, the SMCA bank type enums were ordered based on processor documentation. However, the ordering became inconsistent after new bank types were added over time. Sort the bank type enums alphanumerically in most places. Sort the "enum to HWID/McaType" mapping by HWID/McaType. Drop redundant code comments. No functional changes. [ bp: Sort them alphanumerically. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260307163316.345923-2-yazen.ghannam@amd.com |
||
|
|
7989c39341 |
x86/microcode: Add platform mask to Intel microcode "old" list
Intel sometimes has CPUs with identical family/model/stepping but
which need different microcode. These CPUs are differentiated with the
platform ID.
The Intel "microcode-20250512" release was used to generate the
existing contents of intel-ucode-defs.h. Use that same release and add
the platform mask to the definitions.
This makes the list a few entries longer because some CPUs previously
that shared a definition now need two or more. for example for the
ancient Pentium III there are two CPUs that differ only in their
platform and have two different microcode versions (note:
.driver_data is the microcode version):
{ ..., .model = 0x05, .steppings = 0x0001, .platform_mask = 0x01, .driver_data = 0x40 },
{ ..., .model = 0x05, .steppings = 0x0001, .platform_mask = 0x08, .driver_data = 0x45 },
Another example is the state-of-the-art Granite Rapids:
{ ..., .model = 0xad, .steppings = 0x0002, .platform_mask = 0x20, .driver_data = 0xa0000d1 },
{ ..., .model = 0xad, .steppings = 0x0002, .platform_mask = 0x95, .driver_data = 0x10003a2 },
As you can see, this differentiation with platform ID has been
necessary for a long time and is still relevant today.
Without the platform matching, the old microcode table is incomplete.
For instance, it might lead someone with a Pentium III, platform 0x0,
and microcode 0x40 to think that they should have microcode 0x45,
which is really only for platform 0x4 (.platform_mask==0x08).
In practice, this meant that folks with fully updated microcode were
seeing "Vulnerable" in the "old_microcode" file.
1. https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files
Closes: https://lore.kernel.org/all/38660F8F-499E-48CD-B58B-4822228A5941@nutanix.com/
Fixes:
|
||
|
|
fab0c75d50 |
x86/cpu: Add platform ID to CPU matching structure
The existing x86_match_cpu() infrastructure can be used to match a bunch of attributes of a CPU: vendor, family, model, steppings and CPU features. But, there's one more attribute that's missing and unable to be matched against: the platform ID, enumerated on Intel CPUs in MSR_IA32_PLATFORM_ID. It is a little more obscure and is only queried during microcode loading. This is because Intel sometimes has CPUs with identical family/model/stepping but which need different microcode. These CPUs are differentiated with the platform ID. Add a field in 'struct x86_cpu_id' for the platform ID. Similar to the stepping field, make the new field a mask of platform IDs. Some examples: 0x01: matches only platform ID 0x0 0x02: matches only platform ID 0x1 0x03: matches platform IDs 0x0 or 0x1 0x80: matches only platform ID 0x7 0xff: matches all 8 possible platform IDs Since the mask is only a byte wide, it nestles in next to another u8 and does not even increase the size of 'struct x86_cpu_id'. Reserve the all 0's value as the wildcard (X86_PLATFORM_ANY). This avoids forcing changes to existing 'struct x86_cpu_id' users. They can just continue to fill the field with 0's and their matching will work exactly as before. Note: If someone is ever looking for space in 'struct x86_cpu_id', this new field could probably get stuck over in ->driver_data for the one user that there is. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://patch.msgid.link/20260304181022.058DF07C@davehans-spike.ostc.intel.com |
||
|
|
d8630b67ca |
x86/cpu: Add platform ID to CPU info structure
The end goal here is to be able to do x86_match_cpu() and match on a specific platform ID. While it would be possible to stash this ID off somewhere or read it dynamically, that approaches would not be consistent with the other fields which can be matched. Read the platform ID and store it in cpuinfo_x86. There are lots of sites to set this new field. Place it near the place c->microcode is established since the platform ID is so closely intertwined with microcode updates. Note: This should not grow the size of 'struct cpuinfo_x86' in practice since the u8 fits next to another u8 in the structure. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://patch.msgid.link/20260304181020.8D518228@davehans-spike.ostc.intel.com |
||
|
|
238be4ba87 |
x86/microcode: Refactor platform ID enumeration into a helper
Today, the only code that cares about the platform ID is the microcode update code itself. To facilitate storing the platform ID in a more generic place and using it outside of the microcode update itself, put the enumeration into a helper function. Mirror intel_get_microcode_revision()'s naming and location. But, moving away from intel_collect_cpu_info() means that the model and family information in CPUID is not readily available. Just call CPUID again. Note that the microcode header is a mask of supported platform IDs. Only stick the ID part in the helper. Leave the 1<<id part in the microcode handling. Also note that the PII is weird. It does not really have a platform ID because it doesn't even have the MSR. Just consider it to be platform ID 0. Instead of saying >=PII, say <=PII. The PII is the real oddball here being the only CPU with Linux microcode updates but no platform ID. It's worth calling it out by name. This does subtly change the sig->pf for the PII though from 0x0 to 0x1. Make up for that by ignoring sig->pf when the microcode update platform mask is 0x0. [ dhansen: reflow comment for bpetkov ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://patch.msgid.link/20260304181018.EB6404F8@davehans-spike.ostc.intel.com |
||
|
|
405b7c2793 |
KVM: VMX: Unconditionally allocate root VMCSes during boot CPU bringup
Allocate the root VMCS (misleading called "vmxarea" and "kvm_area" in KVM) for each possible CPU during early boot CPU bringup, before early TDX initialization, so that TDX can eventually do VMXON on-demand (to make SEAMCALLs) without needing to load kvm-intel.ko. Allocate the pages early on, e.g. instead of trying to do so on-demand, to avoid having to juggle allocation failures at runtime. Opportunistically rename the per-CPU pointers to better reflect the role of the VMCS. Use Intel's "root VMCS" terminology, e.g. from various VMCS patents[1][2] and older SDMs, not the more opaque "VMXON region" used in recent versions of the SDM. While it's possible the VMCS passed to VMXON no longer serves as _the_ root VMCS on modern CPUs, it is still in effect a "root mode VMCS", as described in the patents. Link: https://patentimages.storage.googleapis.com/c7/e4/32/d7a7def5580667/WO2013101191A1.pdf [1] Link: https://patentimages.storage.googleapis.com/13/f6/8d/1361fab8c33373/US20080163205A1.pdf [2] Tested-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Sagi Shahar <sagis@google.com> Link: https://patch.msgid.link/20260214012702.2368778-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> |
||
|
|
59674fc9d0 |
x86/resctrl: Fix SNC detection
Now that the x86 topology code has a sensible nodes-per-package measure, that does not depend on the online status of CPUs, use this to divinate the SNC mode. Note that when Cluster on Die (CoD) is configured on older systems this will also show multiple NUMA nodes per package. Intel Resource Director Technology is incomaptible with CoD. Print a warning and do not use the fixup MSR_RMID_SNC_CONFIG. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Tested-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Link: https://patch.msgid.link/aaCxbbgjL6OZ6VMd@agluck-desk3 Link: https://patch.msgid.link/20260303110100.367976706@infradead.org |
||
|
|
ae6730ff42 |
x86/topo: Add topology_num_nodes_per_package()
Use the MADT and SRAT table data to compute __num_nodes_per_package. Specifically, SRAT has already been parsed in x86_numa_init(), which is called before acpi_boot_init() which parses MADT. So both are available in topology_init_possible_cpus(). This number is useful to divinate the various Intel CoD/SNC and AMD NPS modes, since the platforms are failing to provide this otherwise. Doing it this way is independent of the number of online CPUs and other such shenanigans. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Tested-by: Kyle Meyer <kyle.meyer@hpe.com> Link: https://patch.msgid.link/20260303110100.004091624@infradead.org |
||
|
|
68400c1aaf |
x86/cpu: Remove LASS restriction on EFI
The initial LASS enabling has been deferred to much later during boot, and EFI runtime services now run with LASS temporarily disabled. This removes LASS from the path of all EFI services. Remove the LASS restriction on EFI config, as the two can now coexist. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Link: https://patch.msgid.link/20260120234730.2215498-4-sohil.mehta@intel.com |
||
|
|
b3226af5ad |
x86/cpu: Defer LASS enabling until userspace comes up
LASS blocks any kernel access to the lower half of the virtual address space. Unfortunately, some EFI accesses happen during boot with bit 63 cleared, which causes a #GP fault when LASS is enabled. Notably, the SetVirtualAddressMap() call can only happen in EFI physical mode. Also, EFI_BOOT_SERVICES_CODE/_DATA could be accessed even after ExitBootServices(). The boot services memory is truly freed during efi_free_boot_services() after SVAM has completed. To prevent EFI from tripping LASS, at a minimum, LASS enabling must be deferred until EFI has completely finished entering virtual mode (including freeing boot services memory). Moving setup_lass() to arch_cpu_finalize_init() would do the trick, but that would make the implementation very fragile. Something else might come in the future that would need the LASS enabling to be moved again. In general, security features such as LASS provide limited value before userspace is up. They aren't necessary during early boot while only trusted ring0 code is executing. Introduce a generic late initcall to defer activating some CPU features until userspace is enabled. For now, only move the LASS CR4 programming to this initcall. As APs are already up by the time late initcalls run, some extra steps are needed to enable LASS on all CPUs. Use a CPU hotplug callback instead of on_each_cpu() or smp_call_function(). This ensures that LASS is enabled on every CPU that is currently online as well as any future CPUs that come online later. Note, even though hotplug callbacks run with preemption enabled, cr4_set_bits() would disable interrupts while updating CR4. Keep the existing logic in place to clear the LASS feature bits early. setup_clear_cpu_cap() must be called before boot_cpu_data is finalized and alternatives are patched. Eventually, the entire setup_lass() logic can go away once the restrictions based on vsyscall emulation and EFI are removed. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Link: https://patch.msgid.link/20260120234730.2215498-2-sohil.mehta@intel.com |