Commit Graph

21302 Commits

Author SHA1 Message Date
Prathyushi Nangia
c21b90f776 x86/CPU/AMD: Prevent improper isolation of shared resources in Zen2's op cache
Make sure resources are not improperly shared in the op cache and
cause instruction corruption this way.

Signed-off-by: Prathyushi Nangia <prathyushi.nangia@amd.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-05-11 20:06:36 -07:00
David Gow
5772f65352 x86/boot/e820: Re-enable BIOS fallback if e820 table is empty
In commit:

  157266edcc ("x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables")

the check on the number of entries in the e820 table was removed. The intention
was to support single-entry maps, but by removing the check entirely, we also
skip the fallback (to, e.g., the BIOS 88h function).

This means that if no E820 map is passed in from the bootloader (which is the
case on some bootloaders, like linld), we end up with an empty memory map, and
the kernel fails to boot (either by deadlocking on OOM, or by failing to
allocate the real mode trampoline, or similar).

Re-instate the check in append_e820_table(), but only check that nr_entries is
non-zero. This allows e820__memory_setup_default() to fall back to other memory
size sources, and doesn't affect e820__memory_setup_extended(), as the latter
ignores the return value from append_e820_table().

In doing so, we also update the return values to be proper error codes, with
-ENOENT for this case (there are no entries), and -EINVAL for the case where an
entry appears invalid. Given none of the callers check the actual value -- just
whether it's nonzero -- this is largely aesthetic in practice.

Tested against linld, and the kernel boots again fine.

[ mingo: Readability edits to the comment and the changelog. ]

Fixes: 157266edcc ("x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables")
Signed-off-by: David Gow <david@davidgow.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://patch.msgid.link/20260416065746.1896647-1-david@davidgow.net
2026-05-07 10:04:54 +02:00
Linus Torvalds
8f4e8687c8 Miscellaneous x86 fixes:
- Prevent deadlock during shstk sigreturn (Rick Edgecombe)
 
  - Disable FRED when PTI is forced on (Dave Hansen)
 
  - Revert a CPA INVLPGB optimization that did not properly handle
    discontiguous virtual addresses (Dave Hansen)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnrdcARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jdgg//Rc1HsR9foqgOZLunF7no1jiscoFqHSF2
 0sa+Hdmujw3JLBNzTRWRC4PFs1rBuxXSRl/rgZboH2QJa675xsjiHbnieN0XEymM
 gd7m03ljbLRHS/JY52CtKC1DqoAY0YWMl+qSnQVBNS1uegHO8eyNkByVIL0SyALe
 aiBgd0gDoABFxbKvfKI6pKgr7NfdpGxeIhn/cl52FObCjk4d5Eq0Fu2PfCTWZjPG
 xXDdqWyi/HWUCbCUkLtnar/CYO6gbIjC8TSVY4awRZ0rAYMDiuY/H4mp6QtCakm7
 c+82NZFxDjFWGQBlN+XZW8sePS0AECNL6jnRzPmBZSdn82jazyyKAyTodChlvXF4
 1UROkOCR2UZs6iXxLIweS32CU8u9YiHPKslbXw+fYIPL4JSsUwxkrsLrjO3FkGap
 ke/Mn4W9hG6L/drRY6PW7j2728+2Kb0nQFefACMepxozRfKbuKJIeQ7Saji8/KKB
 ga3f1PECRbzD5YgUkifIqUUV21phyw8zvw4x/s8mWkXCezOwxMbheaG7DX5f7tdw
 jaR0SCS+cikYNATj69LMHs+x08AcITtFglV18DVTujVcpYSX09BVWA/jPWyGX+eC
 qzX7wnxbY/MkJvirrcZPa+ZL8tsrcEy9ZYQhpj5Bj859R8qLew3DvxhmbyRVEkk6
 B93tXZmyx7M=
 =hABk
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Prevent deadlock during shstk sigreturn (Rick Edgecombe)

 - Disable FRED when PTI is forced on (Dave Hansen)

 - Revert a CPA INVLPGB optimization that did not properly handle
   discontiguous virtual addresses (Dave Hansen)

* tag 'x86-urgent-2026-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Revert INVLPGB optimization for set_memory code
  x86/cpu: Disable FRED when PTI is forced on
  x86/shstk: Prevent deadlock during shstk sigreturn
2026-04-24 10:05:42 -07:00
Linus Torvalds
45dcf5e288 PCMCIA fixes and cleanups for v7.1
A number of minor PCMCIA bugfixes and cleanups, and a patch removing
 obsolete host controller drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCgA5FiEEmgXaWKgmjrvkPhLCmpdgiUyNow0FAmnqWwIbHGxpbnV4QGRv
 bWluaWticm9kb3dza2kubmV0AAoJEJqXYIlMjaMNifYP/R89OwUCazfebvE+mzTT
 Nlj11sxOaMF9ZMDkWZlP49TkoBZjQ9MEb6hGgw99SLLf5bYoHQFHZ0RMIwM/7UXv
 y66ZLqfuGgdenUre8QROsYP1uegO4gh41aRm9a7cswoAkNlnIgKhH1k2/jhRwlXT
 rrl2CObgkdAtpnZDb2GFyjL2Y5VEKoKJf3nrNlDs1R4NBDVmfisojlfLeYFGGBsq
 CESwNcj9eEj+NuUhdWOmHXjXOvsDxk2PpTUEettswcdInOj8MApScbmyj/UqCX6A
 RTI/myAeLeGTMgTrZCfIqhCiaZbBdVcoKYakRfAZVyM1KuHQo/PmAlMPhv5C4lRT
 V8Fqz3RTtcYxblTMoxWh8pMaCKoH905YIINs54rkFwAqgq6TXgu26QdZtH+YR2V/
 /0JI613MfSDhC2WwEF3j+sW0WlCnGSFZ5yzUZQmoQYf4US3ZOhp1JgaK+18QtqUa
 00e0OQkLNN03Nj7UcoMfLHANOBVjLV8C8U+zkHT1t5AI4bRlP8FSTNrFP0f2VKWZ
 rm0S7CWovRS9mewOOXVHv2Wo0bxSaBsBxrD2G8n1PRE96xpXZvjuUKQtSyX6Totk
 lacS2HziABD7IiuKH0djC1MPPZkbZcct3TO8PggVYkrDa9nyQAsG4Z2+S85f8+Qg
 ALVHDj4B4bbJEVyn4duU9y+F
 =y3cK
 -----END PGP SIGNATURE-----

Merge tag 'pcmcia-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux

Pull PCMCIA updates from Dominik Brodowski:
 "A number of minor PCMCIA bugfixes and cleanups, and a patch removing
  obsolete host controller drivers"

* tag 'pcmcia-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  pcmcia: remove obsolete host controller drivers
  pcmcia: Convert to use less arguments in pci_bus_for_each_resource()
  PCMCIA: Fix garbled log messages for KERN_CONT
2026-04-23 11:22:16 -07:00
Linus Torvalds
38ee6e1fb6 kgdb patches for 7.1
Only a very small update for kgdb this cycle: a single patch from Kexin Sun
 that fixes some outdated comments.
 
 Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmno2OcACgkQfOMlXTn3
 iKHVNw/+IXsSmpUTllkwlzmib+D7zYVEoyLOdoBRQyJDjjT9Qe811budkIs02sIk
 NUgMBnffInU73lCw2TbRpKag9+dl8ZglKFvM3UpElFzY5ePnUc/fHeFtsXdzVjSK
 PVzL3fCtN66R1MuMeCdMNjhtSVHNJCrclj0We39JRgli7kWK1zH6kY8X/j/ExBTR
 9S5e3/9Yn/V9Z2Z33lwBMCKLHyd/5r0ezKbvfVGmqT/sTE/IbTcTfVMipEHGtHfA
 JHs5PC+1obJ9jeq/1c1fV8tZ8kO+VjGI0yUibbWEBmPx0pVUW/FG1z+WwhlbcyOl
 cyhKQ7d4h4e718M+bWAXeoRGk9bXyAWHqWojM98Xh+TvR2H20qAzUC44qeT9tt7B
 SOfvZM4Xwsn+uYqmthtMqLMqe8VIX2WbUVjsntYZ46fRTq8RAUY3z7yLv6bm7N4l
 p4i7/cdCsbzi+tN6uaKANywvt9ygEAcPILGfMhfIS4Pa7aQ/JVjKgxKhP1bUuH9u
 ftqIcJwvX3d8MDQY4FsLpsRToj3rADfr35HnQCc7fmqRIFi7++fXzPerktdF/5Sd
 6NMR0+mdFewdW9JfYKGZMTybf9XXyC+F6HlKgNOBddPje7qQlLUIQBajn71W753i
 TNKI5Go6idMgWEukNVXGD2XqObyKD2eBgCj/42Kc8bDfo7D66UA=
 =MegD
 -----END PGP SIGNATURE-----

Merge tag 'kgdb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb update from Daniel Thompson:
 "Only a very small update for kgdb this cycle: a single patch from
  Kexin Sun that fixes some outdated comments"

* tag 'kgdb-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kgdb: update outdated references to kgdb_wait()
2026-04-22 14:26:58 -07:00
Linus Torvalds
8fd12b03c7 hyperv-next for v7.1
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmnobFATHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXhafB/wMvf8yu4FgapbSRIlPboeW/ONDyuHd
 k0y1a0IFMQLspWfCxK8+snEcHT3g9xzG3ksqcnac0SPgzqxQ4dlK/c7Xr+5EBXuf
 TEt/bHrt6KT5BUpb1k/XxcdoObCsvZd3vfqR020OsHijw1Ni9PrdqeZxk56/vFJs
 nvgyKvsrAEyfALOH1Vgwg0gNWpqJBj1KcT3Kl1o4p5lwQzVpUREZii6RyvgXT/pu
 mckN63FrEPOpDaJllHmCPcfFSzqNi+wFQcFxm35w9rDQsVdnMsRWoJbcXNYW5QGM
 +KOZ1/tzw4Z3queW78hcxsH6sXiElLDsJgtDohBhxbVvUXy+xHrJCOm5
 =4Ke3
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Wei Liu:

 - Fix cross-compilation for hv tools (Aditya Garg)

 - Fix vmemmap_shift exceeding MAX_FOLIO_ORDER in mshv_vtl (Naman Jain)

 - Limit channel interrupt scan to relid high water mark (Michael
   Kelley)

 - Export hv_vmbus_exists() and use it in pci-hyperv (Dexuan Cui)

 - Fix cleanup and shutdown issues for MSHV (Jork Loeser)

 - Introduce more tracing support for MSHV (Stanislav Kinsburskii)

* tag 'hyperv-next-signed-20260421' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Skip LP/VP creation on kexec
  x86/hyperv: move stimer cleanup to hv_machine_shutdown()
  Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing
  mshv: Add tracepoint for GPA intercept handling
  mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
  tools: hv: Fix cross-compilation
  Drivers: hv: vmbus: Export hv_vmbus_exists() and use it in pci-hyperv
  mshv: Introduce tracing support
  Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
2026-04-22 09:50:46 -07:00
Jork Loeser
5170a82e89 x86/hyperv: Skip LP/VP creation on kexec
After a kexec the logical processors and virtual processors already
exist in the hypervisor because they were created by the previous
kernel. Attempting to add them again causes either a BUG_ON or
corrupted VP state leading to MCEs in the new kernel.

Add hv_lp_exists() to probe whether an LP is already present by
calling HVCALL_GET_LOGICAL_PROCESSOR_RUN_TIME. When it succeeds the
LP exists and we skip the add-LP and create-VP loops entirely.

Also add hv_call_notify_all_processors_started() which informs the
hypervisor that all processors are online. This is required after
adding LPs (fresh boot) and is a no-op on kexec since we skip that
path.

Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Co-developed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-04-22 06:23:25 +00:00
Jork Loeser
f7ce370b52 x86/hyperv: move stimer cleanup to hv_machine_shutdown()
Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to
hv_machine_shutdown() in the platform code. This ensures stimer cleanup
happens before the vmbus unload, which is required for root partition
kexec to work correctly.

Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Jork Loeser <jloeser@linux.microsoft.com>
Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-04-22 06:23:25 +00:00
Kexin Sun
256e5254ef kgdb: update outdated references to kgdb_wait()
The function kgdb_wait() was folded into the static function
kgdb_cpu_enter() by commit 62fae31219 ("kgdb: eliminate
kgdb_wait(), all cpus enter the same way").  Update the four stale
references accordingly:

 - include/linux/kgdb.h and arch/x86/kernel/kgdb.c: the
   kgdb_roundup_cpus() kdoc describes what other CPUs are rounded up
   to call.  Because kgdb_cpu_enter() is static, the correct public
   entry point is kgdb_handle_exception(); also fix a pre-existing
   grammar error ("get them be" -> "get them into") and reflow the
   text.

 - kernel/debug/debug_core.c: replace with the generic description
   "the debug trap handler", since the actual entry path is
   architecture-specific.

 - kernel/debug/gdbstub.c: kgdb_cpu_enter() is correct here (it
   describes internal state, not a call target); add the missing
   parentheses.

Suggested-by: Daniel Thompson <daniel@riscstar.com>
Assisted-by: unnamed:deepseek-v3.2 coccinelle
Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn>
2026-04-21 16:41:54 +01:00
Rick Edgecombe
9874b2917b x86/shstk: Prevent deadlock during shstk sigreturn
During sigreturn the shadow stack signal frame is popped. The kernel does
this by reading the shadow stack using normal read accesses. When it can't
assume the memory is shadow stack, it takes extra steps to makes sure it is
reading actual shadow stack memory and not other normal readable memory. It
does this by holding the mmap read lock while doing the access and checking
the flags of the VMA.

Unfortunately that is not safe. If the read of the shadow stack sigframe
hits a page fault, the fault handler will try to recursively grab another
mmap read lock. This normally works ok, but if a writer on another CPU is
also waiting, the second read lock could fail and cause a deadlock.

Fix this by not holding mmap lock during the read access to userspace.

Instead use mmap_lock_speculate_...() to watch for changes between dropping
mmap lock and the userspace access. Retry if anything grabbed an mmap write
lock in between and could have changed the VMA.

These mmap_lock_speculate_...() helpers use mm::mm_lock_seq, which is only
available when PER_VMA_LOCK is configured. So make X86_USER_SHADOW_STACK
depend on it. On x86, PER_VMA_LOCK is a default configuration for SMP
kernels. So drop support for the other configs under the assumption that
the !SMP shadow stack user base does not exist.

Currently there is a check that skips the lookup work when the SSP can be
assumed to be on a shadow stack. While reorganizing the function, remove
the optimization to make the tricky code flows more common, such that
issues like this cannot escape detection for so long.

Fixes: 7fad2a432c ("x86/shstk: Check that signal frame is shadow stack mem")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
2026-04-20 22:54:24 +02:00
Linus Torvalds
9055c64567 memblock: updates for 7.0-rc1
* improve debugability of reserve_mem kernel parameter handling with print
   outs in case of a failure and debugfs info showing what was actually
   reserved
 * Make memblock_free_late() and free_reserved_area() use the same core
   logic for freeing the memory to buddy and ensure it takes care of
   updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmnjRmsQHHJwcHRAa2Vy
 bmVsLm9yZwAKCRA5A4Ymyw79kYh0CAC4NpZGFqpEBep1eQcfqsPH05dvp1LUXDNk
 i5GwS2ht/F5D9GcD+EyoYRQjRM8k+XZyOe3sqEF01Uav/rHAv3XrITg/pfiA92AR
 K7CvQv4NvyQqUNcv/mEb+P8niriJ4oHRXCag9inop1jo/x3Mym07oEy73rknAx9r
 ZQKwoFNOM/QQGVb9hZUANKCkE8cAsUXG89yEOH0n17FOahC0PZbK/vxjeO+br3IL
 HxEoC5l1j4cUauf8XEhsVXXdch0iqit/fB3ROePYFNCx7koVYHk6Yl1w++AM0RUA
 ypOmfPsSiqLY2ciuTIAnpTeMfQkkhEmMI3mp6T5BUBwSKJxLRaSM
 =c1xd
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:

 - improve debuggability of reserve_mem kernel parameter handling with
   print outs in case of a failure and debugfs info showing what was
   actually reserved

 - Make memblock_free_late() and free_reserved_area() use the same core
   logic for freeing the memory to buddy and ensure it takes care of
   updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled.

* tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  x86/alternative: delay freeing of smp_locks section
  memblock: warn when freeing reserved memory before memory map is initialized
  memblock, treewide: make memblock_free() handle late freeing
  memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y
  memblock: extract page freeing from free_reserved_area() into a helper
  memblock: make free_reserved_area() more robust
  mm: move free_reserved_area() to mm/memblock.c
  powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact()
  powerpc: fadump: pair alloc_pages_exact() with free_pages_exact()
  memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name()
  memblock: move reserve_bootmem_range() to memblock.c and make it static
  memblock: Add reserve_mem debugfs info
  memblock: Print out errors on reserve_mem parser
2026-04-18 11:29:14 -07:00
Linus Torvalds
01f492e181 Arm:
- Add support for tracing in the standalone EL2 hypervisor code, which
   should help both debugging and performance analysis.  This uses the
   new infrastructure for 'remote' trace buffers that can be exposed
   by non-kernel entities such as firmware, and which came through the
   tracing tree.
 
 - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting
   point for supporting the new GIC architecture in KVM.
 
 - Finally add support for pKVM protected guests, where pages are unmapped
   from the host as they are faulted into the guest and can be shared back
   from the guest using pKVM hypercalls.  Protected guests are created
   using a new machine type identifier.  As the elusive guestmem has not
   yet delivered on its promises, anonymous memory is also supported.
 
   This is only a first step towards full isolation from the host; for
   example, the CPU register state and DMA accesses are not yet isolated.
   Because this does not really yet bring fully what it promises, it is
   hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and
   also triggers TAINT_USER when a VM is created.  Caveat emptor.
 
 - Rework the dreaded user_mem_abort() function to make it more
   maintainable, reducing the amount of state being exposed to the
   various helpers and rendering a substantial amount of state immutable.
 
 - Expand the Stage-2 page table dumper to support NV shadow page tables
   on a per-VM basis.
 
 - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow.
 
 - Fix both SPE and TRBE in non-VHE configurations so that they do not
   generate spurious, out of context table walks that ultimately lead
   to very bad HW lockups.
 
 - A small set of patches fixing the Stage-2 MMU freeing in error cases.
 
 - Tighten-up accepted SMC immediate value to be only #0 for host
   SMCCC calls.
 
 - The usual cleanups and other selftest churn.
 
 LoongArch:
 
 - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel().
 
 - Add DMSINTC irqchip in kernel support.
 
 RISC-V:
 
 - Fix steal time shared memory alignment checks
 
 - Fix vector context allocation leak
 
 - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
 
 - Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
 
 - Fix integer overflow in kvm_pmu_validate_counter_mask()
 
 - Fix shift-out-of-bounds in make_xfence_request()
 
 - Fix lost write protection on huge pages during dirty logging
 
 - Split huge pages during fault handling for dirty logging
 
 - Skip CSR restore if VCPU is reloaded on the same core
 
 - Implement kvm_arch_has_default_irqchip() for KVM selftests
 
 - Factored-out ISA checks into separate sources
 
 - Added hideleg to struct kvm_vcpu_config
 
 - Factored-out VCPU config into separate sources
 
 - Support configuration of per-VM HGATP mode from KVM user space
 
 s390:
 
 - Support for ESA (31-bit) guests inside nested hypervisors.
 
 - Remove restriction on memslot alignment, which is not needed anymore with
   the new gmap code.
 
 - Fix LPSW/E to update the bear (which of course is the breaking event
   address register).
 
 x86:
 
 - Shut up various UBSAN warnings on reading module parameter before they
   were initialized.
 
 - Don't zero-allocate page tables that are used for splitting hugepages in
   the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and
   thus write all bytes.
 
 - As an optimization, bail early when trying to unsync 4KiB mappings if the
   target gfn can just be mapped with a 2MiB hugepage.
 
 x86 generic:
 
 - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely
   struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM
   would dereference stack pointer after an exit to userspace.
 
 - Clean up and comment the emulated MMIO code to try to make it easier to
   maintain (not necessarily "easy", but "easier").
 
 - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX
   and SVM enabling) as it is needed for trusted I/O.
 
 - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
 
 - Immediately fail the build if a required #define is missing in one of
   KVM's headers that is included multiple times.
 
 - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
   exception, mostly to prevent syzkaller from abusing the uAPI to
   trigger WARNs, but also because it can help prevent userspace from
   unintentionally crashing the VM.
 
 - Exempt SMM from CPUID faulting on Intel, as per the spec.
 
 - Misc hardening and cleanup changes.
 
 x86 (AMD):
 
 - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU
   so that KVM doesn't prematurely re-enable AVIC if multiple
   vCPUs have to-be-injected IRQs.
 
 - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would
   overwrite state when enabling virtualization on multiple CPUs in parallel.
   This should not be a problem because OSVW should usually be the same for
   all CPUs.
 
 - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a
   "too large" size based purely on user input.
 
 - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION.
 
 - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as
   doing so for an SNP guest will crash the host due to an RMP violation
   page fault.
 
 - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries
   are required to hold kvm->lock, and enforce it by lockdep.  Fix various
   bugs where sev_guest() was not ensured to be stable for the whole
   duration of a function or ioctl.
 
 - Convert a pile of kvm->lock SEV code to guard().
 
 - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD,
   for which KVM needs to set CR2 and DR6 as a response to ioctls such as
   KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2
   rather than CR2, for example).  Only set CR2 and DR6 when consumption of
   the payload is imminent, but on the other hand force delivery of the
   payload in all paths where userspace retrieves CR2 or DR6.
 
 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead
   of vmcb02->save.cr2.  The value is out of sync after a save/restore
   or after a #PF is injected into L2.
 
 - Fix a class of nSVM bugs where some fields written by the CPU are not
   synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not
   up-to-date when saved by KVM_GET_NESTED_STATE.
 
 - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and
   KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after
   save+restore.
 
 - Add a variety of missing nSVM consistency checks.
 
 - Fix several bugs where KVM failed to correctly update VMCB fields on
   nested #VMEXIT.
 
 - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for
   SVM-related instructions.
 
 - Add support for save+restore of virtualized LBRs (on SVM).
 
 - Refactor various helpers and macros to improve clarity and (hopefully)
   make the code easier to maintain.
 
 - Aggressively sanitize fields when copying from vmcb12, to guard against
   unintentionally allowing L1 to utilize yet-to-be-defined features.
 
 - Fix several bugs where KVM botched rAX legality checks when emulating SVM
   instructions.  There are remaining issues in that KVM doesn't handle size
   prefix overrides for 64-bit guests.
 
 - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of
   somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's
   architectural but sketchy behavior of generating #GP for "unsupported"
   addresses).
 
 - Cache all used vmcb12 fields to further harden against TOCTOU bugs.
 
 x86 (Intel):
 
 - Drop obsolete branch hint prefixes from the VMX instruction macros.
 
 - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
   register input when appropriate.
 
 - Code cleanups.
 
 guest_memfd:
 
 - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support
   reclaim, the memory is unevictable, and there is no storage to write
   back to.
 
 LoongArch selftests:
 
 - Add KVM PMU test cases
 
 s390 selftests:
 
 - Enable more memory selftests.
 
 x86 selftests:
 
 - Add support for Hygon CPUs in KVM selftests.
 
 - Fix a bug in the MSR test where it would get false failures on AMD/Hygon
   CPUs with exactly one of RDPID or RDTSCP.
 
 - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a
   bug where the kernel would attempt to collapse guest_memfd folios against
   KVM's will.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz
 vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw
 qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3
 8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX
 08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5
 Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg==
 =CcE0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Arm:

   - Add support for tracing in the standalone EL2 hypervisor code,
     which should help both debugging and performance analysis. This
     uses the new infrastructure for 'remote' trace buffers that can be
     exposed by non-kernel entities such as firmware, and which came
     through the tracing tree

   - Add support for GICv5 Per Processor Interrupts (PPIs), as the
     starting point for supporting the new GIC architecture in KVM

   - Finally add support for pKVM protected guests, where pages are
     unmapped from the host as they are faulted into the guest and can
     be shared back from the guest using pKVM hypercalls. Protected
     guests are created using a new machine type identifier. As the
     elusive guestmem has not yet delivered on its promises, anonymous
     memory is also supported

     This is only a first step towards full isolation from the host; for
     example, the CPU register state and DMA accesses are not yet
     isolated. Because this does not really yet bring fully what it
     promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
     'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
     created. Caveat emptor

   - Rework the dreaded user_mem_abort() function to make it more
     maintainable, reducing the amount of state being exposed to the
     various helpers and rendering a substantial amount of state
     immutable

   - Expand the Stage-2 page table dumper to support NV shadow page
     tables on a per-VM basis

   - Tidy up the pKVM PSCI proxy code to be slightly less hard to
     follow

   - Fix both SPE and TRBE in non-VHE configurations so that they do not
     generate spurious, out of context table walks that ultimately lead
     to very bad HW lockups

   - A small set of patches fixing the Stage-2 MMU freeing in error
     cases

   - Tighten-up accepted SMC immediate value to be only #0 for host
     SMCCC calls

   - The usual cleanups and other selftest churn

  LoongArch:

   - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()

   - Add DMSINTC irqchip in kernel support

  RISC-V:

   - Fix steal time shared memory alignment checks

   - Fix vector context allocation leak

   - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()

   - Fix double-free of sdata in kvm_pmu_clear_snapshot_area()

   - Fix integer overflow in kvm_pmu_validate_counter_mask()

   - Fix shift-out-of-bounds in make_xfence_request()

   - Fix lost write protection on huge pages during dirty logging

   - Split huge pages during fault handling for dirty logging

   - Skip CSR restore if VCPU is reloaded on the same core

   - Implement kvm_arch_has_default_irqchip() for KVM selftests

   - Factored-out ISA checks into separate sources

   - Added hideleg to struct kvm_vcpu_config

   - Factored-out VCPU config into separate sources

   - Support configuration of per-VM HGATP mode from KVM user space

  s390:

   - Support for ESA (31-bit) guests inside nested hypervisors

   - Remove restriction on memslot alignment, which is not needed
     anymore with the new gmap code

   - Fix LPSW/E to update the bear (which of course is the breaking
     event address register)

  x86:

   - Shut up various UBSAN warnings on reading module parameter before
     they were initialized

   - Don't zero-allocate page tables that are used for splitting
     hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
     the page table and thus write all bytes

   - As an optimization, bail early when trying to unsync 4KiB mappings
     if the target gfn can just be mapped with a 2MiB hugepage

  x86 generic:

   - Copy single-chunk MMIO write values into struct kvm_vcpu (more
     precisely struct kvm_mmio_fragment) to fix use-after-free stack
     bugs where KVM would dereference stack pointer after an exit to
     userspace

   - Clean up and comment the emulated MMIO code to try to make it
     easier to maintain (not necessarily "easy", but "easier")

   - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
     VMX and SVM enabling) as it is needed for trusted I/O

   - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions

   - Immediately fail the build if a required #define is missing in one
     of KVM's headers that is included multiple times

   - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
     exception, mostly to prevent syzkaller from abusing the uAPI to
     trigger WARNs, but also because it can help prevent userspace from
     unintentionally crashing the VM

   - Exempt SMM from CPUID faulting on Intel, as per the spec

   - Misc hardening and cleanup changes

  x86 (AMD):

   - Fix and optimize IRQ window inhibit handling for AVIC; make it
     per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
     vCPUs have to-be-injected IRQs

   - Clean up and optimize the OSVW handling, avoiding a bug in which
     KVM would overwrite state when enabling virtualization on multiple
     CPUs in parallel. This should not be a problem because OSVW should
     usually be the same for all CPUs

   - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
     about a "too large" size based purely on user input

   - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION

   - Disallow synchronizing a VMSA of an already-launched/encrypted
     vCPU, as doing so for an SNP guest will crash the host due to an
     RMP violation page fault

   - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
     queries are required to hold kvm->lock, and enforce it by lockdep.
     Fix various bugs where sev_guest() was not ensured to be stable for
     the whole duration of a function or ioctl

   - Convert a pile of kvm->lock SEV code to guard()

   - Play nicer with userspace that does not enable
     KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
     as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
     payload would end up in EXITINFO2 rather than CR2, for example).
     Only set CR2 and DR6 when consumption of the payload is imminent,
     but on the other hand force delivery of the payload in all paths
     where userspace retrieves CR2 or DR6

   - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
     instead of vmcb02->save.cr2. The value is out of sync after a
     save/restore or after a #PF is injected into L2

   - Fix a class of nSVM bugs where some fields written by the CPU are
     not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
     are not up-to-date when saved by KVM_GET_NESTED_STATE

   - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
     and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
     initialized after save+restore

   - Add a variety of missing nSVM consistency checks

   - Fix several bugs where KVM failed to correctly update VMCB fields
     on nested #VMEXIT

   - Fix several bugs where KVM failed to correctly synthesize #UD or
     #GP for SVM-related instructions

   - Add support for save+restore of virtualized LBRs (on SVM)

   - Refactor various helpers and macros to improve clarity and
     (hopefully) make the code easier to maintain

   - Aggressively sanitize fields when copying from vmcb12, to guard
     against unintentionally allowing L1 to utilize yet-to-be-defined
     features

   - Fix several bugs where KVM botched rAX legality checks when
     emulating SVM instructions. There are remaining issues in that KVM
     doesn't handle size prefix overrides for 64-bit guests

   - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
     instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
     down on AMD's architectural but sketchy behavior of generating #GP
     for "unsupported" addresses)

   - Cache all used vmcb12 fields to further harden against TOCTOU bugs

  x86 (Intel):

   - Drop obsolete branch hint prefixes from the VMX instruction macros

   - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
     register input when appropriate

   - Code cleanups

  guest_memfd:

   - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
     support reclaim, the memory is unevictable, and there is no storage
     to write back to

  LoongArch selftests:

   - Add KVM PMU test cases

  s390 selftests:

   - Enable more memory selftests

  x86 selftests:

   - Add support for Hygon CPUs in KVM selftests

   - Fix a bug in the MSR test where it would get false failures on
     AMD/Hygon CPUs with exactly one of RDPID or RDTSCP

   - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
     for a bug where the kernel would attempt to collapse guest_memfd
     folios against KVM's will"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
  KVM: x86: use inlines instead of macros for is_sev_*guest
  x86/virt: Treat SVM as unsupported when running as an SEV+ guest
  KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
  KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
  KVM: SEV: use mutex guard in snp_handle_guest_req()
  KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
  KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
  KVM: SEV: use mutex guard in snp_launch_update()
  KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
  KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
  KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
  KVM: SEV: WARN on unhandled VM type when initializing VM
  KVM: LoongArch: selftests: Add PMU overflow interrupt test
  KVM: LoongArch: selftests: Add basic PMU event counting test
  KVM: LoongArch: selftests: Add cpucfg read/write helpers
  LoongArch: KVM: Add DMSINTC inject msi to vCPU
  LoongArch: KVM: Add DMSINTC device support
  LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
  LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
  LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
  ...
2026-04-17 07:18:03 -07:00
Borislav Petkov (AMD)
e55d98e775 x86/CPU: Fix FPDSS on Zen1
Zen1's hardware divider can leave, under certain circumstances, partial
results from previous operations.  Those results can be leaked by
another, attacker thread.

Fix that with a chicken bit.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-17 06:04:42 -07:00
Linus Torvalds
440d6635b2 mm.git review status for linus..mm-nonmm-stable
Total patches:       126
 Reviews/patch:       0.92
 Reviewed rate:       76%
 
 - The 2 patch series "pid: make sub-init creation retryable" from Oleg
   Nesterov increases the robustness of our creation of init in a new
   namespace.  By clearing away some historical cruft which is no longer
   needed.  Also some documentation fixups are provided.
 
 - The 2 patch series "selftests/fchmodat2: Error handling and general"
   from Mark Brown has a fixup and a cleanup for the fchmodat2() syscall
   selftest.
 
 - The 3 patch series "lib: polynomial: Move to math/ and clean up" from
   Andy Shevchenko does as advertised.
 
 - The 3 patch series "hung_task: Provide runtime reset interface for
   hung task detector" from Aaron Tomlin gives administrators the ability
   to zero out /proc/sys/kernel/hung_task_detect_count.
 
 - The 2 patch series "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" from Thomas Weißschuh teaches getdelays to use the
   in-kernel UAPI headers rather than the system-provided ones.
 
 - The 5 patch series "watchdog/hardlockup: Improvements to hardlockup"
   from Mayank Rungta provides several cleanups and fixups to the
   hardlockup detector code and its documentation.
 
 - The 2 patch series "lib/bch: fix undefined behavior from signed
   left-shifts" from Josh Law provides a couple of small/theoretical fixes
   in the bch code.
 
 - The 2 patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()"
   from Junrui Luo does what is claims.
 
 - The 27 patch series "cleanup the RAID5 XOR library" from Christoph
   Hellwig is a quite far-reaching cleanup to this code.  I can't do better
   than to quote Christoph:
 
     The XOR library used for the RAID5 parity is a bit of a mess right
     now.  The main file sits in crypto/ despite not being cryptography and
     not using the crypto API, with the generic implementations sitting in
     include/asm-generic and the arch implementations sitting in an asm/
     header in theory.  The latter doesn't work for many cases, so
     architectures often build the code directly into the core kernel, or
     create another module for the architecture code.
 
     Change this to a single module in lib/ that also contains the
     architecture optimizations, similar to the library work Eric Biggers
     has done for the CRC and crypto libraries later.  After that it
     changes to better calling conventions that allow for smarter
     architecture implementations (although none is contained here yet),
     and uses static_call to avoid indirection function call overhead.
 
 - The 2 patch series "lib/list_sort: Clean up list_sort() scheduling
   workarounds" from Kuan-Wei Chiu cleans up this library code by removing
   a hacky thing which was added for UBIFS, which UBIFS doesn't actually
   need.
 
 - The 5 patch series "Fix bugs in extract_iter_to_sg()" from Christian
   Ehrhardt fixes a few bugs in the scatterlist code, adds in-kernel tests
   for the now-fixed bugs and fixes a leak in the test itself.
 
 - The 3 patch series "kdump: Enable LUKS-encrypted dump target support
   in ARM64 and PowerPC" from Coiby Xu eenables support of the
   LUKS-encrypted device dump target on arm64 and powerpc.
 
 - The 4 patch series "ocfs2: consolidate extent list validation into
   block read callbacks" from Joseph Qi addresses ocfs2's validation of
   extent list fields - cleanup, simplification, robustness.  (Kernel test
   robot loves mounting corrupted fs images!)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad90rQAKCRDdBJ7gKXxA
 jl7rAQD4/Rq7ZSSnEv6FS4gOwc3MgTdWcZZaXkqL1KiWyYhRwAEA+cVCO344+AKb
 znBOjet/hUr+/kBwyViifiC8LHzchwM=
 =Nfnf
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "pid: make sub-init creation retryable" (Oleg Nesterov)

   Make creation of init in a new namespace more robust by clearing away
   some historical cruft which is no longer needed. Also some
   documentation fixups

 - "selftests/fchmodat2: Error handling and general" (Mark Brown)

   Fix and a cleanup for the fchmodat2() syscall selftest

 - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)

 - "hung_task: Provide runtime reset interface for hung task detector"
   (Aaron Tomlin)

   Give administrators the ability to zero out
   /proc/sys/kernel/hung_task_detect_count

 - "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" (Thomas Weißschuh)

   Teach getdelays to use the in-kernel UAPI headers rather than the
   system-provided ones

 - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)

   Several cleanups and fixups to the hardlockup detector code and its
   documentation

 - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)

   A couple of small/theoretical fixes in the bch code

 - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)

 - "cleanup the RAID5 XOR library" (Christoph Hellwig)

   A quite far-reaching cleanup to this code. I can't do better than to
   quote Christoph:

     "The XOR library used for the RAID5 parity is a bit of a mess right
      now. The main file sits in crypto/ despite not being cryptography
      and not using the crypto API, with the generic implementations
      sitting in include/asm-generic and the arch implementations
      sitting in an asm/ header in theory. The latter doesn't work for
      many cases, so architectures often build the code directly into
      the core kernel, or create another module for the architecture
      code.

      Change this to a single module in lib/ that also contains the
      architecture optimizations, similar to the library work Eric
      Biggers has done for the CRC and crypto libraries later. After
      that it changes to better calling conventions that allow for
      smarter architecture implementations (although none is contained
      here yet), and uses static_call to avoid indirection function call
      overhead"

 - "lib/list_sort: Clean up list_sort() scheduling workarounds"
   (Kuan-Wei Chiu)

   Clean up this library code by removing a hacky thing which was added
   for UBIFS, which UBIFS doesn't actually need

 - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)

   Fix a few bugs in the scatterlist code, add in-kernel tests for the
   now-fixed bugs and fix a leak in the test itself

 - "kdump: Enable LUKS-encrypted dump target support in ARM64 and
   PowerPC" (Coiby Xu)

   Enable support of the LUKS-encrypted device dump target on arm64 and
   powerpc

 - "ocfs2: consolidate extent list validation into block read callbacks"
   (Joseph Qi)

   Cleanup, simplify, and make more robust ocfs2's validation of extent
   list fields (Kernel test robot loves mounting corrupted fs images!)

* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
  ocfs2: validate group add input before caching
  ocfs2: validate bg_bits during freefrag scan
  ocfs2: fix listxattr handling when the buffer is full
  doc: watchdog: fix typos etc
  update Sean's email address
  ocfs2: use get_random_u32() where appropriate
  ocfs2: split transactions in dio completion to avoid credit exhaustion
  ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
  ocfs2: validate extent block list fields during block read
  ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
  ocfs2: validate dx_root extent list fields during block read
  ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
  ocfs2: handle invalid dinode in ocfs2_group_extend
  .get_maintainer.ignore: add Askar
  ocfs2: validate bg_list extent bounds in discontig groups
  checkpatch: exclude forward declarations of const structs
  tools/accounting: handle truncated taskstats netlink messages
  taskstats: set version in TGID exit notifications
  ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
  arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
  ...
2026-04-16 20:11:56 -07:00
Linus Torvalds
334fbe734e mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       368
 Reviews/patch:       1.56
 Reviewed rate:       74%
 
 Excluding DAMON:
 
 Total patches:       316
 Reviews/patch:       1.77
 Reviewed rate:       81%
 
 Excluding DAMON and zram:
 
 Total patches:       306
 Reviews/patch:       1.81
 Reviewed rate:       82%
 
 Excluding DAMON, zram and maple_tree:
 
 Total patches:       276
 Reviews/patch:       2.01
 Reviewed rate:       91%
 
 Significant patch series in this merge:
 
 - The 30 patch series "maple_tree: Replace big node with maple copy"
   from Liam Howlett is mainly prepararatory work for ongoing development
   but it does reduce stack usage and is an improvement.
 
 - The 12 patch series "mm, swap: swap table phase III: remove swap_map"
   from Kairui Song offers memory savings by removing the static swap_map.
   It also yields some CPU savings and implements several cleanups.
 
 - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush
   Yadav adds file seal preservation to LUO's memfd code.
 
 - The 2 patch series "mm: zswap: add per-memcg stat for incompressible
   pages" from Jiayuan Chen adds additional userspace stats reportng to
   zswap.
 
 - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike
   Rapoport implements some cleanups for our handling of ZERO_PAGE() and
   zero_pfn.
 
 - The 2 patch series "mm/kmemleak: Improve scan_should_stop()
   implementation" from Zhongqiu Han provides an robustness improvement and
   some cleanups in the kmemleak code.
 
 - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang
   "improves the khugepaged scan logic and reduces CPU consumption by
   prioritizing scanning tasks that access memory frequently".
 
 - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies
   Kexec Handover by "transitioning KHO from an xarray-based metadata
   tracking system with serialization to a radix tree data structure that
   can be passed directly to the next kernel"
 
 - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan
   tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's
   tracepointing.
 
 - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper
   and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow
   stack code: remove per-arch code in favour of a generic implementation.
 
 - The 2 patch series "Fix KASAN support for KHO restored vmalloc
   regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO
   restores a vmalloc area.
 
 - The 4 patch series "mm: Remove stray references to pagevec" from Tal
   Zussman provides several cleanups, mainly udpating references to "struct
   pagevec", which became folio_batch three years ago.
 
 - The 17 patch series "mm: Eliminate fake head pages from vmemmap
   optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap
   optimization (HVO) by changing how tail pages encode their relationship
   to the head page.
 
 - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for
   core layer filters" from SeongJae Park improves two problematic
   behaviors of DAMOS that makes it less efficient when core layer filters
   are used.
 
 - The 3 patch series "mm/damon: strictly respect min_nr_regions" from
   SeongJae Park improves DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter.
 
 - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil
   Babka is a proper fix for a previously hotfixed SMP=n issue.  Code
   simplifications and cleanups ennsed.
 
 - The 16 patch series "mm: cleanups around unmapping / zapping" from
   David Hildenbrand implements "a bunch of cleanups around unmapping and
   zapping.  Mostly simplifications, code movements, documentation and
   renaming of zapping functions".
 
 - The 6 patch series "support batched checking of the young flag for
   MGLRU" from Baolin Wang supports batched checking of the young flag for
   MGLRU.  It's part cleanups; one benchmark shows large performance
   benefits for arm64.
 
 - The 5 patch series "memcg: obj stock and slab stat caching cleanups"
   from Johannes Weiner provides memcg cleanup and robustness improvements.
 
 - The 5 patch series "Allow order zero pages in page reporting" from
   Yuvraj Sakshith enhances page_reporting's free page reporting - it is
   presently and undesirably order-0 pages when reporting free memory.
 
 - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is
   cleanup work following from the recent conversion of the VMA flags to a
   bitmap.
 
 - The 10 patch series "mm/damon: add optional debugging-purpose sanity
   checks" from SeongJae Park adds some more developer-facing debug checks
   into DAMON core.
 
 - The 2 patch series "mm/damon: test and document power-of-2
   min_region_sz requirement" from SeongJae Park adds an additional DAMON
   kunit test and makes some adjustments to the addr_unit parameter
   handling.
 
 - The 3 patch series "mm/damon/core: make passed_sample_intervals
   comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time
   overflow issue in DAMON core.
 
 - The 7 patch series "mm/damon: improve/fixup/update ratio calculation,
   test and documentation" from SeongJae Park is a "batch of misc/minor
   improvements and fixups" for DAMON.
 
 - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of
   hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device
   when CONFIG_HUGETLB=n.  Some code movement was required.
 
 - The 6 patch series "zram: recompression cleanups and tweaks" from
   Sergey Senozhatsky provides "a somewhat random mix of fixups,
   recompression cleanups and improvements" in the zram code.
 
 - The 11 patch series "mm/damon: support multiple goal-based quota
   tuning algorithms" from SeongJae Park extend DAMOS quotas goal
   auto-tuning to support multiple tuning algorithms that users can select.
 
 - The 4 patch series "mm: thp: reduce unnecessary
   start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs
   handling so we no longer spam the logs with reams of junk when
   starting/stopping khugepaged.
 
 - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes
   provides some cleanups and slight fixes in the mremap, mmap and vma
   code.
 
 - The 5 patch series "mm/damon: support addr_unit on default monitoring
   targets for modules" from SeongJae Park extends the use of DAMON core's
   addr_unit tunable.
 
 - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites"
   from Nico Pache provides cleanups in the khugepaged and is a base for
   Nico's planned khugepaged mTHP support.
 
 - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups"
   from David Hildenbrand implements code movement and cleanups in the
   memhotplug and sparsemem code.
 
 - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and
   cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some
   memhotplug Kconfig support.
 
 - The 6 patch series "change young flag check functions to return bool"
   from Baolin Wang is "a cleanup patchset to change all young flag check
   functions to return bool".
 
 - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL
   dereference issues" from Josh Law and SeongJae Park fixes a few
   potential DAMON bugs.
 
 - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma
   code" from "converts a lot of the existing use of the legacy vm_flags_t
   data type to the new vma_flags_t type which replaces it".  Mainly in the
   vma code.
 
 - The 21 patch series "mm: expand mmap_prepare functionality and usage"
   from Lorenzo Stoakes "expands the mmap_prepare functionality, which is
   intended to replace the deprecated f_op->mmap hook which has been the
   source of bugs and security issues for some time".  Cleanups,
   documentation, extension of mmap_prepare into filesystem drivers.
 
 - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from
   Lorenzo Stoakes simplifies and cleans up zap_huge_pmd().  Additional
   cleanups around vm_normal_folio_pmd() and the softleaf functionality are
   performed.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA
 jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi
 GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ=
 =1Q7o
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
508fed6795 - Add new AMD MCA bank names and types to the MCA code, preceded by a clean
up of the relevant places to have them more developer-friendly (read: sort
   them alphanumerically and clean up comments) such that adding new banks is
   easy
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndH/cACgkQEsHwGGHe
 VUpqFBAAvjQCWdL5GQ0sV4EyYVToj4OKU3DmUCJLMKEh3n3yrQpPsbU9+KfxyndP
 B68lRfRqqV/uUxQGebh7Rnp8o2jWphU2hf1Lr0Ssl6y5ouKWs5Up4foLlG4hAhzC
 2MmHVz+jj8Z3FWKLxMEymxqq6wLF+0H3Issd/l23DkK6hMQCkjKc6WrSNC6JBDCA
 sSF5kR/E4Q/lcW12ncq4pUYwkKox2lcdsNtI/nC7W7W+CoqwpOq8MfomCDIII+A0
 Ib7baeRxagOk0WHlfy15fGaDoKlHW6ImT3cVYBK/tomp8dpG2zRMXHHQExan2rBR
 rHzvk3aHEgOr02DZJ/dxOT+libQIkBwno+DheEhJHcirB/gS5Z51ERhkyzqLReGv
 +XSO1Eq9j5bqiVn8RdPeJIVLtfqnOrpcks+cCmyH0AlLIx1WV5mSRUtmVl1kWyq1
 GBos0yOnH4PgMxqv8fNkfNjm1ATnHyrVjYl5YNKSzJHhu/8BYcQJ4X8R0f2m0pXS
 WI6uXf35C6rJcKj25qo1Nnhmj5YDWJgelJjes9ZtmRMPDNNooD4VLk1W6ox7VuOY
 QaNMNwrroLRdfOlaz7oYIUAuoaZbZnTqbz8Lfmb4UScLd9LfI5ZPqs7pB5VORApF
 5IYM/Wli+kQl2Qbz0CD6ZtfdidqR09H7oJBE/r6bEFePot2EpUY=
 =aivF
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - Add new AMD MCA bank names and types to the MCA code, preceded by a
   clean up of the relevant places to have them more developer-friendly
   (read: sort them alphanumerically and clean up comments) such that
   adding new banks is easy

* tag 'ras_core_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Add new SMCA bank types
  x86/mce, EDAC/mce_amd: Update CS bank type naming
  x86/mce, EDAC/mce_amd: Reorder SMCA bank type enums
2026-04-14 15:32:39 -07:00
Linus Torvalds
970216e023 — Reference the tip tree maintainer handbook directly from the relevant
MAINTAINERS file entries (covering timers, IRQ, locking, scheduling, perf,
   x86, and others) so that contributors and tooling can know where to
   look
 
 — Enable interrupt remapping in defconfig, which is an architectural
   requirement for x2APIC to function correctly on bare metal.  Without it,
   x2APIC was effectively enabled but non-functional.
 
 — Ensure that drivers which register custom restart handlers (such as those
   needed for SoC-based x86 devices like Intel Lightning Mountain) are actually
   invoked during reboot, bringing x86 in line with how other architectures
   handle this.
 
 - Cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndUHwACgkQEsHwGGHe
 VUridxAAgSJ7wLZ8rypjCA6W17KLTMOn/ymvrR3NEU/3T8R4m5ax5gHNUgyo/cb9
 T9H9eer5NRnDjlpKHXG9QTExxVMyrMqZEBcV04wxExKrWTBqWs8tc8Y1ux6sDm5I
 84iPjnaI1ideP6Mu8YwxOgUUGHExICh7gdHsvZGAPt7QupGTc3liT0AftT9qNMd7
 c33kx05gJshqdIuI8HFVZGlTu8FTfgYl2AGjVikjGx1i1A8LYlERhDsUu8VEDAsA
 w2KzIH0rINXWQPvfjtKEK9r9KVsqe4Ye28ku3OmVh3ScflBXrAVOzJSH5DXvoy6f
 ktw5/rn5zUDX+FHNt0amaZItBqXNB5I/nyn1F8qraTTvK9obC7urYL1Q7x/Sg6j8
 zqRTdhBQcqvmGiO9FSvUfJyk+VMAakNXGWHNGIoivuoopfIyieeN3jpaQObZ9dld
 szrOL0WBduQ9gnke6KJ8VMFhIt2vAqCw+PS+OABwbBvN9rSXnpMBys6dLblU1aoF
 l93L3g8oHIZimPunGNxqFe77SunA/iVlT4dnXDVGIWmte/0fn967SgtEzRESkOEN
 XipPW8cBhlZBAD0qN9UBkEU7EUM5G3iLFNIDyA0ONbL+OheE+iVtmO7RelAo4sPh
 ChM4bOaXMcxRrZf1ExoT86pYRH3Z+4lrhnIjC8ba93BkEi6dfoA=
 =uJyt
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:

 - Reference the tip tree maintainer handbook directly from the relevant
   MAINTAINERS file entries (covering timers, IRQ, locking, scheduling,
   perf, x86, and others) so that contributors and tooling can know
   where to look

 - Enable interrupt remapping in defconfig, which is an architectural
   requirement for x2APIC to function correctly on bare metal. Without
   it, x2APIC was effectively enabled but non-functional.

 - Ensure that drivers which register custom restart handlers (such as
   those needed for SoC-based x86 devices like Intel Lightning Mountain)
   are actually invoked during reboot, bringing x86 in line with how
   other architectures handle this.

 - Cleanups

* tag 'x86_misc_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add references to tip tree handbook
  x86/64/defconfig: Add CONFIG_IRQ_REMAP
  x86/reboot: Execute the kernel restart handler upon machine restart
  x86/mtrr: Use kstrtoul() in parse_mtrr_spare_reg()
2026-04-14 15:15:08 -07:00
Linus Torvalds
cd4cdc53cc - The kernel carries a table of Intel CPUs family,model,stepping,...,etc
tuples which say what is the latest microcode for that particular CPU. Some
   CPU variants differ only by the platform ID which determines what microcode
   needs to be loaded on them.
 
   Carve out the platform ID handling from the microcode loader and make it
   available in a more generic place so that the old microcode
   verification machinery can use it
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndSBoACgkQEsHwGGHe
 VUoJShAAjwkQMAYzFPrNXCAZzmyXr0At6YI7QVfhCiZnnbhROiRn9BxpRYby5QX9
 VbsBG56uwhHmDklKCeIqz8RyNSDAb0I7JxDVJk+tL4HzoSJjmD8k17ETrl89Sne0
 QB7J6PXAETfXAwEVPI1KID0fcGSE6EWkxJqM5v1CcV/hsaY2Ye3CNI7fwZNDAx1l
 mCbhZ0KqdFH6xjp77z8WlwDquNl7/0vyeTpNYv0CL1TMOZsSUXx8ABY/oI9iTNou
 27yd9L1vATVtAUfDhq67eh9bVV/rrXwBGLIiflgVchxEZ6FDgdZ2a3NxmtMz3xAc
 yam2JM0dSo4ZaB1WfJpYJMfm2jBMOtA4Qx10wi19ytfdfKocceTGr2zZ01zQp/5T
 9qEilQDt/bjpFZQ93qfXcdAK/u7xd3ZYJ+6ZwYWhrMDiwK9+0iRcdqrqUI6ETdqr
 UhUMgU0w4ukneWRqfqyhNeZQrQDYpkCsOGCwF+ruW6U3dHOljEKtMD6TyT6gZmX2
 aUFrSLEq3LnJVjvAuDq/sHneWMYBIU10cSfl10FrD7G4ypnMLjoY2e5RB2fwlfje
 sFJthPl0OGt64Ix/cWuDfR6wUIfRi8IB57xajvAPQEUFKHZP2LHrpygBY9aOsPfW
 fqNoXp52CqznijhLp83kfTGezW8UY7bVNsAJ5a5oKlJ0X4X9d8E=
 =W/iT
 -----END PGP SIGNATURE-----

Merge tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode loading updates from Borislav Petkov:
 "The kernel carries a table of Intel CPUs family, model, stepping, etc
  tuples which say what is the latest microcode for that particular CPU.

  Some CPU variants differ only by the platform ID which determines what
  microcode needs to be loaded on them.

  Carve out the platform ID handling from the microcode loader and make it
  available in a more generic place so that the old microcode
  verification machinery can use it"

* tag 'x86_microcode_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Add platform mask to Intel microcode "old" list
  x86/cpu: Add platform ID to CPU matching structure
  x86/cpu: Add platform ID to CPU info structure
  x86/microcode: Refactor platform ID enumeration into a helper
2026-04-14 14:57:29 -07:00
Linus Torvalds
e9635f2a73 - We made the FRED support an opt-in initially out of fear of it breaking
machines left and right in the case of a hw bug in the first generation of
   machines supporting it. Now that that the FRED code has seen a lot of
   hammering, flip the logic to be opt-out as is the usual case with new hw
   features
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmndQUQACgkQEsHwGGHe
 VUpSQRAAqn7bxSqhZoLyzejovGxTSvHqSMaJGDhnkafA1xUNsiAvgONdMyQqxq3r
 XNQuLST/BCtYqllM7YLfJkKWAV+rpdp6teLA7QnxbU4BFEPwzKQ9OFDeiFDOTBMq
 3Bde+t4YbwB05PpwPN8veZEEIXVDxPzzHe9I4M4x5VdsXhDHYVKuonR/DCAK1rpQ
 UKIXNhDGEcBdSCauOujtNwDZcmYTtcVpxWFkIwzOW0CiM8n/ys6forBTK90W3wg2
 geL9r0LZqgBfeYdL16qqFtOCvqpE1pF9ot8JrPaQZa8r/Zw/wJ06faSjQcygQg/d
 8gEJzC92eSU/IbwPdJBw/25D5vf/ZXztUgw2DDXCzQ6tBZ7N584YamVpHbh/WFRg
 degZ2Qu9AnLZquknBwJqp9y3nx8NQvIjsJZ/Q04+USh73LBlFhqHxwOcMGQ7yJHV
 nPhM9idlzcyTamjiJ1xw9WKx/07sI8P5SNmnte+XWJkbfwM5+mkhIuaRwH7IfRXu
 y/IODC3ulOLIKYyWN4ybUZnVXbUou5Fsz6o0tv6msAjStUj0O/mv/ImfiDLi6unU
 73NQdXzYlFz6YKBvXfQxvjP3/r48i4XWeCc1wJMDAeOegXsBbJRDBuv27bE4kujp
 +7apQCe8BVyQDk0M+DdlnMUJRt0w6NHNakB671wJl+ooFDBENas=
 =XX9t
 -----END PGP SIGNATURE-----

Merge tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 FRED updates from Borislav Petkov:
 "We made the FRED support an opt-in initially out of fear of it
  breaking machines left and right in the case of a hw bug in the first
  generation of machines supporting it.

  Now that that the FRED code has seen a lot of hammering, flip the
  logic to be opt-out as is the usual case with new hw features"

* tag 'x86_fred_for_v7.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fred: Remove kernel log message when initializing exceptions
  x86/fred: Enable FRED by default
2026-04-14 14:50:51 -07:00
Linus Torvalds
9f2bb6c7b3 - Complete LASS enabling: deal with vsyscall and EFI
- Clean up CPUID usage in newer Intel audio driver
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmndDsMACgkQaDWVMHDJ
 krAOhRAAi3BsToTbmKayQJKWwVr5aGbLYXe3FiBv5juPprsJDF4otd0ALkdbf5Ls
 n7nND4L9RBfcGZBu25vo60Y1I1b2JReu9Y8YXxsrEx+WIxTPYClOJa7AqakyNxc4
 zbTCTv6TuEyXqG5VWe3AItWrMToc49TvUVoN2fi+V89fLWA8IOkMoIhbAWwQPBFI
 ir0N4UwC/RtRBFf9qc9jesirqGSgh6SEAK6oBYM3C2PV/njoKljTconaCnF7uJlq
 WxxVcb07dIvOdpdE941OZ6jyS6eePr2feXOazNjnPb0py5Ai4WCsEGdK68ltQ5fM
 EhY24Ex6ltudgphB/cajKIKZ8LUCWgnMwTotY8IQMQiQ47JKmyrucdLUwvhl0qTD
 IAWiOMvb5d3X+CoKt3lXlzc29WhbogXzvxjZE29ad8Dm7DVBEOV57bXDwhnHb7LS
 rro7odiohsw7FMhfqxlb6NhYzCbdpBbmY3IFzXVgTZlWIXU1g/yBVz/2O0qsKLaw
 QQ2NKre6lioP2H3pWc0fi53336svzCSuuExCiu47Hx9WcYBX1Poq/AkpAoxPErq+
 DRn8lXVwqMpefevlHK9XlZjSpwwwltULmId3LFxh3Z53b1NABrKpVEEJ/VChpzzA
 xC8dLdu1pbJa2jdccL6mBDYlMvyWLCmKfAlAHdRh8nLmOZxKRHg=
 =kPr7
 -----END PGP SIGNATURE-----

Merge tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu updates from Dave Hansen:

 - Complete LASS enabling: deal with vsyscall and EFI

   The existing Linear Address Space Separation (LASS) support punted
   on support for common EFI and vsyscall configs. Complete the
   implementation by supporting EFI and vsyscall=xonly.

 - Clean up CPUID usage in newer Intel "avs" audio driver and update the
   x86-cpuid-db file

* tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0
  ASoC: Intel: avs: Include CPUID header at file scope
  ASoC: Intel: avs: Check maximum valid CPUID leaf
  x86/cpu: Remove LASS restriction on vsyscall emulation
  x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
  x86/vsyscall: Restore vsyscall=xonly mode under LASS
  x86/traps: Consolidate user fixups in the #GP handler
  x86/vsyscall: Reorganize the page fault emulation code
  x86/cpu: Remove LASS restriction on EFI
  x86/efi: Disable LASS while executing runtime services
  x86/cpu: Defer LASS enabling until userspace comes up
2026-04-14 14:24:45 -07:00
Linus Torvalds
0972ba5605 x86/platform changes for v7.1:
- Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)
 
  - Print AGESA string from DMI additional information entry
    (Yazen Ghannam, Mario Limonciello)
 
  - Improve and fix the DMI code (Mario Limonciello):
     - Correct an indexing error in <linux/dmi.h>
     - Adjust dmi_decode() to use enums <linux/dmi.h>
     - Add pr_fmt() for dmi_scan.c to fix & standardize
       the log prefixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncskARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1htlQ//ftceW6GTd/MbFnpHlm6yJDbxAYVNtRJD
 eiaRd/A3p73+ljRuaUNQflkP+4IYpFDNd2ZmwvY2MDVx1lYRVR5xbPLcHqJC7s95
 gDN/v4VV4iseiGMxx0hB7MqILgK6eSqrSaiQWBVxIsEHc+Y92tlQvnM+SgFJqaiw
 K0ZRDQOWWNWGE0SS5GcMwlzvdg01XGn9S/eVrvSZ1UO81a0IjFltz861z+YxzOP2
 b1fo9K4HdIH0C8fYcakJDNsxFzijZKXpCFhC/XwTJ918k5PFFM9plmgW1VDZzAkP
 XAnx4m55GmL+s+7vU1fQZZYqvgL02URnAlc7CQpbHEQ+WAJqdj3ACaQCj3OExHVH
 rqifH62SvpOhBqLgSTODKwL5X9jGwX3+aOjQ9SsALncVPZIGqqKMc6WXAGeOrNgW
 gelIn7p1DO3hKEdP3XtucU8CRcCVs27HSUjaJTAKYFYgNKzzuznXulg0LYzesJIe
 85xlDiQxN3AiPuAnYmsM8UVj9jHqDfRjTPPjFHDIi8UAEG0ZJtu98bw4zxdyEmPb
 fjbpjap52+iPFPISyZMqfy8n2L24B/SbftnllO/YGWudSgtPPCVt/KLJkG4i2AtJ
 YBlbJ3eBfz1gU7hZGN6BSXHXHQuEv4Az3q6ThiZ1Edpw0cEInow5CPowmY7/xqjH
 Fjv0QI4iMNY=
 =dr9A
 -----END PGP SIGNATURE-----

Merge tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:

 - Remove M486/M486SX/ELAN support, first minimal step (Ingo Molnar)

 - Print AGESA string from DMI additional information entry (Yazen
   Ghannam, Mario Limonciello)

 - Improve and fix the DMI code (Mario Limonciello):
     - Correct an indexing error in <linux/dmi.h>
     - Adjust dmi_decode() to use enums <linux/dmi.h>
     - Add pr_fmt() for dmi_scan.c to fix & standardize the log prefixes

* tag 'x86-platform-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Print AGESA string from DMI additional information entry
  firmware: dmi: Add pr_fmt() for dmi_scan.c
  firmware: dmi: Adjust dmi_decode() to use enums
  firmware: dmi: Correct an indexing error in dmi.h
  x86/cpu: Remove M486/M486SX/ELAN support
2026-04-14 14:10:44 -07:00
Linus Torvalds
ac633ba77c Miscellaneous x86 cleanups for v7.1:
- Consolidate AMD and Hygon cases in parse_topology() (Wei Wang)
  - asm constraints cleanups in __iowrite32_copy() (Uros Bizjak)
  - Drop AMD Extended Interrupt LVT macros (Naveen N Rao)
  - Don't use REALLY_SLOW_IO for delays (Juergen Gross)
  - paravirt cleanups (Juergen Gross)
  - FPU code cleanups (Borislav Petkov)
  - split-lock handling code cleanups (Borislav Petkov, Ronan Pigott)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncsF0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gZ1A//TmQrq/spNIRx7KqSjT9u9166OaQaBeA/
 r535C9n1d/rZxw0l10vIoWeSOpDIXEPLpMguvs463pvgLVfdrNOXABn1Kw1RR6dv
 yTHV47KUE1j9FIuJ4Y2fQeHhdTC8cdddrC06fEFOezftTMiAMIR/GMaeVA5ExzQd
 9tcyocH10gjhtKCF+ILFGt7OdPn75YDIc8ysJAAPrsF6Dw222K5E7p4XedmEYL54
 W7WVknLK2jP/BdXp17wDVunQP/Hl7huiM9DMgNlv6eliWV6nyH/hONRm5NrgBUEG
 s2URPPEu30thveMHQ1qv31P6ZY6lVFi0VylubJ+OdPofUJDCdCINRk22Bc6kXurZ
 Y8ZV93UyuIgVfvlI9J5UoHSkpi3owMjvrQShquxH2hDbCzzBvwpI7/+KHwWjgVsH
 9+xdOkjR40UrlmwhyyzqTzmB10mg2SM1/YK5Ca2DcneibIkQRlfXdNXQqNikWqhN
 COAEX6U5ayKEu/TjbiNH4zNInJCEQMI65Jiz+oTmdnf+iCQ1L2sp+zSOB6SoyQtp
 rTyubHDDGu6pq9IEATx3hn5BYO7t6Ly4KJksWCAJ0G8lnP3HRESD9l6QvjqipMWB
 JToVwWsuqgL3zWqCpuvBOErpHslgzN6Usbym6blyrp8ERKVIb2elDt9lDAWyz5X3
 7hS8sNulqDw=
 =Ox7s
 -----END PGP SIGNATURE-----

Merge tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Ingo Molnar:

 - Consolidate AMD and Hygon cases in parse_topology() (Wei Wang)

 - asm constraints cleanups in __iowrite32_copy() (Uros Bizjak)

 - Drop AMD Extended Interrupt LVT macros (Naveen N Rao)

 - Don't use REALLY_SLOW_IO for delays (Juergen Gross)

 - paravirt cleanups (Juergen Gross)

 - FPU code cleanups (Borislav Petkov)

 - split-lock handling code cleanups (Borislav Petkov, Ronan Pigott)

* tag 'x86-cleanups-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Correct the comment explaining what xfeatures_in_use() does
  x86/split_lock: Don't warn about unknown split_lock_detect parameter
  x86/fpu: Correct misspelled xfeaures_to_write local var
  x86/apic: Drop AMD Extended Interrupt LVT macros
  x86/cpu/topology: Consolidate AMD and Hygon cases in parse_topology()
  block/floppy: Don't use REALLY_SLOW_IO for delays
  x86/paravirt: Replace io_delay() hook with a bool
  x86/irqflags: Preemptively move include paravirt.h directive where it belongs
  x86/split_lock: Restructure the unwieldy switch-case in sld_state_show()
  x86/local: Remove trailing semicolon from _ASM_XADD in local_add_return()
  x86/asm: Use inout "+" asm onstraint modifiers in __iowrite32_copy()
2026-04-14 14:03:27 -07:00
Linus Torvalds
2ee08a8963 x86/asm updates for v7.1, by Uros Bizjak:
- Remove unnecessary "memory" clobbers from FS/GS base (read-) accessors
  - Remove unnecessary "memory" clobber from savesegment()
  - Use ASM_INPUT_RM in __loadsegment_fs()
  - Implement loadsegment()/savesegment() macros with static inline helpers
  - Use savesegment() for segment register reads in ELF core dump
  - Use savesegment() in __show_regs() instead of inline asm
  - Use correct type for 'gs' variable in __show_regs() to avoid zero-extension
  - Clean up 'sel' variable usage in do_set_thread_area()
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncrpURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g9WQ//WchCbgsHbYx36liwbf2vTS/4DY1hhBer
 oEtWj5LBLASpEnKeYAmfqbLpUM2c4Nb5oP6ZHQtnz1AUnfpBl31P1RUNWdWEKzfY
 OZQ7MoEl7MQ6sPFxryStX6UGBj/TmajgsIWHm9jVBxHB4zPPcPW3oZCB8lpWh4Ku
 ZC3AAEWJ14q8zyYmtGszSIbwJZp8ub15NKoNh5cp+upVMaC1biroVJRpZWba6bvc
 W0W+s4o2h4fGru80anQTEvCihFbfSMVU2MCkEl5NPreLqOfRBwjeHCWeSRR0h1Na
 IFLVMswsni8wQ415uP/A9rl7vLCHUXhL7UhE/O9J2Eu9iDMy7rvPDS3V+y6xuyTs
 vEhBAFirF/Hzn2Mv3MfUPVqh3Mgp3jybgxxUThQEL6CjL8ddC1nUWf9Hu37lZfOq
 ec6WXBrQxXQSLv9E07ffjzTwQuQPQQFQsDG1UayqpNRB7zd+1DXtbWxUlTL6wo4N
 YJN9CS8elp/rke9sZLc0RTresR+dm9PAEbNrpaSV5JHBLSAIZS8YuBNTe2+HxDGE
 6ID4fJFW2i1a7pP9HrxwPFSRMig15Jmc75dQ57DbZQ0cNy1M2PTj3PRVB7eerLt+
 0+TApLvHpYD7Ctr3Me1RMWPgoQOlyMxHvcFVqpeSXKTh6SMer3zF8V4pRPxQMqRM
 YOb+2KRPngc=
 =trkZ
 -----END PGP SIGNATURE-----

Merge tag 'x86-asm-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 asm from Ingo Molnar:
 "x86 asm cleanups by Uros Bizjak:

   - Remove unnecessary memory clobbers from FS/GS base (read-)
     accessors and savesegment()

   - Use ASM_INPUT_RM in __loadsegment_fs() to work around clang code
     generation problems

   - Implement loadsegment()/savesegment() macros with static inline
     helpers

   - Use savesegment() for segment register reads in ELF core dump and
     __show_regs()

   - Use correct type for 'gs' variable in __show_regs() to avoid
     zero-extension

   - Clean up 'sel' variable usage in do_set_thread_area()"

* tag 'x86-asm-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tls: Clean up 'sel' variable usage in do_set_thread_area()
  x86/process/32: Use correct type for 'gs' variable in __show_regs() to avoid zero-extension
  x86/process/64: Use savesegment() in __show_regs() instead of inline asm
  x86/elf: Use savesegment() for segment register reads in ELF core dump
  x86/asm/segment: Implement loadsegment()/savesegment() macros with static inline helpers
  x86/asm/segment: Use ASM_INPUT_RM in __loadsegment_fs()
  x86/asm/segment: Remove unnecessary "memory" clobber from savesegment()
  x86/asm/fsgsbase: Remove unnecessary "memory" clobbers from FS/GS base (read-) accessors
2026-04-14 13:54:17 -07:00
Linus Torvalds
4b2bdc2221 objtool updates for v7.1:
- KLP support updates and fixes (Song Liu)
 
  - KLP-build script updates and fixes (Joe Lawrence)
 
  - Support Clang RAX DRAP sequence, to address clang
    false positive (Josh Poimboeuf)
 
  - Reorder ORC register numbering to match regular x86
    register numbering (Josh Poimboeuf)
 
  - Misc cleanups (Wentong Tian, Song Liu)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnco9wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hHcw//ZviR0NjNMluGcPHAEzNrpVn1O3rAcE/8
 WVIn1ZH0ESOe+b4xeTvlixuVzneHjkq5pqmsC6xtf8f8m39pNnvGDXRmNpqqRVtp
 lIuOBPpiyf5d0anxOwdNtXicK75r/Bl6CWV5J1ngUt4Ei5jcg+gS9naeM3z0VzNS
 N6yTzo/Xg0EwBR03FPmGps6Gem7ywLQK4QKCKz/4ZBodiBnWxByqX4IZ2yn2zbPI
 oRVPl0rH50pPGDfJh76uCx6gQHK24MAC7RY801NL5Kcsde/Mbbl3UPtnbEeOz7aL
 T05a1T1plLX587ZEyl2lUQgOms3/mblFpOOGVquxYAaVXGOB7z81o18fQT64/l1P
 IiRco2v65gTMMTy930txfxrmGN66cmXNLwgajZynRvwnpgKbWEcYFURJJF3uF5b2
 ZbcEW1VoKu0VSq9+tpZLPCn5whQaCXvPUK8EHMwjT1QHkMyRH8ztAPvdrJXYOtjM
 VXpoWU2di4hcOEjldHTmz+FHqqunThzigcxhP367oXIw17Z2CusFt/tsbdVbZtFv
 +jTr+J1M8adbXfJnZ8eU8Ht2F1kpa7Icjhe0HDstiMCj90QJIUfzDv+oRQ1DVmpu
 DuRo6TCz+6q0Phjf0jEkUougIco1+XmWaa7s7PqtkC6QRQCQMyOumdMFfHMWI6uv
 Zh/NMYt5LaI=
 =kPmZ
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - KLP support updates and fixes (Song Liu)

 - KLP-build script updates and fixes (Joe Lawrence)

 - Support Clang RAX DRAP sequence, to address clang false positive
   (Josh Poimboeuf)

 - Reorder ORC register numbering to match regular x86 register
   numbering (Josh Poimboeuf)

 - Misc cleanups (Wentong Tian, Song Liu)

* tag 'objtool-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool/x86: Reorder ORC register numbering
  objtool: Support Clang RAX DRAP sequence
  livepatch/klp-build: report patch validation fuzz
  livepatch/klp-build: add terminal color output
  livepatch/klp-build: provide friendlier error messages
  livepatch/klp-build: improve short-circuit validation
  livepatch/klp-build: fix shellcheck complaints
  livepatch/klp-build: add Makefile with check target
  livepatch/klp-build: add grep-override function
  livepatch/klp-build: switch to GNU patch and recountdiff
  livepatch/klp-build: support patches that add/remove files
  objtool/klp: Correlate locals to globals
  objtool/klp: Match symbols based on demangled_name for global variables
  objtool/klp: Remove .llvm suffix in demangle_name()
  objtool/klp: Also demangle global objects
  objtool/klp: Use sym->demangled_name for symbol_name hash
  objtool/klp: Remove trailing '_' in demangle_name()
  objtool/klp: Remove redundant strcmp() in correlate_symbols()
  objtool: Use section/symbol type helpers
2026-04-14 13:00:04 -07:00
Linus Torvalds
c1fe867b5b Updates for the timer/timekeeping core:
- A rework of the hrtimer subsystem to reduce the overhead for frequently
     armed timers, especially the hrtick scheduler timer.
 
       - Better timer locality decision
 
       - Simplification of the evaluation of the first expiry time by
         keeping track of the neighbor timers in the RB-tree by providing a
         RB-tree variant with neighbor links. That avoids walking the
         RB-tree on removal to find the next expiry time, but even more
         important allows to quickly evaluate whether a timer which is
         rearmed changes the position in the RB-tree with the modified
         expiry time or not. If not, the dequeue/enqueue sequence which both
         can end up in rebalancing can be completely avoided.
 
       - Deferred reprogramming of the underlying clock event device. This
         optimizes for the situation where a hrtimer callback sets the need
         resched bit. In that case the code attempts to defer the
         re-programming of the clock event device up to the point where the
         scheduler has picked the next task and has the next hrtick timer
         armed. In case that there is no immediate reschedule or soft
         interrupts have to be handled before reaching the reschedule point
         in the interrupt entry code the clock event is reprogrammed in one
         of those code paths to prevent that the timer becomes stale.
 
       - Support for clocksource coupled clockevents
 
       	The TSC deadline timer is coupled to the TSC. The next event is
       	programmed in TSC time. Currently this is done by converting the
       	CLOCK_MONOTONIC based expiry value into a relative timeout,
       	converting it into TSC ticks, reading the TSC adding the delta
       	ticks and writing the deadline MSR.
 
 	As the timekeeping core has the conversion factors for the TSC
 	already, the whole back and forth conversion can be completely
 	avoided. The timekeeping core calculates the reverse conversion
 	factors from nanoseconds to TSC ticks and utilizes the base
 	timestamps of TSC and CLOCK_MONOTONIC which are updated once per
 	tick. This allows a direct conversion into the TSC deadline value
 	without reading the time and as a bonus keeps the deadline
 	conversion in sync with the TSC conversion factors, which are
 	updated by adjtimex() on systems with NTP/PTP enabled.
 
      - Allow inlining of the clocksource read and clockevent write
        functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.
 
     With all those enhancements in place a hrtick enabled scheduler
     provides the same performance as without hrtick. But also other hrtimer
     users obviously benefit from these optimizations.
 
   - Robustness improvements and cleanups of historical sins in the hrtimer
     and timekeeping code.
 
   - Rewrite of the clocksource watchdog.
 
     The clocksource watchdog code has over time reached the state of an
     impenetrable maze of duct tape and staples. The original design, which was
     made in the context of systems far smaller than today, is based on the
     assumption that the to be monitored clocksource (TSC) can be trivially
     compared against a known to be stable clocksource (HPET/ACPI-PM timer).
 
     Over the years this rather naive approach turned out to have major
     flaws. Long delays between the watchdog invocations can cause wrap
     arounds of the reference clocksource. The access to the reference
     clocksource degrades on large multi-sockets systems dure to
     interconnect congestion. This has been addressed with various
     heuristics which degraded the accuracy of the watchdog to the point
     that it fails to detect actual TSC problems on older hardware which
     exposes slow inter CPU drifts due to firmware manipulating the TSC to
     hide SMI time.
 
     The rewrite addresses this by:
 
       - Restricting the validation against the reference clocksource to the
         boot CPU which is usually closest to the legacy block which
         contains the reference clocksource (HPET/ACPI-PM).
 
       - Do a round robin validation betwen the boot CPU and the other CPUs
         based only on the TSC with an algorithm similar to the TSC
         synchronization code during CPU hotplug.
 
       - Being more leniant versus remote timeouts
 
   - The usual tiny fixes, cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbxdMQHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYoeq6EAC4h9wuBr5yCkxmog1Bhlk9cnK0oX1THb7V
 Q4z6DrYAiDXP6z4IDwqSW+3vvmNw1QXOeqpyMTiiIcQ5mNSs1IDnCt5HOEwY+ICm
 fiSUMYkXkH6xdFWspYWFkD7aExHJRT3hd/bo+WnXGHhHclPj5NHZssLMIDboHrzX
 jLV1hljmthfwg/uOXDGmQUPRFjqr2ZQjo7zGA5SwfVg8Krz7g/qRVy2wUns9TdW/
 NYwihDm1YV7qkK/+f1GnMdd70toqb1OZo/fS9FPbBrPLdyi8V+UbnFSUeZu8kCwV
 KubAzjLZR4xYCnrlaHhoi208GMd0TOvHMOrdAA0zkQHfhmszGl4N0pbF/EI29Ft2
 tQG/FUTG+nzgNOrMCPN2nr3u/UOXLP+gO+2hkyyQjqUP35IyaTYQn10JgBmPTdJL
 Ab6E8WL9gTMCd+t/bVjdU/B8W9ruMihKBtWkTfMBCcQ9uNJFCEGzrcMF8hzFYRTs
 /4rMDr3NlGYydAnbKPj6bkC5gtjBvh/L08kOdUFyXCMSqIzvJkZJ4241ogl1Awi6
 VfdwjF5KZCQo3M1ujpep+1L010wC/yulqLt2brQMO9Nt05dRhgwM3lxy7cnlMNm3
 NdfMgi+OG0CzQ+ZUpvo20hCgTDUVgWN9g5R3rar8FJX+Ym3T+ZoEKlShZF+fSRjf
 YAUIbUyi7A==
 =2qc8
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer core updates from Thomas Gleixner:

 - A rework of the hrtimer subsystem to reduce the overhead for
   frequently armed timers, especially the hrtick scheduler timer:

     - Better timer locality decision

     - Simplification of the evaluation of the first expiry time by
       keeping track of the neighbor timers in the RB-tree by providing
       a RB-tree variant with neighbor links. That avoids walking the
       RB-tree on removal to find the next expiry time, but even more
       important allows to quickly evaluate whether a timer which is
       rearmed changes the position in the RB-tree with the modified
       expiry time or not. If not, the dequeue/enqueue sequence which
       both can end up in rebalancing can be completely avoided.

     - Deferred reprogramming of the underlying clock event device. This
       optimizes for the situation where a hrtimer callback sets the
       need resched bit. In that case the code attempts to defer the
       re-programming of the clock event device up to the point where
       the scheduler has picked the next task and has the next hrtick
       timer armed. In case that there is no immediate reschedule or
       soft interrupts have to be handled before reaching the reschedule
       point in the interrupt entry code the clock event is reprogrammed
       in one of those code paths to prevent that the timer becomes
       stale.

     - Support for clocksource coupled clockevents

       The TSC deadline timer is coupled to the TSC. The next event is
       programmed in TSC time. Currently this is done by converting the
       CLOCK_MONOTONIC based expiry value into a relative timeout,
       converting it into TSC ticks, reading the TSC adding the delta
       ticks and writing the deadline MSR.

       As the timekeeping core has the conversion factors for the TSC
       already, the whole back and forth conversion can be completely
       avoided. The timekeeping core calculates the reverse conversion
       factors from nanoseconds to TSC ticks and utilizes the base
       timestamps of TSC and CLOCK_MONOTONIC which are updated once per
       tick. This allows a direct conversion into the TSC deadline value
       without reading the time and as a bonus keeps the deadline
       conversion in sync with the TSC conversion factors, which are
       updated by adjtimex() on systems with NTP/PTP enabled.

     - Allow inlining of the clocksource read and clockevent write
       functions when they are tiny enough, e.g. on x86 RDTSC and WRMSR.

   With all those enhancements in place a hrtick enabled scheduler
   provides the same performance as without hrtick. But also other
   hrtimer users obviously benefit from these optimizations.

 - Robustness improvements and cleanups of historical sins in the
   hrtimer and timekeeping code.

 - Rewrite of the clocksource watchdog.

   The clocksource watchdog code has over time reached the state of an
   impenetrable maze of duct tape and staples. The original design,
   which was made in the context of systems far smaller than today, is
   based on the assumption that the to be monitored clocksource (TSC)
   can be trivially compared against a known to be stable clocksource
   (HPET/ACPI-PM timer).

   Over the years this rather naive approach turned out to have major
   flaws. Long delays between the watchdog invocations can cause wrap
   arounds of the reference clocksource. The access to the reference
   clocksource degrades on large multi-sockets systems dure to
   interconnect congestion. This has been addressed with various
   heuristics which degraded the accuracy of the watchdog to the point
   that it fails to detect actual TSC problems on older hardware which
   exposes slow inter CPU drifts due to firmware manipulating the TSC to
   hide SMI time.

   The rewrite addresses this by:

     - Restricting the validation against the reference clocksource to
       the boot CPU which is usually closest to the legacy block which
       contains the reference clocksource (HPET/ACPI-PM).

     - Do a round robin validation betwen the boot CPU and the other
       CPUs based only on the TSC with an algorithm similar to the TSC
       synchronization code during CPU hotplug.

     - Being more leniant versus remote timeouts

 - The usual tiny fixes, cleanups and enhancements all over the place

* tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  alarmtimer: Access timerqueue node under lock in suspend
  hrtimer: Fix incorrect #endif comment for BITS_PER_LONG check
  posix-timers: Fix stale function name in comment
  timers: Get this_cpu once while clearing the idle state
  clocksource: Rewrite watchdog code completely
  clocksource: Don't use non-continuous clocksources as watchdog
  x86/tsc: Handle CLOCK_SOURCE_VALID_FOR_HRES correctly
  MIPS: Don't select CLOCKSOURCE_WATCHDOG
  parisc: Remove unused clocksource flags
  hrtimer: Add a helper to retrieve a hrtimer from its timerqueue node
  hrtimer: Remove trailing comma after HRTIMER_MAX_CLOCK_BASES
  hrtimer: Mark index and clockid of clock base as const
  hrtimer: Drop unnecessary pointer indirection in hrtimer_expire_entry event
  hrtimer: Drop spurious space in 'enum hrtimer_base_type'
  hrtimer: Don't zero-initialize ret in hrtimer_nanosleep()
  hrtimer: Remove hrtimer_get_expires_ns()
  timekeeping: Mark offsets array as const
  timekeeping/auxclock: Consistently use raw timekeeper for tk_setup_internals()
  timer_list: Print offset as signed integer
  tracing: Use explicit array size instead of sentinel elements in symbol printing
  ...
2026-04-14 10:27:07 -07:00
Linus Torvalds
db23954eea Update for the core interrupt subsystem:
- Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
      cleanup the workaround in the Hyper-V driver, which would now invoke
      it twice on ARM64. Removing it from the driver requires to add it to
      the x86 system vector entry point.
 
    - Remove the pointles cpu_read_lock() around reading CPU possible mask,
      which is read only after init.
 
    - Add documentation for the interaction between device tree bindings and
      the interrupt type defines in irq.h.
 
    - Delete stale defines in the matrix allocator and the equivalent in
      loongarch.
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnbt64QHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYoY/xD/9hdmaaSXX/JySPatJPgkAhekR8cl/brK9t
 K6qhMltg/XoUhmU1y0XSfuNDJwEOa4qvW2DbwfnzknIDr0AXvL39S9oNn+1o9I0x
 BjIbWwHTAsuLVyjQesqPWwyQtZ8HJCIM+3Ju2rUYz4jYuY8q15GYsbh6QN0pYDNG
 F54/OpuNV42/UhX/O01Gxw930GMGnsMkV6ou9dquap23U4FdlIsoY+zU/b09VEU2
 MmJZPGwryPpzhSLbCdOGuWA9oM6wd/FoBFYYU31W5OSXm4B0cRs+weE31WMogYKM
 lqudBuyNAhCIuZ06sdSvBPRswdvkuTUJovrJwG2r+rV6h953ExujNn+/ZxqraPg7
 iPK70Kwp/O2uyMFhHp/6r1u4bylK/AL7q6hace4cZFLqB/Htx5BW+hh/H7Cz6Yan
 H3cyBz9XIE7K5BI3lotKnlWVFwkrHgwYXUzHHrMq/UPdQKog1IPh/E29JekqtR+p
 RS9QzSF+Gs45I7oMP+7P8o5jAZYuuW+cODnXLlQkuz/bJff3QeftFhOKZkzbFk5x
 AFT0DREttxVCGcUCQaQYrWhFshp63+eDXgKGQT1YULleFUC1f9KH2W+6KlqjDwFR
 rvrUgmpMxkH2JJz1owct6vJ6wyffFNQtPeCTmq+7GM87HNm6Ji6AaRzQjRnhVGCt
 mh9VbBTnFw==
 =FvJG
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core irq updates from Thomas Gleixner:

 - Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
   cleanup the workaround in the Hyper-V driver, which would now invoke
   it twice on ARM64. Removing it from the driver requires to add it to
   the x86 system vector entry point

 - Remove the pointles cpu_read_lock() around reading CPU possible mask,
   which is read only after init

 - Add documentation for the interaction between device tree bindings
   and the interrupt type defines in irq.h

 - Delete stale defines in the matrix allocator and the equivalent in
   loongarch

* tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
  genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()
  genirq/affinity: Remove cpus_read_lock() while reading cpu_possible_mask
  genirq/matrix, LoongArch: Delete IRQ_MATRIX_BITS leftovers
  genirq: Document interaction between <linux/irq.h> and DT binding defines
2026-04-14 10:02:41 -07:00
Linus Torvalds
a970ed1881 bitmap updates for v7.1
- new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury);
  - drop unused __find_nth_andnot_bit() (Yury);
  - new tests and test improvements (Andy, Akinobu, Yury);
  - fixes for count_zeroes API (Yury);
  - cleanup bitmap_print_to_pagebuf() mess (Yury);
  - documentation updates (Andy, Kai, Kit).
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmnb8vkACgkQsUSA/Tof
 vsjzKgv/RI6HDkwRgjT/jPVAZzaNFrdoL0nIQ1ZriyE70b/0HtjMzbQBO0P3Vmsa
 5k13Nus0eBi9CeEAK0NvjQXy8NRj4E7favqF3faV7l4+J6STHpOKeHZglUAj00CG
 +23WGInz+TS5RBjXnvT00wuTAVQjT6dvYng9606psVDF/nlh8ZtXmYDjLauseoUH
 a1EEKwLGXbk3/MhDgVq/R5RvZoNscL4Hky7QWMZiqLutwF8EDrZotF142tfbxkmW
 mu+2Bn1W66F+8A42HJBDRevcuvsRzMggP2kXxDk50XNL1zTN9f/4iE0r+/5x8UVF
 s3WiGnuLSkRIK4osey12Z9BAtGJTn3gTPvIPYOWvRiJHskOa1yvGSgcvmzc53x0Q
 FZgDq1JkBDsF3OZceSjGIp9QOqg+YJArlzun+mNxLbfnahEbhx21Z/ls65vLJCae
 ENIPAzet5Fxa8mZeJIyiV0zR05DcV+g64FOhcGJ7al4fRWtYVP8qa9FAyGFMV4L2
 JL4xHuRO
 =pEBo
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-v7.1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury)

 - drop unused __find_nth_andnot_bit() (Yury)

 - new tests and test improvements (Andy, Akinobu, Yury)

 - fixes for count_zeroes API (Yury)

 - cleanup bitmap_print_to_pagebuf() mess (Yury)

 - documentation updates (Andy, Kai, Kit).

* tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits)
  bitops: Update kernel-doc for sign_extendXX()
  powerpc/xive: simplify xive_spapr_debug_show()
  thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf()
  coresight: don't use bitmap_print_to_pagebuf()
  lib/prime_numbers: drop temporary buffer in dump_primes()
  drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or()
  ice: use bitmap_empty() in ice_vf_has_no_qs_ena
  ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx()
  bitmap: introduce bitmap_weighted_xor()
  bitmap: add test_zero_nbits()
  bitmap: exclude nbits == 0 cases from bitmap test
  bitmap: test bitmap_weight() for more
  asm-generic/bitops: Fix a comment typo in instrumented-atomic.h
  bitops: fix kernel-doc parameter name for parity8()
  lib: count_zeros: unify count_{leading,trailing}_zeros()
  lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  lib: crypto: fix comments for count_leading_zeros()
  x86/topology: use bitmap_weight_from()
  bitmap: add bitmap_weight_from()
  lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit()
  ...
2026-04-14 08:55:18 -07:00
Linus Torvalds
d7c8087a9c Power management updates for 7.1-rc1
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
 
  - Update cpufreq-dt-platdev blocklist (Faruque Ansari)
 
  - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
    Rosen Penev)
 
  - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
 
  - Add support for new features: CPPC performance priority, Dynamic EPP,
    Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
    Mario Limonciello)
 
  - Fix sysfs files being present when HW missing and broken/outdated
    documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
 
  - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
    cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
    leads to a scheduling-while-atomic bug (K Prateek Nayak)
 
  - Clean up dead code in Kconfig for cpufreq (Julian Braha)
 
  - Remove max_freq_req update for pre-existing cpufreq policy and add a
    boost_freq_req QoS request to save the boost constraint instead of
    overwriting the last scaling_max_freq constraint (Pierre Gondois)
 
  - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
    are allocated in one go along with the policy to simplify lifetime
    rules and avoid error handling issues (Viresh Kumar)
 
  - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
    scaling driver (Henry Tseng)
 
  - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
    of cpumask_weight() because the former is more efficient (Yury Norov)
 
  - Use sysfs_emit() in sysfs show functions for cpufreq governor
    attributes (Thorsten Blum)
 
  - Update intel_pstate to stop returning an error when "off" is written
    to its status sysfs attribute while the driver is already off (Fabio
    De Francesco)
 
  - Include current frequency in the debug message printed by
    __cpufreq_driver_target() (Pengjie Zhang)
 
  - Refine stopped tick handling in the menu cpuidle governor and
    rearrange stopped tick handling in the teo cpuidle governor (Rafael
    Wysocki)
 
  - Add Panther Lake C-states table to the intel_idle driver (Artem
    Bityutskiy)
 
  - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
 
  - Simplify cpuidle_register_device() with guard() (Huisong Li)
 
  - Use performance level if available to distinguish between rates in
    OPP debugfs (Manivannan Sadhasivam)
 
  - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
 
  - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
 
  - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
    Biggers)
 
  - Clean up and rearrange the intel_rapl power capping driver to make
    the respective interface drivers (TPMI, MSR, and MMOI) hold their
    own settings and primitives and consolidate PL4 and PMU support
    flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
 
  - Correct kernel-doc function parameter names in the power capping core
    code (Randy Dunlap)
 
  - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
 
  - Use _visible attribute to replace create/remove_sysfs_files() in
    devfreq (Pengjie Zhang)
 
  - Add Tegra114 support to activity monitor device in tegra30-devfreq as
    a preparation to upcoming EMC controller support (Svyatoslav Ryhel)
 
  - Fix mistakes in cpupower man pages, add the boost and epp options to
    the cpupower-frequency-info man page, and add the perf-bias option to
    the cpupower-info man page (Roberto Ricci)
 
  - Remove unnecessary extern declarations from getopt.h in arguments
    parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
    cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4
 KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag
 tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z
 Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq
 b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E
 NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g=
 =fRCO
 -----END PGP SIGNATURE-----

Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
2026-04-13 19:47:52 -07:00
Linus Torvalds
2e31b16101 ACPI support updates for 7.1-rc1
- Update maintainers information regarding ACPICA (Rafael Wysocki)
 
  - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees
    Cook)
 
  - Trigger an ordered system power off after encountering a fatal error
    operator in AML (Armin Wolf)
 
  - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)
 
  - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from
    the ACPI PPTT parser (Ben Horgan)
 
  - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
    DeSimone)
 
  - Address multiple assorted issues and clean up the code in the ACPI
    processor idle driver (Huisong Li)
 
  - Replace strlcat() in the ACPI processor idle drive with a better
    alternative (Andy Shevchenko)
 
  - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki)
 
  - Move reference performance to capabilities and fix an uninitialized
    variable in the ACPI CPPC library (Pengjie Zhang)
 
  - Add support for the Performance Limited Register to the ACPI CPPC
    library (Sumit Gupta)
 
  - Add cppc_get_perf() API to read performance controls, extend
    cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
    library warn on missing mandatory DESIRED_PERF register (Sumit Gupta)
 
  - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target
    callbacks to allow it to control performance bounds via standard
    scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs
    documentation for the Performance Limited Register to it (Sumit Gupta)
 
  - Add ACPI support to the platform device interface in the CMOS RTC
    driver, make the ACPI core device enumeration code create a platform
    device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael
    Wysocki)
 
  - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
    driver and clean up the CMOS RTC ACPI address space handler (Rafael
    Wysocki)
 
  - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
    and allow that driver to work without a dedicated IRQ if the ACPI
    alarm is used (Rafael Wysocki)
 
  - Clean up the ACPI TAD driver in various ways and add an RTC class
    device interface, including both the RTC setting/reading and alarm
    timer support, to it (Rafael Wysocki)
 
  - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
    drivers (Rafael Wysocki)
 
  - Rework checking for duplicate video bus devices and consolidate
    pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael
    Wysocki)
 
  - Update the ACPI core device drivers to stop setting acpi_device_name()
    unnecessarily (Rafael Wysocki)
 
  - Rearrange code using acpi_device_class() in the ACPI core device
    drivers and update them to stop setting acpi_device_class()
    unnecessarily (Rafael Wysocki)
 
  - Define ACPI_AC_CLASS in one place (Rafael Wysocki)
 
  - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to
    bind to platform devices instead of ACPI devices (Rafael Wysocki)
 
  - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
    hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
    Feng)
 
  - Consolidate the interface for obtaining a CPU UID from ACPI across
    architectures and use it to address incorrect PCI TPH Steering Tag
    on ARM64 resulting from the invalid assumption that the ACPI
    Processor UID would always be the same as the corresponding logical
    CPU ID in Linux (Chengwen Feng)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY/bcSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1chAH/1cGRzh9lSgQ3ZdzIIA5rpRtwKC+CTNz
 iNDvQ97W73B2N+WYzMaloOh+ZVA1Vdqc+8921aH6HI+v7wtg/ZV3h/hU7TagHNY/
 bRFDYaeRXVj4aBXNfoVdn7G5UU9j/kIDcV25I2ubOBqZaO6T5p8p1BK0j0vEj+sG
 yR7XwpEhr2OUQwlIFGKskJwFaH57QJXPEY8wf+o+lMEx/7o/JQRJzKFwsYu01ZZV
 kQy9Ee08P/rsNJwU2ibmZu5P3JMnhategAT8VAMBvkfLScv2sKX+1Vz19NGXzm71
 ARaT7y8MSPNb7SAvWmNZ/rVYrYIL+D3a76Gd7MOGrbVWEn6oXIbCIhY=
 =6vEK
 -----END PGP SIGNATURE-----

Merge tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support updates from Rafael Wysocki:
 "These include an update of the CMOS RTC driver and the related ACPI
  and x86 code that, among other things, switches it over to using the
  platform device interface for device binding on x86 instead of the PNP
  device driver interface (which allows the code in question to be
  simplified quite a bit), a major update of the ACPI Time and Alarm
  Device (TAD) driver adding an RTC class device interface to it, and
  updates of core ACPI drivers that remove some unnecessary and not
  really useful code from them.

  Apart from that, two drivers are converted to using the platform
  driver interface for device binding instead of the ACPI driver one,
  which is slated for removal, support for the Performance Limited
  register is added to the ACPI CPPC library and there are some
  janitorial updates of it and the related cpufreq CPPC driver, the ACPI
  processor driver is fixed and cleaned up, and NVIDIA vendor CPER
  record handler is added to the APEI GHES code.

  Also, the interface for obtaining a CPU UID from ACPI is consolidated
  across architectures and used for fixing a problem with the PCI TPH
  Steering Tag on ARM64, there are two updates related to ACPICA, a
  minor ACPI OS Services Layer (OSL) update, and a few assorted updates
  related to ACPI tables parsing.

  Specifics:

   - Update maintainers information regarding ACPICA (Rafael Wysocki)

   - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()
     (Kees Cook)

   - Trigger an ordered system power off after encountering a fatal
     error operator in AML (Armin Wolf)

   - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)

   - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure
     from the ACPI PPTT parser (Ben Horgan)

   - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
     DeSimone)

   - Address multiple assorted issues and clean up the code in the ACPI
     processor idle driver (Huisong Li)

   - Replace strlcat() in the ACPI processor idle drive with a better
     alternative (Andy Shevchenko)

   - Rearrange and clean up acpi_processor_errata_piix4() (Rafael
     Wysocki)

   - Move reference performance to capabilities and fix an uninitialized
     variable in the ACPI CPPC library (Pengjie Zhang)

   - Add support for the Performance Limited Register to the ACPI CPPC
     library (Sumit Gupta)

   - Add cppc_get_perf() API to read performance controls, extend
     cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
     library warn on missing mandatory DESIRED_PERF register (Sumit
     Gupta)

   - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in
     target callbacks to allow it to control performance bounds via
     standard scaling_min_freq and scaling_max_freq sysfs attributes and
     add sysfs documentation for the Performance Limited Register to it
     (Sumit Gupta)

   - Add ACPI support to the platform device interface in the CMOS RTC
     driver, make the ACPI core device enumeration code create a
     platform device for the CMOS RTC, and drop CMOS RTC PNP device
     support (Rafael Wysocki)

   - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
     driver and clean up the CMOS RTC ACPI address space handler (Rafael
     Wysocki)

   - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
     and allow that driver to work without a dedicated IRQ if the ACPI
     alarm is used (Rafael Wysocki)

   - Clean up the ACPI TAD driver in various ways and add an RTC class
     device interface, including both the RTC setting/reading and alarm
     timer support, to it (Rafael Wysocki)

   - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
     drivers (Rafael Wysocki)

   - Rework checking for duplicate video bus devices and consolidate
     pnp.bus_id workarounds handling in the ACPI video bus driver
     (Rafael Wysocki)

   - Update the ACPI core device drivers to stop setting
     acpi_device_name() unnecessarily (Rafael Wysocki)

   - Rearrange code using acpi_device_class() in the ACPI core device
     drivers and update them to stop setting acpi_device_class()
     unnecessarily (Rafael Wysocki)

   - Define ACPI_AC_CLASS in one place (Rafael Wysocki)

   - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver
     to bind to platform devices instead of ACPI devices (Rafael
     Wysocki)

   - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
     hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
     Feng)

   - Consolidate the interface for obtaining a CPU UID from ACPI across
     architectures and use it to address incorrect PCI TPH Steering Tag
     on ARM64 resulting from the invalid assumption that the ACPI
     Processor UID would always be the same as the corresponding logical
     CPU ID in Linux (Chengwen Feng)"

* tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  ACPICA: Update maintainers information
  watchdog: ni903x_wdt: Convert to a platform driver
  ACPI: PAD: xen: Convert to a platform driver
  ACPI: processor: idle: Reset cpuidle on C-state list changes
  cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
  PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM
  ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
  perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
  ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
  x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
  PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
  ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
  ACPI: tables: Enable FPDT on LoongArch
  ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
  ACPI: processor: idle: Reset power_setup_done flag on initialization failure
  ACPI: TAD: Add alarm support to the RTC class device interface
  ...
2026-04-13 19:25:07 -07:00
Linus Torvalds
ef3da345cc vfs-7.1-rc1.misc
Please consider pulling these changes from the signed vfs-7.1-rc1.misc tag.
 
 Thanks!
 Christian
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCadjZCwAKCRCRxhvAZXjc
 ohhBAQCAmQMlMRAXAgUZFYMTZpeQlcujP5rv+/vT2Tf/xS76YwD/dRDaw1FH294+
 qtk/Z1NjleNixzE2sld1K9J32NxeyAc=
 =+g9q
 -----END PGP SIGNATURE-----

Merge tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:
   - coredump: add tracepoint for coredump events
   - fs: hide file and bfile caches behind runtime const machinery

  Fixes:
   - fix architecture-specific compat_ftruncate64 implementations
   - dcache: Limit the minimal number of bucket to two
   - fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
   - fs/mbcache: cancel shrink work before destroying the cache
   - dcache: permit dynamic_dname()s up to NAME_MAX

  Cleanups:
   - remove or unexport unused fs_context infrastructure
   - trivial ->setattr cleanups
   - selftests/filesystems: Assume that TIOCGPTPEER is defined
   - writeback: fix kernel-doc function name mismatch for wb_put_many()
   - autofs: replace manual symlink buffer allocation in autofs_dir_symlink
   - init/initramfs.c: trivial fix: FSM -> Finite-state machine
   - fs: remove stale and duplicate forward declarations
   - readdir: Introduce dirent_size()
   - fs: Replace user_access_{begin/end} by scoped user access
   - kernel: acct: fix duplicate word in comment
   - fs: write a better comment in step_into() concerning .mnt assignment
   - fs: attr: fix comment formatting and spelling issues"

* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
  dcache: permit dynamic_dname()s up to NAME_MAX
  fs: attr: fix comment formatting and spelling issues
  fs: hide file and bfile caches behind runtime const machinery
  fs: write a better comment in step_into() concerning .mnt assignment
  proc: rename proc_notify_change to proc_setattr
  proc: rename proc_setattr to proc_nochmod_setattr
  affs: rename affs_notify_change to affs_setattr
  adfs: rename adfs_notify_change to adfs_setattr
  hfs: update comments on hfs_inode_setattr
  kernel: acct: fix duplicate word in comment
  fs: Replace user_access_{begin/end} by scoped user access
  readdir: Introduce dirent_size()
  coredump: add tracepoint for coredump events
  fs: remove do_sys_truncate
  fs: pass on FTRUNCATE_* flags to do_truncate
  fs: fix archiecture-specific compat_ftruncate64
  fs: remove stale and duplicate forward declarations
  init/initramfs.c: trivial fix: FSM -> Finite-state machine
  autofs: replace manual symlink buffer allocation in autofs_dir_symlink
  fs/mbcache: cancel shrink work before destroying the cache
  ...
2026-04-13 14:20:11 -07:00
Paolo Bonzini
4a530993da KVM x86 VMXON and EFER.SVME extraction for 7.1
Move _only_ VMXON+VMXOFF and EFER.SVME toggling out of KVM (versus all of VMX
 and SVM enabling) out of KVM and into the core kernel so that non-KVM TDX
 enabling, e.g. for trusted I/O, can make SEAMCALLs without needing to ensure
 KVM is fully loaded.
 
 TDX isn't a hypervisor, and isn't trying to be a hypervisor. Specifically, TDX
 should _never_ have it's own VMCSes (that are visible to the host; the
 TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there is simply
 no reason to move that functionality out of KVM.
 
 With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly
 simple refcounting game.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmnZJkYACgkQOlYIJqCj
 N/21chAAjg9tb/E8+vqBZDT5vO9Bu6c333irV2vqBBJZWUx6xKhtk77kL6kISWyf
 aI57hJ5IwbUkfDcomSY+MyRXxw/X4OioSs5qqvcC2XHatGA8XwifJE47cN5ZT0+D
 hzZjru8Z9VGHf5wUXS41yTHtm+INiEYMgJiseUQR6sbWx3H+zDcLIooNQx/ZLYrV
 vR+VPtaMYpJ0TTDDqb8PrCnjgXoXFenAnzAj9bAikWP60kaDXrxN9KPc5woDo29+
 TrkTyr2mmQvKpNhLCDwAMNa9bXxgzkHEGx8J2WZTbUi9ZBv4MwVsnGLLsaUKQlaa
 4V1JDiICzYptjMzU+ka4iTF+m0KEz4EykP7mVVK+5MAHc0NOUVfDW6JP2PM/66dh
 NyyjGhbrfH0PwqzDn4N2h0MmWT4YNCIxESClecEMtEzsCyWfYOMitxbDbzHnu9Vw
 a/C0pwWKJ34Trr0O79SevAWJBlu596mya0YvMeCAWxCvSUGknbo5IXdrmtp6htGp
 Gz5+0ZyvVRbYpwxS+OOpWMkZuPvvEcWTbMAG/scbSHh80P/uCVyuLsRZR2HSB8EV
 tYnnLDDDQ1KmLV7xmw5XnkN9hFffAM8eXA7KX9TPjCXjd25lCJGgquQEH0oAHe5q
 1qXf+lWttP7MIbD5/Ga5CO+FqXAE6xmFRWjEBgLx32kSAWXqxPs=
 =SuxR
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-vmxon-7.1' of https://github.com/kvm-x86/linux into HEAD

KVM x86 VMXON and EFER.SVME extraction for 7.1

Move _only_ VMXON+VMXOFF and EFER.SVME toggling out of KVM (versus all of VMX
and SVM enabling) out of KVM and into the core kernel so that non-KVM TDX
enabling, e.g. for trusted I/O, can make SEAMCALLs without needing to ensure
KVM is fully loaded.

TIO isn't a hypervisor, and isn't trying to be a hypervisor. Specifically, TIO
should _never_ have it's own VMCSes (that are visible to the host; the
TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there is simply
no reason to move that functionality out of KVM.

With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly
simple refcounting game.
2026-04-13 13:04:48 +02:00
Linus Torvalds
d71358127c Fix incorrect hardware errors reported on Zen3 CPUs, such
as bogus L3 cache deferred errors. (Yazen Ghannam)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmnbR60RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hksA/9E29kFCKEc+MpZ634k1Gb+fsUG0mPvUKa
 DMDqopOAlpW2TgcVkjEecPMwXIVVn2jC68pMGpW9BRkI5VxRbVq2UkmzvYsPGb4S
 lGO83ng4cxu+WHmWST4+7Mm/8XUZNxA3iunFJb/B1rxfA6R6s94VulcQ5PckE5ws
 8X8QdueFOYr/ZgscHpan4lXzWXzJnqvhypUr+UYeRQtweScgI9/HA19+zzfO4+Er
 TzdcperNeBTS0TaXjZTdDtCJ9rQD4iQvokfUqItEkfBLbv136BM9dWkJp3mbTlv9
 RUqbdPlXuFxxEUzJTpg2Y6qmo9n11nrt75UvE9WGuNPRQASqRvzIOfSfW7cMccJ0
 PyH34KovrjUATUNHCuhvk8KnIRjy+sToT/wGkKD8hXeszG7dsH2W3ED85wqGvpbs
 m6tb0JLYsLJ3eg40I+wwMmm3eMGJZNsxmstk7j7pBEqMPCh3X9vojlALfM+oiHwH
 Fw4/jwukKQsqWR8cupOkM1EaIlEC3vsjdVXrshVok+sZ6s1bJ5iBn0zO82YC5pHJ
 U4u5mY5g1c/mcZ1Y7Mt9+ZYeWvq8yD3jjgmvyJEsJN2FgsiOnmWLBPQNCR2faflm
 QlDEiW6GwBhrt5esaTT50+5Jm8e8gb1dsu9GpOgMb4Mfc+jWIX1h65NVOBItHkuA
 fjxy9HLW+p8=
 =vLvD
 -----END PGP SIGNATURE-----

Merge tag 'ras-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 MCE fix from Ingo Molnar:
 "Fix incorrect hardware errors reported on Zen3 CPUs, such as bogus
  L3 cache deferred errors (Yazen Ghannam)"

* tag 'ras-urgent-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/amd: Filter bogus hardware errors on Zen3 clients
2026-04-12 08:27:09 -07:00
Thomas Gleixner
ff1c0c5d07 Merge branch 'timers/urgent' into timers/core
to resolve the conflict with urgent fixes.
2026-04-11 07:58:33 +02:00
Rafael J. Wysocki
83e990310d Merge branch 'pm-cpufreq'
Merge cpufreq updates for 7.1-rc1:

 - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

 - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

 - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
   Rosen Penev)

 - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

 - Add support for new features: CPPC performance priority, Dynamic EPP,
   Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
   Mario Limonciello)

 - Fix sysfs files being present when HW missing and broken/outdated
   documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

 - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
   cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
   leads to a scheduling-while-atomic bug (K Prateek Nayak)

 - Clean up dead code in Kconfig for cpufreq (Julian Braha)

 - Remove max_freq_req update for pre-existing cpufreq policy and add a
   boost_freq_req QoS request to save the boost constraint instead of
   overwriting the last scaling_max_freq constraint (Pierre Gondois)

 - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
   are allocated in one go along with the policy to simplify lifetime
   rules and avoid error handling issues (Viresh Kumar)

 - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
   scaling driver (Henry Tseng)

 - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
   of cpumask_weight() because the former is more efficient (Yury Norov)

 - Use sysfs_emit() in sysfs show functions for cpufreq governor
   attributes (Thorsten Blum)

 - Update intel_pstate to stop returning an error when "off" is written
   to its status sysfs attribute while the driver is already off (Fabio
   De Francesco)

 - Include current frequency in the debug message printed by
   __cpufreq_driver_target() (Pengjie Zhang)

* pm-cpufreq: (38 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  ...
2026-04-10 12:05:32 +02:00
Rafael J. Wysocki
d07060217b Merge branch 'acpi-cmos-rtc'
Merge updates related to the CMOS RTC driver and x86/ACPI CMOS RTC
support for 7.1-rc1:

 - Add ACPI support to the platform device interface in the CMOS RTC
   driver, make the ACPI core device enumeration code create a platform
   device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael
   Wysocki)

 - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
   driver and clean up the CMOS RTC ACPI address space handler (Rafael
   Wysocki)

 - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
   and allow that driver to work without a dedicated IRQ if the ACPI
   alarm is used (Rafael Wysocki)

* acpi-cmos-rtc:
  rtc: cmos: Do not require IRQ if ACPI alarm is used
  rtc: cmos: Enable ACPI alarm if advertised in ACPI FADT
  ACPI: TAD/x86: cmos_rtc: Consolidate address space handler setup
  rtc: cmos: Drop PNP device support
  x86: rtc: Drop PNP device check
  ACPI: PNP: Drop CMOS RTC PNP device support
  ACPI: x86/rtc-cmos: Use platform device for driver binding
  ACPI: x86: cmos_rtc: Create a CMOS RTC platform device
  ACPI: x86: cmos_rtc: Improve coordination with ACPI TAD driver
  ACPI: x86: cmos_rtc: Clean up address space handler driver
2026-04-09 21:40:22 +02:00
Linus Torvalds
52f657e34d x86: shadow stacks: proper error handling for mmap lock
김영민 reports that shstk_pop_sigframe() doesn't check for errors from
mmap_read_lock_killable(), which is a silly oversight, and also shows
that we haven't marked those functions with "__must_check", which would
have immediately caught it.

So let's fix both issues.

Reported-by: 김영민 <osori@hspace.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-04-08 13:18:57 -07:00
Chengwen Feng
3cfe889f89 x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
x86. While at it, add input validation to make the code more robust.

Update Xen-related code to use acpi_get_cpu_uid() instead of the legacy
cpu_acpi_id() function, and remove the now-unused cpu_acpi_id() to clean
up redundant code.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://patch.msgid.link/20260401081640.26875-5-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-06 16:55:15 +02:00
Ronan Pigott
fbe80bd699 x86/split_lock: Don't warn about unknown split_lock_detect parameter
The split_lock_detect command line parameter is handled in sld_setup() shortly
after cpu_parse_early_param() but still before parse_early_param().

Add a dummy parsing function so that parse_early_param() doesn't later
complain about the "unknown" parameter split_lock_detect=, and pass it along
to init.

  [ bp: Massage commit message. ]

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260405181807.3906-1-ronan@rjp.ie
2026-04-05 23:12:40 +02:00
David Hildenbrand (Arm)
52a9e9cd18 mm: rename zap_vma_ptes() to zap_special_vma_range()
zap_vma_ptes() is the only zapping function we export to modules.

It's essentially a wrapper around zap_vma_range(), however, with some
safety checks:
* That the passed range fits fully into the VMA
* That it's only used for VM_PFNMAP

We will add support for VM_MIXEDMAP next, so use the more-generic term
"special vma", although "special" is a bit overloaded.  Maybe we'll later
just support any VM_SPECIAL flag.

While at it, improve the kerneldoc.

Link: https://lkml.kernel.org/r/20260227200848.114019-16-david@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Leon Romanovsky <leon@kernel.org>	[drivers/infiniband]
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Arve <arve@android.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Daniel Borkman <daniel@iogearbox.net>
Cc: Dave Airlie <airlied@gmail.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jakub Kacinski <kuba@kernel.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Namhyung kim <namhyung@kernel.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:15 -07:00
Catalin Marinas
a515ffc9de x86: shstk: use the new common vm_mmap_shadow_stack() helper
Replace part of the x86 alloc_shstk() content with a call to
vm_mmap_shadow_stack(). There is no functional change.

Link: https://lkml.kernel.org/r/20260225161404.3157851-5-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:06 -07:00
Mike Rapoport (Microsoft)
6215d9f447 arch, mm: consolidate empty_zero_page
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.

Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).

Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).

ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.

For the most part the change is mechanical, with these being noteworthy:

* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
  parameters. Switching to a generic empty_zero_page removes the aliasing
  and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
  ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
  although they do not support zero page coloring and having it in BSS
  will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
  can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
  but instead of allocating it, make mem_map_zero point to
  empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
  Rename the parameters page to boot_params_page and let sh use the generic
  empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page

	/* A handy thing to have if one has the RAM. Declared in head.S */

  that unfortunately had to go :)

Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>		[parisc]
Tested-by: Helge Deller <deller@gmx.de>		[parisc]
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>	[alpha]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>	[nios2]
Acked-by: Andreas Larsson <andreas@gaisler.com>	[sparc]
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:01 -07:00
Seongsu Park
3d443691ed mm/pkeys: remove unused tsk parameter from arch_set_user_pkey_access()
The tsk parameter in arch_set_user_pkey_access() is never used in the
function implementations across all architectures (arm64, powerpc, x86).

Link: https://lkml.kernel.org/r/20260219063506.545148-1-sgsu.park@samsung.com
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:57 -07:00
Yazen Ghannam
0422b07bc4 x86/mce/amd: Filter bogus hardware errors on Zen3 clients
Users have been observing multiple L3 cache deferred errors after recent
kernel rework of deferred error handling.¹ ⁴

The errors are bogus due to inconsistent status values. Also, user verified
that bogus MCA_DESTAT values are present on the system even with an older
kernel.²

The errors seem to be garbage values present in the MCA_DESTAT of some L3
cache banks. These were implicitly ignored before the recent kernel rework
because these do not generate a deferred error interrupt.

A later revision of the rework patch was merged for v6.19. This naturally
filtered out most of the bogus error logs. However, a few signatures still
remain.³

Minimize the scope of the filter to the reported CPU
family/model/stepping and only for errors which don't have the Enabled
bit in the MCi status MSR.

¹ https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
² https://lore.kernel.org/6e1eda7dd55f6fa30405edf7b0f75695cf55b237.camel@web.de
³ https://lore.kernel.org/21ba47fa8893b33b94370c2a42e5084cf0d2e975.camel@web.dehttps://lore.kernel.org/r/CAKFB093B2k3sKsGJ_QNX1jVQsaXVFyy=wNwpzCGLOXa_vSDwXw@mail.gmail.com

  [ bp: Generalize the condition according to which errors are bogus. ]

Fixes: 7cb735d7c0 ("x86/mce: Unify AMD DFR handler with MCA Polling")
Closes: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
Reported-by: Bert Karwatzki <spasswolf@web.de>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-By: Bert Karwatzki <spasswolf@web.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250915010010.3547-1-spasswolf@web.de
2026-04-05 12:42:22 +02:00
Michael Kelley
e8be82c2d7 Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
The Hyper-V ISRs, for normal guests and when running in the hypervisor root
patition, are calling add_interrupt_randomness() as a primary source of
entropy. The call is currently in the ISRs as a common place to handle both
x86/x64 and arm64.

On x86/x64, hypervisor interrupts come through a custom sysvec entry, and
do not go through a generic interrupt handler.

On arm64, hypervisor interrupts come through an emulated GICv3. GICv3 uses
the generic handler handle_percpu_devid_irq(), which does not do
add_interrupt_randomness() -- unlike its counterpart
handle_percpu_irq().

But handle_percpu_devid_irq() is now updated to do the
add_interrupt_randomness(). So add_interrupt_randomness() is now needed
only in Hyper-V's x86/x64 custom sysvec path.

Move add_interrupt_randomness() from the Hyper-V ISRs into the Hyper-V
x86/x64 custom sysvec path, matching the existing STIMER0 sysvec path.

With this change, add_interrupt_randomness() is no longer called from any
device drivers, which is appropriate.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://patch.msgid.link/20260402202400.1707-3-mhklkml@zohomail.com
2026-04-04 21:01:56 +02:00
Rafael J. Wysocki
5cdfedf68e amd-pstate new content for 7.1 (2026-04-02)
Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmnOpNgTHHN1cGVybTFA
 a2VybmVsLm9yZwAKCRAtGSymJHcCduR7EADexgetxq0l6/iV2DyI1/YJcf+cNPoS
 yxE93vN9i3A2xcx87klncVF0C2zIZaZFkp6o7VY/AReL/UyUOh6snz371OXBl7pm
 A/uppkT5QdzTpmknJMyqkLRlHfkMjNRzWv4sdh4kyJSB3SkgaN7zSVi6Zxamt/vJ
 VNCgExZQeDqk4VL2X/NBfaBagYSnPnBmBdXoY6aPYqFrqKj4SlDxYNbJsQlcyE9Z
 z0naVGb5YPEJOaMvE+5z+DwX4EmtN3si+vfi8VuQOXPnoDGOG763rpMLnz7xYvfW
 poPu2fnitN39MaT96btRShD6XuCg9eaPAEmpb3j6c93n1kUo+joLLbalhfc0HMeL
 1/8ndz+KatEUMQTCVgs8cboob1PpRvqhIb+vrs6aTEqCsgqUKUZ7GYgglBamyRka
 mivC5Q+ssCxq47/ilGfECFr8vK0oV3rTu9Ltp4MS5zN70tI0YYZk3o1454nY5dhc
 Byv5e9bft/n9AA576y5vXENcWCSez/8UFGl5RjoxQZ7SFKNFnbSic1BT4uMRVX/G
 4QUk5TWwC8WdOp7YsO30LwZ0y9vtxmfBn8BF/6n/dYGhM1/DVQ1nX9iyzhCHZ3XH
 fgyrkUktdI1dsm/xKvbqxK9Djw0tkMsfH1yI6iQccefnlo4gRSvTRFiM2yepY6py
 E8MZpz1ML8T2Pw==
 =XTdh
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:

"Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation"

* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  amd-pstate: Make certain freq_attrs conditionally visible
  ...
2026-04-04 20:55:56 +02:00
Borislav Petkov (AMD)
f44cc3a48a x86/fpu: Correct misspelled xfeaures_to_write local var
It happens. Fix it.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260404120048.14765-1-bp@kernel.org
2026-04-04 14:11:46 +02:00
Naveen N Rao (AMD)
5635c8bfd3 x86/apic: Drop AMD Extended Interrupt LVT macros
AMD defines Extended Interrupt Local Vector Table (EILVT) registers to allow
for additional interrupt sources. While the APIC registers for those are
unique to AMD, the format of those registers follows the standard LVT
registers. Drop EILVT-specific macros in favor of the standard APIC
LVT macros.

Drop unused APIC_EILVT_NR_AMD_K8 and APIC_EILVT_LVTOFF while at it.

No functional change.

  [ bp: Merge the two cleanup patches into one. ]

Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Manali Shukla <manali.shukla@amd.com>
Link: https://patch.msgid.link/b98d69037c0102d2ccd082a941888a689cd214c9.1775019269.git.naveen@kernel.org
2026-04-04 00:56:40 +02:00
Mike Rapoport (Microsoft)
d575951980 x86/alternative: delay freeing of smp_locks section
On SMP systems alternative_instructions() frees memory occupied by
smp_locks section immediately after patching the lock instructions.

The memory is freed using free_init_pages() that calls free_reserved_area()
that essentially does __free_page() for every page in the range.

Up until recently it didn't update memblock state so in cases when
CONFIG_ARCH_KEEP_MEMBLOCK is enabled (on x86 it is selected by
INTEL_TDX_HOST), the state of memblock and the memory map would be
inconsistent.

Additionally, with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled, freeing of
smp_locks happens before the memory map is fully initialized and freeing
reserved memory may cause an access to not-yet-initialized struct page when
__free_page() searches for a buddy page.

Following the discussion in [1], implementation of memblock_free_late() and
free_reserved_area() was unified to ensure that reserved memory that's
freed after memblock transfers the pages to the buddy allocator is actually
freed and that the memblock and the memory map are consistent. As a part of
these changes, free_reserved_area() now WARN()s when it is called before
the initialization of the memory map is complete.

The memory map is fully initialized in page_alloc_init_late() that
completes before initcalls are executed, so it is safe to free reserved
memory in any initcall except early_initcall().

Move freeing of smp_locks section to an initcall to ensure it will happen
after the memory map is fully initialized. Since it does not matter which
exactly initcall to use and the code lives in arch/, pick arch_initcall.

[1] https://lore.kernel.org/all/ec2aaef14783869b3be6e3c253b2dcbf67dbc12a.camel@kernel.crashing.org

Reported-By: Bert Karwatzki <spasswolf@web.de>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202603302154.b50adaf1-lkp@intel.com
Tested-By: Bert Karwatzki <spasswolf@web.de>
Link: https://lore.kernel.org/r/20260327140109.7561-1-spasswolf@web.de
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Fixes: b2129a3951 ("memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-04-03 17:38:34 +03:00
Coiby Xu
03738dd159 crash_dump/dm-crypt: don't print in arch-specific code
Patch series "kdump: Enable LUKS-encrypted dump target support in ARM64
and PowerPC", v5.

CONFIG_CRASH_DM_CRYPT has been introduced to support LUKS-encrypted device
dump target by addressing two challenges [1],

 - Kdump kernel may not be able to decrypt the LUKS partition. For some
   machines, a system administrator may not have a chance to enter the
   password to decrypt the device in kdump initramfs after the 1st kernel
   crashes

 - LUKS2 by default use the memory-hard Argon2 key derivation function
   which is quite memory-consuming compared to the limited memory reserved
   for kdump.

To also enable this feature for ARM64 and PowerPC, we need to add a device
tree property dmcryptkeys [2] as similar to elfcorehdr to pass the memory
address of the stored info of dm-crypt keys to the kdump kernel.


This patch (of 3):

When the vmcore dumping target is not a LUKS-encrypted target, it's
expected that there is no dm-crypt key thus no need to return -ENOENT. 
Also print more logs in crash_load_dm_crypt_keys.  The benefit is
arch-specific code can be more succinct.

Link: https://lkml.kernel.org/r/20260225060347.718905-1-coxu@redhat.com
Link: https://lkml.kernel.org/r/20260225060347.718905-2-coxu@redhat.com
Link: https://lore.kernel.org/all/20250502011246.99238-1-coxu@redhat.com/ [1]
Link: https://github.com/devicetree-org/dt-schema/pull/181 [2]
Signed-off-by: Coiby Xu <coxu@redhat.com>
Suggested-by: Will Deacon <will@kernel.org>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Arnaud Lefebvre <arnaud.lefebvre@clever-cloud.com>
Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Thomas Staudt <tstaudt@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-02 23:36:24 -07:00
Gautham R. Shenoy
172100088f x86/cpufeatures: Add AMD CPPC Performance Priority feature.
Some future AMD processors have feature named "CPPC Performance
Priority" which lets userspace specify different floor performance
levels for different CPUs. The platform firmware takes these different
floor performance levels into consideration while throttling the CPUs
under power/thermal constraints. The presence of this feature is
indicated by bit 16 of the EDX register for CPUID leaf
0x80000007. More details can be found in AMD Publication titled "AMD64
Collaborative Processor Performance Control (CPPC) Performance
Priority" Revision 1.10.

Define a new feature bit named X86_FEATURE_CPPC_PERF_PRIO to map to
CPUID 0x80000007.EDX[16].

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:21 -05:00