Commit Graph

6441 Commits

Author SHA1 Message Date
Linus Torvalds
cd546f7ae2 Assorted arm64, ACPI and kselftest fixes for 7.1-rc2:
- Avoid writing an uninitialised stack variable to POR_EL0 on
    sigreturn if the poe_context record is absent
 
  - Reserve one more page for the early 4K-page kernel mapping to cover
    the extra [_text, _stext) split introduced by the non-executable
    read-only mapping
 
  - Force the arch_local_irq_*() wrappers to be __always_inline so that
    noinstr entry and idle paths cannot call out-of-line, instrumentable
    copies
 
  - Fix potential sign extension in the arm64 SCS unwinder's DWARF
    advance_loc4 decoding
 
  - Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
    states, restoring cpuidle registration on such systems
 
  - Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
    rather than carrying a duplicate struct user_gcs definition (the
    original #ifdef NT_ARM_GCS was wrong to cover the structure
    definition as it would be masked out if the toolchain defined it)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmn03FoACgkQa9axLQDI
 XvELcxAAmhoarEo1Te6wWyybco9LqvfZPzirij+YYLw0GWuqnN99N+f79FZirTbz
 ug9AZiG1PPQY0hCurNWwEjQfWJ6dJYo/4mIT9R1rbeU2MwcxHawIePrM0T8PMBF8
 nHMZaEy/EZ8hX3pam98d78F38yFUvxaikghhxQvHLFlQA4nU19IElQCyMogofe05
 RTE71nDdMZAnfoOS6cVk7wnH99VLfbqiyl97zUOjnyFNdye99UDovayXPUdUkgbN
 clF2qxWInS8TPuoKQPz5hzYkbuR0doFwIasLjSMnOQx+FMZdMmPXEZbwqI/hYl7l
 xc5bjKtJH/AQqdoEkZW9MUJ1GhzMttTpoYW9//wgRpJtBDNxisdOE9LpcsCMMNIM
 wKLrLVLTXsv5jyPeEFMRtUjd0tJ7bV0f3cO/sv5EVBd238CGT76zwCgjpMtZQqbj
 KWsTJpM5oYAsKkBHAYE6XCa5h7kre0/249zH/CYhI/mXJkaHJRM8Ub2CnqBgqeTG
 KobtDIUJt+TPAhThj/2OQ/HxP6SLzgBgsgVmVqE1nhkOPlcfg3YYBsgpgN+bzMfG
 Z7h14yyCAhunoGRVBMtyUgksAvflR+PIS06soRjLZ5cXcOp/3h+sXs6/XVXHtOr/
 UCeO5mfaNUNAr3xJ9oYhuAAT74b7zKXY3YM4NRVASfq6rQS0nWg=
 =a9er
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Avoid writing an uninitialised stack variable to POR_EL0 on sigreturn
   if the poe_context record is absent

 - Reserve one more page for the early 4K-page kernel mapping to cover
   the extra [_text, _stext) split introduced by the non-executable
   read-only mapping

 - Force the arch_local_irq_*() wrappers to be __always_inline so that
   noinstr entry and idle paths cannot call out-of-line, instrumentable
   copies

 - Fix potential sign extension in the arm64 SCS unwinder's DWARF
   advance_loc4 decoding

 - Tolerate arm64 ACPI platforms with only WFI and no deeper PSCI idle
   states, restoring cpuidle registration on such systems

 - Include the UAPI <asm/ptrace.h> header in the arm64 GCS libc test
   rather than carrying a duplicate struct user_gcs definition (the
   original #ifdef NT_ARM_GCS was wrong to cover the structure
   definition as it would be masked out if the toolchain defined it)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: signal: Preserve POR_EL0 if poe_context is missing
  arm64: Reserve an extra page for early kernel mapping
  kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition
  ACPI: arm64: cpuidle: Tolerate platforms with no deep PSCI idle states
  arm64/irqflags: __always_inline the arch_local_irq_*() helpers
  arm64/scs: Fix potential sign extension issue of advance_loc4
2026-05-01 16:32:42 -07:00
Zhaoyang Huang
4d8e74ad45 arm64: Reserve an extra page for early kernel mapping
The final part of [data, end) segment may overflow into the next page of
init_pg_end[1] which is the gap page before early_init_stack[2]:

[1]
crash_arm64_v9.0.1> vtop ffffffed00601000
VIRTUAL           PHYSICAL
ffffffed00601000  83401000

PAGE DIRECTORY: ffffffecffd62000
   PGD: ffffffecffd62da0 => 10000000833fb003
   PMD: ffffff80033fb018 => 10000000833fe003
   PTE: ffffff80033fe008 => 68000083401f03
  PAGE: 83401000

     PTE        PHYSICAL  FLAGS
68000083401f03  83401000  (VALID|SHARED|AF|NG|PXN|UXN)

      PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
fffffffec00d0040 83401000                0        0  1 4000 reserved

[2]
ffffffed002c8000 (r) __pi__data
ffffffed0054e000 (d) __pi___bss_start
ffffffed005f5000 (b) __pi_init_pg_dir
ffffffed005fe000 (b) __pi_init_pg_end
ffffffed005ff000 (B) early_init_stack
ffffffed00608000 (b) __pi__end

For 4K pages, the early kernel mapping may use 2MB block entries but the
kernel segments are only 64KB aligned. Segment boundaries that fall
within a 2MB block therefore require a PTE table so that different
attributes can be applied on either side of the boundary.

KERNEL_SEGMENT_COUNT still correctly counts the five permanent kernel
VMAs registered by declare_kernel_vmas(). However, since commit
5973a62efa ("arm64: map [_text, _stext) virtual address range
non-executable+read-only"), the early mapper also maps [_text, _stext)
separately from [_stext, _etext). This adds one more early-only split
and can require one more page-table page than the existing
EARLY_SEGMENT_EXTRA_PAGES allowance reserves.

Increase the 4K-page early mapping allowance by one page to cover that
additional split.

Fixes: 5973a62efa ("arm64: map [_text, _stext) virtual address range non-executable+read-only")
Assisted-by: TRAE:GLM-5.1
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
[catalin.marinas@arm.com: rewrote part of the commit log]
[catalin.marinas@arm.com: expanded the code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-05-01 16:20:35 +01:00
Breno Leitao
caecde119e arm64/irqflags: __always_inline the arch_local_irq_*() helpers
The arch_local_irq_*() wrappers in <asm/irqflags.h> dispatch between two
underlying primitives: the __daif_* path on most systems, and the
__pmr_* path on builds that use GIC PMR-based masking (Pseudo-NMI). The
leaf primitives are already __always_inline, but the wrappers themselves
are plain "static inline".

That is unsafe for noinstr callers: nothing prevents the compiler from
emitting an out-of-line copy of e.g. arch_local_irq_disable(), and an
out-of-line copy can be instrumented (ftrace, kcov, sanitizers), which
breaks the noinstr contract on the entry/idle paths that rely on these
helpers.

x86 hit and fixed exactly this class of bug in commit 7a745be1cc
("x86/entry: __always_inline irqflags for noinstr").

Force-inline all of the arch_local_irq_*() wrappers so they cannot be
emitted out-of-line:

  - arch_local_irq_enable()
  - arch_local_irq_disable()
  - arch_local_save_flags()
  - arch_irqs_disabled_flags()
  - arch_irqs_disabled()
  - arch_local_irq_save()
  - arch_local_irq_restore()

The primary motivation is noinstr safety. There is a useful side effect
for fleet-wide profiling: when the wrapper is emitted out-of-line,
samples taken inside it during the post-WFI IRQ unmask in
default_idle_call() are attributed to arch_local_irq_enable rather than
default_idle_call(), and the FP-unwinder loses default_idle_call() from
the chain.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Leonardo Bras <leo.bras@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-27 13:13:36 +01:00
Paolo Bonzini
909eac682c KVM/arm64 fixes for 7.1, take #1
- Allow tracing for non-pKVM, which was accidentally disabled when
   the series was merged
 
 - Rationalise the way the pKVM hypercall ranges are defined by using
   the same mechanism as already used for the vcpu_sysreg enum
 
 - Enforce that SMCCC function numbers relayed by the pKVM proxy are
   actually compliant with the specification
 
 - Fix a couple of feature to idreg mappings which resulted in the
   wrong sanitisation being applied
 
 - Fix the GICD_IIDR revision number field that could never been
   written correctly by userspace
 
 - Make kvm_vcpu_initialized() correctly use its parameter instead
   of relying on the surrounding context
 
 - Enforce correct ordering in __pkvm_init_vcpu(), plugging a
   potential pin leak at the same time
 
 - Move __pkvm_init_finalise() to a less dangerous spot, avoiding
   future problems
 
 - Restore functional userspace irqchip support after a four year
   breakage (last functional kernel was 5.18...). This is obviously
   ripe for garbage collection.
 
 - ... and the usual lot of spelling fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmnrhPoACgkQI9DQutE9
 ekNyIxAAgXhyAJzOEvL22uk8bsCNh+mkV/33cI6uEdxJDNWl6yqcaiWqh9PMK0b6
 JtV/TqNwr9ydqbmJPhlpoRA7tRmoOPXVI7tU0BvqYMdG1FXSqVlPK+DAw/GnOwYD
 2vBz3I6Rwm1C5GAggcZNbU+DWXXpFnnILxSEd0N5HHmhPYp3q20jXMcKKfe7WRVn
 DDn2BIAGe65y1pWrG6f2TMxHAg4SghHy0CCA1+v0cfLyklseUlRVbAjaDO4x/2vT
 qJnjd5dDAzktarOiKFe141HrX4UE13Y3vvOlWDSog3iuACrr09HM8wVEh/49cz+5
 55UKoldaQokTOHhe5p560gfzvsIfIjFPrWBkHJ1rke4ajE4Igg1FQirfl+CaZ3L3
 h8b6gLqu8/i2e+Nj45AoDcvoxCuxTTwPIW/X/yJYBUMCfl5DIRj9SO5W7FHv3iLv
 Aa0ZdDb0rgvg7IW6kiFwlysNPvMAHpigkj4hCEonfP7dQTXjaxWybB8I3a4pOL5Q
 2wSkAcqaYo+UgMXo5r4rbsEWgdrql4jxT9xcEMdv9pxpPck2CWVG1zdmgbHW1rk/
 Pyh0qWbvdnxY9tDCZFxIoNhrynrcZUaoWJPScEU7lHb7T8+Gcb7ylnoJQjNu3K7z
 ZDS2QccLncILTPJabGcFm0a0DnmFfyqwqSo5iMeBtQDnwlKSLko=
 =p/JR
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 7.1, take #1

- Allow tracing for non-pKVM, which was accidentally disabled when
  the series was merged

- Rationalise the way the pKVM hypercall ranges are defined by using
  the same mechanism as already used for the vcpu_sysreg enum

- Enforce that SMCCC function numbers relayed by the pKVM proxy are
  actually compliant with the specification

- Fix a couple of feature to idreg mappings which resulted in the
  wrong sanitisation being applied

- Fix the GICD_IIDR revision number field that could never been
  written correctly by userspace

- Make kvm_vcpu_initialized() correctly use its parameter instead
  of relying on the surrounding context

- Enforce correct ordering in __pkvm_init_vcpu(), plugging a
  potential pin leak at the same time

- Move __pkvm_init_finalise() to a less dangerous spot, avoiding
  future problems

- Restore functional userspace irqchip support after a four year
  breakage (last functional kernel was 5.18...). This is obviously
  ripe for garbage collection.

- ... and the usual lot of spelling fixes
2026-04-27 04:24:34 -04:00
Fuad Tabba
d89fdda7dd KVM: arm64: Fix kvm_vcpu_initialized() macro parameter
The macro is defined with parameter 'v' but the body references the
literal token 'vcpu' instead, causing it to silently operate on whatever
'vcpu' resolves to in the caller's scope rather than the value passed by
the caller. All current call sites happen to use a variable named 'vcpu',
so the bug is latent.

Fixes: e016333745 ("KVM: arm64: Only reset vCPU-scoped feature ID regs once")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260424084908.370776-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
2026-04-24 12:03:57 +01:00
Linus Torvalds
2a4c0c11c0 s390 updates for 7.1 merge window
- Add support for CONFIG_PAGE_TABLE_CHECK and enable it in
   debug_defconfig. s390 can only tell user from kernel PTEs via the mm,
   so mm_struct is now passed into pxx_user_accessible_page() callbacks
 
 - Expose the PCI function UID as an arch-specific slot attribute in
   sysfs so a function can be identified by its user-defined id while
   still in standby. Introduces a generic ARCH_PCI_SLOT_GROUPS hook in
   drivers/pci/slot.c
 
 - Refresh s390 PCI documentation to reflect current behavior and cover
   previously undocumented sysfs attributes
 
 - zcrypt device driver cleanup series: consistent field types, clearer
   variable naming, a kernel-doc warning fix, and a comment explaining
   the intentional synchronize_rcu() in pkey_handler_register()
 
 - Provide an s390 arch_raw_cpu_ptr() that avoids the detour via
   get_lowcore() using alternatives, shrinking defconfig by ~27 kB
 
 - Guard identity-base randomization with kaslr_enabled() so nokaslr
   keeps the identity mapping at 0 even with
   CONFIG_RANDOMIZE_IDENTITY_BASE=y
 
 - Build S390_MODULES_SANITY_TEST as a module only by requiring
   KUNIT && m, since built-in would not exercise module loading
 
 - Remove the permanently commented-out HMCDRV_DEV_CLASS create_class()
   code in the hmcdrv driver
 
 - Drop stale ident_map_size extern conflicting with asm/page.h
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmno78kACgkQjYWKoQLX
 FBiHbggAmW5hPIDf4F8HLomMREaaQb7QAyYwfeefwhcFUXSMu8td8S68aN4UkOnS
 DGSFjb+V6Nqd+ewrF7IS9pRU9YFsmBqo3MnLdcJ/ojZFz8BlwoAi+E4AD1a38hY2
 9zh2siPBMjydqBRUn6zjsK8auk4e8r44iS5MNNMXDF2ePE/PnPKTm93GhbtnnM6r
 a7mQkiPbi6j0sN/UU+pQkhS4fm2XNaGpCGGX0W0v2RdLIYZ9zQQdg4TaEsjQ5wZA
 OC3P8LG3OyJjnxsY2J8PIKK0VM0JP67KUGnQOi1y8HbN1LkFfAWF6CK7tsyUE/JM
 TYg7ENs2mUMmaa8niOGkiXzjjAxD0g==
 =NpmP
 -----END PGP SIGNATURE-----

Merge tag 's390-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for CONFIG_PAGE_TABLE_CHECK and enable it in
   debug_defconfig. s390 can only tell user from kernel PTEs via the mm,
   so mm_struct is now passed into pxx_user_accessible_page() callbacks

 - Expose the PCI function UID as an arch-specific slot attribute in
   sysfs so a function can be identified by its user-defined id while
   still in standby. Introduces a generic ARCH_PCI_SLOT_GROUPS hook in
   drivers/pci/slot.c

 - Refresh s390 PCI documentation to reflect current behavior and cover
   previously undocumented sysfs attributes

 - zcrypt device driver cleanup series: consistent field types, clearer
   variable naming, a kernel-doc warning fix, and a comment explaining
   the intentional synchronize_rcu() in pkey_handler_register()

 - Provide an s390 arch_raw_cpu_ptr() that avoids the detour via
   get_lowcore() using alternatives, shrinking defconfig by ~27 kB

 - Guard identity-base randomization with kaslr_enabled() so nokaslr
   keeps the identity mapping at 0 even with RANDOMIZE_IDENTITY_BASE=y

 - Build S390_MODULES_SANITY_TEST as a module only by requiring KUNIT &&
   m, since built-in would not exercise module loading

 - Remove the permanently commented-out HMCDRV_DEV_CLASS create_class()
   code in the hmcdrv driver

 - Drop stale ident_map_size extern conflicting with asm/page.h

* tag 's390-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: Fix warning about wrong kernel doc comment
  PCI: s390: Expose the UID as an arch specific PCI slot attribute
  docs: s390/pci: Improve and update PCI documentation
  s390/pkey: Add comment about synchronize_rcu() to pkey base
  s390/hmcdrv: Remove commented out code
  s390/zcrypt: Slight rework on the agent_id field
  s390/zcrypt: Explicitly use a card variable in _zcrypt_send_cprb
  s390/zcrypt: Rework MKVP fields and handling
  s390/zcrypt: Make apfs a real unsigned int field
  s390/zcrypt: Rework domain processing within zcrypt device driver
  s390/zcrypt: Move inline function rng_type6cprb_msgx from header to code
  s390/percpu: Provide arch_raw_cpu_ptr()
  s390: Enable page table check for debug_defconfig
  s390/pgtable: Add s390 support for page table check
  s390/pgtable: Use set_pmd_bit() to invalidate PMD entry
  mm/page_table_check: Pass mm_struct to pxx_user_accessible_page()
  s390/boot: Respect kaslr_enabled() for identity randomization
  s390/Kconfig: Make modules sanity test a module-only option
  s390/setup: Drop stale ident_map_size declaration
2026-04-22 11:13:45 -07:00
Linus Torvalds
13f24586a2 arm64 updates for 7.1 (second round):
Core features:
 
  - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit)
    DVMSync acknowledgement. The fix consists of sending IPIs on TLB
    maintenance to those CPUs running in user space with SME enabled
 
  - Include kernel-hwcap.h in list of generated files (missed in a recent
    commit generating the KERNEL_HWCAP_* macros)
 
 CCA:
 
  - Fix RSI_INCOMPLETE error check in arm-cca-guest
 
 MPAM:
 
  - Fix an unmount->remount problem with the CDP emulation, uninitialised
    variable and checker warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmnmQZcACgkQa9axLQDI
 XvGouA//SXlo7hQyM41rkgRru9oqftrGg0y6nxz4Z089kv50cm3Jlf/nUuti6vah
 BMBLCGXA1iOQrGIVmuvtCxDRrfYZWpfKGuT9A0gmEoMqrGIpWl9gfBQG+uR+YrQX
 4kp5DLqB85WrJIPiy7HUV6GQoCbFuMrRJwxl89IdWZSobaei3SczTmnttwyJtxG5
 /BMitl024TYdiOPNo8bhiML1wIJCaTHvH4IrtCHPyUHEAtsHSMy00y0OrSKBtA/9
 ZHZRpY7Po/jnL7YUs1AfYwsaSXjkvqXN0K1Tdavzm75k6lpJmbM3VsZabG/CEuvK
 PCOGV++is4Y/A+7aQsCwXKeVnY3b6AC4sextytNq0g3GZ7I+Ht9O6nbsp5ZmyXzB
 HRiFxmFS1pSQOMX9f1neKi3vxDMTy1tKPeccTTzL8dNnxTvUBXnoWfPoJh3cpbjm
 Dbhe1kksiEn01WWFacGtkIPDa9c+Bkd2T+8wrsk85Z+u3Z0JPM5PfOn6v3X9YlKl
 K7W8fhvlDL1wP+iyWcMT5zdo+xzHY4ZxuyWbi9a4RhKc6lFHVVG2mpUuPwSsh2ma
 NnxkDouriuoADHBir89U71N483HSnNfSjhlVSFYD2LFCre5KOZM4KYZ2vwWb8Sy4
 79q+BlVRUTQ5O6XjePoSPjUW4APPNviHJsF4E4IiqHkd9O5lMZU=
 =LNY2
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "The main 'feature' is a workaround for C1-Pro erratum 4193714
  requiring IPIs during TLB maintenance if a process is running in user
  space with SME enabled.

  The hardware acknowledges the DVMSync messages before completing
  in-flight SME accesses, with security implications. The workaround
  makes use of the mm_cpumask() to track the cores that need
  interrupting (arm64 hasn't used this mask before).

  The rest are fixes for MPAM, CCA and generated header that turned up
  during the merging window or shortly before.

  Summary:

  Core features:

   - Add workaround for C1-Pro erratum 4193714 - early CME (SME unit)
     DVMSync acknowledgement. The fix consists of sending IPIs on TLB
     maintenance to those CPUs running in user space with SME enabled

   - Include kernel-hwcap.h in list of generated files (missed in a
     recent commit generating the KERNEL_HWCAP_* macros)

  CCA:

   - Fix RSI_INCOMPLETE error check in arm-cca-guest

  MPAM:

   - Fix an unmount->remount problem with the CDP emulation,
     uninitialised variable and checker warnings"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm_mpam: resctrl: Make resctrl_mon_ctx_waiters static
  arm_mpam: resctrl: Fix the check for no monitor components found
  arm_mpam: resctrl: Fix MBA CDP alloc_capable handling on unmount
  virt: arm-cca-guest: fix error check for RSI_INCOMPLETE
  arm64/hwcap: Include kernel-hwcap.h in list of generated files
  arm64: errata: Work around early CME DVMSync acknowledgement
  arm64: cputype: Add C1-Pro definitions
  arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish()
  arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance
2026-04-20 16:46:22 -07:00
Catalin Marinas
858fbd7248 Merge branch 'for-next/c1-pro-erratum-4193714' into for-next/core
* for-next/c1-pro-erratum-4193714:
  : Work around C1-Pro erratum 4193714 (CVE-2026-0995)
  arm64: errata: Work around early CME DVMSync acknowledgement
  arm64: cputype: Add C1-Pro definitions
  arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish()
  arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance
2026-04-20 13:12:35 +01:00
Catalin Marinas
818f644ec6 Merge branches 'for-next/misc' and 'for-next/mpam' into for-next/core
* for-next/misc:
  : Miscellaneous cleanups/fixes
  virt: arm-cca-guest: fix error check for RSI_INCOMPLETE
  arm64/hwcap: Include kernel-hwcap.h in list of generated files

* for-next/mpam:
  : Fix an unmount->remount problem with the CDP emulation, uninitialised
  : variable and checker warnings
  arm_mpam: resctrl: Make resctrl_mon_ctx_waiters static
  arm_mpam: resctrl: Fix the check for no monitor components found
  arm_mpam: resctrl: Fix MBA CDP alloc_capable handling on unmount
2026-04-20 13:11:50 +01:00
Marc Zyngier
f05799491d KVM: arm64: pkvm: Adopt MARKER() to define host hypercall ranges
The EL2 code defines ranges of host hypercalls that are either
enabled at boot-time only, used by [nh]VHE KVM, or reserved to pKVM.

The way these ranges are delineated is error prone, as the enum symbols
defining the limits are expressed in terms of actual function symbols.
This means that should a new function be added, special care must be
taken to also update the limit symbol.

Improve this by reusing the mechanism introduced for the vcpu_sysreg
enum, which uses a MARKER() macro and some extra trickery to make
the limit symbol standalone. Crucially, the limit symbol has the
same value as the *following* symbol.

The handle_host_hcall() function is then updated to make use of
the new limit definitions and get rid of the brittle default
upper limit. This allows for some more strict checks at build
time, and the removal of an comparison at run time.

Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260414160528.2218858-1-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-18 09:07:13 +01:00
Linus Torvalds
87768582a4 dma-mapping updates for Linux 7.0:
- added support for batched cache sync, what improves performance of
   dma_map/unmap_sg() operations on ARM64 architecture (Barry Song)
 
 - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory
   used in confidential computing (Jiri Pirko)
 
 - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its
   clients (Marek Szyprowski, shared branch with device-tree updates to
   avoid merge conflicts)
 
 - prepared Contiguous Memory Allocator related code for making dma-buf
   drivers modularized (Maxime Ripard)
 
 - added support for benchmarking dma_map_sg() calls to tools/dma utility
   (Qinxin Xia)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSrngzkoBtlA8uaaJ+Jp1EFxbsSRAUCaeCbdQAKCRCJp1EFxbsS
 RHbWAQCt70dzrU0lu0omTR1HdDP4GTYfuM6nZR91e8/itGN1+QD/XH4I/0wuybzk
 v5uxbIC6lR3abQRc3YNRXfi+i5j26A4=
 =Oee2
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping updates from Marek Szyprowski:

 - added support for batched cache sync, what improves performance of
   dma_map/unmap_sg() operations on ARM64 architecture (Barry Song)

 - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory
   used in confidential computing (Jiri Pirko)

 - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and
   its clients (Marek Szyprowski, shared branch with device-tree updates
   to avoid merge conflicts)

 - prepared Contiguous Memory Allocator related code for making dma-buf
   drivers modularized (Maxime Ripard)

 - added support for benchmarking dma_map_sg() calls to tools/dma
   utility (Qinxin Xia)

* tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: (24 commits)
  dma-buf: heaps: system: document system_cc_shared heap
  dma-buf: heaps: system: add system_cc_shared heap for explicitly shared memory
  dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory
  mm: cma: Export cma_alloc(), cma_release() and cma_get_name()
  dma: contiguous: Export dev_get_cma_area()
  dma: contiguous: Make dma_contiguous_default_area static
  dma: contiguous: Make dev_get_cma_area() a proper function
  dma: contiguous: Turn heap registration logic around
  of: reserved_mem: rework fdt_init_reserved_mem_node()
  of: reserved_mem: clarify fdt_scan_reserved_mem*() functions
  of: reserved_mem: rearrange code a bit
  of: reserved_mem: replace CMA quirks by generic methods
  of: reserved_mem: switch to ops based OF_DECLARE()
  of: reserved_mem: use -ENODEV instead of -ENOENT
  of: reserved_mem: remove fdt node from the structure
  dma-mapping: fix false kernel-doc comment marker
  dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg
  dma-mapping: Separate DMA sync issuing and completion waiting
  arm64: Provide dcache_inval_poc_nosync helper
  arm64: Provide dcache_clean_poc_nosync helper
  ...
2026-04-17 11:12:42 -07:00
Vincent Donnefort
ccab51d69b KVM: arm64: Re-allow hyp tracing HVCs for [nh]VHE
The introduction of __KVM_HOST_SMCCC_FUNC_MAX_NO_PKVM excluded hyp
tracing HVCs from the common [nh]VHE/pKVM list. Re-allow them.

Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://patch.msgid.link/20260414100231.1859687-1-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-17 15:21:05 +01:00
Linus Torvalds
01f492e181 Arm:
- Add support for tracing in the standalone EL2 hypervisor code, which
   should help both debugging and performance analysis.  This uses the
   new infrastructure for 'remote' trace buffers that can be exposed
   by non-kernel entities such as firmware, and which came through the
   tracing tree.
 
 - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting
   point for supporting the new GIC architecture in KVM.
 
 - Finally add support for pKVM protected guests, where pages are unmapped
   from the host as they are faulted into the guest and can be shared back
   from the guest using pKVM hypercalls.  Protected guests are created
   using a new machine type identifier.  As the elusive guestmem has not
   yet delivered on its promises, anonymous memory is also supported.
 
   This is only a first step towards full isolation from the host; for
   example, the CPU register state and DMA accesses are not yet isolated.
   Because this does not really yet bring fully what it promises, it is
   hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and
   also triggers TAINT_USER when a VM is created.  Caveat emptor.
 
 - Rework the dreaded user_mem_abort() function to make it more
   maintainable, reducing the amount of state being exposed to the
   various helpers and rendering a substantial amount of state immutable.
 
 - Expand the Stage-2 page table dumper to support NV shadow page tables
   on a per-VM basis.
 
 - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow.
 
 - Fix both SPE and TRBE in non-VHE configurations so that they do not
   generate spurious, out of context table walks that ultimately lead
   to very bad HW lockups.
 
 - A small set of patches fixing the Stage-2 MMU freeing in error cases.
 
 - Tighten-up accepted SMC immediate value to be only #0 for host
   SMCCC calls.
 
 - The usual cleanups and other selftest churn.
 
 LoongArch:
 
 - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel().
 
 - Add DMSINTC irqchip in kernel support.
 
 RISC-V:
 
 - Fix steal time shared memory alignment checks
 
 - Fix vector context allocation leak
 
 - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
 
 - Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
 
 - Fix integer overflow in kvm_pmu_validate_counter_mask()
 
 - Fix shift-out-of-bounds in make_xfence_request()
 
 - Fix lost write protection on huge pages during dirty logging
 
 - Split huge pages during fault handling for dirty logging
 
 - Skip CSR restore if VCPU is reloaded on the same core
 
 - Implement kvm_arch_has_default_irqchip() for KVM selftests
 
 - Factored-out ISA checks into separate sources
 
 - Added hideleg to struct kvm_vcpu_config
 
 - Factored-out VCPU config into separate sources
 
 - Support configuration of per-VM HGATP mode from KVM user space
 
 s390:
 
 - Support for ESA (31-bit) guests inside nested hypervisors.
 
 - Remove restriction on memslot alignment, which is not needed anymore with
   the new gmap code.
 
 - Fix LPSW/E to update the bear (which of course is the breaking event
   address register).
 
 x86:
 
 - Shut up various UBSAN warnings on reading module parameter before they
   were initialized.
 
 - Don't zero-allocate page tables that are used for splitting hugepages in
   the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and
   thus write all bytes.
 
 - As an optimization, bail early when trying to unsync 4KiB mappings if the
   target gfn can just be mapped with a 2MiB hugepage.
 
 x86 generic:
 
 - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely
   struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM
   would dereference stack pointer after an exit to userspace.
 
 - Clean up and comment the emulated MMIO code to try to make it easier to
   maintain (not necessarily "easy", but "easier").
 
 - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX
   and SVM enabling) as it is needed for trusted I/O.
 
 - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
 
 - Immediately fail the build if a required #define is missing in one of
   KVM's headers that is included multiple times.
 
 - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
   exception, mostly to prevent syzkaller from abusing the uAPI to
   trigger WARNs, but also because it can help prevent userspace from
   unintentionally crashing the VM.
 
 - Exempt SMM from CPUID faulting on Intel, as per the spec.
 
 - Misc hardening and cleanup changes.
 
 x86 (AMD):
 
 - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU
   so that KVM doesn't prematurely re-enable AVIC if multiple
   vCPUs have to-be-injected IRQs.
 
 - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would
   overwrite state when enabling virtualization on multiple CPUs in parallel.
   This should not be a problem because OSVW should usually be the same for
   all CPUs.
 
 - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a
   "too large" size based purely on user input.
 
 - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION.
 
 - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as
   doing so for an SNP guest will crash the host due to an RMP violation
   page fault.
 
 - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries
   are required to hold kvm->lock, and enforce it by lockdep.  Fix various
   bugs where sev_guest() was not ensured to be stable for the whole
   duration of a function or ioctl.
 
 - Convert a pile of kvm->lock SEV code to guard().
 
 - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD,
   for which KVM needs to set CR2 and DR6 as a response to ioctls such as
   KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2
   rather than CR2, for example).  Only set CR2 and DR6 when consumption of
   the payload is imminent, but on the other hand force delivery of the
   payload in all paths where userspace retrieves CR2 or DR6.
 
 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead
   of vmcb02->save.cr2.  The value is out of sync after a save/restore
   or after a #PF is injected into L2.
 
 - Fix a class of nSVM bugs where some fields written by the CPU are not
   synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not
   up-to-date when saved by KVM_GET_NESTED_STATE.
 
 - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and
   KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after
   save+restore.
 
 - Add a variety of missing nSVM consistency checks.
 
 - Fix several bugs where KVM failed to correctly update VMCB fields on
   nested #VMEXIT.
 
 - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for
   SVM-related instructions.
 
 - Add support for save+restore of virtualized LBRs (on SVM).
 
 - Refactor various helpers and macros to improve clarity and (hopefully)
   make the code easier to maintain.
 
 - Aggressively sanitize fields when copying from vmcb12, to guard against
   unintentionally allowing L1 to utilize yet-to-be-defined features.
 
 - Fix several bugs where KVM botched rAX legality checks when emulating SVM
   instructions.  There are remaining issues in that KVM doesn't handle size
   prefix overrides for 64-bit guests.
 
 - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of
   somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's
   architectural but sketchy behavior of generating #GP for "unsupported"
   addresses).
 
 - Cache all used vmcb12 fields to further harden against TOCTOU bugs.
 
 x86 (Intel):
 
 - Drop obsolete branch hint prefixes from the VMX instruction macros.
 
 - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
   register input when appropriate.
 
 - Code cleanups.
 
 guest_memfd:
 
 - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support
   reclaim, the memory is unevictable, and there is no storage to write
   back to.
 
 LoongArch selftests:
 
 - Add KVM PMU test cases
 
 s390 selftests:
 
 - Enable more memory selftests.
 
 x86 selftests:
 
 - Add support for Hygon CPUs in KVM selftests.
 
 - Fix a bug in the MSR test where it would get false failures on AMD/Hygon
   CPUs with exactly one of RDPID or RDTSCP.
 
 - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a
   bug where the kernel would attempt to collapse guest_memfd folios against
   KVM's will.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmnftRQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAzwf+NKO4Ktv+7A22ImN0SBl0nlUuulsz
 vTcw3+hxdRoIw83GdNS+hG5js0wrpMDnbv3t4+VliDNBSSxrBzcSWX2wpilW0Xtw
 qGo1MWhs2lKPy1NlaRVOwPS6j7uF3AR0TQ1iQLGMedQuCU9WpiKJxyhNXJdbLrt3
 8EgFzsvtEsv+jKNRUNDf9+d0j4gZsFyIe+Brhianbw+u3/UCiUClLCdsKPc4+5ZX
 08otYXytacGNIf/5Ev1vT4pHkHL0yqKXAtX7LEtaS3+0KrPuLjV4slemivzE9vf5
 Evafm5AhA4wpaNMb1ZerhY3T94lsMaJpWxotjR//0Q7C9B59pCQnXCm8mg==
 =CcE0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Arm:

   - Add support for tracing in the standalone EL2 hypervisor code,
     which should help both debugging and performance analysis. This
     uses the new infrastructure for 'remote' trace buffers that can be
     exposed by non-kernel entities such as firmware, and which came
     through the tracing tree

   - Add support for GICv5 Per Processor Interrupts (PPIs), as the
     starting point for supporting the new GIC architecture in KVM

   - Finally add support for pKVM protected guests, where pages are
     unmapped from the host as they are faulted into the guest and can
     be shared back from the guest using pKVM hypercalls. Protected
     guests are created using a new machine type identifier. As the
     elusive guestmem has not yet delivered on its promises, anonymous
     memory is also supported

     This is only a first step towards full isolation from the host; for
     example, the CPU register state and DMA accesses are not yet
     isolated. Because this does not really yet bring fully what it
     promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
     'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
     created. Caveat emptor

   - Rework the dreaded user_mem_abort() function to make it more
     maintainable, reducing the amount of state being exposed to the
     various helpers and rendering a substantial amount of state
     immutable

   - Expand the Stage-2 page table dumper to support NV shadow page
     tables on a per-VM basis

   - Tidy up the pKVM PSCI proxy code to be slightly less hard to
     follow

   - Fix both SPE and TRBE in non-VHE configurations so that they do not
     generate spurious, out of context table walks that ultimately lead
     to very bad HW lockups

   - A small set of patches fixing the Stage-2 MMU freeing in error
     cases

   - Tighten-up accepted SMC immediate value to be only #0 for host
     SMCCC calls

   - The usual cleanups and other selftest churn

  LoongArch:

   - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()

   - Add DMSINTC irqchip in kernel support

  RISC-V:

   - Fix steal time shared memory alignment checks

   - Fix vector context allocation leak

   - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()

   - Fix double-free of sdata in kvm_pmu_clear_snapshot_area()

   - Fix integer overflow in kvm_pmu_validate_counter_mask()

   - Fix shift-out-of-bounds in make_xfence_request()

   - Fix lost write protection on huge pages during dirty logging

   - Split huge pages during fault handling for dirty logging

   - Skip CSR restore if VCPU is reloaded on the same core

   - Implement kvm_arch_has_default_irqchip() for KVM selftests

   - Factored-out ISA checks into separate sources

   - Added hideleg to struct kvm_vcpu_config

   - Factored-out VCPU config into separate sources

   - Support configuration of per-VM HGATP mode from KVM user space

  s390:

   - Support for ESA (31-bit) guests inside nested hypervisors

   - Remove restriction on memslot alignment, which is not needed
     anymore with the new gmap code

   - Fix LPSW/E to update the bear (which of course is the breaking
     event address register)

  x86:

   - Shut up various UBSAN warnings on reading module parameter before
     they were initialized

   - Don't zero-allocate page tables that are used for splitting
     hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
     the page table and thus write all bytes

   - As an optimization, bail early when trying to unsync 4KiB mappings
     if the target gfn can just be mapped with a 2MiB hugepage

  x86 generic:

   - Copy single-chunk MMIO write values into struct kvm_vcpu (more
     precisely struct kvm_mmio_fragment) to fix use-after-free stack
     bugs where KVM would dereference stack pointer after an exit to
     userspace

   - Clean up and comment the emulated MMIO code to try to make it
     easier to maintain (not necessarily "easy", but "easier")

   - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
     VMX and SVM enabling) as it is needed for trusted I/O

   - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions

   - Immediately fail the build if a required #define is missing in one
     of KVM's headers that is included multiple times

   - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
     exception, mostly to prevent syzkaller from abusing the uAPI to
     trigger WARNs, but also because it can help prevent userspace from
     unintentionally crashing the VM

   - Exempt SMM from CPUID faulting on Intel, as per the spec

   - Misc hardening and cleanup changes

  x86 (AMD):

   - Fix and optimize IRQ window inhibit handling for AVIC; make it
     per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
     vCPUs have to-be-injected IRQs

   - Clean up and optimize the OSVW handling, avoiding a bug in which
     KVM would overwrite state when enabling virtualization on multiple
     CPUs in parallel. This should not be a problem because OSVW should
     usually be the same for all CPUs

   - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
     about a "too large" size based purely on user input

   - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION

   - Disallow synchronizing a VMSA of an already-launched/encrypted
     vCPU, as doing so for an SNP guest will crash the host due to an
     RMP violation page fault

   - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
     queries are required to hold kvm->lock, and enforce it by lockdep.
     Fix various bugs where sev_guest() was not ensured to be stable for
     the whole duration of a function or ioctl

   - Convert a pile of kvm->lock SEV code to guard()

   - Play nicer with userspace that does not enable
     KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
     as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
     payload would end up in EXITINFO2 rather than CR2, for example).
     Only set CR2 and DR6 when consumption of the payload is imminent,
     but on the other hand force delivery of the payload in all paths
     where userspace retrieves CR2 or DR6

   - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
     instead of vmcb02->save.cr2. The value is out of sync after a
     save/restore or after a #PF is injected into L2

   - Fix a class of nSVM bugs where some fields written by the CPU are
     not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
     are not up-to-date when saved by KVM_GET_NESTED_STATE

   - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
     and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
     initialized after save+restore

   - Add a variety of missing nSVM consistency checks

   - Fix several bugs where KVM failed to correctly update VMCB fields
     on nested #VMEXIT

   - Fix several bugs where KVM failed to correctly synthesize #UD or
     #GP for SVM-related instructions

   - Add support for save+restore of virtualized LBRs (on SVM)

   - Refactor various helpers and macros to improve clarity and
     (hopefully) make the code easier to maintain

   - Aggressively sanitize fields when copying from vmcb12, to guard
     against unintentionally allowing L1 to utilize yet-to-be-defined
     features

   - Fix several bugs where KVM botched rAX legality checks when
     emulating SVM instructions. There are remaining issues in that KVM
     doesn't handle size prefix overrides for 64-bit guests

   - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
     instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
     down on AMD's architectural but sketchy behavior of generating #GP
     for "unsupported" addresses)

   - Cache all used vmcb12 fields to further harden against TOCTOU bugs

  x86 (Intel):

   - Drop obsolete branch hint prefixes from the VMX instruction macros

   - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
     register input when appropriate

   - Code cleanups

  guest_memfd:

   - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
     support reclaim, the memory is unevictable, and there is no storage
     to write back to

  LoongArch selftests:

   - Add KVM PMU test cases

  s390 selftests:

   - Enable more memory selftests

  x86 selftests:

   - Add support for Hygon CPUs in KVM selftests

   - Fix a bug in the MSR test where it would get false failures on
     AMD/Hygon CPUs with exactly one of RDPID or RDTSCP

   - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
     for a bug where the kernel would attempt to collapse guest_memfd
     folios against KVM's will"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
  KVM: x86: use inlines instead of macros for is_sev_*guest
  x86/virt: Treat SVM as unsupported when running as an SEV+ guest
  KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
  KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
  KVM: SEV: use mutex guard in snp_handle_guest_req()
  KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
  KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
  KVM: SEV: use mutex guard in snp_launch_update()
  KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
  KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
  KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
  KVM: SEV: WARN on unhandled VM type when initializing VM
  KVM: LoongArch: selftests: Add PMU overflow interrupt test
  KVM: LoongArch: selftests: Add basic PMU event counting test
  KVM: LoongArch: selftests: Add cpucfg read/write helpers
  LoongArch: KVM: Add DMSINTC inject msi to vCPU
  LoongArch: KVM: Add DMSINTC device support
  LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
  LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
  LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
  ...
2026-04-17 07:18:03 -07:00
Linus Torvalds
440d6635b2 mm.git review status for linus..mm-nonmm-stable
Total patches:       126
 Reviews/patch:       0.92
 Reviewed rate:       76%
 
 - The 2 patch series "pid: make sub-init creation retryable" from Oleg
   Nesterov increases the robustness of our creation of init in a new
   namespace.  By clearing away some historical cruft which is no longer
   needed.  Also some documentation fixups are provided.
 
 - The 2 patch series "selftests/fchmodat2: Error handling and general"
   from Mark Brown has a fixup and a cleanup for the fchmodat2() syscall
   selftest.
 
 - The 3 patch series "lib: polynomial: Move to math/ and clean up" from
   Andy Shevchenko does as advertised.
 
 - The 3 patch series "hung_task: Provide runtime reset interface for
   hung task detector" from Aaron Tomlin gives administrators the ability
   to zero out /proc/sys/kernel/hung_task_detect_count.
 
 - The 2 patch series "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" from Thomas Weißschuh teaches getdelays to use the
   in-kernel UAPI headers rather than the system-provided ones.
 
 - The 5 patch series "watchdog/hardlockup: Improvements to hardlockup"
   from Mayank Rungta provides several cleanups and fixups to the
   hardlockup detector code and its documentation.
 
 - The 2 patch series "lib/bch: fix undefined behavior from signed
   left-shifts" from Josh Law provides a couple of small/theoretical fixes
   in the bch code.
 
 - The 2 patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()"
   from Junrui Luo does what is claims.
 
 - The 27 patch series "cleanup the RAID5 XOR library" from Christoph
   Hellwig is a quite far-reaching cleanup to this code.  I can't do better
   than to quote Christoph:
 
     The XOR library used for the RAID5 parity is a bit of a mess right
     now.  The main file sits in crypto/ despite not being cryptography and
     not using the crypto API, with the generic implementations sitting in
     include/asm-generic and the arch implementations sitting in an asm/
     header in theory.  The latter doesn't work for many cases, so
     architectures often build the code directly into the core kernel, or
     create another module for the architecture code.
 
     Change this to a single module in lib/ that also contains the
     architecture optimizations, similar to the library work Eric Biggers
     has done for the CRC and crypto libraries later.  After that it
     changes to better calling conventions that allow for smarter
     architecture implementations (although none is contained here yet),
     and uses static_call to avoid indirection function call overhead.
 
 - The 2 patch series "lib/list_sort: Clean up list_sort() scheduling
   workarounds" from Kuan-Wei Chiu cleans up this library code by removing
   a hacky thing which was added for UBIFS, which UBIFS doesn't actually
   need.
 
 - The 5 patch series "Fix bugs in extract_iter_to_sg()" from Christian
   Ehrhardt fixes a few bugs in the scatterlist code, adds in-kernel tests
   for the now-fixed bugs and fixes a leak in the test itself.
 
 - The 3 patch series "kdump: Enable LUKS-encrypted dump target support
   in ARM64 and PowerPC" from Coiby Xu eenables support of the
   LUKS-encrypted device dump target on arm64 and powerpc.
 
 - The 4 patch series "ocfs2: consolidate extent list validation into
   block read callbacks" from Joseph Qi addresses ocfs2's validation of
   extent list fields - cleanup, simplification, robustness.  (Kernel test
   robot loves mounting corrupted fs images!)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad90rQAKCRDdBJ7gKXxA
 jl7rAQD4/Rq7ZSSnEv6FS4gOwc3MgTdWcZZaXkqL1KiWyYhRwAEA+cVCO344+AKb
 znBOjet/hUr+/kBwyViifiC8LHzchwM=
 =Nfnf
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "pid: make sub-init creation retryable" (Oleg Nesterov)

   Make creation of init in a new namespace more robust by clearing away
   some historical cruft which is no longer needed. Also some
   documentation fixups

 - "selftests/fchmodat2: Error handling and general" (Mark Brown)

   Fix and a cleanup for the fchmodat2() syscall selftest

 - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)

 - "hung_task: Provide runtime reset interface for hung task detector"
   (Aaron Tomlin)

   Give administrators the ability to zero out
   /proc/sys/kernel/hung_task_detect_count

 - "tools/getdelays: use the static UAPI headers from
   tools/include/uapi" (Thomas Weißschuh)

   Teach getdelays to use the in-kernel UAPI headers rather than the
   system-provided ones

 - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)

   Several cleanups and fixups to the hardlockup detector code and its
   documentation

 - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)

   A couple of small/theoretical fixes in the bch code

 - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)

 - "cleanup the RAID5 XOR library" (Christoph Hellwig)

   A quite far-reaching cleanup to this code. I can't do better than to
   quote Christoph:

     "The XOR library used for the RAID5 parity is a bit of a mess right
      now. The main file sits in crypto/ despite not being cryptography
      and not using the crypto API, with the generic implementations
      sitting in include/asm-generic and the arch implementations
      sitting in an asm/ header in theory. The latter doesn't work for
      many cases, so architectures often build the code directly into
      the core kernel, or create another module for the architecture
      code.

      Change this to a single module in lib/ that also contains the
      architecture optimizations, similar to the library work Eric
      Biggers has done for the CRC and crypto libraries later. After
      that it changes to better calling conventions that allow for
      smarter architecture implementations (although none is contained
      here yet), and uses static_call to avoid indirection function call
      overhead"

 - "lib/list_sort: Clean up list_sort() scheduling workarounds"
   (Kuan-Wei Chiu)

   Clean up this library code by removing a hacky thing which was added
   for UBIFS, which UBIFS doesn't actually need

 - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)

   Fix a few bugs in the scatterlist code, add in-kernel tests for the
   now-fixed bugs and fix a leak in the test itself

 - "kdump: Enable LUKS-encrypted dump target support in ARM64 and
   PowerPC" (Coiby Xu)

   Enable support of the LUKS-encrypted device dump target on arm64 and
   powerpc

 - "ocfs2: consolidate extent list validation into block read callbacks"
   (Joseph Qi)

   Cleanup, simplify, and make more robust ocfs2's validation of extent
   list fields (Kernel test robot loves mounting corrupted fs images!)

* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
  ocfs2: validate group add input before caching
  ocfs2: validate bg_bits during freefrag scan
  ocfs2: fix listxattr handling when the buffer is full
  doc: watchdog: fix typos etc
  update Sean's email address
  ocfs2: use get_random_u32() where appropriate
  ocfs2: split transactions in dio completion to avoid credit exhaustion
  ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
  ocfs2: validate extent block list fields during block read
  ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
  ocfs2: validate dx_root extent list fields during block read
  ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
  ocfs2: handle invalid dinode in ocfs2_group_extend
  .get_maintainer.ignore: add Askar
  ocfs2: validate bg_list extent bounds in discontig groups
  checkpatch: exclude forward declarations of const structs
  tools/accounting: handle truncated taskstats netlink messages
  taskstats: set version in TGID exit notifications
  ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
  arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
  ...
2026-04-16 20:11:56 -07:00
Linus Torvalds
334fbe734e mm.git review status for linus..mm-stable
Everything:
 
 Total patches:       368
 Reviews/patch:       1.56
 Reviewed rate:       74%
 
 Excluding DAMON:
 
 Total patches:       316
 Reviews/patch:       1.77
 Reviewed rate:       81%
 
 Excluding DAMON and zram:
 
 Total patches:       306
 Reviews/patch:       1.81
 Reviewed rate:       82%
 
 Excluding DAMON, zram and maple_tree:
 
 Total patches:       276
 Reviews/patch:       2.01
 Reviewed rate:       91%
 
 Significant patch series in this merge:
 
 - The 30 patch series "maple_tree: Replace big node with maple copy"
   from Liam Howlett is mainly prepararatory work for ongoing development
   but it does reduce stack usage and is an improvement.
 
 - The 12 patch series "mm, swap: swap table phase III: remove swap_map"
   from Kairui Song offers memory savings by removing the static swap_map.
   It also yields some CPU savings and implements several cleanups.
 
 - The 2 patch series "mm: memfd_luo: preserve file seals" from Pratyush
   Yadav adds file seal preservation to LUO's memfd code.
 
 - The 2 patch series "mm: zswap: add per-memcg stat for incompressible
   pages" from Jiayuan Chen adds additional userspace stats reportng to
   zswap.
 
 - The 4 patch series "arch, mm: consolidate empty_zero_page" from Mike
   Rapoport implements some cleanups for our handling of ZERO_PAGE() and
   zero_pfn.
 
 - The 2 patch series "mm/kmemleak: Improve scan_should_stop()
   implementation" from Zhongqiu Han provides an robustness improvement and
   some cleanups in the kmemleak code.
 
 - The 4 patch series "Improve khugepaged scan logic" from Vernon Yang
   "improves the khugepaged scan logic and reduces CPU consumption by
   prioritizing scanning tasks that access memory frequently".
 
 - The 2 patch series "Make KHO Stateless" from Jason Miu simplifies
   Kexec Handover by "transitioning KHO from an xarray-based metadata
   tracking system with serialization to a radix tree data structure that
   can be passed directly to the next kernel"
 
 - The 3 patch series "mm: vmscan: add PID and cgroup ID to vmscan
   tracepoints" from Thomas Ballasi and Steven Rostedt enhances vmscan's
   tracepointing.
 
 - The 5 patch series "mm: arch/shstk: Common shadow stack mapping helper
   and VM_NOHUGEPAGE" from Catalin Marinas is a cleanup for the shadow
   stack code: remove per-arch code in favour of a generic implementation.
 
 - The 2 patch series "Fix KASAN support for KHO restored vmalloc
   regions" from Pasha Tatashin fixes a WARN() which can be emitted the KHO
   restores a vmalloc area.
 
 - The 4 patch series "mm: Remove stray references to pagevec" from Tal
   Zussman provides several cleanups, mainly udpating references to "struct
   pagevec", which became folio_batch three years ago.
 
 - The 17 patch series "mm: Eliminate fake head pages from vmemmap
   optimization" from Kiryl Shutsemau simplifies the HugeTLB vmemmap
   optimization (HVO) by changing how tail pages encode their relationship
   to the head page.
 
 - The 2 patch series "mm/damon/core: improve DAMOS quota efficiency for
   core layer filters" from SeongJae Park improves two problematic
   behaviors of DAMOS that makes it less efficient when core layer filters
   are used.
 
 - The 3 patch series "mm/damon: strictly respect min_nr_regions" from
   SeongJae Park improves DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter.
 
 - The 3 patch series "mm/page_alloc: pcp locking cleanup" from Vlastimil
   Babka is a proper fix for a previously hotfixed SMP=n issue.  Code
   simplifications and cleanups ennsed.
 
 - The 16 patch series "mm: cleanups around unmapping / zapping" from
   David Hildenbrand implements "a bunch of cleanups around unmapping and
   zapping.  Mostly simplifications, code movements, documentation and
   renaming of zapping functions".
 
 - The 6 patch series "support batched checking of the young flag for
   MGLRU" from Baolin Wang supports batched checking of the young flag for
   MGLRU.  It's part cleanups; one benchmark shows large performance
   benefits for arm64.
 
 - The 5 patch series "memcg: obj stock and slab stat caching cleanups"
   from Johannes Weiner provides memcg cleanup and robustness improvements.
 
 - The 5 patch series "Allow order zero pages in page reporting" from
   Yuvraj Sakshith enhances page_reporting's free page reporting - it is
   presently and undesirably order-0 pages when reporting free memory.
 
 - The 6 patch series "mm: vma flag tweaks" from Lorenzo Stoakes is
   cleanup work following from the recent conversion of the VMA flags to a
   bitmap.
 
 - The 10 patch series "mm/damon: add optional debugging-purpose sanity
   checks" from SeongJae Park adds some more developer-facing debug checks
   into DAMON core.
 
 - The 2 patch series "mm/damon: test and document power-of-2
   min_region_sz requirement" from SeongJae Park adds an additional DAMON
   kunit test and makes some adjustments to the addr_unit parameter
   handling.
 
 - The 3 patch series "mm/damon/core: make passed_sample_intervals
   comparisons overflow-safe" from SeongJae Park fixes a hard-to-hit time
   overflow issue in DAMON core.
 
 - The 7 patch series "mm/damon: improve/fixup/update ratio calculation,
   test and documentation" from SeongJae Park is a "batch of misc/minor
   improvements and fixups" for DAMON.
 
 - The 4 patch series "mm: move vma_(kernel|mmu)_pagesize() out of
   hugetlb.c" from David Hildenbrand fixes a possible issue with dax-device
   when CONFIG_HUGETLB=n.  Some code movement was required.
 
 - The 6 patch series "zram: recompression cleanups and tweaks" from
   Sergey Senozhatsky provides "a somewhat random mix of fixups,
   recompression cleanups and improvements" in the zram code.
 
 - The 11 patch series "mm/damon: support multiple goal-based quota
   tuning algorithms" from SeongJae Park extend DAMOS quotas goal
   auto-tuning to support multiple tuning algorithms that users can select.
 
 - The 4 patch series "mm: thp: reduce unnecessary
   start_stop_khugepaged()" from Breno Leitao fixes the khugpaged sysfs
   handling so we no longer spam the logs with reams of junk when
   starting/stopping khugepaged.
 
 - The 3 patch series "mm: improve map count checks" from Lorenzo Stoakes
   provides some cleanups and slight fixes in the mremap, mmap and vma
   code.
 
 - The 5 patch series "mm/damon: support addr_unit on default monitoring
   targets for modules" from SeongJae Park extends the use of DAMON core's
   addr_unit tunable.
 
 - The 5 patch series "mm: khugepaged cleanups and mTHP prerequisites"
   from Nico Pache provides cleanups in the khugepaged and is a base for
   Nico's planned khugepaged mTHP support.
 
 - The 15 patch series "mm: memory hot(un)plug and SPARSEMEM cleanups"
   from David Hildenbrand implements code movement and cleanups in the
   memhotplug and sparsemem code.
 
 - The 2 patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and
   cleanup CONFIG_MIGRATION" from David Hildenbrand rationalizes some
   memhotplug Kconfig support.
 
 - The 6 patch series "change young flag check functions to return bool"
   from Baolin Wang is "a cleanup patchset to change all young flag check
   functions to return bool".
 
 - The 3 patch series "mm/damon/sysfs: fix memory leak and NULL
   dereference issues" from Josh Law and SeongJae Park fixes a few
   potential DAMON bugs.
 
 - The 25 patch series "mm/vma: convert vm_flags_t to vma_flags_t in vma
   code" from "converts a lot of the existing use of the legacy vm_flags_t
   data type to the new vma_flags_t type which replaces it".  Mainly in the
   vma code.
 
 - The 21 patch series "mm: expand mmap_prepare functionality and usage"
   from Lorenzo Stoakes "expands the mmap_prepare functionality, which is
   intended to replace the deprecated f_op->mmap hook which has been the
   source of bugs and security issues for some time".  Cleanups,
   documentation, extension of mmap_prepare into filesystem drivers.
 
 - The 13 patch series "mm/huge_memory: refactor zap_huge_pmd()" from
   Lorenzo Stoakes simplifies and cleans up zap_huge_pmd().  Additional
   cleanups around vm_normal_folio_pmd() and the softleaf functionality are
   performed.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCad3HDQAKCRDdBJ7gKXxA
 jrUQAPwNhPk5nPSxnyxjAeQtOBHqgCdnICeEismLajPKd9aYRgEA0s2XAu3tSUYi
 GrBnWImHG3s4ePQxVcPCegWTsOUrXgQ=
 =1Q7o
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "maple_tree: Replace big node with maple copy" (Liam Howlett)

   Mainly prepararatory work for ongoing development but it does reduce
   stack usage and is an improvement.

 - "mm, swap: swap table phase III: remove swap_map" (Kairui Song)

   Offers memory savings by removing the static swap_map. It also yields
   some CPU savings and implements several cleanups.

 - "mm: memfd_luo: preserve file seals" (Pratyush Yadav)

   File seal preservation to LUO's memfd code

 - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
   Chen)

   Additional userspace stats reportng to zswap

 - "arch, mm: consolidate empty_zero_page" (Mike Rapoport)

   Some cleanups for our handling of ZERO_PAGE() and zero_pfn

 - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
   Han)

   A robustness improvement and some cleanups in the kmemleak code

 - "Improve khugepaged scan logic" (Vernon Yang)

   Improve khugepaged scan logic and reduce CPU consumption by
   prioritizing scanning tasks that access memory frequently

 - "Make KHO Stateless" (Jason Miu)

   Simplify Kexec Handover by transitioning KHO from an xarray-based
   metadata tracking system with serialization to a radix tree data
   structure that can be passed directly to the next kernel

 - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
   Ballasi and Steven Rostedt)

   Enhance vmscan's tracepointing

 - "mm: arch/shstk: Common shadow stack mapping helper and
   VM_NOHUGEPAGE" (Catalin Marinas)

   Cleanup for the shadow stack code: remove per-arch code in favour of
   a generic implementation

 - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)

   Fix a WARN() which can be emitted the KHO restores a vmalloc area

 - "mm: Remove stray references to pagevec" (Tal Zussman)

   Several cleanups, mainly udpating references to "struct pagevec",
   which became folio_batch three years ago

 - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
   Shutsemau)

   Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
   pages encode their relationship to the head page

 - "mm/damon/core: improve DAMOS quota efficiency for core layer
   filters" (SeongJae Park)

   Improve two problematic behaviors of DAMOS that makes it less
   efficient when core layer filters are used

 - "mm/damon: strictly respect min_nr_regions" (SeongJae Park)

   Improve DAMON usability by extending the treatment of the
   min_nr_regions user-settable parameter

 - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)

   The proper fix for a previously hotfixed SMP=n issue. Code
   simplifications and cleanups ensued

 - "mm: cleanups around unmapping / zapping" (David Hildenbrand)

   A bunch of cleanups around unmapping and zapping. Mostly
   simplifications, code movements, documentation and renaming of
   zapping functions

 - "support batched checking of the young flag for MGLRU" (Baolin Wang)

   Batched checking of the young flag for MGLRU. It's part cleanups; one
   benchmark shows large performance benefits for arm64

 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)

   memcg cleanup and robustness improvements

 - "Allow order zero pages in page reporting" (Yuvraj Sakshith)

   Enhance free page reporting - it is presently and undesirably order-0
   pages when reporting free memory.

 - "mm: vma flag tweaks" (Lorenzo Stoakes)

   Cleanup work following from the recent conversion of the VMA flags to
   a bitmap

 - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
   Park)

   Add some more developer-facing debug checks into DAMON core

 - "mm/damon: test and document power-of-2 min_region_sz requirement"
   (SeongJae Park)

   An additional DAMON kunit test and makes some adjustments to the
   addr_unit parameter handling

 - "mm/damon/core: make passed_sample_intervals comparisons
   overflow-safe" (SeongJae Park)

   Fix a hard-to-hit time overflow issue in DAMON core

 - "mm/damon: improve/fixup/update ratio calculation, test and
   documentation" (SeongJae Park)

   A batch of misc/minor improvements and fixups for DAMON

 - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
   Hildenbrand)

   Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
   movement was required.

 - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)

   A somewhat random mix of fixups, recompression cleanups and
   improvements in the zram code

 - "mm/damon: support multiple goal-based quota tuning algorithms"
   (SeongJae Park)

   Extend DAMOS quotas goal auto-tuning to support multiple tuning
   algorithms that users can select

 - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)

   Fix the khugpaged sysfs handling so we no longer spam the logs with
   reams of junk when starting/stopping khugepaged

 - "mm: improve map count checks" (Lorenzo Stoakes)

   Provide some cleanups and slight fixes in the mremap, mmap and vma
   code

 - "mm/damon: support addr_unit on default monitoring targets for
   modules" (SeongJae Park)

   Extend the use of DAMON core's addr_unit tunable

 - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)

   Cleanups to khugepaged and is a base for Nico's planned khugepaged
   mTHP support

 - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)

   Code movement and cleanups in the memhotplug and sparsemem code

 - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
   CONFIG_MIGRATION" (David Hildenbrand)

   Rationalize some memhotplug Kconfig support

 - "change young flag check functions to return bool" (Baolin Wang)

   Cleanups to change all young flag check functions to return bool

 - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
   Law and SeongJae Park)

   Fix a few potential DAMON bugs

 - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
   Stoakes)

   Convert a lot of the existing use of the legacy vm_flags_t data type
   to the new vma_flags_t type which replaces it. Mainly in the vma
   code.

 - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)

   Expand the mmap_prepare functionality, which is intended to replace
   the deprecated f_op->mmap hook which has been the source of bugs and
   security issues for some time. Cleanups, documentation, extension of
   mmap_prepare into filesystem drivers

 - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)

   Simplify and clean up zap_huge_pmd(). Additional cleanups around
   vm_normal_folio_pmd() and the softleaf functionality are performed.

* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm: fix deferred split queue races during migration
  mm/khugepaged: fix issue with tracking lock
  mm/huge_memory: add and use has_deposited_pgtable()
  mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
  mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
  mm/huge_memory: separate out the folio part of zap_huge_pmd()
  mm/huge_memory: use mm instead of tlb->mm
  mm/huge_memory: remove unnecessary sanity checks
  mm/huge_memory: deduplicate zap deposited table call
  mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
  mm/huge_memory: add a common exit path to zap_huge_pmd()
  mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
  mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
  mm/huge: avoid big else branch in zap_huge_pmd()
  mm/huge_memory: simplify vma_is_specal_huge()
  mm: on remap assert that input range within the proposed VMA
  mm: add mmap_action_map_kernel_pages[_full]()
  uio: replace deprecated mmap hook with mmap_prepare in uio_info
  drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
  mm: allow handling of stacked mmap_prepare hooks in more drivers
  ...
2026-04-15 12:59:16 -07:00
Linus Torvalds
c43267e679 arm64 updates for 7.1:
Core features:
 
  - Add support for FEAT_LSUI, allowing futex atomic operations without
    toggling Privileged Access Never (PAN)
 
  - Further refactor the arm64 exception handling code towards the
    generic entry infrastructure
 
  - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
    through it
 
 Memory management:
 
  - Refactor the arm64 TLB invalidation API and implementation for better
    control over barrier placement and level-hinted invalidation
 
  - Enable batched TLB flushes during memory hot-unplug
 
  - Fix rodata=full block mapping support for realm guests (when
    BBML2_NOABORT is available)
 
 Perf and PMU:
 
  - Add support for a whole bunch of system PMUs featured in NVIDIA's
    Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
    for CPU/C2C memory latency PMUs)
 
  - Clean up iomem resource handling in the Arm CMN driver
 
  - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}
 
 MPAM (Memory Partitioning And Monitoring):
 
  - Add architecture context-switch and hiding of the feature from KVM
 
  - Add interface to allow MPAM to be exposed to user-space using resctrl
 
  - Add errata workaround for some existing platforms
 
  - Add documentation for using MPAM and what shape of platforms can use
    resctrl
 
 Miscellaneous:
 
  - Check DAIF (and PMR, where relevant) at task-switch time
 
  - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
    (only relevant to asynchronous or asymmetric tag check modes)
 
  - Remove a duplicate allocation in the kexec code
 
  - Remove redundant save/restore of SCS SP on entry to/from EL0
 
  - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
    descriptions
 
  - Add kselftest coverage for cmpbr_sigill()
 
  - Update sysreg definitions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmnc8DEACgkQa9axLQDI
 XvFauRAAhc1cIgoRpgtdZd7+3/g457teDPYA3L/CjJzI28aesIpV/ECrEw2GL4xs
 HrQfijF4oyCDbBwh0sAascO/H7RoyOranlbuc+fVJ6Bj6gP9STzR4GmscsWkAMSJ
 vA3Jd1DREdDBO2sjw+hGhht84nRlcfY1FyORJP+1JaFH4oWTWsRNeOZIiI3BhxR8
 EtFP9E8r2Esxi/FmZb/47m7kYCEH+XsrzQvBQNLVCH899QX2Hn0kAY70ndq2ZiQl
 n+zLAe7FBFwKzUVmlgWuhjrWMmK+1TthK/XQuOtxg13dHmX+vE/j+A+dOqRWSfHY
 ktNcWaf6m4+TWKVeVTe4E1cnSuwTQTm4VQKd9zaeQxiZYyYJhCQjXuEZg3vDmDbq
 F6D3MpTaJHRRWp0rEurxnSBlmQPCBE2IxEBdSrjd/WJ6T9e1oYwWiSJSS7bGCgGr
 dd/XLsOY7Um5n4ooIFEZc1de6VO6/VTKjmxnBMgU+Sa1REbLpD438IX/6CjzG5qM
 l5Ulke/c6/a/faeVCEpZpD8JuvNOzo9RISDPrNg1KKAL+OSU+9tgmVjIFPhDDB0w
 zNTqT7YJIhxlJxnUGWDk8YNsTjT3OzyquY9UT1tBTBqC0k13J2i2ev30toUez7xj
 2aV+9qMpunbLtwYhXNun1hBFiYrCxpX7I8ha0hXiXL0CywVOPTI=
 =CnVn
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "The biggest changes are MPAM enablement in drivers/resctrl and new PMU
  support under drivers/perf.

  On the core side, FEAT_LSUI lets futex atomic operations with EL0
  permissions, avoiding PAN toggling.

  The rest is mostly TLB invalidation refactoring, further generic entry
  work, sysreg updates and a few fixes.

  Core features:

   - Add support for FEAT_LSUI, allowing futex atomic operations without
     toggling Privileged Access Never (PAN)

   - Further refactor the arm64 exception handling code towards the
     generic entry infrastructure

   - Optimise __READ_ONCE() with CONFIG_LTO=y and allow alias analysis
     through it

  Memory management:

   - Refactor the arm64 TLB invalidation API and implementation for
     better control over barrier placement and level-hinted invalidation

   - Enable batched TLB flushes during memory hot-unplug

   - Fix rodata=full block mapping support for realm guests (when
     BBML2_NOABORT is available)

  Perf and PMU:

   - Add support for a whole bunch of system PMUs featured in NVIDIA's
     Tegra410 SoC (cspmu extensions for the fabric and PCIe, new drivers
     for CPU/C2C memory latency PMUs)

   - Clean up iomem resource handling in the Arm CMN driver

   - Fix signedness handling of AA64DFR0.{PMUVer,PerfMon}

  MPAM (Memory Partitioning And Monitoring):

   - Add architecture context-switch and hiding of the feature from KVM

   - Add interface to allow MPAM to be exposed to user-space using
     resctrl

   - Add errata workaround for some existing platforms

   - Add documentation for using MPAM and what shape of platforms can
     use resctrl

  Miscellaneous:

   - Check DAIF (and PMR, where relevant) at task-switch time

   - Skip TFSR_EL1 checks and barriers in synchronous MTE tag check mode
     (only relevant to asynchronous or asymmetric tag check modes)

   - Remove a duplicate allocation in the kexec code

   - Remove redundant save/restore of SCS SP on entry to/from EL0

   - Generate the KERNEL_HWCAP_ definitions from the arm64 hwcap
     descriptions

   - Add kselftest coverage for cmpbr_sigill()

   - Update sysreg definitions"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (109 commits)
  arm64: rsi: use linear-map alias for realm config buffer
  arm64: Kconfig: fix duplicate word in CMDLINE help text
  arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
  arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
  arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
  arm64: kexec: Remove duplicate allocation for trans_pgd
  ACPI: AGDI: fix missing newline in error message
  arm64: Check DAIF (and PMR) at task-switch time
  arm64: entry: Use split preemption logic
  arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
  arm64: entry: Consistently prefix arm64-specific wrappers
  arm64: entry: Don't preempt with SError or Debug masked
  entry: Split preemption from irqentry_exit_to_kernel_mode()
  entry: Split kernel mode logic from irqentry_{enter,exit}()
  entry: Move irqentry_enter() prototype later
  entry: Remove local_irq_{enable,disable}_exit_to_user()
  ...
2026-04-14 16:48:56 -07:00
Linus Torvalds
7393febcb1 Locking updates for v7.1:
Mutexes:
 
  - Add killable flavor to guard definitions (Davidlohr Bueso)
  - Remove the list_head from struct mutex (Matthew Wilcox)
  - Rename mutex_init_lockep() (Davidlohr Bueso)
 
 rwsems:
 
  - Remove the list_head from struct rw_semaphore and
    replace it with a single pointer (Matthew Wilcox)
  - Fix logic error in rwsem_del_waiter() (Andrei Vagin)
 
 Semaphores:
 
  - Remove the list_head from struct semaphore (Matthew Wilcox)
 
 Jump labels:
 
  - Use ATOMIC_INIT() for initialization of .enabled (Thomas Weißschuh)
  - Remove workaround for old compilers in initializations
    (Thomas Weißschuh)
 
 Lock context analysis changes and improvements:
 
  - Add context analysis for rwsems (Peter Zijlstra)
  - Fix rwlock and spinlock lock context annotations (Bart Van Assche)
  - Fix rwlock support in <linux/spinlock_up.h> (Bart Van Assche)
  - Add lock context annotations in the spinlock implementation
    (Bart Van Assche)
  - signal: Fix the lock_task_sighand() annotation (Bart Van Assche)
  - ww-mutex: Fix the ww_acquire_ctx function annotations
    (Bart Van Assche)
  - Add lock context support in do_raw_{read,write}_trylock()
    (Bart Van Assche)
  - arm64, compiler-context-analysis: Permit alias analysis through
    __READ_ONCE() with CONFIG_LTO=y (Marco Elver)
  - Add __cond_releases() (Peter Zijlstra)
  - Add context analysis for mutexes (Peter Zijlstra)
  - Add context analysis for rtmutexes (Peter Zijlstra)
  - Convert futexes to compiler context analysis (Peter Zijlstra)
 
 Rust integration updates:
 
  - Add atomic fetch_sub() implementation (Andreas Hindborg)
  - Refactor various rust_helper_ methods for expansion (Boqun Feng)
  - Add Atomic<*{mut,const} T> support (Boqun Feng)
  - Add atomic operation helpers over raw pointers (Boqun Feng)
  - Add performance-optimal Flag type for atomic booleans, to avoid
    slow byte-sized RMWs on architectures that don't support them.
    (FUJITA Tomonori)
  - Misc cleanups and fixes (Andreas Hindborg, Boqun Feng,
    FUJITA Tomonori)
 
 LTO support updates:
 
  - arm64: Optimize __READ_ONCE() with CONFIG_LTO=y (Marco Elver)
  - compiler: Simplify generic RELOC_HIDE() (Marco Elver)
 
 Miscellaneous fixes and cleanups by Peter Zijlstra, Randy Dunlap,
 Thomas Weißschuh, Davidlohr Bueso and Mikhail Gavrilov.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmncnGERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ig7Q//a3UgHjUe96/zuJIv/X1lt5MU0GHP/m/n
 Rf6c39P0VWV6iupJtZ6gPmtQBQDyqWsnfE9S6PFDW4P/Njn0CGEBhk5bcYiN7dc5
 pN0hfM67rY1Ids2FJo5JVIxw2pNZpZHU4v3dJC84xH1cPwmccHxt3XW67iQnJCY9
 6m7RJ3nUfmNC1qLGKtAFQp3N91hK+BYxqZQ1Wn6a0lRWfmYY7WDs8qrr5N6Ezn7W
 53ZNXXbXUC09iOO/slOZmFD5tDrp5Z1nPYTeOdFnWYC5SoTvkfauTqmfZRN5sFad
 8vRxXHuCsdBthNF+ljobBUhZx9QL4UJMGOJTFVp9dZSj13vI7UNlbfAtwMKM8lsR
 L+v+GSsGdQWwrhzaiz53k6ZuUUDECltjwKFFUBy9RPFMtKkpKsgjW+X6I+SFeTQW
 QAPzaA/fEK45bvSPUcjn09kKKC1EVIjHQ6NvByoVjPABz92PpJgOR0si2PW5F314
 7sKzk4Ra5x8NGDSEiC9uSwB7mIv56/lUq5zuVoz3CB2rehqIpPdeieE8TaXvcmdK
 8otsWYcUXDcj/d6en9XBzb0t3LpQ1TumcOGw3xUJhrSoB4DvmWgW788SbGIKklR9
 KFQLU2USlm4u8JHEFXOAeaDhWME+eCqP5FCq3YTqxLksiA+oYx3Xui1R+5L4yjRs
 bVbp+BIs3N4=
 =aw95
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "Mutexes:

   - Add killable flavor to guard definitions (Davidlohr Bueso)

   - Remove the list_head from struct mutex (Matthew Wilcox)

   - Rename mutex_init_lockep() (Davidlohr Bueso)

  rwsems:

   - Remove the list_head from struct rw_semaphore and
     replace it with a single pointer (Matthew Wilcox)

   - Fix logic error in rwsem_del_waiter() (Andrei Vagin)

  Semaphores:

   - Remove the list_head from struct semaphore (Matthew Wilcox)

  Jump labels:

   - Use ATOMIC_INIT() for initialization of .enabled (Thomas Weißschuh)

   - Remove workaround for old compilers in initializations
     (Thomas Weißschuh)

  Lock context analysis changes and improvements:

   - Add context analysis for rwsems (Peter Zijlstra)

   - Fix rwlock and spinlock lock context annotations (Bart Van Assche)

   - Fix rwlock support in <linux/spinlock_up.h> (Bart Van Assche)

   - Add lock context annotations in the spinlock implementation
     (Bart Van Assche)

   - signal: Fix the lock_task_sighand() annotation (Bart Van Assche)

   - ww-mutex: Fix the ww_acquire_ctx function annotations
     (Bart Van Assche)

   - Add lock context support in do_raw_{read,write}_trylock()
     (Bart Van Assche)

   - arm64, compiler-context-analysis: Permit alias analysis through
     __READ_ONCE() with CONFIG_LTO=y (Marco Elver)

   - Add __cond_releases() (Peter Zijlstra)

   - Add context analysis for mutexes (Peter Zijlstra)

   - Add context analysis for rtmutexes (Peter Zijlstra)

   - Convert futexes to compiler context analysis (Peter Zijlstra)

  Rust integration updates:

   - Add atomic fetch_sub() implementation (Andreas Hindborg)

   - Refactor various rust_helper_ methods for expansion (Boqun Feng)

   - Add Atomic<*{mut,const} T> support (Boqun Feng)

   - Add atomic operation helpers over raw pointers (Boqun Feng)

   - Add performance-optimal Flag type for atomic booleans, to avoid
     slow byte-sized RMWs on architectures that don't support them.
     (FUJITA Tomonori)

   - Misc cleanups and fixes (Andreas Hindborg, Boqun Feng, FUJITA
     Tomonori)

  LTO support updates:

   - arm64: Optimize __READ_ONCE() with CONFIG_LTO=y (Marco Elver)

   - compiler: Simplify generic RELOC_HIDE() (Marco Elver)

  Miscellaneous fixes and cleanups by Peter Zijlstra, Randy Dunlap,
  Thomas Weißschuh, Davidlohr Bueso and Mikhail Gavrilov"

* tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  compiler: Simplify generic RELOC_HIDE()
  locking: Add lock context annotations in the spinlock implementation
  locking: Add lock context support in do_raw_{read,write}_trylock()
  locking: Fix rwlock support in <linux/spinlock_up.h>
  lockdep: Raise default stack trace limits when KASAN is enabled
  cleanup: Optimize guards
  jump_label: remove workaround for old compilers in initializations
  jump_label: use ATOMIC_INIT() for initialization of .enabled
  futex: Convert to compiler context analysis
  locking/rwsem: Fix logic error in rwsem_del_waiter()
  locking/rwsem: Add context analysis
  locking/rtmutex: Add context analysis
  locking/mutex: Add context analysis
  compiler-context-analysys: Add __cond_releases()
  locking/mutex: Remove the list_head from struct mutex
  locking/semaphore: Remove the list_head from struct semaphore
  locking/rwsem: Remove the list_head from struct rw_semaphore
  rust: atomic: Update a safety comment in impl of `fetch_add()`
  rust: sync: atomic: Update documentation for `fetch_add()`
  rust: sync: atomic: Add fetch_sub()
  ...
2026-04-14 12:36:25 -07:00
Linus Torvalds
f21f7b5162 Update to the VDSO subsystem:
- Make the handling of compat functions consistent and more robust
 
      - Rework the underlying data store so that it is dynamically
        allocated, which allows the conversion of the last holdout SPARC64
        to the generic VDSO implementation
 
      - Rework the SPARC64 VDSO to utilize the generic implementation
 
      - Mop up the left overs of the non-generic VDSO support in the core
        code.
 
      - Expand the VDSO selftest and make them more robust
 
      - Allow time namespaces to be enabled independently of the generic
        VDSO support, which was not possible before due to SPARC64 not
        using it.
 
      - Various cleanups and improvements in the related code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmnb0v8QHHRnbHhAa2Vy
 bmVsLm9yZwAKCRCmGPVMDXSYocfqD/9ywgnvwRH6B612mY4PI3qCbLHs6n9f78aH
 YwyXmmfBZ5vt1ZtptHD+BAxiIMm9GC+/exdj5zhcOWucnBVhorcloE6evxhkJAMn
 RhTQFKkEmcA/UV2Yfct9r+33kgZRyu4IIul4J7hgn2o5T1BqwZbOil0W/O5adr5P
 MDLxjT1OLV80ZZWI9qbWcR/aR7W7sHcdwfVPPqjhombRY7f391Mo3dZeM5C2y55x
 8TXCEqVpN1RJzFinWEgQN7QpP4OmF0rRuXSrDQpkH6pk/+RSqNlT/QGG7MJtmCQR
 E6CeBjNRUn318KiroaGyTKlM9xsL3gNoiCY24ZTwzZxx3g5gSAR3KTCTJhQU0hpu
 Svxj+ksqEAyW7fAOIsbce6W8fUPKC2KM+juXgPKcqZ5hjE2fALD+eEYMlq00jSiu
 sj71007cM9tZKOXPdWs3Fv7AY2Yj7iiRiRz9gv1wqS1z7ybxiaFjxjLYYakej0tr
 rmwBDEGhNow7msZZttr01BRZk9hDUWfIiJtL+0BrgRLNzst2A7WoagtZ2s0Z7Psl
 RjtWgYNBDJ878xK0J+Djqb9TyLraGWZShIIna9uYCAJX9i954xfKJ//NOnUkZhcl
 jslDLHhdttyJ+TmgIsc1ntUGvYvHqH5ywQpyDfWepMKyIYdaJLHOr2K6bwFnGHdw
 uocXvLrkXw==
 =8ixX
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vdso updates from Thomas Gleixner:

 - Make the handling of compat functions consistent and more robust

 - Rework the underlying data store so that it is dynamically allocated,
   which allows the conversion of the last holdout SPARC64 to the
   generic VDSO implementation

 - Rework the SPARC64 VDSO to utilize the generic implementation

 - Mop up the left overs of the non-generic VDSO support in the core
   code

 - Expand the VDSO selftest and make them more robust

 - Allow time namespaces to be enabled independently of the generic VDSO
   support, which was not possible before due to SPARC64 not using it

 - Various cleanups and improvements in the related code

* tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  timens: Use task_lock guard in timens_get*()
  timens: Use mutex guard in proc_timens_set_offset()
  timens: Simplify some calls to put_time_ns()
  timens: Add a __free() wrapper for put_time_ns()
  timens: Remove dependency on the vDSO
  vdso/timens: Move functions to new file
  selftests: vDSO: vdso_test_correctness: Add a test for time()
  selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c
  selftests: vDSO: vdso_test_correctness: Handle different tv_usec types
  selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks
  selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks
  Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers"
  random: vDSO: Remove ifdeffery
  random: vDSO: Trim vDSO includes
  vdso/datapage: Trim down unnecessary includes
  vdso/datapage: Remove inclusion of gettimeofday.h
  vdso/helpers: Explicitly include vdso/processor.h
  vdso/gettimeofday: Add explicit includes
  random: vDSO: Add explicit includes
  MIPS: vdso: Explicitly include asm/vdso/vdso.h
  ...
2026-04-14 10:53:44 -07:00
Mark Brown
680b961ebf arm64/hwcap: Include kernel-hwcap.h in list of generated files
When adding generation for the kernel internal constants for hwcaps the
generated file was not explicitly flagged as such in the build system,
causing it to be regenerated on each build. This wasn't obvious when the
series the change was included in was developed since it was all about
changes that trigger rebuilds anyway.

Fixes: abed23c3c4 ("arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps")
Reported-by: Marek Vasut <marex@nabladev.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-14 16:23:40 +01:00
Linus Torvalds
2e31b16101 ACPI support updates for 7.1-rc1
- Update maintainers information regarding ACPICA (Rafael Wysocki)
 
  - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees
    Cook)
 
  - Trigger an ordered system power off after encountering a fatal error
    operator in AML (Armin Wolf)
 
  - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)
 
  - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from
    the ACPI PPTT parser (Ben Horgan)
 
  - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
    DeSimone)
 
  - Address multiple assorted issues and clean up the code in the ACPI
    processor idle driver (Huisong Li)
 
  - Replace strlcat() in the ACPI processor idle drive with a better
    alternative (Andy Shevchenko)
 
  - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki)
 
  - Move reference performance to capabilities and fix an uninitialized
    variable in the ACPI CPPC library (Pengjie Zhang)
 
  - Add support for the Performance Limited Register to the ACPI CPPC
    library (Sumit Gupta)
 
  - Add cppc_get_perf() API to read performance controls, extend
    cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
    library warn on missing mandatory DESIRED_PERF register (Sumit Gupta)
 
  - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target
    callbacks to allow it to control performance bounds via standard
    scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs
    documentation for the Performance Limited Register to it (Sumit Gupta)
 
  - Add ACPI support to the platform device interface in the CMOS RTC
    driver, make the ACPI core device enumeration code create a platform
    device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael
    Wysocki)
 
  - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
    driver and clean up the CMOS RTC ACPI address space handler (Rafael
    Wysocki)
 
  - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
    and allow that driver to work without a dedicated IRQ if the ACPI
    alarm is used (Rafael Wysocki)
 
  - Clean up the ACPI TAD driver in various ways and add an RTC class
    device interface, including both the RTC setting/reading and alarm
    timer support, to it (Rafael Wysocki)
 
  - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
    drivers (Rafael Wysocki)
 
  - Rework checking for duplicate video bus devices and consolidate
    pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael
    Wysocki)
 
  - Update the ACPI core device drivers to stop setting acpi_device_name()
    unnecessarily (Rafael Wysocki)
 
  - Rearrange code using acpi_device_class() in the ACPI core device
    drivers and update them to stop setting acpi_device_class()
    unnecessarily (Rafael Wysocki)
 
  - Define ACPI_AC_CLASS in one place (Rafael Wysocki)
 
  - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to
    bind to platform devices instead of ACPI devices (Rafael Wysocki)
 
  - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
    hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
    Feng)
 
  - Consolidate the interface for obtaining a CPU UID from ACPI across
    architectures and use it to address incorrect PCI TPH Steering Tag
    on ARM64 resulting from the invalid assumption that the ACPI
    Processor UID would always be the same as the corresponding logical
    CPU ID in Linux (Chengwen Feng)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY/bcSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1chAH/1cGRzh9lSgQ3ZdzIIA5rpRtwKC+CTNz
 iNDvQ97W73B2N+WYzMaloOh+ZVA1Vdqc+8921aH6HI+v7wtg/ZV3h/hU7TagHNY/
 bRFDYaeRXVj4aBXNfoVdn7G5UU9j/kIDcV25I2ubOBqZaO6T5p8p1BK0j0vEj+sG
 yR7XwpEhr2OUQwlIFGKskJwFaH57QJXPEY8wf+o+lMEx/7o/JQRJzKFwsYu01ZZV
 kQy9Ee08P/rsNJwU2ibmZu5P3JMnhategAT8VAMBvkfLScv2sKX+1Vz19NGXzm71
 ARaT7y8MSPNb7SAvWmNZ/rVYrYIL+D3a76Gd7MOGrbVWEn6oXIbCIhY=
 =6vEK
 -----END PGP SIGNATURE-----

Merge tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI support updates from Rafael Wysocki:
 "These include an update of the CMOS RTC driver and the related ACPI
  and x86 code that, among other things, switches it over to using the
  platform device interface for device binding on x86 instead of the PNP
  device driver interface (which allows the code in question to be
  simplified quite a bit), a major update of the ACPI Time and Alarm
  Device (TAD) driver adding an RTC class device interface to it, and
  updates of core ACPI drivers that remove some unnecessary and not
  really useful code from them.

  Apart from that, two drivers are converted to using the platform
  driver interface for device binding instead of the ACPI driver one,
  which is slated for removal, support for the Performance Limited
  register is added to the ACPI CPPC library and there are some
  janitorial updates of it and the related cpufreq CPPC driver, the ACPI
  processor driver is fixed and cleaned up, and NVIDIA vendor CPER
  record handler is added to the APEI GHES code.

  Also, the interface for obtaining a CPU UID from ACPI is consolidated
  across architectures and used for fixing a problem with the PCI TPH
  Steering Tag on ARM64, there are two updates related to ACPICA, a
  minor ACPI OS Services Layer (OSL) update, and a few assorted updates
  related to ACPI tables parsing.

  Specifics:

   - Update maintainers information regarding ACPICA (Rafael Wysocki)

   - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()
     (Kees Cook)

   - Trigger an ordered system power off after encountering a fatal
     error operator in AML (Armin Wolf)

   - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)

   - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure
     from the ACPI PPTT parser (Ben Horgan)

   - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
     DeSimone)

   - Address multiple assorted issues and clean up the code in the ACPI
     processor idle driver (Huisong Li)

   - Replace strlcat() in the ACPI processor idle drive with a better
     alternative (Andy Shevchenko)

   - Rearrange and clean up acpi_processor_errata_piix4() (Rafael
     Wysocki)

   - Move reference performance to capabilities and fix an uninitialized
     variable in the ACPI CPPC library (Pengjie Zhang)

   - Add support for the Performance Limited Register to the ACPI CPPC
     library (Sumit Gupta)

   - Add cppc_get_perf() API to read performance controls, extend
     cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
     library warn on missing mandatory DESIRED_PERF register (Sumit
     Gupta)

   - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in
     target callbacks to allow it to control performance bounds via
     standard scaling_min_freq and scaling_max_freq sysfs attributes and
     add sysfs documentation for the Performance Limited Register to it
     (Sumit Gupta)

   - Add ACPI support to the platform device interface in the CMOS RTC
     driver, make the ACPI core device enumeration code create a
     platform device for the CMOS RTC, and drop CMOS RTC PNP device
     support (Rafael Wysocki)

   - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
     driver and clean up the CMOS RTC ACPI address space handler (Rafael
     Wysocki)

   - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
     and allow that driver to work without a dedicated IRQ if the ACPI
     alarm is used (Rafael Wysocki)

   - Clean up the ACPI TAD driver in various ways and add an RTC class
     device interface, including both the RTC setting/reading and alarm
     timer support, to it (Rafael Wysocki)

   - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
     drivers (Rafael Wysocki)

   - Rework checking for duplicate video bus devices and consolidate
     pnp.bus_id workarounds handling in the ACPI video bus driver
     (Rafael Wysocki)

   - Update the ACPI core device drivers to stop setting
     acpi_device_name() unnecessarily (Rafael Wysocki)

   - Rearrange code using acpi_device_class() in the ACPI core device
     drivers and update them to stop setting acpi_device_class()
     unnecessarily (Rafael Wysocki)

   - Define ACPI_AC_CLASS in one place (Rafael Wysocki)

   - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver
     to bind to platform devices instead of ACPI devices (Rafael
     Wysocki)

   - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
     hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
     Feng)

   - Consolidate the interface for obtaining a CPU UID from ACPI across
     architectures and use it to address incorrect PCI TPH Steering Tag
     on ARM64 resulting from the invalid assumption that the ACPI
     Processor UID would always be the same as the corresponding logical
     CPU ID in Linux (Chengwen Feng)"

* tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
  ACPICA: Update maintainers information
  watchdog: ni903x_wdt: Convert to a platform driver
  ACPI: PAD: xen: Convert to a platform driver
  ACPI: processor: idle: Reset cpuidle on C-state list changes
  cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
  PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM
  ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
  perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
  ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
  x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
  ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
  PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
  ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
  ACPI: tables: Enable FPDT on LoongArch
  ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
  ACPI: processor: idle: Reset power_setup_done flag on initialization failure
  ACPI: TAD: Add alarm support to the RTC class device interface
  ...
2026-04-13 19:25:07 -07:00
Linus Torvalds
370c388319 Crypto library updates for 7.1
- Migrate more hash algorithms from the traditional crypto subsystem
   to lib/crypto/.
 
   Like the algorithms migrated earlier (e.g. SHA-*), this simplifies
   the implementations, improves performance, enables further
   simplifications in calling code, and solves various other issues:
 
     - AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC)
 
         - Support these algorithms in lib/crypto/ using the AES
           library and the existing arm64 assembly code
 
         - Reimplement the traditional crypto API's "cmac(aes)",
           "xcbc(aes)", and "cbcmac(aes)" on top of the library
 
         - Convert mac80211 to use the AES-CMAC library. Note: several
           other subsystems can use it too and will be converted later
 
         - Drop the broken, nonstandard, and likely unused support for
           "xcbc(aes)" with key lengths other than 128 bits
 
         - Enable optimizations by default
 
     - GHASH
 
         - Migrate the standalone GHASH code into lib/crypto/
 
         - Integrate the GHASH code more closely with the very similar
           POLYVAL code, and improve the generic GHASH implementation
           to resist cache-timing attacks and use much less memory
 
         - Reimplement the AES-GCM library and the "gcm" crypto_aead
           template on top of the GHASH library. Remove "ghash" from
           the crypto_shash API, as it's no longer needed
 
         - Enable optimizations by default
 
     - SM3
 
         - Migrate the kernel's existing SM3 code into lib/crypto/, and
           reimplement the traditional crypto API's "sm3" on top of it
 
         - I don't recommend using SM3, but this cleanup is worthwhile
           to organize the code the same way as other algorithms
 
 - Testing improvements
 
     - Add a KUnit test suite for each of the new library APIs
 
     - Migrate the existing ChaCha20Poly1305 test to KUnit
 
     - Make the KUnit all_tests.config enable all crypto library tests
 
     - Move the test kconfig options to the Runtime Testing menu
 
 - Other updates to arch-optimized crypto code
 
     - Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine
 
     - Remove some MD5 implementations that are no longer worth keeping
 
     - Drop big endian and voluntary preemption support from the arm64
       code, as those configurations are no longer supported on arm64
 
 - Make jitterentropy and samples/tsm-mr use the crypto library APIs
 
 Note: the overall diffstat is neutral, but when the test code is
 excluded it is significantly negative:
 
     Tests:     13 files changed, 1982 insertions(+),  888 deletions(-)
     Non-test: 141 files changed, 2897 insertions(+), 3987 deletions(-)
     All:      154 files changed, 4879 insertions(+), 4875 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCadWPyxQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK8QCAQD0i98miI1mu01RKuEwrBzmn7L/2sUH
 ReYV/dFDtnN0GwD+KMCiNAM2XTVLRKq5t3OxPHpKZ4y+gZwRowAJeFA02Q8=
 =5rip
 -----END PGP SIGNATURE-----

Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library updates from Eric Biggers:

 - Migrate more hash algorithms from the traditional crypto subsystem to
   lib/crypto/

   Like the algorithms migrated earlier (e.g. SHA-*), this simplifies
   the implementations, improves performance, enables further
   simplifications in calling code, and solves various other issues:

     - AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC)

         - Support these algorithms in lib/crypto/ using the AES library
           and the existing arm64 assembly code

         - Reimplement the traditional crypto API's "cmac(aes)",
           "xcbc(aes)", and "cbcmac(aes)" on top of the library

         - Convert mac80211 to use the AES-CMAC library. Note: several
           other subsystems can use it too and will be converted later

         - Drop the broken, nonstandard, and likely unused support for
           "xcbc(aes)" with key lengths other than 128 bits

         - Enable optimizations by default

     - GHASH

         - Migrate the standalone GHASH code into lib/crypto/

         - Integrate the GHASH code more closely with the very similar
           POLYVAL code, and improve the generic GHASH implementation to
           resist cache-timing attacks and use much less memory

         - Reimplement the AES-GCM library and the "gcm" crypto_aead
           template on top of the GHASH library. Remove "ghash" from the
           crypto_shash API, as it's no longer needed

         - Enable optimizations by default

     - SM3

         - Migrate the kernel's existing SM3 code into lib/crypto/, and
           reimplement the traditional crypto API's "sm3" on top of it

         - I don't recommend using SM3, but this cleanup is worthwhile
           to organize the code the same way as other algorithms

 - Testing improvements:

     - Add a KUnit test suite for each of the new library APIs

     - Migrate the existing ChaCha20Poly1305 test to KUnit

     - Make the KUnit all_tests.config enable all crypto library tests

     - Move the test kconfig options to the Runtime Testing menu

 - Other updates to arch-optimized crypto code:

     - Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine

     - Remove some MD5 implementations that are no longer worth keeping

     - Drop big endian and voluntary preemption support from the arm64
       code, as those configurations are no longer supported on arm64

 - Make jitterentropy and samples/tsm-mr use the crypto library APIs

* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (66 commits)
  lib/crypto: arm64: Assume a little-endian kernel
  arm64: fpsimd: Remove obsolete cond_yield macro
  lib/crypto: arm64/sha3: Remove obsolete chunking logic
  lib/crypto: arm64/sha512: Remove obsolete chunking logic
  lib/crypto: arm64/sha256: Remove obsolete chunking logic
  lib/crypto: arm64/sha1: Remove obsolete chunking logic
  lib/crypto: arm64/poly1305: Remove obsolete chunking logic
  lib/crypto: arm64/gf128hash: Remove obsolete chunking logic
  lib/crypto: arm64/chacha: Remove obsolete chunking logic
  lib/crypto: arm64/aes: Remove obsolete chunking logic
  lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h>
  lib/crypto: aesgcm: Don't disable IRQs during AES block encryption
  lib/crypto: aescfb: Don't disable IRQs during AES block encryption
  lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
  lib/crypto: sparc: Drop optimized MD5 code
  lib/crypto: mips: Drop optimized MD5 code
  lib: Move crypto library tests to Runtime Testing menu
  crypto: sm3 - Remove 'struct sm3_state'
  crypto: sm3 - Remove the original "sm3_block_generic()"
  crypto: sm3 - Remove sm3_base.h
  ...
2026-04-13 17:31:39 -07:00
Linus Torvalds
fdcbb1bc06 Merge branch 'nocache-cleanup'
This series cleans up some of the special user copy functions naming and
semantics.  In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.

To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.

The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.

This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions.  This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.

I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/

* nocache-cleanup:
  x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
  x86: rename and clean up __copy_from_user_inatomic_nocache()
  x86-64: rename misleadingly named '__copy_user_nocache()' function
2026-04-13 08:39:51 -07:00
Paolo Bonzini
e74c3a8891 KVM/arm64 updates for 7.1
* New features:
 
 - Add support for tracing in the standalone EL2 hypervisor code,
   which should help both debugging and performance analysis.
   This comes with a full infrastructure for 'remote' trace buffers
   that can be exposed by non-kernel entities such as firmware.
 
 - Add support for GICv5 Per Processor Interrupts (PPIs), as the
   starting point for supporting the new GIC architecture in KVM.
 
 - Finally add support for pKVM protected guests, with anonymous
   memory being used as a backing store. About time!
 
 * Improvements and bug fixes:
 
 - Rework the dreaded user_mem_abort() function to make it more
   maintainable, reducing the amount of state being exposed to
   the various helpers and rendering a substantial amount of
   state immutable.
 
 - Expand the Stage-2 page table dumper to support NV shadow
   page tables on a per-VM basis.
 
 - Tidy up the pKVM PSCI proxy code to be slightly less hard
   to follow.
 
 - Fix both SPE and TRBE in non-VHE configurations so that they
   do not generate spurious, out of context table walks that
   ultimately lead to very bad HW lockups.
 
 - A small set of patches fixing the Stage-2 MMU freeing in error
   cases.
 
 - Tighten-up accepted SMC immediate value to be only #0 for host
   SMCCC calls.
 
 - The usual cleanups and other selftest churn.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmnWdswACgkQI9DQutE9
 ekNYvBAAxj5Zmsx8sJ2CYDTJc2w4XkEjSgDugA+J/s0TMgrzExeBlWCstdhVTncy
 68nwOjQl3TotnIrt7q36kko9u7IdD0pHNrk34NtlggLjHfB61n9SNcAA6j4F6zJa
 GFkHpJSrSnZuUPqapkDnlyhuPkgTIAkEUk2Am9siksSfY4HvRyHZJm2FTdxsdIBn
 NN9wvQqw2wefTXOQ8gS+oHbPVp1cPbwrF2a3EhzXXv/6W3mUBstXgsijgo07UzCp
 W6vHCv2wqHbHdf67z3Q3hL+VXlVH6oHlyW99/swqISvqRkH/iSB90+oUojnMRrSm
 yB6Wmhh8jboCaajWMJhG+veZw+7GMXU4nOrGd1rbnY8cwRl/TQ5YibhRm7DIdvjO
 xeUluTLJ0NdweQUwE2k4OlgKOuGang3E2p0clmkUO4SstA48MdqR/kpST6guIlWw
 U5syuNaaaiuwP5QOi9qZmMCNmQ3ZfnZG3nseJFdoyGjhVhf5jyQyv4Du9vGZQFF/
 Zkg7yTqC4OWiC+3GkW9YYAySM1MyetivLtd47PGzHPTdtaZziWhNvQ0y+8QjQ+R+
 CJNvyS/DvsT7epSya4sLgMP1ZAlih9xkz5sQ6k8NJLBYYXi0v33qwqditErgLLyj
 S4Ci4WNhHHWIusvCVM7JUBkH0AElpmi506f7F6iHoFLlkYR4t9U=
 =/SuQ
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 7.1

* New features:

- Add support for tracing in the standalone EL2 hypervisor code,
  which should help both debugging and performance analysis.
  This comes with a full infrastructure for 'remote' trace buffers
  that can be exposed by non-kernel entities such as firmware.

- Add support for GICv5 Per Processor Interrupts (PPIs), as the
  starting point for supporting the new GIC architecture in KVM.

- Finally add support for pKVM protected guests, with anonymous
  memory being used as a backing store. About time!

* Improvements and bug fixes:

- Rework the dreaded user_mem_abort() function to make it more
  maintainable, reducing the amount of state being exposed to
  the various helpers and rendering a substantial amount of
  state immutable.

- Expand the Stage-2 page table dumper to support NV shadow
  page tables on a per-VM basis.

- Tidy up the pKVM PSCI proxy code to be slightly less hard
  to follow.

- Fix both SPE and TRBE in non-VHE configurations so that they
  do not generate spurious, out of context table walks that
  ultimately lead to very bad HW lockups.

- A small set of patches fixing the Stage-2 MMU freeing in error
  cases.

- Tighten-up accepted SMC immediate value to be only #0 for host
  SMCCC calls.

- The usual cleanups and other selftest churn.
2026-04-13 11:49:54 +02:00
Catalin Marinas
0baba94a97 arm64: errata: Work around early CME DVMSync acknowledgement
C1-Pro acknowledges DVMSync messages before completing the SME/CME
memory accesses. Work around this by issuing an IPI to the affected CPUs
if they are running in EL0 with SME enabled.

Note that we avoid the local DSB in the IPI handler as the kernel runs
with SCTLR_EL1.IESB=1. This is sufficient to complete SME memory
accesses at EL0 on taking an exception to EL1. On the return to user
path, no barrier is necessary either. See the comment in
sme_set_active() and the more detailed explanation in the link below.

To avoid a potential IPI flood from malicious applications (e.g.
madvise(MADV_PAGEOUT) in a tight loop), track where a process is active
via mm_cpumask() and only interrupt those CPUs.

Link: https://lore.kernel.org/r/ablEXwhfKyJW1i7l@J2N7QTR9R3
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Reviewed-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-10 19:46:14 +01:00
Catalin Marinas
2c99561016 arm64: cputype: Add C1-Pro definitions
Add cputype definitions for C1-Pro. These will be used for errata
detection in subsequent patches.

These values can be found in "Table A-303: MIDR_EL1 bit descriptions" in
issue 07 of the C1-Pro TRM:

  https://documentation-service.arm.com/static/6930126730f8f55a656570af

Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Reviewed-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-10 19:46:14 +01:00
Catalin Marinas
d9fb08ba94 arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish()
The mm structure will be used for workarounds that need limiting to
specific tasks.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-10 19:46:14 +01:00
Catalin Marinas
6bfbf574a3 arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance
Add __tlbi_sync_s1ish_kernel() similar to __tlbi_sync_s1ish() and use it
for kernel TLB maintenance. Also use this function in flush_tlb_all()
which is only used in relation to kernel mappings. Subsequent patches
can differentiate between workarounds that apply to user only or both
user and kernel.

A subsequent patch will add mm_struct to __tlbi_sync_s1ish(). Since
arch_tlbbatch_flush() is not specific to an mm, add a corresponding
__tlbi_sync_s1ish_batch() helper.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-10 19:46:14 +01:00
Catalin Marinas
480a9e57cc Merge branches 'for-next/misc', 'for-next/tlbflush', 'for-next/ttbr-macros-cleanup', 'for-next/kselftest', 'for-next/feat_lsui', 'for-next/mpam', 'for-next/hotplug-batched-tlbi', 'for-next/bbml2-fixes', 'for-next/sysreg', 'for-next/generic-entry' and 'for-next/acpi', remote-tracking branches 'arm64/for-next/perf' and 'arm64/for-next/read-once' into for-next/core
* arm64/for-next/perf:
  : Perf updates
  perf/arm-cmn: Fix resource_size_t printk specifier in arm_cmn_init_dtc()
  perf/arm-cmn: Fix incorrect error check for devm_ioremap()
  perf: add NVIDIA Tegra410 C2C PMU
  perf: add NVIDIA Tegra410 CPU Memory Latency PMU
  perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU
  perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU
  perf/arm_cspmu: Add arm_cspmu_acpi_dev_get
  perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU
  perf/arm_cspmu: nvidia: Rename doc to Tegra241
  perf/arm-cmn: Stop claiming entire iomem region
  arm64: cpufeature: Use pmuv3_implemented() function
  arm64: cpufeature: Make PMUVer and PerfMon unsigned
  KVM: arm64: Read PMUVer as unsigned

* arm64/for-next/read-once:
  : Fixes for __READ_ONCE() with CONFIG_LTO=y
  arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y
  arm64: Optimize __READ_ONCE() with CONFIG_LTO=y

* for-next/misc:
  : Miscellaneous cleanups/fixes
  arm64: rsi: use linear-map alias for realm config buffer
  arm64: Kconfig: fix duplicate word in CMDLINE help text
  arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
  arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
  arm64: kexec: Remove duplicate allocation for trans_pgd
  arm64: mm: Use generic enum pgtable_level
  arm64: scs: Remove redundant save/restore of SCS SP on entry to/from EL0
  arm64: remove ARCH_INLINE_*

* for-next/tlbflush:
  : Refactor the arm64 TLB invalidation API and implementation
  arm64: mm: __ptep_set_access_flags must hint correct TTL
  arm64: mm: Provide level hint for flush_tlb_page()
  arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range()
  arm64: mm: More flags for __flush_tlb_range()
  arm64: mm: Refactor __flush_tlb_range() to take flags
  arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid()
  arm64: mm: Simplify __flush_tlb_range_limit_excess()
  arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
  arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
  arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  arm64: mm: Introduce a C wrapper for by-range TLB invalidation
  arm64: mm: Re-implement the __tlbi_level macro as a C function

* for-next/ttbr-macros-cleanup:
  : Cleanups of the TTBR1_* macros
  arm64/mm: Directly use TTBRx_EL1_CnP
  arm64/mm: Directly use TTBRx_EL1_ASID_MASK
  arm64/mm: Describe TTBR1_BADDR_4852_OFFSET

* for-next/kselftest:
  : arm64 kselftest updates
  selftests/arm64: Implement cmpbr_sigill() to hwcap test

* for-next/feat_lsui:
  : Futex support using FEAT_LSUI instructions to avoid toggling PAN
  arm64: armv8_deprecated: Disable swp emulation when FEAT_LSUI present
  arm64: Kconfig: Add support for LSUI
  KVM: arm64: Use CAST instruction for swapping guest descriptor
  arm64: futex: Support futex with FEAT_LSUI
  arm64: futex: Refactor futex atomic operation
  KVM: arm64: kselftest: set_id_regs: Add test for FEAT_LSUI
  KVM: arm64: Expose FEAT_LSUI to guests
  arm64: cpufeature: Add FEAT_LSUI

* for-next/mpam: (40 commits)
  : Expose MPAM to user-space via resctrl:
  :  - Add architecture context-switch and hiding of the feature from KVM.
  :  - Add interface to allow MPAM to be exposed to user-space using resctrl.
  :  - Add errata workaoround for some existing platforms.
  :  - Add documentation for using MPAM and what shape of platforms can use resctrl
  arm64: mpam: Add initial MPAM documentation
  arm_mpam: Quirk CMN-650's CSU NRDY behaviour
  arm_mpam: Add workaround for T241-MPAM-6
  arm_mpam: Add workaround for T241-MPAM-4
  arm_mpam: Add workaround for T241-MPAM-1
  arm_mpam: Add quirk framework
  arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl
  arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
  arm_mpam: resctrl: Add empty definitions for assorted resctrl functions
  arm_mpam: resctrl: Update the rmid reallocation limit
  arm_mpam: resctrl: Add resctrl_arch_rmid_read()
  arm_mpam: resctrl: Allow resctrl to allocate monitors
  arm_mpam: resctrl: Add support for csu counters
  arm_mpam: resctrl: Add monitor initialisation and domain boilerplate
  arm_mpam: resctrl: Add kunit test for control format conversions
  arm_mpam: resctrl: Add support for 'MB' resource
  arm_mpam: resctrl: Wait for cacheinfo to be ready
  arm_mpam: resctrl: Add rmid index helpers
  arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
  arm_mpam: resctrl: Hide CDP emulation behind CONFIG_EXPERT
  ...

* for-next/hotplug-batched-tlbi:
  : arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
  arm64/mm: Reject memory removal that splits a kernel leaf mapping
  arm64/mm: Enable batched TLB flush in unmap_hotplug_range()

* for-next/bbml2-fixes:
  : Fixes for realm guest and BBML2_NOABORT
  arm64: mm: Remove pmd_sect() and pud_sect()
  arm64: mm: Handle invalid large leaf mappings correctly
  arm64: mm: Fix rodata=full block mapping support for realm guests

* for-next/sysreg:
  : arm64 sysreg updates
  arm64/sysreg: Update ID_AA64SMFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ZFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64FPFR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR2_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update ID_AA64ISAR0_EL1 description to DDI0601 2025-12
  arm64/sysreg: Update SMIDR_EL1 to DDI0601 2025-06

* for-next/generic-entry:
  : More arm64 refactoring towards using the generic entry code
  arm64: Check DAIF (and PMR) at task-switch time
  arm64: entry: Use split preemption logic
  arm64: entry: Use irqentry_{enter_from,exit_to}_kernel_mode()
  arm64: entry: Consistently prefix arm64-specific wrappers
  arm64: entry: Don't preempt with SError or Debug masked
  entry: Split preemption from irqentry_exit_to_kernel_mode()
  entry: Split kernel mode logic from irqentry_{enter,exit}()
  entry: Move irqentry_enter() prototype later
  entry: Remove local_irq_{enable,disable}_exit_to_user()
  entry: Fix stale comment for irqentry_enter()

* for-next/acpi:
  : arm64 ACPI updates
  ACPI: AGDI: fix missing newline in error message
2026-04-10 14:22:24 +01:00
Muhammad Usama Anjum
249bf97331 arm64: mte: Skip TFSR_EL1 checks and barriers in synchronous tag check mode
With KASAN_HW_TAGS (MTE) in synchronous mode, tag check faults are
reported as immediate Data Abort exceptions. The TFSR_EL1.TF1 bit is
never set since faults never go through the asynchronous path.
Therefore, reading TFSR_EL1 and executing data and instruction barriers
on kernel entry, exit, context switch and suspend is unnecessary
overhead.

As with the check_mte_async_tcf and clear_mte_async_tcf paths for
TFSRE0_EL1, extend the same optimisation to kernel entry/exit, context
switch and suspend.

All mte kselftests pass. The kunit before and after the patch show same
results.

A selection of test_vmalloc benchmarks running on a arm64 machine.
v6.19 is the baseline. (>0 is faster, <0 is slower, (R)/(I) =
statistically significant Regression/Improvement). Based on significance
and ignoring the noise, the benchmarks improved.

* 77 result classes were considered, with 9 wins, 0 losses and 68 ties

Results of fastpath [1] on v6.19 vs this patch:

+----------------------------+----------------------------------------------------------+------------+
| Benchmark                  | Result Class                                             |   barriers |
+============================+==========================================================+============+
| micromm/fork               | fork: p:1, d:10 (seconds)                                |  (I) 2.75% |
|                            | fork: p:512, d:10 (seconds)                              |      0.96% |
+----------------------------+----------------------------------------------------------+------------+
| micromm/munmap             | munmap: p:1, d:10 (seconds)                              |     -1.78% |
|                            | munmap: p:512, d:10 (seconds)                            |      5.02% |
+----------------------------+----------------------------------------------------------+------------+
| micromm/vmalloc            | fix_align_alloc_test: p:1, h:0, l:500000 (usec)          |     -0.56% |
|                            | fix_size_alloc_test: p:1, h:0, l:500000 (usec)           |      0.70% |
|                            | fix_size_alloc_test: p:4, h:0, l:500000 (usec)           |      1.18% |
|                            | fix_size_alloc_test: p:16, h:0, l:500000 (usec)          |     -5.01% |
|                            | fix_size_alloc_test: p:16, h:1, l:500000 (usec)          |     13.81% |
|                            | fix_size_alloc_test: p:64, h:0, l:100000 (usec)          |      6.51% |
|                            | fix_size_alloc_test: p:64, h:1, l:100000 (usec)          |     32.87% |
|                            | fix_size_alloc_test: p:256, h:0, l:100000 (usec)         |      4.17% |
|                            | fix_size_alloc_test: p:256, h:1, l:100000 (usec)         |      8.40% |
|                            | fix_size_alloc_test: p:512, h:0, l:100000 (usec)         |     -0.48% |
|                            | fix_size_alloc_test: p:512, h:1, l:100000 (usec)         |     -0.74% |
|                            | full_fit_alloc_test: p:1, h:0, l:500000 (usec)           |      0.53% |
|                            | kvfree_rcu_1_arg_vmalloc_test: p:1, h:0, l:500000 (usec) |     -2.81% |
|                            | kvfree_rcu_2_arg_vmalloc_test: p:1, h:0, l:500000 (usec) |     -2.06% |
|                            | long_busy_list_alloc_test: p:1, h:0, l:500000 (usec)     |     -0.56% |
|                            | pcpu_alloc_test: p:1, h:0, l:500000 (usec)               |     -0.41% |
|                            | random_size_align_alloc_test: p:1, h:0, l:500000 (usec)  |      0.89% |
|                            | random_size_alloc_test: p:1, h:0, l:500000 (usec)        |      1.71% |
|                            | vm_map_ram_test: p:1, h:0, l:500000 (usec)               |      0.83% |
+----------------------------+----------------------------------------------------------+------------+
| schbench/thread-contention | -m 16 -t 1 -r 10 -s 1000, avg_rps (req/sec)              |      0.05% |
|                            | -m 16 -t 1 -r 10 -s 1000, req_latency_p99 (usec)         |      0.60% |
|                            | -m 16 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec)      |      0.00% |
|                            | -m 16 -t 4 -r 10 -s 1000, avg_rps (req/sec)              |     -0.34% |
|                            | -m 16 -t 4 -r 10 -s 1000, req_latency_p99 (usec)         |     -0.58% |
|                            | -m 16 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec)      |      9.09% |
|                            | -m 16 -t 16 -r 10 -s 1000, avg_rps (req/sec)             |     -0.74% |
|                            | -m 16 -t 16 -r 10 -s 1000, req_latency_p99 (usec)        |     -1.40% |
|                            | -m 16 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec)     |      0.00% |
|                            | -m 16 -t 64 -r 10 -s 1000, avg_rps (req/sec)             |     -0.78% |
|                            | -m 16 -t 64 -r 10 -s 1000, req_latency_p99 (usec)        |     -0.11% |
|                            | -m 16 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec)     |      0.11% |
|                            | -m 16 -t 256 -r 10 -s 1000, avg_rps (req/sec)            |      2.64% |
|                            | -m 16 -t 256 -r 10 -s 1000, req_latency_p99 (usec)       |      3.15% |
|                            | -m 16 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec)    |     17.54% |
|                            | -m 32 -t 1 -r 10 -s 1000, avg_rps (req/sec)              |     -1.22% |
|                            | -m 32 -t 1 -r 10 -s 1000, req_latency_p99 (usec)         |      0.85% |
|                            | -m 32 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec)      |      0.00% |
|                            | -m 32 -t 4 -r 10 -s 1000, avg_rps (req/sec)              |     -0.34% |
|                            | -m 32 -t 4 -r 10 -s 1000, req_latency_p99 (usec)         |      1.05% |
|                            | -m 32 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec)      |      0.00% |
|                            | -m 32 -t 16 -r 10 -s 1000, avg_rps (req/sec)             |     -0.41% |
|                            | -m 32 -t 16 -r 10 -s 1000, req_latency_p99 (usec)        |      0.58% |
|                            | -m 32 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec)     |      2.13% |
|                            | -m 32 -t 64 -r 10 -s 1000, avg_rps (req/sec)             |      0.67% |
|                            | -m 32 -t 64 -r 10 -s 1000, req_latency_p99 (usec)        |      2.07% |
|                            | -m 32 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec)     |     -1.28% |
|                            | -m 32 -t 256 -r 10 -s 1000, avg_rps (req/sec)            |      1.01% |
|                            | -m 32 -t 256 -r 10 -s 1000, req_latency_p99 (usec)       |      0.69% |
|                            | -m 32 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec)    |     13.12% |
|                            | -m 64 -t 1 -r 10 -s 1000, avg_rps (req/sec)              |     -0.25% |
|                            | -m 64 -t 1 -r 10 -s 1000, req_latency_p99 (usec)         |     -0.48% |
|                            | -m 64 -t 1 -r 10 -s 1000, wakeup_latency_p99 (usec)      |     10.53% |
|                            | -m 64 -t 4 -r 10 -s 1000, avg_rps (req/sec)              |     -0.06% |
|                            | -m 64 -t 4 -r 10 -s 1000, req_latency_p99 (usec)         |      0.00% |
|                            | -m 64 -t 4 -r 10 -s 1000, wakeup_latency_p99 (usec)      |      0.00% |
|                            | -m 64 -t 16 -r 10 -s 1000, avg_rps (req/sec)             |     -0.36% |
|                            | -m 64 -t 16 -r 10 -s 1000, req_latency_p99 (usec)        |      0.52% |
|                            | -m 64 -t 16 -r 10 -s 1000, wakeup_latency_p99 (usec)     |      0.11% |
|                            | -m 64 -t 64 -r 10 -s 1000, avg_rps (req/sec)             |      0.52% |
|                            | -m 64 -t 64 -r 10 -s 1000, req_latency_p99 (usec)        |      3.53% |
|                            | -m 64 -t 64 -r 10 -s 1000, wakeup_latency_p99 (usec)     |     -0.10% |
|                            | -m 64 -t 256 -r 10 -s 1000, avg_rps (req/sec)            |      2.53% |
|                            | -m 64 -t 256 -r 10 -s 1000, req_latency_p99 (usec)       |      1.82% |
|                            | -m 64 -t 256 -r 10 -s 1000, wakeup_latency_p99 (usec)    |     -5.80% |
+----------------------------+----------------------------------------------------------+------------+
| syscall/getpid             | mean (ns)                                                | (I) 15.98% |
|                            | p99 (ns)                                                 | (I) 11.11% |
|                            | p99.9 (ns)                                               | (I) 16.13% |
+----------------------------+----------------------------------------------------------+------------+
| syscall/getppid            | mean (ns)                                                | (I) 14.82% |
|                            | p99 (ns)                                                 | (I) 17.86% |
|                            | p99.9 (ns)                                               |  (I) 9.09% |
+----------------------------+----------------------------------------------------------+------------+
| syscall/invalid            | mean (ns)                                                | (I) 17.78% |
|                            | p99 (ns)                                                 | (I) 11.11% |
|                            | p99.9 (ns)                                               |     13.33% |
+----------------------------+----------------------------------------------------------+------------+

[1] https://gitlab.arm.com/tooling/fastpath

Signed-off-by: Muhammad Usama Anjum <usama.anjum@arm.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-09 18:56:22 +01:00
Mark Brown
abed23c3c4 arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
Currently for each hwcap we define both the HWCAPn_NAME definition which is
exposed to userspace and a kernel internal KERNEL_HWCAP_NAME definition
which we use internally. This is tedious and repetitive, instead use a
script to generate the KERNEL_HWCAP_ definitions from the UAPI definitions.

No functional changes intended.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-09 16:50:23 +01:00
Mark Rutland
ae654112ea arm64: entry: Use split preemption logic
The generic irqentry code now provides
irqentry_exit_to_kernel_mode_preempt() and
irqentry_exit_to_kernel_mode_after_preempt(), which can be used
where architectures have different state requirements for involuntary
preemption and exception return, as is the case on arm64.

Use the new functions on arm64, aligning our exit to kernel mode logic
with the style of our exit to user mode logic. This removes the need for
the recently-added bodge in arch_irqentry_exit_need_resched(), and
allows preemption to occur when returning from any exception taken from
kernel mode, which is nicer for RT.

In an ideal world, we'd remove arch_irqentry_exit_need_resched(), and
fold the conditionality directly into the architecture-specific entry
code. That way all the logic necessary to avoid preempting from a
pseudo-NMI could be constrained specifically to the EL1 IRQ/FIQ paths,
avoiding redundant work for other exceptions, and making the flow a bit
clearer. At present it looks like that would require a larger
refactoring (e.g. for the PREEMPT_DYNAMIC logic), and so I've left that
as-is for now.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-08 17:40:04 +01:00
Mark Rutland
2371bd83b3 arm64: entry: Don't preempt with SError or Debug masked
On arm64, involuntary kernel preemption has been subtly broken since the
move to the generic irqentry code. When preemption occurs, the new task
may run with SError and Debug exceptions masked unexpectedly, leading to
a loss of RAS events, breakpoints, watchpoints, and single-step
exceptions.

Prior to moving to the generic irqentry code, involuntary preemption of
kernel mode would only occur when returning from regular interrupts, in
a state where interrupts were masked and all other arm64-specific
exceptions (SError, Debug, and pseudo-NMI) were unmasked. This is the
only state in which it is valid to switch tasks.

As part of moving to the generic irqentry code, the involuntary
preemption logic was moved such that involuntary preemption could occur
when returning from any (non-NMI) exception. As most exception handlers
mask all arm64-specific exceptions before this point, preemption could
occur in a state where arm64-specific exceptions were masked. This is
not a valid state to switch tasks, and resulted in the loss of
exceptions described above.

As a temporary bodge, avoid the loss of exceptions by avoiding
involuntary preemption when SError and/or Debug exceptions are masked.
Practically speaking this means that involuntary preemption will only
occur when returning from regular interrupts, as was the case before
moving to the generic irqentry code.

Fixes: 99eb057ccd ("arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode()")
Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-08 17:40:04 +01:00
Marc Zyngier
d77f4792db Merge branch kvm-arm64/vgic-fixes-7.1 into kvmarm-master/next
* kvm-arm64/vgic-fixes-7.1:
  : .
  : FIrst pass at fixing a number of vgic-v5 bugs that were found
  : after the merge of the initial series.
  : .
  KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE
  KVM: arm64: vgic-v5: Fold PPI state for all exposed PPIs
  KVM: arm64: set_id_regs: Allow GICv3 support to be set at runtime
  KVM: arm64: Don't advertises GICv3 in ID_PFR1_EL1 if AArch32 isn't supported
  KVM: arm64: Correctly plumb ID_AA64PFR2_EL1 into pkvm idreg handling
  KVM: arm64: Move GICv5 timer PPI validation into timer_irqs_are_valid()
  KVM: arm64: Remove evaluation of timer state in kvm_cpu_has_pending_timer()
  KVM: arm64: Kill arch_timer_context::direct field
  KVM: arm64: vgic-v5: Correctly set dist->ready once initialised
  KVM: arm64: vgic-v5: Make the effective priority mask a strict limit
  KVM: arm64: vgic-v5: Cast vgic_apr to u32 to avoid undefined behaviours
  KVM: arm64: vgic-v5: Transfer edge pending state to ICH_PPI_PENDRx_EL2
  KVM: arm64: vgic-v5: Hold config_lock while finalizing GICv5 PPIs
  KVM: arm64: Account for RESx bits in __compute_fgt()
  KVM: arm64: Fix writeable mask for ID_AA64PFR2_EL1
  arm64: Fix field references for ICH_PPI_DVIR[01]_EL2
  KVM: arm64: Don't skip per-vcpu NV initialisation
  KVM: arm64: vgic: Don't reset cpuif/redist addresses at finalize time

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:26:00 +01:00
Marc Zyngier
83a3980750 Merge branch kvm-arm64/pkvm-protected-guest into kvmarm-master/next
* kvm-arm64/pkvm-protected-guest: (41 commits)
  : .
  : pKVM support for protected guests, implementing the very long
  : awaited support for anonymous memory, as the elusive guestmem
  : has failed to deliver on its promises despite a multi-year
  : effort. Patches courtesy of Will Deacon. From the initial cover
  : letter:
  :
  : "[...] this patch series implements support for protected guest
  : memory with pKVM, where pages are unmapped from the host as they are
  : faulted into the guest and can be shared back from the guest using pKVM
  : hypercalls. Protected guests are created using a new machine type
  : identifier and can be booted to a shell using the kvmtool patches
  : available at [2], which finally means that we are able to test the pVM
  : logic in pKVM. Since this is an incremental step towards full isolation
  : from the host (for example, the CPU register state and DMA accesses are
  : not yet isolated), creating a pVM requires a developer Kconfig option to
  : be enabled in addition to booting with 'kvm-arm.mode=protected' and
  : results in a kernel taint."
  : .
  KVM: arm64: Don't hold 'vm_table_lock' across guest page reclaim
  KVM: arm64: Allow get_pkvm_hyp_vm() to take a reference to a dying VM
  KVM: arm64: Prevent teardown finalisation of referenced 'hyp_vm'
  drivers/virt: pkvm: Add Kconfig dependency on DMA_RESTRICTED_POOL
  KVM: arm64: Rename PKVM_PAGE_STATE_MASK
  KVM: arm64: Extend pKVM page ownership selftests to cover guest hvcs
  KVM: arm64: Extend pKVM page ownership selftests to cover forced reclaim
  KVM: arm64: Register 'selftest_vm' in the VM table
  KVM: arm64: Extend pKVM page ownership selftests to cover guest donation
  KVM: arm64: Add some initial documentation for pKVM
  KVM: arm64: Allow userspace to create protected VMs when pKVM is enabled
  KVM: arm64: Implement the MEM_UNSHARE hypercall for protected VMs
  KVM: arm64: Implement the MEM_SHARE hypercall for protected VMs
  KVM: arm64: Add hvc handler at EL2 for hypercalls from protected VMs
  KVM: arm64: Return -EFAULT from VCPU_RUN on access to a poisoned pte
  KVM: arm64: Reclaim faulting page from pKVM in spurious fault handler
  KVM: arm64: Introduce hypercall to force reclaim of a protected page
  KVM: arm64: Annotate guest donations with handle and gfn in host stage-2
  KVM: arm64: Change 'pkvm_handle_t' to u16
  KVM: arm64: Introduce host_stage2_set_owner_metadata_locked()
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:25:39 +01:00
Marc Zyngier
73bb0bc2f4 Merge branch kvm-arm64/spe-trbe-nvhe into kvmarm-master/next
* kvm-arm64/spe-trbe-nvhe:
  : .
  : Fix SPE and TRBE nVHE world switch which can otherwise result in
  : pretty bad behaviours, as they have the nasty habit of performing
  : out of context speculative page table walks.
  :
  : Patches courtesy of Will Deacon.
  : .
  KVM: arm64: Don't pass host_debug_state to BRBE world-switch routines
  KVM: arm64: Disable SPE Profiling Buffer when running in guest context
  KVM: arm64: Disable TRBE Trace Buffer Unit when running in guest context

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:23:58 +01:00
Marc Zyngier
b693940e81 Merge branch kvm-arm64/pkvm-psci into kvmarm-master/next
* kvm-arm64/pkvm-psci:
  : .
  : Cleanup of the pKVM PSCI relay CPU entry code, making it slightly
  : easier to follow, should someone have to wade into these waters
  : ever again.
  : .
  KVM: arm64: Remove extra ISBs when using msr_hcr_el2
  KVM: arm64: pkvm: Use direct function pointers for cpu_{on,resume}
  KVM: arm64: pkvm: Turn __kvm_hyp_init_cpu into an inner label
  KVM: arm64: pkvm: Simplify BTI handling on CPU boot
  KVM: arm64: pkvm: Move error handling to the end of kvm_hyp_cpu_entry

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:23:24 +01:00
Marc Zyngier
e85d1c0cc7 Merge branch kvm-arm64/nv-s2-debugfs into kvmarm-master/next
* kvm-arm64/nv-s2-debugfs:
  : .
  : Expand the stage-2 ptdump infrastructure to be able to display
  : the content of the shadow s2 tables generated by nested virt.
  :
  : Patches courtesy of Wei-Lin Chang.
  : .
  KVM: arm64: ptdump: Initialize parser_state before pgtable walk
  KVM: arm64: nv: Expose shadow page tables in debugfs
  KVM: arm64: ptdump: Make KVM ptdump code s2 mmu aware

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:22:55 +01:00
Marc Zyngier
f8078d51ee Merge branch kvm-arm64/vgic-v5-ppi into kvmarm-master/next
* kvm-arm64/vgic-v5-ppi: (40 commits)
  : .
  : Add initial GICv5 support for KVM guests, only adding PPI support
  : for the time being. Patches courtesy of Sascha Bischoff.
  :
  : From the cover letter:
  :
  : "This is v7 of the patch series to add the virtual GICv5 [1] device
  : (vgic_v5). Only PPIs are supported by this initial series, and the
  : vgic_v5 implementation is restricted to the CPU interface,
  : only. Further patch series are to follow in due course, and will add
  : support for SPIs, LPIs, the GICv5 IRS, and the GICv5 ITS."
  : .
  KVM: arm64: selftests: Add no-vgic-v5 selftest
  KVM: arm64: selftests: Introduce a minimal GICv5 PPI selftest
  KVM: arm64: gic-v5: Communicate userspace-driveable PPIs via a UAPI
  Documentation: KVM: Introduce documentation for VGICv5
  KVM: arm64: gic-v5: Probe for GICv5 device
  KVM: arm64: gic-v5: Set ICH_VCTLR_EL2.En on boot
  KVM: arm64: gic-v5: Introduce kvm_arm_vgic_v5_ops and register them
  KVM: arm64: gic-v5: Hide FEAT_GCIE from NV GICv5 guests
  KVM: arm64: gic: Hide GICv5 for protected guests
  KVM: arm64: gic-v5: Mandate architected PPI for PMU emulation on GICv5
  KVM: arm64: gic-v5: Enlighten arch timer for GICv5
  irqchip/gic-v5: Introduce minimal irq_set_type() for PPIs
  KVM: arm64: gic-v5: Initialise ID and priority bits when resetting vcpu
  KVM: arm64: gic-v5: Create and initialise vgic_v5
  KVM: arm64: gic-v5: Support GICv5 interrupts with KVM_IRQ_LINE
  KVM: arm64: gic-v5: Implement direct injection of PPIs
  KVM: arm64: Introduce set_direct_injection irq_op
  KVM: arm64: gic-v5: Trap and mask guest ICC_PPI_ENABLERx_EL1 writes
  KVM: arm64: gic-v5: Check for pending PPIs
  KVM: arm64: gic-v5: Clear TWI if single task running
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:22:35 +01:00
Marc Zyngier
2de32a25a3 Merge branch kvm-arm64/hyp-tracing into kvmarm-master/next
* kvm-arm64/hyp-tracing: (40 commits)
  : .
  : EL2 tracing support, adding both 'remote' ring-buffer
  : infrastructure and the tracing itself, courtesy of
  : Vincent Donnefort. From the cover letter:
  :
  : "The growing set of features supported by the hypervisor in protected
  : mode necessitates debugging and profiling tools. Tracefs is the
  : ideal candidate for this task:
  :
  :   * It is simple to use and to script.
  :
  :   * It is supported by various tools, from the trace-cmd CLI to the
  :     Android web-based perfetto.
  :
  :   * The ring-buffer, where are stored trace events consists of linked
  :     pages, making it an ideal structure for sharing between kernel and
  :     hypervisor.
  :
  : This series first introduces a new generic way of creating remote events and
  : remote buffers. Then it adds support to the pKVM hypervisor."
  : .
  tracing: selftests: Extend hotplug testing for trace remotes
  tracing: Non-consuming read for trace remotes with an offline CPU
  tracing: Adjust cmd_check_undefined to show unexpected undefined symbols
  tracing: Restore accidentally removed SPDX tag
  KVM: arm64: avoid unused-variable warning
  tracing: Generate undef symbols allowlist for simple_ring_buffer
  KVM: arm64: tracing: add ftrace dependency
  tracing: add more symbols to whitelist
  tracing: Update undefined symbols allow list for simple_ring_buffer
  KVM: arm64: Fix out-of-tree build for nVHE/pKVM tracing
  tracing: selftests: Add hypervisor trace remote tests
  KVM: arm64: Add selftest event support to nVHE/pKVM hyp
  KVM: arm64: Add hyp_enter/hyp_exit events to nVHE/pKVM hyp
  KVM: arm64: Add event support to the nVHE/pKVM hyp and trace remote
  KVM: arm64: Add trace reset to the nVHE/pKVM hyp
  KVM: arm64: Sync boot clock with the nVHE/pKVM hyp
  KVM: arm64: Add trace remote for the nVHE/pKVM hyp
  KVM: arm64: Add tracing capability for the nVHE/pKVM hyp
  KVM: arm64: Support unaligned fixmap in the pKVM hyp
  KVM: arm64: Initialise hyp_nr_cpus for nVHE hyp
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-04-08 12:21:51 +01:00
Chengwen Feng
a7034e9e44 ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
Update acpi/pptt.c to use acpi_get_cpu_uid() and remove unused
get_acpi_id_for_cpu() from arm64/loongarch/riscv, completing PPTT's
migration to the unified ACPI CPU UID interface

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-8-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-06 16:55:16 +02:00
Chengwen Feng
f652d0a4e1 ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
Centralize acpi_get_cpu_uid() in include/linux/acpi.h (global scope) and
remove arch-specific declarations from arm64/loongarch/riscv/x86
asm/acpi.h. This unifies the interface across architectures and
simplifies maintenance by eliminating duplicate prototypes.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-6-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-06 16:55:16 +02:00
Chengwen Feng
7cd5f5659a arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
arm64. While at it, add input validation to make the code more robust.

Reimplement get_cpu_for_acpi_id() based on acpi_get_cpu_uid() for
consistency, and move its implementation next to the new function for
code coherence.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://patch.msgid.link/20260401081640.26875-2-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-06 16:55:15 +02:00
Lorenzo Stoakes (Oracle)
3a6455d56b mm: convert do_brk_flags() to use vma_flags_t
In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS
and friends and update the architecture-specific definitions also.

We then have to update some KSM logic to handle VMA flags, and introduce
VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS.

We also introduce two helper functions for use during the time we are
converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and
legacy_to_vma_flags().

This enables us to iteratively make changes to break these changes up into
separate parts.

We use these explicitly here to keep VM_STACK_FLAGS around for certain
users which need to maintain the legacy vm_flags_t values for the time
being.

We are no longer able to rely on the simple VM_xxx being set to zero if
the feature is not enabled, so in the case of VM_DROPPABLE we introduce
VMA_DROPPABLE as the vma_flags_t equivalent, which is set to
EMPTY_VMA_FLAGS if the droppable flag is not available.

While we're here, we make the description of do_brk_flags() into a kdoc
comment, as it almost was already.

We use vma_flags_to_legacy() to not need to update the vm_get_page_prot()
logic as this time.

Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON()
with a VM_WARN_ON_ONCE() as the tested values are no longer build time
available.

We also update mprotect_fixup() to use VMA flags where possible, though we
have to live with a little duplication between vm_flags_t and vma_flags_t
values for the time being until further conversions are made.

While we're here, update VM_SPECIAL to be defined in terms of
VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy().

Finally, we update the VMA tests to reflect these changes.

Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>	[SELinux]
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:40 -07:00
Baolin Wang
42e26354c4 mm: change to return bool for pmdp_test_and_clear_young()
Callers use pmdp_test_and_clear_young() to clear the young flag and check
whether it was set for this PMD entry.  Change the return type to bool to
make the intention clearer.

Link: https://lkml.kernel.org/r/f1d31307a13365d3d0fed5809727dcc2dd59631b.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:35 -07:00
Baolin Wang
06c4dfa3ce mm: change to return bool for ptep_clear_flush_young()/clear_flush_young_ptes()
The ptep_clear_flush_young() and clear_flush_young_ptes() are used to
clear the young flag and flush the TLB, returning whether the young flag
was set.  Change the return type to bool to make the intention clearer.

Link: https://lkml.kernel.org/r/24af5144b96103631594501f77d4525f2475c1be.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:35 -07:00
Baolin Wang
a62ca3f40f mm: change to return bool for ptep_test_and_clear_young()
Patch series "change young flag check functions to return bool", v2.

This is a cleanup patchset to change all young flag check functions to
return bool, as discussed with David in the previous thread[1].  Since
callers only care about whether the young flag was set, returning bool
makes the intention clearer.  No functional changes intended.


This patch (of 6):

Callers use ptep_test_and_clear_young() to clear the young flag and check
whether it was set.  Change the return type to bool to make the intention
clearer.

Link: https://lkml.kernel.org/r/cover.1774075004.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/57e70efa9703d43959aa645246ea3cbdba14fa17.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:35 -07:00
Baolin Wang
9970a9a27f arm64: mm: implement the architecture-specific test_and_clear_young_ptes()
Implement the Arm64 architecture-specific test_and_clear_young_ptes() to
enable batched checking of young flags, improving performance during large
folio reclamation when MGLRU is enabled.

While we're at it, simplify ptep_test_and_clear_young() by calling
test_and_clear_young_ptes().  Since callers guarantee that PTEs are
present before calling these functions, we can use pte_cont() to check the
CONT_PTE flag instead of pte_valid_cont().

Performance testing:

Enable MGLRU, then allocate 10G clean file-backed folios by mmap() in a
memory cgroup, and try to reclaim 8G file-backed folios via the
memory.reclaim interface.  I can observe 60%+ performance improvement on
my Arm64 32-core server (and about 15% improvement on my X86 machine).

W/o patchset:
real	0m0.470s
user	0m0.000s
sys	0m0.470s

W/ patchset:
real	0m0.180s
user	0m0.001s
sys	0m0.179s

Link: https://lkml.kernel.org/r/7f891d42a720cc2e57862f3b79e4f774404f313c.1772778858.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Xu <weixugc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:16 -07:00
Mike Rapoport (Microsoft)
26513781d1 mm: cache struct page for empty_zero_page and return it from ZERO_PAGE()
For most architectures every invocation of ZERO_PAGE() does
virt_to_page(empty_zero_page).  But empty_zero_page is in BSS and it is
enough to get its struct page once at initialization time and then use it
whenever a zero page should be accessed.

Add yet another __zero_page variable that will be initialized as
virt_to_page(empty_zero_page) for most architectures in a weak
arch_setup_zero_pages() function.

For architectures that use colored zero pages (MIPS and s390) rename their
setup_zero_pages() to arch_setup_zero_pages() and make it global rather
than static.

For architectures that cannot use virt_to_page() for BSS (arm64 and
sparc64) add override of arch_setup_zero_pages().

Link: https://lkml.kernel.org/r/20260211103141.3215197-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:01 -07:00
Mike Rapoport (Microsoft)
6215d9f447 arch, mm: consolidate empty_zero_page
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.

Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).

Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).

ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.

For the most part the change is mechanical, with these being noteworthy:

* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
  parameters. Switching to a generic empty_zero_page removes the aliasing
  and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
  ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
  although they do not support zero page coloring and having it in BSS
  will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
  can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
  but instead of allocating it, make mem_map_zero point to
  empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
  Rename the parameters page to boot_params_page and let sh use the generic
  empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page

	/* A handy thing to have if one has the RAM. Declared in head.S */

  that unfortunately had to go :)

Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>		[parisc]
Tested-by: Helge Deller <deller@gmx.de>		[parisc]
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com>	[alpha]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>	[nios2]
Acked-by: Andreas Larsson <andreas@gaisler.com>	[sparc]
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:53:01 -07:00
Seongsu Park
3d443691ed mm/pkeys: remove unused tsk parameter from arch_set_user_pkey_access()
The tsk parameter in arch_set_user_pkey_access() is never used in the
function implementations across all architectures (arm64, powerpc, x86).

Link: https://lkml.kernel.org/r/20260219063506.545148-1-sgsu.park@samsung.com
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05 13:52:57 -07:00