mirror of
https://github.com/torvalds/linux.git
synced 2026-05-22 14:12:07 +02:00
master
237 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0baba94a97 |
arm64: errata: Work around early CME DVMSync acknowledgement
C1-Pro acknowledges DVMSync messages before completing the SME/CME memory accesses. Work around this by issuing an IPI to the affected CPUs if they are running in EL0 with SME enabled. Note that we avoid the local DSB in the IPI handler as the kernel runs with SCTLR_EL1.IESB=1. This is sufficient to complete SME memory accesses at EL0 on taking an exception to EL1. On the return to user path, no barrier is necessary either. See the comment in sme_set_active() and the more detailed explanation in the link below. To avoid a potential IPI flood from malicious applications (e.g. madvise(MADV_PAGEOUT) in a tight loop), track where a process is active via mm_cpumask() and only interrupt those CPUs. Link: https://lore.kernel.org/r/ablEXwhfKyJW1i7l@J2N7QTR9R3 Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
63de2b3859 |
arm64/efi: Remove unneeded SVE/SME fallback preserve/store handling
Since commit
|
||
|
|
f617d24606 |
arm64 FPSIMD buffer on-stack for 6.19
In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSuxbxQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOKysnAQCbN4Jed8IqwGUEkkjZrnMeN0pEO4RI lAhb2Obj3n/grQEAiPBmqWVjXaIPO4lSgLQxY6XoVLr+utMod4TMTYHfnAY= =0zQQ -----END PGP SIGNATURE----- Merge tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers: "This is a core arm64 change. However, I was asked to take this because most uses of kernel-mode FPSIMD are in crypto or CRC code. In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction" * tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits) lib/crypto: arm64: Move remaining algorithms to scoped ksimd API lib/crypto: arm/blake2b: Move to scoped ksimd API arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack arm64/fpu: Enforce task-context only for generic kernel mode FPU net/mlx5: Switch to more abstract scoped ksimd guard API on arm64 arm64/xorblocks: Switch to 'ksimd' scoped guard API crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API crypto/arm64: polyval - Switch to 'ksimd' scoped guard API crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API raid6: Move to more abstract 'ksimd' guard API crypto: aegis128-neon - Move to more abstract 'ksimd' guard API crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API ... |
||
|
|
4fa617cc68 |
arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
Commit
|
||
|
|
7137a203b2 |
arm64/fpsimd: Permit kernel mode NEON with IRQs off
Currently, may_use_simd() will return false when called from a context where IRQs are disabled. One notable case where this happens is when calling the ResetSystem() EFI runtime service from the reboot/poweroff code path. For this case alone, there is a substantial amount of FP/SIMD support code to handle the corner case where a EFI runtime service is invoked with IRQs disabled. The only reason kernel mode SIMD is not allowed when IRQs are disabled is that re-enabling softirqs in this case produces a noisy diagnostic when lockdep is enabled. The warning is valid, in the sense that delivering pending softirqs over the back of the call to local_bh_enable() is problematic when IRQs are disabled. While the API lacks a facility to simply mask and unmask softirqs without triggering their delivery, disabling softirqs is not needed to begin with when IRQs are disabled, given that softirqs are only every taken asynchronously over the back of a hard IRQ. So dis/enable softirq processing conditionally, based on whether IRQs are enabled, and relax the check in may_use_simd(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
1d038e8018 |
arm64/fpsimd: Don't warn when EFI execution context is preemptible
Kernel mode FP/SIMD no longer requires preemption to be disabled, so only warn on uses of FP/SIMD from preemptible context if the fallback path is taken for cases where kernel mode NEON would not be allowed otherwise. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
19dd484cd1 |
arm64/fpsimd: simplify sme_setup()
The function checks info->vq_map for emptiness right before calling find_last_bit(). We can use the find_last_bit() output and save on bitmap_empty() call, which is O(N). Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Will Deacon <will@kernel.org> |
||
|
|
53a087046a |
Merge branch 'for-next/sme-fixes' into for-next/core
* for-next/sme-fixes: (35 commits)
arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected
arm64/fpsimd: ptrace: Gracefully handle errors
arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state
arm64/fpsimd: ptrace: Do not present register data for inactive mode
arm64/fpsimd: ptrace: Save task state before generating SVE header
arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state
arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data
arm64/fpsimd: Make clone() compatible with ZA lazy saving
arm64/fpsimd: Clear PSTATE.SM during clone()
arm64/fpsimd: Consistently preserve FPSIMD state during clone()
arm64/fpsimd: Remove redundant task->mm check
arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return()
arm64/fpsimd: Add task_smstop_sm()
arm64/fpsimd: Factor out {sve,sme}_state_size() helpers
arm64/fpsimd: Clarify sve_sync_*() functions
arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE
arm64/fpsimd: signal: Consistently read FPSIMD context
arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state
arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only
arm64/fpsimd: Do not discard modified SVE state
...
|
||
|
|
b87c8c4aca |
arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state
Currently, vec_set_vector_length() can manipulate a task into an invalid
state as a result of a prctl/ptrace syscall which changes the SVE/SME
vector length, resulting in several problems:
(1) When changing the SVE vector length, if the task initially has
PSTATE.ZA==1, and sve_alloc() fails to allocate memory, the task
will be left with PSTATE.ZA==1 and sve_state==NULL. This is not a
legitimate state, and could result in a subsequent null pointer
dereference.
(2) When changing the SVE vector length, if the task initially has
PSTATE.SM==1, the task will be left with PSTATE.SM==1 and
fp_type==FP_STATE_FPSIMD. Streaming mode state always needs to be
saved in SVE format, so this is not a legitimate state.
Attempting to restore this state may cause a task to erroneously
inherit stale streaming mode predicate registers and FFR contents,
behaving non-deterministically and potentially leaving information
from another task.
While in this state, reads of the NT_ARM_SSVE regset will indicate
that the registers are not stored in SVE format. For the NT_ARM_SSVE
regset specifically, debuggers interpret this as meaning that
PSTATE.SM==0.
(3) When changing the SME vector length, if the task initially has
PSTATE.SM==1, the lower 128 bits of task's streaming mode vector
state will be migrated to non-streaming mode, rather than these bits
being zeroed as is usually the case for changes to PSTATE.SM.
To fix the first issue, we can eagerly allocate the new sve_state and
sme_state before modifying the task. This makes it possible to handle
memory allocation failure without modifying the task state at all, and
removes the need to clear TIF_SVE and TIF_SME.
To fix the second issue, we either need to clear PSTATE.SM or not change
the saved fp_type. Given we're going to eagerly allocate sve_state and
sme_state, the simplest option is to preserve PSTATE.SM and the saves
fp_type, and consistently truncate the SVE state. This ensures that the
task always stays in a valid state, and by virtue of not exiting
streaming mode, this also sidesteps the third issue.
I believe these changes should not be problematic for realistic usage:
* When the SVE/SME vector length is changed via prctl(), syscall entry
will have cleared PSTATE.SM. Unless the task's state has been
manipulated via ptrace after entry, the task will have PSTATE.SM==0.
* When the SVE/SME vector length is changed via a write to the
NT_ARM_SVE or NT_ARM_SSVE regsets, PSTATE.SM will be forced
immediately after the length change, and new vector state will be
copied from userspace.
* When the SME vector length is changed via a write to the NT_ARM_ZA
regset, the (S)SVE state is clobbered today, so anyone who cares about
the specific state would need to install this after writing to the
NT_ARM_ZA regset.
As we need to free the old SVE state while TIF_SVE may still be set, we
cannot use sve_free(), and using kfree() directly makes it clear that
the free pairs with the subsequent assignment. As this leaves sve_free()
unused, I've removed the existing sve_free() and renamed __sve_free() to
mirror sme_free().
Fixes:
|
||
|
|
49ce484187 |
arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data
The SVE/SME vector lengths can be changed via prctl/ptrace syscalls. Changes to the SVE/SME vector lengths are documented as preserving the lower 128 bits of the Z registers (i.e. the bits shared with the FPSIMD V registers). To ensure this, vec_set_vector_length() explicitly copies register values from a task's saved SVE state to its saved FPSIMD state when dropping the task to FPSIMD-only. The logic for this was not updated when when FPSIMD/SVE state tracking was changed across commits: |
||
|
|
6ef1d778ce |
arm64/fpsimd: Add task_smstop_sm()
In a few places we want to transition a task from streaming mode to
non-streaming mode, e.g. signal delivery where we historically tried to
use an SMSTOP SM instruction.
Add a new helper to manipulate a task's state in the same way as an
SMSTOP SM instruction. I have not added a corresponding helper to
simulate the effects of SMSTART SM. Only ptrace transitions a task into
streaming mode, and ptrace has distinct semantics for such transitions.
Per ARM DDI 0487 L.a, section B1.4.6:
| RRSWFQ
| When the Effective value of PSTATE.SM is changed by any method from 0
| to 1, an entry to Streaming SVE mode is performed, and all implemented
| bits of Streaming SVE register state are set to zero.
| RKFRQZ
| When the Effective value of PSTATE.SM is changed by any method from 1
| to 0, an exit from Streaming SVE mode is performed, and in the
| newly-entered mode, all implemented bits of the SVE scalable vector
| registers, SVE predicate registers, and FFR, are set to zero.
Per ARM DDI 0487 L.a, section C5.2.9:
| On entry to or exit from Streaming SVE mode, FPMR is set to 0
Per ARM DDI 0487 L.a, section C5.2.10:
| On entry to or exit from Streaming SVE mode, FPSR.{IOC, DZC, OFC, UFC,
| IXC, IDC, QC} are set to 1 and the remaining bits are set to 0.
This means bits 0, 1, 2, 3, 4, 7, and 27 respectively, i.e. 0x0800009f
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-9-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
||
|
|
8738288a08 |
arm64/fpsimd: Factor out {sve,sme}_state_size() helpers
In subsequent patches we'll need to determine the SVE/SME state size for a given SVE VL and SME VL regardless of whether a task is currently configured with those VLs. Split the sizing logic out of sve_state_size() and sme_state_size() so that we don't need to open-code this logic elsewhere. At the same time, apply minor cleanups: * Move sve_state_size() into fpsimd.h, matching the placement of sme_state_size(). * Remove the feature checks from sve_state_size(). We only call sve_state_size() when at least one of SVE and SME are supported, and when either of the two is not supported, the task's corresponding SVE/SME vector length will be zero. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
|
|
b255be4269 |
arm64/fpsimd: Clarify sve_sync_*() functions
The sve_sync_{to,from}_fpsimd*() functions are intended to
extract/insert the currently effective FPSIMD state of a task regardless
of whether the task's state is saved in FPSIMD format or SVE format.
Historically they were only used by ptrace, but sve_sync_to_fpsimd() is
now used more widely, and sve_sync_from_fpsimd_zeropad() may be used
more widely in future.
When FPSIMD/SVE state tracking was changed across commits:
|
||
|
|
316283f276 |
arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE
Partial writes to the NT_ARM_SVE and NT_ARM_SSVE regsets using an payload are handled inconsistently and non-deterministically. A comment within sve_set_common() indicates that we intended that a partial write would preserve any effective FPSIMD/SVE state which was not overwritten, but this has never worked consistently, and during syscalls the FPSIMD vector state may be non-deterministically preserved and may be erroneously migrated between streaming and non-streaming SVE modes. The simplest fix is to handle a partial write by consistently zeroing the remaining state. As detailed below I do not believe this will adversely affect any real usage. Neither GDB nor LLDB attempt partial writes to these regsets, and the documentation (in Documentation/arch/arm64/sve.rst) has always indicated that state preservation was not guaranteed, as is says: | The effect of writing a partial, incomplete payload is unspecified. When the logic was originally introduced in commit: |
||
|
|
398edaa12f |
arm64/fpsimd: Do not discard modified SVE state
Historically SVE state was discarded deterministically early in the syscall entry path, before ptrace is notified of syscall entry. This permitted ptrace to modify SVE state before and after the "real" syscall logic was executed, with the modified state being retained. This behaviour was changed by commit: |
||
|
|
f699c66691 |
arm64/fpsimd: Avoid warning when sve_to_fpsimd() is unused
Historically fpsimd_to_sve() and sve_to_fpsimd() were (conditionally) called by functions which were defined regardless of CONFIG_ARM64_SVE. Hence it was necessary that both fpsimd_to_sve() and sve_to_fpsimd() were always defined and not guarded by ifdeffery. As a result of the removal of fpsimd_signal_preserve_current_state() in commit: |
||
|
|
e04796c8b5 |
arm64/fpsimd: Avoid unnecessary per-CPU buffers for EFI runtime calls
The EFI specification has some elaborate rules about which runtime services may be called while another runtime service call is already in progress. In Linux, however, for simplicity, all EFI runtime service invocations are serialized via the efi_runtime_lock semaphore. This implies that calls to the helper pair arch_efi_call_virt_setup() and arch_efi_call_virt_teardown() are serialized too, and are guaranteed not to nest. Furthermore, the arm64 arch code has its own spinlock to serialize use of the EFI runtime stack, of which only a single instance exists. This all means that the FP/SIMD and SVE state preserve/restore logic in __efi_fpsimd_begin() and __efi_fpsimd_end() are also serialized, and only a single instance of the associated per-CPU variables can ever be in use at the same time. There is therefore no need at all for per-CPU variables here, and they can all be replaced with singleton instances. This saves a non-trivial amount of memory on systems with many CPUs. To be more robust against potential future changes in the core EFI code that may invalidate the reasoning above, move the invocations of __efi_fpsimd_begin() and __efi_fpsimd_end() into the critical section covered by the efi_rt_lock spinlock. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250318132421.3155799-2-ardb+git@google.com Signed-off-by: Will Deacon <will@kernel.org> |
||
|
|
929fa99b12 |
arm64/fpsimd: signal: Always save+flush state early
There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.
Known issues with native FPSIMD/SVE/SME state management include:
* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:
|
||
|
|
d3a181588d |
arm64/fpsimd: Add fpsimd_save_and_flush_current_state()
When the current task's FPSIMD/SVE/SME state may be live on *any* CPU in the system, special care must be taken when manipulating that state, as this manipulation can race with preemption and/or asynchronous usage of FPSIMD/SVE/SME (e.g. kernel-mode NEON in softirq handlers). Even when manipulation is is protected with get_cpu_fpsimd_context() and get_cpu_fpsimd_context(), the logic necessary when the state is live on the current CPU can be wildly different from the logic necessary when the state is not live on the current CPU. A number of historical and extant issues result from failing to handle these cases consistetntly and/or correctly. To make it easier to get such manipulation correct, add a new fpsimd_save_and_flush_current_state() helper function, which ensures that the current task's state has been saved to memory and any stale state on any CPU has been "flushed" such that is not live on any CPU in the system. This will allow code to safely manipulate the saved state without risk of races. Subsequent patches will use the new function. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409164010.3480271-11-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
c94f2f3261 |
arm64/fpsimd: Fix merging of FPSIMD state during signal return
For backwards compatibility reasons, when a signal return occurs which restores SVE state, the effective lower 128 bits of each of the SVE vector registers are restored from the corresponding FPSIMD vector register in the FPSIMD signal frame, overriding the values in the SVE signal frame. This is intended to be the case regardless of streaming mode. To make this happen, restore_sve_fpsimd_context() uses fpsimd_update_current_state() to merge the lower 128 bits from the FPSIMD signal frame into the SVE register state. Unfortunately, fpsimd_update_current_state() performs this merging dependent upon TIF_SVE, which is not always correct for streaming SVE register state: * When restoring non-streaming SVE register state there is no observable problem, as the signal return code configures TIF_SVE and the saved fp_type to match before calling fpsimd_update_current_state(), which observes either: - TIF_SVE set AND fp_type == FP_STATE_SVE - TIF_SVE clear AND fp_type == FP_STATE_FPSIMD * On systems which have SME but not SVE, TIF_SVE cannot be set. Thus the merging will never happen for the streaming SVE register state. * On systems which have SVE and SME, TIF_SVE can be set and cleared independently of PSTATE.SM. Thus the merging may or may not happen for streaming SVE register state. As TIF_SVE can be cleared non-deterministically during syscalls (including at the start of sigreturn()), the merging may occur non-deterministically from the perspective of userspace. This logic has been broken since its introduction in commit: |
||
|
|
a90878f297 |
arm64/fpsimd: Reset FPMR upon exec()
An exec() is expected to reset all FPSIMD/SVE/SME state, and barring
special handling of the vector lengths, the state is expected to reset
to zero. This reset is handled in fpsimd_flush_thread(), which the core
exec() code calls via flush_thread().
When support was added for FPMR, no logic was added to
fpsimd_flush_thread() to reset the FPMR value, and thus it is
erroneously inherited across an exec().
Add the missing reset of FPMR.
Fixes:
|
||
|
|
01098d893f |
arm64/fpsimd: Avoid clobbering kernel FPSIMD state with SMSTOP
On system with SME, a thread's kernel FPSIMD state may be erroneously
clobbered during a context switch immediately after that state is
restored. Systems without SME are unaffected.
If the CPU happens to be in streaming SVE mode before a context switch
to a thread with kernel FPSIMD state, fpsimd_thread_switch() will
restore the kernel FPSIMD state using fpsimd_load_kernel_state() while
the CPU is still in streaming SVE mode. When fpsimd_thread_switch()
subsequently calls fpsimd_flush_cpu_state(), this will execute an
SMSTOP, causing an exit from streaming SVE mode. The exit from
streaming SVE mode will cause the hardware to reset a number of
FPSIMD/SVE/SME registers, clobbering the FPSIMD state.
Fix this by calling fpsimd_flush_cpu_state() before restoring the kernel
FPSIMD state.
Fixes:
|
||
|
|
e5fa85fce0 |
arm64/fpsimd: Don't corrupt FPMR when streaming mode changes
When the effective value of PSTATE.SM is changed from 0 to 1 or from 1
to 0 by any method, an entry or exit to/from streaming SVE mode is
performed, and hardware automatically resets a number of registers. As
of ARM DDI 0487 L.a, this means:
* All implemented bits of the SVE vector registers are set to zero.
* All implemented bits of the SVE predicate registers are set to zero.
* All implemented bits of FFR are set to zero, if FFR is implemented in
the new mode.
* FPSR is set to 0x0000_0000_0800_009f.
* FPMR is set to 0, if FPMR is implemented.
Currently task_fpsimd_load() restores FPMR before restoring SVCR (which
is an accessor for PSTATE.{SM,ZA}), and so the restored value of FPMR
may be clobbered if the restored value of PSTATE.SM happens to differ
from the initial value of PSTATE.SM.
Fix this by moving the restore of FPMR later.
Note: this was originally posted as [1].
Fixes:
|
||
|
|
d3eaab3c70 |
arm64/fpsimd: Discard stale CPU state when handling SME traps
The logic for handling SME traps manipulates saved FPSIMD/SVE/SME state incorrectly, and a race with preemption can result in a task having TIF_SME set and TIF_FOREIGN_FPSTATE clear even though the live CPU state is stale (e.g. with SME traps enabled). This can result in warnings from do_sme_acc() where SME traps are not expected while TIF_SME is set: | /* With TIF_SME userspace shouldn't generate any traps */ | if (test_and_set_thread_flag(TIF_SME)) | WARN_ON(1); This is very similar to the SVE issue we fixed in commit: |
||
|
|
d7649a4a60 |
arm64/fpsimd: Remove opportunistic freeing of SME state
When a task's SVE vector length (NSVL) is changed, and the task happens
to have SVCR.{SM,ZA}=={0,0}, vec_set_vector_length() opportunistically
frees the task's sme_state and clears TIF_SME.
The opportunistic freeing was added with no rationale in commit:
|
||
|
|
45fd86986b |
arm64/fpsimd: Remove redundant SVE trap manipulation
When task_fpsimd_load() loads the saved FPSIMD/SVE/SME state, it
configures EL0 SVE traps by calling sve_user_{enable,disable}(). This is
unnecessary, and this is suspicious/confusing as task_fpsimd_load() does
not configure EL0 SME traps.
All calls to task_fpsimd_load() are followed by a call to
fpsimd_bind_task_to_cpu(), where the latter configures traps for SVE and
SME dependent upon the current values of TIF_SVE and TIF_SME, overriding
any trap configuration performed by task_fpsimd_load().
The calls to sve_user_{enable,disable}() calls in task_fpsimd_load()
have been redundant (though benign) since they were introduced in
commit:
|
||
|
|
61db0e0ba3 |
arm64/fpsimd: Remove unused fpsimd_force_sync_to_sve()
There have been no users of fpsimd_force_sync_to_sve() since commit:
|
||
|
|
95507570fb |
arm64/fpsimd: Avoid RES0 bits in the SME trap handler
The SME trap handler consumes RES0 bits from the ESR when determining
the reason for the trap, and depends upon those bits reading as zero.
This may break in future when those RES0 bits are allocated a meaning
and stop reading as zero.
For SME traps taken with ESR_ELx.EC == 0b011101, the specific reason for
the trap is indicated by ESR_ELx.ISS.SMTC ("SME Trap Code"). This field
occupies bits [2:0] of ESR_ELx.ISS, and as of ARM DDI 0487 L.a, bits
[24:3] of ESR_ELx.ISS are RES0. ESR_ELx.ISS itself occupies bits [24:0]
of ESR_ELx.
Extract the SMTC field specifically, matching the way we handle ESR_ELx
fields elsewhere, and ensuring that the handler is future-proof.
Fixes:
|
||
|
|
3bb7dcebd0 |
KVM/arm64 fixes for 6.14, take #2
- Large set of fixes for vector handling, specially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours. - Fix use of kernel VAs at EL2 when emulating timers with nVHE. - Small set of pKVM improvements and cleanups. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmevLKMACgkQI9DQutE9 ekP3hQ//db7pAPzLr++//PAyam0GP+ooKlgpB0ImZisQwkrTrTMP+IjNJG+NCJ46 y88anBErFijvWb3BINpeTM/dux7DmuoaolGx7lquFu+i0L8UfFFjYG7UU+NZscim KE4j0tJz8jm5ksN4iwaj3RIkGKc1zJtRyoPny3j1blOtm8aTtujRJB7/Gx2QefZR 1Z13RaIzk1tKdY0JxAmPpGkaRY99MQahx96iBsk2u4rlypcxmVr9aQ1Madp7Pc6Y pBcX9jZwLf75cj6CAK93YSjFF3j/x4QM8jSupLCu5tyin6YZ4sRaZa6sy52byk2v zes7i83l5g3+JEKv5oZVwjD5SFBu02UPbnMGSxKQitgz4Zej3qMIq5BxgII2kHZV jwXrNEx4trNegEcoqwFX5xA0FMUr1/g3Cr4+rZBoUramj80cBhzbBdUkhyWd3eey j2EOuAG3pgUD5Wv9SyojlbHBwmSAcBEtr3vqJpTjWQS6AyEmdKNvzh/8JCH1h7UM fBo4+LIEylzmZXbqDrZNwXh31tELoTCR9Ur3pTCEO3Yfg9npTLWmvKs+tAgO/282 IOjZE0N/ZtzPJ6Cgr+2efBGd+id81HXh+H8gWo35Dyx3EH2k44FHwQ3rW2NKOVzo 10eSbswYpjk3gi/6GxwC0lDqFi4Bk6ILvC6roqTghixBf7xThfY= =L5HS -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.14, take #2 - Large set of fixes for vector handling, specially in the interactions between host and guest state. This fixes a number of bugs affecting actual deployments, and greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland for dealing with this thankless task. - Fix an ugly race between vcpu and vgic creation/init, resulting in unexpected behaviours. - Fix use of kernel VAs at EL2 when emulating timers with nVHE. - Small set of pKVM improvements and cleanups. |
||
|
|
fbc7e61195 |
KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state
There are several problems with the way hyp code lazily saves the host's FPSIMD/SVE state, including: * Host SVE being discarded unexpectedly due to inconsistent configuration of TIF_SVE and CPACR_ELx.ZEN. This has been seen to result in QEMU crashes where SVE is used by memmove(), as reported by Eric Auger: https://issues.redhat.com/browse/RHEL-68997 * Host SVE state is discarded *after* modification by ptrace, which was an unintentional ptrace ABI change introduced with lazy discarding of SVE state. * The host FPMR value can be discarded when running a non-protected VM, where FPMR support is not exposed to a VM, and that VM uses FPSIMD/SVE. In these cases the hyp code does not save the host's FPMR before unbinding the host's FPSIMD/SVE/SME state, leaving a stale value in memory. Avoid these by eagerly saving and "flushing" the host's FPSIMD/SVE/SME state when loading a vCPU such that KVM does not need to save any of the host's FPSIMD/SVE/SME state. For clarity, fpsimd_kvm_prepare() is removed and the necessary call to fpsimd_save_and_flush_cpu_state() is placed in kvm_arch_vcpu_load_fp(). As 'fpsimd_state' and 'fpmr_ptr' should not be used, they are set to NULL; all uses of these will be removed in subsequent patches. Historical problems go back at least as far as v5.17, e.g. erroneous assumptions about TIF_SVE being clear in commit: |
||
|
|
1751f872cc |
treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.
Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit
|
||
|
|
ba1f9c8fe3 |
arm64 updates for 6.13:
* Support for running Linux in a protected VM under the Arm Confidential
Compute Architecture (CCA)
* Guarded Control Stack user-space support. Current patches follow the
x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
patches (already on the list) will add support for clone3() allowing
finer-grained control of the shadow stack size and placement from libc
* AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
getting close with the upcoming dpISA support)
* Other arch features:
- In-kernel use of the memcpy instructions, FEAT_MOPS (previously only
exposed to user; uaccess support not merged yet)
- MTE: hugetlbfs support and the corresponding kselftests
- Optimise CRC32 using the PMULL instructions
- Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
- Optimise the kernel TLB flushing to use the range operations
- POE/pkey (permission overlays): further cleanups after bringing the
signal handler in line with the x86 behaviour for 6.12
* arm64 perf updates:
- Support for the NXP i.MX91 PMU in the existing IMX driver
- Support for Ampere SoCs in the Designware PCIe PMU driver
- Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
- Support for Samsung's 'Mongoose' CPU PMU
- Support for PMUv3.9 finer-grained userspace counter access control
- Switch back to platform_driver::remove() now that it returns 'void'
- Add some missing events for the CXL PMU driver
* Miscellaneous arm64 fixes/cleanups:
- Page table accessors cleanup: type updates, drop unused macros,
reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
check addresses before runtime P4D/PUD folding
- Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
FEAT_ECV for the generic timers) allowing Linux to boot with
firmware deployments that don't set SCTLR_EL3.ECVEn
- ACPI/arm64: tighten the check for the array of platform timer
structures and adjust the error handling procedure in
gtdt_parse_timer_block()
- Optimise the cache flush for the uprobes xol slot (skip if no
change) and other uprobes/kprobes cleanups
- Fix the context switching of tpidrro_el0 when kpti is enabled
- Dynamic shadow call stack fixes
- Sysreg updates
- Various arm64 kselftest improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmc5POIACgkQa9axLQDI
XvEDYA//a3eeNkgMuGdnSCVcLz+zy+oNwAwboG/4X1DqL8jiCbI4npwugPx95RIA
YZOUvo9T2aL3OyefpUHll4gFHqx9OwoZIig2F70TEUmlPsGUbh0KBkdfQF3xZPdl
EwV0kHSGEqMWMBwsGJGwgCYrUaf1MUQzh1GBl7VJ2ts5XsJBaBeOyKkysij26wtZ
V+aHq2IUx7qQS7+HC/4P6IoHxKziFcsCMovaKaynP4cw9xXBQbDMcNlHEwndOMyk
pu2zrv7GG0j3KQuVP/2Alf5FKhmI0GVGP/6Nc/zsOmw96w8Kf7HfzEtkHawr2aRq
rqg/c9ivzDn1p+fUBo4ZYtrRk4IAY+yKu6hdzdLTP5+bQrBTWTO9rjQVBm9FAGYT
sCdEj1NqzvExvNHD7X6ut/GJ05lmce3K+qeSXSEysN9gqiT3eomYWMXrD2V2lxzb
rIDDcb/icfaqjt14Mksh19r/rzNeq7noj9CGSmcqw0BHZfHzl38Lai6pdfYzCNyn
vCM/c4c1D/WWX8/lifO1JZVbhDk1jy82Iphg2KEhL8iKPxDsKBBZLmYuU1oa7tMo
WryGAz9+GQwd+W9chFuaOEtMnzvW2scEJ5Eb2fEf0Qj0aEurkL+C9dZR6o1GN77V
DBUxtU628Ef4PJJGfbNCwZzdd8UPYG3a/mKfQQ3dz0oz2LySlW4=
=wDot
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Support for running Linux in a protected VM under the Arm
Confidential Compute Architecture (CCA)
- Guarded Control Stack user-space support. Current patches follow the
x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
patches (already on the list) will add support for clone3() allowing
finer-grained control of the shadow stack size and placement from
libc
- AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
getting close with the upcoming dpISA support)
- Other arch features:
- In-kernel use of the memcpy instructions, FEAT_MOPS (previously
only exposed to user; uaccess support not merged yet)
- MTE: hugetlbfs support and the corresponding kselftests
- Optimise CRC32 using the PMULL instructions
- Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
- Optimise the kernel TLB flushing to use the range operations
- POE/pkey (permission overlays): further cleanups after bringing
the signal handler in line with the x86 behaviour for 6.12
- arm64 perf updates:
- Support for the NXP i.MX91 PMU in the existing IMX driver
- Support for Ampere SoCs in the Designware PCIe PMU driver
- Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
- Support for Samsung's 'Mongoose' CPU PMU
- Support for PMUv3.9 finer-grained userspace counter access
control
- Switch back to platform_driver::remove() now that it returns
'void'
- Add some missing events for the CXL PMU driver
- Miscellaneous arm64 fixes/cleanups:
- Page table accessors cleanup: type updates, drop unused macros,
reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
check addresses before runtime P4D/PUD folding
- Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
FEAT_ECV for the generic timers) allowing Linux to boot with
firmware deployments that don't set SCTLR_EL3.ECVEn
- ACPI/arm64: tighten the check for the array of platform timer
structures and adjust the error handling procedure in
gtdt_parse_timer_block()
- Optimise the cache flush for the uprobes xol slot (skip if no
change) and other uprobes/kprobes cleanups
- Fix the context switching of tpidrro_el0 when kpti is enabled
- Dynamic shadow call stack fixes
- Sysreg updates
- Various arm64 kselftest improvements
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits)
arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
kselftest/arm64: Try harder to generate different keys during PAC tests
kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
arm64/ptrace: Clarify documentation of VL configuration via ptrace
kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
acpi/arm64: remove unnecessary cast
arm64/mm: Change protval as 'pteval_t' in map_range()
kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
kselftest/arm64: Add FPMR coverage to fp-ptrace
kselftest/arm64: Expand the set of ZA writes fp-ptrace does
kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
kselftest/arm64: Enable build of PAC tests with LLVM=1
kselftest/arm64: Check that SVCR is 0 in signal handlers
selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()
kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
kselftest/arm64: Fix build with stricter assemblers
arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
arm64/scs: Deal with 64-bit relative offsets in FDE frames
...
|
||
|
|
751ecf6afd |
arm64/sve: Discard stale CPU state when handling SVE traps
The logic for handling SVE traps manipulates saved FPSIMD/SVE state
incorrectly, and a race with preemption can result in a task having
TIF_SVE set and TIF_FOREIGN_FPSTATE clear even though the live CPU state
is stale (e.g. with SVE traps enabled). This has been observed to result
in warnings from do_sve_acc() where SVE traps are not expected while
TIF_SVE is set:
| if (test_and_set_thread_flag(TIF_SVE))
| WARN_ON(1); /* SVE access shouldn't have trapped */
Warnings of this form have been reported intermittently, e.g.
https://lore.kernel.org/linux-arm-kernel/CA+G9fYtEGe_DhY2Ms7+L7NKsLYUomGsgqpdBj+QwDLeSg=JhGg@mail.gmail.com/
https://lore.kernel.org/linux-arm-kernel/000000000000511e9a060ce5a45c@google.com/
The race can occur when the SVE trap handler is preempted before and
after manipulating the saved FPSIMD/SVE state, starting and ending on
the same CPU, e.g.
| void do_sve_acc(unsigned long esr, struct pt_regs *regs)
| {
| // Trap on CPU 0 with TIF_SVE clear, SVE traps enabled
| // task->fpsimd_cpu is 0.
| // per_cpu_ptr(&fpsimd_last_state, 0) is task.
|
| ...
|
| // Preempted; migrated from CPU 0 to CPU 1.
| // TIF_FOREIGN_FPSTATE is set.
|
| get_cpu_fpsimd_context();
|
| if (test_and_set_thread_flag(TIF_SVE))
| WARN_ON(1); /* SVE access shouldn't have trapped */
|
| sve_init_regs() {
| if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
| ...
| } else {
| fpsimd_to_sve(current);
| current->thread.fp_type = FP_STATE_SVE;
| }
| }
|
| put_cpu_fpsimd_context();
|
| // Preempted; migrated from CPU 1 to CPU 0.
| // task->fpsimd_cpu is still 0
| // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then:
| // - Stale HW state is reused (with SVE traps enabled)
| // - TIF_FOREIGN_FPSTATE is cleared
| // - A return to userspace skips HW state restore
| }
Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set
by calling fpsimd_flush_task_state() to detach from the saved CPU
state. This ensures that a subsequent context switch will not reuse the
stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the
new state to be reloaded from memory prior to a return to userspace.
Fixes:
|
||
|
|
525fd6a1b3 |
arm64/fpsimd: Fix a typo
s/FPSMID/FPSIMD/ M and I swapped. Fix it. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/2cbcb42615e9265bccc9b746465d7998382e605d.1730539907.git.christophe.jaillet@wanadoo.fr Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
78eb4ea25c |
sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.
This patch has been generated by the following coccinelle script:
```
virtual patch
@r1@
identifier ctl, write, buffer, lenp, ppos;
identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
@@
int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int write, void *buffer, size_t *lenp, loff_t *ppos);
@r2@
identifier func, ctl, write, buffer, lenp, ppos;
@@
int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int write, void *buffer, size_t *lenp, loff_t *ppos)
{ ... }
@r3@
identifier func;
@@
int func(
- struct ctl_table *
+ const struct ctl_table *
,int , void *, size_t *, loff_t *);
@r4@
identifier func, ctl;
@@
int func(
- struct ctl_table *ctl
+ const struct ctl_table *ctl
,int , void *, size_t *, loff_t *);
@r5@
identifier func, write, buffer, lenp, ppos;
@@
int func(
- struct ctl_table *
+ const struct ctl_table *
,int write, void *buffer, size_t *lenp, loff_t *ppos);
```
* Code formatting was adjusted in xfs_sysctl.c to comply with code
conventions. The xfs_stats_clear_proc_handler,
xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
adjusted.
* The ctl_table argument in proc_watchdog_common was const qualified.
This is called from a proc_handler itself and is calling back into
another proc_handler, making it necessary to change it as part of the
proc_handler migration.
Co-developed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Co-developed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Joel Granados <j.granados@samsung.com>
|
||
|
|
e92bee9f86 |
arm64/fpsimd: Avoid erroneous elide of user state reload
TIF_FOREIGN_FPSTATE is a 'convenience' flag that should reflect whether
the current CPU holds the most recent user mode FP/SIMD state of the
current task. It combines two conditions:
- whether the current CPU's FP/SIMD state belongs to the task;
- whether that state is the most recent associated with the task (as a
task may have executed on other CPUs as well).
When a task is scheduled in and TIF_KERNEL_FPSTATE is set, it means the
task was in a kernel mode NEON section when it was scheduled out, and so
the kernel mode FP/SIMD state is restored. Since this implies that the
current CPU is *not* holding the most recent user mode FP/SIMD state of
the current task, the TIF_FOREIGN_FPSTATE flag is set too, so that the
user mode FP/SIMD state is reloaded from memory when returning to
userland.
However, the task may be scheduled out after completing the kernel mode
NEON section, but before returning to userland. When this happens, the
TIF_FOREIGN_FPSTATE flag will not be preserved, but will be set as usual
the next time the task is scheduled in, and will be based on the above
conditions.
This means that, rather than setting TIF_FOREIGN_FPSTATE when scheduling
in a task with TIF_KERNEL_FPSTATE set, the underlying state should be
updated so that TIF_FOREIGN_FPSTATE will assume the expected value as a
result.
So instead, call fpsimd_flush_cpu_state(), which takes care of this.
Closes: https://lore.kernel.org/all/cb8822182231850108fa43e0446a4c7f@kernel.org
Reported-by: Johannes Nixdorf <mixi@shadowice.org>
Fixes:
|
||
|
|
f481bb32d6 |
Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
This reverts commit
|
||
|
|
b8995a1841 |
Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
This reverts commit
|
||
|
|
6d75c6f40a |
arm64 updates for 6.9:
* Reorganise the arm64 kernel VA space and add support for LPA2 (at
stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address range
with 4KB and 16KB pages
* Enable Rust on arm64
* Support for the 2023 dpISA extensions (data processing ISA), host only
* arm64 perf updates:
- StarFive's StarLink (integrates one or more CPU cores with a shared
L3 memory system) PMU support
- Enable HiSilicon Erratum 162700402 quirk for HIP09
- Several updates for the HiSilicon PCIe PMU driver
- Arm CoreSight PMU support
- Convert all drivers under drivers/perf/ to use .remove_new()
* Miscellaneous:
- Don't enable workarounds for "rare" errata by default
- Clean up the DAIF flags handling for EL0 returns (in preparation for
NMI support)
- Kselftest update for ptrace()
- Update some of the sysreg field definitions
- Slight improvement in the code generation for inline asm I/O
accessors to permit offset addressing
- kretprobes: acquire regs via a BRK exception (previously done via a
trampoline handler)
- SVE/SME cleanups, comment updates
- Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously disabled
due to gcc silently ignoring -falign-functions=N)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmXxiSgACgkQa9axLQDI
XvHd7hAAjQrQqxJogPT2ahM5/gxct8qTrXpIgX0B1Y7bb5R8ztvOUN9MJNuDyRsj
0s28SSZw387LReM5OUu+U6G/iahcuNAyP/8d9qeac32Tidd255fV3KPEh4C4eC+u
0HeOqLBZ+stmNoa71tBC2K6SmchizhYyYduvRnri8km8K4OMDawHWqWRTXl0PNRT
RMVJvZTDJMPfMBFeD4+B7EnSFOoP14tKCw9MZvlbpT2PEV0kINjhCQiojW2jJgqv
w36vm/dhwsg1avSzT1xhy3KE+m+7n28+IC/wr1HB7c1WumvYKv7Z84ieCp3PlO3Z
owvVO7dKJC6X3RkoY6Kge5p2RHU6poDerDVHYiAvG+Zi57nrDmHyAubskThsGTGR
AibSEeJ5nQ0yM6hx7zAIQa5XEo4l0svD1ZM7NynY+5JR44W9cdAH3SnEsvIBMGIf
/ja+iZ1W4ZQnIESQXD5uDPSxILfqQ8Ebhdorpw+Qg3rB7OhdTdGSSGQCi6V2PcJH
d/ErFO+i0lFRBPJtBbUAN4EEu3HJcVYEoEnVJYQahC+6KyNGLxO+7L6sH0YO7Pag
P1LRa6h8ktuBMrbCrOPWdmJYNDYCbb5rRtmcCwO0ItZ4g5tYWp9djFc8pyctCaNB
MZxxRrUCNwXTOcFTDiYzyk+JCvpf3EvXfvj8AH+P8BMjFWgqHqw=
=KTD/
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The major features are support for LPA2 (52-bit VA/PA with 4K and 16K
pages), the dpISA extension and Rust enabled on arm64. The changes are
mostly contained within the usual arch/arm64/, drivers/perf, the arm64
Documentation and kselftests. The exception is the Rust support which
touches some generic build files.
Summary:
- Reorganise the arm64 kernel VA space and add support for LPA2 (at
stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address
range with 4KB and 16KB pages
- Enable Rust on arm64
- Support for the 2023 dpISA extensions (data processing ISA), host
only
- arm64 perf updates:
- StarFive's StarLink (integrates one or more CPU cores with a
shared L3 memory system) PMU support
- Enable HiSilicon Erratum 162700402 quirk for HIP09
- Several updates for the HiSilicon PCIe PMU driver
- Arm CoreSight PMU support
- Convert all drivers under drivers/perf/ to use .remove_new()
- Miscellaneous:
- Don't enable workarounds for "rare" errata by default
- Clean up the DAIF flags handling for EL0 returns (in preparation
for NMI support)
- Kselftest update for ptrace()
- Update some of the sysreg field definitions
- Slight improvement in the code generation for inline asm I/O
accessors to permit offset addressing
- kretprobes: acquire regs via a BRK exception (previously done
via a trampoline handler)
- SVE/SME cleanups, comment updates
- Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously
disabled due to gcc silently ignoring -falign-functions=N)"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits)
Revert "mm: add arch hook to validate mmap() prot flags"
Revert "arm64: mm: add support for WXN memory translation attribute"
Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512"
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
kselftest/arm64: Add 2023 DPISA hwcap test coverage
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Handle FPMR context in generic signal frame parser
arm64/hwcap: Define hwcaps for 2023 DPISA features
arm64/ptrace: Expose FPMR via ptrace
arm64/signal: Add FPMR signal handling
arm64/fpsimd: Support FEAT_FPMR
arm64/fpsimd: Enable host kernel access to FPMR
arm64/cpufeature: Hook new identification registers up to cpufeature
docs: perf: Fix build warning of hisi-pcie-pmu.rst
perf: starfive: Only allow COMPILE_TEST for 64-bit architectures
MAINTAINERS: Add entry for StarFive StarLink PMU
docs: perf: Add description for StarFive's StarLink PMU
dt-bindings: perf: starfive: Add JH8100 StarLink PMU
perf: starfive: Add StarLink PMU support
docs: perf: Update usage for target filter of hisi-pcie-pmu
...
|
||
|
|
0c5ade742e |
Merge branches 'for-next/reorg-va-space', 'for-next/rust-for-arm64', 'for-next/misc', 'for-next/daif-cleanup', 'for-next/kselftest', 'for-next/documentation', 'for-next/sysreg' and 'for-next/dpisa', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf: (39 commits)
docs: perf: Fix build warning of hisi-pcie-pmu.rst
perf: starfive: Only allow COMPILE_TEST for 64-bit architectures
MAINTAINERS: Add entry for StarFive StarLink PMU
docs: perf: Add description for StarFive's StarLink PMU
dt-bindings: perf: starfive: Add JH8100 StarLink PMU
perf: starfive: Add StarLink PMU support
docs: perf: Update usage for target filter of hisi-pcie-pmu
drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()
drivers/perf: hisi_pcie: Relax the check on related events
drivers/perf: hisi_pcie: Check the target filter properly
drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth
drivers/perf: hisi_pcie: Fix incorrect counting under metric mode
drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()
drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()
drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09
perf/arm_cspmu: Add devicetree support
dt-bindings/perf: Add Arm CoreSight PMU
perf/arm_cspmu: Simplify counter reset
perf/arm_cspmu: Simplify attribute groups
perf/arm_cspmu: Simplify initialisation
...
* for-next/reorg-va-space:
: Reorganise the arm64 kernel VA space in preparation for LPA2 support
: (52-bit VA/PA).
arm64: kaslr: Adjust randomization range dynamically
arm64: mm: Reclaim unused vmemmap region for vmalloc use
arm64: vmemmap: Avoid base2 order of struct page size to dimension region
arm64: ptdump: Discover start of vmemmap region at runtime
arm64: ptdump: Allow all region boundaries to be defined at boot time
arm64: mm: Move fixmap region above vmemmap region
arm64: mm: Move PCI I/O emulation region above the vmemmap region
* for-next/rust-for-arm64:
: Enable Rust support for arm64
arm64: rust: Enable Rust support for AArch64
rust: Refactor the build target to allow the use of builtin targets
* for-next/misc:
: Miscellaneous arm64 patches
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
arm64: Remove enable_daif macro
arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception
arm64: cpufeatures: Clean up temporary variable to simplify code
arm64: Update setup_arch() comment on interrupt masking
arm64: remove unnecessary ifdefs around is_compat_task()
arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang
arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values
arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values
arm64/sve: Document that __SVE_VQ_MAX is much larger than needed
arm64: make member of struct pt_regs and it's offset macro in the same order
arm64: remove unneeded BUILD_BUG_ON assertion
arm64: kretprobes: acquire the regs via a BRK exception
arm64: io: permit offset addressing
arm64: errata: Don't enable workarounds for "rare" errata by default
* for-next/daif-cleanup:
: Clean up DAIF handling for EL0 returns
arm64: Unmask Debug + SError in do_notify_resume()
arm64: Move do_notify_resume() to entry-common.c
arm64: Simplify do_notify_resume() DAIF masking
* for-next/kselftest:
: Miscellaneous arm64 kselftest patches
kselftest/arm64: Test that ptrace takes effect in the target process
* for-next/documentation:
: arm64 documentation patches
arm64/sme: Remove spurious 'is' in SME documentation
arm64/fp: Clarify effect of setting an unsupported system VL
arm64/sme: Fix cut'n'paste in ABI document
arm64/sve: Remove bitrotted comment about syscall behaviour
* for-next/sysreg:
: sysreg updates
arm64/sysreg: Update ID_AA64DFR0_EL1 register
arm64/sysreg: Update ID_DFR0_EL1 register fields
arm64/sysreg: Add register fields for ID_AA64DFR1_EL1
* for-next/dpisa:
: Support for 2023 dpISA extensions
kselftest/arm64: Add 2023 DPISA hwcap test coverage
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Handle FPMR context in generic signal frame parser
arm64/hwcap: Define hwcaps for 2023 DPISA features
arm64/ptrace: Expose FPMR via ptrace
arm64/signal: Add FPMR signal handling
arm64/fpsimd: Support FEAT_FPMR
arm64/fpsimd: Enable host kernel access to FPMR
arm64/cpufeature: Hook new identification registers up to cpufeature
|
||
|
|
203f2b95a8 |
arm64/fpsimd: Support FEAT_FPMR
FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the FP8 related features added to the architecture at the same time. Detect support for this register and context switch it for EL0 when present. Due to the sharing of responsibility for saving floating point state between the host kernel and KVM FP8 support is not yet implemented in KVM and a stub similar to that used for SVCR is provided for FPMR in order to avoid bisection issues. To make it easier to share host state with the hypervisor we store FPMR as a hardened usercopy field in uw (along with some padding). Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-3-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
93576e3498 |
arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values
At present nothing in our CPU initialisation code ever sets unknown fields in SMCR_EL1 to known values, all updates to SMCR_EL1 are read/modify/write sequences. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-2-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
2f0090549b |
arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values
At present nothing in our CPU initialisation code ever sets unknown fields in ZCR_EL1 to known values, all updates to ZCR_EL1 are read/modify/write sequences for LEN. All the unknown fields are RES0, explicitly initialise them as such to avoid future surprises. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240213-arm64-fp-init-vec-cr-v1-1-7e7c2d584f26@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
|
|
d7b77a0d56 |
arm64/sme: Restore SMCR_EL1.EZT0 on exit from suspend
The fields in SMCR_EL1 reset to an architecturally UNKNOWN value. Since we
do not otherwise manage the traps configured in this register at runtime we
need to reconfigure them after a suspend in case nothing else was kind
enough to preserve them for us. Do so for SMCR_EL1.EZT0.
Fixes:
|
||
|
|
9533864816 |
arm64/sme: Restore SME registers on exit from suspend
The fields in SMCR_EL1 and SMPRI_EL1 reset to an architecturally UNKNOWN
value. Since we do not otherwise manage the traps configured in this
register at runtime we need to reconfigure them after a suspend in case
nothing else was kind enough to preserve them for us.
The vector length will be restored as part of restoring the SME state for
the next SME using task.
Fixes:
|
||
|
|
61da7c8e2a |
arm64/signal: Don't assume that TIF_SVE means we saved SVE state
When we are in a syscall we will only save the FPSIMD subset even though
the task still has access to the full register set, and on context switch
we will only remove TIF_SVE when loading the register state. This means
that the signal handling code should not assume that TIF_SVE means that
the register state is stored in SVE format, it should instead check the
format that was recorded during save.
Fixes:
|
||
|
|
dc7eb87557 |
arm64/sme: Always exit sme_alloc() early with existing storage
When sme_alloc() is called with existing storage and we are not flushing we
will always allocate new storage, both leaking the existing storage and
corrupting the state. Fix this by separating the checks for flushing and
for existing storage as we do for SVE.
Callers that reallocate (eg, due to changing the vector length) should
call sme_free() themselves.
Fixes:
|
||
|
|
8410186ca4 |
arm64/fpsimd: Remove spurious check for SVE support
There is no need to check for SVE support when changing vector lengths,
even if the system is SME only we still need SVE storage for the streaming
SVE state.
Fixes:
|
||
|
|
79eb42b269 |
Merge branch 'for-next/fpsimd' into for-next/core
* for-next/fpsimd: arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch arm64: fpsimd: Drop unneeded 'busy' flag |
||
|
|
63a2d92e14 |
arm64: Cleanup system cpucap handling
Recent changes to remove cpus_have_const_cap() introduced new users of
cpus_have_cap() in the period between detecting system cpucaps and
patching alternatives. It would be preferable to defer these until after
the relevant cpucaps have been patched so that these can use the usual
feature check helper functions, which is clearer and has less risk of
accidental usage of code relying upon an alternative which has not yet
been patched.
This patch reworks the system-wide cpucap detection and patching to
minimize this transient period:
* The detection, enablement, and patching of system cpucaps is moved
into a new setup_system_capabilities() function so that these can be
grouped together more clearly, with no other functions called in the
period between detection and patching. This is called from
setup_system_features() before the subsequent checks that depend on
the cpucaps.
The logging of TTBR0 PAN and cpucaps with a mask is also moved here to
keep these as close as possible to update_cpu_capabilities().
At the same time, comments are corrected and improved to make the
intent clearer.
* As hyp_mode_check() only tests system register values (not hwcaps) and
must be called prior to patching, the call to hyp_mode_check() is
moved before the call to setup_system_features().
* In setup_system_features(), the use of system_uses_ttbr0_pan() is
restored, now that this occurs after alternatives are patched. This is
a partial revert of commit:
|