ASoC: Fixes for v7.1

A bigger batch of fixes than usual due to -next not happeing last week,
 this is mostly stuff for laptops - a lot of quirks and small fixes,
 mainly for x86 and SoundWire.  Nothing too big or exciting individually,
 just two week's worth.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmoPk5oACgkQJNaLcl1U
 h9AW2Af+IFfNdP+xpv6d+aOjyvifBggBhCEUjbVJU/R5RVNd4Za3cHdSw1tueHqC
 /Bk9s+S9uoWMOvpsnYiqMG7ez1p3LAQbvV+ASSCgcsmZ7LohUxQY8nQAURGWq1mc
 7zdDYeb/Lh+QikSaMQxxL0f5DLFctdGiHtlmJs34kDh8OTle0EDqG2r4rjNCFOqN
 fvRNjlArTRo1IHU8qryeyfm68C/80od36cuWsoGicVOuJoBvDTq6hVeVv+gL6jL1
 QTKhDG6aOl0+zVYfy6fOy1LdA164O/NR5ptFnos7DtRf7qzqOuEWpuzm6Vzmqrwz
 bNuqL+6SuuaRdcD13LRnQaL8fdxcpQ==
 =ZHJg
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v7.1-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v7.1

A bigger batch of fixes than usual due to -next not happeing last week,
this is mostly stuff for laptops - a lot of quirks and small fixes,
mainly for x86 and SoundWire.  Nothing too big or exciting individually,
just two week's worth.
This commit is contained in:
Takashi Iwai 2026-05-22 08:25:18 +02:00
commit 2519003dd5
854 changed files with 12051 additions and 5947 deletions

View File

@ -682,6 +682,7 @@ Peter A Jonsson <pj@ludd.ltu.se>
Peter Hilber <peter.hilber@oss.qualcomm.com> <quic_philber@quicinc.com>
Peter Oruba <peter.oruba@amd.com>
Peter Oruba <peter@oruba.de>
Peter Rosin <peda@lysator.liu.se> <peda@axentia.se>
Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> <pierre-louis.bossart@linux.intel.com>
Pratyush Anand <pratyush.anand@gmail.com> <pratyush.anand@st.com>
Pratyush Yadav <pratyush@kernel.org> <ptyadav@amazon.de>
@ -856,6 +857,7 @@ Tobias Klauser <tklauser@distanz.ch> <klto@zhaw.ch>
Tobias Klauser <tklauser@distanz.ch> <tklauser@nuerscht.ch>
Tobias Klauser <tklauser@distanz.ch> <tklauser@xenon.tklauser.home>
Todor Tomov <todor.too@gmail.com> <todor.tomov@linaro.org>
Tomasz Jeznach <tomasz.jeznach@linux.dev> <tjeznach@rivosinc.com>
Tony Luck <tony.luck@intel.com>
Trilok Soni <quic_tsoni@quicinc.com> <tsoni@codeaurora.org>
TripleX Chung <xxx.phy@gmail.com> <triplex@zh-kernel.org>

View File

@ -47,21 +47,19 @@ Please note that implementation details can be changed.
Called when swp_entry's refcnt goes down to 0. A charge against swap
disappears.
3. charge-commit-cancel
3. charge-commit
=======================
Memcg pages are charged in two steps:
- mem_cgroup_try_charge()
- mem_cgroup_commit_charge() or mem_cgroup_cancel_charge()
- commit_charge()
At try_charge(), there are no flags to say "this page is charged".
at this point, usage += PAGE_SIZE.
At commit(), the page is associated with the memcg.
At cancel(), simply usage -= PAGE_SIZE.
Under below explanation, we assume CONFIG_SWAP=y.
4. Anonymous

View File

@ -21,13 +21,13 @@ call at each patchable function entry, and patches it dynamically at runtime to
enable or disable the redirection. In the case of RISC-V, 2 instructions,
AUIPC + JALR, are required to compose a function call. However, it is impossible
to patch 2 instructions and expect that a concurrent read-side executes them
without a race condition. This series makes atmoic code patching possible in
without a race condition. This series makes atomic code patching possible in
RISC-V ftrace. Kernel preemption makes things even worse as it allows the old
state to persist across the patching process with stop_machine().
In order to get rid of stop_machine() and run dynamic ftrace with full kernel
preemption, we partially initialize each patchable function entry at boot-time,
setting the first instruction to AUIPC, and the second to NOP. Now, atmoic
setting the first instruction to AUIPC, and the second to NOP. Now, atomic
patching is possible because the kernel only has to update one instruction.
According to Ziccif, as long as an instruction is naturally aligned, the ISA
guarantee an atomic update.
@ -36,8 +36,8 @@ By fixing down the first instruction, AUIPC, the range of the ftrace trampoline
is limited to +-2K from the predetermined target, ftrace_caller, due to the lack
of immediate encoding space in RISC-V. To address the issue, we introduce
CALL_OPS, where an 8B naturally align metadata is added in front of each
pacthable function. The metadata is resolved at the first trampoline, then the
execution can be derect to another custom trampoline.
patchable function. The metadata is resolved at the first trampoline, then the
execution can be directed to another custom trampoline.
CMODX in the User Space
-----------------------

View File

@ -78,7 +78,7 @@ the program.
Per-task indirect branch tracking state can be monitored and
controlled via the :c:macro:`PR_GET_CFI` and :c:macro:`PR_SET_CFI`
``prctl()` arguments (respectively), by supplying
``prctl()`` arguments (respectively), by supplying
:c:macro:`PR_CFI_BRANCH_LANDING_PADS` as the second argument. These
are architecture-agnostic, and will return -EINVAL if the underlying
functionality is not supported.

View File

@ -16,10 +16,15 @@ allOf:
properties:
compatible:
enum:
- amlogic,meson6-i2c # Meson6, Meson8 and compatible SoCs
- amlogic,meson-gxbb-i2c # GXBB and compatible SoCs
- amlogic,meson-axg-i2c # AXG and compatible SoCs
oneOf:
- items:
- enum:
- amlogic,t7-i2c
- const: amlogic,meson-axg-i2c
- enum:
- amlogic,meson6-i2c # Meson6, Meson8 and compatible SoCs
- amlogic,meson-gxbb-i2c # GXBB and compatible SoCs
- amlogic,meson-axg-i2c # AXG and compatible SoCs
reg:
maxItems: 1

View File

@ -22,7 +22,9 @@ properties:
compatible:
oneOf:
- items:
- const: apple,t6020-i2c
- enum:
- apple,t6020-i2c
- apple,t8122-i2c
- const: apple,t8103-i2c
- items:
- enum:

View File

@ -18,7 +18,9 @@ properties:
description: Phandles of rt5650 and rt5514 codecs
items:
- description: phandle of rt5650 codec
maxItems: 1
- description: phandle of rt5514 codec
maxItems: 1
mediatek,platform:
$ref: /schemas/types.yaml#/definitions/phandle

View File

@ -22,5 +22,5 @@ The following sensors are supported
sysfs-Interface
---------------
temp0_input
temp1_input
- Temperature of external NTC (milli-degree C)

View File

@ -135,4 +135,4 @@ References
4. **Lenovo IdeaPad Laptop Driver:** Reference for DMI-based hardware
feature gating in Lenovo laptops.
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/ideapad-laptop.c
https://github.com/torvalds/linux/blob/master/drivers/platform/x86/lenovo/ideapad-laptop.c

View File

@ -69,6 +69,15 @@ properties:
header:
description: For C-compatible languages, header which already defines this value.
type: string
scope:
description: |
Visibility of this definition. "uapi" (default) renders into
the uAPI header, "kernel" renders into the kernel-side
generated header, "user" renders into the user-side
generated header. When combined with `header:`, the
definition is not rendered, and the named header is
included only by code matching the scope.
enum: [ uapi, kernel, user ]
type:
enum: [ const, enum, flags ]
doc:

View File

@ -83,6 +83,15 @@ properties:
header:
description: For C-compatible languages, header which already defines this value.
type: string
scope:
description: |
Visibility of this definition. "uapi" (default) renders into
the uAPI header, "kernel" renders into the kernel-side
generated header, "user" renders into the user-side
generated header. When combined with `header:`, the
definition is not rendered, and the named header is
included only by code matching the scope.
enum: [ uapi, kernel, user ]
type:
enum: [ const, enum, flags, struct ] # Trim
doc:

View File

@ -55,6 +55,15 @@ properties:
header:
description: For C-compatible languages, header which already defines this value.
type: string
scope:
description: |
Visibility of this definition. "uapi" (default) renders into
the uAPI header, "kernel" renders into the kernel-side
generated header, "user" renders into the user-side
generated header. When combined with `header:`, the
definition is not rendered, and the named header is
included only by code matching the scope.
enum: [ uapi, kernel, user ]
type:
enum: [ const, enum, flags ]
doc:

View File

@ -87,6 +87,15 @@ properties:
header:
description: For C-compatible languages, header which already defines this value.
type: string
scope:
description: |
Visibility of this definition. "uapi" (default) renders into
the uAPI header, "kernel" renders into the kernel-side
generated header, "user" renders into the user-side
generated header. When combined with `header:`, the
definition is not rendered, and the named header is
included only by code matching the scope.
enum: [ uapi, kernel, user ]
type:
enum: [ const, enum, flags, struct ] # Trim
doc:

View File

@ -33,6 +33,11 @@ doc: |
@cap-get operation.
definitions:
-
type: const
name: max-handle-id
value: 0x3fffffe
scope: kernel
-
type: enum
name: scope
@ -140,6 +145,8 @@ attribute-sets:
-
name: id
type: u32
checks:
max: max-handle-id
doc: |
Numeric identifier of a shaper. The id semantic depends on
the scope. For @queue scope it's the queue id and for @node

View File

@ -86,6 +86,7 @@ regressions and security problems.
debugging/index
handling-regressions
security-bugs
threat-model
cve
embargoed-hardware-issues

View File

@ -66,6 +66,42 @@ In addition, the following information are highly desirable:
the issue appear. It is useful to share them, as they can be helpful to
keep end users protected during the time it takes them to apply the fix.
What qualifies as a security bug
--------------------------------
It is important that most bugs are handled publicly so as to involve the widest
possible audience and find the best solution. By nature, bugs that are handled
in closed discussions between a small set of participants are less likely to
produce the best possible fix (e.g., risk of missing valid use cases, limited
testing abilities).
It turns out that the majority of the bugs reported via the security team are
just regular bugs that have been improperly qualified as security bugs due to
a lack of awareness of the Linux kernel's threat model, as described in
Documentation/process/threat-model.rst, and ought to have been sent through
the normal channels described in Documentation/admin-guide/reporting-issues.rst
instead.
The security list exists for urgent bugs that grant an attacker a capability
they are not supposed to have on a correctly configured production system, and
can be easily exploited, representing an imminent threat to many users. Before
reporting, consider whether the issue actually crosses a trust boundary on such
a system.
**If you resorted to AI assistance to identify a bug, you must treat it as
public**. While you may have valid reasons to believe it is not, the security
team's experience shows that bugs discovered this way systematically surface
simultaneously across multiple researchers, often on the same day. In this
case, do not publicly share a reproducer, as this could cause unintended harm;
just mention that one is available and maintainers might ask for it privately
if they need it.
If you are unsure whether an issue qualifies, err on the side of reporting
privately: the security team would rather triage a borderline report than miss
a real vulnerability. Reporting ordinary bugs to the security list, however,
does not make them move faster and consumes triage capacity that other reports
need.
Identifying contacts
--------------------
@ -74,7 +110,7 @@ affected subsystem's maintainers and Cc: the Linux kernel security team. Do
not send it to a public list at this stage, unless you have good reasons to
consider the issue as being public or trivial to discover (e.g. result of a
widely available automated vulnerability scanning tool that can be repeated by
anyone).
anyone, or use of AI-based tools).
If you're sending a report for issues affecting multiple parts in the kernel,
even if they're fairly similar issues, please send individual messages (think
@ -131,6 +167,64 @@ the Linux kernel security team only. Your message will be triaged, and you
will receive instructions about whom to contact, if needed. Your message may
equally be forwarded as-is to the relevant maintainers.
Responsible use of AI to find bugs
----------------------------------
A significant fraction of bug reports submitted to the security team are
actually the result of code reviews assisted by AI tools. While this can be an
efficient means to find bugs in rarely explored areas, it causes an overload on
maintainers, who are sometimes forced to ignore such reports due to their poor
quality or accuracy. As such, reporters must be particularly cautious about a
number of points which tend to make these reports needlessly difficult to
handle:
* **Length**: AI-generated reports tend to be excessively long, containing
multiple sections and excessive detail. This makes it difficult to spot
important information such as affected files, versions, and impact. Please
ensure that a clear summary of the problem and all critical details are
presented first. Do not require triage engineers to scan multiple pages of
text. Configure your tools to produce concise, human-style reports.
* **Formatting**: Most AI-generated reports are littered with Markdown tags.
These decorations complicate the search for important information and do
not survive the quoting processes involved in forwarding or replying.
Please **always convert your report to plain text** without any formatting
decorations before sending it.
* **Impact Evaluation**: Many AI-generated reports lack an understanding
of the kernel's threat model (see Documentation/process/threat-model.rst)
and go to great lengths inventing theoretical consequences. This adds
noise and complicates triage. Please stick to verifiable facts (e.g.,
"this bug permits any user to gain CAP_NET_ADMIN") without enumerating
speculative implications. Have your tool read this documentation as
part of the evaluation process.
* **Reproducer**: AI-based tools are often capable of generating reproducers.
Please always ensure your tool provides one and **test it thoroughly**. If
the reproducer does not work, or if the tool cannot produce one, the
validity of the report should be seriously questioned. Note that since the
report will be posted to a public list, the reproducer should only be
shared upon maintainers' request.
* **Propose a Fix**: Many AI tools are actually better at writing code than
evaluating it. Please ask your tool to propose a fix and **test it** before
reporting the problem. If the fix cannot be tested because it relies on
rare hardware or almost extinct network protocols, the issue is likely not
a security bug. In any case, if a fix is proposed, it must adhere to
Documentation/process/submitting-patches.rst and include a 'Fixes:' tag
designating the commit that introduced the bug.
Failure to consider these points exposes your report to the risk of being
ignored.
Use common sense when evaluating the report. If the affected file has not been
touched for more than one year and is maintained by a single individual, it is
likely that usage has declined and exposed users are virtually non-existent
(e.g., drivers for very old hardware, obsolete filesystems). In such cases,
there is no need to consume a maintainer's time with an unimportant report. If
the issue is clearly trivial and publicly discoverable, you should report it
directly to the public mailing lists.
Sending the report
------------------
@ -148,7 +242,15 @@ run additional tests. Reports where the reporter does not respond promptly
or cannot effectively discuss their findings may be abandoned if the
communication does not quickly improve.
The report must be sent to maintainers, with the security team in ``Cc:``.
The report must be sent to maintainers. If there are two or fewer
recipients in your message, you must also always Cc: the Linux kernel
security team who will ensure the message is delivered to the proper
people, and will be able to assist small maintainer teams with processes
they may not be familiar with. For larger teams, Cc: the Linux kernel
security team for your first few reports or when seeking specific help,
such as when resending a message which got no response within a week.
Once you have become comfortable with the process for a few reports, it is
no longer necessary to Cc: the security list when sending to large teams.
The Linux kernel security team can be contacted by email at
<security@kernel.org>. This is a private list of security officers
who will help verify the bug report and assist developers working on a fix.

View File

@ -0,0 +1,235 @@
The Linux Kernel threat model
=============================
There are a lot of assumptions regarding what the kernel does and does not
protect against. These assumptions tend to cause confusion for bug reports
(:doc:`security-related ones <security-bugs>` vs :doc:`non-security ones
<../admin-guide/reporting-issues>`), and can complicate security enforcement
when the responsibilities for some boundaries is not clear between the kernel,
distros, administrators and users.
This document tries to clarify the responsibilities of the kernel in this
domain.
The kernel's responsibilities
-----------------------------
The kernel abstracts access to local hardware resources and to remote systems
in a way that allows multiple local users to get a fair share of the available
resources granted to them, and, when the underlying hardware permits, to assign
a level of confidentiality to their communications and to the data they are
processing or storing.
The kernel assumes that the underlying hardware behaves according to its
specifications. This includes the integrity of the CPU's instruction set, the
transparency of the branch prediction unit and the cache units, the consistency
of the Memory Management Unit (MMU), the isolation of DMA-capable peripherals
(e.g., via IOMMU), state transitions in controllers, ranges of values read from
registers, the respect of documented hardware limitations, etc.
When hardware fails to maintain its specified isolation (e.g., CPU bugs,
side-channels, hardware response to unexpected inputs), the kernel will usually
attempt to implement reasonable mitigations. These are best-effort measures
intended to reduce the attack surface or elevate the cost of an attack within
the limits of the hardware's facilities; they do not constitute a
kernel-provided safety guarantee.
Users always perform their activities under the authority of an administrator
who is able to grant or deny various types of permissions that may affect how
users benefit from available resources, or the level of confidentiality of
their activities. Administrators may also delegate all or part of their own
permissions to some users, particularly via capabilities but not only. All this
is performed via configuration (sysctl, file-system permissions etc).
The Linux Kernel applies a certain collection of default settings that match
its threat model. Distros have their own threat model and will come with their
own configuration presets, that the administrator may have to adjust to better
suit their expectations (relax or restrict).
By default, the Linux Kernel guarantees the following protections when running
on common processors featuring privilege levels and memory management units:
* **User-based isolation**: an unprivileged user may restrict access to their
own data from other unprivileged users running on the same system. This
includes:
* stored data, via file system permissions
* in-memory data (pages are not accessible by default to other users)
* process activity (ptrace is not permitted to other users)
* inter-process communication (other users may not observe data exchanged via
UNIX domain sockets or other IPC mechanisms).
* network communications within the same or with other systems
* **Capability-based protection**:
* users not having elevated capabilities (including but not limited to
CAP_SYS_ADMIN) may not alter the
kernel's configuration, memory nor state, change other users' view of the
file system layout, grant any user capabilities they do not have, nor
affect the system's availability (shutdown, reboot, panic, hang, or making
the system unresponsive via unbounded resource exhaustion).
* users not having the ``CAP_NET_ADMIN`` capability may not alter the network
configuration, intercept nor spoof network communications from other users
nor systems.
* users not having ``CAP_SYS_PTRACE`` may not observe other users' processes
activities.
When ``CONFIG_USER_NS`` is set, the kernel also permits unprivileged users to
create their own user namespace in which they have all capabilities, but with a
number of restrictions (they may not perform actions that have impacts on the
initial user namespace, such as changing time, loading modules or mounting
block devices). Please refer to ``user_namespaces(7)`` for more details, the
possibilities of user namespaces are not covered in this document.
The kernel also offers a lot of troubleshooting and debugging facilities, which
can constitute attack vectors when placed in wrong hands. While some of them
are designed to be accessible to regular local users with a low risk (e.g.
kernel logs via ``/proc/kmsg``), some would expose enough information to
represent a risk in most places and the decision to expose them is under the
administrator's responsibility (perf events, traces), and others are not
designed to be accessed by non-privileged users (e.g. debugfs). Access to these
facilities by a user who has been explicitly granted permission by an
administrator does not constitute a security breach.
Bugs that permit to violate the principles above constitute security breaches.
However, bugs that permit one violation only once another one was already
achieved are only weaknesses. The kernel applies a number of self-protection
measures whose purpose is to avoid crossing a security boundary when certain
classes of bugs are found, but a failure of these extra protections do not
constitute a vulnerability alone.
What does not constitute a security bug
---------------------------------------
In the Linux kernel's threat model, the following classes of problems are
**NOT** considered as Linux Kernel security bugs. However, when it is believed
that the kernel could do better, they should be reported, so that they can be
reviewed and fixed where reasonably possible, but they will be handled as any
regular bug:
* **Configuration**:
* outdated kernels and particularly end-of-life branches are out of the scope
of the kernel's threat model: administrators are responsible for keeping
their system up to date. For a bug to qualify as a security bug, it must be
demonstrated that it affects actively maintained versions.
* build-level: changes to the kernel configuration that are explicitly
documented as lowering the security level (e.g. ``CONFIG_NOMMU``), or
targeted at developers only.
* OS-level: changes to command line parameters, sysctls, filesystem
permissions, user capabilities, exposure of privileged interfaces, that
explicitly increase exposure by either offering non-default access to
unprivileged users, or reduce the kernel's ability to enforce some
protections or mitigations. Example: write access to procfs or debugfs.
* issues triggered only when using features intended for development or
debugging (e.g., LOCKDEP, KASAN, FAULT_INJECTION): these features are known
to introduce overhead and potential instability and are not intended for
production use.
* issues affecting drivers exposed under CONFIG_STAGING, as well as features
marked EXPERIMENTAL in the configuration.
* loading of explicitly insecure/broken/staging modules, and generally any
using any subsystem marked as experimental or not intended for production
use.
* running out-of-tree modules or unofficial kernel forks; these should be
reported to the relevant vendor.
* **Excess of initial privileges**:
* actions performed by a user already possessing the privileges required to
perform that action or modify that state (e.g. ``CAP_SYS_ADMIN``,
``CAP_NET_ADMIN``, ``CAP_SYS_RAWIO``, ``CAP_SYS_MODULE`` with no further
boundary being crossed).
* actions performed in user namespace that do not bypass the restrictions
imposed to the initial user (e.g. ptrace usage, signal delivery, resource
usage, access to FS/device/sysctl/memory, network binding, system/network
configuration etc).
* anything performed by the root user in the initial namespace (e.g. kernel
oops when writing to a privileged device).
* **Out of production use**:
This covers theoretical/probabilistic attacks that rely on laboratory
conditions with zero system noise, or those requiring an unrealistic number
of attempts (e.g., billions of trials) that would be detected by standard
system monitoring long before success, such as:
* prediction of random numbers that only works in a totally silent
environment (such as IP ID, TCP ports or sequence numbers that can only be
guessed in a lab).
* activity observation and information leaks based on probabilistic
approaches that are prone to measurement noise and not realistically
reproducible on a production system.
* issues that can only be triggered by heavy attacks (e.g. brute force) whose
impact on the system makes it unlikely or impossible to remain undetected
before they succeed (e.g. consuming all memory before succeeding).
* problems seen only under development simulators, emulators, or combinations
that do not exist on real systems at the time of reporting (issues
involving tens of millions of threads, tens of thousands of CPUs,
unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds.
* issues whose reproduction requires hardware modification or emulation,
including fake USB devices that pretend to be another one.
* as well as issues that can be triggered at a cost that is orders of
magnitude higher than the expected benefits (e.g. fully functional keyboard
emulator only to retrieve 7 uninitialized bytes in a structure, or
brute-force method involving millions of connection attempts to guess a
port number).
* **Hardening failures**:
* ability to bypass some of the kernel's hardening measures with no
demonstrable exploit path (e.g. ASLR bypass, events timing or probing with
no demonstrable consequence). These are just weaknesses, not
vulnerabilities.
* missing argument checks and failure to report certain errors with no
immediate consequence.
* **Random information leaks**:
This concerns information leaks of small data parts that happen to be there
and that cannot be chosen by the attacker, or face access restrictions:
* structure padding reported by syscalls or other interfaces.
* identifiers, partial data, non-terminated strings reported in error
messages.
* Leaks of kernel memory addresses/pointers do not constitute an immediately
exploitable vector and are not security bugs, though they must be reported
and fixed.
* **Crafted file system images**:
* bugs triggered by mounting a corrupted or maliciously crafted file system
image are generally not security bugs, as the kernel assumes the underlying
storage media is under the administrator's control, unless the filesystem
driver is specifically documented as being hardened against untrusted media.
* issues that are resolved, mitigated, or detected by running a filesystem
consistency check (fsck) on the image prior to mounting.
* **Physical access**:
Issues that require physical access to the machine, hardware modification, or
the use of specialized hardware (e.g., logic analyzers, DMA-attack tools over
PCI-E/Thunderbolt) are out of scope unless the system is explicitly
configured with technologies meant to defend against such attacks
(e.g. IOMMU).
* **Functional and performance regressions**:
Any issue that can be mitigated by setting proper permissions and limits
doesn't qualify as a security bug.

View File

@ -24,6 +24,97 @@ Quick access to CPU number, node ID
Allows to implement per CPU data efficiently. Documentation is in code and
selftests. :(
Optimized RSEQ V2
-----------------
On architectures which utilize the generic entry code and generic TIF bits
the kernel supports runtime optimizations for RSEQ, which also enable
enhanced features like scheduler time slice extensions.
To enable them a task has to register the RSEQ region with at least the
length advertised by getauxval(AT_RSEQ_FEATURE_SIZE).
If existing binaries register with RSEQ_ORIG_SIZE (32 bytes), the kernel
keeps the legacy low performance mode enabled to fulfil the expectations
of existing users regarding the original RSEQ implementation behaviour.
The following table documents the ABI and behavioral guarantees of the
legacy and the optimized V2 mode.
.. list-table:: RSEQ modes
:header-rows: 1
* - Nr
- What
- Legacy
- Optimized V2
* - 1
- The cpu_id_start, cpu_id, node_id and mm_cid fields (User mode read
only)
.. Legacy
- Updated by the kernel unconditionally after each context switch and
before signal delivery
.. Optimized V2
- Updated by the kernel if and only if they change, i.e. if the task
is migrated or mm_cid changes
* - 2
- The rseq_cs critical section field
.. Legacy
- Evaluated and handled unconditionally after each context switch and
before signal delivery
.. Optimized V2
- Evaluated and handled conditionally only when user space was
interrupted and was scheduled out or before delivering a signal in
the interrupted context.
* - 3
- Read only fields
.. Legacy
- No strict enforcement except in debug mode
.. Optimized V2
- Strict enforcement
* - 4
- membarrier(...RSEQ)
.. Legacy
- All running threads of the process are interrupted and the ID fields
are rewritten and eventually active critical sections are aborted
before they return to user space. All threads which are scheduled
out whether voluntary or not are covered by #1/#2 above.
.. Optimized V2
- All running threads of the process are interrupted and eventually
active critical sections are aborted before these threads return to
user space. The ID fields are only updated if changed as a
consequence of the interrupt. All threads which are scheduled out
whether voluntary or not are covered by #1/#2 above.
* - 5
- Time slice extensions
.. Legacy
- Not supported
.. Optimized V2
- Supported
The legacy mode is obviously less performant as it does unconditional
updates and critical section checks even if not strictly required by the
ABI contract. That can't be changed anymore as some users depend on that
observed behavior, which in turn enables them to violate the ABI and
overwrite the cpu_id_start field for their own purposes. This is obviously
discouraged as it renders RSEQ incompatible with the intended usage and
breaks the expectation of other libraries in the same application.
The ABI compliant optimized v2 mode, which respects the read only fields,
does not require unconditional updates and therefore is way more
performant. The kernel validates the read only fields for compliance. If
user space modifies them, the process is killed. Compliant usage allows
multiple libraries in the same application to benefit from the RSEQ
functionality without disturbing each other. The ABI compliant optimized v2
mode also enables extended RSEQ features like time slice extensions.
Scheduler time slice extensions
-------------------------------
@ -37,7 +128,8 @@ The prerequisites for this functionality are:
* Enabled at boot time (default is enabled)
* A rseq userspace pointer has been registered for the thread
* A rseq userspace pointer has been registered for the thread in
optimized V2 mode
The thread has to enable the functionality via prctl(2)::

View File

@ -656,8 +656,8 @@ References
See [white-paper]_, [api-spec]_, [amd-apm]_, [kvm-forum]_, and [snp-fw-abi]_
for more info.
.. [white-paper] https://developer.amd.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
.. [api-spec] https://support.amd.com/TechDocs/55766_SEV-KM_API_Specification.pdf
.. [amd-apm] https://support.amd.com/TechDocs/24593.pdf (section 15.34)
.. [white-paper] https://docs.amd.com/v/u/en-US/memory-encryption-white-paper
.. [api-spec] https://docs.amd.com/v/u/en-US/55766_PUB_3.24_SEV_API
.. [amd-apm] https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2 (section 15.34)
.. [kvm-forum] https://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
.. [snp-fw-abi] https://www.amd.com/system/files/TechDocs/56860.pdf
.. [snp-fw-abi] https://www.amd.com/content/dam/amd/en/documents/developer/56860.pdf

View File

@ -68,6 +68,12 @@ Maintainers List
first. When adding to this list, please keep the entries in
alphabetical order.
3C509 NETWORK DRIVER
M: "Maciej W. Rozycki" <macro@orcam.me.uk>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/3com/3c509.c
3C59X NETWORK DRIVER
M: Steffen Klassert <klassert@kernel.org>
L: netdev@vger.kernel.org
@ -2015,7 +2021,7 @@ F: Documentation/hwmon/aquacomputer_d5next.rst
F: drivers/hwmon/aquacomputer_d5next.c
AQUANTIA ETHERNET DRIVER (atlantic)
M: Igor Russkikh <irusskikh@marvell.com>
M: Sukhdeep Singh <sukhdeeps@marvell.com>
L: netdev@vger.kernel.org
S: Maintained
W: https://www.marvell.com/
@ -2024,7 +2030,7 @@ F: Documentation/networking/device_drivers/ethernet/aquantia/atlantic.rst
F: drivers/net/ethernet/aquantia/atlantic/
AQUANTIA ETHERNET DRIVER PTP SUBSYSTEM
M: Egor Pomozov <epomozov@marvell.com>
M: Sukhdeep Singh <sukhdeeps@marvell.com>
L: netdev@vger.kernel.org
S: Maintained
W: http://www.aquantia.com
@ -4181,8 +4187,8 @@ F: include/uapi/linux/sonet.h
F: net/atm/
ATMEL MACB ETHERNET DRIVER
M: Nicolas Ferre <nicolas.ferre@microchip.com>
M: Claudiu Beznea <claudiu.beznea@tuxon.dev>
M: Théo Lebrun <theo.lebrun@bootlin.com>
R: Conor Dooley <conor.dooley@microchip.com>
S: Maintained
F: drivers/net/ethernet/cadence/
@ -4299,18 +4305,16 @@ F: Documentation/devicetree/bindings/leds/backlight/awinic,aw99706.yaml
F: drivers/video/backlight/aw99706.c
AXENTIA ARM DEVICES
M: Peter Rosin <peda@axentia.se>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
S: Orphan
F: arch/arm/boot/dts/microchip/at91-linea.dtsi
F: arch/arm/boot/dts/microchip/at91-natte.dtsi
F: arch/arm/boot/dts/microchip/at91-nattis-2-natte-2.dts
F: arch/arm/boot/dts/microchip/at91-tse850-3.dts
AXENTIA ASOC DRIVERS
M: Peter Rosin <peda@axentia.se>
L: linux-sound@vger.kernel.org
S: Maintained
S: Orphan
F: Documentation/devicetree/bindings/sound/axentia,*
F: sound/soc/atmel/tse850-pcm5142.c
@ -6358,6 +6362,7 @@ F: include/uapi/linux/comedi.h
COMMON CLK FRAMEWORK
M: Michael Turquette <mturquette@baylibre.com>
M: Stephen Boyd <sboyd@kernel.org>
R: Brian Masney <bmasney@redhat.com>
L: linux-clk@vger.kernel.org
S: Maintained
Q: http://patchwork.kernel.org/project/linux-clk/list/
@ -7077,6 +7082,12 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/debugobjec
F: include/linux/debugobjects.h
F: lib/debugobjects.c
DEC LANCE NETWORK DRIVER
M: "Maciej W. Rozycki" <macro@orcam.me.uk>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/amd/declance.c
DECSTATION PLATFORM SUPPORT
M: "Maciej W. Rozycki" <macro@orcam.me.uk>
L: linux-mips@vger.kernel.org
@ -8193,10 +8204,9 @@ F: include/uapi/drm/nouveau_drm.h
CORE DRIVER FOR NVIDIA GPUS [RUST]
M: Danilo Krummrich <dakr@kernel.org>
M: Alexandre Courbot <acourbot@nvidia.com>
L: nouveau@lists.freedesktop.org
L: nova-gpu@lists.linux.dev
S: Supported
W: https://rust-for-linux.com/nova-gpu-driver
Q: https://patchwork.freedesktop.org/project/nouveau/
B: https://gitlab.freedesktop.org/drm/nova/-/issues
C: irc://irc.oftc.net/nouveau
T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next
@ -8205,10 +8215,9 @@ F: drivers/gpu/nova-core/
DRM DRIVER FOR NVIDIA GPUS [RUST]
M: Danilo Krummrich <dakr@kernel.org>
L: nouveau@lists.freedesktop.org
L: nova-gpu@lists.linux.dev
S: Supported
W: https://rust-for-linux.com/nova-gpu-driver
Q: https://patchwork.freedesktop.org/project/nouveau/
B: https://gitlab.freedesktop.org/drm/nova/-/issues
C: irc://irc.oftc.net/nouveau
T: git https://gitlab.freedesktop.org/drm/rust/kernel.git drm-rust-next
@ -12046,7 +12055,7 @@ F: Documentation/i2c/busses/i2c-nvidia-gpu.rst
F: drivers/i2c/busses/i2c-nvidia-gpu.c
I2C MUXES
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-i2c@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/i2c/i2c-arb*
@ -12447,7 +12456,7 @@ F: drivers/iio/industrialio-backend.c
F: include/linux/iio/backend.h
IIO DIGITAL POTENTIOMETER DAC
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/sysfs-bus-iio-dac-dpot-dac
@ -12455,7 +12464,7 @@ F: Documentation/devicetree/bindings/iio/dac/dpot-dac.yaml
F: drivers/iio/dac/dpot-dac.c
IIO ENVELOPE DETECTOR
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/sysfs-bus-iio-adc-envelope-detector
@ -12471,7 +12480,7 @@ F: include/linux/iio/iio-gts-helper.h
F: drivers/iio/test/iio-test-gts.c
IIO MULTIPLEXER
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/iio/multiplexer/io-channel-mux.yaml
@ -12502,7 +12511,7 @@ F: include/linux/iio/
F: tools/iio/
IIO UNIT CONVERTER
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/iio/afe/current-sense-amplifier.yaml
@ -12779,7 +12788,6 @@ M: Cezary Rojewski <cezary.rojewski@intel.com>
M: Liam Girdwood <liam.r.girdwood@linux.intel.com>
M: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
M: Bard Liao <yung-chuan.liao@linux.intel.com>
M: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
M: Kai Vehmanen <kai.vehmanen@linux.intel.com>
R: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
L: linux-sound@vger.kernel.org
@ -14052,6 +14060,7 @@ KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
M: Marc Zyngier <maz@kernel.org>
M: Oliver Upton <oupton@kernel.org>
R: Joey Gouly <joey.gouly@arm.com>
R: Steffen Eiden <seiden@linux.ibm.com>
R: Suzuki K Poulose <suzuki.poulose@arm.com>
R: Zenghui Yu <yuzenghui@huawei.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
@ -15718,7 +15727,7 @@ F: Documentation/devicetree/bindings/media/i2c/maxim,max96717.yaml
F: drivers/media/i2c/max96717.c
MAX9860 MONO AUDIO VOICE CODEC DRIVER
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-sound@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/sound/max9860.txt
@ -15933,7 +15942,7 @@ F: Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml
F: drivers/net/can/spi/mcp251xfd/
MCP4018 AND MCP4531 MICROCHIP DIGITAL POTENTIOMETER DRIVERS
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/sysfs-bus-iio-potentiometer-mcp4531
@ -18238,7 +18247,7 @@ F: include/linux/mmc/
F: include/uapi/linux/mmc/
MULTIPLEXER SUBSYSTEM
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
S: Odd Fixes
F: Documentation/ABI/testing/sysfs-class-mux*
F: Documentation/devicetree/bindings/mux/
@ -19347,7 +19356,7 @@ F: include/dt-bindings/display/tda998x.h
K: "nxp,tda998x"
NXP TFA9879 DRIVER
M: Peter Rosin <peda@axentia.se>
M: Peter Rosin <peda@lysator.liu.se>
L: linux-sound@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/sound/trivial-codec.yaml
@ -19445,7 +19454,6 @@ F: include/misc/ocxl*
F: include/uapi/misc/ocxl.h
OMAP AUDIO SUPPORT
M: Peter Ujfalusi <peter.ujfalusi@gmail.com>
M: Jarkko Nikula <jarkko.nikula@bitmer.com>
L: linux-sound@vger.kernel.org
L: linux-omap@vger.kernel.org
@ -20348,13 +20356,14 @@ F: Documentation/devicetree/bindings/pci/marvell,armada8k-pcie.yaml
F: drivers/pci/controller/dwc/pcie-armada8k.c
PCI DRIVER FOR CADENCE PCIE IP
R: Aksh Garg <a-garg7@ti.com>
L: linux-pci@vger.kernel.org
S: Orphan
F: Documentation/devicetree/bindings/pci/cdns,*
F: drivers/pci/controller/cadence/*cadence*
F: drivers/pci/controller/cadence/
PCI DRIVER FOR CIX Sky1
M: Hans Zhang <hans.zhang@cixtech.com>
M: Hans Zhang <18255117159@163.com>
L: linux-pci@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/pci/cix,sky1-pcie-*.yaml
@ -20466,7 +20475,7 @@ F: drivers/pci/controller/plda/pcie-plda-host.c
F: drivers/pci/controller/plda/pcie-plda.h
PCI DRIVER FOR RENESAS R-CAR
M: Marek Vasut <marek.vasut+renesas@gmail.com>
M: Marek Vasut <marek.vasut+renesas@mailbox.org>
M: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
L: linux-pci@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
@ -22940,7 +22949,7 @@ N: riscv
K: riscv
RISC-V IOMMU
M: Tomasz Jeznach <tjeznach@rivosinc.com>
M: Tomasz Jeznach <tomasz.jeznach@linux.dev>
L: iommu@lists.linux.dev
L: linux-riscv@lists.infradead.org
S: Maintained
@ -24650,6 +24659,7 @@ S: Maintained
F: fs/smb/client/smbdirect.*
F: fs/smb/smbdirect/
F: fs/smb/server/transport_rdma.*
F: include/linux/smbdirect.h
SMC91x ETHERNET DRIVER
M: Nicolas Pitre <nico@fluxnic.net>
@ -25053,7 +25063,6 @@ SOUND - SOUND OPEN FIRMWARE (SOF) DRIVERS
M: Liam Girdwood <lgirdwood@gmail.com>
M: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
M: Bard Liao <yung-chuan.liao@linux.intel.com>
M: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
M: Daniel Baluta <daniel.baluta@nxp.com>
R: Kai Vehmanen <kai.vehmanen@linux.intel.com>
R: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
@ -26346,7 +26355,7 @@ F: arch/xtensa/
F: drivers/irqchip/irq-xtensa-*
TEXAS INSTRUMENTS ASoC DRIVERS
M: Peter Ujfalusi <peter.ujfalusi@gmail.com>
M: Sen Wang <sen@ti.com>
L: linux-sound@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/sound/davinci-mcasp-audio.yaml
@ -26848,12 +26857,6 @@ S: Maintained
F: Documentation/devicetree/bindings/iio/adc/ti,tsc2046.yaml
F: drivers/iio/adc/ti-tsc2046.c
TI TWL4030 SERIES SOC CODEC DRIVER
M: Peter Ujfalusi <peter.ujfalusi@gmail.com>
L: linux-sound@vger.kernel.org
S: Maintained
F: sound/soc/codecs/twl4030*
TI VPE/CAL DRIVERS
M: Yemike Abhilash Chandra <y-abhilashchandra@ti.com>
L: linux-media@vger.kernel.org

View File

@ -2,7 +2,7 @@
VERSION = 7
PATCHLEVEL = 1
SUBLEVEL = 0
EXTRAVERSION = -rc2
EXTRAVERSION = -rc4
NAME = Baby Opossum Posse
# *DOCUMENTATION*
@ -486,6 +486,8 @@ export rust_common_flags := --edition=2021 \
-Wclippy::as_ptr_cast_mut \
-Wclippy::as_underscore \
-Wclippy::cast_lossless \
-Aclippy::collapsible_if \
-Aclippy::collapsible_match \
-Wclippy::ignored_unit_patterns \
-Aclippy::incompatible_msrv \
-Wclippy::mut_mut \

View File

@ -23,6 +23,7 @@ static inline u64 tcr_el2_ps_to_tcr_el1_ips(u64 tcr_el2)
static inline u64 translate_tcr_el2_to_tcr_el1(u64 tcr)
{
return TCR_EPD1_MASK | /* disable TTBR1_EL1 */
((tcr & TCR_EL2_DS) ? TCR_DS : 0) |
((tcr & TCR_EL2_TBI) ? TCR_TBI0 : 0) |
tcr_el2_ps_to_tcr_el1_ips(tcr) |
(tcr & TCR_EL2_TG0_MASK) |

View File

@ -844,7 +844,7 @@
#define INIT_SCTLR_EL2_MMU_ON \
(SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_ELx_I | \
SCTLR_ELx_IESB | SCTLR_ELx_WXN | ENDIAN_SET_EL2 | \
SCTLR_ELx_ITFSB | SCTLR_EL2_RES1)
SCTLR_ELx_ITFSB | SCTLR_ELx_EIS | SCTLR_ELx_EOS | SCTLR_EL2_RES1)
#define INIT_SCTLR_EL2_MMU_OFF \
(SCTLR_EL2_RES1 | ENDIAN_SET_EL2)

View File

@ -62,6 +62,13 @@ static void noinstr arm64_exit_to_kernel_mode(struct pt_regs *regs,
irqentry_exit_to_kernel_mode_after_preempt(regs, state);
}
static __always_inline void arm64_syscall_enter_from_user_mode(struct pt_regs *regs)
{
enter_from_user_mode(regs);
mte_disable_tco_entry(current);
sme_enter_from_user_mode();
}
/*
* Handle IRQ/context state management when entering from user mode.
* Before this function is called it is not safe to call regular kernel code,
@ -70,20 +77,30 @@ static void noinstr arm64_exit_to_kernel_mode(struct pt_regs *regs,
static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs)
{
enter_from_user_mode(regs);
rseq_note_user_irq_entry();
mte_disable_tco_entry(current);
sme_enter_from_user_mode();
}
static __always_inline void arm64_syscall_exit_to_user_mode(struct pt_regs *regs)
{
local_irq_disable();
syscall_exit_to_user_mode_prepare(regs);
local_daif_mask();
sme_exit_to_user_mode();
mte_check_tfsr_exit();
exit_to_user_mode();
}
/*
* Handle IRQ/context state management when exiting to user mode.
* After this function returns it is not safe to call regular kernel code,
* instrumentable code, or any code which may trigger an exception.
*/
static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs)
{
local_irq_disable();
exit_to_user_mode_prepare_legacy(regs);
irqentry_exit_to_user_mode_prepare(regs);
local_daif_mask();
sme_exit_to_user_mode();
mte_check_tfsr_exit();
@ -92,7 +109,7 @@ static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs)
asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs)
{
arm64_exit_to_user_mode(regs);
arm64_syscall_exit_to_user_mode(regs);
}
/*
@ -716,12 +733,12 @@ static void noinstr el0_brk64(struct pt_regs *regs, unsigned long esr)
static void noinstr el0_svc(struct pt_regs *regs)
{
arm64_enter_from_user_mode(regs);
arm64_syscall_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
fpsimd_syscall_enter();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc(regs);
arm64_exit_to_user_mode(regs);
arm64_syscall_exit_to_user_mode(regs);
fpsimd_syscall_exit();
}
@ -868,11 +885,11 @@ static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr)
static void noinstr el0_svc_compat(struct pt_regs *regs)
{
arm64_enter_from_user_mode(regs);
arm64_syscall_enter_from_user_mode(regs);
cortex_a76_erratum_1463225_svc_handler();
local_daif_restore(DAIF_PROCCTX);
do_el0_svc_compat(regs);
arm64_exit_to_user_mode(regs);
arm64_syscall_exit_to_user_mode(regs);
}
static void noinstr el0_bkpt32(struct pt_regs *regs, unsigned long esr)

View File

@ -983,8 +983,8 @@ static int sve_set_common(struct task_struct *target,
}
/* Always zero V regs, FPSR, and FPCR */
memset(&current->thread.uw.fpsimd_state, 0,
sizeof(current->thread.uw.fpsimd_state));
memset(&target->thread.uw.fpsimd_state, 0,
sizeof(target->thread.uw.fpsimd_state));
/* Registers: FPSIMD-only case */

View File

@ -4,6 +4,7 @@
* Author: Christoffer Dall <c.dall@virtualopensystems.com>
*/
#include <linux/arm-smccc.h>
#include <linux/bug.h>
#include <linux/cpu_pm.h>
#include <linux/errno.h>
@ -2638,6 +2639,22 @@ static int init_pkvm_host_sve_state(void)
return 0;
}
static int pkvm_check_sme_dvmsync_fw_call(void)
{
struct arm_smccc_res res;
if (!cpus_have_final_cap(ARM64_WORKAROUND_4193714))
return 0;
arm_smccc_1_1_smc(ARM_SMCCC_CPU_WORKAROUND_4193714, &res);
if (res.a0) {
kvm_err("pKVM requires firmware support for C1-Pro erratum 4193714\n");
return -ENODEV;
}
return 0;
}
/*
* Finalizes the initialization of hyp mode, once everything else is initialized
* and the initialziation process cannot fail.
@ -2838,6 +2855,10 @@ static int __init init_hyp_mode(void)
if (err)
goto out_err;
err = pkvm_check_sme_dvmsync_fw_call();
if (err)
goto out_err;
err = kvm_hyp_init_protection(hyp_va_bits);
if (err) {
kvm_err("Failed to init hyp memory protection\n");

View File

@ -245,7 +245,7 @@ static inline void __activate_traps_ich_hfgxtr(struct kvm_vcpu *vcpu)
__activate_fgt(hctxt, vcpu, ICH_HFGITR_EL2);
}
#define __deactivate_fgt(htcxt, vcpu, reg) \
#define __deactivate_fgt(hctxt, vcpu, reg) \
do { \
write_sysreg_s(ctxt_sys_reg(hctxt, reg), \
SYS_ ## reg); \

View File

@ -35,6 +35,9 @@ void trace_clock_update(u32 mult, u32 shift, u64 epoch_ns, u64 epoch_cyc)
struct clock_data *clock = &trace_clock_data;
u64 bank = clock->cur ^ 1;
if (!mult || shift >= 64)
return;
clock->data[bank].mult = mult;
clock->data[bank].shift = shift;
clock->data[bank].epoch_ns = epoch_ns;

View File

@ -5,6 +5,7 @@
*/
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>
#include <asm/kvm_mmu.h>
@ -14,6 +15,7 @@
#include <hyp/fault.h>
#include <nvhe/arm-smccc.h>
#include <nvhe/gfp.h>
#include <nvhe/memory.h>
#include <nvhe/mem_protect.h>
@ -29,6 +31,19 @@ static struct hyp_pool host_s2_pool;
static DEFINE_PER_CPU(struct pkvm_hyp_vm *, __current_vm);
#define current_vm (*this_cpu_ptr(&__current_vm))
static void pkvm_sme_dvmsync_fw_call(void)
{
if (alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714)) {
struct arm_smccc_res res;
/*
* Ignore the return value. Probing for the workaround
* availability took place in init_hyp_mode().
*/
hyp_smccc_1_1_smc(ARM_SMCCC_CPU_WORKAROUND_4193714, &res);
}
}
static void guest_lock_component(struct pkvm_hyp_vm *vm)
{
hyp_spin_lock(&vm->lock);
@ -574,8 +589,14 @@ static int host_stage2_set_owner_metadata_locked(phys_addr_t addr, u64 size,
ret = host_stage2_try(kvm_pgtable_stage2_annotate, &host_mmu.pgt,
addr, size, &host_s2_pool,
KVM_HOST_INVALID_PTE_TYPE_DONATION, annotation);
if (!ret)
if (!ret) {
/*
* After stage2 maintenance has happened, but before the page
* owner has changed.
*/
pkvm_sme_dvmsync_fw_call();
__host_update_page_state(addr, size, PKVM_NOPAGE);
}
return ret;
}
@ -1369,6 +1390,22 @@ int __pkvm_host_reclaim_page_guest(u64 gfn, struct pkvm_hyp_vm *vm)
return ret && ret != -EHWPOISON ? ret : 0;
}
/*
* share/donate install at most one stage-2 leaf (PAGE_SIZE, or one
* KVM_PGTABLE_LAST_LEVEL - 1 block for share). kvm_mmu_cache_min_pages()
* bounds the worst-case allocation: exact for the PAGE_SIZE leaf,
* conservative by one for the block.
*/
static int __guest_check_pgtable_memcache(struct pkvm_hyp_vcpu *vcpu)
{
struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu);
if (vcpu->vcpu.arch.pkvm_memcache.nr_pages < kvm_mmu_cache_min_pages(vm->pgt.mmu))
return -ENOMEM;
return 0;
}
int __pkvm_host_donate_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu)
{
struct pkvm_hyp_vm *vm = pkvm_hyp_vcpu_to_hyp_vm(vcpu);
@ -1388,6 +1425,10 @@ int __pkvm_host_donate_guest(u64 pfn, u64 gfn, struct pkvm_hyp_vcpu *vcpu)
if (ret)
goto unlock;
ret = __guest_check_pgtable_memcache(vcpu);
if (ret)
goto unlock;
meta = host_stage2_encode_gfn_meta(vm, gfn);
WARN_ON(host_stage2_set_owner_metadata_locked(phys, PAGE_SIZE,
PKVM_ID_GUEST, meta));
@ -1453,6 +1494,10 @@ int __pkvm_host_share_guest(u64 pfn, u64 gfn, u64 nr_pages, struct pkvm_hyp_vcpu
}
}
ret = __guest_check_pgtable_memcache(vcpu);
if (ret)
goto unlock;
for_each_hyp_page(page, phys, size) {
set_host_state(page, PKVM_PAGE_SHARED_OWNED);
page->host_share_guest_count++;

View File

@ -752,16 +752,30 @@ static struct pkvm_hyp_vcpu selftest_vcpu = {
struct pkvm_hyp_vcpu *init_selftest_vm(void *virt)
{
struct hyp_page *p = hyp_virt_to_page(virt);
unsigned long min_pages, seeded = 0;
int i;
selftest_vm.kvm.arch.mmu.vtcr = host_mmu.arch.mmu.vtcr;
WARN_ON(kvm_guest_prepare_stage2(&selftest_vm, virt));
/*
* Mirror pkvm_refill_memcache() for the share/donate pre-checks;
* the selftest invokes those functions directly and would
* otherwise see an empty memcache.
*/
min_pages = kvm_mmu_cache_min_pages(&selftest_vm.kvm.arch.mmu);
for (i = 0; i < pkvm_selftest_pages(); i++) {
if (p[i].refcount)
continue;
p[i].refcount = 1;
hyp_put_page(&selftest_vm.pool, hyp_page_to_virt(&p[i]));
if (seeded < min_pages) {
push_hyp_memcache(&selftest_vcpu.vcpu.arch.pkvm_memcache,
hyp_page_to_virt(&p[i]), hyp_virt_to_phys);
seeded++;
} else {
hyp_put_page(&selftest_vm.pool, hyp_page_to_virt(&p[i]));
}
}
selftest_vm.kvm.arch.pkvm.handle = __pkvm_reserve_vm();

View File

@ -663,7 +663,8 @@ static void __noreturn __hyp_call_panic(u64 spsr, u64 elr, u64 par)
host_ctxt = host_data_ptr(host_ctxt);
vcpu = host_ctxt->__hyp_running_vcpu;
__deactivate_traps(vcpu);
if (vcpu)
__deactivate_traps(vcpu);
sysreg_restore_host_state_vhe(host_ctxt);
panic("HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n",

View File

@ -1576,21 +1576,24 @@ struct kvm_s2_fault_desc {
static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
{
bool write_fault, exec_fault;
bool perm_fault = kvm_vcpu_trap_is_permission_fault(s2fd->vcpu);
enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED;
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt;
unsigned long mmu_seq;
struct page *page;
struct kvm *kvm = s2fd->vcpu->kvm;
void *memcache;
void *memcache = NULL;
kvm_pfn_t pfn;
gfn_t gfn;
int ret;
memcache = get_mmu_memcache(s2fd->vcpu);
ret = topup_mmu_memcache(s2fd->vcpu, memcache);
if (ret)
return ret;
if (!perm_fault) {
memcache = get_mmu_memcache(s2fd->vcpu);
ret = topup_mmu_memcache(s2fd->vcpu, memcache);
if (ret)
return ret;
}
if (s2fd->nested)
gfn = kvm_s2_trans_output(s2fd->nested) >> PAGE_SHIFT;
@ -1631,9 +1634,19 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
goto out_unlock;
}
ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
__pfn_to_phys(pfn), prot,
memcache, flags);
if (perm_fault) {
/*
* Drop the SW bits in favour of those stored in the
* PTE, which will be preserved.
*/
prot &= ~KVM_NV_GUEST_MAP_SZ;
ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, s2fd->fault_ipa,
prot, flags);
} else {
ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
__pfn_to_phys(pfn), prot,
memcache, flags);
}
out_unlock:
kvm_release_faultin_page(kvm, page, !!ret, prot & KVM_PGTABLE_PROT_W);

View File

@ -3,7 +3,7 @@ obj-y += mm/
obj-y += net/
obj-y += vdso/
obj-$(CONFIG_KVM) += kvm/
obj-$(subst m,y,$(CONFIG_KVM)) += kvm/
# for cleaning
subdir- += boot

View File

@ -220,6 +220,7 @@ menu "Kernel type and options"
choice
prompt "Kernel type"
default 64BIT # Keep existing behavior
config 32BIT
bool "32-bit kernel"

View File

@ -55,9 +55,11 @@ endif
ifdef CONFIG_32BIT
tool-archpref = $(32bit-tool-archpref)
UTS_MACHINE := loongarch32
cflags-y += $(call cc-option,-m32)
else
tool-archpref = $(64bit-tool-archpref)
UTS_MACHINE := loongarch64
cflags-y += $(call cc-option,-m64)
endif
ifneq ($(SUBARCH),$(ARCH))

View File

@ -20,3 +20,23 @@ asmlinkage void noinstr __no_stack_protector ret_from_kernel_thread(struct task_
struct pt_regs *regs,
int (*fn)(void *),
void *fn_arg);
struct kvm_run;
struct kvm_vcpu;
struct loongarch_fpu;
void kvm_exc_entry(void);
int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu);
void kvm_save_fpu(struct loongarch_fpu *fpu);
void kvm_restore_fpu(struct loongarch_fpu *fpu);
#ifdef CONFIG_CPU_HAS_LSX
void kvm_save_lsx(struct loongarch_fpu *fpu);
void kvm_restore_lsx(struct loongarch_fpu *fpu);
#endif
#ifdef CONFIG_CPU_HAS_LASX
void kvm_save_lasx(struct loongarch_fpu *fpu);
void kvm_restore_lasx(struct loongarch_fpu *fpu);
#endif

View File

@ -87,7 +87,6 @@ struct kvm_context {
struct kvm_world_switch {
int (*exc_entry)(void);
int (*enter_guest)(struct kvm_run *run, struct kvm_vcpu *vcpu);
unsigned long page_order;
};
#define MAX_PGTABLE_LEVELS 4
@ -359,8 +358,6 @@ void kvm_exc_entry(void);
int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu);
extern unsigned long vpid_mask;
extern const unsigned long kvm_exception_size;
extern const unsigned long kvm_enter_guest_size;
extern struct kvm_world_switch *kvm_loongarch_ops;
#define SW_GCSR (1 << 0)

View File

@ -69,7 +69,7 @@
9, 10, 11, 12, 13, 14, 15, 16, \
17, 18, 19, 20, 21, 22, 23, 24, \
25, 26, 27, 28, 29, 30, 31; \
.cfi_offset \num, SC_REGS + \num * SZREG; \
.cfi_offset \num, SC_REGS + \num * 8; \
.endr; \
\
nop; \

View File

@ -85,12 +85,6 @@ static __always_inline u64 __arch_get_hw_counter(s32 clock_mode,
return count;
}
static inline bool loongarch_vdso_hres_capable(void)
{
return true;
}
#define __arch_vdso_hres_capable loongarch_vdso_hres_capable
#endif /* CONFIG_GENERIC_GETTIMEOFDAY */
#endif /* !__ASSEMBLER__ */

View File

@ -7,11 +7,12 @@ include $(srctree)/virt/kvm/Makefile.kvm
obj-$(CONFIG_KVM) += kvm.o
obj-y += switch.o
kvm-y += exit.o
kvm-y += interrupt.o
kvm-y += main.o
kvm-y += mmu.o
kvm-y += switch.o
kvm-y += timer.o
kvm-y += tlb.o
kvm-y += vcpu.o

View File

@ -390,6 +390,7 @@ int kvm_emu_mmio_read(struct kvm_vcpu *vcpu, larch_inst inst)
run->mmio.len = 8;
break;
default:
ret = EMULATE_FAIL;
break;
}
break;

View File

@ -28,23 +28,29 @@ static unsigned int priority_to_irq[EXCCODE_INT_NUM] = {
static int kvm_irq_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
{
unsigned int irq = 0;
unsigned long old, new;
clear_bit(priority, &vcpu->arch.irq_pending);
if (priority < EXCCODE_INT_NUM)
irq = priority_to_irq[priority];
if (kvm_guest_has_msgint(&vcpu->arch) && (priority == INT_AVEC)) {
dmsintc_inject_irq(vcpu);
set_gcsr_estat(irq);
return 1;
}
switch (priority) {
case INT_AVEC:
if (!kvm_guest_has_msgint(&vcpu->arch))
break;
dmsintc_inject_irq(vcpu);
fallthrough;
case INT_TI:
case INT_IPI:
case INT_SWI0:
case INT_SWI1:
old = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL);
set_gcsr_estat(irq);
new = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL);
/* Inject TI if TVAL inverted */
if (new > old)
set_gcsr_estat(CPU_TIMER);
break;
case INT_HWI0 ... INT_HWI7:
@ -61,22 +67,28 @@ static int kvm_irq_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
static int kvm_irq_clear(struct kvm_vcpu *vcpu, unsigned int priority)
{
unsigned int irq = 0;
unsigned long old, new;
clear_bit(priority, &vcpu->arch.irq_clear);
if (priority < EXCCODE_INT_NUM)
irq = priority_to_irq[priority];
if (kvm_guest_has_msgint(&vcpu->arch) && (priority == INT_AVEC)) {
clear_gcsr_estat(irq);
return 1;
}
switch (priority) {
case INT_AVEC:
if (!kvm_guest_has_msgint(&vcpu->arch))
break;
fallthrough;
case INT_TI:
case INT_IPI:
case INT_SWI0:
case INT_SWI1:
old = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL);
clear_gcsr_estat(irq);
new = kvm_read_hw_gcsr(LOONGARCH_CSR_TVAL);
/* Inject TI if TVAL inverted */
if (new > old)
set_gcsr_estat(CPU_TIMER);
break;
case INT_HWI0 ... INT_HWI7:

View File

@ -348,8 +348,7 @@ void kvm_arch_disable_virtualization_cpu(void)
static int kvm_loongarch_env_init(void)
{
int cpu, order, ret;
void *addr;
int cpu, ret;
struct kvm_context *context;
vmcs = alloc_percpu(struct kvm_context);
@ -365,30 +364,8 @@ static int kvm_loongarch_env_init(void)
return -ENOMEM;
}
/*
* PGD register is shared between root kernel and kvm hypervisor.
* So world switch entry should be in DMW area rather than TLB area
* to avoid page fault reenter.
*
* In future if hardware pagetable walking is supported, we won't
* need to copy world switch code to DMW area.
*/
order = get_order(kvm_exception_size + kvm_enter_guest_size);
addr = (void *)__get_free_pages(GFP_KERNEL, order);
if (!addr) {
free_percpu(vmcs);
vmcs = NULL;
kfree(kvm_loongarch_ops);
kvm_loongarch_ops = NULL;
return -ENOMEM;
}
memcpy(addr, kvm_exc_entry, kvm_exception_size);
memcpy(addr + kvm_exception_size, kvm_enter_guest, kvm_enter_guest_size);
flush_icache_range((unsigned long)addr, (unsigned long)addr + kvm_exception_size + kvm_enter_guest_size);
kvm_loongarch_ops->exc_entry = addr;
kvm_loongarch_ops->enter_guest = addr + kvm_exception_size;
kvm_loongarch_ops->page_order = order;
kvm_loongarch_ops->exc_entry = (void *)kvm_exc_entry;
kvm_loongarch_ops->enter_guest = (void *)kvm_enter_guest;
vpid_mask = read_csr_gstat();
vpid_mask = (vpid_mask & CSR_GSTAT_GIDBIT) >> CSR_GSTAT_GIDBIT_SHIFT;
@ -428,16 +405,10 @@ static int kvm_loongarch_env_init(void)
static void kvm_loongarch_env_exit(void)
{
unsigned long addr;
if (vmcs)
free_percpu(vmcs);
if (kvm_loongarch_ops) {
if (kvm_loongarch_ops->exc_entry) {
addr = (unsigned long)kvm_loongarch_ops->exc_entry;
free_pages(addr, kvm_loongarch_ops->page_order);
}
kfree(kvm_loongarch_ops);
}

View File

@ -95,7 +95,7 @@ static int kvm_flush_pte(kvm_pte_t *pte, phys_addr_t addr, kvm_ptw_ctx *ctx)
else
kvm->stat.pages--;
*pte = ctx->invalid_entry;
kvm_set_pte(pte, ctx->invalid_entry);
return 1;
}

View File

@ -4,9 +4,11 @@
*/
#include <linux/linkage.h>
#include <linux/kvm_types.h>
#include <asm/asm.h>
#include <asm/asmmacro.h>
#include <asm/loongarch.h>
#include <asm/page.h>
#include <asm/regdef.h>
#include <asm/unwind_hints.h>
@ -100,11 +102,16 @@
* - is still in guest mode, such as pgd table/vmid registers etc,
* - will fix with hw page walk enabled in future
* load kvm_vcpu from reserved CSR KVM_VCPU_KS, and save a2 to KVM_TEMP_KS
*
* PGD register is shared between root kernel and kvm hypervisor.
* So world switch entry should be in DMW area rather than TLB area
* to avoid page fault re-enter.
*/
.text
.p2align PAGE_SHIFT
.cfi_sections .debug_frame
SYM_CODE_START(kvm_exc_entry)
UNWIND_HINT_UNDEFINED
UNWIND_HINT_END_OF_STACK
csrwr a2, KVM_TEMP_KS
csrrd a2, KVM_VCPU_KS
addi.d a2, a2, KVM_VCPU_ARCH
@ -190,8 +197,8 @@ ret_to_host:
kvm_restore_host_gpr a2
jr ra
SYM_INNER_LABEL(kvm_exc_entry_end, SYM_L_LOCAL)
SYM_CODE_END(kvm_exc_entry)
EXPORT_SYMBOL_FOR_KVM(kvm_exc_entry)
/*
* int kvm_enter_guest(struct kvm_run *run, struct kvm_vcpu *vcpu)
@ -215,8 +222,8 @@ SYM_FUNC_START(kvm_enter_guest)
/* Save kvm_vcpu to kscratch */
csrwr a1, KVM_VCPU_KS
kvm_switch_to_guest
SYM_INNER_LABEL(kvm_enter_guest_end, SYM_L_LOCAL)
SYM_FUNC_END(kvm_enter_guest)
EXPORT_SYMBOL_FOR_KVM(kvm_enter_guest)
SYM_FUNC_START(kvm_save_fpu)
fpu_save_csr a0 t1
@ -224,6 +231,7 @@ SYM_FUNC_START(kvm_save_fpu)
fpu_save_cc a0 t1 t2
jr ra
SYM_FUNC_END(kvm_save_fpu)
EXPORT_SYMBOL_FOR_KVM(kvm_save_fpu)
SYM_FUNC_START(kvm_restore_fpu)
fpu_restore_double a0 t1
@ -231,6 +239,7 @@ SYM_FUNC_START(kvm_restore_fpu)
fpu_restore_cc a0 t1 t2
jr ra
SYM_FUNC_END(kvm_restore_fpu)
EXPORT_SYMBOL_FOR_KVM(kvm_restore_fpu)
#ifdef CONFIG_CPU_HAS_LSX
SYM_FUNC_START(kvm_save_lsx)
@ -239,6 +248,7 @@ SYM_FUNC_START(kvm_save_lsx)
lsx_save_data a0 t1
jr ra
SYM_FUNC_END(kvm_save_lsx)
EXPORT_SYMBOL_FOR_KVM(kvm_save_lsx)
SYM_FUNC_START(kvm_restore_lsx)
lsx_restore_data a0 t1
@ -246,6 +256,7 @@ SYM_FUNC_START(kvm_restore_lsx)
fpu_restore_csr a0 t1 t2
jr ra
SYM_FUNC_END(kvm_restore_lsx)
EXPORT_SYMBOL_FOR_KVM(kvm_restore_lsx)
#endif
#ifdef CONFIG_CPU_HAS_LASX
@ -255,6 +266,7 @@ SYM_FUNC_START(kvm_save_lasx)
lasx_save_data a0 t1
jr ra
SYM_FUNC_END(kvm_save_lasx)
EXPORT_SYMBOL_FOR_KVM(kvm_save_lasx)
SYM_FUNC_START(kvm_restore_lasx)
lasx_restore_data a0 t1
@ -262,10 +274,8 @@ SYM_FUNC_START(kvm_restore_lasx)
fpu_restore_csr a0 t1 t2
jr ra
SYM_FUNC_END(kvm_restore_lasx)
EXPORT_SYMBOL_FOR_KVM(kvm_restore_lasx)
#endif
.section ".rodata"
SYM_DATA(kvm_exception_size, .quad kvm_exc_entry_end - kvm_exc_entry)
SYM_DATA(kvm_enter_guest_size, .quad kvm_enter_guest_end - kvm_enter_guest)
#ifdef CONFIG_CPU_HAS_LBT
STACK_FRAME_NON_STANDARD kvm_restore_fpu

View File

@ -96,15 +96,21 @@ void kvm_restore_timer(struct kvm_vcpu *vcpu)
* and set CSR TVAL with -1
*/
write_gcsr_timertick(0);
__delay(2); /* Wait cycles until timer interrupt injected */
/*
* Writing CSR_TINTCLR_TI to LOONGARCH_CSR_TINTCLR will clear
* timer interrupt, and CSR TVAL keeps unchanged with -1, it
* avoids spurious timer interrupt
*/
if (!(estat & CPU_TIMER))
if (!(estat & CPU_TIMER)) {
__delay(2); /* Wait cycles until timer interrupt injected */
/* Write TVAL with max value if no TI shot */
estat = kvm_read_hw_gcsr(LOONGARCH_CSR_ESTAT);
if (!(estat & CPU_TIMER))
write_gcsr_timertick(CSR_TCFG_VAL);
gcsr_write(CSR_TINTCLR_TI, LOONGARCH_CSR_TINTCLR);
}
return;
}

View File

@ -125,7 +125,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = 1;
break;
case KVM_CAP_NR_VCPUS:
r = num_online_cpus();
r = min_t(unsigned int, num_online_cpus(), KVM_MAX_VCPUS);
break;
case KVM_CAP_MAX_VCPUS:
r = KVM_MAX_VCPUS;

View File

@ -61,11 +61,16 @@ static void acpi_release_root_info(struct acpi_pci_root_info *ci)
static int acpi_prepare_root_resources(struct acpi_pci_root_info *ci)
{
int status;
unsigned long long pci_h = 0;
struct resource_entry *entry, *tmp;
struct acpi_device *device = ci->bridge;
status = acpi_pci_probe_root_resources(ci);
if (status > 0) {
acpi_evaluate_integer(device->handle, "PCIH", NULL, &pci_h);
if (pci_h)
return status;
resource_list_for_each_entry_safe(entry, tmp, &ci->resources) {
if (entry->res->flags & IORESOURCE_MEM) {
entry->offset = ci->root->mcfg_addr & GENMASK_ULL(63, 40);

View File

@ -132,6 +132,9 @@ static void loongson_gpu_fixup_dma_hang(struct pci_dev *pdev, bool on)
crtc_reg = regbase;
crtc_offset = 0x400;
break;
default:
iounmap(regbase);
return;
}
for (i = 0; i < CRTC_NUM_MAX; i++, crtc_reg += crtc_offset) {

View File

@ -12,6 +12,8 @@ obj-vdso-$(CONFIG_GENERIC_GETTIMEOFDAY) += vgettimeofday.o
ccflags-vdso := \
$(filter -I%,$(KBUILD_CFLAGS)) \
$(filter -E%,$(KBUILD_CFLAGS)) \
$(filter -m32,$(KBUILD_CFLAGS)) \
$(filter -m64,$(KBUILD_CFLAGS)) \
$(filter -march=%,$(KBUILD_CFLAGS)) \
$(filter -m%-float,$(KBUILD_CFLAGS)) \
$(CLANG_FLAGS) \

View File

@ -174,15 +174,21 @@ ifeq ($(KBUILD_EXTMOD),)
# this hack.
prepare: vdso_prepare
vdso_prepare: prepare0
$(if $(CONFIG_64BIT),$(Q)$(MAKE) \
$(build)=arch/parisc/kernel/vdso64 include/generated/vdso64-offsets.h)
$(if $(CONFIG_PA11)$(CONFIG_COMPAT),$(Q)$(MAKE) \
ifdef CONFIG_64BIT
$(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso64 include/generated/vdso64-offsets.h
$(if $(CONFIG_COMPAT),$(Q)$(MAKE) \
$(build)=arch/parisc/kernel/vdso32 include/generated/vdso32-offsets.h)
else
$(Q)$(MAKE) $(build)=arch/parisc/kernel/vdso32 include/generated/vdso32-offsets.h
endif
endif
vdso-install-$(CONFIG_PA11) += arch/parisc/kernel/vdso32/vdso32.so
ifdef CONFIG_64BIT
vdso-install-y += arch/parisc/kernel/vdso64/vdso64.so
vdso-install-$(CONFIG_COMPAT) += arch/parisc/kernel/vdso32/vdso32.so
vdso-install-$(CONFIG_64BIT) += arch/parisc/kernel/vdso64/vdso64.so
else
vdso-install-y += arch/parisc/kernel/vdso32/vdso32.so
endif
install: KBUILD_IMAGE := vmlinux
zinstall: KBUILD_IMAGE := vmlinuz

View File

@ -6,13 +6,14 @@
#ifdef CONFIG_64BIT
#include <generated/vdso64-offsets.h>
#define VDSO64_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso64_offset_##name))
#endif
#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT)
#include <generated/vdso32-offsets.h>
#endif
#define VDSO64_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso64_offset_##name))
#define VDSO32_SYMBOL(tsk, name) ((tsk)->mm->context.vdso_base + (vdso32_offset_##name))
#else
#define VDSO32_SYMBOL(tsk, name) 0UL
#endif
#endif /* __ASSEMBLER__ */

View File

@ -46,6 +46,9 @@ obj-$(CONFIG_KEXEC_FILE) += kexec_file.o
# vdso
obj-y += vdso.o
obj-$(CONFIG_64BIT) += vdso64/
obj-$(CONFIG_PA11) += vdso32/
ifdef CONFIG_64BIT
obj-y += vdso64/
obj-$(CONFIG_COMPAT) += vdso32/
else
obj-y += vdso32/
endif

View File

@ -41,9 +41,7 @@
const struct dma_map_ops *hppa_dma_ops __ro_after_init;
EXPORT_SYMBOL(hppa_dma_ops);
static struct device root = {
.init_name = "parisc",
};
static struct device *root;
static inline int check_dev(struct device *dev)
{
@ -89,7 +87,7 @@ static int for_each_padev(int (*fn)(struct device *, void *), void * data)
.obj = data,
.fn = fn,
};
return device_for_each_child(&root, &recurse_data, descend_children);
return device_for_each_child(root, &recurse_data, descend_children);
}
/**
@ -290,7 +288,7 @@ const struct parisc_device *
find_pa_parent_type(const struct parisc_device *padev, int type)
{
const struct device *dev = &padev->dev;
while (dev != &root) {
while (dev != root) {
struct parisc_device *candidate = to_parisc_device(dev);
if (candidate->id.hw_type == type)
return candidate;
@ -319,7 +317,7 @@ static void get_node_path(struct device *dev, struct hardware_path *path)
dev = dev->parent;
}
while (dev != &root) {
while (dev != root) {
if (dev_is_pci(dev)) {
unsigned int devfn = to_pci_dev(dev)->devfn;
path->bc[i--] = PCI_SLOT(devfn) | (PCI_FUNC(devfn)<< 5);
@ -482,7 +480,7 @@ static struct parisc_device * __init alloc_tree_node(
static struct parisc_device *create_parisc_device(struct hardware_path *modpath)
{
int i;
struct device *parent = &root;
struct device *parent = root;
for (i = 0; i < 6; i++) {
if (modpath->bc[i] == -1)
continue;
@ -755,7 +753,7 @@ parse_tree_node(struct device *parent, int index, struct hardware_path *modpath)
struct device *hwpath_to_device(struct hardware_path *modpath)
{
int i;
struct device *parent = &root;
struct device *parent = root;
for (i = 0; i < 6; i++) {
if (modpath->bc[i] == -1)
continue;
@ -880,7 +878,7 @@ void __init walk_central_bus(void)
{
walk_native_bus(CENTRAL_BUS_ADDR,
CENTRAL_BUS_ADDR + (MAX_NATIVE_DEVICES * NATIVE_DEVICE_OFFSET),
&root);
root);
}
static __init void print_parisc_device(struct parisc_device *dev)
@ -907,9 +905,10 @@ void __init init_parisc_bus(void)
{
if (bus_register(&parisc_bus_type))
panic("Could not register PA-RISC bus type\n");
if (device_register(&root))
root = root_device_register("parisc");
if (IS_ERR(root))
panic("Could not register PA-RISC root device\n");
get_device(&root);
}
static __init void qemu_header(void)

View File

@ -83,11 +83,10 @@ config MSI_BITMAP_SELFTEST
depends on DEBUG_KERNEL
config GUEST_STATE_BUFFER_TEST
def_tristate n
def_tristate KUNIT_ALL_TESTS
prompt "Enable Guest State Buffer unit tests"
depends on KUNIT
depends on KVM_BOOK3S_HV_POSSIBLE
default KUNIT_ALL_TESTS
help
The Guest State Buffer is a data format specified in the PAPR.
It is by hcalls to communicate the state of L2 guests between

View File

@ -76,7 +76,6 @@ CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_HW_RANDOM is not set
# CONFIG_HWMON is not set
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_TILEBLITTING=y
CONFIG_FB_RADEON=y
CONFIG_FB_3DFX=y

View File

@ -76,7 +76,6 @@ CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_NVRAM=y
# CONFIG_HWMON is not set
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_OF=y
CONFIG_FB_MATROX=y
CONFIG_FB_MATROX_MILLENIUM=y

View File

@ -85,6 +85,8 @@ CONFIG_PMAC_SMU=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_WINDFARM=y
CONFIG_WINDFARM_PM81=y
CONFIG_WINDFARM_PM72=y
CONFIG_WINDFARM_RM31=y
CONFIG_WINDFARM_PM91=y
CONFIG_WINDFARM_PM112=y
CONFIG_WINDFARM_PM121=y
@ -121,7 +123,6 @@ CONFIG_I2C_CHARDEV=y
CONFIG_AGP=m
CONFIG_AGP_UNINORTH=m
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_TILEBLITTING=y
CONFIG_FB_OF=y
CONFIG_FB_NVIDIA=y

View File

@ -98,7 +98,6 @@ CONFIG_SENSORS_LM85=y
CONFIG_SENSORS_LM90=y
CONFIG_DRM=y
CONFIG_DRM_RADEON=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_TILEBLITTING=y
CONFIG_FB_VGA16=y
CONFIG_FB_NVIDIA=y

View File

@ -196,7 +196,6 @@ CONFIG_I2C_CHARDEV=y
# CONFIG_PTP_1588_CLOCK is not set
CONFIG_DRM=y
CONFIG_DRM_AST=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_OF=y
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y

View File

@ -249,7 +249,6 @@ CONFIG_I2C_CHARDEV=y
CONFIG_I2C_AMD8111=y
CONFIG_I2C_PASEMI=y
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_OF=y
CONFIG_FB_MATROX=y
CONFIG_FB_MATROX_MILLENIUM=y

View File

@ -118,7 +118,6 @@ CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_AMD8111=y
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_OF=y
CONFIG_FB_MATROX=y
CONFIG_FB_MATROX_MILLENIUM=y

View File

@ -214,7 +214,6 @@ CONFIG_SENSORS_IBMPOWERNV=m
CONFIG_DRM=m
CONFIG_DRM_AST=m
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
# CONFIG_VGA_CONSOLE is not set
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_LOGO=y

View File

@ -79,10 +79,6 @@ extern int pmac_i2c_match_adapter(struct device_node *dev,
struct i2c_adapter *adapter);
/* (legacy) Locking functions exposed to i2c-keywest */
extern int pmac_low_i2c_lock(struct device_node *np);
extern int pmac_low_i2c_unlock(struct device_node *np);
/* Access functions for platform code */
extern int pmac_i2c_open(struct pmac_i2c_bus *bus, int polled);
extern void pmac_i2c_close(struct pmac_i2c_bus *bus);

View File

@ -458,6 +458,10 @@ DEFINE_PER_CPU(u8, irq_work_pending);
#endif /* 32 vs 64 bit */
/*
* Must be called with preemption disabled since it updates
* per-CPU irq_work state and programs the local CPU decrementer.
*/
void arch_irq_work_raise(void)
{
/*
@ -471,10 +475,8 @@ void arch_irq_work_raise(void)
* which could get tangled up if we're messing with the same state
* here.
*/
preempt_disable();
set_irq_work_pending_flag();
set_dec(1);
preempt_enable();
}
static void set_dec_or_work(u64 val)

View File

@ -62,6 +62,12 @@ CC32FLAGSREMOVE += -fno-stack-clash-protection
# 32-bit one. clang validates the values passed to these arguments during
# parsing, even when -fno-stack-protector is passed afterwards.
CC32FLAGSREMOVE += -mstack-protector-guard%
# ftrace is disabled for the vdso but arch/powerpc/Makefile adds this define to
# KBUILD_CPPFLAGS, which enables use of the 'patchable_function_entry'
# attribute in the 'inline' define via 'notrace'. This attribute is not
# supported for the powerpcle target, resulting in many instances of
# -Wunknown-attributes.
CC32FLAGSREMOVE += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
endif
LD32FLAGS := -Wl,-soname=linux-vdso32.so.1
AS32FLAGS := -D__VDSO32__

View File

@ -16,4 +16,4 @@ GCOV_PROFILE_core_$(BITS).o := n
KCOV_INSTRUMENT_core_$(BITS).o := n
UBSAN_SANITIZE_core_$(BITS).o := n
KASAN_SANITIZE_core.o := n
KASAN_SANITIZE_core_$(BITS) := n
KASAN_SANITIZE_core_$(BITS).o := n

View File

@ -52,7 +52,14 @@ int exit_vmx_usercopy(void)
}
EXPORT_SYMBOL(exit_vmx_usercopy);
int enter_vmx_ops(void)
/*
* Can be called from kexec copy_page() path with MMU off. The kexec
* code sets preempt_count to HARDIRQ_OFFSET so we return early here.
* Since in_interrupt() is always inline, __no_sanitize_address on this
* function is sufficient to avoid KASAN shadow memory accesses in real
* mode.
*/
int __no_sanitize_address enter_vmx_ops(void)
{
if (in_interrupt())
return 0;

View File

@ -2242,6 +2242,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
const u64 last_period = event->hw.last_period;
s64 prev, delta, left;
int record = 0;
int mark_event = regs->dsisr & MMCRA_SAMPLE_ENABLE;
if (event->hw.state & PERF_HES_STOPPED) {
write_pmc(event->hw.idx, 0);
@ -2304,9 +2305,9 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
* In ISA v3.0 and before values "0" and "7" are considered reserved.
* In ISA v3.1, value "7" has been used to indicate "larx/stcx".
* Drop the sample if "type" has reserved values for this field with a
* ISA version check.
* ISA version check for marked events.
*/
if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC &&
if (mark_event && event->attr.sample_type & PERF_SAMPLE_DATA_SRC &&
ppmu->get_mem_data_src) {
val = (regs->dar & SIER_TYPE_MASK) >> SIER_TYPE_SHIFT;
if (val == 0 || (val == 7 && !cpu_has_feature(CPU_FTR_ARCH_31))) {

View File

@ -210,7 +210,7 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att
0, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
@ -244,12 +244,14 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att
starting_index, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
}
out_success:
put_cpu_var(hv_gpci_reqb);
return n;
out:
@ -278,7 +280,7 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute
0, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
@ -312,12 +314,14 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute
starting_index, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
}
out_success:
put_cpu_var(hv_gpci_reqb);
return n;
out:
@ -346,7 +350,7 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev,
0, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
@ -382,12 +386,14 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev,
starting_index, secondary_index, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
}
out_success:
put_cpu_var(hv_gpci_reqb);
return n;
out:
@ -416,7 +422,7 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device
0, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
@ -448,12 +454,14 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device
starting_index, 0, buf, &n, arg);
if (!ret)
return n;
goto out_success;
if (ret != H_PARAMETER)
goto out;
}
out_success:
put_cpu_var(hv_gpci_reqb);
return n;
out:

View File

@ -293,6 +293,8 @@ static int pika_dtm_thread(void __iomem *fpga)
schedule_timeout(HZ);
}
put_device(&client->dev);
return 0;
}

View File

@ -27,8 +27,8 @@
static void __init km82xx_pic_init(void)
{
struct device_node *np __free(device_node);
np = of_find_compatible_node(NULL, NULL, "fsl,pq2-pic");
struct device_node *np __free(device_node) = of_find_compatible_node(NULL,
NULL, "fsl,pq2-pic");
if (!np) {
pr_err("PIC init: can not find cpm-pic node\n");

View File

@ -477,7 +477,7 @@ int cpm1_gpiochip_add16(struct device *dev)
struct device_node *np = dev->of_node;
struct cpm1_gpio16_chip *cpm1_gc;
struct gpio_chip *gc;
u16 mask;
u32 mask;
cpm1_gc = devm_kzalloc(dev, sizeof(*cpm1_gc), GFP_KERNEL);
if (!cpm1_gc)
@ -485,7 +485,7 @@ int cpm1_gpiochip_add16(struct device *dev)
spin_lock_init(&cpm1_gc->lock);
if (!of_property_read_u16(np, "fsl,cpm1-gpio-irq-mask", &mask)) {
if (!of_property_read_u32(np, "fsl,cpm1-gpio-irq-mask", &mask)) {
int i, j;
for (i = 0, j = 0; i < 16; i++)

View File

@ -272,13 +272,12 @@ void __init pas_pci_init(void)
{
struct device_node *root = of_find_node_by_path("/");
struct device_node *np;
int res;
pci_set_flags(PCI_SCAN_ALL_PCIE_DEVS);
np = of_find_compatible_node(root, NULL, "pasemi,rootbus");
if (np) {
res = pas_add_bridge(np);
pas_add_bridge(np);
of_node_put(np);
}
of_node_put(root);

View File

@ -1058,40 +1058,6 @@ int pmac_i2c_match_adapter(struct device_node *dev, struct i2c_adapter *adapter)
}
EXPORT_SYMBOL_GPL(pmac_i2c_match_adapter);
int pmac_low_i2c_lock(struct device_node *np)
{
struct pmac_i2c_bus *bus, *found = NULL;
list_for_each_entry(bus, &pmac_i2c_busses, link) {
if (np == bus->controller) {
found = bus;
break;
}
}
if (!found)
return -ENODEV;
return pmac_i2c_open(bus, 0);
}
EXPORT_SYMBOL_GPL(pmac_low_i2c_lock);
int pmac_low_i2c_unlock(struct device_node *np)
{
struct pmac_i2c_bus *bus, *found = NULL;
list_for_each_entry(bus, &pmac_i2c_busses, link) {
if (np == bus->controller) {
found = bus;
break;
}
}
if (!found)
return -ENODEV;
pmac_i2c_close(bus);
return 0;
}
EXPORT_SYMBOL_GPL(pmac_low_i2c_unlock);
int pmac_i2c_open(struct pmac_i2c_bus *bus, int polled)
{
int rc;

View File

@ -950,8 +950,6 @@ static int __init ps3_start_probe_thread(enum ps3_bus_type bus_type)
static int __init ps3_register_devices(void)
{
int result;
if (!firmware_has_feature(FW_FEATURE_PS3_LV1))
return -ENODEV;
@ -959,7 +957,7 @@ static int __init ps3_register_devices(void)
/* ps3_repository_dump_bus_info(); */
result = ps3_start_probe_thread(PS3_BUS_TYPE_STORAGE);
ps3_start_probe_thread(PS3_BUS_TYPE_STORAGE);
ps3_register_vuart_devices();

View File

@ -16,6 +16,7 @@ static void *htm_buf;
static void *htm_status_buf;
static void *htm_info_buf;
static void *htm_caps_buf;
static void *htm_mem_buf;
static u32 nodeindex;
static u32 nodalchipindex;
static u32 coreindexonchip;
@ -86,7 +87,7 @@ static ssize_t htm_return_check(long rc)
static ssize_t htmdump_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
void *htm_buf = filp->private_data;
void *htm_buf_data = filp->private_data;
unsigned long page, read_size, available;
loff_t offset;
long rc, ret;
@ -100,7 +101,7 @@ static ssize_t htmdump_read(struct file *filp, char __user *ubuf,
* - last three values are address, size and offset
*/
rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip,
htmtype, H_HTM_OP_DUMP_DATA, virt_to_phys(htm_buf),
htmtype, H_HTM_OP_DUMP_DATA, virt_to_phys(htm_buf_data),
PAGE_SIZE, page);
ret = htm_return_check(rc);
@ -112,7 +113,61 @@ static ssize_t htmdump_read(struct file *filp, char __user *ubuf,
available = PAGE_SIZE;
read_size = min(count, available);
*ppos += read_size;
return simple_read_from_buffer(ubuf, count, &offset, htm_buf, available);
return simple_read_from_buffer(ubuf, count, &offset, htm_buf_data, available);
}
static ssize_t htmsystem_mem_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
void *htm_mem_data = filp->private_data;
long rc, ret;
u64 *num_entries;
u64 to_copy = 0;
loff_t offset = 0;
u64 mem_offset = 0;
/*
* Invoke H_HTM call with:
* - operation as htm status (H_HTM_OP_STATUS)
* - last three values as addr, size and offset. "offset"
* is value from output buffer header that points to next
* entry to dump. 0 is the first entry to dump. next entry
* is read from the output bufferbyte offset 0x8.
*
* When first time hcall is invoked, mem_offset should be
* zero because zero is the first entry.
* In the next hcall, offset of next entry to read from is
* picked from output buffer header itself. So don't fill
* mem_offset for first read.
*
* If there is no further data to read in next iteration,
* offset value from output buffer header will point to -1.
*/
if (*ppos) {
mem_offset = *(u64 *)(htm_mem_data + 0x8);
if (mem_offset == -1)
return 0;
}
rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip,
htmtype, H_HTM_OP_DUMP_SYSMEM_CONF, virt_to_phys(htm_mem_data),
PAGE_SIZE, be64_to_cpu(mem_offset));
ret = htm_return_check(rc);
if (ret <= 0) {
pr_debug("H_HTM hcall returned for op: H_HTM_OP_DUMP_SYSMEM_CONF with hcall returning %ld\n", ret);
return ret;
}
/*
* HTM system mem buffer, start of buffer + 0x10 gives the
* number of HTM entries in the buffer.
* So total count to copy is:
* 32 bytes (for first 5 fields) + (number of HTM entries * entry size)
*/
num_entries = htm_mem_data + 0x10;
to_copy = 32 + (be64_to_cpu(*num_entries) * 32);
*ppos += to_copy;
return simple_read_from_buffer(ubuf, count, &offset, htm_mem_data, to_copy);
}
static const struct file_operations htmdump_fops = {
@ -121,6 +176,12 @@ static const struct file_operations htmdump_fops = {
.open = simple_open,
};
static const struct file_operations htmsystem_mem_fops = {
.llseek = NULL,
.read = htmsystem_mem_read,
.open = simple_open,
};
static int htmconfigure_set(void *data, u64 val)
{
long rc, ret;
@ -226,20 +287,31 @@ static int htmstart_get(void *data, u64 *val)
static ssize_t htmstatus_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
void *htm_status_buf = filp->private_data;
void *htm_status_data = filp->private_data;
long rc, ret;
u64 *num_entries;
u64 to_copy;
int htmstatus_flag;
loff_t offset = 0;
u64 status_offset = 0;
/*
* Invoke H_HTM call with:
* - operation as htm status (H_HTM_OP_STATUS)
* - last three values as addr, size and offset
* - last three values as addr, size and offset.
* "offset" is value from output buffer header
* that points to next entry to dump. 0 is the first
* entry to dump. next entry is read from the output
* bufferbyte offset 0x8.
*/
if (*ppos) {
status_offset = *(u64 *)(htm_status_data + 0x8);
if (status_offset == -1)
return 0;
}
rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip,
htmtype, H_HTM_OP_STATUS, virt_to_phys(htm_status_buf),
PAGE_SIZE, 0);
htmtype, H_HTM_OP_STATUS, virt_to_phys(htm_status_data),
PAGE_SIZE, be64_to_cpu(status_offset));
ret = htm_return_check(rc);
if (ret <= 0) {
@ -255,13 +327,15 @@ static ssize_t htmstatus_read(struct file *filp, char __user *ubuf,
* So total count to copy is:
* 32 bytes (for first 7 fields) + (number of HTM entries * entry size)
*/
num_entries = htm_status_buf + 0x10;
num_entries = htm_status_data + 0x10;
if (htmtype == 0x2)
htmstatus_flag = 0x8;
else
htmstatus_flag = 0x6;
to_copy = 32 + (be64_to_cpu(*num_entries) * htmstatus_flag);
return simple_read_from_buffer(ubuf, count, ppos, htm_status_buf, to_copy);
*ppos += to_copy;
return simple_read_from_buffer(ubuf, count, &offset, htm_status_data, to_copy);
}
static const struct file_operations htmstatus_fops = {
@ -273,19 +347,30 @@ static const struct file_operations htmstatus_fops = {
static ssize_t htminfo_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
void *htm_info_buf = filp->private_data;
void *htm_info_data = filp->private_data;
long rc, ret;
u64 *num_entries;
u64 to_copy;
loff_t offset = 0;
u64 info_offset = 0;
/*
* Invoke H_HTM call with:
* - operation as htm status (H_HTM_OP_STATUS)
* - last three values as addr, size and offset
* "offset" is value from output buffer header
* that points to next entry to dump. 0 is the first
* entry to dump. next entry is read from the output
* bufferbyte offset 0x8.
*/
if (*ppos) {
info_offset = *(u64 *)(htm_info_data + 0x8);
if (info_offset == -1)
return 0;
}
rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip,
htmtype, H_HTM_OP_DUMP_SYSPROC_CONF, virt_to_phys(htm_info_buf),
PAGE_SIZE, 0);
htmtype, H_HTM_OP_DUMP_SYSPROC_CONF, virt_to_phys(htm_info_data),
PAGE_SIZE, be64_to_cpu(info_offset));
ret = htm_return_check(rc);
if (ret <= 0) {
@ -301,15 +386,17 @@ static ssize_t htminfo_read(struct file *filp, char __user *ubuf,
* So total count to copy is:
* 32 bytes (for first 5 fields) + (number of HTM entries * entry size)
*/
num_entries = htm_info_buf + 0x10;
num_entries = htm_info_data + 0x10;
to_copy = 32 + (be64_to_cpu(*num_entries) * 16);
return simple_read_from_buffer(ubuf, count, ppos, htm_info_buf, to_copy);
*ppos += to_copy;
return simple_read_from_buffer(ubuf, count, &offset, htm_info_data, to_copy);
}
static ssize_t htmcaps_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
void *htm_caps_buf = filp->private_data;
void *htm_caps_data = filp->private_data;
long rc, ret;
/*
@ -319,7 +406,7 @@ static ssize_t htmcaps_read(struct file *filp, char __user *ubuf,
* and zero
*/
rc = htm_hcall_wrapper(htmflags, nodeindex, nodalchipindex, coreindexonchip,
htmtype, H_HTM_OP_CAPABILITIES, virt_to_phys(htm_caps_buf),
htmtype, H_HTM_OP_CAPABILITIES, virt_to_phys(htm_caps_data),
0x80, 0);
ret = htm_return_check(rc);
@ -328,7 +415,7 @@ static ssize_t htmcaps_read(struct file *filp, char __user *ubuf,
return ret;
}
return simple_read_from_buffer(ubuf, count, ppos, htm_caps_buf, 0x80);
return simple_read_from_buffer(ubuf, count, ppos, htm_caps_data, 0x80);
}
static const struct file_operations htminfo_fops = {
@ -457,9 +544,17 @@ static int htmdump_init_debugfs(void)
return -ENOMEM;
}
/* Memory to present HTM system memory configuration */
htm_mem_buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
if (!htm_mem_buf) {
pr_err("Failed to allocate htm mem buf\n");
return -ENOMEM;
}
debugfs_create_file("htmstatus", 0400, htmdump_debugfs_dir, htm_status_buf, &htmstatus_fops);
debugfs_create_file("htminfo", 0400, htmdump_debugfs_dir, htm_info_buf, &htminfo_fops);
debugfs_create_file("htmcaps", 0400, htmdump_debugfs_dir, htm_caps_buf, &htmcaps_fops);
debugfs_create_file("htmsystem_mem", 0400, htmdump_debugfs_dir, htm_mem_buf, &htmsystem_mem_fops);
return 0;
}
@ -482,6 +577,10 @@ static void __exit htmdump_exit(void)
{
debugfs_remove_recursive(htmdump_debugfs_dir);
kfree(htm_buf);
kfree(htm_status_buf);
kfree(htm_info_buf);
kfree(htm_caps_buf);
kfree(htm_mem_buf);
}
module_init(htmdump_init);

View File

@ -190,33 +190,34 @@ static int hvpipe_rtas_recv_msg(char __user *buf, int size)
return -ENOMEM;
}
ret = rtas_ibm_receive_hvpipe_msg(work_area, &srcID,
&bytes_written);
if (!ret) {
/*
* Recv HVPIPE RTAS is successful.
* When releasing FD or no one is waiting on the
* specific source, issue recv HVPIPE RTAS call
* so that pipe is not blocked - this func is called
* with NULL buf.
*/
if (buf) {
if (size < bytes_written) {
pr_err("Received the payload size = %d, but the buffer size = %d\n",
bytes_written, size);
bytes_written = size;
}
ret = copy_to_user(buf,
rtas_work_area_raw_buf(work_area),
bytes_written);
if (!ret)
ret = bytes_written;
}
} else {
pr_err("ibm,receive-hvpipe-msg failed with %d\n",
ret);
/*
* Recv HVPIPE RTAS is successful.
* When releasing FD or no one is waiting on the
* specific source, issue recv HVPIPE RTAS call
* so that pipe is not blocked - this func is called
* with NULL buf.
*/
ret = rtas_ibm_receive_hvpipe_msg(work_area, &srcID, &bytes_written);
if (ret) {
pr_err("ibm,receive-hvpipe-msg failed with %d\n", ret);
goto out;
}
if (!buf)
goto out;
if (size < bytes_written) {
pr_err("Received the payload size = %d, but the buffer size = %d\n",
bytes_written, size);
bytes_written = size;
}
if (copy_to_user(buf, rtas_work_area_raw_buf(work_area), bytes_written))
ret = -EFAULT;
else
ret = bytes_written;
out:
rtas_work_area_free(work_area);
return ret;
}
@ -327,8 +328,8 @@ static ssize_t papr_hvpipe_handle_read(struct file *file,
{
struct hvpipe_source_info *src_info = file->private_data;
struct papr_hvpipe_hdr hdr;
long ret;
struct papr_hvpipe_hdr hdr = {};
ssize_t ret = 0;
/*
* Return -ENXIO during migration
@ -376,7 +377,7 @@ static ssize_t papr_hvpipe_handle_read(struct file *file,
ret = copy_to_user(buf, &hdr, HVPIPE_HDR_LEN);
if (ret)
return ret;
return -EFAULT;
/*
* Message event has payload, so get the payload with
@ -385,19 +386,23 @@ static ssize_t papr_hvpipe_handle_read(struct file *file,
if (hdr.flags & HVPIPE_MSG_AVAILABLE) {
ret = hvpipe_rtas_recv_msg(buf + HVPIPE_HDR_LEN,
size - HVPIPE_HDR_LEN);
if (ret > 0) {
/*
* Always clear MSG_AVAILABLE once the RTAS call has drained
* the message, regardless of whether copy_to_user succeeded.
*/
if (ret >= 0 || ret == -EFAULT)
src_info->hvpipe_status &= ~HVPIPE_MSG_AVAILABLE;
ret += HVPIPE_HDR_LEN;
}
} else if (hdr.flags & HVPIPE_LOST_CONNECTION) {
/*
* Hypervisor is closing the pipe for the specific
* source. So notify user space.
*/
src_info->hvpipe_status &= ~HVPIPE_LOST_CONNECTION;
ret = HVPIPE_HDR_LEN;
}
if (ret >= 0)
ret += HVPIPE_HDR_LEN;
return ret;
}
@ -444,16 +449,18 @@ static int papr_hvpipe_handle_release(struct inode *inode,
struct file *file)
{
struct hvpipe_source_info *src_info;
unsigned long flags;
/*
* Hold the lock, remove source from src_list, reset the
* hvpipe status and release the lock to prevent any race
* with message event IRQ.
*/
spin_lock(&hvpipe_src_list_lock);
spin_lock_irqsave(&hvpipe_src_list_lock, flags);
src_info = file->private_data;
list_del(&src_info->list);
file->private_data = NULL;
spin_unlock_irqrestore(&hvpipe_src_list_lock, flags);
/*
* If the pipe for this specific source has any pending
* payload, issue recv HVPIPE RTAS so that pipe will not
@ -461,10 +468,8 @@ static int papr_hvpipe_handle_release(struct inode *inode,
*/
if (src_info->hvpipe_status & HVPIPE_MSG_AVAILABLE) {
src_info->hvpipe_status = 0;
spin_unlock(&hvpipe_src_list_lock);
hvpipe_rtas_recv_msg(NULL, 0);
} else
spin_unlock(&hvpipe_src_list_lock);
}
kfree(src_info);
return 0;
@ -479,50 +484,53 @@ static const struct file_operations papr_hvpipe_handle_ops = {
static int papr_hvpipe_dev_create_handle(u32 srcID)
{
struct hvpipe_source_info *src_info __free(kfree) = NULL;
spin_lock(&hvpipe_src_list_lock);
/*
* Do not allow more than one process communicates with
* each source.
*/
src_info = hvpipe_find_source(srcID);
if (src_info) {
spin_unlock(&hvpipe_src_list_lock);
pr_err("pid(%d) is already using the source(%d)\n",
src_info->tsk->pid, srcID);
return -EALREADY;
}
spin_unlock(&hvpipe_src_list_lock);
struct hvpipe_source_info *src_info;
int fd;
unsigned long flags;
src_info = kzalloc_obj(*src_info, GFP_KERNEL_ACCOUNT);
if (!src_info)
return -ENOMEM;
src_info->srcID = srcID;
src_info->tsk = current;
init_waitqueue_head(&src_info->recv_wqh);
FD_PREPARE(fdf, O_RDONLY | O_CLOEXEC,
anon_inode_getfile("[papr-hvpipe]", &papr_hvpipe_handle_ops,
(void *)src_info, O_RDWR));
if (fdf.err)
return fdf.err;
retain_and_null_ptr(src_info);
spin_lock(&hvpipe_src_list_lock);
/*
* If two processes are executing ioctl() for the same
* source ID concurrently, prevent the second process to
* acquire FD.
* Do not allow more than one process communicates with
* each source.
*/
spin_lock_irqsave(&hvpipe_src_list_lock, flags);
if (hvpipe_find_source(srcID)) {
spin_unlock(&hvpipe_src_list_lock);
spin_unlock_irqrestore(&hvpipe_src_list_lock, flags);
pr_err("pid(%s:%d) could not get the source(%d)\n",
current->comm, task_pid_nr(current), srcID);
kfree(src_info);
return -EALREADY;
}
list_add(&src_info->list, &hvpipe_src_list);
spin_unlock(&hvpipe_src_list_lock);
return fd_publish(fdf);
spin_unlock_irqrestore(&hvpipe_src_list_lock, flags);
fd = FD_ADD(O_RDONLY | O_CLOEXEC,
anon_inode_getfile("[papr-hvpipe]", &papr_hvpipe_handle_ops,
(void *)src_info, O_RDWR));
if (fd < 0) {
spin_lock_irqsave(&hvpipe_src_list_lock, flags);
list_del(&src_info->list);
spin_unlock_irqrestore(&hvpipe_src_list_lock, flags);
/*
* if we fail to add FD, that means no userspace program is
* polling. In that case if there is a msg pending because the
* interrupt was fired after the src_info was added to the
* global list, then let's consume it here, to unblock the
* hvpipe
*/
if (src_info->hvpipe_status & HVPIPE_MSG_AVAILABLE)
hvpipe_rtas_recv_msg(NULL, 0);
kfree(src_info);
return fd;
}
return fd;
}
/*
@ -685,20 +693,19 @@ static int __init enable_hvpipe_IRQ(void)
struct device_node *np;
hvpipe_check_exception_token = rtas_function_token(RTAS_FN_CHECK_EXCEPTION);
if (hvpipe_check_exception_token == RTAS_UNKNOWN_SERVICE)
if (hvpipe_check_exception_token == RTAS_UNKNOWN_SERVICE)
return -ENODEV;
/* hvpipe events */
np = of_find_node_by_path("/event-sources/ibm,hvpipe-msg-events");
if (np != NULL) {
request_event_sources_irqs(np, hvpipe_event_interrupt,
"HPIPE_EVENT");
of_node_put(np);
} else {
pr_err("Can not enable hvpipe event IRQ\n");
if (!np) {
pr_err("No device node found, could not enable hvpipe event IRQ\n");
return -ENODEV;
}
request_event_sources_irqs(np, hvpipe_event_interrupt, "HPIPE_EVENT");
of_node_put(np);
return 0;
}
@ -775,23 +782,29 @@ static int __init papr_hvpipe_init(void)
}
ret = enable_hvpipe_IRQ();
if (!ret) {
ret = set_hvpipe_sys_param(1);
if (!ret)
ret = misc_register(&papr_hvpipe_dev);
}
if (ret)
goto out_wq;
if (!ret) {
pr_info("hvpipe feature is enabled\n");
hvpipe_feature = true;
return 0;
}
ret = misc_register(&papr_hvpipe_dev);
if (ret)
goto out_wq;
pr_err("hvpipe feature is not enabled %d\n", ret);
ret = set_hvpipe_sys_param(1);
if (ret)
goto out_misc;
pr_info("hvpipe feature is enabled\n");
hvpipe_feature = true;
return 0;
out_misc:
misc_deregister(&papr_hvpipe_dev);
out_wq:
destroy_workqueue(papr_hvpipe_wq);
out:
kfree(papr_hvpipe_work);
papr_hvpipe_work = NULL;
pr_err("hvpipe feature is not enabled %d\n", ret);
return ret;
}
machine_device_initcall(pseries, papr_hvpipe_init);

View File

@ -21,7 +21,6 @@ struct hvpipe_source_info {
u32 srcID;
u32 hvpipe_status;
wait_queue_head_t recv_wqh; /* wake up poll() waitq */
struct task_struct *tsk;
};
/*

View File

@ -937,6 +937,28 @@ config RISCV_VECTOR_MISALIGNED
help
Enable detecting support for vector misaligned loads and stores.
config RISCV_SBI_FWFT_DELEGATE_MISALIGNED
bool "Request firmware delegation of unaligned access exceptions"
depends on RISCV_SBI
depends on NONPORTABLE
help
Use SBI FWFT to request delegation of load address misaligned and
store address misaligned exceptions, if possible, and prefer Linux
kernel emulation of these accesses to firmware emulation.
Unfortunately, Linux's emulation is still incomplete. Namely, it
currently does not handle vector instructions and KVM guest accesses.
On platforms where these accesses would have been handled by firmware,
enabling this causes unexpected kernel oopses, userspaces crashes and
KVM guest crashes. If you are sure that these are not a problem for
your platform, you can say Y here, which may improve performance.
Saying N here will not worsen emulation support for unaligned accesses
even in the case where the firmware also has incomplete support. It
simply keeps the firmware's emulation enabled.
If you don't know what to do here, say N.
choice
prompt "Unaligned Accesses Support"
default RISCV_PROBE_UNALIGNED_ACCESS

View File

@ -57,7 +57,7 @@ void mips_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
}
tmp = (1U << alt->patch_id);
if (cpu_req_errata && tmp) {
if (cpu_req_errata & tmp) {
mutex_lock(&text_mutex);
patch_text_nosync(ALT_OLD_PTR(alt), ALT_ALT_PTR(alt),
alt->alt_len);

View File

@ -107,6 +107,8 @@ static long compat_restore_sigcontext(struct pt_regs *regs,
/* sc_regs is structured the same as the start of pt_regs */
err = __copy_from_user(&cregs, &sc->sc_regs, sizeof(sc->sc_regs));
if (unlikely(err))
return err;
cregs_to_regs(&cregs, regs);

View File

@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/* Copyright (C) 2023 Rivos Inc. */
#include <linux/cfi_types.h>
#include <linux/linkage.h>
#include <asm/asm.h>
@ -9,7 +10,7 @@
/* void __riscv_copy_words_unaligned(void *, const void *, size_t) */
/* Performs a memcpy without aligning buffers, using word loads and stores. */
/* Note: The size is truncated to a multiple of 8 * SZREG */
SYM_FUNC_START(__riscv_copy_words_unaligned)
SYM_TYPED_FUNC_START(__riscv_copy_words_unaligned)
andi a4, a2, ~((8*SZREG)-1)
beqz a4, 2f
add a3, a1, a4
@ -41,7 +42,7 @@ SYM_FUNC_END(__riscv_copy_words_unaligned)
/* void __riscv_copy_bytes_unaligned(void *, const void *, size_t) */
/* Performs a memcpy without aligning buffers, using only byte accesses. */
/* Note: The size is truncated to a multiple of 8 */
SYM_FUNC_START(__riscv_copy_bytes_unaligned)
SYM_TYPED_FUNC_START(__riscv_copy_bytes_unaligned)
andi a4, a2, ~(8-1)
beqz a4, 2f
add a3, a1, a4

View File

@ -896,10 +896,8 @@ static void __init riscv_fill_hwcap_from_isa_string(unsigned long *isa2hwcap)
* CPU cores with the ratified spec will contain non-zero
* marchid.
*/
if (acpi_disabled && boot_vendorid == THEAD_VENDOR_ID && boot_archid == 0x0) {
this_hwcap &= ~isa2hwcap[RISCV_ISA_EXT_v];
if (acpi_disabled && boot_vendorid == THEAD_VENDOR_ID && boot_archid == 0x0)
clear_bit(RISCV_ISA_EXT_v, source_isa);
}
riscv_resolve_isa(source_isa, isainfo->isa, &this_hwcap, isa2hwcap);
@ -1104,16 +1102,16 @@ early_param("riscv_isa_fallback", riscv_isa_fallback_setup);
void __init riscv_fill_hwcap(void)
{
char print_str[NUM_ALPHA_EXTS + 1];
unsigned long isa2hwcap[26] = {0};
unsigned long isa2hwcap[RISCV_ISA_EXT_BASE] = {0};
int i, j;
isa2hwcap['i' - 'a'] = COMPAT_HWCAP_ISA_I;
isa2hwcap['m' - 'a'] = COMPAT_HWCAP_ISA_M;
isa2hwcap['a' - 'a'] = COMPAT_HWCAP_ISA_A;
isa2hwcap['f' - 'a'] = COMPAT_HWCAP_ISA_F;
isa2hwcap['d' - 'a'] = COMPAT_HWCAP_ISA_D;
isa2hwcap['c' - 'a'] = COMPAT_HWCAP_ISA_C;
isa2hwcap['v' - 'a'] = COMPAT_HWCAP_ISA_V;
isa2hwcap[RISCV_ISA_EXT_i] = COMPAT_HWCAP_ISA_I;
isa2hwcap[RISCV_ISA_EXT_m] = COMPAT_HWCAP_ISA_M;
isa2hwcap[RISCV_ISA_EXT_a] = COMPAT_HWCAP_ISA_A;
isa2hwcap[RISCV_ISA_EXT_f] = COMPAT_HWCAP_ISA_F;
isa2hwcap[RISCV_ISA_EXT_d] = COMPAT_HWCAP_ISA_D;
isa2hwcap[RISCV_ISA_EXT_c] = COMPAT_HWCAP_ISA_C;
isa2hwcap[RISCV_ISA_EXT_v] = COMPAT_HWCAP_ISA_V;
if (!acpi_disabled) {
riscv_fill_hwcap_from_isa_string(isa2hwcap);

View File

@ -577,8 +577,8 @@ static int compat_riscv_gpr_set(struct task_struct *target,
struct compat_user_regs_struct cregs;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &cregs, 0, -1);
cregs_to_regs(&cregs, task_pt_regs(target));
if (!ret)
cregs_to_regs(&cregs, task_pt_regs(target));
return ret;
}

View File

@ -584,7 +584,7 @@ static int cpu_online_check_unaligned_access_emulated(unsigned int cpu)
static bool misaligned_traps_delegated;
#ifdef CONFIG_RISCV_SBI
#if defined(CONFIG_RISCV_SBI_FWFT_DELEGATE_MISALIGNED)
static int cpu_online_sbi_unaligned_setup(unsigned int cpu)
{

View File

@ -109,15 +109,16 @@ void set_indir_lp_lock(struct task_struct *task, bool lock)
task->thread_info.user_cfi_state.ufcfi_locked = lock;
}
/*
* If size is 0, then to be compatible with regular stack we want it to be as big as
* regular stack. Else PAGE_ALIGN it and return back
* The shadow stack only stores the return address and not any variables
* this should be more than sufficient for most applications.
* Else PAGE_ALIGN it and return back
*/
static unsigned long calc_shstk_size(unsigned long size)
{
if (size)
return PAGE_ALIGN(size);
return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G));
return PAGE_ALIGN(min(rlimit(RLIMIT_STACK) / 2, SZ_2G));
}
/*

View File

@ -2,6 +2,7 @@
/* Copyright (C) 2024 Rivos Inc. */
#include <linux/args.h>
#include <linux/cfi_types.h>
#include <linux/linkage.h>
#include <asm/asm.h>
@ -16,7 +17,7 @@
/* void __riscv_copy_vec_words_unaligned(void *, const void *, size_t) */
/* Performs a memcpy without aligning buffers, using word loads and stores. */
/* Note: The size is truncated to a multiple of WORD_EEW */
SYM_FUNC_START(__riscv_copy_vec_words_unaligned)
SYM_TYPED_FUNC_START(__riscv_copy_vec_words_unaligned)
andi a4, a2, ~(WORD_EEW-1)
beqz a4, 2f
add a3, a1, a4
@ -38,7 +39,7 @@ SYM_FUNC_END(__riscv_copy_vec_words_unaligned)
/* void __riscv_copy_vec_bytes_unaligned(void *, const void *, size_t) */
/* Performs a memcpy without aligning buffers, using only byte accesses. */
/* Note: The size is truncated to a multiple of 8 */
SYM_FUNC_START(__riscv_copy_vec_bytes_unaligned)
SYM_TYPED_FUNC_START(__riscv_copy_vec_bytes_unaligned)
andi a4, a2, ~(8-1)
beqz a4, 2f
add a3, a1, a4

View File

@ -792,6 +792,27 @@ static void __init set_mmap_rnd_bits_max(void)
mmap_rnd_bits_max = MMAP_VA_BITS - PAGE_SHIFT - 3;
}
static bool __init is_vaddr_valid(unsigned long va)
{
unsigned long up = 0;
switch (satp_mode) {
case SATP_MODE_39:
up = 1UL << 38;
break;
case SATP_MODE_48:
up = 1UL << 47;
break;
case SATP_MODE_57:
up = 1UL << 56;
break;
default:
return false;
}
return (va < up) || (va >= (ULONG_MAX - up + 1));
}
/*
* There is a simple way to determine if 4-level is supported by the
* underlying hardware: establish 1:1 mapping in 4-level page table mode
@ -833,6 +854,9 @@ static __init void set_satp_mode(uintptr_t dtb_pa)
set_satp_mode_pmd + PMD_SIZE,
PMD_SIZE, PAGE_KERNEL_EXEC);
retry:
if (!is_vaddr_valid(set_satp_mode_pmd))
goto out;
create_pgd_mapping(early_pg_dir,
set_satp_mode_pmd,
pgtable_l5_enabled ?
@ -855,6 +879,7 @@ static __init void set_satp_mode(uintptr_t dtb_pa)
disable_pgtable_l4();
}
out:
memset(early_pg_dir, 0, PAGE_SIZE);
memset(early_p4d, 0, PAGE_SIZE);
memset(early_pud, 0, PAGE_SIZE);

View File

@ -3310,8 +3310,7 @@ static void aen_host_forward(unsigned long si)
struct zpci_gaite *gaite;
struct kvm *kvm;
gaite = (struct zpci_gaite *)aift->gait +
(si * sizeof(struct zpci_gaite));
gaite = aift->gait + si;
if (gaite->count == 0)
return;
if (gaite->aisb != 0)

View File

@ -166,7 +166,7 @@ static int kvm_zpci_set_airq(struct zpci_dev *zdev)
fib.fmt0.noi = airq_iv_end(zdev->aibv);
fib.fmt0.aibv = virt_to_phys(zdev->aibv->vector);
fib.fmt0.aibvo = 0;
fib.fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
fib.fmt0.aisb = virt_to_phys(aift->sbv->vector) + (zdev->aisb / 64) * 8;
fib.fmt0.aisbo = zdev->aisb & 63;
fib.gd = zdev->gisa;
@ -290,8 +290,7 @@ static int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
phys_to_virt(fib->fmt0.aibv));
spin_lock_irq(&aift->gait_lock);
gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
sizeof(struct zpci_gaite));
gaite = aift->gait + zdev->aisb;
/* If assist not requested, host will get all alerts */
if (assist)
@ -309,7 +308,7 @@ static int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
/* Update guest FIB for re-issue */
fib->fmt0.aisbo = zdev->aisb & 63;
fib->fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
fib->fmt0.aisb = virt_to_phys(aift->sbv->vector) + (zdev->aisb / 64) * 8;
fib->fmt0.isc = gisc;
/* Save some guest fib values in the host for later use */
@ -357,8 +356,7 @@ static int kvm_s390_pci_aif_disable(struct zpci_dev *zdev, bool force)
if (zdev->kzdev->fib.fmt0.aibv == 0)
goto out;
spin_lock_irq(&aift->gait_lock);
gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
sizeof(struct zpci_gaite));
gaite = aift->gait + zdev->aisb;
isc = gaite->gisc;
gaite->count--;
if (gaite->count == 0) {

View File

@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event)
return event->hw.event_base_rdpmc;
}
static inline int match_prev_assignment(struct hw_perf_event *hwc,
static inline int match_prev_assignment(struct perf_event *event,
struct cpu_hw_events *cpuc,
int i)
{
struct hw_perf_event *hwc = &event->hw;
return hwc->idx == cpuc->assign[i] &&
hwc->last_cpu == smp_processor_id() &&
hwc->last_tag == cpuc->tags[i];
hwc->last_cpu == smp_processor_id() &&
hwc->last_tag == cpuc->tags[i] &&
!is_acr_event_group(event);
}
static void x86_pmu_start(struct perf_event *event, int flags);
@ -1346,7 +1349,7 @@ static void x86_pmu_enable(struct pmu *pmu)
* - no other event has used the counter since
*/
if (hwc->idx == -1 ||
match_prev_assignment(hwc, cpuc, i))
match_prev_assignment(event, cpuc, i))
continue;
/*
@ -1367,7 +1370,7 @@ static void x86_pmu_enable(struct pmu *pmu)
event = cpuc->event_list[i];
hwc = &event->hw;
if (!match_prev_assignment(hwc, cpuc, i))
if (!match_prev_assignment(event, cpuc, i))
x86_assign_hw_event(event, cpuc, i);
else if (i < n_running)
continue;

View File

@ -3118,11 +3118,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
intel_set_masks(event, idx);
/*
* Enable IRQ generation (0x8), if not PEBS,
* and enable ring-3 counting (0x2) and ring-0 counting (0x1)
* if requested:
* Enable IRQ generation (0x8), if not PEBS or self-reloaded
* ACR event, and enable ring-3 counting (0x2) and ring-0
* counting (0x1) if requested:
*/
if (!event->attr.precise_ip)
if (!event->attr.precise_ip && !is_acr_self_reload_event(event))
bits |= INTEL_FIXED_0_ENABLE_PMI;
if (hwc->config & ARCH_PERFMON_EVENTSEL_USR)
bits |= INTEL_FIXED_0_USER;
@ -3306,6 +3306,15 @@ static void intel_pmu_enable_event(struct perf_event *event)
intel_set_masks(event, idx);
static_call_cond(intel_pmu_enable_acr_event)(event);
static_call_cond(intel_pmu_enable_event_ext)(event);
/*
* For self-reloaded ACR event, don't enable PMI since
* HW won't set overflow bit in GLOBAL_STATUS. Otherwise,
* the PMI would be recognized as a suspicious NMI.
*/
if (is_acr_self_reload_event(event))
hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
else if (!event->attr.precise_ip)
hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
__x86_pmu_enable_event(hwc, enable_mask);
break;
case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1:
@ -3332,23 +3341,41 @@ static void intel_pmu_enable_event(struct perf_event *event)
static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc)
{
struct perf_event *event, *leader;
int i, j, idx;
int i, j, k, bit, idx;
/*
* FIXME: ACR mask parsing relies on cpuc->event_list[] (active events only).
* Disabling an ACR event causes bit-shifting errors in the acr_mask of
* remaining group members. As ACR sampling requires all events to be active,
* this limitation is acceptable for now. Revisit if independent event toggling
* is required.
*/
for (i = 0; i < cpuc->n_events; i++) {
leader = cpuc->event_list[i];
if (!is_acr_event_group(leader))
continue;
/* The ACR events must be contiguous. */
/* Find the last event of the ACR group. */
for (j = i; j < cpuc->n_events; j++) {
event = cpuc->event_list[j];
if (event->group_leader != leader->group_leader)
break;
for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
if (i + idx >= cpuc->n_events ||
!is_acr_event_group(cpuc->event_list[i + idx]))
return;
__set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1);
}
/*
* Translate the user-space ACR mask (attr.config2) into the physical
* counter bitmask (hw.config1) for each ACR event in the group.
* NOTE: ACR event contiguity is guaranteed by intel_pmu_hw_config().
*/
for (k = i; k < j; k++) {
event = cpuc->event_list[k];
event->hw.config1 = 0;
for_each_set_bit(bit, (unsigned long *)&event->attr.config2, X86_PMC_IDX_MAX) {
idx = i + bit;
/* Event index of ACR group must locate in [i, j). */
if (idx >= j || !is_acr_event_group(cpuc->event_list[idx]))
continue;
__set_bit(cpuc->assign[idx], (unsigned long *)&event->hw.config1);
}
}
i = j - 1;
@ -7504,6 +7531,7 @@ static __always_inline void intel_pmu_init_pnc(struct pmu *pmu)
hybrid(pmu, event_constraints) = intel_pnc_event_constraints;
hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints;
hybrid(pmu, extra_regs) = intel_pnc_extra_regs;
static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
}
static __always_inline void intel_pmu_init_skt(struct pmu *pmu)

View File

@ -137,6 +137,16 @@ static inline bool is_acr_event_group(struct perf_event *event)
return check_leader_group(event->group_leader, PERF_X86_EVENT_ACR);
}
static inline bool is_acr_self_reload_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
if (hwc->idx < 0)
return false;
return test_bit(hwc->idx, (unsigned long *)&hwc->config1);
}
struct amd_nb {
int nb_id; /* NorthBridge id */
int refcnt; /* reference count */

View File

@ -137,7 +137,8 @@ extern void __init efi_dump_pagetable(void);
extern void __init efi_apply_memmap_quirks(void);
extern int __init efi_reuse_config(u64 tables, int nr_tables);
extern void efi_delete_dummy_variable(void);
extern void efi_crash_gracefully_on_page_fault(unsigned long phys_addr);
extern void efi_crash_gracefully_on_page_fault(unsigned long phys_addr,
const struct pt_regs *regs);
extern void efi_unmap_boot_services(void);
void arch_efi_call_virt_setup(void);

View File

@ -803,9 +803,10 @@
#define MSR_AMD64_LBR_SELECT 0xc000010e
/* Zen4 */
#define MSR_ZEN4_BP_CFG 0xc001102e
#define MSR_ZEN4_BP_CFG 0xc001102e
#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
#define MSR_ZEN2_BP_CFG_BUG_FIX_BIT 33
/* Fam 19h MSRs */
#define MSR_F19H_UMC_PERF_CTL 0xc0010800

View File

@ -88,19 +88,19 @@ static void amd_set_max_freq_ratio(void)
rc = cppc_get_perf_caps(0, &perf_caps);
if (rc) {
pr_warn("Could not retrieve perf counters (%d)\n", rc);
pr_debug("Could not retrieve perf counters (%d)\n", rc);
return;
}
rc = amd_get_boost_ratio_numerator(0, &numerator);
if (rc) {
pr_warn("Could not retrieve highest performance (%d)\n", rc);
pr_debug("Could not retrieve highest performance (%d)\n", rc);
return;
}
nominal_perf = perf_caps.nominal_perf;
if (!nominal_perf) {
pr_warn("Could not retrieve nominal performance\n");
pr_debug("Could not retrieve nominal performance\n");
return;
}

View File

@ -989,6 +989,9 @@ static void init_amd_zen2(struct cpuinfo_x86 *c)
/* Correct misconfigured CPUID on some clients. */
clear_cpu_cap(c, X86_FEATURE_INVLPGB);
if (!cpu_has(c, X86_FEATURE_HYPERVISOR))
msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN2_BP_CFG_BUG_FIX_BIT);
}
static void init_amd_zen3(struct cpuinfo_x86 *c)

View File

@ -90,7 +90,6 @@ struct mca_config mca_cfg __read_mostly = {
};
static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen);
static unsigned long mce_need_notify;
/*
* MCA banks polled by the period polling timer for corrected events.
@ -152,8 +151,10 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
void mce_log(struct mce_hw_err *err)
{
if (mce_gen_pool_add(err))
if (mce_gen_pool_add(err)) {
pr_info(HW_ERR "Machine check events logged\n");
irq_work_queue(&mce_irq_work);
}
}
EXPORT_SYMBOL_GPL(mce_log);
@ -585,28 +586,6 @@ bool mce_is_correctable(struct mce *m)
}
EXPORT_SYMBOL_GPL(mce_is_correctable);
/*
* Notify the user(s) about new machine check events.
* Can be called from interrupt context, but not from machine check/NMI
* context.
*/
static bool mce_notify_irq(void)
{
/* Not more than two messages every minute */
static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
if (test_and_clear_bit(0, &mce_need_notify)) {
mce_work_trigger();
if (__ratelimit(&ratelimit))
pr_info(HW_ERR "Machine check events logged\n");
return true;
}
return false;
}
static int mce_early_notifier(struct notifier_block *nb, unsigned long val,
void *data)
{
@ -618,9 +597,7 @@ static int mce_early_notifier(struct notifier_block *nb, unsigned long val,
/* Emit the trace record: */
trace_mce_record(err);
set_bit(0, &mce_need_notify);
mce_notify_irq();
mce_work_trigger();
return NOTIFY_DONE;
}
@ -1804,7 +1781,7 @@ static void mce_timer_fn(struct timer_list *t)
* Alert userspace if needed. If we logged an MCE, reduce the polling
* interval, otherwise increase the polling interval.
*/
if (mce_notify_irq())
if (!mce_gen_pool_empty())
iv = max(iv / 2, (unsigned long) HZ/100);
else
iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));

Some files were not shown because too many files have changed in this diff Show More