When concurrently bringing up and down two SMT threads of a physical
core, many warning call traces occur as below:
The issue timeline is as follows:
1. When the system starts,
cpufreq: CPU: 220, policy->related_cpus: 220-221, policy->cpus: 220-221
2. Offline CPU 220 and CPU 221.
3. Online CPU 220
- CPU 221 is now offline, as acpi_get_psd_map() use
for_each_online_cpu(), so the cpu_data->shared_cpu_map,
policy->cpus, and related_cpus has only CPU 220.
cpufreq: CPU: 220, policy->related_cpus: 220, policy->cpus: 220
4. Offline CPU 220
5. Online CPU 221, the below call trace occurs:
- Since CPU 220 and CPU 221 share one policy, and
policy->related_cpus = 220 after step 3, so CPU 221
is not in policy->related_cpus but
per_cpu(cpufreq_cpu_data, cpu221) is not NULL.
After reverting commit 56eb0c0ed3 ("ACPI: CPPC: Fix remaining
for_each_possible_cpu() to use online CPUs"), the issue disappeared.
The _PSD (P-State Dependency) defines the hardware-level dependency of
frequency control across CPU cores. Since this relationship is a physical
attribute of the hardware topology, it remains constant regardless of the
online or offline status of the CPUs.
Using for_each_online_cpu() in acpi_get_psd_map() is problematic. If a
CPU is offline, it will be excluded from the shared_cpu_map.
Consequently, if that CPU is brought online later, the kernel will fail
to recognize it as part of any shared frequency domain.
Switch back to for_each_possible_cpu() to ensure that all cores defined
in the ACPI tables are correctly mapped into their respective performance
domains from the start. This aligns with the logic of policy->related_cpus,
which must encompass all potentially available cores in the domain to
prevent logic gaps during CPU hotplug operations.
To resolve the original issue regarding the "nosmt" or "nosmt=force"
boot parameter, as send_pcc_cmd() function already does if (!desc)
continue, so reverting that loop back to for_each_possible_cpu() is ok,
only need to change the match_cpc_ptr NULL case in acpi_get_psd_map() to
continue as Sean suggested.
How to reproduce, on arm64 machine with SMT support which use acpi cppc
cpufreq driver:
bash test.sh 220 & bash test.sh 221 &
The test.sh is as below:
while true
do
echo 0 > /sys/devices/system/cpu/cpu${1}/online
sleep 0.5
cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
echo 1 > /sys/devices/system/cpu/cpu${1}/online
cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
done
CPU: 221 PID: 1119 Comm: cpuhp/221 Kdump: loaded Not tainted 6.6.0debug+ #5
Hardware name: To be filled by O.E.M. S920X20/BC83AMDA01-7270Z, BIOS 20.39 09/04/2024
pstate: a1400009 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : cpufreq_online+0x8ac/0xa90
lr : cpuhp_cpufreq_online+0x18/0x30
sp : ffff80008739bce0
x29: ffff80008739bce0 x28: 0000000000000000 x27: ffff28400ca32200
x26: 0000000000000000 x25: 0000000000000003 x24: ffffd483503ff000
x23: ffffd483504051a0 x22: ffffd48350024a00 x21: 00000000000000dd
x20: 000000000000001d x19: ffff28400ca32000 x18: 0000000000000000
x17: 0000000000000020 x16: ffffd4834e6a3fc8 x15: 0000000000000020
x14: 0000000000000008 x13: 0000000000000001 x12: 00000000ffffffff
x11: 0000000000000040 x10: ffffd48350430728 x9 : ffffd4834f087c78
x8 : 0000000000000001 x7 : ffff2840092bdf00 x6 : ffffd483504264f0
x5 : ffffd48350405000 x4 : ffff283f7f95cc60 x3 : 0000000000000000
x2 : ffff53bc2f94b000 x1 : 00000000000000dd x0 : 0000000000000000
Call trace:
cpufreq_online+0x8ac/0xa90
cpuhp_cpufreq_online+0x18/0x30
cpuhp_invoke_callback+0x128/0x580
cpuhp_thread_fun+0x110/0x1b0
smpboot_thread_fn+0x140/0x190
kthread+0xec/0x100
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
Cc: All applicable <stable@vger.kernel.org>
Fixes: 56eb0c0ed3 ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs")
Co-developed-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260417040112.3727756-1-ruanjinjie@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
- Update cpufreq-dt-platdev blocklist (Faruque Ansari)
- Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
Rosen Penev)
- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
- Add support for new features: CPPC performance priority, Dynamic EPP,
Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
Mario Limonciello)
- Fix sysfs files being present when HW missing and broken/outdated
documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
- Pass the policy to cpufreq_driver->adjust_perf() to avoid using
cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
leads to a scheduling-while-atomic bug (K Prateek Nayak)
- Clean up dead code in Kconfig for cpufreq (Julian Braha)
- Remove max_freq_req update for pre-existing cpufreq policy and add a
boost_freq_req QoS request to save the boost constraint instead of
overwriting the last scaling_max_freq constraint (Pierre Gondois)
- Embed cpufreq QoS freq_req objects in cpufreq policy so they all
are allocated in one go along with the policy to simplify lifetime
rules and avoid error handling issues (Viresh Kumar)
- Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
scaling driver (Henry Tseng)
- Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
of cpumask_weight() because the former is more efficient (Yury Norov)
- Use sysfs_emit() in sysfs show functions for cpufreq governor
attributes (Thorsten Blum)
- Update intel_pstate to stop returning an error when "off" is written
to its status sysfs attribute while the driver is already off (Fabio
De Francesco)
- Include current frequency in the debug message printed by
__cpufreq_driver_target() (Pengjie Zhang)
- Refine stopped tick handling in the menu cpuidle governor and
rearrange stopped tick handling in the teo cpuidle governor (Rafael
Wysocki)
- Add Panther Lake C-states table to the intel_idle driver (Artem
Bityutskiy)
- Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
- Simplify cpuidle_register_device() with guard() (Huisong Li)
- Use performance level if available to distinguish between rates in
OPP debugfs (Manivannan Sadhasivam)
- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
- Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
- Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
Biggers)
- Clean up and rearrange the intel_rapl power capping driver to make
the respective interface drivers (TPMI, MSR, and MMOI) hold their
own settings and primitives and consolidate PL4 and PMU support
flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
- Correct kernel-doc function parameter names in the power capping core
code (Randy Dunlap)
- Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
- Use _visible attribute to replace create/remove_sysfs_files() in
devfreq (Pengjie Zhang)
- Add Tegra114 support to activity monitor device in tegra30-devfreq as
a preparation to upcoming EMC controller support (Svyatoslav Ryhel)
- Fix mistakes in cpupower man pages, add the boost and epp options to
the cpupower-frequency-info man page, and add the perf-bias option to
the cpupower-info man page (Roberto Ricci)
- Remove unnecessary extern declarations from getopt.h in arguments
parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4
KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag
tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z
Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq
b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E
NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g=
=fRCO
-----END PGP SIGNATURE-----
Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Once again, cpufreq is the most active development area, mostly
because of the new feature additions and documentation updates in the
amd-pstate driver, but there are also changes in the cpufreq core
related to boost support and other assorted updates elsewhere.
Next up are power capping changes due to the major cleanup of the
Intel RAPL driver.
On the cpuidle front, a new C-states table for Intel Panther Lake is
added to the intel_idle driver, the stopped tick handling in the menu
and teo governors is updated, and there are a couple of cleanups.
Apart from the above, support for Tegra114 is added to devfreq and
there are assorted cleanups of that code, there are also two updates
of the operating performance points (OPP) library, two minor updates
related to hibernation, and cpupower utility man pages updates and
cleanups.
Specifics:
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
- Update cpufreq-dt-platdev blocklist (Faruque Ansari)
- Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
Rosen Penev)
- Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
- Add support for new features: CPPC performance priority, Dynamic
EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
Shenoy, Mario Limonciello)
- Fix sysfs files being present when HW missing and broken/outdated
documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
- Pass the policy to cpufreq_driver->adjust_perf() to avoid using
cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
which leads to a scheduling-while-atomic bug (K Prateek Nayak)
- Clean up dead code in Kconfig for cpufreq (Julian Braha)
- Remove max_freq_req update for pre-existing cpufreq policy and add
a boost_freq_req QoS request to save the boost constraint instead
of overwriting the last scaling_max_freq constraint (Pierre
Gondois)
- Embed cpufreq QoS freq_req objects in cpufreq policy so they all
are allocated in one go along with the policy to simplify lifetime
rules and avoid error handling issues (Viresh Kumar)
- Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
scaling driver (Henry Tseng)
- Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
of cpumask_weight() because the former is more efficient (Yury
Norov)
- Use sysfs_emit() in sysfs show functions for cpufreq governor
attributes (Thorsten Blum)
- Update intel_pstate to stop returning an error when "off" is
written to its status sysfs attribute while the driver is already
off (Fabio De Francesco)
- Include current frequency in the debug message printed by
__cpufreq_driver_target() (Pengjie Zhang)
- Refine stopped tick handling in the menu cpuidle governor and
rearrange stopped tick handling in the teo cpuidle governor (Rafael
Wysocki)
- Add Panther Lake C-states table to the intel_idle driver (Artem
Bityutskiy)
- Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
- Simplify cpuidle_register_device() with guard() (Huisong Li)
- Use performance level if available to distinguish between rates in
OPP debugfs (Manivannan Sadhasivam)
- Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
- Return -ENODATA if the snapshot image is not loaded (Alberto
Garcia)
- Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
Biggers)
- Clean up and rearrange the intel_rapl power capping driver to make
the respective interface drivers (TPMI, MSR, and MMOI) hold their
own settings and primitives and consolidate PL4 and PMU support
flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
- Correct kernel-doc function parameter names in the power capping
core code (Randy Dunlap)
- Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
- Use _visible attribute to replace create/remove_sysfs_files() in
devfreq (Pengjie Zhang)
- Add Tegra114 support to activity monitor device in tegra30-devfreq
as a preparation to upcoming EMC controller support (Svyatoslav
Ryhel)
- Fix mistakes in cpupower man pages, add the boost and epp options
to the cpupower-frequency-info man page, and add the perf-bias
option to the cpupower-info man page (Roberto Ricci)
- Remove unnecessary extern declarations from getopt.h in arguments
parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"
* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
cpupower: remove extern declarations in cmd functions
cpuidle: Simplify cpuidle_register_device() with guard()
PM / devfreq: tegra30-devfreq: add support for Tegra114
PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
cpufreq/amd-pstate-ut: Add a unit test for raw EPP
cpufreq/amd-pstate: Add support for raw EPP writes
cpufreq/amd-pstate: Add support for platform profile class
cpufreq/amd-pstate: add kernel command line to override dynamic epp
cpufreq/amd-pstate: Add dynamic energy performance preference
Documentation: amd-pstate: fix dead links in the reference section
cpufreq/amd-pstate: Cache the max frequency in cpudata
Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
...
On AMD Ryzen Embedded V1780B (Family 17h, Zen 1), the BIOS does not
provide ACPI _CPC objects and the CPU does not support MSR-based CPPC
(X86_FEATURE_CPPC). The _PSS table only lists nominal P-states
(P0 = 3350 MHz), so when get_max_boost_ratio() fails at
cppc_get_perf_caps(), cpuinfo_max_freq reports only the base frequency
instead of the rated boost frequency (3600 MHz).
dmesg:
ACPI CPPC: No CPC descriptor for CPU:0
acpi_cpufreq: CPU0: Unable to get performance capabilities (-19)
cppc-cpufreq already has a DMI fallback (cppc_get_dmi_max_khz()) that
reads the processor max speed from SMBIOS Type 4. Export it and reuse
it in acpi-cpufreq as a last-resort source for the boost frequency.
A sanity check ensures the DMI value is above the _PSS P0 frequency
and within 2x of it; values outside that range are ignored and the
existing arch_set_max_freq_ratio() path is taken instead. The 2x
upper bound is based on a survey of the AMD Ryzen Embedded V1000
series, where the highest boost-to-base ratio is 1.8x (V1404I:
2.0 GHz base / 3.6 GHz boost).
The DMI lookup and sanity check are wrapped in a helper,
acpi_cpufreq_resolve_max_freq(), which falls through to
arch_set_max_freq_ratio() if the DMI value is absent or
out of range.
Tested on AMD Ryzen Embedded V1780B with v7.0-rc4:
Before: cpuinfo_max_freq = 3350000 (base only)
After: cpuinfo_max_freq = 3600000 (includes boost)
Link: https://www.amd.com/en/products/embedded/ryzen/ryzen-v1000-series.html#specifications
Signed-off-by: Henry Tseng <henrytseng@qnap.com>
Link: https://patch.msgid.link/20260324090948.1667340-1-henrytseng@qnap.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Callers of cpc_read() ignore its return value, which can lead
to using uninitialized or stale values when the read fails.
Fix this by consistently checking cpc_read() return values in
cppc_get_perf_caps(), cppc_get_perf_ctrs(), and cppc_get_perf().
Link: https://lore.kernel.org/lkml/48bdf87e-39f1-402f-a7dc-1a0e1e7a819d@nvidia.com/
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Link: https://patch.msgid.link/20260318095005.2437960-1-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 8505bfb4e4 ("ACPI: CPPC: Move reference performance
to capabilities") introduced a logical error when retrieving
the reference performance.
On platforms lacking the reference performance register, the fallback
logic leaves the local 'ref' variable uninitialized (0). This causes
the subsequent sanity check to incorrectly return -EFAULT, breaking
amd_pstate initialization.
Fix this by assigning 'ref = nom' in the fallback path.
Fixes: 8505bfb4e4 ("ACPI: CPPC: Move reference performance to capabilities")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20260310003026.GA2639793@ax162/
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260311071334.1494960-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, the `Reference Performance` register is read every time
the CPU frequency is sampled in `cppc_get_perf_ctrs()`. This function
is on the hot path of the cppc_cpufreq driver.
Reference Performance indicates the performance level that corresponds
to the Reference Counter incrementing and is not expected to change
dynamically during runtime (unlike the Delivered and Reference counters).
Reading this register in the hot path incurs unnecessary overhead,
particularly on platforms where CPC registers are located in the PCC
(Platform Communication Channel) subspace. This patch moves
`reference_perf` from the dynamic feedback counters structure
(`cppc_perf_fb_ctrs`) to the static capabilities structure
(`cppc_perf_caps`).
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
[ rjw: Changelog adjustment ]
Link: https://patch.msgid.link/20260213100935.19111-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add sysfs interface to read/write the Performance Limited register.
The Performance Limited register indicates to the OS that an
unpredictable event (like thermal throttling) has limited processor
performance. It contains two sticky bits set by the platform:
- Bit 0 (Desired_Excursion): Set when delivered performance is
constrained below desired performance. Not used when Autonomous
Selection is enabled.
- Bit 1 (Minimum_Excursion): Set when delivered performance is
constrained below minimum performance.
These bits remain set until OSPM explicitly clears them. The write
operation accepts a bitmask of bits to clear:
- Write 0x1 to clear bit 0
- Write 0x2 to clear bit 1
- Write 0x3 to clear both bits
This enables users to detect if platform throttling impacted a workload.
Users clear the register before execution, run the workload, then check
afterward - if set, hardware throttling occurred during that time window.
The interface is exposed as:
/sys/devices/system/cpu/cpuX/cpufreq/perf_limited
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-7-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Extend cppc_set_epp_perf() to write both auto_sel and energy_perf
registers when they are in FFH or SystemMemory address space.
This keeps the behavior consistent with PCC case where both registers
are already updated together, but was missing for FFH/SystemMemory.
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-4-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a warning during CPPC processor probe if the Desired Performance
register is not supported when it should be.
As per 8.4.6.1.2.3 section of ACPI 6.6 specification,
"The Desired Performance Register is optional only when OSPM indicates
support for CPPC2 in the platform-wide _OSC capabilities and the
Autonomous Selection Enable field is encoded as an Integer with a
value of 1."
In other words:
- In CPPC v1, DESIRED_PERF is mandatory
- In CPPC v2, it becomes optional only when AUTO_SEL_ENABLE is supported
This helps detect firmware configuration issues early during boot.
Link: https://lore.kernel.org/lkml/9fa21599-004a-4af8-acc2-190fd0404e35@nvidia.com/
Suggested-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-3-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add cppc_get_perf() function to read values of performance control
registers including desired_perf, min_perf, max_perf, energy_perf,
and auto_sel.
This provides a read interface to complement the existing
cppc_set_perf() write interface for performance control registers.
Note that auto_sel is read by cppc_get_perf() but not written by
cppc_set_perf() to avoid unintended mode changes during performance
updates. It can be updated with existing dedicated cppc_set_auto_sel()
API.
Use cppc_get_perf() in cppc_cpufreq_get_cpu_data() to initialize
perf_ctrls with current hardware register values during cpufreq
policy initialization.
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-2-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
- Add an unused power resource handling quirk for THUNDEROBOT ZERO (Zhai
Can)
- Fix remaining for_each_possible_cpu() in the ACPI CPPC library to use
online CPUs (Sean V Kelley)
- Drop redundant checks from the ACPI notify handler and the driver
remove callback in the ACPI battery driver (Rafael Wysocki)
- Move the creation of the wakeup source during the ACPI button driver
probe to an earlier point to avoid missing a wakeup event due to a
race and clean up system wakeup handling and remove callback in that
driver (Rafael Wysocki)
- Drop unnecessary driver_data pointer clearing from the ACPI EC and
SMBUS HC drivers and make the ACPI backlight (video) driver clear the
device's driver_data pointer on remove (Rafael Wysocki)
- Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
Kharuzhy)
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmmWFKUSHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO17FoH+wV8F4NRk13Si+DC99uh37N0IC4BdEd+
9rly7D77i+Egaj+yH/fIHXl1ul5ArGkHEhs7reMToSNfRFzSYXTSLGUaIa0tMYJ+
p1IV6vKtSShyl58FxOfMeYBgMlI5Iuc4ZMxzKiZzmBo55vmLaOrMp2i2hYkDi4wl
qZIU9DGSTyTUefqVRUiMg3pqIpLSg6CzXIn0Pv7Mj+qd/BTAFkXiMvIjk0Jl8LLR
Sf84CMkgzTzXA+rO98AQXHNNrC3R45Xgec9yEsSROgOR+c6ZijIYXj3HJNer9tP7
bMdy6yTw86Wy9KJedGx94WmD/xLGQLIWRIAkki2vfJpCBJ07FEJLYeE=
=Aw9u
-----END PGP SIGNATURE-----
Merge tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI support updates from Rafael J. Wysocki:
"These are mostly fixes and cleanups on top of the ACPI support updates
merged recently, including two new quirks, an ACPI CPPC library fix,
and fixes and cleanups of a few core ACPI device drivers:
- Add an unused power resource handling quirk for THUNDEROBOT ZERO
(Zhai Can)
- Fix remaining for_each_possible_cpu() in the ACPI CPPC library to
use online CPUs (Sean V Kelley)
- Drop redundant checks from the ACPI notify handler and the driver
remove callback in the ACPI battery driver (Rafael Wysocki)
- Move the creation of the wakeup source during the ACPI button
driver probe to an earlier point to avoid missing a wakeup event
due to a race and clean up system wakeup handling and remove
callback in that driver (Rafael Wysocki)
- Drop unnecessary driver_data pointer clearing from the ACPI EC and
SMBUS HC drivers and make the ACPI backlight (video) driver clear
the device's driver_data pointer on remove (Rafael Wysocki)
- Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
Kharuzhy)"
* tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO
ACPI: driver: Drop driver_data pointer clearing from two drivers
ACPI: video: Clear driver_data pointer on remove
ACPI: button: Tweak acpi_button_remove()
ACPI: button: Tweak system wakeup handling
ACPI: battery: Drop redundant checks from acpi_battery_remove()
ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
ACPI: x86: Force enabling of PWM2 on the Yogabook YB1-X90
ACPI: button: Call device_init_wakeup() earlier during probe
ACPI: battery: Drop redundant check from acpi_battery_notify()
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However, send_pcc_cmd() and acpi_get_psd_map() still iterate over all
possible CPUs. In acpi_get_psd_map(), encountering an offline CPU
returns -EFAULT, causing cppc_cpufreq initialization to fail.
This breaks systems booted with "nosmt" or "nosmt=force".
Fix by using for_each_online_cpu() in both functions.
Fixes: 80b8286aee ("ACPI / CPPC: support for batching CPPC requests")
Signed-off-by: Sean V Kelley <skelley@nvidia.com>
Link: https://patch.msgid.link/20260211212254.30190-1-skelley@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Remove the unused omap-cpufreq driver (Andreas Kemnade)
- Optimize error handling code in cpufreq_boost_trigger_state() and
make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
supports boost (Lifeng Zheng)
- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio)
- Minor improvements to the cpufreq and cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)
- Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)
- Update arch_freq_scale in the CPPC cpufreq driver's frequency
invariance engine (FIE) in scheduler ticks if the related CPPC
registers are not in PCC (Jie Zhan)
- Assorted minor cleanups and improvements in ARM cpufreq drivers (Juan
Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)
- Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
Gupta)
- Make the scaling_setspeed cpufreq sysfs attribute return the actual
requested frequency to avoid confusion (Pengjie Zhang)
- Simplify the idle CPU time granularity test in the ondemand cpufreq
governor (Frederic Weisbecker)
- Enable asym capacity in intel_pstate only when CPU SMT is not
possible (Yaxiong Tian)
- Update the description of rate_limit_us default value in cpufreq
documentation (Yaxiong Tian)
- Add a command line option to adjust the C-states table in the
intel_idle driver, remove the 'preferred_cstates' module parameter
from it, add C-states validation to it and clean it up (Artem
Bityutskiy)
- Make the menu cpuidle governor always check the time till the closest
timer event when the scheduler tick has been stopped to prevent it
from mistakenly selecting the deepest available idle state (Rafael
Wysocki)
- Update the teo cpuidle governor to avoid making suboptimal decisions
in certain corner cases and generally improve idle state selection
accuracy (Rafael Wysocki)
- Remove an unlikely() annotation on the early-return condition in
menu_select() that leads to branch misprediction 100% of the time
on systems with only 1 idle state enabled, like ARM64 servers (Breno
Leitao)
- Add Christian Loehle to MAINTAINERS as a cpuidle reviewer (Christian
Loehle)
- Stop flagging the PM runtime workqueue as freezable to avoid system
suspend and resume deadlocks in subsystems that assume asynchronous
runtime PM to work during system-wide PM transitions (Rafael Wysocki)
- Drop redundant NULL pointer checks before acomp_request_free() from
the hibernation code handling image saving (Rafael Wysocki)
- Update wakeup_sources_walk_start() to handle empty lists of wakeup
sources as appropriate (Samuel Wu)
- Make dev_pm_clear_wake_irq() check the power.wakeirq value under
power.lock to avoid race conditions (Gui-Dong Han)
- Avoid bit field races related to power.work_in_progress in the core
device suspend code (Xuewen Yan)
- Make several drivers discard pm_runtime_put() return value in
preparation for converting that function to a void one (Rafael
Wysocki)
- Add PL4 support for Ice Lake to the Intel RAPL power capping
driver (Daniel Tang)
- Replace sprintf() with sysfs_emit() in power capping sysfs show
functions (Sumeet Pawnikar)
- Make dev_pm_opp_get_level() return value match the documentation
after a previous update of the latter (Aleks Todorov)
- Use scoped for each OF child loop in the OPP code (Krzysztof
Kozlowski)
- Fix a bug in an example code snippet and correct typos in the energy
model management documentation (Patrick Little)
- Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
* idle_monitor: Fix incorrect value logged after stop
* Fix inverted APERF capability check
* Use strcspn() to strip trailing newline
* Reset errno before strtoull()
* Show C0 in idle-info dump
- Improve cpupower installation procedure by making the systemd step
optional and allowing users to disable the installation of systemd's
unit file (João Marcos Costa)
-----BEGIN PGP SIGNATURE-----
iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmmDr5ASHHJqd0Byand5
c29ja2kubmV0AAoJEO5fvZ0v1OO1Q8oH/0KRqdidHzesIQl6gd5WSS/sWdxODRUt
R9dEGQQ6LXCY0z05RAq29HZQf618fYuRFX4PSrtyCvrcRJK7MJKuzK55MRq0MC3c
c/2pL1PdpHexjLXUP9pcoxrYjetsr7SnD6Y0M3JfOPg1E/bG8sp1DlnE8cdqrL0W
lrdB2cEGewT2SVkNhCIQ2n6bwfQwmLlfQl1vXTM8BA7xCjoslePUJlRphAFVAt/J
5fQxSOH0eSxK5PYQFUDM2D2J3uMAN0pFb6eIjwVYYqjABqV//BPl99Rv2W3ElJq7
K/SICRWlvzyINCgF15QAUtQHWdINxSb0GzovECVxODHOv0N4mKHdpNU=
=QlVe
-----END PGP SIGNATURE-----
Merge tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"By the number of commits, cpufreq is the leading party (again) and the
most visible change there is the removal of the omap-cpufreq driver
that has not been used for a long time (good riddance). There are also
quite a few changes in the cppc_cpufreq driver, mostly related to
fixing its frequency invariance engine in the case when the CPPC
registers used by it are not in PCC. In addition to that, support for
AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev
list is updated for some platforms. The remaining cpufreq changes are
assorted fixes and cleanups.
Next up is cpuidle and the changes there are dominated by intel_idle
driver updates, mostly related to the new command line facility
allowing users to adjust the list of C-states used by the driver.
There are also a few updates of cpuidle governors, including two menu
governor fixes and some refinements of the teo governor, and a
MAINTAINERS update adding Christian Loehle as a cpuidle reviewer.
[Thanks for stepping up Christian!]
The most significant update related to system suspend and hibernation
is the one to stop freezing the PM runtime workqueue during system PM
transitions which allows some deadlocks to be avoided. There is also a
fix for possible concurrent bit field updates in the core device
suspend code and a few other minor fixes.
Apart from the above, several drivers are updated to discard the
return value of pm_runtime_put() which is going to be converted to a
void function as soon as everybody stops using its return value, PL4
support for Ice Lake is added to the Intel RAPL power capping driver,
and there are assorted cleanups, documentation fixes, and some
cpupower utility improvements.
Specifics:
- Remove the unused omap-cpufreq driver (Andreas Kemnade)
- Optimize error handling code in cpufreq_boost_trigger_state() and
make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
supports boost (Lifeng Zheng)
- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio)
- Minor improvements to the cpufreq and cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)
- Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)
- Update arch_freq_scale in the CPPC cpufreq driver's frequency
invariance engine (FIE) in scheduler ticks if the related CPPC
registers are not in PCC (Jie Zhan)
- Assorted minor cleanups and improvements in ARM cpufreq drivers
(Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)
- Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
Gupta)
- Make the scaling_setspeed cpufreq sysfs attribute return the actual
requested frequency to avoid confusion (Pengjie Zhang)
- Simplify the idle CPU time granularity test in the ondemand cpufreq
governor (Frederic Weisbecker)
- Enable asym capacity in intel_pstate only when CPU SMT is not
possible (Yaxiong Tian)
- Update the description of rate_limit_us default value in cpufreq
documentation (Yaxiong Tian)
- Add a command line option to adjust the C-states table in the
intel_idle driver, remove the 'preferred_cstates' module parameter
from it, add C-states validation to it and clean it up (Artem
Bityutskiy)
- Make the menu cpuidle governor always check the time till the
closest timer event when the scheduler tick has been stopped to
prevent it from mistakenly selecting the deepest available idle
state (Rafael Wysocki)
- Update the teo cpuidle governor to avoid making suboptimal
decisions in certain corner cases and generally improve idle state
selection accuracy (Rafael Wysocki)
- Remove an unlikely() annotation on the early-return condition in
menu_select() that leads to branch misprediction 100% of the time
on systems with only 1 idle state enabled, like ARM64 servers
(Breno Leitao)
- Add Christian Loehle to MAINTAINERS as a cpuidle reviewer
(Christian Loehle)
- Stop flagging the PM runtime workqueue as freezable to avoid system
suspend and resume deadlocks in subsystems that assume asynchronous
runtime PM to work during system-wide PM transitions (Rafael
Wysocki)
- Drop redundant NULL pointer checks before acomp_request_free() from
the hibernation code handling image saving (Rafael Wysocki)
- Update wakeup_sources_walk_start() to handle empty lists of wakeup
sources as appropriate (Samuel Wu)
- Make dev_pm_clear_wake_irq() check the power.wakeirq value under
power.lock to avoid race conditions (Gui-Dong Han)
- Avoid bit field races related to power.work_in_progress in the core
device suspend code (Xuewen Yan)
- Make several drivers discard pm_runtime_put() return value in
preparation for converting that function to a void one (Rafael
Wysocki)
- Add PL4 support for Ice Lake to the Intel RAPL power capping driver
(Daniel Tang)
- Replace sprintf() with sysfs_emit() in power capping sysfs show
functions (Sumeet Pawnikar)
- Make dev_pm_opp_get_level() return value match the documentation
after a previous update of the latter (Aleks Todorov)
- Use scoped for each OF child loop in the OPP code (Krzysztof
Kozlowski)
- Fix a bug in an example code snippet and correct typos in the
energy model management documentation (Patrick Little)
- Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
* idle_monitor: Fix incorrect value logged after stop
* Fix inverted APERF capability check
* Use strcspn() to strip trailing newline
* Reset errno before strtoull()
* Show C0 in idle-info dump
- Improve cpupower installation procedure by making the systemd step
optional and allowing users to disable the installation of
systemd's unit file (João Marcos Costa)"
* tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
PM: sleep: core: Avoid bit field races related to work_in_progress
PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races
cpufreq: Documentation: Update description of rate_limit_us default value
cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible
PM: wakeup: Handle empty list in wakeup_sources_walk_start()
PM: EM: Documentation: Fix bug in example code snippet
Documentation: Fix typos in energy model documentation
cpuidle: governors: teo: Refine intercepts-based idle state lookup
cpuidle: governors: teo: Adjust the classification of wakeup events
cpufreq: ondemand: Simplify idle cputime granularity test
cpufreq: userspace: make scaling_setspeed return the actual requested frequency
PM: hibernate: Drop NULL pointer checks before acomp_request_free()
cpufreq: CPPC: Add generic helpers for sysfs show/store
cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
cpufreq: ti-cpufreq: add support for AM62L3 SoC
cpufreq: dt-platdev: Add ti,am62l3 to blocklist
cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
cpufreq: scmi: correct SCMI explanation
cpufreq: dt-platdev: Block the driver from probing on more QC platforms
rust: cpumask: rename methods of Cpumask for clarity and consistency
...
- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio).
- Minor improvements to the cpufreq / cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen).
- Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole).
- Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan).
- Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca
Weiss, and Sergey Shtylyov).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAml4YIcACgkQ0rkcPK6B
Ehx6hQ//Y8GoICqNQe6IrVQ6b9eJLB/YOP7vPyZuwc0iqT9YWXrMou5xlnmNW/IY
zj0Wo3l3fidp6eDOo7f21mXALF9yt8kElKq411Oqcg4WVWyAXc9p6ODpWZp/G2/h
JcmusAkwFPul0XE0QJmlJ54KqtsyjoSWQHtrPzOO54mJEhOL4dWQwqhWP046ed7T
FVkNRLb7ysY3+weNuAg45SbVJ3FT/a7f8XbdGd5DAz96efbqTyFt+znhfsd3Xti+
sF75Mq1AEN0Vnfb8ZP4MZUCe7zeVdOVfLFqXXiW/qJOdbRgoD6k2uAOIt2NAcYU/
sbv6UjaW0NE0oTvKbJ8CLE4IIJudjRgVNjyyGlKHdjBVgMQQk9vr7DIedGLghink
VABcyerIqhPFGkBkY0IXkLSmhNtKWoLN9w7sMeCwhE34l63Bnie3Sg9JLikRxaXK
6BAm3+8BiG30tg4WL1LX8UyssnMlUGvvOD9TCP4jOfLQeAk8wWgQ1D+CwWCB5o5j
jtDwPTOCIN9UQT46lYS+kkqzwf4YTFVdA4c23Tod70gjrtm7Z1a7UzYNxoTGcGS3
VtrgVnlgPh3/I/95Qpsgoy5D1oeIz9znFoVv6ETPBINy4A4rAsYMA4DEASfM7tIY
pNhbSTcbtDbp6Eo79hkh5J2ZGoJyTSthrX+irqOz+IQFp8fP9s4=
=Zz3B
-----END PGP SIGNATURE-----
Merge tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull CPUFreq Arm updates for 7.0 from Viresh Kumar:
"- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio).
- Minor improvements to the cpufreq / cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen).
- Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole).
- Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan).
- Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca
Weiss, and Sergey Shtylyov)."
* tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
cpufreq: ti-cpufreq: add support for AM62L3 SoC
cpufreq: dt-platdev: Add ti,am62l3 to blocklist
cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
cpufreq: scmi: correct SCMI explanation
cpufreq: dt-platdev: Block the driver from probing on more QC platforms
rust: cpumask: rename methods of Cpumask for clarity and consistency
cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs
cpufreq: CPPC: Factor out cppc_fie_kworker_init()
ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu()
rust: cpufreq: replace `kernel::c_str!` with C-Strings
cpufreq: Add Tegra186 and Tegra194 to cpufreq-dt-platdev blocklist
dt-bindings: cpufreq: qcom-hw: document Milos CPUFREQ Hardware
rust: cpufreq: add __rust_helper to helpers
rust: cpufreq: always inline functions using build_assert with arguments
Factor out cppc_perf_ctrs_in_pcc_cpu() for checking whether per-cpu CPC
regs are defined in PCC channels, and export it out for further use.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The current implementation overlooks the 'guaranteed_perf'
register in this check.
If the Guaranteed Performance register is located in the PCC
subspace, the function currently attempts to read it without
acquiring the lock and without sending the CMD_READ doorbell
to the firmware. This can result in reading stale data.
Fixes: 29523f0953 ("ACPI / CPPC: Add support for guaranteed performance")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPU via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_perf_ctrs_in_pcc() checks if the CPPC
perf-ctrs are in a PCC region for all the present CPUs, which breaks
when the kernel is booted with "nosmt=force".
Hence, limit the check only to the online CPUs.
Fixes: ae2df912d1 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_allow_fast_switch() checks for the validity
of the _CPC object for all the present CPUs. This breaks when the
kernel is booted with "nosmt=force".
Check fast_switch capability only on online CPUs
Fixes: 15eece6c5b ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function acpi_cpc_valid() checks for the validity of the
_CPC object for all the present CPUs. This breaks when the kernel is
booted with "nosmt=force".
Hence check the validity of the _CPC objects of only the online CPUs.
Fixes: 2aeca6bd02 ("ACPI: CPPC: Check present CPUs for determining _CPC is valid")
Reported-by: Christopher Harris <chris.harris79@gmail.com>
Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Tested-by: Chrisopher Harris <chris.harris79@gmail.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-3-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Instead of using CPUFREQ_ETERNAL for signaling an error condition
in cppc_get_transition_latency(), change the return value type of
that function to int and make it return a proper negative error
code on failures.
No intentional functional impact.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
With nosmp in cmdline, other CPUs are not brought up, leaving
their cpc_desc_ptr NULL. CPU0's iteration via for_each_possible_cpu()
dereferences these NULL pointers, causing panic.
Panic backtrace:
[ 0.401123] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8
...
[ 0.403255] [<ffffffff809a5818>] cppc_allow_fast_switch+0x6a/0xd4
...
Kernel panic - not syncing: Attempted to kill init!
Fixes: 3cc30dd00a ("cpufreq: CPPC: Enable fast_switch")
Reported-by: Xu Lu <luxu.kernel@bytedance.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://patch.msgid.link/20250604023036.99553-1-cuiyunhui@bytedance.com
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel(). Using a
cppc_perf_caps to carry the value is unnecessary.
Add a check to ensure the pointer 'enable' is not null.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-8-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Refactor register value get and set ABIs by using cppc_get_reg_val(),
cppc_set_reg_val() and CPPC_REG_VAL_READ().
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-7-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add cppc_set_reg_val() as a generic function for setting CPPC register
values, with this features:
1. Check register. If a register is writeable, it must be a buffer and
can not be null.
2. Extract the operations if register is in PCC out as
cppc_set_reg_val_in_pcc().
This function can be used to reduce some existing code duplication.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-6-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Extract the operations if register is in pcc out from cppc_get_reg_val()
as cppc_get_reg_val_in_pcc().
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rename cppc_get_perf() to cppc_get_reg_val() as a generic function to
read CPPC registers.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Optimize cppc_get_perf() with three changes:
1. Change the error kind to "no such device" when pcc_ss_id < 0, as
other register value getting functions.
2. Add a check to ensure the pointer 'perf' is no null.
3. Add a check to verify if the register is supported to be read before
using it. The logic is:
(1) If the register is of the integer type, check whether the
register is optional and its value is 0. If yes, the register
is not supported.
(2) If the register is of other types, a null one is not supported.
4. Return the result of cpc_read() instead of 0.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In ACPI 6.5, s8.4.6.1 _CPC (Continuous Performance Control), whether
each of the per-cpu cpc_regs[] is mandatory or optional is defined.
Since the CPC_SUPPORTED() check is only for optional _CPC fields,
another macro to check if the field is optional is needed.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-2-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The PCC driver now handles mapping and unmapping of shared memory
areas as part of pcc_mbox_{request,free}_channel(). Without these before,
this ACPI CPPC driver did handling of those mappings like several
other PCC mailbox client drivers.
There were redundant operations, leading to unnecessary code. Maintaining
the consistency across these driver was harder due to scattered handling
of shmem.
Just use the mapped shmem and remove all redundant operations from this
driver.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/20250411113144.1151094-2-sudeep.holla@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge updates of the ACPI battery and EC drivers, an ACPI Platform
Firmware Runtime (PFR) telemetry driver update and an ACPI OS support
layer change for 6.13-rc1:
- Use DEFINE_SIMPLE_DEV_PM_OPS in the ACPI battery driver, make it use
devm_ for initializing mutexes and allocating driver data, and make
it check the register_pm_notifier() return value (Thomas Weißschuh,
Andy Shevchenko).
- Make the ACPI EC driver support compile-time conditional and allow
ACPI to be built without CONFIG_HAS_IOPORT (Arnd Bergmann).
- Remove a redundant error check from the pfr_telemetry driver (Colin
Ian King).
* acpi-battery:
ACPI: battery: Check for error code from devm_mutex_init() call
ACPI: battery: use DEFINE_SIMPLE_DEV_PM_OPS
ACPI: battery: initialize mutexes through devm_ APIs
ACPI: battery: allocate driver data through devm_ APIs
ACPI: battery: check result of register_pm_notifier()
* acpi-ec:
ACPI: EC: make EC support compile-time conditional
* acpi-pfr:
ACPI: pfr_telemetry: remove redundant error check on ret
* acpi-osl:
ACPI: allow building without CONFIG_HAS_IOPORT
Since commit 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage"), _CPC
registers cannot be changed from 1 to 0.
It turns out that there is an extra OR after MASK_VAL_WRITE(), which
has already ORed prev_val with the register mask.
Remove the extra OR to fix the problem.
Fixes: 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20241113103309.761031-1-zhenglifeng1@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
arch_init_invariance_cppc() is called at the end of
acpi_cppc_processor_probe() in order to configure frequency invariance
based upon the values from _CPC.
This however doesn't work on AMD CPPC shared memory designs that have
AMD preferred cores enabled because _CPC needs to be analyzed from all
cores to judge if preferred cores are enabled.
This issue manifests to users as a warning since commit 21fb59ab4b
("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"):
```
Could not retrieve highest performance (-19)
```
However the warning isn't the cause of this, it was actually
commit 279f838a61 ("x86/amd: Detect preferred cores in
amd_get_boost_ratio_numerator()") which exposed the issue.
To fix this problem, change arch_init_invariance_cppc() into a new weak
symbol that is called at the end of acpi_processor_driver_init().
Each architecture that supports it can declare the symbol to override
the weak one.
Define it for x86, in arch/x86/kernel/acpi/cppc.c, and for all of the
architectures using the generic arch_topology.c code.
Fixes: 279f838a61 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Reported-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219431
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20241104222855.3959267-1-superm1@kernel.org
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CONFIG_HAS_IOPORT will soon become optional and cause a build time
failure when it is disabled but a driver calls inb()/outb(). At the
moment, all architectures that can support ACPI have port I/O, but this
is not necessarily the case in the future on non-x86 architectures.
The result is a set of errors like:
drivers/acpi/osl.c: In function 'acpi_os_read_port':
include/asm-generic/io.h:542:14: error: call to '_inb' declared with attribute error: inb()) requires CONFIG_HAS_IOPORT
Nothing should actually call these functions in this configuration,
and if it does, the result would be undefined behavior today, possibly
a NULL pointer dereference.
Change the low-level functions to return a proper error code when
HAS_IOPORT is disabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20241030123701.1538919-2-arnd@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The following BUG was triggered:
=============================
[ BUG: Invalid wait context ]
6.12.0-rc2-XXX #406 Not tainted
-----------------------------
kworker/1:1/62 is trying to lock:
ffffff8801593030 (&cpc_ptr->rmw_lock){+.+.}-{3:3}, at: cpc_write+0xcc/0x370
other info that might help us debug this:
context-{5:5}
2 locks held by kworker/1:1/62:
#0: ffffff897ef5ec98 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2c/0x50
#1: ffffff880154e238 (&sg_policy->update_lock){....}-{2:2}, at: sugov_update_shared+0x3c/0x280
stack backtrace:
CPU: 1 UID: 0 PID: 62 Comm: kworker/1:1 Not tainted 6.12.0-rc2-g9654bd3e8806 #406
Workqueue: 0x0 (events)
Call trace:
dump_backtrace+0xa4/0x130
show_stack+0x20/0x38
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x28
__lock_acquire+0x480/0x1ad8
lock_acquire+0x114/0x310
_raw_spin_lock+0x50/0x70
cpc_write+0xcc/0x370
cppc_set_perf+0xa0/0x3a8
cppc_cpufreq_fast_switch+0x40/0xc0
cpufreq_driver_fast_switch+0x4c/0x218
sugov_update_shared+0x234/0x280
update_load_avg+0x6ec/0x7b8
dequeue_entities+0x108/0x830
dequeue_task_fair+0x58/0x408
__schedule+0x4f0/0x1070
schedule+0x54/0x130
worker_thread+0xc0/0x2e8
kthread+0x130/0x148
ret_from_fork+0x10/0x20
sugov_update_shared() locks a raw_spinlock while cpc_write() locks a
spinlock.
To have a correct wait-type order, update rmw_lock to a raw spinlock and
ensure that interrupts will be disabled on the CPU holding it.
Fixes: 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://patch.msgid.link/20241028125657.1271512-1-pierre.gondois@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When the nominal_freq recorded by the kernel is equal to the lowest_freq,
and the frequency adjustment operation is triggered externally, there is
a logic error in cppc_perf_to_khz()/cppc_khz_to_perf(), resulting in perf
and khz conversion errors.
Fix this by adding a branch processing logic when nominal_freq is equal
to lowest_freq.
Fixes: ec1c7ad476 ("cpufreq: CPPC: Fix performance/frequency conversion")
Signed-off-by: liwei <liwei728@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20241024022952.2627694-1-liwei728@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
Some Asus AMD systems are reported to not be able to change EPP values
because the BIOS doesn't advertise support for the CPPC MSR and the PCC
region is not configured.
However the ACPI 6.2 specification allows CPC registers to be declared
in FFH:
```
Starting with ACPI Specification 6.2, all _CPC registers can be in
PCC, System Memory, System IO, or Functional Fixed Hardware address
spaces. OSPM support for this more flexible register space scheme
is indicated by the “Flexible Address Space for CPPC Registers” _OSC
bit.
```
If this _OSC has been set allow using FFH to configure EPP.
Reported-by: al0uette@outlook.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Suggested-by: al0uette@outlook.com
Tested-by: vderp@icloud.com
Tested-by: al0uette@outlook.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20240910031524.106387-1-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
MASK_VAL() was added as a way to handle bit_offset and bit_width for
registers located in system memory address space. However, while suited
for reading, it does not work for writing and result in corrupted
registers when writing values with bit_offset > 0. Moreover, when a
register is collocated with another one at the same address but with a
different mask, the current code results in the other registers being
overwritten with 0s. The write procedure for SYSTEM_MEMORY registers
should actually read the value, mask it, update it and write it with the
updated value. Moreover, since registers can be located in the same
word, we must take care of locking the access before doing it. We should
potentially use a global lock since we don't know in if register
addresses aren't shared with another _CPC package but better not
encourage vendors to do so. Assume that registers can use the same word
inside a _CPC package and thus, use a per _CPC package lock.
Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20240826101648.95654-1-cleger@rivosinc.com
[ rjw: Dropped redundant semicolon ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Expose the CPPC guaranteed performance as reported by the platform through
GuaranteedPerformanceRegister.
The current value is already read in cppc_get_perf_caps() and stored in
struct cppc_perf_caps (to be used by the intel_pstate driver), so only the
attribute itself needs to be defined.
Signed-off-by: Petr Tesařík <ptesarik@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge cpufreq updates for 6.10:
- Rework the handling of disabled turbo in the intel_pstate driver and
make it update the maximum CPU frequency consistently regardless of
the reason on top of a number of cleanups (Rafael Wysocki).
- Add missing checks for NULL .exit() cpufreq driver callback to the
cpufreq core (Viresh Kumar).
- Prevent pulicy->max from going above the frequency QoS maximum value
when cpufreq_frequency_table_verify() is used (Xuewen Yan).
- Prevent a negative CPU number or frequency value from being printed
if they are really large (Joshua Yeong).
- Update MAINTAINERS entry for amd-pstate to add two new submaintainers
and a designated reviewer (Huang Rui).
- Clean up the amd-pstate driver and update its documentation (Gautham
Shenoy).
- Fix the highest frequency issue in the amd-pstate driver which limits
performance (Perry Yuan).
- Enable CPPC v2 for certain processors in the family 17H, as requested
by TR40 processor users who expect improved performance and lower
system temperature (Perry Yuan).
- Change latency and delay values to be read from platform firmware
firstly for more accurate timing (Perry Yuan).
- A new quirk is introduced for supporting amd-pstate on legacy
processors which either lack CPPC capability, or only only have CPPC
v2 capability (Perry Yuan).
- Sun50i: Add support for opp_supported_hw, H616 platform and general
cleanups (Andre Przywara, Martin Botka, Brandon Cheo Fusi, Dan
Carpenter, Viresh Kumar).
- CPPC: Fix possible null pointer dereference (Aleksandr Mishin).
- Eliminate uses of of_node_put() (Javier Carrasco, and Shivani Gupta).
- brcmstb-avs: ISO C90 forbids mixed declarations (Portia Stephens).
- mediatek: Add support for MT7988A (Sam Shih).
- cpufreq-qcom-hw: Add SM4450 compatibles in DT bindings (Tengfei Fan).
- Fix struct cpudata::epp_cached kernel-doc in the intel_pstate cpufreq
driver (Jeff Johnson).
* pm-cpufreq: (46 commits)
cpufreq: amd-pstate: fix the highest frequency issue which limits performance
cpufreq: intel_pstate: fix struct cpudata::epp_cached kernel-doc
cpufreq: Fix up printing large CPU numbers and frequency values
MAINTAINERS: cpufreq: amd-pstate: Add co-maintainers and reviewer
cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
cpufreq: amd-pstate: fix code format problems
cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing
cppc_acpi: print error message if CPPC is unsupported
cpufreq: amd-pstate: get transition delay and latency value from ACPI tables
cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()
cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq
cpufreq: amd-pstate: Document the units for freq variables in amd_cpudata
cpufreq: amd-pstate: Document *_limit_* fields in struct amd_cpudata
dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM4450 compatibles
cpufreq: sun50i: fix error returns in dt_has_supported_hw()
cpufreq: brcmstb-avs-cpufreq: ISO C90 forbids mixed declarations
cpufreq: dt-platdev: eliminate uses of of_node_put()
cpufreq: dt: eliminate uses of of_node_put()
cpufreq: ti: Implement scope-based cleanup in ti_cpufreq_match_node()
...
The amd-pstate driver can fail when _CPC objects are not supported by
the CPU. However, the current error message is ambiguous (see below) and
there is no clear way for attributing the failure of the amd-pstate
driver to the lack of CPPC support.
[ 0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled
Fix this by adding an debug message to notify the user if the amd-pstate
driver failed to load due to CPPC not be supported by the CPU
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
commit 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system
memory accesses") modified cpc_read()/cpc_write() to use access_width to
read CPC registers.
However, for PCC registers the access width field in the ACPI register
macro specifies the PCC subspace ID. For non-zero PCC subspace ID it is
incorrectly treated as access width. This causes errors when reading
from PCC registers in the CPPC driver.
For PCC registers, base the size of read/write on the bit width field.
The debug message in cpc_read()/cpc_write() is updated to print relevant
information for the address space type used to read the register.
Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
Tested-by: Jarred White <jarredwhite@linux.microsoft.com>
Reviewed-by: Jarred White <jarredwhite@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Cc: 5.15+ <stable@vger.kernel.org> # 5.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for
system memory accesses") neglected to properly wrap the bit_offset shift
when it comes to applying the mask. This may cause incorrect values to be
read and may cause the cpufreq module not be loaded.
[ 11.059751] cpu_capacity: CPU0 missing/invalid highest performance.
[ 11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs
Also, corrected the bitmask generation in GENMASK (extra bit being added).
Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Jarred White <jarredwhite@linux.microsoft.com>
Cc: 5.15+ <stable@vger.kernel.org> # 5.15+
Reviewed-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>