Commit Graph

147 Commits

Author SHA1 Message Date
Jinjie Ruan
75141a770f ACPI: CPPC: Fix related_cpus inconsistency during CPU hotplug
When concurrently bringing up and down two SMT threads of a physical
core, many warning call traces occur as below:

The issue timeline is as follows:

 1. When the system starts,
    cpufreq: CPU: 220, policy->related_cpus: 220-221, policy->cpus: 220-221

 2. Offline CPU 220 and CPU 221.

 3. Online CPU 220
    - CPU 221 is now offline, as acpi_get_psd_map() use
      for_each_online_cpu(), so the cpu_data->shared_cpu_map,
      policy->cpus, and related_cpus has only CPU 220.

    cpufreq: CPU: 220, policy->related_cpus: 220, policy->cpus: 220

 4. Offline CPU 220

 5. Online CPU 221, the below call trace occurs:
    - Since CPU 220 and CPU 221 share one policy, and
      policy->related_cpus = 220 after step 3, so CPU 221
      is not in policy->related_cpus but
      per_cpu(cpufreq_cpu_data, cpu221) is not NULL.

After reverting commit 56eb0c0ed3 ("ACPI: CPPC: Fix remaining
for_each_possible_cpu() to use online CPUs"), the issue disappeared.

The _PSD (P-State Dependency) defines the hardware-level dependency of
frequency control across CPU cores. Since this relationship is a physical
attribute of the hardware topology, it remains constant regardless of the
online or offline status of the CPUs.

Using for_each_online_cpu() in acpi_get_psd_map() is problematic. If a
CPU is offline, it will be excluded from the shared_cpu_map.
Consequently, if that CPU is brought online later, the kernel will fail
to recognize it as part of any shared frequency domain.

Switch back to for_each_possible_cpu() to ensure that all cores defined
in the ACPI tables are correctly mapped into their respective performance
domains from the start. This aligns with the logic of policy->related_cpus,
which must encompass all potentially available cores in the domain to
prevent logic gaps during CPU hotplug operations.

To resolve the original issue regarding the "nosmt" or "nosmt=force"
boot parameter, as send_pcc_cmd() function already does if (!desc)
continue, so reverting that loop back to for_each_possible_cpu() is ok,
only need to change the match_cpc_ptr NULL case in acpi_get_psd_map() to
continue as Sean suggested.

How to reproduce, on arm64 machine with SMT support which use acpi cppc
cpufreq driver:

	bash test.sh 220 & bash test.sh 221 &

	The test.sh is as below:
		while true
			do
			echo 0 > /sys/devices/system/cpu/cpu${1}/online
			sleep 0.5
			cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
			echo 1 >  /sys/devices/system/cpu/cpu${1}/online
			cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
		done

	CPU: 221 PID: 1119 Comm: cpuhp/221 Kdump: loaded Not tainted 6.6.0debug+ #5
	Hardware name: To be filled by O.E.M. S920X20/BC83AMDA01-7270Z, BIOS 20.39 09/04/2024
	pstate: a1400009 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
	pc : cpufreq_online+0x8ac/0xa90
	lr : cpuhp_cpufreq_online+0x18/0x30
	sp : ffff80008739bce0
	x29: ffff80008739bce0 x28: 0000000000000000 x27: ffff28400ca32200
	x26: 0000000000000000 x25: 0000000000000003 x24: ffffd483503ff000
	x23: ffffd483504051a0 x22: ffffd48350024a00 x21: 00000000000000dd
	x20: 000000000000001d x19: ffff28400ca32000 x18: 0000000000000000
	x17: 0000000000000020 x16: ffffd4834e6a3fc8 x15: 0000000000000020
	x14: 0000000000000008 x13: 0000000000000001 x12: 00000000ffffffff
	x11: 0000000000000040 x10: ffffd48350430728 x9 : ffffd4834f087c78
	x8 : 0000000000000001 x7 : ffff2840092bdf00 x6 : ffffd483504264f0
	x5 : ffffd48350405000 x4 : ffff283f7f95cc60 x3 : 0000000000000000
	x2 : ffff53bc2f94b000 x1 : 00000000000000dd x0 : 0000000000000000
	Call trace:
	 cpufreq_online+0x8ac/0xa90
	 cpuhp_cpufreq_online+0x18/0x30
	 cpuhp_invoke_callback+0x128/0x580
	 cpuhp_thread_fun+0x110/0x1b0
	 smpboot_thread_fn+0x140/0x190
	 kthread+0xec/0x100
	 ret_from_fork+0x10/0x20
	---[ end trace 0000000000000000 ]---

Cc: All applicable <stable@vger.kernel.org>
Fixes: 56eb0c0ed3 ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs")
Co-developed-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260417040112.3727756-1-ruanjinjie@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-27 21:50:26 +02:00
Linus Torvalds
d7c8087a9c Power management updates for 7.1-rc1
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
 
  - Update cpufreq-dt-platdev blocklist (Faruque Ansari)
 
  - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
    Rosen Penev)
 
  - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
 
  - Add support for new features: CPPC performance priority, Dynamic EPP,
    Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
    Mario Limonciello)
 
  - Fix sysfs files being present when HW missing and broken/outdated
    documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
 
  - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
    cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
    leads to a scheduling-while-atomic bug (K Prateek Nayak)
 
  - Clean up dead code in Kconfig for cpufreq (Julian Braha)
 
  - Remove max_freq_req update for pre-existing cpufreq policy and add a
    boost_freq_req QoS request to save the boost constraint instead of
    overwriting the last scaling_max_freq constraint (Pierre Gondois)
 
  - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
    are allocated in one go along with the policy to simplify lifetime
    rules and avoid error handling issues (Viresh Kumar)
 
  - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
    scaling driver (Henry Tseng)
 
  - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
    of cpumask_weight() because the former is more efficient (Yury Norov)
 
  - Use sysfs_emit() in sysfs show functions for cpufreq governor
    attributes (Thorsten Blum)
 
  - Update intel_pstate to stop returning an error when "off" is written
    to its status sysfs attribute while the driver is already off (Fabio
    De Francesco)
 
  - Include current frequency in the debug message printed by
    __cpufreq_driver_target() (Pengjie Zhang)
 
  - Refine stopped tick handling in the menu cpuidle governor and
    rearrange stopped tick handling in the teo cpuidle governor (Rafael
    Wysocki)
 
  - Add Panther Lake C-states table to the intel_idle driver (Artem
    Bityutskiy)
 
  - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
 
  - Simplify cpuidle_register_device() with guard() (Huisong Li)
 
  - Use performance level if available to distinguish between rates in
    OPP debugfs (Manivannan Sadhasivam)
 
  - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
 
  - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
 
  - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
    Biggers)
 
  - Clean up and rearrange the intel_rapl power capping driver to make
    the respective interface drivers (TPMI, MSR, and MMOI) hold their
    own settings and primitives and consolidate PL4 and PMU support
    flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
 
  - Correct kernel-doc function parameter names in the power capping core
    code (Randy Dunlap)
 
  - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
 
  - Use _visible attribute to replace create/remove_sysfs_files() in
    devfreq (Pengjie Zhang)
 
  - Add Tegra114 support to activity monitor device in tegra30-devfreq as
    a preparation to upcoming EMC controller support (Svyatoslav Ryhel)
 
  - Fix mistakes in cpupower man pages, add the boost and epp options to
    the cpupower-frequency-info man page, and add the perf-bias option to
    the cpupower-info man page (Roberto Ricci)
 
  - Remove unnecessary extern declarations from getopt.h in arguments
    parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
    cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4
 KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag
 tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z
 Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq
 b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E
 NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g=
 =fRCO
 -----END PGP SIGNATURE-----

Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
2026-04-13 19:47:52 -07:00
Henry Tseng
16fb8d8a0e cpufreq: acpi-cpufreq: use DMI max speed when CPPC is unavailable
On AMD Ryzen Embedded V1780B (Family 17h, Zen 1), the BIOS does not
provide ACPI _CPC objects and the CPU does not support MSR-based CPPC
(X86_FEATURE_CPPC).  The _PSS table only lists nominal P-states
(P0 = 3350 MHz), so when get_max_boost_ratio() fails at
cppc_get_perf_caps(), cpuinfo_max_freq reports only the base frequency
instead of the rated boost frequency (3600 MHz).

  dmesg:
    ACPI CPPC: No CPC descriptor for CPU:0
    acpi_cpufreq: CPU0: Unable to get performance capabilities (-19)

cppc-cpufreq already has a DMI fallback (cppc_get_dmi_max_khz()) that
reads the processor max speed from SMBIOS Type 4.  Export it and reuse
it in acpi-cpufreq as a last-resort source for the boost frequency.

A sanity check ensures the DMI value is above the _PSS P0 frequency
and within 2x of it; values outside that range are ignored and the
existing arch_set_max_freq_ratio() path is taken instead.  The 2x
upper bound is based on a survey of the AMD Ryzen Embedded V1000
series, where the highest boost-to-base ratio is 1.8x (V1404I:
2.0 GHz base / 3.6 GHz boost).

The DMI lookup and sanity check are wrapped in a helper,
acpi_cpufreq_resolve_max_freq(), which falls through to
arch_set_max_freq_ratio() if the DMI value is absent or
out of range.

Tested on AMD Ryzen Embedded V1780B with v7.0-rc4:

  Before: cpuinfo_max_freq = 3350000 (base only)
  After:  cpuinfo_max_freq = 3600000 (includes boost)

Link: https://www.amd.com/en/products/embedded/ryzen/ryzen-v1000-series.html#specifications
Signed-off-by: Henry Tseng <henrytseng@qnap.com>
Link: https://patch.msgid.link/20260324090948.1667340-1-henrytseng@qnap.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-25 14:29:05 +01:00
Sumit Gupta
0cc2497722 ACPI: CPPC: Check cpc_read() return values consistently
Callers of cpc_read() ignore its return value, which can lead
to using uninitialized or stale values when the read fails.

Fix this by consistently checking cpc_read() return values in
cppc_get_perf_caps(), cppc_get_perf_ctrs(), and cppc_get_perf().

Link: https://lore.kernel.org/lkml/48bdf87e-39f1-402f-a7dc-1a0e1e7a819d@nvidia.com/
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Link: https://patch.msgid.link/20260318095005.2437960-1-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-19 14:06:27 +01:00
Pengjie Zhang
be473f0591 ACPI: CPPC: Fix uninitialized ref variable in cppc_get_perf_caps()
Commit 8505bfb4e4 ("ACPI: CPPC: Move reference performance
to capabilities") introduced a logical error when retrieving
the reference performance.

On platforms lacking the reference performance register, the fallback
logic leaves the local 'ref' variable uninitialized (0). This causes
the subsequent sanity check to incorrectly return -EFAULT, breaking
amd_pstate initialization.

Fix this by assigning 'ref = nom' in the fallback path.

Fixes: 8505bfb4e4 ("ACPI: CPPC: Move reference performance to capabilities")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20260310003026.GA2639793@ax162/
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260311071334.1494960-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-11 12:34:02 +01:00
Pengjie Zhang
8505bfb4e4 ACPI: CPPC: Move reference performance to capabilities
Currently, the `Reference Performance` register is read every time
the CPU frequency is sampled in `cppc_get_perf_ctrs()`. This function
is on the hot path of the cppc_cpufreq driver.

Reference Performance indicates the performance level that corresponds
to the Reference Counter incrementing and is not expected to change
dynamically during runtime (unlike the Delivered and Reference counters).

Reading this register in the hot path incurs unnecessary overhead,
particularly on platforms where CPC registers are located in the PCC
(Platform Communication Channel) subspace. This patch moves
`reference_perf` from the dynamic feedback counters structure
(`cppc_perf_fb_ctrs`) to the static capabilities structure
(`cppc_perf_caps`).

Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
[ rjw: Changelog adjustment ]
Link: https://patch.msgid.link/20260213100935.19111-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-05 20:00:19 +01:00
Sumit Gupta
13c45a2663 ACPI: CPPC: add APIs and sysfs interface for perf_limited
Add sysfs interface to read/write the Performance Limited register.

The Performance Limited register indicates to the OS that an
unpredictable event (like thermal throttling) has limited processor
performance. It contains two sticky bits set by the platform:
  - Bit 0 (Desired_Excursion): Set when delivered performance is
    constrained below desired performance. Not used when Autonomous
    Selection is enabled.
  - Bit 1 (Minimum_Excursion): Set when delivered performance is
    constrained below minimum performance.

These bits remain set until OSPM explicitly clears them. The write
operation accepts a bitmask of bits to clear:
  - Write 0x1 to clear bit 0
  - Write 0x2 to clear bit 1
  - Write 0x3 to clear both bits

This enables users to detect if platform throttling impacted a workload.
Users clear the register before execution, run the workload, then check
afterward - if set, hardware throttling occurred during that time window.

The interface is exposed as:
  /sys/devices/system/cpu/cpuX/cpufreq/perf_limited

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-7-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:42 +01:00
Sumit Gupta
38428a6800 ACPI: CPPC: Extend cppc_set_epp_perf() for FFH/SystemMemory
Extend cppc_set_epp_perf() to write both auto_sel and energy_perf
registers when they are in FFH or SystemMemory address space.

This keeps the behavior consistent with PCC case where both registers
are already updated together, but was missing for FFH/SystemMemory.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-4-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:41 +01:00
Sumit Gupta
b3e45fb2db ACPI: CPPC: Warn on missing mandatory DESIRED_PERF register
Add a warning during CPPC processor probe if the Desired Performance
register is not supported when it should be.

As per 8.4.6.1.2.3 section of ACPI 6.6 specification,
"The Desired Performance Register is optional only when OSPM indicates
support for CPPC2 in the platform-wide _OSC capabilities and the
Autonomous Selection Enable field is encoded as an Integer with a
value of 1."

In other words:
- In CPPC v1, DESIRED_PERF is mandatory
- In CPPC v2, it becomes optional only when AUTO_SEL_ENABLE is supported

This helps detect firmware configuration issues early during boot.

Link: https://lore.kernel.org/lkml/9fa21599-004a-4af8-acc2-190fd0404e35@nvidia.com/
Suggested-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-3-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:41 +01:00
Sumit Gupta
658fa7b1c4 ACPI: CPPC: Add cppc_get_perf() API to read performance controls
Add cppc_get_perf() function to read values of performance control
registers including desired_perf, min_perf, max_perf, energy_perf,
and auto_sel.

This provides a read interface to complement the existing
cppc_set_perf() write interface for performance control registers.

Note that auto_sel is read by cppc_get_perf() but not written by
cppc_set_perf() to avoid unintended mode changes during performance
updates. It can be updated with existing dedicated cppc_set_auto_sel()
API.

Use cppc_get_perf() in cppc_cpufreq_get_cpu_data() to initialize
perf_ctrls with current hardware register values during cpufreq
policy initialization.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-2-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:41 +01:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
9a199794fd More ACPI support updates for 7.0-rc1
- Add an unused power resource handling quirk for THUNDEROBOT ZERO (Zhai
    Can)
 
  - Fix remaining for_each_possible_cpu() in the ACPI CPPC library to use
    online CPUs (Sean V Kelley)
 
  - Drop redundant checks from the ACPI notify handler and the driver
    remove callback in the ACPI battery driver (Rafael Wysocki)
 
  - Move the creation of the wakeup source during the ACPI button driver
    probe to an earlier point to avoid missing a wakeup event due to a
    race and clean up system wakeup handling and remove callback in that
    driver (Rafael Wysocki)
 
  - Drop unnecessary driver_data pointer clearing from the ACPI EC and
    SMBUS HC drivers and make the ACPI backlight (video) driver clear the
    device's driver_data pointer on remove (Rafael Wysocki)
 
  - Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
    Kharuzhy)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmmWFKUSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO17FoH+wV8F4NRk13Si+DC99uh37N0IC4BdEd+
 9rly7D77i+Egaj+yH/fIHXl1ul5ArGkHEhs7reMToSNfRFzSYXTSLGUaIa0tMYJ+
 p1IV6vKtSShyl58FxOfMeYBgMlI5Iuc4ZMxzKiZzmBo55vmLaOrMp2i2hYkDi4wl
 qZIU9DGSTyTUefqVRUiMg3pqIpLSg6CzXIn0Pv7Mj+qd/BTAFkXiMvIjk0Jl8LLR
 Sf84CMkgzTzXA+rO98AQXHNNrC3R45Xgec9yEsSROgOR+c6ZijIYXj3HJNer9tP7
 bMdy6yTw86Wy9KJedGx94WmD/xLGQLIWRIAkki2vfJpCBJ07FEJLYeE=
 =Aw9u
 -----END PGP SIGNATURE-----

Merge tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI support updates from Rafael J. Wysocki:
 "These are mostly fixes and cleanups on top of the ACPI support updates
  merged recently, including two new quirks, an ACPI CPPC library fix,
  and fixes and cleanups of a few core ACPI device drivers:

   - Add an unused power resource handling quirk for THUNDEROBOT ZERO
     (Zhai Can)

   - Fix remaining for_each_possible_cpu() in the ACPI CPPC library to
     use online CPUs (Sean V Kelley)

   - Drop redundant checks from the ACPI notify handler and the driver
     remove callback in the ACPI battery driver (Rafael Wysocki)

   - Move the creation of the wakeup source during the ACPI button
     driver probe to an earlier point to avoid missing a wakeup event
     due to a race and clean up system wakeup handling and remove
     callback in that driver (Rafael Wysocki)

   - Drop unnecessary driver_data pointer clearing from the ACPI EC and
     SMBUS HC drivers and make the ACPI backlight (video) driver clear
     the device's driver_data pointer on remove (Rafael Wysocki)

   - Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
     Kharuzhy)"

* tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO
  ACPI: driver: Drop driver_data pointer clearing from two drivers
  ACPI: video: Clear driver_data pointer on remove
  ACPI: button: Tweak acpi_button_remove()
  ACPI: button: Tweak system wakeup handling
  ACPI: battery: Drop redundant checks from acpi_battery_remove()
  ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
  ACPI: x86: Force enabling of PWM2 on the Yogabook YB1-X90
  ACPI: button: Call device_init_wakeup() earlier during probe
  ACPI: battery: Drop redundant check from acpi_battery_notify()
2026-02-18 14:28:57 -08:00
Sean V Kelley
56eb0c0ed3 ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().

However, send_pcc_cmd() and acpi_get_psd_map() still iterate over all
possible CPUs. In acpi_get_psd_map(), encountering an offline CPU
returns -EFAULT, causing cppc_cpufreq initialization to fail.

This breaks systems booted with "nosmt" or "nosmt=force".

Fix by using for_each_online_cpu() in both functions.

Fixes: 80b8286aee ("ACPI / CPPC: support for batching CPPC requests")
Signed-off-by: Sean V Kelley <skelley@nvidia.com>
Link: https://patch.msgid.link/20260211212254.30190-1-skelley@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-12 12:43:01 +01:00
Linus Torvalds
9b1b3dcd28 Power management updates for 6.20-rc1/7.0-rc1
- Remove the unused omap-cpufreq driver (Andreas Kemnade)
 
  - Optimize error handling code in cpufreq_boost_trigger_state() and
    make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
    supports boost (Lifeng Zheng)
 
  - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
    Dhruva Gole, and Konrad Dybcio)
 
  - Minor improvements to the cpufreq and cpumask rust implementation
    (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)
 
  - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)
 
  - Update arch_freq_scale in the CPPC cpufreq driver's frequency
    invariance engine (FIE) in scheduler ticks if the related CPPC
    registers are not in PCC (Jie Zhan)
 
  - Assorted minor cleanups and improvements in ARM cpufreq drivers (Juan
    Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)
 
  - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
    Gupta)
 
  - Make the scaling_setspeed cpufreq sysfs attribute return the actual
    requested frequency to avoid confusion (Pengjie Zhang)
 
  - Simplify the idle CPU time granularity test in the ondemand cpufreq
    governor (Frederic Weisbecker)
 
  - Enable asym capacity in intel_pstate only when CPU SMT is not
    possible (Yaxiong Tian)
 
  - Update the description of rate_limit_us default value in cpufreq
    documentation (Yaxiong Tian)
 
  - Add a command line option to adjust the C-states table in the
    intel_idle driver, remove the 'preferred_cstates' module parameter
    from it, add C-states validation to it and clean it up (Artem
    Bityutskiy)
 
  - Make the menu cpuidle governor always check the time till the closest
    timer event when the scheduler tick has been stopped to prevent it
    from mistakenly selecting the deepest available idle state (Rafael
    Wysocki)
 
  - Update the teo cpuidle governor to avoid making suboptimal decisions
    in certain corner cases and generally improve idle state selection
    accuracy (Rafael Wysocki)
 
  - Remove an unlikely() annotation on the early-return condition in
    menu_select() that leads to branch misprediction 100% of the time
    on systems with only 1 idle state enabled, like ARM64 servers (Breno
    Leitao)
 
  - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer (Christian
    Loehle)
 
  - Stop flagging the PM runtime workqueue as freezable to avoid system
    suspend and resume deadlocks in subsystems that assume asynchronous
    runtime PM to work during system-wide PM transitions (Rafael Wysocki)
 
  - Drop redundant NULL pointer checks before acomp_request_free() from
    the hibernation code handling image saving (Rafael Wysocki)
 
  - Update wakeup_sources_walk_start() to handle empty lists of wakeup
    sources as appropriate (Samuel Wu)
 
  - Make dev_pm_clear_wake_irq() check the power.wakeirq value under
    power.lock to avoid race conditions (Gui-Dong Han)
 
  - Avoid bit field races related to power.work_in_progress in the core
    device suspend code (Xuewen Yan)
 
  - Make several drivers discard pm_runtime_put() return value in
    preparation for converting that function to a void one (Rafael
    Wysocki)
 
  - Add PL4 support for Ice Lake to the Intel RAPL power capping
    driver (Daniel Tang)
 
  - Replace sprintf() with sysfs_emit() in power capping sysfs show
    functions (Sumeet Pawnikar)
 
  - Make dev_pm_opp_get_level() return value match the documentation
    after a previous update of the latter (Aleks Todorov)
 
  - Use scoped for each OF child loop in the OPP code (Krzysztof
    Kozlowski)
 
  - Fix a bug in an example code snippet and correct typos in the energy
    model management documentation (Patrick Little)
 
  - Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
 
    * idle_monitor: Fix incorrect value logged after stop
    * Fix inverted APERF capability check
    * Use strcspn() to strip trailing newline
    * Reset errno before strtoull()
    * Show C0 in idle-info dump
 
  - Improve cpupower installation procedure by making the systemd step
    optional and allowing users to disable the installation of systemd's
    unit file (João Marcos Costa)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmmDr5ASHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1Q8oH/0KRqdidHzesIQl6gd5WSS/sWdxODRUt
 R9dEGQQ6LXCY0z05RAq29HZQf618fYuRFX4PSrtyCvrcRJK7MJKuzK55MRq0MC3c
 c/2pL1PdpHexjLXUP9pcoxrYjetsr7SnD6Y0M3JfOPg1E/bG8sp1DlnE8cdqrL0W
 lrdB2cEGewT2SVkNhCIQ2n6bwfQwmLlfQl1vXTM8BA7xCjoslePUJlRphAFVAt/J
 5fQxSOH0eSxK5PYQFUDM2D2J3uMAN0pFb6eIjwVYYqjABqV//BPl99Rv2W3ElJq7
 K/SICRWlvzyINCgF15QAUtQHWdINxSb0GzovECVxODHOv0N4mKHdpNU=
 =QlVe
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "By the number of commits, cpufreq is the leading party (again) and the
  most visible change there is the removal of the omap-cpufreq driver
  that has not been used for a long time (good riddance). There are also
  quite a few changes in the cppc_cpufreq driver, mostly related to
  fixing its frequency invariance engine in the case when the CPPC
  registers used by it are not in PCC. In addition to that, support for
  AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev
  list is updated for some platforms. The remaining cpufreq changes are
  assorted fixes and cleanups.

  Next up is cpuidle and the changes there are dominated by intel_idle
  driver updates, mostly related to the new command line facility
  allowing users to adjust the list of C-states used by the driver.
  There are also a few updates of cpuidle governors, including two menu
  governor fixes and some refinements of the teo governor, and a
  MAINTAINERS update adding Christian Loehle as a cpuidle reviewer.
  [Thanks for stepping up Christian!]

  The most significant update related to system suspend and hibernation
  is the one to stop freezing the PM runtime workqueue during system PM
  transitions which allows some deadlocks to be avoided. There is also a
  fix for possible concurrent bit field updates in the core device
  suspend code and a few other minor fixes.

  Apart from the above, several drivers are updated to discard the
  return value of pm_runtime_put() which is going to be converted to a
  void function as soon as everybody stops using its return value, PL4
  support for Ice Lake is added to the Intel RAPL power capping driver,
  and there are assorted cleanups, documentation fixes, and some
  cpupower utility improvements.

  Specifics:

   - Remove the unused omap-cpufreq driver (Andreas Kemnade)

   - Optimize error handling code in cpufreq_boost_trigger_state() and
     make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
     supports boost (Lifeng Zheng)

   - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
     Dhruva Gole, and Konrad Dybcio)

   - Minor improvements to the cpufreq and cpumask rust implementation
     (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)

   - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)

   - Update arch_freq_scale in the CPPC cpufreq driver's frequency
     invariance engine (FIE) in scheduler ticks if the related CPPC
     registers are not in PCC (Jie Zhan)

   - Assorted minor cleanups and improvements in ARM cpufreq drivers
     (Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)

   - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
     Gupta)

   - Make the scaling_setspeed cpufreq sysfs attribute return the actual
     requested frequency to avoid confusion (Pengjie Zhang)

   - Simplify the idle CPU time granularity test in the ondemand cpufreq
     governor (Frederic Weisbecker)

   - Enable asym capacity in intel_pstate only when CPU SMT is not
     possible (Yaxiong Tian)

   - Update the description of rate_limit_us default value in cpufreq
     documentation (Yaxiong Tian)

   - Add a command line option to adjust the C-states table in the
     intel_idle driver, remove the 'preferred_cstates' module parameter
     from it, add C-states validation to it and clean it up (Artem
     Bityutskiy)

   - Make the menu cpuidle governor always check the time till the
     closest timer event when the scheduler tick has been stopped to
     prevent it from mistakenly selecting the deepest available idle
     state (Rafael Wysocki)

   - Update the teo cpuidle governor to avoid making suboptimal
     decisions in certain corner cases and generally improve idle state
     selection accuracy (Rafael Wysocki)

   - Remove an unlikely() annotation on the early-return condition in
     menu_select() that leads to branch misprediction 100% of the time
     on systems with only 1 idle state enabled, like ARM64 servers
     (Breno Leitao)

   - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer
     (Christian Loehle)

   - Stop flagging the PM runtime workqueue as freezable to avoid system
     suspend and resume deadlocks in subsystems that assume asynchronous
     runtime PM to work during system-wide PM transitions (Rafael
     Wysocki)

   - Drop redundant NULL pointer checks before acomp_request_free() from
     the hibernation code handling image saving (Rafael Wysocki)

   - Update wakeup_sources_walk_start() to handle empty lists of wakeup
     sources as appropriate (Samuel Wu)

   - Make dev_pm_clear_wake_irq() check the power.wakeirq value under
     power.lock to avoid race conditions (Gui-Dong Han)

   - Avoid bit field races related to power.work_in_progress in the core
     device suspend code (Xuewen Yan)

   - Make several drivers discard pm_runtime_put() return value in
     preparation for converting that function to a void one (Rafael
     Wysocki)

   - Add PL4 support for Ice Lake to the Intel RAPL power capping driver
     (Daniel Tang)

   - Replace sprintf() with sysfs_emit() in power capping sysfs show
     functions (Sumeet Pawnikar)

   - Make dev_pm_opp_get_level() return value match the documentation
     after a previous update of the latter (Aleks Todorov)

   - Use scoped for each OF child loop in the OPP code (Krzysztof
     Kozlowski)

   - Fix a bug in an example code snippet and correct typos in the
     energy model management documentation (Patrick Little)

   - Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
      * idle_monitor: Fix incorrect value logged after stop
      * Fix inverted APERF capability check
      * Use strcspn() to strip trailing newline
      * Reset errno before strtoull()
      * Show C0 in idle-info dump

   - Improve cpupower installation procedure by making the systemd step
     optional and allowing users to disable the installation of
     systemd's unit file (João Marcos Costa)"

* tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
  PM: sleep: core: Avoid bit field races related to work_in_progress
  PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races
  cpufreq: Documentation: Update description of rate_limit_us default value
  cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible
  PM: wakeup: Handle empty list in wakeup_sources_walk_start()
  PM: EM: Documentation: Fix bug in example code snippet
  Documentation: Fix typos in energy model documentation
  cpuidle: governors: teo: Refine intercepts-based idle state lookup
  cpuidle: governors: teo: Adjust the classification of wakeup events
  cpufreq: ondemand: Simplify idle cputime granularity test
  cpufreq: userspace: make scaling_setspeed return the actual requested frequency
  PM: hibernate: Drop NULL pointer checks before acomp_request_free()
  cpufreq: CPPC: Add generic helpers for sysfs show/store
  cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
  cpufreq: ti-cpufreq: add support for AM62L3 SoC
  cpufreq: dt-platdev: Add ti,am62l3 to blocklist
  cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
  cpufreq: scmi: correct SCMI explanation
  cpufreq: dt-platdev: Block the driver from probing on more QC platforms
  rust: cpumask: rename methods of Cpumask for clarity and consistency
  ...
2026-02-09 19:00:42 -08:00
Sumit Gupta
83e2908c1d ACPI: CPPC: Rename EPP constants for clarity
Update EPP (Energy Performance Preference) constants for more clarity:

 - Add CPPC_EPP_PERFORMANCE_PREF (0x00) for performance preference.

 - Rename CPPC_ENERGY_PERF_MAX to CPPC_EPP_ENERGY_EFFICIENCY_PREF (0xFF)
   for energy efficiency.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260120145623.2959636-4-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27 21:21:05 +01:00
Rafael J. Wysocki
b753c3204d CPUFreq arm updates for 7.0
- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
   Dhruva Gole, and Konrad Dybcio).
 
 - Minor improvements to the cpufreq / cpumask rust implementation
   (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen).
 
 - Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole).
 
 - Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan).
 
 - Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca
   Weiss, and Sergey Shtylyov).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAml4YIcACgkQ0rkcPK6B
 Ehx6hQ//Y8GoICqNQe6IrVQ6b9eJLB/YOP7vPyZuwc0iqT9YWXrMou5xlnmNW/IY
 zj0Wo3l3fidp6eDOo7f21mXALF9yt8kElKq411Oqcg4WVWyAXc9p6ODpWZp/G2/h
 JcmusAkwFPul0XE0QJmlJ54KqtsyjoSWQHtrPzOO54mJEhOL4dWQwqhWP046ed7T
 FVkNRLb7ysY3+weNuAg45SbVJ3FT/a7f8XbdGd5DAz96efbqTyFt+znhfsd3Xti+
 sF75Mq1AEN0Vnfb8ZP4MZUCe7zeVdOVfLFqXXiW/qJOdbRgoD6k2uAOIt2NAcYU/
 sbv6UjaW0NE0oTvKbJ8CLE4IIJudjRgVNjyyGlKHdjBVgMQQk9vr7DIedGLghink
 VABcyerIqhPFGkBkY0IXkLSmhNtKWoLN9w7sMeCwhE34l63Bnie3Sg9JLikRxaXK
 6BAm3+8BiG30tg4WL1LX8UyssnMlUGvvOD9TCP4jOfLQeAk8wWgQ1D+CwWCB5o5j
 jtDwPTOCIN9UQT46lYS+kkqzwf4YTFVdA4c23Tod70gjrtm7Z1a7UzYNxoTGcGS3
 VtrgVnlgPh3/I/95Qpsgoy5D1oeIz9znFoVv6ETPBINy4A4rAsYMA4DEASfM7tIY
 pNhbSTcbtDbp6Eo79hkh5J2ZGoJyTSthrX+irqOz+IQFp8fP9s4=
 =Zz3B
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Pull CPUFreq Arm updates for 7.0 from Viresh Kumar:

"- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
   Dhruva Gole, and Konrad Dybcio).

 - Minor improvements to the cpufreq / cpumask rust implementation
   (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen).

 - Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole).

 - Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan).

 - Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca
   Weiss, and Sergey Shtylyov)."

* tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
  cpufreq: ti-cpufreq: add support for AM62L3 SoC
  cpufreq: dt-platdev: Add ti,am62l3 to blocklist
  cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
  cpufreq: scmi: correct SCMI explanation
  cpufreq: dt-platdev: Block the driver from probing on more QC platforms
  rust: cpumask: rename methods of Cpumask for clarity and consistency
  cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs
  cpufreq: CPPC: Factor out cppc_fie_kworker_init()
  ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu()
  rust: cpufreq: replace `kernel::c_str!` with C-Strings
  cpufreq: Add Tegra186 and Tegra194 to cpufreq-dt-platdev blocklist
  dt-bindings: cpufreq: qcom-hw: document Milos CPUFREQ Hardware
  rust: cpufreq: add __rust_helper to helpers
  rust: cpufreq: always inline functions using build_assert with arguments
2026-01-27 14:46:28 +01:00
Jie Zhan
f9cadb3d56 ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu()
Factor out cppc_perf_ctrs_in_pcc_cpu() for checking whether per-cpu CPC
regs are defined in PCC channels, and export it out for further use.

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2026-01-27 11:21:23 +05:30
Pengjie Zhang
6ea3a44cef ACPI: CPPC: Fix missing PCC check for guaranteed_perf
The current implementation overlooks the 'guaranteed_perf'
register in this check.

If the Guaranteed Performance register is located in the PCC
subspace, the function currently attempts to read it without
acquiring the lock and without sending the CMD_READ doorbell
to the firmware. This can result in reading stale data.

Fixes: 29523f0953 ("ACPI / CPPC: Add support for guaranteed performance")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-12-15 12:56:46 +01:00
Gautham R. Shenoy
0fce758706 ACPI: CPPC: Limit perf ctrs in PCC check only to online CPUs
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPU via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().

However the function cppc_perf_ctrs_in_pcc() checks if the CPPC
perf-ctrs are in a PCC region for all the present CPUs, which breaks
when the kernel is booted with "nosmt=force".

Hence, limit the check only to the online CPUs.

Fixes: ae2df912d1 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 18:37:42 +01:00
Gautham R. Shenoy
8821c8e80a ACPI: CPPC: Perform fast check switch only for online CPUs
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().

However the function cppc_allow_fast_switch() checks for the validity
of the _CPC object for all the present CPUs. This breaks when the
kernel is booted with "nosmt=force".

Check fast_switch capability only on online CPUs

Fixes: 15eece6c5b ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 18:37:42 +01:00
Gautham R. Shenoy
6dd3b8a709 ACPI: CPPC: Check _CPC validity for only the online CPUs
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().

However the function acpi_cpc_valid() checks for the validity of the
_CPC object for all the present CPUs. This breaks when the kernel is
booted with "nosmt=force".

Hence check the validity of the _CPC objects of only the online CPUs.

Fixes: 2aeca6bd02 ("ACPI: CPPC: Check present CPUs for determining _CPC is valid")
Reported-by: Christopher Harris <chris.harris79@gmail.com>
Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Tested-by: Chrisopher Harris <chris.harris79@gmail.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-3-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-07 18:37:42 +01:00
Chu Guangqing
1642fabff1 ACPI: CPPC: Fix typo in a comment
Fix spelling from "pachage" to "package".

Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog and subject edits ]
Link: https://patch.msgid.link/20251031055240.2791-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-03 17:25:42 +01:00
Rafael J. Wysocki
c28a280bd4 ACPI: CPPC: Do not use CPUFREQ_ETERNAL as an error value
Instead of using CPUFREQ_ETERNAL for signaling an error condition
in cppc_get_transition_latency(), change the return value type of
that function to int and make it return a proper negative error
code on failures.

No intentional functional impact.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
2025-10-01 13:57:13 +02:00
Yunhui Cui
15eece6c5b ACPI: CPPC: Fix NULL pointer dereference when nosmp is used
With nosmp in cmdline, other CPUs are not brought up, leaving
their cpc_desc_ptr NULL. CPU0's iteration via for_each_possible_cpu()
dereferences these NULL pointers, causing panic.

Panic backtrace:

[    0.401123] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8
...
[    0.403255] [<ffffffff809a5818>] cppc_allow_fast_switch+0x6a/0xd4
...
Kernel panic - not syncing: Attempted to kill init!

Fixes: 3cc30dd00a ("cpufreq: CPPC: Enable fast_switch")
Reported-by: Xu Lu <luxu.kernel@bytedance.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://patch.msgid.link/20250604023036.99553-1-cuiyunhui@bytedance.com
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10 20:52:50 +02:00
Lifeng Zheng
f35e5b3ccf ACPI: CPPC: Add three functions related to autonomous selection
cppc_set_epp() - write energy performance preference register value,
based on ACPI 6.5, s8.4.6.1.7

cppc_get_auto_act_window() - read autonomous activity window register
value, based on ACPI 6.5, s8.4.6.1.6

cppc_set_auto_act_window() - write autonomous activity window register
value, based on ACPI 6.5, s8.4.6.1.6

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-9-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:31 +02:00
Lifeng Zheng
2605e4ab66 ACPI: CPPC: Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel()
Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel(). Using a
cppc_perf_caps to carry the value is unnecessary.

Add a check to ensure the pointer 'enable' is not null.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-8-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:31 +02:00
Lifeng Zheng
ab482f1bac ACPI: CPPC: Refactor register value get and set ABIs
Refactor register value get and set ABIs by using cppc_get_reg_val(),
cppc_set_reg_val() and CPPC_REG_VAL_READ().

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-7-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:31 +02:00
Lifeng Zheng
e05c75072c ACPI: CPPC: Add cppc_set_reg_val()
Add cppc_set_reg_val() as a generic function for setting CPPC register
values, with this features:

 1. Check register. If a register is writeable, it must be a buffer and
    can not be null.

 2. Extract the operations if register is in PCC out as
    cppc_set_reg_val_in_pcc().

This function can be used to reduce some existing code duplication.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-6-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:30 +02:00
Lifeng Zheng
b5ef45e6a1 ACPI: CPPC: Extract cppc_get_reg_val_in_pcc()
Extract the operations if register is in pcc out from cppc_get_reg_val()
as cppc_get_reg_val_in_pcc().

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:30 +02:00
Lifeng Zheng
714d103ce8 ACPI: CPPC: Rename cppc_get_perf() to cppc_get_reg_val()
Rename cppc_get_perf() to cppc_get_reg_val() as a generic function to
read CPPC registers.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:30 +02:00
Lifeng Zheng
45f3763a21 ACPI: CPPC: Optimize cppc_get_perf()
Optimize cppc_get_perf() with three changes:

 1. Change the error kind to "no such device" when pcc_ss_id < 0, as
    other register value getting functions.

 2. Add a check to ensure the pointer 'perf' is no null.

 3. Add a check to verify if the register is supported to be read before
    using it. The logic is:

    (1) If the register is of the integer type, check whether the
        register is optional and its value is 0. If yes, the register
        is not supported.

    (2) If the register is of other types, a null one is not supported.

 4. Return the result of cpc_read() instead of 0.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:30 +02:00
Lifeng Zheng
e3d7935a6c ACPI: CPPC: Add IS_OPTIONAL_CPC_REG macro to judge if a cpc_reg is optional
In ACPI 6.5, s8.4.6.1 _CPC (Continuous Performance Control), whether
each of the per-cpu cpc_regs[] is mandatory or optional is defined.

Since the CPC_SUPPORTED() check is only for optional _CPC fields,
another macro to check if the field is optional is needed.

Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-2-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 22:01:30 +02:00
Sudeep Holla
3a3ce10e7a ACPI: CPPC: Simplify PCC shared memory region handling
The PCC driver now handles mapping and unmapping of shared memory
areas as part of pcc_mbox_{request,free}_channel(). Without these before,
this ACPI CPPC driver did handling of those mappings like several
other PCC mailbox client drivers.

There were redundant operations, leading to unnecessary code. Maintaining
the consistency across these driver was harder due to scattered handling
of shmem.

Just use the mapped shmem and remove all redundant operations from this
driver.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/20250411113144.1151094-2-sudeep.holla@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-04-30 21:51:23 +02:00
Rafael J. Wysocki
1c58e3a528 Merge branches 'acpi-battery', 'acpi-ec', 'acpi-pfr' and 'acpi-osl'
Merge updates of the ACPI battery and EC drivers, an ACPI Platform
Firmware Runtime (PFR) telemetry driver update and an ACPI OS support
layer change for 6.13-rc1:

 - Use DEFINE_SIMPLE_DEV_PM_OPS in the ACPI battery driver, make it use
   devm_ for initializing mutexes and allocating driver data, and make
   it check the register_pm_notifier() return value (Thomas Weißschuh,
   Andy Shevchenko).

 - Make the ACPI EC driver support compile-time conditional and allow
   ACPI to be built without CONFIG_HAS_IOPORT (Arnd Bergmann).

 - Remove a redundant error check from the pfr_telemetry driver (Colin
   Ian King).

* acpi-battery:
  ACPI: battery: Check for error code from devm_mutex_init() call
  ACPI: battery: use DEFINE_SIMPLE_DEV_PM_OPS
  ACPI: battery: initialize mutexes through devm_ APIs
  ACPI: battery: allocate driver data through devm_ APIs
  ACPI: battery: check result of register_pm_notifier()

* acpi-ec:
  ACPI: EC: make EC support compile-time conditional

* acpi-pfr:
  ACPI: pfr_telemetry: remove redundant error check on ret

* acpi-osl:
  ACPI: allow building without CONFIG_HAS_IOPORT
2024-11-15 20:45:14 +01:00
Lifeng Zheng
2388b266c9 ACPI: CPPC: Fix _CPC register setting issue
Since commit 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage"), _CPC
registers cannot be changed from 1 to 0.

It turns out that there is an extra OR after MASK_VAL_WRITE(), which
has already ORed prev_val with the register mask.

Remove the extra OR to fix the problem.

Fixes: 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20241113103309.761031-1-zhenglifeng1@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-13 12:45:23 +01:00
Mario Limonciello
b79276dcac ACPI: processor: Move arch_init_invariance_cppc() call later
arch_init_invariance_cppc() is called at the end of
acpi_cppc_processor_probe() in order to configure frequency invariance
based upon the values from _CPC.

This however doesn't work on AMD CPPC shared memory designs that have
AMD preferred cores enabled because _CPC needs to be analyzed from all
cores to judge if preferred cores are enabled.

This issue manifests to users as a warning since commit 21fb59ab4b
("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"):
```
Could not retrieve highest performance (-19)
```

However the warning isn't the cause of this, it was actually
commit 279f838a61 ("x86/amd: Detect preferred cores in
amd_get_boost_ratio_numerator()") which exposed the issue.

To fix this problem, change arch_init_invariance_cppc() into a new weak
symbol that is called at the end of acpi_processor_driver_init().
Each architecture that supports it can declare the symbol to override
the weak one.

Define it for x86, in arch/x86/kernel/acpi/cppc.c, and for all of the
architectures using the generic arch_topology.c code.

Fixes: 279f838a61 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Reported-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219431
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20241104222855.3959267-1-superm1@kernel.org
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-06 21:31:36 +01:00
Arnd Bergmann
4435a12501 ACPI: allow building without CONFIG_HAS_IOPORT
CONFIG_HAS_IOPORT will soon become optional and cause a build time
failure when it is disabled but a driver calls inb()/outb(). At the
moment, all architectures that can support ACPI have port I/O, but this
is not necessarily the case in the future on non-x86 architectures.
The result is a set of errors like:

drivers/acpi/osl.c: In function 'acpi_os_read_port':
include/asm-generic/io.h:542:14: error: call to '_inb' declared with attribute error: inb()) requires CONFIG_HAS_IOPORT

Nothing should actually call these functions in this configuration,
and if it does, the result would be undefined behavior today, possibly
a NULL pointer dereference.

Change the low-level functions to return a proper error code when
HAS_IOPORT is disabled.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20241030123701.1538919-2-arnd@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-05 21:41:10 +01:00
Pierre Gondois
1c10941e34 ACPI: CPPC: Make rmw_lock a raw_spin_lock
The following BUG was triggered:

=============================
[ BUG: Invalid wait context ]
6.12.0-rc2-XXX #406 Not tainted
-----------------------------
kworker/1:1/62 is trying to lock:
ffffff8801593030 (&cpc_ptr->rmw_lock){+.+.}-{3:3}, at: cpc_write+0xcc/0x370
other info that might help us debug this:
context-{5:5}
2 locks held by kworker/1:1/62:
  #0: ffffff897ef5ec98 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2c/0x50
  #1: ffffff880154e238 (&sg_policy->update_lock){....}-{2:2}, at: sugov_update_shared+0x3c/0x280
stack backtrace:
CPU: 1 UID: 0 PID: 62 Comm: kworker/1:1 Not tainted 6.12.0-rc2-g9654bd3e8806 #406
Workqueue:  0x0 (events)
Call trace:
  dump_backtrace+0xa4/0x130
  show_stack+0x20/0x38
  dump_stack_lvl+0x90/0xd0
  dump_stack+0x18/0x28
  __lock_acquire+0x480/0x1ad8
  lock_acquire+0x114/0x310
  _raw_spin_lock+0x50/0x70
  cpc_write+0xcc/0x370
  cppc_set_perf+0xa0/0x3a8
  cppc_cpufreq_fast_switch+0x40/0xc0
  cpufreq_driver_fast_switch+0x4c/0x218
  sugov_update_shared+0x234/0x280
  update_load_avg+0x6ec/0x7b8
  dequeue_entities+0x108/0x830
  dequeue_task_fair+0x58/0x408
  __schedule+0x4f0/0x1070
  schedule+0x54/0x130
  worker_thread+0xc0/0x2e8
  kthread+0x130/0x148
  ret_from_fork+0x10/0x20

sugov_update_shared() locks a raw_spinlock while cpc_write() locks a
spinlock.

To have a correct wait-type order, update rmw_lock to a raw spinlock and
ensure that interrupts will be disabled on the CPU holding it.

Fixes: 60949b7b80 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://patch.msgid.link/20241028125657.1271512-1-pierre.gondois@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-29 12:56:19 +01:00
liwei
d93df29bda cpufreq: CPPC: fix perf_to_khz/khz_to_perf conversion exception
When the nominal_freq recorded by the kernel is equal to the lowest_freq,
and the frequency adjustment operation is triggered externally, there is
a logic error in cppc_perf_to_khz()/cppc_khz_to_perf(), resulting in perf
and khz conversion errors.

Fix this by adding a branch processing logic when nominal_freq is equal
to lowest_freq.

Fixes: ec1c7ad476 ("cpufreq: CPPC: Fix performance/frequency conversion")
Signed-off-by: liwei <liwei728@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20241024022952.2627694-1-liwei728@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-24 17:35:13 +02:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Mario Limonciello
aaf21ac939 ACPI: CPPC: Add support for setting EPP register in FFH
Some Asus AMD systems are reported to not be able to change EPP values
because the BIOS doesn't advertise support for the CPPC MSR and the PCC
region is not configured.

However the ACPI 6.2 specification allows CPC registers to be declared
in FFH:
```
Starting with ACPI Specification 6.2, all _CPC registers can be in
PCC, System Memory, System IO, or Functional Fixed Hardware address
spaces. OSPM support for this more flexible register space scheme
is indicated by the “Flexible Address Space for CPPC Registers” _OSC
bit.
```

If this _OSC has been set allow using FFH to configure EPP.

Reported-by: al0uette@outlook.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Suggested-by: al0uette@outlook.com
Tested-by: vderp@icloud.com
Tested-by: al0uette@outlook.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20240910031524.106387-1-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-09-10 20:25:28 +02:00
Clément Léger
60949b7b80 ACPI: CPPC: Fix MASK_VAL() usage
MASK_VAL() was added as a way to handle bit_offset and bit_width for
registers located in system memory address space. However, while suited
for reading, it does not work for writing and result in corrupted
registers when writing values with bit_offset > 0. Moreover, when a
register is collocated with another one at the same address but with a
different mask, the current code results in the other registers being
overwritten with 0s. The write procedure for SYSTEM_MEMORY registers
should actually read the value, mask it, update it and write it with the
updated value. Moreover, since registers can be located in the same
word, we must take care of locking the access before doing it. We should
potentially use a global lock since we don't know in if register
addresses aren't shared with another _CPC package but better not
encourage vendors to do so. Assume that registers can use the same word
inside a _CPC package and thus, use a per _CPC package lock.

Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20240826101648.95654-1-cleger@rivosinc.com
[ rjw: Dropped redundant semicolon ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-09-02 14:24:40 +02:00
Prabhakar Pujeri
86932cd8cc ACPI: CPPC: Replace ternary operator with umax()
Replace ternary operator with umax() in cppc_find_dmi_mhz().

Signed-off-by: Prabhakar Pujeri <prabhakar.pujeri@gmail.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20240626130941.1527127-2-prabhakar.pujeri@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-02 21:01:08 +02:00
Petr Tesařík
8c6294ccb5 ACPI: CPPC: add sysfs entry for guaranteed performance
Expose the CPPC guaranteed performance as reported by the platform through
GuaranteedPerformanceRegister.

The current value is already read in cppc_get_perf_caps() and stored in
struct cppc_perf_caps (to be used by the intel_pstate driver), so only the
attribute itself needs to be defined.

Signed-off-by: Petr Tesařík <ptesarik@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-13 21:15:56 +02:00
Rafael J. Wysocki
14449382b5 Merge branch 'pm-cpufreq'
Merge cpufreq updates for 6.10:

 - Rework the handling of disabled turbo in the intel_pstate driver and
   make it update the maximum CPU frequency consistently regardless of
   the reason on top of a number of cleanups (Rafael Wysocki).

 - Add missing checks for NULL .exit() cpufreq driver callback to the
   cpufreq core (Viresh Kumar).

 - Prevent pulicy->max from going above the frequency QoS maximum value
   when cpufreq_frequency_table_verify() is used (Xuewen Yan).

 - Prevent a negative CPU number or frequency value from being printed
   if they are really large (Joshua Yeong).

 - Update MAINTAINERS entry for amd-pstate to add two new submaintainers
   and a designated reviewer (Huang Rui).

 - Clean up the amd-pstate driver and update its documentation (Gautham
   Shenoy).

 - Fix the highest frequency issue in the amd-pstate driver which limits
   performance (Perry Yuan).

 - Enable CPPC v2 for certain processors in the family 17H, as requested
   by TR40 processor users who expect improved performance and lower
   system temperature (Perry Yuan).

 - Change latency and delay values to be read from platform firmware
   firstly for more accurate timing (Perry Yuan).

 - A new quirk is introduced for supporting amd-pstate on legacy
   processors which either lack CPPC capability, or only only have CPPC
   v2 capability (Perry Yuan).

 - Sun50i: Add support for opp_supported_hw, H616 platform and general
   cleanups (Andre Przywara, Martin Botka, Brandon Cheo Fusi, Dan
   Carpenter, Viresh Kumar).

 - CPPC: Fix possible null pointer dereference (Aleksandr Mishin).

 - Eliminate uses of of_node_put() (Javier Carrasco, and Shivani Gupta).

 - brcmstb-avs: ISO C90 forbids mixed declarations (Portia Stephens).

 - mediatek: Add support for MT7988A (Sam Shih).

 - cpufreq-qcom-hw: Add SM4450 compatibles in DT bindings (Tengfei Fan).

 - Fix struct cpudata::epp_cached kernel-doc in the intel_pstate cpufreq
   driver (Jeff Johnson).

* pm-cpufreq: (46 commits)
  cpufreq: amd-pstate: fix the highest frequency issue which limits performance
  cpufreq: intel_pstate: fix struct cpudata::epp_cached kernel-doc
  cpufreq: Fix up printing large CPU numbers and frequency values
  MAINTAINERS: cpufreq: amd-pstate: Add co-maintainers and reviewer
  cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
  cpufreq: amd-pstate: fix code format problems
  cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing
  cppc_acpi: print error message if CPPC is unsupported
  cpufreq: amd-pstate: get transition delay and latency value from ACPI tables
  cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
  cpufreq: amd-pstate: Remove amd_get_{min,max,nominal,lowest_nonlinear}_freq()
  cpufreq: amd-pstate: Unify computation of {max,min,nominal,lowest_nonlinear}_freq
  cpufreq: amd-pstate: Document the units for freq variables in amd_cpudata
  cpufreq: amd-pstate: Document *_limit_* fields in struct amd_cpudata
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM4450 compatibles
  cpufreq: sun50i: fix error returns in dt_has_supported_hw()
  cpufreq: brcmstb-avs-cpufreq: ISO C90 forbids mixed declarations
  cpufreq: dt-platdev: eliminate uses of of_node_put()
  cpufreq: dt: eliminate uses of of_node_put()
  cpufreq: ti: Implement scope-based cleanup in ti_cpufreq_match_node()
  ...
2024-05-13 20:13:48 +02:00
Perry Yuan
5f8f9bc4d7 cppc_acpi: print error message if CPPC is unsupported
The amd-pstate driver can fail when _CPC objects are not supported by
the CPU. However, the current error message is ambiguous (see below) and
there is no clear way for attributing the failure of the amd-pstate
driver to the lack of CPPC support.

[    0.477523] amd_pstate: the _CPC object is not present in SBIOS or ACPI disabled

Fix this by adding an debug message to notify the user if the amd-pstate
driver failed to load due to CPPC not be supported by the CPU

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Tested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-26 19:35:38 +02:00
Vanshidhar Konda
f489c94802 ACPI: CPPC: Fix access width used for PCC registers
commit 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system
memory accesses") modified cpc_read()/cpc_write() to use access_width to
read CPC registers.

However, for PCC registers the access width field in the ACPI register
macro specifies the PCC subspace ID.  For non-zero PCC subspace ID it is
incorrectly treated as access width. This causes errors when reading
from PCC registers in the CPPC driver.

For PCC registers, base the size of read/write on the bit width field.
The debug message in cpc_read()/cpc_write() is updated to print relevant
information for the address space type used to read the register.

Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
Tested-by: Jarred White <jarredwhite@linux.microsoft.com>
Reviewed-by: Jarred White <jarredwhite@linux.microsoft.com>
Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Cc: 5.15+ <stable@vger.kernel.org> # 5.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-22 18:58:46 +02:00
Jarred White
05d92ee782 ACPI: CPPC: Fix bit_offset shift in MASK_VAL() macro
Commit 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for
system memory accesses") neglected to properly wrap the bit_offset shift
when it comes to applying the mask. This may cause incorrect values to be
read and may cause the cpufreq module not be loaded.

[   11.059751] cpu_capacity: CPU0 missing/invalid highest performance.
[   11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs

Also, corrected the bitmask generation in GENMASK (extra bit being added).

Fixes: 2f4a4d63a1 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Jarred White <jarredwhite@linux.microsoft.com>
Cc: 5.15+ <stable@vger.kernel.org> # 5.15+
Reviewed-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-22 18:42:16 +02:00