Commit Graph

4189 Commits

Author SHA1 Message Date
Linus Torvalds
e2d10998e4 Devicetree updates for v7.1:
DT core:
 - Cleanup of the reserved memory code to keep CMA specifics in CMA code
 
 - Add and convert several users to new of_machine_get_match() helper
 
 - Validate nul termination in string properties
 
 - Update dtc to upstream v1.7.2-69-g53373d135579
 
 - Limit matching reserved memory devices to /reserved-memory nodes
 
 - Fix some UAF in unittests
 
 - Remove Baikal SoC bus driver
 
 - Fix false DT_SPLIT_BINDING_PATCH checkpatch warning
 
 - Allow fw_devlink device-tree on x86
 
 - Fix kerneldoc return description for of_property_count_elems_of_size()
 
 DT bindings:
 - Add fsl,imx25-aips, fsl,imx25-tcq, qcom,eliza-pdc,
   qcom,eliza-spmi-pmic-arb, qcom,hawi-imem, qcom,milos-imem,
   qcom,hawi-pdc, and lg,sw49410 bindings
 
 - Convert arm,vexpress-scc to DT schema
 
 - Deprecate Qualcomm generic CPU compatibles. Add Apple M3 CPU cores.
 
 - Move some dual-link display panels to the dual-link schema
 
 - Drop mux controller node name constraints
 
 - Remove Baikal SoC bus bindings
 
 - Fix a false warning in the thermal trip node binding
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmnhBKQACgkQ+vtdtY28
 YcO7iw//ZHGjSI4WTwwtthgBsn108x0iRMpo3d8rYN6ESDEIgXELbEDFzWSbIzBB
 yCWUUmbNzv1YcwUF+XOoiQzdMZP6GdSTL/vvj3KgRPDgkOgNlnEgxJEPyC1mbUf5
 oJ5S/f8sr1+hfFp6LJiDnZTU6RcJHZSsLlwW4GfA87hjHCun8lQSfP+E4x1RyS4s
 6+LA+mYQNk0OFfqmE/w/Jqn44nHqoduD3Ay/I/5sd8YECtowiI9ZZ+85GkYPtDFa
 R5pW4eSoffYr5vXDTdCyBARSUxUcm++yIB/yAgi5B4xQCEKK5ds7M6YL0ylGXdey
 8ezINGXHhFsc40e4Gp/13m6/jYy322PF6kEeUWb63uZ0djJMxba7SFmEAUs00zme
 UQ8irtGay80tgRnS9btRChKFOqKcXFr5bsN5L6tkVdoJMTAziz7B77fwo7W+x+7b
 MTdlUfFMupuQJWETte9o04TkGZ5/tqnlcbpF/iTws3vCyeFyZ0LWHueQpSwhO2oy
 wzdDZ1VLHATPdVlGV757yfH9O8BkyHtdG8lXVLLoR0C1JOqUlyfScGxfLFwawUC8
 cERw9bU4f2IT+lJbr5WqpG+BnQ6TnwoNOvqajXHAByDQ/Gd8MLZoX91ykZflaWxQ
 EIvXVytWdwmQ4iReD+9ETPHBXC8v3W4+hkqrJxCAqVHPxBkXS40=
 =Ia0F
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "DT core:

   - Cleanup of the reserved memory code to keep CMA specifics in CMA
     code

   - Add and convert several users to new of_machine_get_match() helper

   - Validate nul termination in string properties

   - Update dtc to upstream v1.7.2-69-g53373d135579

   - Limit matching reserved memory devices to /reserved-memory nodes

   - Fix some UAF in unittests

   - Remove Baikal SoC bus driver

   - Fix false DT_SPLIT_BINDING_PATCH checkpatch warning

   - Allow fw_devlink device-tree on x86

   - Fix kerneldoc return description for of_property_count_elems_of_size()

  DT bindings:

   - Add fsl,imx25-aips, fsl,imx25-tcq, qcom,eliza-pdc,
     qcom,eliza-spmi-pmic-arb, qcom,hawi-imem, qcom,milos-imem,
     qcom,hawi-pdc, and lg,sw49410 bindings

   - Convert arm,vexpress-scc to DT schema

   - Deprecate Qualcomm generic CPU compatibles. Add Apple M3 CPU cores.

   - Move some dual-link display panels to the dual-link schema

   - Drop mux controller node name constraints

   - Remove Baikal SoC bus bindings

   - Fix a false warning in the thermal trip node binding"

* tag 'devicetree-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (39 commits)
  dt-bindings: display: panel: panel-simple: Add lg,sw49410 compatible
  dt-bindings: display: ti, am65x-dss: Fix AM62L DSS reg and clock constraints
  dt-bindings: display: simple: Move Innolux G156HCE-L01 panel to dual-link
  dt-bindings: display: simple: Move AUO 21.5" FHD to dual-link
  dt-bindings: thermal: Fix false warning with 'phandle' in trips nodes
  of: unittest: fix use-after-free in testdrv_probe()
  of: unittest: fix use-after-free in of_unittest_changeset()
  dt-bindings: qcom,pdc: document the Hawi Power Domain Controller
  dt-bindings: ARM: arm,vexpress-scc: convert to DT schema
  drivers/of: fdt: validate flat DT string properties before string use
  drivers/of: fdt: validate stdout-path properties before parsing them
  dt-bindings: sram: Document qcom,hawi-imem compatible
  dt-bindings: sram: Allow multiple-word prefixes to sram subnode
  dt-bindings: sram: Document qcom,milos-imem
  scripts/dtc: Update to upstream version v1.7.2-69-g53373d135579
  of: property: Allow fw_devlink device-tree on x86
  dt-bindings: arm: cpus: Add Apple M3 CPU core compatibles
  dt-bindings: display: lt8912b: Drop redundant endpoint properties
  dt-bindings: opp-v2: Fix example 3 CPU reg value
  dt-bindings: connector: add pd-disable dependency
  ...
2026-04-17 14:09:02 -07:00
Linus Torvalds
cb30bf881c tracing updates for v7.1:
- Fix printf format warning for bprintf
 
   sunrpc uses a trace_printk() that triggers a printf warning during the
   compile. Move the __printf() attribute around for when debugging is not
   enabled the warning will go away.
 
 - Remove redundant check for EVENT_FILE_FL_FREED in event_filter_write()
 
   The FREED flag is checked in the call to event_file_file() and then
   checked again right afterward, which is unneeded.
 
 - Clean up event_file_file() and event_file_data() helpers
 
   These helper functions played a different role in the past, but now with
   eventfs, the READ_ONCE() isn't needed. Simplify the code a bit and also
   add a warning to event_file_data() if the file or its data is not present.
 
 - Remove updating file->private_data in tracing open
 
   All access to the file private data is handled by the helper functions,
   which do not use file->private_data. Stop updating it on open.
 
 - Show ENUM names in function arguments via BTF in function tracing
 
   When showing the function arguments when func-args option is set for
   function tracing, if one of the arguments is found to be an enum, show the
   name of the enum instead of its number.
 
 - Add new trace_call__##name() API for tracepoints
 
   Tracepoints are enabled via static_branch() blocks, where when not
   enabled, there's only a nop that is in the code where the execution will
   just skip over it. When tracing is enabled, the nop is converted to a
   direct jump to the tracepoint code. Sometimes more calculations are
   required to be performed to update the parameters of the tracepoint. In
   this case, trace_##name##_enabled() is called which is a static_branch()
   that gets enabled only when the tracepoint is enabled. This allows the
   extra calculations to also be skipped by the nop:
 
   if (trace_foo_enabled()) {
       x = bar();
       trace_foo(x);
   }
 
   Where the x=bar() is only performed when foo is enabled. The problem with
   this approach is that there's now two static_branch() calls. One for
   checking if the tracepoint is enabled, and then again to know if the
   tracepoint should be called. The second one is redundant.
 
   Introduce trace_call__foo() that will call the foo() tracepoint directly
   without doing a static_branch():
 
   if (trace_foo_enabled()) {
       x = bar();
       trace_call__foo();
   }
 
 - Update various locations to use the new trace_call__##name() API
 
 - Move snapshot code out of trace.c
 
   Cleaning up trace.c to not be a "dump all", move the snapshot code out of
   it and into a new trace_snapshot.c file.
 
 - Clean up some "%*.s" to "%*s"
 
 - Allow boot kernel command line options to be called multiple times
 
   Have options like:
 
     ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo
 
   Equal to:
 
     ftrace_filter=foo,bar,zoo
 
 - Fix ipi_raise event CPU field to be a CPU field
 
   The ipi_raise target_cpus field is defined as a __bitmask(). There is now a
   __cpumask() field definition. Update the field to use that.
 
 - Have hist_field_name() use a snprintf() and not a series of strcat()
 
   It's safer to use snprintf() that a series of strcat().
 
 - Fix tracepoint regfunc balancing
 
   A tracepoint can define a "reg" and "unreg" function that gets called
   before the tracepoint is enabled, and after it is disabled respectively.
   But on error, after the "reg" func is called and the tracepoint is not
   enabled, the "unreg" function is not called to tear down what the "reg"
   function performed.
 
 - Fix output that shows what histograms are enabled
 
   Event variables are displayed incorrectly in the histogram output.
 
   Instead of "sched.sched_wakeup.$var", it is showing
   "$sched.sched_wakeup.var" where the '$' is in the incorrect location.
 
 - Some other simple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaeCpvxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qt2WAP44m85BbAjBqJe4WR103eOXV+bREBta
 dRoReKJOMe519gEAp0rK/HoCvHgHhIGe3gaGdIsNhnaxoFyNWMG/wokoLAY=
 =Hg6+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Fix printf format warning for bprintf

   sunrpc uses a trace_printk() that triggers a printf warning during
   the compile. Move the __printf() attribute around for when debugging
   is not enabled the warning will go away

 - Remove redundant check for EVENT_FILE_FL_FREED in
   event_filter_write()

   The FREED flag is checked in the call to event_file_file() and then
   checked again right afterward, which is unneeded

 - Clean up event_file_file() and event_file_data() helpers

   These helper functions played a different role in the past, but now
   with eventfs, the READ_ONCE() isn't needed. Simplify the code a bit
   and also add a warning to event_file_data() if the file or its data
   is not present

 - Remove updating file->private_data in tracing open

   All access to the file private data is handled by the helper
   functions, which do not use file->private_data. Stop updating it on
   open

 - Show ENUM names in function arguments via BTF in function tracing

   When showing the function arguments when func-args option is set for
   function tracing, if one of the arguments is found to be an enum,
   show the name of the enum instead of its number

 - Add new trace_call__##name() API for tracepoints

   Tracepoints are enabled via static_branch() blocks, where when not
   enabled, there's only a nop that is in the code where the execution
   will just skip over it. When tracing is enabled, the nop is converted
   to a direct jump to the tracepoint code. Sometimes more calculations
   are required to be performed to update the parameters of the
   tracepoint. In this case, trace_##name##_enabled() is called which is
   a static_branch() that gets enabled only when the tracepoint is
   enabled. This allows the extra calculations to also be skipped by the
   nop:

	if (trace_foo_enabled()) {
		x = bar();
		trace_foo(x);
	}

   Where the x=bar() is only performed when foo is enabled. The problem
   with this approach is that there's now two static_branch() calls. One
   for checking if the tracepoint is enabled, and then again to know if
   the tracepoint should be called. The second one is redundant

   Introduce trace_call__foo() that will call the foo() tracepoint
   directly without doing a static_branch():

	if (trace_foo_enabled()) {
		x = bar();
		trace_call__foo();
	}

 - Update various locations to use the new trace_call__##name() API

 - Move snapshot code out of trace.c

   Cleaning up trace.c to not be a "dump all", move the snapshot code
   out of it and into a new trace_snapshot.c file

 - Clean up some "%*.s" to "%*s"

 - Allow boot kernel command line options to be called multiple times

   Have options like:

	ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo

   Equal to:

	ftrace_filter=foo,bar,zoo

 - Fix ipi_raise event CPU field to be a CPU field

   The ipi_raise target_cpus field is defined as a __bitmask(). There is
   now a __cpumask() field definition. Update the field to use that

 - Have hist_field_name() use a snprintf() and not a series of strcat()

   It's safer to use snprintf() that a series of strcat()

 - Fix tracepoint regfunc balancing

   A tracepoint can define a "reg" and "unreg" function that gets called
   before the tracepoint is enabled, and after it is disabled
   respectively. But on error, after the "reg" func is called and the
   tracepoint is not enabled, the "unreg" function is not called to tear
   down what the "reg" function performed

 - Fix output that shows what histograms are enabled

   Event variables are displayed incorrectly in the histogram output

   Instead of "sched.sched_wakeup.$var", it is showing
   "$sched.sched_wakeup.var" where the '$' is in the incorrect location

 - Some other simple cleanups

* tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (24 commits)
  selftests/ftrace: Add test case for fully-qualified variable references
  tracing: Fix fully-qualified variable reference printing in histograms
  tracepoint: balance regfunc() on func_add() failure in tracepoint_add_func()
  tracing: Rebuild full_name on each hist_field_name() call
  tracing: Report ipi_raise target CPUs as cpumask
  tracing: Remove duplicate latency_fsnotify() stub
  tracing: Preserve repeated trace_trigger boot parameters
  tracing: Append repeated boot-time tracing parameters
  tracing: Remove spurious default precision from show_event_trigger/filter formats
  cpufreq: Use trace_call__##name() at guarded tracepoint call sites
  tracing: Remove tracing_alloc_snapshot() when snapshot isn't defined
  tracing: Move snapshot code out of trace.c and into trace_snapshot.c
  mm: damon: Use trace_call__##name() at guarded tracepoint call sites
  btrfs: Use trace_call__##name() at guarded tracepoint call sites
  spi: Use trace_call__##name() at guarded tracepoint call sites
  i2c: Use trace_call__##name() at guarded tracepoint call sites
  kernel: Use trace_call__##name() at guarded tracepoint call sites
  tracepoint: Add trace_call__##name() API
  tracing: trace_mmap.h: fix a kernel-doc warning
  tracing: Pretty-print enum parameters in function arguments
  ...
2026-04-17 09:43:12 -07:00
Linus Torvalds
d7c8087a9c Power management updates for 7.1-rc1
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)
 
  - Update cpufreq-dt-platdev blocklist (Faruque Ansari)
 
  - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
    Rosen Penev)
 
  - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)
 
  - Add support for new features: CPPC performance priority, Dynamic EPP,
    Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
    Mario Limonciello)
 
  - Fix sysfs files being present when HW missing and broken/outdated
    documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)
 
  - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
    cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
    leads to a scheduling-while-atomic bug (K Prateek Nayak)
 
  - Clean up dead code in Kconfig for cpufreq (Julian Braha)
 
  - Remove max_freq_req update for pre-existing cpufreq policy and add a
    boost_freq_req QoS request to save the boost constraint instead of
    overwriting the last scaling_max_freq constraint (Pierre Gondois)
 
  - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
    are allocated in one go along with the policy to simplify lifetime
    rules and avoid error handling issues (Viresh Kumar)
 
  - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
    scaling driver (Henry Tseng)
 
  - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
    of cpumask_weight() because the former is more efficient (Yury Norov)
 
  - Use sysfs_emit() in sysfs show functions for cpufreq governor
    attributes (Thorsten Blum)
 
  - Update intel_pstate to stop returning an error when "off" is written
    to its status sysfs attribute while the driver is already off (Fabio
    De Francesco)
 
  - Include current frequency in the debug message printed by
    __cpufreq_driver_target() (Pengjie Zhang)
 
  - Refine stopped tick handling in the menu cpuidle governor and
    rearrange stopped tick handling in the teo cpuidle governor (Rafael
    Wysocki)
 
  - Add Panther Lake C-states table to the intel_idle driver (Artem
    Bityutskiy)
 
  - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)
 
  - Simplify cpuidle_register_device() with guard() (Huisong Li)
 
  - Use performance level if available to distinguish between rates in
    OPP debugfs (Manivannan Sadhasivam)
 
  - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)
 
  - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)
 
  - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
    Biggers)
 
  - Clean up and rearrange the intel_rapl power capping driver to make
    the respective interface drivers (TPMI, MSR, and MMOI) hold their
    own settings and primitives and consolidate PL4 and PMU support
    flags into rapl_defaults (Kuppuswamy Sathyanarayanan)
 
  - Correct kernel-doc function parameter names in the power capping core
    code (Randy Dunlap)
 
  - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)
 
  - Use _visible attribute to replace create/remove_sysfs_files() in
    devfreq (Pengjie Zhang)
 
  - Add Tegra114 support to activity monitor device in tegra30-devfreq as
    a preparation to upcoming EMC controller support (Svyatoslav Ryhel)
 
  - Fix mistakes in cpupower man pages, add the boost and epp options to
    the cpupower-frequency-info man page, and add the perf-bias option to
    the cpupower-info man page (Roberto Ricci)
 
  - Remove unnecessary extern declarations from getopt.h in arguments
    parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
    cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmnY9TISHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1G9gH/j5mEqfPpiwX6fQ/ZwOGdNOOPVA5w9j4
 KPHSMwMD5lZkoaZfasp2vt27KY5SOoVVvRZ2DKkFJ3Jai4I3cUPZYypga2nre1ag
 tgzX4vOjcw2r40Eda6ezWl1h4mca/xJJBX7xH2+hn1JY+Y1in37g50CqMIjKh96z
 Uugkk6UZytL1XcF55PMhIUgDf6pDtRT5UOW9xOKOkUt8FVWTJ7ei3HaWyV5kDmVq
 b5eQ42+OH7y6sWNnoKczFd8fStvh6J/avoJurBEvcOQhMcjaIaB48G19+KjDg73E
 NjrVcgG20P2rltBvV2d0J1TKskZHkaP7XjIeWfkwjGZhee3FL7ssS/g=
 =fRCO
 -----END PGP SIGNATURE-----

Merge tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Once again, cpufreq is the most active development area, mostly
  because of the new feature additions and documentation updates in the
  amd-pstate driver, but there are also changes in the cpufreq core
  related to boost support and other assorted updates elsewhere.

  Next up are power capping changes due to the major cleanup of the
  Intel RAPL driver.

  On the cpuidle front, a new C-states table for Intel Panther Lake is
  added to the intel_idle driver, the stopped tick handling in the menu
  and teo governors is updated, and there are a couple of cleanups.

  Apart from the above, support for Tegra114 is added to devfreq and
  there are assorted cleanups of that code, there are also two updates
  of the operating performance points (OPP) library, two minor updates
  related to hibernation, and cpupower utility man pages updates and
  cleanups.

  Specifics:

   - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

   - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

   - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
     Rosen Penev)

   - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

   - Add support for new features: CPPC performance priority, Dynamic
     EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham
     Shenoy, Mario Limonciello)

   - Fix sysfs files being present when HW missing and broken/outdated
     documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

   - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
     cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate
     which leads to a scheduling-while-atomic bug (K Prateek Nayak)

   - Clean up dead code in Kconfig for cpufreq (Julian Braha)

   - Remove max_freq_req update for pre-existing cpufreq policy and add
     a boost_freq_req QoS request to save the boost constraint instead
     of overwriting the last scaling_max_freq constraint (Pierre
     Gondois)

   - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
     are allocated in one go along with the policy to simplify lifetime
     rules and avoid error handling issues (Viresh Kumar)

   - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
     scaling driver (Henry Tseng)

   - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
     of cpumask_weight() because the former is more efficient (Yury
     Norov)

   - Use sysfs_emit() in sysfs show functions for cpufreq governor
     attributes (Thorsten Blum)

   - Update intel_pstate to stop returning an error when "off" is
     written to its status sysfs attribute while the driver is already
     off (Fabio De Francesco)

   - Include current frequency in the debug message printed by
     __cpufreq_driver_target() (Pengjie Zhang)

   - Refine stopped tick handling in the menu cpuidle governor and
     rearrange stopped tick handling in the teo cpuidle governor (Rafael
     Wysocki)

   - Add Panther Lake C-states table to the intel_idle driver (Artem
     Bityutskiy)

   - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

   - Simplify cpuidle_register_device() with guard() (Huisong Li)

   - Use performance level if available to distinguish between rates in
     OPP debugfs (Manivannan Sadhasivam)

   - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

   - Return -ENODATA if the snapshot image is not loaded (Alberto
     Garcia)

   - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
     Biggers)

   - Clean up and rearrange the intel_rapl power capping driver to make
     the respective interface drivers (TPMI, MSR, and MMOI) hold their
     own settings and primitives and consolidate PL4 and PMU support
     flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

   - Correct kernel-doc function parameter names in the power capping
     core code (Randy Dunlap)

   - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

   - Use _visible attribute to replace create/remove_sysfs_files() in
     devfreq (Pengjie Zhang)

   - Add Tegra114 support to activity monitor device in tegra30-devfreq
     as a preparation to upcoming EMC controller support (Svyatoslav
     Ryhel)

   - Fix mistakes in cpupower man pages, add the boost and epp options
     to the cpupower-frequency-info man page, and add the perf-bias
     option to the cpupower-info man page (Roberto Ricci)

   - Remove unnecessary extern declarations from getopt.h in arguments
     parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
     cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)"

* tag 'pm-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
  cpupower: remove extern declarations in cmd functions
  cpuidle: Simplify cpuidle_register_device() with guard()
  PM / devfreq: tegra30-devfreq: add support for Tegra114
  PM / devfreq: use _visible attribute to replace create/remove_sysfs_files()
  PM / devfreq: Remove unneeded casting for HZ_PER_KHZ
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  ...
2026-04-13 19:47:52 -07:00
Rafael J. Wysocki
bfb0315a2d Merge branches 'acpi-processor' and 'acpi-cppc'
Merge ACPI processor driver updates and ACPI CPPC library updates for
7.1-rc1:

 - Address multiple assorted issues and clean up the code in the ACPI
   processor idle driver (Huisong Li)

 - Replace strlcat() in the ACPI processor idle drive with a better
   alternative (Andy Shevchenko)

 - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki)

 - Move reference performance to capabilities and fix an uninitialized
   variable in the ACPI CPPC library (Pengjie Zhang)

 - Add support for the Performance Limited Register to the ACPI CPPC
   library (Sumit Gupta)

 - Add cppc_get_perf() API to read performance controls, extend
   cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
   library warn on missing mandatory DESIRED_PERF register (Sumit Gupta)

 - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target
   callbacks to allow it to control performance bounds via standard
   scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs
   documentation for the Performance Limited Register to it (Sumit Gupta)

* acpi-processor:
  ACPI: processor: idle: Reset cpuidle on C-state list changes
  cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
  ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
  ACPI: processor: idle: Reset power_setup_done flag on initialization failure
  ACPI: processor: Rearrange and clean up acpi_processor_errata_piix4()
  ACPI: processor: idle: Replace strlcat() with better alternative
  ACPI: processor: idle: Remove redundant static variable and rename cstate check function
  ACPI: processor: idle: Move max_cstate update out of the loop
  ACPI: processor: idle: Remove redundant cstate check in acpi_processor_power_init
  ACPI: processor: idle: Add missing bounds check in flatten_lpi_states()

* acpi-cppc:
  ACPI: CPPC: Check cpc_read() return values consistently
  ACPI: CPPC: Fix uninitialized ref variable in cppc_get_perf_caps()
  ACPI: CPPC: Move reference performance to capabilities
  cpufreq: CPPC: Add sysfs documentation for perf_limited
  ACPI: CPPC: add APIs and sysfs interface for perf_limited
  cpufreq: cppc: Update MIN_PERF/MAX_PERF in target callbacks
  cpufreq: CPPC: Update cached perf_ctrls on sysfs write
  ACPI: CPPC: Extend cppc_set_epp_perf() for FFH/SystemMemory
  ACPI: CPPC: Warn on missing mandatory DESIRED_PERF register
  ACPI: CPPC: Add cppc_get_perf() API to read performance controls
2026-04-09 21:26:06 +02:00
Mario Limonciello
6793439775 cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
The dynamic EPP feature uses power_supply_reg_notifier() and
power_supply_unreg_notifier() but doesn't declare a dependency on
POWER_SUPPLY, causing linker errors when POWER_SUPPLY is not enabled.

Add POWER_SUPPLY to the selects.

Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Fixes: e30ca6dd53 ("cpufreq/amd-pstate: Add dynamic energy performance preference")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604040742.ySEdkuAa-lkp@intel.com/
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20260407194949.310114-1-mario.limonciello@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-07 21:57:04 +02:00
Rafael J. Wysocki
d7d36c3a09 CPUFreq arm updates for 7.1
- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa).
 
 - Update cpufreq-dt-platdev blocklist (Faruque Ansari).
 
 - Minor updates to driver and dt-bindings for Tegra (Thierry Reding and Rosen Penev).
 
 - Add MAINTAINERS entry for CPPC driver (Viresh Kumar).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmnTkToACgkQ0rkcPK6B
 Ehx9DA/9GBTg9Oo6MRBzfCcME+1Au2XyAB0BGPhwrybwvS4kB4Brz/kYjlwo1mSo
 +iqfh9NN2aiZHWLhQrsIT36/vMM3LG5OwuaxOWkqZP09u22MOmLoZuB2prbcpMR/
 6mJHOTntGOMvGZ2ckbcmdc82rxZng9JuRURjAk7gr97VS78PBhtmJNzo3FdPr7/t
 Ipz8IGgh9y81Vtm6R0q5J+vOggUb6K5oaICtLtCAH/Dc0MAGHv/l8jRGqyiyJC05
 JHRZbgy10Cjk7bnKR0LRgWMYGfk4hj3Dm1CmHKejPq0ESUfQjaV5hBGpK2sKBXdR
 gk2RoWZvUO0GOYj8Bk9qsTx3b0YGfHOBKR3uGQHR7hx1HD5Z3wHP1z8q73Lle/KV
 hGTtCdH76vWIPoTe5Eq48eFSVIlnmmxiSOoQOHdPA8xIEG8WjWoTu4FD/ABWtcco
 iZAD3OEW4C2ZDDqnmWM0U56/bSXh66IyAlRM1WEEuzcdt58kHhBP8s+YBhLsEsw+
 RzpplfwabikrinAPbxGCv1DQJZzw3xu5I+Qn4GqQfCWEhFJ2zLdA0whcmpPUj8xa
 g6YyD8/zHqLRzm1SP4+quStUa9s4Ve+GRYWM95HYr+3d0QhYmqOTeqmVEW5LtfSQ
 cVtc+5k1tp1miOu6uMCneQ/FQyTqe/MzSFkSqb9c2w3wlO2pzWU=
 =/Icr
 -----END PGP SIGNATURE-----

Merge tag 'cpufreq-arm-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm

Pull CPUFreq Arm updates for 7.1 from Viresh Kumar:

"- Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa).

 - Update cpufreq-dt-platdev blocklist (Faruque Ansari).

 - Minor updates to driver and dt-bindings for Tegra (Thierry Reding and
   Rosen Penev).

 - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)."

* tag 'cpufreq-arm-updates-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
  cpufreq: tegra194: remove COMPILE_TEST
  cpufreq: Add QCS8300 to cpufreq-dt-platdev blocklist
  cpufreq: Add MAINTAINERS entry for CPPC driver
  cpufreq: tegra194: Rename Tegra239 to Tegra238
  dt-bindings: arm: nvidia: Document the Tegra238 CCPLEX cluster
  dt-bindings: cpufreq: qcom-hw: document Eliza cpufreq hardware
2026-04-06 16:25:16 +02:00
Rafael J. Wysocki
5cdfedf68e amd-pstate new content for 7.1 (2026-04-02)
Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEECwtuSU6dXvs5GA2aLRkspiR3AnYFAmnOpNgTHHN1cGVybTFA
 a2VybmVsLm9yZwAKCRAtGSymJHcCduR7EADexgetxq0l6/iV2DyI1/YJcf+cNPoS
 yxE93vN9i3A2xcx87klncVF0C2zIZaZFkp6o7VY/AReL/UyUOh6snz371OXBl7pm
 A/uppkT5QdzTpmknJMyqkLRlHfkMjNRzWv4sdh4kyJSB3SkgaN7zSVi6Zxamt/vJ
 VNCgExZQeDqk4VL2X/NBfaBagYSnPnBmBdXoY6aPYqFrqKj4SlDxYNbJsQlcyE9Z
 z0naVGb5YPEJOaMvE+5z+DwX4EmtN3si+vfi8VuQOXPnoDGOG763rpMLnz7xYvfW
 poPu2fnitN39MaT96btRShD6XuCg9eaPAEmpb3j6c93n1kUo+joLLbalhfc0HMeL
 1/8ndz+KatEUMQTCVgs8cboob1PpRvqhIb+vrs6aTEqCsgqUKUZ7GYgglBamyRka
 mivC5Q+ssCxq47/ilGfECFr8vK0oV3rTu9Ltp4MS5zN70tI0YYZk3o1454nY5dhc
 Byv5e9bft/n9AA576y5vXENcWCSez/8UFGl5RjoxQZ7SFKNFnbSic1BT4uMRVX/G
 4QUk5TWwC8WdOp7YsO30LwZ0y9vtxmfBn8BF/6n/dYGhM1/DVQ1nX9iyzhCHZ3XH
 fgyrkUktdI1dsm/xKvbqxK9Djw0tkMsfH1yI6iQccefnlo4gRSvTRFiM2yepY6py
 E8MZpz1ML8T2Pw==
 =XTdh
 -----END PGP SIGNATURE-----

Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:

"Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation"

* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  amd-pstate: Make certain freq_attrs conditionally visible
  ...
2026-04-04 20:55:56 +02:00
Rafael J. Wysocki
35ed8fa05f Merge back earlier cpufreq material for 7.1 2026-04-04 14:58:58 +02:00
K Prateek Nayak
c03791085a cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
cpufreq_cpu_get() can sleep on PREEMPT_RT in presence of concurrent
writer(s), however amd-pstate depends on fetching the cpudata via the
policy's driver data which necessitates grabbing the reference.

Since schedutil governor can call "cpufreq_driver->update_perf()"
during sched_tick/enqueue/dequeue with rq_lock held and IRQs disabled,
fetching the policy object using the cpufreq_cpu_get() helper in the
scheduler fast-path leads to "BUG: scheduling while atomic" on
PREEMPT_RT [1].

Pass the cached cpufreq policy object in sg_policy to the update_perf()
instead of just the CPU. The CPU can be inferred using "policy->cpu".

The lifetime of cpufreq_policy object outlasts that of the governor and
the cpufreq driver (allocated when the CPU is onlined and only reclaimed
when the CPU is offlined / the CPU device is removed) which makes it
safe to be referenced throughout the governor's lifetime.

Closes:https://lore.kernel.org/all/20250731092316.3191-1-spasswolf@web.de/ [1]

Fixes: 1d215f0319 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Gary Guo <gary@garyguo.net> # Rust
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260316081849.19368-3-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:30:24 -05:00
K Prateek Nayak
86d71f1d76 cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
All callers of amd_pstate_update() already have a reference to the
cpufreq_policy object.

Pass the entire policy object and grab the cpudata using
"policy->driver_data" instead of passing the cpudata and unnecessarily
grabbing another read-side reference to the cpufreq policy object when
it is already available in the caller.

No functional changes intended.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20260316081849.19368-2-kprateek.nayak@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:30:24 -05:00
Mario Limonciello (AMD)
7e173bc310 cpufreq/amd-pstate-ut: Add a unit test for raw EPP
Ensure that all supported raw EPP values work properly.

Export the driver helpers used by the test module so the test can drive
raw EPP writes and temporarily disable dynamic EPP while it runs.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:30:19 -05:00
Mario Limonciello (AMD)
6927f21852 cpufreq/amd-pstate: Add support for raw EPP writes
The energy performance preference field of the CPPC request MSR
supports values from 0 to 255, but the strings only offer 4 values.

The other values are useful for tuning the performance of some
workloads.

Add support for writing the raw energy performance preference value
to the sysfs file.  If the last value written was an integer then
an integer will be returned.  If the last value written was a string
then a string will be returned.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:25 -05:00
Mario Limonciello (AMD)
798c47593c cpufreq/amd-pstate: Add support for platform profile class
The platform profile core allows multiple drivers and devices to
register platform profile support.

When the legacy platform profile interface is used all drivers will
adjust the platform profile as well.

Add support for registering every CPU with the platform profile handler
when dynamic EPP is enabled.

The end result will be that changing the platform profile will modify
EPP accordingly.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:15 -05:00
Mario Limonciello (AMD)
da8afb1c66 cpufreq/amd-pstate: add kernel command line to override dynamic epp
Add `amd_dynamic_epp=enable` and `amd_dynamic_epp=disable` to override
the kernel configuration option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`
locally.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:11 -05:00
Mario Limonciello (AMD)
e30ca6dd53 cpufreq/amd-pstate: Add dynamic energy performance preference
Dynamic energy performance preference changes the EPP profile based on
whether the machine is running on AC or DC power.

A notification chain from the power supply core is used to adjust EPP
values on plug in or plug out events.

When enabled, the driver exposes a sysfs toggle for dynamic EPP, blocks
manual writes to energy_performance_preference while it "owns" the EPP
updates.

For non-server systems:
    * the default EPP for AC mode is `performance`.
    * the default EPP for DC mode is `balance_performance`.

For server systems dynamic EPP is mostly a no-op.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:29:02 -05:00
Mario Limonciello (AMD)
8cdc494013 cpufreq/amd-pstate: Cache the max frequency in cpudata
The value of maximum frequency is fixed and never changes. Doing
calculations every time based off of perf is unnecessary.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://lore.kernel.org/r/20260326193620.649441-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:54 -05:00
Gautham R. Shenoy
3b90e5a417 amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
amd-pstate driver has per-attribute visibility functions to
dynamically control which sysfs freq_attrs are exposed based on the
platform capabilities and the current amd_pstate mode. However, there
is no test coverage to validate that the driver's live attribute list
matches the expected visibility for each mode.

Add amd_pstate_ut_check_freq_attrs() to the amd-pstate unit test
module. For each enabled mode (passive, active, guided), the test
independently derives the expected visibility of each attribute:
  - Core attributes (max_freq, lowest_nonlinear_freq, highest_perf)
    are always expected.
  - Prefcore attributes (prefcore_ranking, hw_prefcore) are expected
    only when cpudata->hw_prefcore indicates platform support.
  - EPP attributes (energy_performance_preference,
    energy_performance_available_preferences) are expected only in
    active mode.
  - Floor frequency attributes (floor_freq, floor_count) are expected
    only when X86_FEATURE_CPPC_PERF_PRIO is present.

Compare these independent expectations against the live driver's attr
array, catching bugs such as attributes leaking into wrong modes or
visibility functions checking incorrect conditions.

Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:40 -05:00
Gautham R. Shenoy
c6a2b750de amd-pstate-ut: Add module parameter to select testcases
Currently when amd-pstate-ut test module is loaded, it runs all the
tests from amd_pstate_ut_cases[] array.

Add a module parameter named "test_list" that accepts a
comma-delimited list of test names, allowing users to run a
selected subset of tests. When the parameter is omitted or empty, all
tests are run as before.

Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:35 -05:00
Gautham R. Shenoy
30c63f7234 amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
Introduce a new tracepoint trace_amd_pstate_cppc_req2() to track
updates to MSR_AMD_CPPC_REQ2.

Invoke this while changing the Floor Perf.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:31 -05:00
Gautham R. Shenoy
b9f103d096 amd-pstate: Add sysfs support for floor_freq and floor_count
When Floor Performance feature is supported by the platform, expose
two sysfs files:

   * amd_pstate_floor_freq to allow userspace to request the floor
     frequency for each CPU.

   * amd_pstate_floor_count which advertises the number of distinct
     levels of floor frequencies supported on this platform.

Reset the floor_perf to bios_floor_perf in the suspend, offline, and
exit paths, and restore the value to the cached user-request
floor_freq on the resume and online paths mirroring how bios_min_perf
is handled for MSR_AMD_CPPC_REQ.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:28 -05:00
Gautham R. Shenoy
97838281f5 amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
Some future AMD processors have feature named "CPPC Performance
Priority" which lets userspace specify different floor performance
levels for different CPUs. The platform firmware takes these different
floor performance levels into consideration while throttling the CPUs
under power/thermal constraints. The presence of this feature is
indicated by bit 16 of the EDX register for CPUID leaf
0x80000007. More details can be found in AMD Publication titled "AMD64
Collaborative Processor Performance Control (CPPC) Performance
Priority" Revision 1.10.

The number of distinct floor performance levels supported on the
platform will be advertised through the bits 32:39 of the
MSR_AMD_CPPC_CAP1. Bits 0:7 of a new MSR MSR_AMD_CPPC_REQ2
(0xc00102b5) will be used to specify the desired floor performance
level for that CPU.

Add support for the aforementioned MSR_AMD_CPPC_REQ2, and macros for
parsing and updating the relevant bits from MSR_AMD_CPPC_CAP1 and
MSR_AMD_CPPC_REQ2.

On boot if the default value of the MSR_AMD_CPPC_REQ2[7:0] (Floor
Perf) is lower than CPPC.lowest_perf, and thus invalid, initialize it
to MSR_AMD_CPPC_CAP1.nominal_perf which is a sane default value.

Save the boot-time floor_perf during amd_pstate_init_floor_perf(). In
a subsequent patch it will be restored in the suspend, offline, and
exit paths, mirroring how bios_min_perf is handled for
MSR_AMD_CPPC_REQ.

Link: https://docs.amd.com/v/u/en-US/69206_1.10_AMD64_CPPC_PUB
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:24 -05:00
Gautham R. Shenoy
e67a5b6541 amd-pstate: Make certain freq_attrs conditionally visible
Certain amd_pstate freq_attrs such as amd_pstate_hw_prefcore and
amd_pstate_prefcore_ranking are enabled even when preferred core is
not supported on the platform.

Similarly there are common freq_attrs between the amd-pstate and the
amd-pstate-epp drivers (eg: amd_pstate_max_freq,
amd_pstate_lowest_nonlinear_freq, etc.) but are duplicated in two
different freq_attr structs.

Unify all the attributes in a single place and associate each of them
with a visibility function that determines whether the attribute
should be visible based on the underlying platform support and the
current amd_pstate mode.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:16 -05:00
Gautham R. Shenoy
fcc25a291f amd-pstate: Update cppc_req_cached in fast_switch case
The function msr_update_perf() does not cache the new value that is
written to MSR_AMD_CPPC_REQ into the variable cpudata->cppc_req_cached
when the update is happening from the fast path.

Fix that by caching the value everytime the MSR_AMD_CPPC_REQ gets
updated.

This issue was discovered by Claude Opus 4.6 with the aid of Chris
Mason's AI review-prompts
(https://github.com/masoncl/review-prompts/tree/main/kernel).

Assisted-by: Claude:claude-opus-4.6 review-prompts/linux
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Fixes: fff3957969 ("cpufreq/amd-pstate: Always write EPP value when updating perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:12 -05:00
Gautham R. Shenoy
beda3b3635 amd-pstate: Fix memory leak in amd_pstate_epp_cpu_init()
On failure to set the epp, the function amd_pstate_epp_cpu_init()
returns with an error code without freeing the cpudata object that was
allocated at the beginning of the function.

Ensure that the cpudata object is freed before returning from the
function.

This memory leak was discovered by Claude Opus 4.6 with the aid of
Chris Mason's AI review-prompts
(https://github.com/masoncl/review-prompts/tree/main/kernel).

Assisted-by: Claude:claude-opus-4.6 review-prompts/linux
Fixes: f9a378ff64 ("cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2026-04-02 11:28:05 -05:00
Guangshuo Li
6dcf9d0064 cpufreq: governor: fix double free in cpufreq_dbs_governor_init() error path
When kobject_init_and_add() fails, cpufreq_dbs_governor_init() calls
kobject_put(&dbs_data->attr_set.kobj).

The kobject release callback cpufreq_dbs_data_release() calls
gov->exit(dbs_data) and kfree(dbs_data), but the current error path
then calls gov->exit(dbs_data) and kfree(dbs_data) again, causing a
double free.

Keep the direct kfree(dbs_data) for the gov->init() failure path, but
after kobject_init_and_add() has been called, let kobject_put() handle
the cleanup through cpufreq_dbs_data_release().

Fixes: 4ebe36c94a ("cpufreq: Fix kobject memleak")
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260401024535.1395801-1-lgs201920130244@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-01 16:08:15 +02:00
Julian Braha
2e00c2dcc5 cpufreq: clean up dead code in Kconfig
There is already an 'if CPU_FREQ' condition wrapping these config
options, making the 'depends on' statement for each a duplicate
dependency (dead code).

Leave the outer 'if CPU_FREQ...endif' and remove the individual
'depends on' statement from each option.

This dead code was found by kconfirm, a static analysis tool for
Kconfig.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20260331074242.39986-1-julianbraha@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-01 15:58:01 +02:00
Viresh Kumar
9266b4da05 cpufreq: Allocate QoS freq_req objects with policy
A recent change exposed a bug in the error path: if
freq_qos_add_request(boost_freq_req) fails, min_freq_req may remain a
valid pointer even though it was never successfully added. During policy
teardown, this leads to an unconditional call to
freq_qos_remove_request(), triggering a WARN.

The current design allocates all three freq_req objects together, making
the lifetime rules unclear and error handling fragile.

Simplify this by allocating the QoS freq_req objects at policy
allocation time. The policy itself is dynamically allocated, and two of
the three requests are always needed anyway. This ensures consistent
lifetime management and eliminates the inconsistent state in failure
paths.

Reported-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Fixes: 6e39ba4e5a ("cpufreq: Add boost_freq_req QoS request")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Link: https://patch.msgid.link/a293f29d841b86c51f34699c6e717e01858d8ada.1774933424.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-01 15:53:47 +02:00
Pierre Gondois
6e39ba4e5a cpufreq: Add boost_freq_req QoS request
The Power Management Quality of Service (PM QoS) allows to
aggregate constraints from multiple entities. It is currently
used to manage the min/max frequency of a given policy.

Frequency constraints can come for instance from:
 - Thermal framework: acpi_thermal_cpufreq_init()
 - Firmware: _PPC objects: acpi_processor_ppc_init()
 - User: by setting policyX/scaling_[min|max]_freq
The minimum of the max frequency constraints is used to compute
the resulting maximum allowed frequency.

When enabling boost frequencies, the same frequency request object
(policy->max_freq_req) as to handle requests from users is used.
As a result, when setting:
 - scaling_max_freq
 - boost
The last sysfs file used overwrites the request from the other
sysfs file.

To avoid this, create a per-policy boost_freq_req to save the boost
constraints instead of overwriting the last scaling_max_freq
constraint.

policy_set_boost() calls the cpufreq set_boost callback.
Update the newly added boost_freq_req request from there:
 - whenever boost is toggled
 - to cover all possible paths

In the existing .set_boost() callbacks:
 - Don't update policy->max as this is done through the qos notifier
   cpufreq_notifier_max() which calls cpufreq_set_policy().
 - Remove freq_qos_update_request() calls as the qos request is now
   done in policy_set_boost() and updates the new boost_freq_req

$ ## Init state
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ echo 700000 > scaling_max_freq
scaling_max_freq:700000
cpuinfo_max_freq:1000000

$ echo 1 > ../boost
scaling_max_freq:1200000
cpuinfo_max_freq:1200000

$ echo 800000 > scaling_max_freq
scaling_max_freq:800000
cpuinfo_max_freq:1200000

$ ## Final step:
$ ## Without the patches:
$ echo 0 > ../boost
scaling_max_freq:1000000
cpuinfo_max_freq:1000000

$ ## With the patches:
$ echo 0 > ../boost
scaling_max_freq:800000
cpuinfo_max_freq:1000000

Note:
cpufreq_frequency_table_cpuinfo() updates policy->min
and max from:
A.
cpufreq_boost_set_sw()
\-cpufreq_frequency_table_cpuinfo()
B.
cpufreq_policy_online()
\-cpufreq_table_validate_and_sort()
  \-cpufreq_frequency_table_cpuinfo()
Keep these updates as some drivers expect policy->min and
max to be set through B.

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-30 21:56:52 +02:00
Pierre Gondois
04aa9d0726 cpufreq: Remove max_freq_req update for pre-existing policy
policy->max_freq_req QoS constraint represents the maximal allowed
frequency than can be requested. It is set by:
 - writing to policyX/scaling_max sysfs file
 - toggling the cpufreq/boost sysfs file

Upon calling freq_qos_update_request(), a successful update
of the max_freq_req value triggers cpufreq_notifier_max(),
followed by cpufreq_set_policy() which update the requested
frequency for the policy.
If the new max_freq_req value is not different from the
original value, no frequency update is triggered.

In a specific sequence of toggling:
 - cpufreq/boost sysfs file
 - CPU hot-plugging
a CPU could end up with boost enabled but running at the
maximal non-boost frequency, cpufreq_notifier_max() not being
triggered. The following fixed that:
commit 1608f02305 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

The following:
commit dd016f379e ("cpufreq: Introduce a more generic way to
set default per-policy boost flag")
also fixed the issue by correctly setting the max_freq_req
constraint of a policy that is re-activated. This makes the
first fix unnecessary.

As the original issue is fixed by another method,
this patch reverts:
commit 1608f02305 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260326204404.1401849-2-pierre.gondois@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-30 21:56:52 +02:00
Vineeth Pillai (Google)
ad8363ebf8 cpufreq: Use trace_call__##name() at guarded tracepoint call sites
Replace trace_foo() with the new trace_call__foo() at sites already
guarded by trace_foo_enabled(), avoiding a redundant
static_branch_unlikely() re-evaluation inside the tracepoint.
trace_call__foo() calls the tracepoint callbacks directly without
utilizing the static branch again.

Cc: Huang Rui <ray.huang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Perry Yuan <perry.yuan@amd.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://patch.msgid.link/20260323160052.17528-7-vineeth@bitbyteword.org
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org>
Assisted-by: Claude:claude-sonnet-4-6
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> # cpufreq core
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2026-03-28 13:37:06 -04:00
Rafael J. Wysocki
65dea11925 Merge back earlier cpufreq material for 7.1 2026-03-27 11:57:31 +01:00
Henry Tseng
16fb8d8a0e cpufreq: acpi-cpufreq: use DMI max speed when CPPC is unavailable
On AMD Ryzen Embedded V1780B (Family 17h, Zen 1), the BIOS does not
provide ACPI _CPC objects and the CPU does not support MSR-based CPPC
(X86_FEATURE_CPPC).  The _PSS table only lists nominal P-states
(P0 = 3350 MHz), so when get_max_boost_ratio() fails at
cppc_get_perf_caps(), cpuinfo_max_freq reports only the base frequency
instead of the rated boost frequency (3600 MHz).

  dmesg:
    ACPI CPPC: No CPC descriptor for CPU:0
    acpi_cpufreq: CPU0: Unable to get performance capabilities (-19)

cppc-cpufreq already has a DMI fallback (cppc_get_dmi_max_khz()) that
reads the processor max speed from SMBIOS Type 4.  Export it and reuse
it in acpi-cpufreq as a last-resort source for the boost frequency.

A sanity check ensures the DMI value is above the _PSS P0 frequency
and within 2x of it; values outside that range are ignored and the
existing arch_set_max_freq_ratio() path is taken instead.  The 2x
upper bound is based on a survey of the AMD Ryzen Embedded V1000
series, where the highest boost-to-base ratio is 1.8x (V1404I:
2.0 GHz base / 3.6 GHz boost).

The DMI lookup and sanity check are wrapped in a helper,
acpi_cpufreq_resolve_max_freq(), which falls through to
arch_set_max_freq_ratio() if the DMI value is absent or
out of range.

Tested on AMD Ryzen Embedded V1780B with v7.0-rc4:

  Before: cpuinfo_max_freq = 3350000 (base only)
  After:  cpuinfo_max_freq = 3600000 (includes boost)

Link: https://www.amd.com/en/products/embedded/ryzen/ryzen-v1000-series.html#specifications
Signed-off-by: Henry Tseng <henrytseng@qnap.com>
Link: https://patch.msgid.link/20260324090948.1667340-1-henrytseng@qnap.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-25 14:29:05 +01:00
Viresh Kumar
6a28fb8cb2 cpufreq: conservative: Reset requested_freq on limits change
A recently reported issue highlighted that the cached requested_freq
is not guaranteed to stay in sync with policy->cur. If the platform
changes the actual CPU frequency after the governor sets one (e.g.
due to platform-specific frequency scaling) and a re-sync occurs
later, policy->cur may diverge from requested_freq.

This can lead to incorrect behavior in the conservative governor.
For example, the governor may assume the CPU is already running at
the maximum frequency and skip further increases even though there
is still headroom.

Avoid this by resetting the cached requested_freq to policy->cur on
detecting a change in policy limits.

Reported-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Tested-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://lore.kernel.org/all/20260210115458.3493646-1-zhenglifeng1@huawei.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/d846a141a98ac0482f20560fcd7525c0f0ec2f30.1773999467.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:32:57 +01:00
Viresh Kumar
8f13c0c6cb cpufreq: Don't skip cpufreq_frequency_table_cpuinfo()
The commit 6db0f533d3 ("cpufreq: preserve freq_table_sorted
across suspend/hibernate") unintentionally made a change where
cpufreq_frequency_table_cpuinfo() isn't getting called anymore
for old policies getting re-initialized.

This leads to potentially invalid values of policy->max and
policy->cpuinfo_max_freq.

Fix the issue by reverting the original commit and adding the condition
for just the sorting function.

Fixes: 6db0f533d3 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.19+ <stable@vger.kernel.org> # 6.19+
Link: https://patch.msgid.link/65ba5c45749267c82e8a87af3dc788b37a0b3f48.1773998611.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-23 13:32:57 +01:00
Rosen Penev
8655a4e35c cpufreq: tegra194: remove COMPILE_TEST
The driver needs architecture specific headers to build. Problem gets
exposed when TEGRA_BPMP gets COMPILE_TEST added to it.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2026-03-20 11:03:33 +05:30
Faruque Ansari
d6f8e0e06d cpufreq: Add QCS8300 to cpufreq-dt-platdev blocklist
The Qualcomm QCS8300 platform uses the qcom-cpufreq-hw
driver, so add it to the cpufreq-dt-platdev driver's blocklist.

Signed-off-by: Faruque Ansari <faruque.ansari@oss.qualcomm.com>
Reviewed-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2026-03-17 14:21:32 +05:30
Geert Uytterhoeven
951318c465 cpufreq: ti-cpufreq: Convert to of_machine_get_match()
Use the of_machine_get_match() helper instead of open-coding the same
operation.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/bba0631aea78b6db7d453a9f9e98ea16b7e2c269.1772468323.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-13 17:00:04 -05:00
Geert Uytterhoeven
8cd94ead51 cpufreq: qcom-nvmem: Convert to of_machine_get_match()
Use the of_machine_get_match() helper instead of open-coding the same
operation.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/886a603a7a1de6c8cb14ee0783ee0bceea4d914a.1772468323.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-13 17:00:04 -05:00
Geert Uytterhoeven
1838e0924e cpufreq: airoha: Convert to of_machine_get_match()
Use the of_machine_get_match() helper instead of open-coding the same
operation.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/cc76137755d93af982bf255095adafc7d523692c.1772468323.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-03-13 17:00:04 -05:00
Thorsten Blum
022eec206a cpufreq: governor: Use sysfs_emit() in sysfs show functions
Replace sprintf() with sysfs_emit() in sysfs show functions.
sysfs_emit() is preferred for formatting sysfs output because it
provides safer bounds checking.  No functional changes.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject tweak ]
Link: https://patch.msgid.link/20260223210351.344388-2-thorsten.blum@linux.dev
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-06 19:33:10 +01:00
Fabio M. De Francesco
6acae3c833 cpufreq: intel_pstate: Allow repeated intel_pstate disable
Repeated intel_pstate disables currently return an error, adding unnecessary
complexity to userspace scripts which must first read the current state and
conditionally write 'off'.

Make repeated intel_pstate disables a no-op.

Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260219181600.16388-1-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-06 19:18:28 +01:00
Pengjie Zhang
8505bfb4e4 ACPI: CPPC: Move reference performance to capabilities
Currently, the `Reference Performance` register is read every time
the CPU frequency is sampled in `cppc_get_perf_ctrs()`. This function
is on the hot path of the cppc_cpufreq driver.

Reference Performance indicates the performance level that corresponds
to the Reference Counter incrementing and is not expected to change
dynamically during runtime (unlike the Delivered and Reference counters).

Reading this register in the hot path incurs unnecessary overhead,
particularly on platforms where CPC registers are located in the PCC
(Platform Communication Channel) subspace. This patch moves
`reference_perf` from the dynamic feedback counters structure
(`cppc_perf_fb_ctrs`) to the static capabilities structure
(`cppc_perf_caps`).

Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
[ rjw: Changelog adjustment ]
Link: https://patch.msgid.link/20260213100935.19111-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-05 20:00:19 +01:00
Pengjie Zhang
54de61a3f6 cpufreq: Add debug print for current frequency in __cpufreq_driver_target()
Include policy->cur in the debug message to explicitly show the frequency
transition (from current to target).

Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20260129121813.3874516-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-05 15:45:36 +01:00
Thierry Reding
2d14bf98e6 cpufreq: tegra194: Rename Tegra239 to Tegra238
This chip identifies as Tegra238, so update the device tree compatible
string and data structures associated with it to use the correct name.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2026-03-03 09:46:04 +05:30
Sumit Gupta
13c45a2663 ACPI: CPPC: add APIs and sysfs interface for perf_limited
Add sysfs interface to read/write the Performance Limited register.

The Performance Limited register indicates to the OS that an
unpredictable event (like thermal throttling) has limited processor
performance. It contains two sticky bits set by the platform:
  - Bit 0 (Desired_Excursion): Set when delivered performance is
    constrained below desired performance. Not used when Autonomous
    Selection is enabled.
  - Bit 1 (Minimum_Excursion): Set when delivered performance is
    constrained below minimum performance.

These bits remain set until OSPM explicitly clears them. The write
operation accepts a bitmask of bits to clear:
  - Write 0x1 to clear bit 0
  - Write 0x2 to clear bit 1
  - Write 0x3 to clear both bits

This enables users to detect if platform throttling impacted a workload.
Users clear the register before execution, run the workload, then check
afterward - if set, hardware throttling occurred during that time window.

The interface is exposed as:
  /sys/devices/system/cpu/cpuX/cpufreq/perf_limited

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-7-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:42 +01:00
Sumit Gupta
ea3db45ae4 cpufreq: cppc: Update MIN_PERF/MAX_PERF in target callbacks
Update MIN_PERF and MAX_PERF registers from policy->min and policy->max
in the .target() and .fast_switch() callbacks. This allows controlling
performance bounds via standard scaling_min_freq and scaling_max_freq
sysfs interfaces.

Similar to intel_cpufreq which updates HWP min/max limits in .target(),
cppc_cpufreq now programs MIN_PERF/MAX_PERF along with DESIRED_PERF.
Since MIN_PERF/MAX_PERF can be updated even when auto_sel is disabled,
they are updated unconditionally.

Also program MIN_PERF/MAX_PERF in store_auto_select() when enabling
autonomous selection so the platform uses correct bounds immediately.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Link: https://patch.msgid.link/20260206142658.72583-6-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:42 +01:00
Sumit Gupta
24ad4c6c13 cpufreq: CPPC: Update cached perf_ctrls on sysfs write
Update the cached perf_ctrls values when writing via sysfs to keep
them in sync with hardware registers:
- store_auto_select(): update perf_ctrls.auto_sel
- store_energy_performance_preference_val(): update perf_ctrls.energy_perf

This ensures consistent cached values after sysfs writes, which
complements the cppc_get_perf() initialization during policy setup.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-5-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:42 +01:00
Sumit Gupta
658fa7b1c4 ACPI: CPPC: Add cppc_get_perf() API to read performance controls
Add cppc_get_perf() function to read values of performance control
registers including desired_perf, min_perf, max_perf, energy_perf,
and auto_sel.

This provides a read interface to complement the existing
cppc_set_perf() write interface for performance control registers.

Note that auto_sel is read by cppc_get_perf() but not written by
cppc_set_perf() to avoid unintended mode changes during performance
updates. It can be updated with existing dedicated cppc_set_auto_sel()
API.

Use cppc_get_perf() in cppc_cpufreq_get_cpu_data() to initialize
perf_ctrls with current hardware register values during cpufreq
policy initialization.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260206142658.72583-2-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-27 20:50:41 +01:00
Srinivas Pandruvada
6b050482ec cpufreq: intel_pstate: Fix crash during turbo disable
When the system is booted with kernel command line argument "nosmt" or
"maxcpus" to limit the number of CPUs, disabling turbo via:

 echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

results in a crash:

 PF: supervisor read access in kernel mode
 PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] SMP PTI
 ...
 RIP: 0010:store_no_turbo+0x100/0x1f0
 ...

This occurs because for_each_possible_cpu() returns CPUs even if they
are not online. For those CPUs, all_cpu_data[] will be NULL. Since
commit 973207ae3d ("cpufreq: intel_pstate: Rearrange max frequency
updates handling code"), all_cpu_data[] is dereferenced even for CPUs
which are not online, causing the NULL pointer dereference.

To fix that, pass CPU number to intel_pstate_update_max_freq() and use
all_cpu_data[] for those CPUs for which there is a valid cpufreq policy.

Fixes: 973207ae3d ("cpufreq: intel_pstate: Rearrange max frequency updates handling code")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221068
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Link: https://patch.msgid.link/20260225001752.890164-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-25 14:39:19 +01:00
David Arcari
ab39cc4cb8 cpufreq: intel_pstate: Fix NULL pointer dereference in update_cpu_qos_request()
The update_cpu_qos_request() function attempts to initialize the 'freq'
variable by dereferencing 'cpudata' before verifying if the 'policy'
is valid.

This issue occurs on systems booted with the "nosmt" parameter, where
all_cpu_data[cpu] is NULL for the SMT sibling threads. As a result,
any call to update_qos_requests() will result in a NULL pointer
dereference as the code will attempt to access pstate.turbo_freq using
the NULL cpudata pointer.

Also, pstate.turbo_freq may be updated by intel_pstate_get_hwp_cap()
after initializing the 'freq' variable, so it is better to defer the
'freq' until intel_pstate_get_hwp_cap() has been called.

Fix this by deferring the 'freq' assignment until after the policy and
driver_data have been validated.

Fixes: ae1bdd23b9 ("cpufreq: intel_pstate: Adjust frequency percentage computations")
Reported-by: Jirka Hladky <jhladky@redhat.com>
Closes: https://lore.kernel.org/all/CAE4VaGDfiPvz3AzrwrwM4kWB3SCkMci25nPO8W1JmTBd=xHzZg@mail.gmail.com/
Signed-off-by: David Arcari <darcari@redhat.com>
Cc: 6.18+ <stable@vger.kernel.org> # 6.18+
[ rjw: Added one paragraph to the changelog ]
Link: https://patch.msgid.link/20260224122106.228116-1-darcari@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-24 15:38:16 +01:00