mirror of
https://github.com/torvalds/linux.git
synced 2026-05-18 11:38:01 +02:00
94de1dfd47
17522 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
119b1e61a7 |
RISC-V Patches for the 6.16 Merge Window, Part 1
* Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions. * Support for getrandom() in the VDSO. * Support for mseal. * Optimized routines for raid6 syndrome and recovery calculations. * kexec_file() supports loading Image-formatted kernel binaries. * Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly. * Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions. * Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines. -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmhDLP8ZHHBhbG1lcmRh YmJlbHRAZ29vZ2xlLmNvbQAKCRAuExnzX7sYiZhFD/4+Zikkld812VjFb9dTF+Wj n/x9h86zDwAEFgf2BMIpUQhHru6vtdkO2l/Ky6mQblTPMWLafF4eK85yCsf84sQ0 +RX4sOMLZ0+qvqxKX+aOFe9JXOWB0QIQuPvgBfDDOV4UTm60sglIxwqOpKcsBEHs 2nplXXjiv0ckaMFLos8xlwu1uy4A/jMfT3Y9FDcABxYCqBoKOZ1frcL9ezJZbHbv BoOKLDH8ZypFxIG/eQ511lIXXtrnLas0l4jHWjrfsWu6pmXTgJasKtbGuH3LoLnM G/4qvHufR6lpVUOIL5L0V6PpsmYwDi/ciFIFlc8NH2oOZil3qiVaGSEbJIkWGFu9 8lWTXQWnbinZbfg2oYbWp8GlwI70vKomtDyYNyB9q9Cq9jyiTChMklRNODr4764j ZiEnzc/l4KyvaxUg8RLKCT595lKECiUDnMytbIbunJu05HBqRCoGpBtMVzlQsyUd ybkRt3BA7eOR8/xFA7ZZQeJofmiu2yxkBs5ggMo8UnSragw27hmv/OA0mWMXEuaD aaWc4ZKpKqf7qLchLHOvEl5ORUhsisyIJgZwOqdme5rQoWorVtr51faA4AKwFAN4 vcKgc5qJjK8vnpW+rl3LNJF9LtH+h4TgmUI853vUlukPoH2oqRkeKVGSkxG0iAze eQy2VjP1fJz6ciRtJZn9aw== =cZGy -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions - Support for getrandom() in the VDSO - Support for mseal - Optimized routines for raid6 syndrome and recovery calculations - kexec_file() supports loading Image-formatted kernel binaries - Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions - Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines * tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits) riscv: uaccess: Only restore the CSR_STATUS SUM bit RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension RISC-V: Documentation: Add enough title underlines to CMODX riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop ... |
||
|
|
2670a39b1e
|
Merge tag 'riscv-mw2-6.16-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next
riscv patches for 6.16-rc1, part 2 * Performance improvements - Add support for vdso getrandom - Implement raid6 calculations using vectors - Introduce svinval tlb invalidation * Cleanup - A bunch of deduplication of the macros we use for manipulating instructions * Misc - Introduce a kunit test for kprobes - Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :)) [Palmer: There was a rebase between part 1 and part 2, so I've had to do some more git surgery here... at least two rounds of surgery...] * alex-pr-2: (866 commits) RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension riscv: Add kprobes KUnit test riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM riscv: kprobes: Move branch_funct3 to insn.h riscv: kprobes: Move branch_rs2_idx to insn.h Linux 6.15-rc6 Input: xpad - fix xpad_device sorting Input: xpad - add support for several more controllers Input: xpad - fix Share button on Xbox One controllers ... |
||
|
|
4d6319289e
|
perf symbols: Ignore mapping symbols on riscv
RISCV ELF use mapping symbols with special names $x, $d to identify regions of RISCV code or code with different ISAs[1]. These symbols don't identify functions, so will confuse the perf output. The patch filters out these symbols at load time, similar to "4886f2ca perf symbols: Ignore mapping symbols on aarch64". [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/ master/riscv-elf.adoc#mapping-symbol Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> |
||
|
|
0939bd2fcf |
perf tools improvements and fixes for Linux v6.16:
perf report/top/annotate TUI:
- Accept the left arrow key as a Zoom out if done on the first column.
- Show if source code toggle status in title, to help spotting bugs with
the various disassemblers (capstone, llvm, objdump).
- Provide feedback on unhandled hotkeys.
Build:
- Better inform when certain features are not available with warnings in the
build process and in 'perf version --build-options' or 'perf -vv'.
perf record:
- Improve the --off-cpu code by synthesizing events for switch-out -> switch-in
intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh
knob.
perf report:
- Add 'tgid' sort key.
perf mem/c2c:
- Add 'op', 'cache', 'snoop', 'dtlb' output fields.
- Add support for 'ldlat' on AMD IBS (Instruction Based Sampling).
perf ftrace:
- Use process/session specific trace settings instead of messing with
the global ftrace knobs.
perf trace:
- Implement syscall summary in BPF.
- Support --summary-mode=cgroup.
- Always print return value for syscalls returning a pid.
- The rseq and set_robust_list don't return a pid, just -errno.
perf lock contention:
- Symbolize zone->lock using BTF.
- Add -J/--inject-delay option to estimate impact on application performance by
optimization of kernel locking behavior.
perf stat:
- Improve hybrid support for the NMI watchdog warning.
Symbol resolution:
- Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols.
- Improve Rust demangler.
Hardware tracing:
Intel PT:
- Fix PEBS-via-PT data_src.
- Do not default to recording all switch events.
- Fix pattern matching with python3 on the SQL viewer script.
arm64:
- Fixups for the hip08 hha PMU.
Vendor events:
- Update Intel events/metrics files for alderlake, alderlaken, arrowlake,
bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest,
elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx,
icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake,
nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp,
westmereep-sx.
python support:
- Add support for event counts in the python binding, add a counting.py example.
perf list:
- Display the PMU name associated with a perf metric in JSON.
perf test:
- Hybrid improvements for metric value validation test.
- Fix LBR test by ignoring idle task.
- Add AMD IBS sw filter ana d'ldlat' tests.
- Add 'perf trace --summary-mode=cgroup' test.
- Add tests for the various language symbol demanglers.
Miscellaneous.
- Allow specifying the cpu an event will be tied using '-e event/cpu=N/'.
- Sync various headers with the kernel sources.
- Add annotations to use clang's -Wthread-safety and fix some problems
it detected.
- Make dump_stack() use perf's symbol resolution to provide better backtraces.
- Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
(Precision Event-Based Sampling), that adds timing info, the retirement
latency of instructions.
- Various memory allocation (some detected by ASAN) and reference counting
fixes.
- Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED.
- Skip unsupported event types in perf.data files, don't stop when finding one.
- Improve lookups using hashmaps and binary searches.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCaD9ViwAKCRCyPKLppCJ+
JzOfAQDXlukhPQyuJ4j1ie0x1QO4jalloMbG1Bkp3hn6yjxafAD9Ha5wr+dwnAj4
FfxOVqua29r8Htn4aGahXZ0nnlVp9Ac=
=bwgD
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
"perf report/top/annotate TUI:
- Accept the left arrow key as a Zoom out if done on the first column
- Show if source code toggle status in title, to help spotting bugs
with the various disassemblers (capstone, llvm, objdump)
- Provide feedback on unhandled hotkeys
Build:
- Better inform when certain features are not available with warnings
in the build process and in 'perf version --build-options' or 'perf -vv'
perf record:
- Improve the --off-cpu code by synthesizing events for switch-out ->
switch-in intervals using a BPF program. This can be fine tuned
using a --off-cpu-thresh knob
perf report:
- Add 'tgid' sort key
perf mem/c2c:
- Add 'op', 'cache', 'snoop', 'dtlb' output fields
- Add support for 'ldlat' on AMD IBS (Instruction Based Sampling)
perf ftrace:
- Use process/session specific trace settings instead of messing with
the global ftrace knobs
perf trace:
- Implement syscall summary in BPF
- Support --summary-mode=cgroup
- Always print return value for syscalls returning a pid
- The rseq and set_robust_list don't return a pid, just -errno
perf lock contention:
- Symbolize zone->lock using BTF
- Add -J/--inject-delay option to estimate impact on application
performance by optimization of kernel locking behavior
perf stat:
- Improve hybrid support for the NMI watchdog warning
Symbol resolution:
- Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust
symbols
- Improve Rust demangler
Hardware tracing:
Intel PT:
- Fix PEBS-via-PT data_src
- Do not default to recording all switch events
- Fix pattern matching with python3 on the SQL viewer script
arm64:
- Fixups for the hip08 hha PMU
Vendor events:
- Update Intel events/metrics files for alderlake, alderlaken,
arrowlake, bonnell, broadwell, broadwellde, broadwellx,
cascadelakex, clearwaterforest, elkhartlake, emeraldrapids,
grandridge, graniterapids, haswell, haswellx, icelake, icelakex,
ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep,
nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
skylake, skylakex, snowridgex, tigerlake, westmereep-dp,
westmereep-sp, westmereep-sx
python support:
- Add support for event counts in the python binding, add a
counting.py example
perf list:
- Display the PMU name associated with a perf metric in JSON
perf test:
- Hybrid improvements for metric value validation test
- Fix LBR test by ignoring idle task
- Add AMD IBS sw filter ana d'ldlat' tests
- Add 'perf trace --summary-mode=cgroup' test
- Add tests for the various language symbol demanglers
Miscellaneous:
- Allow specifying the cpu an event will be tied using '-e
event/cpu=N/'
- Sync various headers with the kernel sources
- Add annotations to use clang's -Wthread-safety and fix some
problems it detected
- Make dump_stack() use perf's symbol resolution to provide better
backtraces
- Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
(Precision Event-Based Sampling), that adds timing info, the
retirement latency of instructions
- Various memory allocation (some detected by ASAN) and reference
counting fixes
- Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace
PERF_RECORD_COMPRESSED
- Skip unsupported event types in perf.data files, don't stop when
finding one
- Improve lookups using hashmaps and binary searches"
* tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits)
perf callchain: Always populate the addr_location map when adding IP
perf lock contention: Reject more than 10ms delays for safety
perf trace: Set errpid to false for rseq and set_robust_list
perf symbol: Move demangling code out of symbol-elf.c
perf trace: Always print return value for syscalls returning a pid
perf script: Print PERF_AUX_FLAG_COLLISION flag
perf mem: Show absolute percent in mem_stat output
perf mem: Display sort order only if it's available
perf mem: Describe overhead calculation in brief
perf record: Fix incorrect --user-regs comments
Revert "perf thread: Ensure comm_lock held for comm_list"
perf test trace_summary: Skip --bpf-summary tests if no libbpf
perf test intel-pt: Skip jitdump test if no libelf
perf intel-tpebs: Avoid race when evlist is being deleted
perf test demangle-java: Don't segv if demangling fails
perf symbol: Fix use-after-free in filename__read_build_id
perf pmu: Avoid segv for missing name/alias_name in wildcarding
perf machine: Factor creating a "live" machine out of dwarf-unwind
perf test: Add AMD IBS sw filter test
perf mem: Count L2 HITM for c2c statistic
...
|
||
|
|
a913ef6fd8 |
perf callchain: Always populate the addr_location map when adding IP
Dropping symbols also meant the callchain maps wasn't populated, but
the callchain map is needed to find the DSO.
Plumb the symbols option better, falling back to thread__find_map()
rather than thread__find_symbol() when symbols are disabled.
Fixes:
|
||
|
|
0df14c1f1e |
perf lock contention: Reject more than 10ms delays for safety
Delaying kernel operations can be dangerous and the kernel may kill
(non-sleepable) BPF programs running for long in the future.
Limit the max delay to 10ms and update the document about it.
$ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true
lock delay is too long: 100000us (> 10ms)
Usage: perf lock contention [<options>]
-J, --inject-delay <TIME@FUNC>
Inject delays to specific locks
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
8c56bfe53b |
perf trace: Set errpid to false for rseq and set_robust_list
The 'rseq' and 'set_robust_list' syscalls don't return a pid, so set errpid for both to false. Fixes: |
||
|
|
4d9b5146f0 |
perf symbol: Move demangling code out of symbol-elf.c
symbol-elf.c is used when building with libelf, symbol-minimal is used otherwise. There is no reason the demangling code with no dependencies on libelf is part of symbol-elf.c so move to symbol.c. This allows demangling tests to pass with NO_LIBELF=1. Structurally, while moving the functions rename demangle_sym() to dso__demangle_sym() which is already a function exposed in symbol.h and the only purpose of which in symbol-elf.c was to call demangle_sym(). Change the calls to demangle_sym() in symbol-elf.c to calls to dso__demangle_sym(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528210858.499898-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c7a48ea9b9 |
perf trace: Always print return value for syscalls returning a pid
The syscalls that were consistently observed were set_robust_list and
rseq. This is because perf cannot find their child process.
This change ensures that the return value is always printed.
Before:
0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24) =
0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979) =
After:
0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24) = 0
0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979) = 0
Committer notes:
As discussed in the thread in the Link: tag below, these two don't
return a pid, but for syscalls returning one, we need to print the
result and if we manage to find the children in 'perf trace' data
structures, then print its name as well.
Fixes:
|
||
|
|
e8718f9866 |
perf script: Print PERF_AUX_FLAG_COLLISION flag
Print out the collision flag for AUX trace data. This is helpful for inspecting sample collisions. After: 0x217b60@/data_nvme1n1/niayan01/upstream/perf.data [0x40]: event: 11 . . ... raw event: size 64 bytes . 0000: 0b 00 00 00 00 00 40 00 d2 ef 3f 00 00 00 00 00 ......@...?..... . 0010: ff 0f 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................ . 0020: 1c 01 00 00 1c 01 00 00 10 bf 38 d6 11 01 00 00 ..........8..... . 0030: 03 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ................ 3 1176120114960 0x217b60 [0x40]: PERF_RECORD_AUX offset: 0x3fefd2 size: 0xfff flags: 0x8 [C] The added character '[C]' indicates the collision. Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528153519.188644-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
0dad79cf81 |
perf mem: Show absolute percent in mem_stat output
Currently the output sums up to 100% for each entry. But it can be
confusing when it's displayed with 'overhead'.
Before:
$ perf mem report -F overhead,sample,cache,comm
...
# -------------- Cache --------------
# Overhead Samples L1 L2 L3 L1-buf Other Command
# ........ ............ ................................... ...............
#
25.38% 517 34.6% 0.0% 15.8% 23.3% 26.2% swapper
9.03% 239 35.4% 0.8% 9.1% 22.1% 32.6% chrome
8.61% 233 45.3% 1.2% 8.9% 22.7% 21.9% Chrome_ChildIOT
7.81% 189 33.6% 0.4% 5.5% 35.9% 24.6% Isolated Web Co
3.73% 103 40.4% 0.3% 2.7% 39.4% 17.2% gnome-shell
Let's convert it to use absolute percent value so that it can add up to
the overhead for that entry.
After:
# -------------- Cache --------------
# Overhead Samples L1 L2 L3 L1-buf Other Command
# ........ ............ ................................... ...............
#
25.38% 517 8.8% 0.0% 4.0% 5.9% 6.7% swapper
9.03% 239 3.2% 0.1% 0.8% 2.0% 2.9% chrome
8.61% 233 3.9% 0.1% 0.8% 2.0% 1.9% Chrome_ChildIOT
7.81% 189 2.6% 0.0% 0.4% 2.8% 1.9% Isolated Web Co
3.73% 103 1.5% 0.0% 0.1% 1.5% 0.6% gnome-shell
This aligns well with the existing 'mem' sort key.
$ perf mem report -s comm,mem -H
...
#
# Overhead Samples Command / Memory access
# ......................... ..........................................
#
25.38% 517 swapper
8.78% 150 L1 hit
6.66% 72 RAM hit
5.92% 137 LFB/MAB hit
4.02% 157 L3 hit
0.00% 1 L3 miss
9.03% 239 chrome
3.19% 117 L1 hit
2.94% 35 RAM hit
1.99% 48 LFB/MAB hit
0.82% 32 L3 hit
0.08% 5 L2 hit
0.00% 2 L3 miss
We can add an option or a config to change the setting later.
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
7a6710d015 |
perf mem: Display sort order only if it's available
IOW it's not used when -F option is used alone. Let's make it conditional to skip printing incorrect information. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
00a23c000e |
perf mem: Describe overhead calculation in brief
Unlike perf-report which uses sample period for overhead calculation, perf-mem overhead is calculated using sample weight. Describe perf-mem overhead calculation method in it's man page. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
a4a859eb67 |
perf record: Fix incorrect --user-regs comments
The comment of "--user-regs" option is not correct, fix it.
"on interrupt," -> "in user space,"
Fixes:
|
||
|
|
24bcc31fc7 |
Revert "perf thread: Ensure comm_lock held for comm_list"
This reverts commit
|
||
|
|
6dd7a0fde9 |
perf test trace_summary: Skip --bpf-summary tests if no libbpf
If perf is built without libbpf (e.g. NO_LIBBPF=1) then the --bpf-summary perf trace tests will fail. Skip the tests as this is expected behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8755f940a0 |
perf test intel-pt: Skip jitdump test if no libelf
jitdump support is only present if building with libelf. Skip the intel-pt jitdump test if perf isn't compiled with libelf support. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
040a008d0e |
perf intel-tpebs: Avoid race when evlist is being deleted
Reading through the evsel->evlist may seg fault if a sample arrives
when the evlist is being deleted.
Detect this case and ignore samples arriving when the evlist is being
deleted.
Fixes:
|
||
|
|
07f2b1287c |
perf test demangle-java: Don't segv if demangling fails
The buffer returned by dso__demangle_sym() may be NULL, don't segv in strcmp if this happens. Currently this happens for NO_LIBELF=1 builds. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
fef8f648bb |
perf symbol: Fix use-after-free in filename__read_build_id
The same buf is used for the program headers and reading notes. As the
notes memory may be reallocated then this corrupts the memory pointed
to by the phdr. Using the same buffer is in any case a logic
error. Rather than deal with the duplicated code, introduce an elf32
boolean and a union for either the elf32 or elf64 headers that are in
use. Let the program headers have their own memory and grow the buffer
for notes as necessary.
Before `perf list -j` compiled with asan would crash with:
```
==4176189==ERROR: AddressSanitizer: heap-use-after-free on address 0x5160000070b8 at pc 0x555d3b15075b bp 0x7ffebb5a8090 sp 0x7ffebb5a8088
READ of size 8 at 0x5160000070b8 thread T0
#0 0x555d3b15075a in filename__read_build_id tools/perf/util/symbol-minimal.c:212:25
#1 0x555d3ae43aff in filename__sprintf_build_id tools/perf/util/build-id.c:110:8
...
0x5160000070b8 is located 312 bytes inside of 560-byte region [0x516000006f80,0x5160000071b0)
freed by thread T0 here:
#0 0x555d3ab21840 in realloc (perf+0x264840) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e)
#1 0x555d3b1506e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:206:11
...
previously allocated by thread T0 here:
#0 0x555d3ab21423 in malloc (perf+0x264423) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e)
#1 0x555d3b1503a2 in filename__read_build_id tools/perf/util/symbol-minimal.c:182:9
...
```
Note: this bug is long standing and not introduced by the other asan
fix in commit
|
||
|
|
2a2a7f5e7d |
perf pmu: Avoid segv for missing name/alias_name in wildcarding
The pmu name or alias_name fields may be NULL and should be skipped if
so. This is done in all loops of perf_pmu___name_match except the
final wildcard loop which was an oversight.
Fixes:
|
||
|
|
4c04654455 |
perf machine: Factor creating a "live" machine out of dwarf-unwind
Factor out for use in places other than the dwarf unwinding tests for libunwind. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250313052952.871958-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
0e71bcdcf1 |
perf test: Add AMD IBS sw filter test
The kernel v6.14 added 'swfilt' to support privilege filtering in software so that IBS can be used by regular users. Add a test case in x86 to verify the behavior. $ sudo perf test -vv 'IBS software filter' 113: AMD IBS software filtering: --- start --- test child forked, pid 178826 check availability of IBS swfilt run perf record with modifier and swfilt [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] check number of samples with swfilt [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.037 MB - ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.041 MB - ] ---- end(0) ---- 113: AMD IBS software filtering : Ok Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a 9950x3d Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250524002754.1266681-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
fa9b3578ed |
perf mem: Count L2 HITM for c2c statistic
L2 HITM is not counted in c2c statistic decoding. Count it for lcl_hitm like how we handle L2 Peer snoop. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
846b62b343 |
perf arm-spe: Add support for SPE Data Source packet on HiSilicon HIP12
Add data source encoding for HiSilicon HIP12 and coresponding mapping to the perf's memory data source. This will help to synthesize the data and support upper layer tools like perf-mem and perf-c2c. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
785cdec46e |
Core x86 updates for v6.16:
Boot code changes:
- A large series of changes to reorganize the x86 boot code into a better isolated
and easier to maintain base of PIC early startup code in arch/x86/boot/startup/,
by Ard Biesheuvel.
Motivation & background:
| Since commit
|
|
|
||
|
|
94ec70880f |
Merge branch 'locking/futex' into locking/core, to pick up pending futex changes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
628e124404 |
perf tests switch-tracking: Fix timestamp comparison
The test might fail on the Arm64 platform with the error:
# perf test -vvv "Track with sched_switch"
Missing sched_switch events
#
The issue is caused by incorrect handling of timestamp comparisons. The
comparison result, a signed 64-bit value, was being directly cast to an
int, leading to incorrect sorting for sched events.
The case does not fail everytime, usually I can trigger the failure
after run 20 ~ 30 times:
# while true; do perf test "Track with sched_switch"; done
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : FAILED!
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
106: Track with sched_switch : FAILED!
106: Track with sched_switch : Ok
106: Track with sched_switch : Ok
I used cross compiler to build Perf tool on my host machine and tested on
Debian / Juno board. Generally, I think this issue is not very specific
to GCC versions. As both internal CI and my local env can reproduce the
issue.
My Host Build compiler:
# aarch64-linux-gnu-gcc --version
aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Juno Board:
# lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 12 (bookworm)
Release: 12
Codename: bookworm
Fix this by explicitly returning 0, 1, or -1 based on whether the result
is zero, positive, or negative.
Fixes:
|
||
|
|
0ffca606e9 |
perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapids
On graniterapids the cache home agent (CHA) and memory controller
(IMC) PMUs all have their cpumask set to per-socket information. In
order for per NUMA node aggregation to work correctly the PMUs cpumask
needs to be set to CPUs for the relevant sub-NUMA grouping.
For example, on a 2 socket graniterapids machine with sub NUMA
clustering of 3, for uncore_cha and uncore_imc PMUs the cpumask is
"0,120" leading to aggregation only on NUMA nodes 0 and 3:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1
Performance counter stats for 'system wide':
N0 1 277,835,681,344 UNC_CHA_CLOCKTICKS
N0 1 19,242,894,228 UNC_M_CLOCKTICKS
N3 1 277,803,448,124 UNC_CHA_CLOCKTICKS
N3 1 19,240,741,498 UNC_M_CLOCKTICKS
1.002113847 seconds time elapsed
```
By updating the PMUs cpumasks to "0,120", "40,160" and "80,200" then
the correctly 6 NUMA node aggregations are achieved:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1
Performance counter stats for 'system wide':
N0 1 92,748,667,796 UNC_CHA_CLOCKTICKS
N0 0 6,424,021,142 UNC_M_CLOCKTICKS
N1 0 92,753,504,424 UNC_CHA_CLOCKTICKS
N1 1 6,424,308,338 UNC_M_CLOCKTICKS
N2 0 92,751,170,084 UNC_CHA_CLOCKTICKS
N2 0 6,424,227,402 UNC_M_CLOCKTICKS
N3 1 92,745,944,144 UNC_CHA_CLOCKTICKS
N3 0 6,423,752,086 UNC_M_CLOCKTICKS
N4 0 92,725,793,788 UNC_CHA_CLOCKTICKS
N4 1 6,422,393,266 UNC_M_CLOCKTICKS
N5 0 92,717,504,388 UNC_CHA_CLOCKTICKS
N5 0 6,421,842,618 UNC_M_CLOCKTICKS
1.003406645 seconds time elapsed
```
In general, having the perf tool adjust cpumasks isn't desirable as
ideally the PMU driver would be advertising the correct cpumask.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250515181417.491401-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
9e893dab82 |
perf tests trace_summary.sh: Run in exclusive mode
And it is being successfull only when running alone, probably because there are some tests that add the vfs_getname probe that gets used by 'perf trace' and alter how it does syscall arg pathname resolution. This should be removed or made a fallback to the preferred BPF mode of getting syscall parameters, but till then, run this in exclusive mode. For reference, here are some of the tests that run close to this one: 127: perf record offcpu profiling tests : Ok 128: perf all PMU test : Ok 129: perf stat --bpf-counters test : Ok 130: Check Arm CoreSight trace data recording and synthesized samples: Skip 131: Check Arm CoreSight disassembly script completes without errors : Skip 132: Check Arm SPE trace data recording and synthesized samples : Skip 133: Test data symbol : Ok 134: Miscellaneous Intel PT testing : Skip 135: test Intel TPEBS counting mode : Skip 136: perf script task-analyzer tests : Ok 137: Check open filename arg using perf trace + vfs_getname : Ok 138: perf trace summary : Ok Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aC-hHTgArwlF_zu9@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
dd8633bd09 |
perf test: Add cgroup summary test case for 'perf trace'
$ sudo ./perf test -vv 112 112: perf trace summary: --- start --- test child forked, pid 1018940 testing: perf trace -s -- true testing: perf trace -S -- true testing: perf trace -s --summary-mode=thread -- true testing: perf trace -S --summary-mode=total -- true testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true testing: perf trace -as --summary-mode=total --no-bpf-summary -- true testing: perf trace -as --summary-mode=thread --bpf-summary -- true testing: perf trace -as --summary-mode=total --bpf-summary -- true testing: perf trace -aS --summary-mode=total --bpf-summary -- true testing: perf trace -as --summary-mode=cgroup --bpf-summary -- true testing: perf trace -aS --summary-mode=cgroup --bpf-summary -- true ---- end(0) ---- 112: perf trace summary : Ok Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522142551.1062417-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
59df607bf8 |
perf python: Add counting.py as example for counting perf events
Add counting.py - a python version of counting.c to demonstrate measuring and reading of counts for given perf events. Committer testing: Build perf and make the generated python binding somewhere you can point to to avoid using the one in the distro python3-perf (fedora, may be different in other distros): $ make -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Copy /tmp/build/perf-tools-next/python/perf.cpython-313-x86_64-linux-gnu.so to somewhere outside this toolbox container and then use it with root: # export PYTHONPATH=/root/python/ # ls -la /root/python/ total 10640 drwxr-xr-x. 1 root root 72 May 21 11:40 . dr-xr-x---. 1 root root 574 May 21 11:40 .. -rwxr-xr-x. 1 acme acme 10894360 May 21 11:40 perf.cpython-313-x86_64-linux-gnu.so # tools/perf/python/counting.py | head -5 For evsel(software/cpu-clock/) val: 2930946 enable: 2932479 run: 2932479 For evsel(software/cpu-clock/) val: 2924975 enable: 2926267 run: 2926267 For evsel(software/cpu-clock/) val: 2921017 enable: 2922430 run: 2922430 For evsel(software/cpu-clock/) val: 2914966 enable: 2916549 run: 2916549 For evsel(software/cpu-clock/) val: 2910027 enable: 2911589 run: 2911589 # Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> [ make the API take a CPU and thread then compute from these the appropriate indices. ] Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/linux-perf-users/CAP-5=fWb-=hCYmpg7U5N9C94EucQGTOS7YwR2-fo4ptOexzxyg@mail.gmail.com/ Link: https://lore.kernel.org/r/20250519195148.1708988-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
aa68483740 |
perf python: Add evlist close support
Add support for the evlist close function. Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250519195148.1708988-7-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
739621f657 |
perf python: Add evsel read method
Add the evsel read method to enable python to read counter data for the given evsel. Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20250512055748.479786-1-gautam@linux.ibm.com/ Link: https://lore.kernel.org/r/20250519195148.1708988-6-irogers@google.com [ make the API take a CPU and thread then compute from these the appropriate indices. ] Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
3b4991dcb4 |
perf python: Add support for 'struct perf_counts_values' to return counter data
Add support for the perf_counts_values struct to enable the python
bindings to read and return the counter data.
Committer notes:
Use T_ULONG instead of Py_T_ULONG, as all the other PyMemberDef arrays,
fixing the build with older python3 versions.
Use { .name = NULL, } to finish the new PyMemberDef
pyrf_counts_values_members array, again as the other arrays to please
some clang versions, ditto for PyGetSetDef.
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-5-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
0589aff473 |
perf python: Add evsel cpus and threads functions
Allow access to cpus and thread_map structs associated with an evsel. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250519195148.1708988-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
21fb366b2f |
perf test amd: Skip amd-ibs-period test on kernel < v6.15
Bunch of IBS kernel fixes went in v6.15-rc1 [1]. The amd-ibs-period test will fail without those kernel patches. Skip the test on system running kernel older than v6.15 to distinguish genuine new failures vs known failure due to old kernel. Since all the related IBS fixes went in -rc1 itself, the ">= 6.15" check will work for any custom compiled v6.15-* kernel as well. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Closes: https://lore.kernel.org/r/aCfuGXUnNIbnYo_r@x1 Link: https://lore.kernel.org/r/20250115054438.1021-1-ravi.bangoria@amd.com [1] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8f454c9581 |
perf thread: Ensure comm_lock held for comm_list
Add thread safety annotations for comm_list and add locking for two instances where the list is accessed without the lock held (in contradiction to ____thread__set_comm()). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250519224645.1810891-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
6fe064491b |
perf rwsem: Add clang's -Wthread-safety annotations
Add annotations used by clang's -Wthread-safety. Fix dsos compilation errors caused by a lock of annotations. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250519224645.1810891-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
ab2c742d75 |
perf dso: Minor refactor to allow clang's Wthread-safety analysis
The pattern:
```
if (x) {
lock(...)
}
block1;
if (x) {
unlock(...)
}
```
defeats clang's -Wthread-safety analysis where it complains of locks
held on one path and not another.
Add helper functions for "block1" then restructure as:
```
if (x) {
lock(...);
block1();
unlock(...);
} else {
block1();
}
```
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Fei Lang <langfei@huawei.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250519224645.1810891-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
4140e2b31b |
tools headers: Synchronize prctl.h ABI header
The prctl.h ABI header was slightly updated during the development of the interface. In particular the "immutable" parameter became a bit in the option argument. Synchronize prctl.h ABI header again and make use of the definition in the testsuite and "perf bench futex". Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: André Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/r/20250517151455.1065363-5-bigeasy@linutronix.de |
||
|
|
ba5f102eec |
perf ftrace: Use process/session specific trace settings
Executing 'perf ftrace' commands 'ftrace', 'profile' and 'latency' leave tracing disabled as can seen in this output: # echo 1 > /sys/kernel/debug/tracing/tracing_on # cat /sys/kernel/debug/tracing/tracing_on 1 # perf ftrace trace --graph-opts depth=5 sleep 0.1 > /dev/null # cat /sys/kernel/debug/tracing/tracing_on 0 # The 'tracing_on' file is not restored to its value before the command. To fix that this patch uses the .../tracing/instances/XXX subdirectory feature. Each 'perf ftrace' invocation creates its own session/process specific subdirectory and does not change the global state in the .../tracing directory itself. Use rmdir(../tracing/instances/dir) to stop process/session specific tracing and delete all process/session specific setings. Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250520093726.2009696-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
b705ca3d24 |
tools include UAPI: Sync linux/vhost.h with the kernel sources
To get the changes in:
|
||
|
|
735a3ac370 |
perf test probe_vfs_getname: Add regex for searching probe line
Since commit |
||
|
|
8cdf00b843 |
perf record: Fix a asan runtime error in util/maps.c
If I build perf with asan and run Zstd test: $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined" $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv 83: Zstd perf.data compression/decompression: ... util/maps.c:1046:5: runtime error: null pointer passed as argument 2, which is declared to never be null ... The issue was caused by `bsearch`. The patch adds a check to ensure argument 2 and 3 are not NULL and 0. Testing with the commands above confirms that the runtime error is resolved. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303183646.327510-2-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
208c0e1683 |
perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2
The original PERF_RECORD_COMPRESS is not 8-byte aligned, which can cause
asan runtime error:
# Build with asan
$ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined"
# Test success with many asan runtime errors:
$ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv
83: Zstd perf.data compression/decompression:
...
util/session.c:1959:13: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 13 byte alignment
0x7f69e3f99653: note: pointer points here
d0 3a 50 69 44 00 00 00 00 00 08 00 bb 07 00 00 00 00 00 00 44 00 00 00 00 00 00 00 ff 07 00 00
^
util/session.c:2163:22: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 8 byte alignment
0x7f69e3f99653: note: pointer points here
d0 3a 50 69 44 00 00 00 00 00 08 00 bb 07 00 00 00 00 00 00 44 00 00 00 00 00 00 00 ff 07 00 00
^
...
Since there is no way to align compressed data in zstd compression, this
patch add a new event type `PERF_RECORD_COMPRESSED2`, which adds a field
`data_size` to specify the actual compressed data size.
The `header.size` contains the total record size, including the padding
at the end to make it 8-byte aligned.
Tested with `Zstd perf.data compression/decompression`
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250303183646.327510-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
bcfab08db7 |
perf intel-tpebs: Filter non-workload samples
If perf is running with a benchmark then we want the retirement latency samples associated with the benchmark rather than from the system as a whole. Use the workload's PID to filter out samples that aren't from the workload or its children. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250430200108.243234-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1c5721ca89 |
perf test: Allow tolerance for leader sampling test
There is a known issue that the leader sampling is inconsistent, since throttle only affect leader, not the slave. The detail is in [1]. To maintain test coverage, this patch sets a tolerance rate of 80% to accommodate the throttled samples and prevent test failures due to throttling. [1] lore.kernel.org/20250328182752.769662-1-ctshao@google.com Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Co-developed-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250430140611.599078-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
cb422594d6 |
perf test: Add stat uniquifying test
The `stat+uniquify.sh` test retrieves all uniquified `clockticks` events
from `perf list -v clockticks` and check if `perf stat -e clockticks -A`
contains all of them.
Committer testing:
root@x1:~# grep -m1 "model name" /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
root@x1:~# perf list clockticks
List of pre-defined events (to be used in -e or -M):
uncore_clock/clockticks/ [Kernel PMU event]
uncore memory:
unc_m_clockticks
[Number of clocks. Unit: uncore_imc]
root@x1:~#
root@x1:~# perf test uniquifying
92: perf stat events uniquifying : Ok
root@x1:~# perf test -vv uniquifying
92: perf stat events uniquifying:
--- start ---
test child forked, pid 1552628
stat event uniquifying test
---- end(0) ----
92: perf stat events uniquifying : Ok
root@x1:~#
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250513215401.2315949-4-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
137359b789 |
perf parse-events: Use wildcard processing to set an event to merge into
The merge stat code fails for uncore events if they are repeated twice, for example `perf stat -e clockticks,clockticks -I 1000` as the counts of the second set of uncore events will be merged into the first counter. Reimplement the logic to have a first_wildcard_match so that merged later events correctly merge into the first wildcard event that they will be aggregated into. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250513215401.2315949-3-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |