Commit Graph

534 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
d3e01be6da perf symbols: Make variable receiving result strrchr() const
Fixing:

  util/symbol.c: In function ‘symbol__config_symfs’:
  util/symbol.c:2499:20: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
   2499 |         layout_str = strrchr(dir, ',');
        |

With recent gcc/glibc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-08 19:21:04 -07:00
Thomas Richter
83674a7829 perf addr2line: Remove global variable addr2line_timeout_ms
Remove global variable addr2line_timeout_ms and add it as a member
to symbol_conf structure.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
[namhyung: move the initialization to util/symbol.c]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-08 10:28:49 -07:00
Ian Rogers
83c338369a libperf cpumap: Make index and nr types unsigned
The index into the cpumap array and the number of entries within the
array can never be negative, so let's make them unsigned. This is
prompted by reports that gcc 13 with -O6 is giving a
alloc-size-larger-than errors. The change makes the cpumap changes and
then updates the declaration of index variables throughout perf and
libperf to be unsigned. The two things are hard to separate as
compiler warnings about mixing signed and unsigned types breaks the
build.

Reported-by: Chingbin Li <liqb365@163.com>
Closes: https://lore.kernel.org/lkml/20260212025127.841090-1-liqb365@163.com/
Tested-by: Chingbin Li <liqb365@163.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-01 14:50:53 -07:00
Chen Ni
ebbc5ce26e perf tools: Remove duplicate include of debug.h
Remove duplicate inclusion of debug.h in symbol.c to clean up redundant
code.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-03-18 11:51:13 -07:00
Changbin Du
f182573e06 perf tools: Add layout support for --symfs option
Add support for parsing an optional layout parameter in the --symfs
command line option. The format is:

  --symfs <directory[,layout]>

Where layout can be:
  - 'hierarchy': matches full path (default)
  - 'flat': only matches base name

When debugging symbol files from a copy of the filesystem (e.g., from a
container or remote machine), the debug files are often stored in a
flat directory structure with only filenames, not the full original
paths. In this case, using 'flat' layout allows perf to find debug
symbols by matching only the filename rather than the full path.

For example, given a binary path like:
  /build/output/lib/foo.so

With 'perf report --symfs /debug/files,flat', perf will look for:
  /debug/files/foo.so

Instead of:
  /debug/files/build/output/lib/foo.so

This is particularly useful when:
- Extracting debug files from containers with different directory layouts
- Working with build systems that flatten directory structures

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-03-10 23:13:30 -07:00
Gary Guo
623ba6ea45 perf symbol: Remove Rust symbol workarounds
Due to an off-by-one error introduced in commit 73bbb94466
("kallsyms: support "big" kernel symbols"), long symbols (which are
currently only produced by Rust) can have their symbol type being
wrongly parsed by kernel/kallsyms.c.

This has been fixed in commit f3f9f42232 ("kallsyms: Fix wrong
"big" kernel symbol type read from procfs"), and these symbols are now
reported correctly.

Drop the workaround in perf symbol that filter out these symbol types.

Specifically, '1' and 'l' can never be generated by nm -- 'u' does
indicate GNU unique, however such symbols are only generated by G++ for
C++ templates, and are never generated by LLVM (LLVM generates weak
symbols in such cases instead).

'N' can appear if symbols exist inside debug sections, and 'n' may
appear for symbols inside note sections, however these sections do not
typically have symbol (and they're explicitly filtered out by kallsyms).

Therefore, the previous occurrence of these symbols types must be due to
the off-by-one error and can be safely removed.

Signed-off-by: Gary Guo <gary@garyguo.net>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <lossin@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-05 12:44:09 -03:00
Linus Torvalds
9e906a9dea [GIT PULL] perf tools changes for v6.19
Perf event/metric description
 -----------------------------
 Unify all event and metric descriptions in JSON format.
 Now event parsing and handling is greatly simplified by that.
 
 From users point of view, perf list will provide richer
 information about hardware events like the following.
 
     $ perf list hw
 
     List of pre-defined events (to be used in -e or -M):
 
     legacy hardware:
       branch-instructions
            [Retired branch instructions [This event is an alias of branches]. Unit: cpu]
       branch-misses
            [Mispredicted branch instructions. Unit: cpu]
       branches
            [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
       bus-cycles
            [Bus cycles,which can be different from total cycles. Unit: cpu]
       cache-misses
            [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
             PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
       cache-references
            [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
             prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
       cpu-cycles
            [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
       cycles
            [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
       instructions
            [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
       ref-cycles
            [Total cycles; not affected by CPU frequency scaling. Unit: cpu]
 
 But most notable changes would be in the perf stat.  On the right side,
 the default metrics are better named and aligned. :)
 
     $ perf stat -- perf test -w noploop
 
      Performance counter stats for 'perf test -w noploop':
 
                     11      context-switches                 #     10.8 cs/sec  cs_per_second
                      0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                  3,612      page-faults                      #   3532.5 faults/sec  page_faults_per_second
               1,022.51 msec task-clock                       #      1.0 CPUs  CPUs_utilized
                110,466      branch-misses                    #      0.0 %  branch_miss_rate         (88.66%)
          6,934,452,104      branches                         #   6781.8 M/sec  branch_frequency     (88.66%)
          4,657,032,590      cpu-cycles                       #      4.6 GHz  cycles_frequency       (88.65%)
         27,755,874,218      instructions                     #      6.0 instructions  insn_per_cycle  (89.03%)
                             TopdownL1                        #      0.3 %  tma_backend_bound
                                                              #      9.3 %  tma_bad_speculation      (89.05%)
                                                              #      9.7 %  tma_frontend_bound       (77.86%)
                                                              #     80.7 %  tma_retiring             (88.81%)
 
            1.025318171 seconds time elapsed
 
            1.013248000 seconds user
            0.012014000 seconds sys
 
 Deferred unwinding support
 --------------------------
 With the kernel support [1], perf can use deferred callchains for
 userspace stack trace with frame pointers like below:
 
     $ perf record --call-graph fp,defer ...
 
 This will be transparent to users when it comes to other commands like
 perf report and perf script.  They will merge the deferred callchains to
 the previous samples as if they were collected together.
 
 [1] https://git.kernel.org/torvalds/c/c69993ecdd4dfde2b7da08b022052a33b203da07
 
 ARM SPE updates
 ---------------
 * Extensive enhancements to support various kinds of memory operations
   including GCS, MTE allocation tags, memcpy/memset, register access,
   and SIMD operations.
 
 * Add inverted data source filter (inv_data_src_filter) support to
   exclude certain data sources.
 
 * Improve documentation.
 
 Vendor event updates
 --------------------
 * Intel: Updated event files for Sierra Forest, Panther Lake, Meteor Lake,
          Lunar Lake, Granite Rapids, and others.
 
 * Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE definitions.
 
 * RISC-V: Added JSON support for T-HEAD C920V2.
 
 Misc
 ----
 * Improve pointer tracking in data type profiling.  It'd give better
   output when the variable is using container_of() to convert type.
 
 * Annotation support for perf c2c report in TUI.  Press 'a' key to
   enter annotation view from cacheline browser window.  This will show
   which instruction is causing the cacheline contention.
 
 * Lots of fixes and test coverage improvements!
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCaTUiWgAKCRCMstVUGiXM
 gzO3AQCaPM1/xAOtZ3Z21QEBrP+A0yFhmWMkI54IqZLsFl6qzQD/fvuorMblR+9W
 Nlr0Yyyo3zWnl2CD6s6AraIcLR5gVQs=
 =mjYC
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
 "Perf event/metric description:

  Unify all event and metric descriptions in JSON format. Now event
  parsing and handling is greatly simplified by that.

  From users point of view, perf list will provide richer information
  about hardware events like the following.

    $ perf list hw

    List of pre-defined events (to be used in -e or -M):

    legacy hardware:
      branch-instructions
           [Retired branch instructions [This event is an alias of branches]. Unit: cpu]
      branch-misses
           [Mispredicted branch instructions. Unit: cpu]
      branches
           [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu]
      bus-cycles
           [Bus cycles,which can be different from total cycles. Unit: cpu]
      cache-misses
           [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the
            PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu]
      cache-references
           [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include
            prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu]
      cpu-cycles
           [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu]
      cycles
           [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu]
      instructions
           [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu]
      ref-cycles
           [Total cycles; not affected by CPU frequency scaling. Unit: cpu]

  But most notable changes would be in the perf stat. On the right side,
  the default metrics are better named and aligned. :)

    $ perf stat -- perf test -w noploop

     Performance counter stats for 'perf test -w noploop':

                    11      context-switches                 #     10.8 cs/sec  cs_per_second
                     0      cpu-migrations                   #      0.0 migrations/sec  migrations_per_second
                 3,612      page-faults                      #   3532.5 faults/sec  page_faults_per_second
              1,022.51 msec task-clock                       #      1.0 CPUs  CPUs_utilized
               110,466      branch-misses                    #      0.0 %  branch_miss_rate         (88.66%)
         6,934,452,104      branches                         #   6781.8 M/sec  branch_frequency     (88.66%)
         4,657,032,590      cpu-cycles                       #      4.6 GHz  cycles_frequency       (88.65%)
        27,755,874,218      instructions                     #      6.0 instructions  insn_per_cycle  (89.03%)
                            TopdownL1                        #      0.3 %  tma_backend_bound
                                                             #      9.3 %  tma_bad_speculation      (89.05%)
                                                             #      9.7 %  tma_frontend_bound       (77.86%)
                                                             #     80.7 %  tma_retiring             (88.81%)

           1.025318171 seconds time elapsed

           1.013248000 seconds user
           0.012014000 seconds sys

  Deferred unwinding support:

  With the kernel support (commit c69993ecdd4d: "perf: Support deferred
  user unwind"), perf can use deferred callchains for userspace stack
  trace with frame pointers like below:

    $ perf record --call-graph fp,defer ...

  This will be transparent to users when it comes to other commands like
  perf report and perf script. They will merge the deferred callchains
  to the previous samples as if they were collected together.

  ARM SPE updates

   - Extensive enhancements to support various kinds of memory
     operations including GCS, MTE allocation tags, memcpy/memset,
     register access, and SIMD operations.

   - Add inverted data source filter (inv_data_src_filter) support to
     exclude certain data sources.

   - Improve documentation.

  Vendor event updates:

   - Intel: Updated event files for Sierra Forest, Panther Lake, Meteor
     Lake, Lunar Lake, Granite Rapids, and others.

   - Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE
     definitions.

   - RISC-V: Added JSON support for T-HEAD C920V2.

  Misc:

   - Improve pointer tracking in data type profiling. It'd give better
     output when the variable is using container_of() to convert type.

   - Annotation support for perf c2c report in TUI. Press 'a' key to
     enter annotation view from cacheline browser window. This will show
     which instruction is causing the cacheline contention.

   - Lots of fixes and test coverage improvements!"

* tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits)
  libperf: Use 'extern' in LIBPERF_API visibility macro
  perf stat: Improve handling of termination by signal
  perf tests stat: Add test for error for an offline CPU
  perf stat: When no events, don't report an error if there is none
  perf tests stat: Add "--null" coverage
  perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
  libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
  perf stat: Allow no events to open if this is a "--null" run
  perf test kvm: Add some basic perf kvm test coverage
  perf tests evlist: Add basic evlist test
  perf tests script dlfilter: Add a dlfilter test
  perf tests kallsyms: Add basic kallsyms test
  perf tests timechart: Add a perf timechart test
  perf tests top: Add basic perf top coverage test
  perf tests buildid: Add purge and remove testing
  perf tests c2c: Add a basic c2c
  perf c2c: Clean up some defensive gets and make asan clean
  perf jitdump: Fix missed dso__put
  perf mem-events: Don't leak online CPU map
  perf hist: In init, ensure mem_info is put on error paths
  ...
2025-12-07 07:07:02 -08:00
Ian Rogers
b4e44399eb perf symbol: Add missed dso__put
Add missing dso__put for the dso created in maps__split_kallsyms.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-03 11:07:23 -08:00
Namhyung Kim
4fba95fc38 perf tools: Use machine->root_dir to find /proc/kallsyms
This is for test functions to find the kallsyms correctly.  It can find
the machine from the kernel maps and use its root_dir.  This is helpful
to setup fake /proc directory for testing.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02 21:59:14 -08:00
Namhyung Kim
295d8a03ca perf tools: Fallback to initial kernel map properly
In maps__split_kallsyms(), it assumes new kernel map when it finds a
symbol without module after any module and the initial kernel map has
some symbols.  Because it expects modules are out of the kernel map so
modules should not have symbols in the kernel map.

For example, the following memory map shows symbols and maps.  Any
symbols in the module 1 area will go to the module 1.  The main kernel
map starts at 0xffffffffbc200000.  But if any symbol has a module
between the symbols in that area, next symbols after 0xffffffffbd008000
will generate new kernel maps like [kernel].1.

   kernel address   |                     |
                    |                     |
 0xffffffffc0000000 |---------------------|
                    |     (symbols)       |
                    |        ...          |   <---  [kernel].N
 0xffffffffbc400000 |---------------------|
                    |     (symbols)       |
                    |      module 2       |   <---  bad?
 0xffffffffbc380000 |---------------------|
                    |        ...          |
                    |     (symbols)       |
                    |  [kernel.kallsyms]  |   <---  initial map
 0xffffffffbc200000 |---------------------|
                    |                     |
                    |                     |
 0xffffffffabcde000 |---------------------|
                    |     (symbols)       |
                    |      module 1       |
 0xffffffffabcd0000 |---------------------|

This is very fragile when the module has a symbol that falls into the
main kernel map for some reason.  My system has a livepatch module with
such symbols.  And it created a lot of new kernel maps after those
symbols.  But the symbol may have broken addresses and the later symbols
can still be found in the initial kernel map.

Let's check the symbol address in the initial map and use it if found.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02 21:59:14 -08:00
Namhyung Kim
ad0b9c4865 perf tools: Fix split kallsyms DSO counting
It's counted twice as it's increased after calling maps__insert().  I
guess we want to increase it only after it's added properly.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a18 ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02 21:59:14 -08:00
Namhyung Kim
7da4d60db3 perf tools: Mark split kallsyms DSOs as loaded
The maps__split_kallsyms() will split symbols to module DSOs if it comes
from a module.  It also handled some unusual kernel symbols after modules
by creating new kernel maps like "[kernel].0".

But they are pseudo DSOs to have those unexpected symbols.  They should
not be considered as unloaded kernel DSOs.  Otherwise the dso__load()
for them will end up calling dso__load_kallsyms() and then
maps__split_kallsyms() again and again.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a18 ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-12-02 21:59:14 -08:00
James Clark
834ebb5678 perf tools: Don't read build-ids from non-regular files
Simplify the build ID reading code by removing the non-blocking option.
Having to pass the correct option to this function was fragile and a
mistake would result in a hang, see the linked fix. Furthermore,
compressed files are always opened blocking anyway, ignoring the
non-blocking option.

We also don't expect to read build IDs from non-regular files. The only
hits to this function that are non-regular are devices that won't be elf
files with build IDs, for example "/dev/dri/renderD129".

Now instead of opening these as non-blocking and failing to read, we
skip them. Even if something like a pipe or character device did have a
build ID, I don't think it would have worked because you need to call
read() in a loop, check for -EAGAIN and handle timeouts to make
non-blocking reads work.

Link: https://lore.kernel.org/linux-perf-users/20251022-james-perf-fix-dso-block-v1-1-c4faab150546@linaro.org/
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-26 10:13:38 -08:00
Arnaldo Carvalho de Melo
7f17ef0d47 perf symbols: Handle '1' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  root@x1:~# uname -a
  Linux x1 6.17.4-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Oct 19 18:47:49 UTC 2025 x86_64 GNU/Linux
  root@x1:~#

  root@x1:~# perf test 1
    1: vmlinux symtab matches kallsyms     : FAILED!
  root@x1:~#

Related to:

  root@x1:~# grep ' 1 ' /proc/kallsyms
  ffffffffb098bc00 1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  ffffffffb098bc10 1 _RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  root@x1:~#

That is found in:

  root@x1:~# pahole --running_kernel_vmlinux
  /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux
  root@x1:~#

  root@x1:~# readelf -sW /usr/lib/debug/lib/modules/6.17.4-200.fc42.x86_64/vmlinux | grep __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  150649: ffffffff81f8bc00    16 FUNC    LOCAL  DEFAULT    1 __pfx__RNCINvNtNtNtCsfwaGRd4cjqE_4core4iter8adapters3map12map_try_foldjNtCskFudTml27HW_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  root@x1:~#

But was being filtered out when reading /proc/kallsyms, as the '1'
symbol type was not being handled, do it, there are just two of them at
this point.

Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <lossin@kernel.org>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-03 14:54:31 -03:00
Ian Rogers
95931d9a59 perf libbfd: Move libbfd functionality to its own file
Move symbolization and srcline libbfd dependencies to a separate
libbfd.c. This mirrors moving llvm and capstone code. While this code
is deprecated as it is part of BUILD_NONDISTRO license incompatible
code, moving the code to its own file minimizes disruption in the main
files.

disasm_bpf.c is moved to libbfd.c also except for
symbol__disassemble_bpf_image which is currently more of a placeholder
function rather than something that provides disassembly support.

demangle-cxx.cpp code isn't migrated as it is very limited.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02 15:39:44 -03:00
Arnaldo Carvalho de Melo
945f500361 perf symbols: Handle 'N' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  # uname -a
  Linux number 6.16.3-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Aug 23 17:02:17 UTC 2025 x86_64 GNU/Linux
  #
  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                  : FAILED!
  #

Rust is enabled and these were the symbols causing the above failure,
i.e. found in vmlinux but not in /proc/kallsyms:

  $ grep -w N /proc/kallsyms
  0000000000000000 N __pfx__RNCINvNtNtNtCsbDUBuN8AbD4_4core4iter8adapters3map12map_try_foldjNtCs6vVzKs5jPr6_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  0000000000000000 N _RNCINvNtNtNtCsbDUBuN8AbD4_4core4iter8adapters3map12map_try_foldjNtCs6vVzKs5jPr6_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  $

So accept those 'N' symbols as well.

About them, from 'man nm':

           "N" The symbol is a debugging symbol.

           "n" The symbol is in a non-data, non-code, non-debug read-only section.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-09-09 10:42:05 -03:00
Ian Rogers
2c369d91d0 perf symbol: Add blocking argument to filename__read_build_id
When synthesizing build-ids, for build ID mmap2 events, they will be
added for data mmaps if -d/--data is specified. The files opened for
their build IDs may block on the open causing perf to hang during
synthesis. There is some robustness in existing calls to
filename__read_build_id by checking the file path is to a regular
file, which unfortunately fails for symlinks. Rather than adding more
is_regular_file calls, switch filename__read_build_id to take a
"block" argument and specify O_NONBLOCK when this is false. The
existing is_regular_file checking callers and the event synthesis
callers are made to pass false and thereby avoiding the hang.

Fixes: 53b00ff358 ("perf record: Make --buildid-mmap the default")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250823000024.724394-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-08-25 15:07:18 -07:00
Ian Rogers
eee4b66105 perf build-id: Ensure struct build_id is empty before use
If a build ID is read then not all code paths may ensure it is empty
before use. Initialize the build_id to be zero-ed unless there is
clear initialization such as a call to build_id__init.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25 10:37:55 -07:00
Ian Rogers
fccaaf6fbb perf build-id: Change sprintf functions to snprintf
Pass in a size argument rather than implying all build id strings must
be SBUILD_ID_SIZE.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-4-irogers@google.com
[ fixed some build errors ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25 10:37:13 -07:00
Ian Rogers
63a088e999 perf dso: Add missed dso__put to dso__load_kcore
The kcore loading creates a set of list nodes that have reference
counted references to maps of the kcore. The list node freeing in the
success path wasn't releasing the maps, add the missing puts. It is
unclear why this leak was being missed by leak sanitizer.

Fixes: 8372020996 ("perf map: Move map list node into symbol")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-02 19:05:26 -07:00
Namhyung Kim
ef0f7c235e perf build: Fix a build error on REFCNT_CHECKING=1
Recently it added -fno-strict-aliasing to sync with the kernel behavior.
But it caused an error due to potential uninitialized access like below:

  In file included from util/symbol.c:27:
  In function ‘dso__set_symbol_names_len’,
      inlined from ‘dso__sort_by_name’ at util/symbol.c:638:4:
  util/dso.h:654:46: error: ‘len’ may be used uninitialized [-Werror=maybe-uninitialized]
    654 |         RC_CHK_ACCESS(dso)->symbol_names_len = len;
        |                                              ^
  util/symbol.c: In function ‘dso__sort_by_name’:
  util/symbol.c:634:24: note: ‘len’ was declared here
    634 |                 size_t len;
        |                        ^~~

Let's just initialize it with 0.

Fixes: 55a18d2f3f ("perf build: enable -fno-strict-aliasing")
Closes: https://lore.kernel.org/r/aF7JC8zkG5-_-nY_@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-27 11:45:48 -07:00
Ian Rogers
4d9b5146f0 perf symbol: Move demangling code out of symbol-elf.c
symbol-elf.c is used when building with libelf, symbol-minimal is used
otherwise.

There is no reason the demangling code with no dependencies on libelf is
part of symbol-elf.c so move to symbol.c.

This allows demangling tests to pass with NO_LIBELF=1.

Structurally, while moving the functions rename demangle_sym() to
dso__demangle_sym() which is already a function exposed in symbol.h and
the only purpose of which in symbol-elf.c was to call demangle_sym().

Change the calls to demangle_sym() in symbol-elf.c to calls to
dso__demangle_sym().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528210858.499898-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 19:02:58 -03:00
Arnaldo Carvalho de Melo
4d728bb93b perf symbols: Handle 'u' and 'l' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  # uname -a
  Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux
  #

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : FAILED!
  #

Where we have Rust enabled:

  # grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64
  CONFIG_RUSTC_VERSION=108600
  CONFIG_RUST_IS_AVAILABLE=y
  CONFIG_RUSTC_LLVM_VERSION=200101
  CONFIG_RUSTC_HAS_COERCE_POINTEE=y
  CONFIG_RUST=y
  CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)"
  CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y
  CONFIG_RUST_PHYLIB_ABSTRACTIONS=y
  # CONFIG_RUST_DEBUG_ASSERTIONS is not set
  CONFIG_RUST_OVERFLOW_CHECKS=y
  # CONFIG_RUST_BUILD_ASSERT_ALLOW is not set
  #

Looking at the reason for the failure:

  # perf test -v vmlinux |& grep ^ERR
  ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  #

But:

  # grep -w u /proc/kallsyms
  ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

The test checks that "vmlinux symtab matches kallsyms", so it finds those two
symbols in vmlinux:

  # pahole --running_kernel_vmlinux
  /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux
  #

  # readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)'
 81844: ffffffff81efc7e0   524 FUNC    LOCAL  DEFAULT    1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
144259: ffffffff81efc7d0    16 FUNC    LOCAL  DEFAULT    1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

It is there.

From the nm documentation we can see that:

           "U" The symbol is undefined.

           "u" The symbol is a unique global symbol.  This is a GNU extension to the
	       standard set of ELF symbol bindings.  For such a symbol the dynamic
	       linker will make sure that in the entire process there is just one
	       symbol with this name and type in use.

So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case.

Fedora:40 shows this as a 'l' symbol, so consider that as well.

With this patch 'perf test 1' is happy again:

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : Ok
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:44 -03:00
Stephen Brennan
b10f74308e perf symbol: Support .gnu_debugdata for symbols
Fedora introduced a "MiniDebuginfo" feature, in which an LZMA-compressed
ELF file is placed inside a section named ".gnu_debugdata". This file
contains nothing but a symbol table, which can be used to supplement the
.dynsym section which only contains required symbols for runtime.

It is supported by GDB for stack traces, but it should be useful for
tracing as well. Implement support for loading symbols from
.gnu_debugdata.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250307232206.2102440-4-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:37:06 -07:00
Dmitry Vyukov
61b6b31c2f perf report: Add parallelism filter
Add parallelism filter that can be used to look at specific parallelism
levels only. The format is the same as cpu lists. For example:

Only single-threaded samples: --parallelism=1
Low parallelism only: --parallelism=1-4
High parallelism only: --parallelism=64-128

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/e61348985ff0a6a14b07c39e880edbd60a8f8635.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Namhyung Kim
8c2eafbbfd perf symbol: Prefer non-label symbols with same address
When there are more than one symbols at the same address, it needs to
choose which one is better.  In choose_best_symbol() it didn't check the
type of symbols.  It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.

  $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
  105089: ffffffff82000000   814 FUNC    GLOBAL DEFAULT    1 __do_softirq
  111954: ffffffff82000000     0 NOTYPE  GLOBAL DEFAULT    1 __softirqentry_text_start

The commit 77b004f4c5 tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code.  Let's restore the original code and choose the right symbol using
type of the symbols.

Fixes: 77b004f4c5 ("perf symbol: Do not fixup end address of labels")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Namhyung Kim
77b679453d Linux 6.12-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmcMPK0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGw5kH/0GukMc4uUytezog
 7UxIxa0G1zvwJwAhJpNCZR90e8GQ7YCvJFUOxjX3bVqjxZlCjEJ3YWC3fZNdx8YS
 fOjbuZlGiTmyKg91aVYlby5l23N+r2u6gCDBdPfJD0japiIbayBKjrL+hdEicmf3
 w6qToMY20mdvRQ6SUd+Y9nrc//TONru4EhabqRU2Sf1sDzQd1qj4WPtDLSKp3YG9
 hpFR7YeJaSYDjwRz1vF8tEnQVJ4I2Df3lXJZYsoSsqiQhQ1Lasp4a09ppVPysj6x
 oQCza6xeR3jwKib23pZIbNAF4xPMdN1OMOiYELkgHo7YGc6kxniXqSVSrP3LAvkA
 b92bQpc=
 =T5hJ
 -----END PGP SIGNATURE-----

Merge tag 'v6.12-rc3' into perf-tools-next

To get the fixes in the current perf-tools tree.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14 10:45:28 -07:00
Namhyung Kim
77b004f4c5 perf symbol: Do not fixup end address of labels
When it loads symbols from an ELF file, it loads label symbols which is
0 size.  Sometimes it has the same address with other symbols and might
shadow the original symbols because it fixes up the size of the symbol.

For example, in my system __do_softirq is shadowed and only accepts the
__softirqentry_text_start instead.  But it should accept __do_softirq.

  $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
  105089: ffffffff82000000   814 FUNC    GLOBAL DEFAULT    1 __do_softirq
  111954: ffffffff82000000     0 NOTYPE  GLOBAL DEFAULT    1 __softirqentry_text_start

  $ perf annotate --stdio __do_softirq
  Error:
  The perf.data data has no samples!

  $ perf annotate --stdio __softirqentry_text_start | head
   Percent |	Source code & Disassembly of vmlinux for cycles (26 samples, percent: local period)
  ---------------------------------------------------------------------------------------------------
           : 0                0xffffffff82000000 <__softirqentry_text_start>:
      0.00 :   ffffffff82000000:        nopl    (%rax,%rax)
     30.77 :   ffffffff82000005:        pushq   %rbp
      3.85 :   ffffffff82000006:        movq    %rsp, %rbp
      0.00 :   ffffffff82000009:        pushq   %r15
      3.85 :   ffffffff8200000b:        pushq   %r14
      3.85 :   ffffffff8200000d:        pushq   %r13
      0.00 :   ffffffff8200000f:        pushq   %r12

We can ignore NOTYPE symbols in the symbols__fixup_end() so that it can
pick the __do_softirq() in choose_best_symbol().  This should be fine
since most symbols have either STT_FUNC or STT_OBJECT.

Link: https://lore.kernel.org/r/20240912224208.3360116-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-25 22:37:25 -07:00
Namhyung Kim
5363c30678 perf symbol: Set binary_type of dso when loading
For the kernel dso, it sets the binary type of dso when loading the
symbol table.  But it seems not to do that for user DSOs.  Actually
it sets the symtab type only.  It's not clear why we want to maintain
the two separately but it uses the binary type info before getting
the disassembly.

Let's use the symtab type as binary type too if it's not set.  I think
it's ok to set the binary type when it founds a symsrc whether or not
it has actual symbols.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Alexander Monakov <amonakov@ispras.ru>
Link: https://lore.kernel.org/r/20240426215139.1271039-1-namhyung@kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc:  <linux-perf-users@vger.kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-22 23:46:18 +02:00
Ian Rogers
e25ebda78e perf cap: Tidy up and improve capability testing
Remove dependence on libcap. libcap is only used to query whether a
capability is supported, which is just 1 capget system call.

If the capget system call fails, fall back on root permission
checking. Previously if libcap fails then the permission is assumed
not present which may be pessimistic/wrong.

Add a used_root out argument to perf_cap__capable to say whether the
fall back root check was used. This allows the correct error message,
"root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to
be selected.

Tidy uses of perf_cap__capable so that tests aren't repeated if capget
isn't supported.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240806220614.831914-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-20 17:53:12 -03:00
Ian Rogers
1553419c3c perf dso: Fix address sanitizer build
Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.

Fixes: ee756ef749 ("perf dso: Add reference count checking and accessor functions")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:41 -07:00
Namhyung Kim
e988a5b53e perf symbol: Simplify kernel module checking
In dso__load(), it checks if the dso is a kernel module by looking the
symtab type.  Actually dso has 'is_kmod' field to check that easily and
dso__set_module_info() set the symtab type and the is_kmod bit.  So it
should have the same result to check the is_kmod bit.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-3-namhyung@kernel.org
2024-06-25 11:06:20 -07:00
Athira Rajeev
b0979f008f tools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load
Perf test for perf probe of function from different CU fails
as below:

	./perf test -vv "test perf probe of function from different CU"
	116: test perf probe of function from different CU:
	--- start ---
	test child forked, pid 2679
	Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.Msa7iy89bx/testfile
	  Error: Failed to add events.
	--- Cleaning up ---
	"foo" does not hit any event.
	  Error: Failed to delete events.
	---- end(-1) ----
	116: test perf probe of function from different CU                   : FAILED!

The test does below to probe function "foo" :

	# gcc -g -Og -flto -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.c
	-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
	# gcc -g -Og -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.c
	-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o
	# gcc -g -Og -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
	/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
	/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o

	# ./perf probe -x /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile foo
	Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
	   Error: Failed to add events.

Perf probe fails to find symbol foo in the executable placed in
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7

Simple reproduce:

 # mktemp -d /tmp/perf-checkXXXXXXXXXX
   /tmp/perf-checkcWpuLRQI8j

 # gcc -g -o test test.c
 # cp test /tmp/perf-checkcWpuLRQI8j/
 # nm /tmp/perf-checkcWpuLRQI8j/test | grep foo
   00000000100006bc T foo

 # ./perf probe -x /tmp/perf-checkcWpuLRQI8j/test foo
   Failed to find symbol foo in /tmp/perf-checkcWpuLRQI8j/test
      Error: Failed to add events.

But it works with any files like /tmp/perf/test. Only for
patterns with "/tmp/perf-", this fails.

Further debugging, commit 80d496be89 ("perf report: Add support
for profiling JIT generated code") added support for profiling JIT
generated code. This patch handles dso's of form
"/tmp/perf-$PID.map" .

The check used "if (strncmp(self->name, "/tmp/perf-", 10) == 0)"
to match "/tmp/perf-$PID.map". With this commit, any dso in
/tmp/perf- folder will be considered separately for processing
(not only JIT created map files ). Fix this by changing the
string pattern to check for "/tmp/perf-%d.map". Add a helper
function is_perf_pid_map_name to do this check. In "struct dso",
dso->long_name holds the long name of the dso file. Since the
/tmp/perf-$PID.map check uses the complete name, use dso___long_name for
the string name.

With the fix,
	# ./perf test "test perf probe of function from different CU"
	117: test perf probe of function from different CU                   : Ok

Fixes: 56cbeacf14 ("perf probe: Add test for regression introduced by switch to die_get_decl_file()")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-1-atrajeev@linux.vnet.ibm.com
2024-06-25 11:00:43 -07:00
James Clark
25626e19ae perf symbols: Fix ownership of string in dso__load_vmlinux()
The linked commit updated dso__load_vmlinux() to call
dso__set_long_name() before loading the symbols. Loading the symbols may
not succeed but dso__set_long_name() takes ownership of the string. The
two callers of this function free the string themselves on failure
cases, resulting in the following error:

  $ perf record -- ls
  $ perf report

  free(): double free detected in tcache 2

Fix it by always taking ownership of the string, even on failure. This
means the string is either freed at the very first early exit condition,
or later when the dso is deleted or the long name is replaced. Now no
special return value is needed to signify that the caller needs to
free the string.

Fixes: e59fea47f8 ("perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-09 18:48:46 -03:00
James Clark
f30232b20f perf symbols: Update kcore map before merging in remaining symbols
When loading kcore, the main vmlinux map is updated in the same loop
that merges the remaining maps. If a map that overlaps is merged in
before kcore, the list can become unsortable when the main map addresses
are updated. This will later trigger the check_invariants() assert:

  $ perf record
  $ perf report

  util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
    map__start(map) || map__start(prev) == map__start(map)' failed.
  Aborted

Fix it by moving the main map update prior to the loop so that
maps__merge_in() can split it if necessary.

Fixes: 659ad3492b ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-09 18:48:32 -03:00
James Clark
9fe410a7ef perf symbols: Remove map from list before updating addresses
Make the order of operations remove, update, add. Updating addresses
before the map is removed causes the ordering check to fail when the map
is removed. This can be reproduced when running Perf on an Arm system
with a static kernel and Perf uses kcore rather than other sources:

  $ perf record -- ls
  $ perf report

  util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
    map__start(map) || map__start(prev) == map__start(map)' failed

Fixes: 659ad3492b ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-09 18:48:00 -03:00
Ian Rogers
ad3003a65a perf mem-info: Move mem-info out of mem-events and symbol
Move mem-info to its own header rather than having it split between
mem-events and symbol.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-07 18:06:44 -03:00
Ian Rogers
ee756ef749 perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&dso->symbols, false);
  -       symbols__fixup_duplicate(&dso->symbols);
  -       dso->adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06 15:28:49 -03:00
Namhyung Kim
bacefe0c7b perf tools: Fixup module symbol end address properly
I got a strange error on ARM to fail on processing FINISHED_ROUND
record.  It turned out that it was failing in symbol__alloc_hist()
because the symbol size is too big.

When a sample is captured on a specific BPF program, it failed.  I've
added a debug code and found the end address of the symbol is from
the next module which is placed far way.

  ffff800008795778-ffff80000879d6d8: bpf_prog_1bac53b8aac4bc58_netcg_sock    [bpf]
  ffff80000879d6d8-ffff80000ad656b4: bpf_prog_76867454b5944e15_netcg_getsockopt      [bpf]
  ffff80000ad656b4-ffffd69b7af74048: bpf_prog_1d50286d2eb1be85_hn_egress     [bpf]   <---------- here
  ffffd69b7af74048-ffffd69b7af74048: $x.5    [sha3_generic]
  ffffd69b7af74048-ffffd69b7af740b8: crypto_sha3_init        [sha3_generic]
  ffffd69b7af740b8-ffffd69b7af741e0: crypto_sha3_update      [sha3_generic]

The logic in symbols__fixup_end() just uses curr->start to update the
prev->end.  But in this case, it won't work as it's too different.

I think ARM has a different kernel memory layout for modules and BPF
than on x86.  Actually there's a logic to handle kernel and module
boundary.  Let's do the same for symbols between different modules.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Leo Yan <leo.yan@linux.dev>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240212233322.1855161-1-namhyung@kernel.org
2024-02-16 16:07:28 -08:00
Ian Rogers
ff0bd79980 perf maps: Hide maps internals
Move the struct into the C file. Add maps__equal to work around
exposing the struct for reference count checking. Add accessors for
the unwind_libunwind_ops. Move maps_list_node to its only use in
symbol.c.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-6-irogers@google.com
2024-02-12 12:35:41 -08:00
Ian Rogers
107ef66cb0 perf maps: Get map before returning in maps__find_by_name
Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this. Also fix some reference counted pointer comparisons.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-4-irogers@google.com
2024-02-12 12:35:33 -08:00
Ian Rogers
42fd623b58 perf maps: Get map before returning in maps__find
Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-3-irogers@google.com
2024-02-12 12:35:26 -08:00
Ian Rogers
8d5847a617 perf maps: Add remove maps function to remove a map based on callback
Removing maps wasn't being done under the write lock. Similar to
maps__for_each_map(), iterate the entries but in this case remove the
entry based on the result of the callback. If an entry was removed
then maps_by_name() also needs updating, so add missed flush.

In dso__load_kcore(), the test of map to save would always be false with
REFCNT_CHECKING because of a missing RC_CHK_ACCESS/RC_CHK_EQUAL.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231207011722.1220634-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-20 14:51:44 -03:00
Ian Rogers
111350c67d perf symbol: Use function to add missing maps lock
Switch do_validate_kcore_modules from loop macro maps__for_each_entry to
maps__for_each_map function that takes a callback. The function holds
the maps lock, which should be held during iteration.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231207011722.1220634-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-18 21:35:09 -03:00
Ian Rogers
0f6ab6a3fb perf maps: Move symbol maps functions to maps.c
Move the find and certain other symbol maps__* functions to maps.c for
better abstraction.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231127220902.1315692-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06 13:01:46 -03:00
Ian Rogers
9fa688ea34 perf map: Simplify map_ip/unmap_ip and make 'struct map' smaller
When mapping an IP it is either an identity mapping or a DSO relative
mapping, so a single bit is required in the struct to identify
this.

The current code uses function pointers, adding 2 pointers per map and
also pushing the size of a map beyond 1 cache line.

Switch to using a byte to identify the mapping type (as well as priv and
erange_warned), to avoid any masking.

Change struct maps's layout to avoid holes.

Before:
```
struct map {
        u64                        start;                /*     0     8 */
        u64                        end;                  /*     8     8 */
        _Bool                      erange_warned:1;      /*    16: 0  1 */
        _Bool                      priv:1;               /*    16: 1  1 */

        /* XXX 6 bits hole, try to pack */
        /* XXX 3 bytes hole, try to pack */

        u32                        prot;                 /*    20     4 */
        u64                        pgoff;                /*    24     8 */
        u64                        reloc;                /*    32     8 */
        u64                        (*map_ip)(const struct map  *, u64); /*    40     8 */
        u64                        (*unmap_ip)(const struct map  *, u64); /*    48     8 */
        struct dso *               dso;                  /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        refcount_t                 refcnt;               /*    64     4 */
        u32                        flags;                /*    68     4 */

        /* size: 72, cachelines: 2, members: 12 */
        /* sum members: 68, holes: 1, sum holes: 3 */
        /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
        /* last cacheline: 8 bytes */
};
```

After:
```
struct map {
        u64                        start;                /*     0     8 */
        u64                        end;                  /*     8     8 */
        u64                        pgoff;                /*    16     8 */
        u64                        reloc;                /*    24     8 */
        struct dso *               dso;                  /*    32     8 */
        refcount_t                 refcnt;               /*    40     4 */
        u32                        prot;                 /*    44     4 */
        u32                        flags;                /*    48     4 */
        enum mapping_type          mapping_type:8;       /*    52: 0  4 */

        /* Bitfield combined with next fields */

        _Bool                      erange_warned;        /*    53     1 */
        _Bool                      priv;                 /*    54     1 */

        /* size: 56, cachelines: 1, members: 11 */
        /* padding: 1 */
        /* last cacheline: 56 bytes */
};
```

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231127220902.1315692-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-06 13:01:42 -03:00
Ian Rogers
56e144fe98 perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
Fix leak where mem_info__put wouldn't release the maps/map as used by
perf mem. Add exit functions and use elsewhere that the maps and map
are released.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25 13:39:58 -07:00
Ian Rogers
78c32f4cb1 libperf rc_check: Add RC_CHK_EQUAL
Comparing pointers with reference count checking is tricky to avoid a
SEGV. Add a convenience macro to simplify and use.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-25 13:37:22 -07:00
Arnaldo Carvalho de Melo
29a2fd7c72 perf symbols: Add 'intel_idle_ibrs' to the list of idle symbols
This is a longstanding to do list entry: we need a way to see that a
sample took place while in idle state, as the current way to do it is
to infer that by the name of the functions that in such state have
more samples, IOW: a hack.

Maybe we can do flip a bit in samples that take place inside the
enter/exit idle section in do_idle()?

But till then, add one more :-\

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/ZR66Qgbcltt+zG7F@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-12 10:01:55 -07:00
Athira Rajeev
26a5262d30 tools/perf: Add text_end to "struct dso" to save .text section size
Update "struct dso" to include new member "text_end".
This new field will represent the offset for end of text
section for a dso. For elf, this value is derived as:
sh_size (Size of section in byes) + sh_offset (Section file
offst) of the elf header for text.

For bfd, this value is derived as:
1. For PE file,
section->size + ( section->vma - dso->text_offset)
2. Other cases:
section->filepos (file position) + section->size (size of
section)

To resolve the address from a sample, perf looks at the
DSO maps. In case of address from a kernel module, there
were some address found to be not resolved. This was
observed while running perf test for "Object code reading".
Though the ip falls beteen the start address of the loaded
module (perf map->start ) and end address ( perf map->end),
it was unresolved.

Example:

    Reading object code for memory address: 0xc008000007f0142c
    File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
    On file address is: 0x1114cc
    Objdump command is: objdump -z -d --start-address=0x11142c --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
    objdump read too few bytes: 128
    test child finished with -1

Here, module is loaded at:
    # cat /proc/modules | grep xfs
    xfs 2228224 3 - Live 0xc008000007d00000

From objdump for xfs module, text section is:
    text 0010f7bc  0000000000000000 0000000000000000 000000a0 2**4

Here the offset for 0xc008000007f0142c ie  0x112074 falls out
.text section which is up to 0x10f7bc.

In this case for module, the address 0xc008000007e11fd4 is pointing
to stub instructions. This address range represents the module stubs
which is allocated on module load and hence is not part of DSO offset.

To identify such  address, which falls out of text
section and within module end, added the new field "text_end" to
"struct dso".

Reported-by: Disha Goel <disgoel@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230928075213.84392-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-10-04 22:28:07 -07:00