linux

mirror of https://github.com/torvalds/linux.git synced 2026-05-30 01:53:29 +02:00

Author	SHA1	Message	Date
Namhyung Kim	552636b931	perf trace: Add beautifier script for fsmount flags And move the existing one to fsmount_attr.sh to be more precise. Now the fsmount_flags[] is generated from the mount.h like below. The ilog2() + 1 is an existing pattern to handle bit flags. $ cat tools/perf/trace/beauty/generated/fsmount_arrays.c static const char *fsmount_flags[] = { [ilog2(0x00000001) + 1] = "CLOEXEC", [ilog2(0x00000002) + 1] = "NAMESPACE", }; It was found by Sashiko during the review. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-05-14 14:48:32 -07:00
Arnaldo Carvalho de Melo	fbfb858552	perf tools: Use calloc() where applicable Instead of using zalloc(nr_entries * sizeof_entry) that is what calloc() does. In some places where linux/zalloc.h isn't needed, remove it, add when needed and was getting it indirectly. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-08 19:21:05 -07:00
Michael Petlan	11e8d234d4	perf trace: Fix potential u64 underflow in duration calculation Although it happens very rarely, in case of out-of-order events (i.e. due to CPU migration when a syscall is executed), the calculation of event duration might underflow and thus a bogus value is printed: 2.804 ( 0.001 ms): :49553/49553 rt_sigaction(sig: QUIT, act: 0x7fff403ed6e0, oact: 0x7fff403ed780, sigsetsize: 8) = 0 2.807 ( 0.001 ms): :49553/49553 rt_sigaction(sig: CHLD, act: 0x7fff403ed6e0, oact: 0x7fff403ed780, sigsetsize: 8) = 0 2.815 (18446744073709.438 ms): :49553/49553 execve(filename: 0xbb173a30, argv: 0x55aabb171930, envp: 0x55aabb171120) = 0 2.815 ( 0.534 ms): pwd/49553 ... [continued]: execve()) = 0 Check for possible underflow first and in case of a bogus value, do not print it. 2.804 ( 0.001 ms): :49553/49553 rt_sigaction(sig: QUIT, act: 0x7fff403ed6e0, oact: 0x7fff403ed780, sigsetsize: 8) = 0 2.807 ( 0.001 ms): :49553/49553 rt_sigaction(sig: CHLD, act: 0x7fff403ed6e0, oact: 0x7fff403ed780, sigsetsize: 8) = 0 2.815 ( ): :49553/49553 execve(filename: 0xbb173a30, argv: 0x55aabb171930, envp: 0x55aabb171120) = 0 2.815 ( 0.534 ms): pwd/49553 ... [continued]: execve()) = 0 Signed-off-by: Michael Petlan <mpetlan@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-02 18:43:35 -07:00
Namhyung Kim	4cbceeca56	perf trace: Skip unnecessary synthesis for summary-only mode It needs to synthesize task info for the comm name. The mmap information is only needed for callchain symbolization which is not used by the summary mode. Also total or cgroup summary mode don't require the task info. Let's skip the processing if possible. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-04-02 12:51:09 -07:00
Ian Rogers	c006753c3a	perf callchain: Refactor callchain option parsing record_opts__parse_callchain is shared by builtin-record and builtin-trace, it is declared in callchain.h. Move the declaration to callchain.c for consistency with the header. In other cases make the option callback a small static stub that then calls into callchain.c. Make the no argument '-g' callchain option just a short-cut for '--call-graph fp' so that there is consistency in how the arguments are handled. This requires the const char* string to be strdup-ed in __parse_callchain_report_opt. For consistency also make parse_callchain_record use strdup and remove some unnecessary casts. Also, be more explicit about the '-g' behavior if there is a .perfconfig file setting. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-03-19 14:42:46 -07:00
Ian Rogers	5cd621dead	perf bpf_map: Remove unused code bpf_map__fprintf is unused so delete it, the header file declaring it and the now unused static helper functions. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-03-13 14:26:28 -07:00
Ian Rogers	d05073adda	perf trace: Avoid an ERR_PTR in syscall_stats hashmap__new may return an ERR_PTR and previously this would be assigned to syscall_stats meaning all use of syscall_stats needs to test for NULL (uninitialized) or an ERR_PTR. Given the only reason hashmap__new can fail is ENOMEM, just use NULL to indicate the allocation failure and avoid the code having to test for NULL and IS_ERR. Fixes: `96f202eab8` (perf trace: Fix IS_ERR() vs NULL check bug) Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-03-02 17:13:19 -08:00
wangguangju	96f202eab8	perf trace: Fix IS_ERR() vs NULL check bug The alloc_syscall_stats() function always returns an error pointer (ERR_PTR) on failure. So replace NULL check with IS_ERR() check after calling delete_syscall_stats() function. Fixes: `ef2da619b1` ("perf trace: Convert syscall_stats to hashmap") Signed-off-by: wangguangju <wangguangju@hygon.cn> Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2026-02-26 10:48:14 -08:00
Ian Rogers	4e66527f88	perf thread: Add optional e_flags output argument to thread__e_machine The e_flags are needed to accurately compute complete perf register information for CSKY. Add the ability to read and have this value associated with a thread. This change doesn't wire up the use of the e_flags except in disasm where use already exists but just wasn't set up yet. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Guo Ren <guoren@kernel.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-01-26 18:21:20 -03:00
Arnaldo Carvalho de Melo	2c850606a4	perf trace: Deal with compiler const checks The strchr() function these days return const/non-const based on the arg it receives, and sometimes we need to use casts when we're dealing with variables that are used in code that needs to safely change the returned value and sometimes not (as it points to really const areas). Tweak one such case. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-01-20 20:49:59 -03:00
Ian Rogers	bac74dcbd4	perf tools: Switch printf("...%s", strerror(errno)) to printf("...%m") strerror() has thread safety issues, strerror_r() requires stack allocated buffers. Code in perf has already been using the "%m" formatting flag that is a widely support glibc extension to print the current errno's description. Expand the usage of this formatting flag and remove usage of strerror()/strerror_r(). Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Yunseong Kim <ysk@kzalloc.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-01-14 17:22:50 -03:00
Arnaldo Carvalho de Melo	c85eff00cf	perf trace: Don't change const char strings We got away with this so far but now with fedora 44 complaining about the return value of strchr et all, lets use strdup for good measure. Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20251211221756.96294-5-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-12-17 09:30:37 -03:00
Namhyung Kim	267c2e633a	perf trace: Skip internal syscall arguments Recent changes in the linux-next kernel will add new field for syscalls to have contents in the userspace like below. # cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format name: sys_enter_write ID: 758 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int __syscall_nr; offset:8; size:4; signed:1; field:unsigned int fd; offset:16; size:8; signed:0; field:const char * buf; offset:24; size:8; signed:0; field:size_t count; offset:32; size:8; signed:0; field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1), ((unsigned long)(REC->count)) We have a different way to handle those arguments and this change confuses perf trace then make some tests failing. Fix it by skipping the new fields that have "__data_loc char[]" type. Maybe we can switch to this instead of the BPF augmentation later. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Howard Chu <howardchu95@gmail.com> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-29 12:23:37 -08:00
Ian Rogers	ad1a008bf0	perf trace: Don't synthesize mmaps unless callchains are enabled Synthesizing mmaps in perf trace is unnecessary unless call chains are being generated. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-19 11:32:30 +09:00
Fushuai Wang	b0f4ade163	perf trace: Fix IS_ERR() vs NULL check bug The alloc_syscall_stats() function always returns an error pointer (ERR_PTR) on failure. So replace NULL check with IS_ERR() check after calling alloc_syscall_stats() function. Fixes: `fc00897c8a` ("perf trace: Add --summary-mode option") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-09-19 12:42:07 -03:00
Namhyung Kim	ece3c7754f	perf trace: Add --max-summary option The --max-summary option is to limit the number of output lines for syscall summary stats. The max applies to each entries like thread and cgroups. For total summary, it will just print up to the given number. For example, $ sudo perf trace -as --max-summary 3 sleep 0.1 ThreadPoolServi (1011651), 114 events, 14.8% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 38 0 95.589 0.000 2.515 11.153 28.98% futex 9 0 0.040 0.002 0.004 0.014 28.63% read 10 0 0.037 0.003 0.004 0.005 4.67% sleep (1050529), 250 events, 32.4% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ clock_nanosleep 1 0 100.156 100.156 100.156 100.156 0.00% execve 4 3 1.020 0.005 0.255 0.989 95.93% openat 36 17 0.416 0.003 0.012 0.029 10.58% ... And this is for per-cgroup summary using BPF. $ sudo perf trace -as --max-summary 3 --summary-mode=cgroup --bpf-summary sleep 0.1 cgroup /user.slice/user-657345.slice/user@657345.service/session.slice/org.gnome.Shell@x11.service, 12 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 8 7 0.016 0.001 0.002 0.006 39.73% ppoll 1 0 0.014 0.014 0.014 0.014 0.00% write 2 0 0.010 0.002 0.005 0.008 61.02% cgroup /user.slice/user-657345.slice/session-4.scope, 73 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 8 0 13.461 0.010 1.683 12.235 89.66% ioctl 20 0 0.204 0.001 0.010 0.113 54.01% writev 11 0 0.164 0.004 0.015 0.042 20.34% Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-09-19 12:14:29 -03:00
Ian Rogers	003a86bce0	perf trace: Avoid global perf_env with evsel__env There is no session in perf trace unless in replay mode, so in host mode no session can be associated with the evlist. If the evsel__env call fails resort to the host_env that's part of the trace. Remove errno_to_name as it becomes a called once 1-line function once the argument is turned into a perf_env, just call perf_env__arch_strerrno directly. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-19-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:58 -07:00
Ian Rogers	e481066388	perf machine: Explicitly pass in host perf_env When creating a machine for the host explicitly pass in a scoped perf_env. This removes a use of the global perf_env. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-17-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:57 -07:00
Ian Rogers	c3e5b9ec96	perf session: Add accessor for session->header.env The perf_env from the header in the session is frequently accessed, add an accessor function rather than access directly. Cache the value to avoid repeated calls. No behavioral change. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:56 -07:00
Namhyung Kim	f6109fb6f5	perf trace: Split BPF skel code to util/bpf_trace_augment.c And make builtin-trace.c less conditional. Dummy functions will be called when BUILD_BPF_SKEL=0 is used. This makes the builtin-trace.c slightly smaller and simpler by removing the skeleton and its helpers. The conditional guard of trace__init_syscalls_bpf_prog_array_maps() is changed from the HAVE_BPF_SKEL to HAVE_LIBBPF_SUPPORT as it doesn't have a skeleton in the code directly. And a dummy function is added so that it can be called unconditionally. The function will succeed only if the both conditions are true. Do not include trace_augment.h from the BPF code and move the definition of TRACE_AUG_MAX_BUF to the BPF directly. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623225721.21553-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-06-26 10:31:05 -07:00
Ian Rogers	eda9e47fae	perf trace: Add missed freeing of ordered events and thread Caught by leak sanitizer running "perf trace BTF general tests". Make the ordered_events initialization unconditional and early so that trace__exit cleanup is simple - ordered_events__init doesn't allocate and just sets up 4 values and inits 3 list heads. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250617223356.2752099-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-06-24 10:27:51 -07:00
Namhyung Kim	317fa41b47	perf trace: Show zero value in STRARRAY The STRARRAY macro is to print values in a pre-defined array. But sometimes it hides the value because it's 0. The value of 0 can have a meaning in this case so set 'show_zero' field. For example, it can show CREATE_MAP cmd in the bpf syscall. Acked-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250502204056.973977-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-06-24 10:27:50 -07:00
Ian Rogers	b4c658d4d6	perf target: Remove uid from target Gathering threads with a uid by scanning /proc is inherently racy leading to perf_event_open failures that quit perf. All users of the functionality now use BPF filters, so remove uid and uid_str from target. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-06-09 11:18:18 -07:00
Ian Rogers	bf1976dd28	perf trace: Switch user option to use BPF filter Finding user processes by scanning /proc is inherently racy and results in perf_event_open failures. Use a BPF filter to drop samples where the uid doesn't match. Ensure adding the BPF filter forces system-wide. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250604174545.2853620-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-06-09 11:18:18 -07:00
Anubhav Shelat	8c56bfe53b	perf trace: Set errpid to false for rseq and set_robust_list The 'rseq' and 'set_robust_list' syscalls don't return a pid, so set errpid for both to false. Fixes: `0c1019e346` ("perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space") Fixes: `1de5b5dcb8` ("perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space") Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529143334.1469669-2-ashelat@redhat.com [ Remove explicit .errpid = false, omitting its initialization zeroes it, as noted by Namhyung ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-29 22:09:37 -03:00
Anubhav Shelat	c7a48ea9b9	perf trace: Always print return value for syscalls returning a pid The syscalls that were consistently observed were set_robust_list and rseq. This is because perf cannot find their child process. This change ensures that the return value is always printed. Before: 0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24) = 0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979) = After: 0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24) = 0 0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979) = 0 Committer notes: As discussed in the thread in the Link: tag below, these two don't return a pid, but for syscalls returning one, we need to print the result and if we manage to find the children in 'perf trace' data structures, then print its name as well. Fixes: `11c8e39f51` ("perf trace: Infrastructure to show COMM strings for syscalls returning PIDs") Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403160411.159238-2-ashelat@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-28 15:39:29 -03:00
Namhyung Kim	ef60b8f572	perf trace: Support --summary-mode=cgroup Add a new summary mode to collect stats for each cgroup. $ sudo ./perf trace -as --bpf-summary --summary-mode=cgroup -- sleep 1 Summary of events: cgroup /user.slice/user-657345.slice/user@657345.service/session.slice/org.gnome.Shell@x11.service, 535 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ ppoll 15 0 373.600 0.004 24.907 197.491 55.26% poll 15 0 1.325 0.001 0.088 0.369 38.76% close 66 0 0.567 0.007 0.009 0.026 3.55% write 150 0 0.471 0.001 0.003 0.010 3.29% recvmsg 94 83 0.290 0.000 0.003 0.037 16.39% ioctl 26 0 0.237 0.001 0.009 0.096 50.13% timerfd_create 66 0 0.236 0.003 0.004 0.024 8.92% timerfd_settime 70 0 0.160 0.001 0.002 0.012 7.66% writev 10 0 0.118 0.001 0.012 0.019 18.17% read 9 0 0.021 0.001 0.002 0.004 14.07% getpid 14 0 0.019 0.000 0.001 0.004 20.28% cgroup /system.slice/polkit.service, 94 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ ppoll 22 0 19.811 0.000 0.900 9.273 63.88% write 30 0 0.040 0.001 0.001 0.003 12.09% recvmsg 12 0 0.018 0.001 0.002 0.006 28.15% read 18 0 0.013 0.000 0.001 0.003 21.99% poll 12 0 0.006 0.000 0.001 0.001 4.48% cgroup /user.slice/user-657345.slice/user@657345.service/app.slice/app-org.gnome.Terminal.slice/gnome-terminal-server.service, 21 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ ppoll 4 0 17.476 0.003 4.369 13.298 69.65% recvmsg 15 12 0.068 0.002 0.005 0.014 26.53% writev 1 0 0.033 0.033 0.033 0.033 0.00% poll 1 0 0.005 0.005 0.005 0.005 0.00% ... It works only for --bpf-summary for now. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250501225337.928470-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-13 18:20:46 -03:00
Namhyung Kim	30d20fb1f8	perf trace: Fix leaks of 'struct thread' in set_filter_loop_pids() I've found some leaks from 'perf trace -a'. It seems there are more leaks but this is what I can find for now. Fixes: `082ab9a18e` ("perf trace: Filter out 'sshd' in the tracer ancestry in syswide tracing") Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org [ split from a larget patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-08 17:11:50 -03:00
Namhyung Kim	bb3de7fa98	perf trace: Fix leaks of 'struct thread' in fprintf_sys_enter() I've found some leaks from 'perf trace -a'. It seems there are more leaks but this is what I can find for now. Fixes: `70351029b5` ("perf thread: Add support for reading the e_machine type for a thread") Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org [ split from a larget patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-08 17:10:09 -03:00
Ian Rogers	8830091383	perf trace: Free the files.max entry in files->table The files.max is the maximum valid fd in the files array and so freeing the values needs to be inclusive of the max value. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-05-08 11:49:40 -03:00
Namhyung Kim	1bec43f523	perf trace: Implement syscall summary in BPF When -s/--summary option is used, it doesn't need (augmented) arguments of syscalls. Let's skip the augmentation and load another small BPF program to collect the statistics in the kernel instead of copying the data to the ring-buffer to calculate the stats in userspace. This will be much more light-weight than the existing approach and remove any lost events. Let's add a new option --bpf-summary to control this behavior. I cannot make it default because there's no way to get e_machine in the BPF which is needed for detecting different ABIs like 32-bit compat mode. No functional changes intended except for no more LOST events. :) $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1 Summary of events: total, 6194 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75% futex 693 45 4317.231 0.000 6.230 500.077 21.98% poll 300 0 1040.109 0.000 3.467 120.928 17.02% clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00% ppoll 360 0 872.386 0.001 2.423 253.275 41.91% epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79% pselect6 14 0 108.130 7.198 7.724 8.206 0.85% nanosleep 39 0 43.378 0.069 1.112 10.084 44.23% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org [ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-04-28 16:53:11 -03:00
Thomas Richter	216d567610	perf trace: Fix wrong size to bpf_map__update_elem call In linux-next commit `c760174401` ("perf cpumap: Reduce cpu size from int to int16_t") causes the perf tests 100 126 to fail on s390: Output before: # ./perf test 100 100: perf trace BTF general tests : FAILED! # The root cause is the change from int to int16_t for the cpu maps. The size of the CPU key value pair changes from four bytes to two bytes. However a two byte key size is not supported for bpf_map__update_elem(). Note: validate_map_op() in libbpf.c emits warning libbpf: map '__augmented_syscalls__': \ unexpected key size 2 provided, expected 4 when key size is set to int16_t. Therefore change to variable size back to 4 bytes for invocation of bpf_map__update_elem(). Output after: # ./perf test 100 100: perf trace BTF general tests : Ok # Fixes: `c760174401` ("perf cpumap: Reduce cpu size from int to int16_t") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250324152756.3879571-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-24 13:55:26 -07:00
Ian Rogers	7b172b92c1	perf trace: Fix evlist memory leak Leak sanitizer was reporting a memory leak in the "perf record and replay" test. Add evlist__delete to trace__exit, also ensure trace__exit is called after trace__record. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-15-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:58:35 -07:00
Ian Rogers	874fa827df	perf trace: Fix BTF memory leak Add missing btf__free in trace__exit. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:58:31 -07:00
Ian Rogers	ccc60dce3e	perf trace: Make syscall table stable Namhyung fixed the syscall table being reallocated and moving by reloading the system call pointer after a move: https://lore.kernel.org/lkml/Z9YHCzINiu4uBQ8B@google.com/ This could be brittle so this patch changes the syscall table to be an array of pointers of "struct syscall" that don't move. Remove unnecessary copies and searches with this change. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:58:27 -07:00
Ian Rogers	70351029b5	perf thread: Add support for reading the e_machine type for a thread First try to read the e_machine from the dsos associated with the thread's maps. If live use the executable from /proc/pid/exe and read the e_machine from the ELF header. On failure use EM_HOST. Change builtin-trace syscall functions to pass e_machine from the thread rather than EM_HOST, so that in later patches when syscalltbl can use the e_machine the system calls are specific to the architecture. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:58:05 -07:00
Ian Rogers	5c2938fe78	perf syscalltbl: Remove struct syscalltbl The syscalltbl held entries of system call name and number pairs, generated from a native syscalltbl at start up. As there are gaps in the system call number there is a notion of index into the table. Going forward we want the system call table to be identifiable by a machine type, for example, i386 vs x86-64. Change the interface to the syscalltbl so (1) a (currently unused machine type of EM_HOST) is passed (2) the index to syscall number and system call name mapping is computed at build time. Two tables are used for this, an array of system call number to name, an array of system call numbers sorted by the system call name. The sorted array doesn't store strings in part to save memory and relocations. The index notion is carried forward and is an index into the sorted array of system call numbers, the data structures are opaque (held only in syscalltbl.c), and so the number of indices for a machine type is exposed as a new API. The arrays are computed in the syscalltbl.sh script and so no start-up time computation and storage is necessary. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:57 -07:00
Ian Rogers	3d94b8441c	perf trace: Reorganize syscalls Identify struct syscall information in the syscalls table by a machine type and syscall number, not just system call number. Having the machine type means that 32-bit system calls can be differentiated from 64-bit ones on a machine capable of both. Having a table for all machine types and all system call numbers would be too large, so maintain a sorted array of system calls as they are encountered. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-03-20 22:57:53 -07:00
Athira Rajeev	2337b7251d	perf trace: Add missing perf_tool__init() Perf trace on perf.data fails as below: ./perf trace record -- sleep 1 ./perf trace -i perf.data perf: Segmentation fault Segmentation fault (core dumped) Backtrace pointed to : ?? () perf_session.process_user_event () reader.read_event () perf_session.process_events () cmd_trace () run_builtin () handle_internal_command () main () Further debug pointed that, segmentation fault happens when trying to access id_index. Code snippet: case PERF_RECORD_ID_INDEX: err = tool->id_index(session, event); Since 'commit `15d4a6f41d` ("perf tool: Remove perf_tool__fill_defaults()")', perf_tool__fill_defaults is removed. All tools are initialized using perf_tool__init() prior to use. But in builtin-trace, perf_tool__init is not used and hence the defaults are not initialized. Use perf_tool__init() in perf trace to handle the initialization. Reported-by: Tejas Manhas <Tejas.Manhas1@ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Link: https://lore.kernel.org/r/20250225113157.28836-1-atrajeev@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-27 08:46:45 -08:00
Ian Rogers	dc6d2bc2d8	perf sample: Make user_regs and intr_regs optional The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. Initializing this much memory has a cost reported by Tavian Barnes <tavianator@tavianator.com> as about 2.5% when running `perf script --itrace=i0`: https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/ Adrian Hunter <adrian.hunter@intel.com> replied that the zero initialization was necessary and couldn't simply be removed. This patch aims to strike a middle ground of still zeroing the perf_sample, but removing 79% of its size by make user_regs and intr_regs optional pointers to zalloc-ed memory. To support the allocation accessors are created for user_regs and intr_regs. To support correct cleanup perf_sample__init and perf_sample__exit functions are created and added throughout the code base. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-12 20:06:11 -08:00
Namhyung Kim	fc00897c8a	perf trace: Add --summary-mode option The --summary-mode option will select how to show the syscall summary at the end. By default, it'll show the summary for each thread and it's the same as if --summary-mode=thread is passed. The other option is to show total summary, which is --summary-mode=total. I'd like to have this instead of a separate option like --total-summary because we may want to add a new summary mode (by cgroup) later. $ sudo ./perf trace -as --summary-mode=total sleep 1 Summary of events: total, 21580 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 1305 0 14716.712 0.000 11.277 551.529 8.87% futex 1256 89 13331.197 0.000 10.614 733.722 15.49% poll 669 0 6806.618 0.000 10.174 459.316 11.77% ppoll 220 0 3968.797 0.000 18.040 516.775 25.35% clock_nanosleep 1 0 1000.027 1000.027 1000.027 1000.027 0.00% epoll_pwait 21 0 592.783 0.000 28.228 522.293 88.29% nanosleep 16 0 60.515 0.000 3.782 10.123 33.33% ioctl 510 0 4.284 0.001 0.008 0.182 8.84% recvmsg 1434 775 3.497 0.001 0.002 0.174 6.37% write 1393 0 2.854 0.001 0.002 0.017 1.79% read 1063 100 2.236 0.000 0.002 0.083 5.11% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-12 19:44:16 -08:00
Namhyung Kim	ef2da619b1	perf trace: Convert syscall_stats to hashmap It was using a RBtree-based int-list as a hash and a custom resort logic for that. As we have hashmap, let's convert to it and add a custom sort function for the hashmap entries using an array. It should be faster and more light-weighted. It's also to prepare supporting system-wide syscall stats. No functional changes intended. Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-12 19:44:15 -08:00
Namhyung Kim	c7f821b876	perf trace: Allocate syscall stats only if summary is on The syscall stats are used only when summary is requested. Let's avoid unnecessary operations. While at it, let's pass 'trace' pointer directly instead of passing 'output' file pointer and 'summary' option in the 'trace' separately. Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-12 19:44:10 -08:00
Namhyung Kim	9e676a024f	Linux 6.14-rc1 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmegAi4eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG+cMH/jFx5lmvzVObuStc OdqfdMJVF238cX3iovDF6hLMDCuSgYY9CX5FYmd7pGtxGuUEecSLxin+WbJcxfin WBHzgPP+hmcjqpU0yCd3azITi8BHJeFCgT86OM/1Rsv82M4T/xWxBIET79izQJ0E 5L9KzlmPMLTLbLPVa+wookXfoJOycWRDCN6p/jxTLzeM/szqDlokAsSf19iodkl/ 59Gnk5oEYneqyt4FdTgxWcq1fteTlzZJgC6heN5XIjZuSN1ME11N4QO0xu+ld3UA nzbpnNwCRIl50yO5+pvYpkoRrHDwxjJ7an9sliWAHxDt/etVngTaSsl8uGht/9QK +4Vi48I= =TI43 -----END PGP SIGNATURE----- Merge tag 'v6.14-rc1' into perf-tools-next To get the various fixes in the current master. Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-02-05 14:57:18 -08:00
Howard Chu	c7b87ce0dd	perf trace: Fix runtime error of index out of bounds libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: `5e58fcfaf4` ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250122025519.361873-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-01-28 09:27:27 -08:00
Benjamin Peterson	0aefb3df8b	perf trace: Fix return value of trace__fprintf_tp_fields This function formerly returned twice the number of bytes printed. Signed-off-by: Benjamin Peterson <benjamin@engflow.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250123-void-fprintf_tp_fields-v2-1-6038f8224987@engflow.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-01-24 13:21:49 -08:00
Namhyung Kim	4f90ed0ae3	perf trace: Fix unaligned access for augmented args Some version of compilers reported unaligned accesses in perf trace when undefined-behavior sanitizer is on. I found that it uses raw data in the sample directly and assuming it's properly aligned. Unlike other sample fields, the raw data is not 8-byte aligned because there's a size field (u32) before the actual data. So I added a static buffer in syscall__augmented_args() and return it instead. This is not ideal but should work well as perf trace is single-threaded. A better approach would be aligning the raw data by adding a 4-byte data before the augmented args but I'm afraid it'd break the backward compatibility. Committer testing: To build with the undefined behaviour sanitizer: $ make CC=clang EXTRA_CFLAGS=-fsanitize=undefined -C tools/perf Checking if the resulting binary is instrumented: root@number:~# nm ~/bin/perf \| grep ubsan \| wc -l 113 root@number:~# nm ~/bin/perf \| grep ubsan \| tail -5 000000000043d5b0 t _ZN7__ubsanL19UBsanOnDeadlySignalEiPvS0_ 000000000043ce50 T _ZNK7__ubsan5Value12getSIntValueEv 000000000043cf40 T _ZNK7__ubsan5Value12getUIntValueEv 000000000043d140 T _ZNK7__ubsan5Value13getFloatValueEv 000000000043cfd0 T _ZNK7__ubsan5Value19getPositiveIntValueEv root@number:~# Now running something that will access timespec, as reported in the Closes URL: root@number:~# perf trace --max-events=1 -e nano sleep 1.1 trace/beauty/timespec.c:10:64: runtime error: member access within misaligned address 0x7fc583cfb2a4 for type 'struct augmented_arg', which requires 8 byte alignment 0x7fc583cfb2a4: note: pointer points here 99 99 11 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 01 e1 f5 05 00 00 00 00 00 00 00 00 ^ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior trace/beauty/timespec.c:10:64 <SNIP> As Namhyung said we need to make the raw_data to be 64-bit aligned, probably we need to add a PERF_SAMPLE_ALIGNED_RAW with a 64-bit raw_size instead of the current u32 done at kernel/events/core.c, perf_output_sample(), that perf_output_put(handle, raw->size) where raw->size is an u32 and then the raw_data is always 64-bit unaligned... After the patch: root@number:~# perf trace -e nano sleep 1.1 0.000 (1100.064 ms): sleep/1984224 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 100000001 }, rmtp: 0x7fff5b3fe970) = 0 root@number:~# Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250102201248.790841-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-01-10 10:59:42 -03:00
Charlie Jenkins	3cc550f5bb	perf tools: Remove dependency on libaudit All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is no longer needed. With the removal of the flag, the related GENERIC_SYSCALL_TABLE can also be removed. libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT was not defined, so libaudit is also no longer needed for any architecture. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Günther Noack <gnoack@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-01-10 10:59:42 -03:00
Ian Rogers	16ecb4316f	perf env: Move arch errno function to only use in env Move arch_syscalls__strerrno_function out of builtin-trace.c to env.c so that there isn't a util to builtin function call. This allows the python.c stub to be removed. Also, remove declaration/prototype from env.h and make static to reduce scope. The include is moved inside ifdefs to avoid, "defined but unused warnings". Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-15-irogers@google.com perf: perf python: Correctly throw IndexError Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-12-18 16:24:33 -03:00
Ian Rogers	c46d634a03	perf evsel: Add/use accessor for tp_format Add an accessor function for tp_format. Rather than search+replace uses try to use a variable and reuse it. Add additional NULL checks when accessing/using the value. Make sure the PTR_ERR is nulled out on error path in evsel__newtp_idx. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-12-09 17:52:42 -03:00

1 2 3 4 5 ...

845 Commits