Steinar H. Gunderson [Sat, 3 Aug 2024 15:20:07 +0000 (17:20 +0200)]
perf annotate: Split out read_symbol()
The Capstone disassembler code has a useful code snippet to read the
bytes for a given code symbol into memory. Split it out into its own
function, so that the LLVM disassembler can use it in the next patch.
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-2-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steinar H. Gunderson [Sat, 3 Aug 2024 15:20:06 +0000 (17:20 +0200)]
perf report: Support LLVM for addr2line()
In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.
This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).
Thus, it is set as the primary choice if available.
As an example, running 'perf report' on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired and
aborted the test after waiting for 55 minutes with external bfd
addr2line (which is the default for perf as compiled by distributions
today).
Evidently, for this case, the bfd addr2line process needs 18 seconds (on
a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second
timeout and gets killed during initialization, getting restarted anew
every time. Having an in-process addr2line makes this much more robust.
As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:
- Symbol enumeration (in particular, for PE binaries).
- Demangling (including non-Itanium demangling, e.g. Microsoft
or Rust).
- Disassembling (perf annotate).
However, these are much less pressing; most people don't profile PE
binaries, and perf has non-bfd paths for ELF. The same with demangling;
the default _cxa_demangle path works fine for most users, and while bfd
objdump can be slow on large binaries, it is possible to use
--objdump=llvm-objdump to get the speed benefits. (It appears
LLVM-based demangling is very simple, should we want that.)
Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.
Committer notes:
Added the name and a __maybe_unused to address:
1 13.50 almalinux:8 : FAIL gcc version 8.5.0
20210514 (Red Hat 8.5.0-22) (GCC)
util/srcline.c: In function 'dso__free_a2l':
util/srcline.c:184:20: error: parameter name omitted
void dso__free_a2l(struct dso *)
^~~~~~~~~~~~
make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 30 Aug 2024 22:53:47 +0000 (19:53 -0300)]
perf tools: Build x86 32-bit syscall table from arch/x86/entry/syscalls/syscall_32.tbl
To remove one more use of the audit libs and address a problem reported
with a recent change where a function isn't available when using the
audit libs method, that should really go away, this being one step in
that direction.
The script used to generate the 64-bit syscall table was already
parametrized to generate for both 64-bit and 32-bit, so just use it and
wire the generated table to the syscalltbl.c routines.
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/6fe63fa3-6c63-4b75-ac09-884d26f6fb95@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yang Jihong [Mon, 19 Aug 2024 02:47:20 +0000 (10:47 +0800)]
perf sched timehist: Fixed timestamp error when unable to confirm event sched_in time
If sched_in event for current task is not recorded, sched_in timestamp
will be set to end_time of time window interest, causing an error in
timestamp show. In this case, we choose to ignore this event.
Test scenario:
perf[
1229608] does not record the first sched_in event, run time and sch delay are both 0
# perf sched timehist
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
2090450.763231 [0000] perf[
1229608] 0.000 0.000 0.000
2090450.763235 [0000] migration/0[15] 0.000 0.001 0.003
2090450.763263 [0001] perf[
1229608] 0.000 0.000 0.000
2090450.763268 [0001] migration/1[21] 0.000 0.001 0.004
2090450.763302 [0002] perf[
1229608] 0.000 0.000 0.000
2090450.763309 [0002] migration/2[27] 0.000 0.001 0.007
2090450.763338 [0003] perf[
1229608] 0.000 0.000 0.000
2090450.763343 [0003] migration/3[33] 0.000 0.001 0.004
Before:
arbitrarily specify a time window of interest, timestamp will be set to an incorrect value
# perf sched timehist --time 100,200
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
200.000000 [0000] perf[
1229608] 0.000 0.000 0.000
200.000000 [0001] perf[
1229608] 0.000 0.000 0.000
200.000000 [0002] perf[
1229608] 0.000 0.000 0.000
200.000000 [0003] perf[
1229608] 0.000 0.000 0.000
200.000000 [0004] perf[
1229608] 0.000 0.000 0.000
200.000000 [0005] perf[
1229608] 0.000 0.000 0.000
200.000000 [0006] perf[
1229608] 0.000 0.000 0.000
200.000000 [0007] perf[
1229608] 0.000 0.000 0.000
After:
# perf sched timehist --time 100,200
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
Fixes:
853b74071110bed3 ("perf sched timehist: Add option to specify time window of interest")
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819024720.2405244-1-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 28 Aug 2024 05:29:53 +0000 (22:29 -0700)]
perf lock contention: Fix spinlock and rwlock accounting
The spinlock and rwlock use a single-element per-cpu array to track
current locks due to performance reason. But this means the key is
always available and it cannot simply account lock stats in the array
because some of them are invalid.
In fact, the contention_end() program in the BPF invalidates the entry
by setting the 'lock' value to 0 instead of deleting the entry for the
hashmap. So it should skip entries with the lock value of 0 in the
account_end_timestamp().
Otherwise, it'd have spurious high contention on an idle machine:
$ sudo perf lock con -ab -Y spinlock sleep 3
contended total wait max wait avg wait type caller
8 4.72 s 1.84 s 590.46 ms spinlock rcu_core+0xc7
8 1.87 s 1.87 s 233.48 ms spinlock process_one_work+0x1b5
2 1.87 s 1.87 s 933.92 ms spinlock worker_thread+0x1a2
3 1.81 s 1.81 s 603.93 ms spinlock tmigr_update_events+0x13c
2 1.72 s 1.72 s 861.98 ms spinlock tick_do_update_jiffies64+0x25
6 42.48 us 13.02 us 7.08 us spinlock futex_q_lock+0x2a
1 13.03 us 13.03 us 13.03 us spinlock futex_wake+0xce
1 11.61 us 11.61 us 11.61 us spinlock rcu_core+0xc7
I don't believe it has contention on a spinlock longer than 1 second.
After this change, it only reports some small contentions.
$ sudo perf lock con -ab -Y spinlock sleep 3
contended total wait max wait avg wait type caller
4 133.51 us 43.29 us 33.38 us spinlock tick_do_update_jiffies64+0x25
4 69.06 us 31.82 us 17.27 us spinlock process_one_work+0x1b5
2 50.66 us 25.77 us 25.33 us spinlock rcu_core+0xc7
1 28.45 us 28.45 us 28.45 us spinlock rcu_core+0xc7
1 24.77 us 24.77 us 24.77 us spinlock tmigr_update_events+0x13c
1 23.34 us 23.34 us 23.34 us spinlock raw_spin_rq_lock_nested+0x15
Fixes:
b5711042a1c8cc88 ("perf lock contention: Use per-cpu array map for spinlocks")
Reported-by: Xi Wang <xii@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240828052953.1445862-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 30 Aug 2024 06:51:50 +0000 (23:51 -0700)]
perf lock contention: Do not fail EEXIST for update
When it updates the lock stat for the first time, it needs to create an
element in the BPF hash map.
But if there's a concurrent thread waiting for the same lock (like for
rwsem or rwlock), it might race with the thread and possibly fail to
update with -EEXIST.
In that case, it can lookup the map again and put the data there instead
of failing.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20240830065150.1758962-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 30 Aug 2024 06:51:49 +0000 (23:51 -0700)]
perf lock contention: Simplify spinlock check
The LCB_F_SPIN bit is used for spinlock, rwlock and optimistic spinning
in mutex. In get_tstamp_elem() it needs to check spinlock and rwlock
only. As mutex sets the LCB_F_MUTEX, it can check those two bits and
reduce the number of operations.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20240830065150.1758962-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 30 Aug 2024 06:51:48 +0000 (23:51 -0700)]
perf lock contention: Handle error in a single place
It has some duplicate codes to do the same job. Let's add a label and
goto there to handle errors in a single place.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20240830065150.1758962-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:54 +0000 (08:01 -0700)]
perf test: Additional pipe tests with pipe output written to a file
Additional pipe tests where piped files are written to disk. This
means that spotting a file name of "-" isn't a sufficient "is pipe?"
test.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:53 +0000 (08:01 -0700)]
perf header: Remove repipe option
No longer used by `perf inject` the repipe_fd is always -1 and repipe
is always false. Remove the options and associated code knowing the
constant values of the removed variables.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:52 +0000 (08:01 -0700)]
perf inject: Overhaul handling of pipe files
Previously inject->is_pipe was set if the input or output were a
pipe. Determining the input was a pipe had to be done prior to
starting the session and opening the file. This was done by comparing
the input file name with '-' but it fails if the pipe file is written
to disk.
Opening a pipe file from disk will correctly set perf_data.is_pipe, but
this is too late for 'perf inject' and results in a broken file. A
workaround is 'cat pipe_perf|perf inject -i - ...'.
This change removes inject->is_pipe and changes the dependent
conditions to use the is_pipe flag on the input
(inject->session->data) and output files (inject->output). This
ensures the is_pipe condition reflects things like the header being
read.
The change removes the use of perf file header repiping, that is
writing the file header out while reading it in. The case of input
pipe and output file cannot repipe as the attributes for the file are
unknown. To resolve this, write the file header when writing to disk
and as the attributes may be unknown, write them after the data.
Update sessions repipe variable to be trace_event_repipe as those are
the only events now impacted by it. Update __perf_session__new as the
repipe_fd no longer needs passing. Fully removing repipe from session
header reading will be done in a later change.
Committer testing:
root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 | perf report -i -
# To display the perf.data header info, please use --header/--header-only options.
#
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.050 MB - ]
#
# Total Lost Samples: 0
#
# Samples: 1 of event 'syscalls:sys_enter_clock_nanosleep'
# Event count (approx.): 1
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ...............................
#
100.00% sleep libc.so.6 [.] clock_nanosleep@GLIBC_2.2.5
|
---__libc_start_main@@GLIBC_2.34
__libc_start_call_main
0x562fc2560a9f
clock_nanosleep@GLIBC_2.2.5
#
# (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
#
root@number:~# perf record -e syscalls:sys_enter_*sleep/max-stack=4/ -o - sleep 0.01 > pipe.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.050 MB - ]
root@number:~# perf report --stdio -i pipe.data
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 1 of event 'syscalls:sys_enter_clock_nanosleep'
# Event count (approx.): 1
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ...............................
#
100.00% sleep libc.so.6 [.] clock_nanosleep@GLIBC_2.2.5
|
---__libc_start_main@@GLIBC_2.34
__libc_start_call_main
0x55f775975a9f
clock_nanosleep@GLIBC_2.2.5
#
# (Tip: To set sampling period of individual events use perf record -e cpu/cpu-cycles,period=100001/,cpu/branches,period=10001/ ...)
#
root@number:~#
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:51 +0000 (08:01 -0700)]
perf header: Allow attributes to be written after data
With a file, to write data an offset needs to be known. Typically data
follows the event attributes in a file.
However, if processing a pipe the number of event attributes may not be
known.
It is convenient in that case to write the attributes after the data.
Expand perf_session__do_write_header() to allow this when the data
offset and size are known.
This approach may be useful for more than just taking a pipe file to
write into a data file, `perf inject --itrace` will reserve and
additional 8kb for attributes, which would be unnecessary if the
attributes were written after the data.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:50 +0000 (08:01 -0700)]
perf header: Fail read if header sections overlap
Buggy perf.data files can have the attributes and data
overlapping.
For example, when processing pipe data the attributes aren't known and
so file offset header calculations can consider them not present.
Later this can cause the attributes to overwrite the data. This can be
seen in:
$ perf record -o - true > a.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.059 MB - ]
$ perf inject -i a.data -o b.data
$ perf report --stats -i b.data
0x68 [0]: failed to process type: 510379 [Invalid argument]
Error:
failed to process sample
$
This change makes reading the corrupt file fail:
$ perf report --stats -i b.data
Perf file header corrupt: Attributes and data overlap
incompatible file format (rerun with -v to learn more)
$
Which is more informative.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:49 +0000 (08:01 -0700)]
perf header: Add kerneldoc to 'struct perf_file_header'
Some of the values are a little strange so add documentation to
resolve ambiguity.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 29 Aug 2024 15:01:48 +0000 (08:01 -0700)]
perf session: Document 'struct perf_session' and constify its 'auxtrace' member
perf_session is a central data structure to the tool so let's comment
it. The auxtrace callbacks are never modified in session so constify.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240829150154.37929-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:49 +0000 (11:11 +0100)]
perf: cs-etm: Print queue number in raw trace dump
Now that we have overlapping trace IDs it's also useful to know what the
queue number is to be able to distinguish the source of the trace so
print it inline. Hide it behind the -v option because it might not be
obvious to users what the queue number is.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-8-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:48 +0000 (11:11 +0100)]
perf: cs-etm: Support version 0.1 of HW_ID packets
v0.1 HW_ID packets have a new field that describes which sink each CPU
writes to. Use the sink ID to link trace ID maps to each other so that
mappings are shared wherever the sink is shared.
Also update the error message to show that overlapping IDs aren't an
error in per-thread mode, just not supported. In the future we can
use the CPU ID from the AUX records, or watch for changing sink IDs on
HW_ID packets to use the correct decoders.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-7-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:47 +0000 (11:11 +0100)]
perf: cs-etm: Only save valid trace IDs into files
This isn't a bug because Perf always masks with
CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it
looking like it could be, make an effort to not save bad values.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-6-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:46 +0000 (11:11 +0100)]
perf: cs-etm: Create decoders based on the trace ID mappings
Now that each queue has a unique set of trace ID mappings, use this
list to create the decoders. In unformatted mode just add a single
mapping so only one decoder is made.
Previously each queue would have a decoder created for each traced CPU
on the system but this won't work anymore because CPUs can have
overlapping trace IDs.
This also means that the CORESIGHT_TRACE_ID_UNUSED_FLAG isn't needed
any more. If mappings aren't added then decoders aren't created, rather
than needing a flag to suppress creation.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-5-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:45 +0000 (11:11 +0100)]
perf: cs-etm: Move traceid_list to each queue
The global list won't work for per-sink trace ID allocations, so put a
list in each queue where the IDs will be unique to that queue.
To keep the same behavior as before, for version 0 of the HW_ID packets,
copy all the HW_ID mappings into all queues.
This change doesn't effect the decoders, only trace ID lookups on the
Perf side. The decoders are still created with global mappings which
will be fixed in a later commit.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-4-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:44 +0000 (11:11 +0100)]
perf: cs-etm: Allocate queues for all CPUs
Make cs_etm__setup_queue() setup a queue even if it's empty, and
pre-allocate queues based on the max CPU that was recorded. In per-CPU
mode aux queues are indexed based on CPU ID even if all CPUs aren't
recorded, sparse queue arrays aren't used.
This will allow HW_IDs to be saved even if no aux data was received in
that queue without having to call cs_etm__setup_queue() from two
different places.
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-3-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 22 Jul 2024 10:11:43 +0000 (11:11 +0100)]
perf cs-etm: Create decoders after both AUX and HW_ID search passes
Both of these passes gather information about how to create the
decoders. AUX records determine formatted/unformatted, and the HW_IDs
determine the traceID/metadata mappings.
Therefore it makes sense to cache the information and wait until both
passes are over until creating the decoders, rather than creating them
at the first HW_ID found.
This will allow a simplification of the creation process where
cs_etm_queue->traceid_list will exclusively used to create the decoders,
rather than the current two methods depending on whether the trace is
formatted or not.
Previously the sample CPU from the AUX record was used to initialize
the decoder CPU, but actually sample CPU == AUX queue index in per-CPU
mode, so saving the sample CPU isn't required.
Similarly formatted/unformatted was used upfront to create the decoders,
but now it's cached until later.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-2-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 29 Aug 2024 14:46:40 +0000 (11:46 -0300)]
Revert "tools build: Remove leftover libcap tests that prevents fast path feature detection from working"
Ian pointed out that the libcap feature test is also used by bpftool, so
we can't remove it just because perf stopped using it, revert the
removal of the feature test.
Since both perf and libcap uses the fast path feature detection
(tools/build/feature/test-all.c), probably the best thing is to keep
libcap-devel when building perf even it not being used there.
This reverts commit
47b3b6435e4bfb61ae8ffc63a11bd3c310f69acf.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 28 Aug 2024 22:06:47 +0000 (19:06 -0300)]
tools build: Remove leftover libcap tests that prevents fast path feature detection from working
I noticed that the fast path feature detection was failing:
$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
/usr/bin/ld: cannot find -lcap: No such file or directory
collect2: error: ld returned 1 exit status
$
The patch removing the dependency (Fixes tag below) didn't remove the
detection of libcap, and as the fast path feature detection (test-all.c)
had -lcap in its Makefile link list of libraries to link, it was failing
when libcap-devel is not available, fix it by removing those leftover
files.
Fixes:
e25ebda78e230283 ("perf cap: Tidy up and improve capability testing")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zs-gjOGFWtAvIZit@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 26 Aug 2024 22:10:45 +0000 (15:10 -0700)]
perf test: Add 'perf record cgroup' filtering test
$ sudo ./perf test filtering -vv
96: perf record sample filtering (by BPF) tests:
--- start ---
test child forked, pid
2966908
Checking BPF-filter privilege
Basic bpf-filter test
Basic bpf-filter test [Success]
Failing bpf-filter test
Failing bpf-filter test [Success]
Group bpf-filter test
Group bpf-filter test [Success]
Multiple bpf-filter test
Multiple bpf-filter test [Success]
Cgroup bpf-filter test
Cgroup bpf-filter test [Success]
---- end(0) ----
96: perf record sample filtering (by BPF) tests : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240826221045.1202305-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 26 Aug 2024 22:10:44 +0000 (15:10 -0700)]
perf bpf-filter: Support filtering on cgroups
The new cgroup filter can take either of '==' or '!=' operator and a
pathname for the target cgroup.
$ perf record -a --all-cgroups -e cycles --filter 'cgroup == /abc/def' -- sleep 1
Users should have --all-cgroups option in the command line to enable
cgroup filtering. Technically it doesn't need to have the option as
it can get the current task's cgroup info directly from BPF. But I want
to follow the convention for the other sample info.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240826221045.1202305-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 26 Aug 2024 22:10:43 +0000 (15:10 -0700)]
perf bpf-filter: Add build dependency to header files
The flex and bison files need to be recompiled when one of these header
filters are changed.
* util/bpf-filter.h
* util/bpf_skel/sample-filter.h
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240826221045.1202305-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 26 Aug 2024 22:10:42 +0000 (15:10 -0700)]
perf report: Fix segfault when 'sym' sort key is not used
The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.
So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage. So it shouldn't access it
unconditionally.
I got a segfault, when I wanted to see cgroup profiles.
$ sudo perf record -a --all-cgroups --synth=cgroup true
$ sudo perf report -s cgroup
Program received signal SIGSEGV, Segmentation fault.
0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
48 return RC_CHK_ACCESS(map)->dso;
(gdb) bt
#0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=
140736115941088) at util/map.c:385
#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
at util/hist.c:644
#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
at util/hist.c:1260
#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
machine=0x5555560388e8) at builtin-report.c:334
#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
at util/session.c:780
#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
file_path=0x555556038ff0 "perf.data") at util/session.c:1406
As you can see the entry->ms.map was NULL even if he->ms.map has a
value. This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same. I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).
Fixes:
ac01c8c4246546fd ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Fri, 9 Aug 2024 09:54:22 +0000 (10:54 +0100)]
perf test trace_btf_enum: Fix shellcheck warning
Shellcheck versions < v0.7.2 can't follow this path so add the helper to
fix the following warning:
In tests/shell/trace_btf_enum.sh line 13:
. "$(dirname $0)"/lib/probe.sh
^--------------------------^ SC1090: Can't follow non-constant source.
Use a directive to specify location.
Fixes:
d66763fed30f0bd8 ("perf test trace_btf_enum: Add regression test for the BTF augmentation of enums in 'perf trace'")
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240809095426.3065163-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Tue, 6 Aug 2024 20:41:23 +0000 (21:41 +0100)]
perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record
The 'pmu' pointer in the auxtrace_record structure is not used after
support multiple AUX events, remove it.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240806204130.720977-3-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Tue, 6 Aug 2024 20:41:22 +0000 (21:41 +0100)]
perf auxtrace: Use evsel__is_aux_event() for checking AUX event
Use evsel__is_aux_event() to decide if an event is a AUX event, this is
a refactoring to replace comparing the PMU type.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240806204130.720977-2-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Lucas Stach [Mon, 1 Jul 2024 17:57:35 +0000 (19:57 +0200)]
perf vendor events arm64: Move Yitian 710 DDR PMU into T-Head directory
The Yitian 710 is not a Freescale/NXP design and thus should
be located in a separate T-Head vendor directory.
Reviewed-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: Will Deacon <will@kernel.org>
Cc: kernel@pengutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Cc: patchwork-lst@pengutronix.de
Link: https://lore.kernel.org/r/20240701175735.485655-1-l.stach@pengutronix.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kajol Jain [Tue, 27 Aug 2024 05:32:06 +0000 (11:02 +0530)]
perf vendor events: Move PM_BR_MPRED_CMPL event for power10 platform
Move PM_BR_MPRED_CMPL event from cache.json to frontend.json file
for power10 platform
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20240827053206.538814-3-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kajol Jain [Tue, 27 Aug 2024 05:32:05 +0000 (11:02 +0530)]
perf vendor events power10: Move the JSON/events
Move some of the JSON/events from others.json to more appropriate JSON
files for power10 platform.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20240827053206.538814-2-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kajol Jain [Tue, 27 Aug 2024 05:32:04 +0000 (11:02 +0530)]
perf vendor events power10: Update JSON/events
Update JSON/events for power10 platform with additional events.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20240827053206.538814-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 22 Aug 2024 18:10:27 +0000 (15:10 -0300)]
perf trace: Pass the richer 'struct syscall_arg' pointer to trace__btf_scnprintf()
Since we'll need it later in the current patch series and we can get the
syscall_arg_fmt from syscall_arg->fmt.
Based-on-a-patch-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zsd8vqCrTh5h69rp@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Howard Chu [Thu, 15 Aug 2024 01:36:17 +0000 (09:36 +0800)]
perf trace: Fix perf trace -p <PID>
'perf trace -p <PID>' work on a syscall that is unaugmented, but doesn't
work on a syscall that's augmented (when it calls perf_event_output() in
BPF).
Let's take open() as an example. open() is augmented in perf trace.
Before:
$ perf trace -e open -p
3792392
? ( ): ... [continued]: open()) = -1 ENOENT (No such file or directory)
? ( ): ... [continued]: open()) = -1 ENOENT (No such file or directory)
We can see there's no output.
After:
$ perf trace -e open -p
3792392
0.000 ( 0.123 ms): a.out/
3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)
1000.398 ( 0.116 ms): a.out/
3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)
Reason:
bpf_perf_event_output() will fail when you specify a pid in 'perf trace' (EOPNOTSUPP).
When using 'perf trace -p 114', before perf_event_open(), we'll have PID
= 114, and CPU = -1.
This is bad for bpf-output event, because the ring buffer won't accept
output from BPF's perf_event_output(), making it fail. I'm still trying
to find out why.
If we open bpf-output for every cpu, instead of setting it to -1, like
this:
PID = <PID>, CPU = 0
PID = <PID>, CPU = 1
PID = <PID>, CPU = 2
PID = <PID>, CPU = 3
Everything works.
You can test it with this script (open.c):
#include <unistd.h>
#include <sys/syscall.h>
int main()
{
int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
char s1[] = "DINGZHEN", s2[] = "XUEBAO";
while (1) {
syscall(SYS_open, s1, i1, i2);
sleep(1);
}
return 0;
}
save, compile:
make open
perf trace:
perf trace -e open <path-to-the-executable>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Howard Chu [Thu, 15 Aug 2024 01:36:17 +0000 (09:36 +0800)]
perf evlist: Introduce method to find if there is a bpf-output event
We'll use it in the next patch, to deciding how to set up the ring
buffer.
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240815013626.935097-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Tue, 27 Aug 2024 21:27:57 +0000 (14:27 -0700)]
perf report: Name events in stats for pipe mode
In stats mode PERF_RECORD_EVENT_UPDATE isn't being handled meaning the
evsels aren't named when handling pipe mode output.
Before:
$ perf record -e inst_retired.any -a -o - sleep 0.1|perf report --stats -i -
...
Aggregated stats:
TOTAL events: 23358
COMM events: 2608 (11.2%)
EXIT events: 1 ( 0.0%)
FORK events: 2607 (11.2%)
SAMPLE events: 174 ( 0.7%)
MMAP2 events: 17936 (76.8%)
ATTR events: 2 ( 0.0%)
FINISHED_ROUND events: 2 ( 0.0%)
ID_INDEX events: 1 ( 0.0%)
THREAD_MAP events: 1 ( 0.0%)
CPU_MAP events: 1 ( 0.0%)
EVENT_UPDATE events: 3 ( 0.0%)
TIME_CONV events: 1 ( 0.0%)
FEATURE events: 20 ( 0.1%)
FINISHED_INIT events: 1 ( 0.0%)
raw 0xc0 stats:
SAMPLE events: 174
After:
$ perf record -e inst_retired.any -a -o - sleep 0.1|perf report --stats -i -
...
Aggregated stats:
TOTAL events: 23742
COMM events: 2620 (11.0%)
EXIT events: 2 ( 0.0%)
FORK events: 2619 (11.0%)
SAMPLE events: 165 ( 0.7%)
MMAP2 events: 18304 (77.1%)
ATTR events: 2 ( 0.0%)
FINISHED_ROUND events: 2 ( 0.0%)
ID_INDEX events: 1 ( 0.0%)
THREAD_MAP events: 1 ( 0.0%)
CPU_MAP events: 1 ( 0.0%)
EVENT_UPDATE events: 3 ( 0.0%)
TIME_CONV events: 1 ( 0.0%)
FEATURE events: 20 ( 0.1%)
FINISHED_INIT events: 1 ( 0.0%)
inst_retired.any stats:
SAMPLE events: 165
This makes the pipe output match the regular output.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240827212757.1469340-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michael Petlan [Tue, 2 Jul 2024 11:08:50 +0000 (13:08 +0200)]
perf testsuite: Install perf-report tests in the 'make install-tests -C tools/perf' target
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-13-vmolnaro@redhat.com
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:49 +0000 (13:08 +0200)]
perf testsuite report: Add test case for perf report
Add a new 'perf report' test case that acts as an entry element in 'perf
test list'.
Runs multiple subtests from directory "base_report", which can be
expanded without further editing.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-12-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:48 +0000 (13:08 +0200)]
perf testsuite report: Add test for perf-report basic functionality
Test basic execution and some options of perf-report subcommand, like
show-nr-samples, header, showcpuutilization, pid and symbol filtering.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-11-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:47 +0000 (13:08 +0200)]
perf testsuite: Add common output checking helper
As a form of validation, it is a common practice to check the outputs
of commands whether they contain expected patterns or match a certain
regular expression.
This output checking helper is designed to allow checking stderr output
of perf commands for unexpected messages, while ignoring messages that
are known to be harmless, e.g.:
"Lowering default frequency rate to \d+\."
"\d+ out of order events recorded."
etc.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-10-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:46 +0000 (13:08 +0200)]
perf testsuite probe: Add test for line semantics
The perf-probe command uses a specific semantics to describe probes.
Test some patterns that are known to be both valid and invalid if
they are handled appropriately.
This test is run as a part of perftool-testsuite_probe test case.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-9-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:45 +0000 (13:08 +0200)]
perf testsuite probe: Add test for invalid options
Test if various incompatible options are correctly handled-rejected.
It is run as a part of perftool-testsuite_probe test case.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-8-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:44 +0000 (13:08 +0200)]
perf testsuite probe: Add test for basic perf-probe options
Test basic behavior of perf-probe subcommand. It is run as a part of
perftool-testsuite_probe test case.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-7-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:43 +0000 (13:08 +0200)]
perf testsuite probe: Add test for blacklisted kprobes handling
Test perf probe interface. Blacklisted functions should be rejected
when there is an attempt to set a kprobe to them.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-6-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:42 +0000 (13:08 +0200)]
perf testsuite: Fix shellcheck warnings
Shellcheck is becoming a standard when building perf to prevent
any unnecessary mistakes. Fix shellcheck warnings in perf testsuite.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-5-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Tue, 2 Jul 2024 11:08:41 +0000 (13:08 +0200)]
perf testsuite: Merge settings files for shell tests
Merge perf testsuite setting files into common settings to reduce
duplicates and prevent errors.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-4-vmolnaro@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michael Petlan [Tue, 2 Jul 2024 11:08:40 +0000 (13:08 +0200)]
perf tests shell: Skip base_* dirs in test script search
The test scripts in base_* directories currently have their own drivers
that run them. Before this patch, the shell test-suite generator causes
them to run twice. Fix that by skipping them in the generator.
A cleaner solution (for future) will be to use the directory structure
idea (introduced by Carsten Haitzler in
7391db645938 ("perf test:
Refactor shell tests allowing subdirs")) to generate test entries with
subtests, like:
$ perf test list
[...]
97: perf probe shell tests
97:1: perf probe basic functionality
97:2: perf probe tests with arguments
97:3: perf probe invalid options handling
[...]
There is already a lot of shell test scripts and many are about to come,
so there is a need for some hierarchy.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240702110849.31904-3-vmolnaro@redhat.com
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 27 Aug 2024 18:57:33 +0000 (15:57 -0300)]
perf test vfs_getname: Look for alternative line where to collect the pathname
The getname_flags() routine changed recently and thus the place where we
were getting the pathname is not probeable anymore, albeit still
present, so use the next line for that, before:
root@number:/home/acme/git/perf-tools-next# perf test vfs_getname
91: Add vfs_getname probe to get syscall args filenames : FAILED!
93: Use vfs_getname probe to get syscall args filenames : FAILED!
126: Check open filename arg using perf trace + vfs_getname : FAILED!
root@number:/home/acme/git/perf-tools-next#
Now tests 91 and 126 are passing, some more investigation is needed for
test 93, that continues to fail.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 20 Aug 2024 15:45:04 +0000 (08:45 -0700)]
perf test: Update sample filtering tests with multiple events
Add Multiple bpf-filter test for two or more events with filters.
It uses task-clock and page-faults events with different filter
expressions and check the perf script output
$ sudo ./perf test filtering -vv
96: perf record sample filtering (by BPF) tests:
--- start ---
test child forked, pid
2804025
Checking BPF-filter privilege
Basic bpf-filter test
Basic bpf-filter test [Success]
Failing bpf-filter test
Error: task-clock event does not have PERF_SAMPLE_CPU
Failing bpf-filter test [Success]
Group bpf-filter test
Error: task-clock event does not have PERF_SAMPLE_CPU
Error: task-clock event does not have PERF_SAMPLE_CODE_PAGE_SIZE
Group bpf-filter test [Success]
Multiple bpf-filter test
Multiple bpf-filter test [Success]
---- end(0) ----
96: perf record sample filtering (by BPF) tests : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240820154504.128923-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 20 Aug 2024 15:45:03 +0000 (08:45 -0700)]
perf tools: Print lost samples due to BPF filter
Print the actual dropped sample count in the event stat.
$ sudo perf record -o- -e cycles --filter 'period < 10000' \
-e instructions --filter 'ip > 0x8000000000000000' perf test -w noploop | \
perf report --stat -i-
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.058 MB - ]
Aggregated stats:
TOTAL events: 469
MMAP events: 268 (57.1%)
COMM events: 2 ( 0.4%)
EXIT events: 1 ( 0.2%)
SAMPLE events: 16 ( 3.4%)
MMAP2 events: 22 ( 4.7%)
LOST_SAMPLES events: 2 ( 0.4%)
KSYMBOL events: 89 (19.0%)
BPF_EVENT events: 39 ( 8.3%)
ATTR events: 2 ( 0.4%)
FINISHED_ROUND events: 1 ( 0.2%)
ID_INDEX events: 1 ( 0.2%)
THREAD_MAP events: 1 ( 0.2%)
CPU_MAP events: 1 ( 0.2%)
EVENT_UPDATE events: 2 ( 0.4%)
TIME_CONV events: 1 ( 0.2%)
FEATURE events: 20 ( 4.3%)
FINISHED_INIT events: 1 ( 0.2%)
cycles stats:
SAMPLE events: 2
LOST_SAMPLES (BPF) events: 4010
instructions stats:
SAMPLE events: 14
LOST_SAMPLES (BPF) events: 3990
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240820154504.128923-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 20 Aug 2024 15:45:02 +0000 (08:45 -0700)]
perf bpf-filter: Support multiple events properly
So far it used tgid as a key to get the filter expressions in the
pinned filters map for regular users but it won't work well if the has
more than one filters at the same time. Let's add the event id to the
key of the filter hash map so that it can identify the right filter
expression in the BPF program.
As the event can be inherited to child tasks, it should use the primary
id which belongs to the parent (original) event. Since evsel opens the
event for multiple CPUs and tasks, it needs to maintain a separate hash
map for the event id.
In the user space, it keeps a list for the multiple evsel and release
the entries in the both hash map when it closes the event.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240820154504.128923-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Tue, 20 Aug 2024 18:32:02 +0000 (11:32 -0700)]
perf hist: Don't set hpp_fmt_value for members in --no-group
Perf crashes as below when applying --no-group
# perf record -e "{cache-misses,branches"} -b sleep 1
# perf report --stdio --no-group
free(): invalid next size (fast)
Aborted (core dumped)
#
In the __hpp__fmt(), only 1 hpp_fmt_value is allocated for the current
event when --no-group is applied.
However, the current implementation tries to assign the hists from all
members to the hpp_fmt_value, which exceeds the allocated memory.
Fixes:
8f6071a3dce40e69 ("perf hist: Simplify __hpp_fmt() using hpp_fmt_data")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240820183202.3174323-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Tue, 13 Aug 2024 21:36:49 +0000 (14:36 -0700)]
perf test: Support external tests for separate objdir
Extend the searching for the test files so that it works when running
perf from a separate objdir, and also when the perf executable is
symlinked.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240813213651.1057362-2-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 22 Aug 2024 17:13:49 +0000 (14:13 -0300)]
perf python: Disable -Wno-cast-function-type-mismatch if present on clang
The -Wcast-function-type-mismatch option was introduced in clang 19 and
its enabled by default, since we use -Werror, and python bindings do
casts that are valid but trips this warning, disable it if present.
Closes: https://lore.kernel.org/all/CA+icZUXoJ6BS3GMhJHV3aZWyb5Cz2haFneX0C5pUMUUhG-UVKQ@mail.gmail.com
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org # To allow building with the upcoming clang 19
Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 22 Aug 2024 17:13:49 +0000 (14:13 -0300)]
perf python: Allow checking for the existence of warning options in clang
We'll need to check if an warning option introduced in clang 19 is
available on the clang version being used, so cover the error message
emitted when testing for a -W option.
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/CA+icZUVtHn8X1Tb_Y__c-WswsO0K8U9uy3r2MzKXwTA5THtL7w@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 23:26:28 +0000 (16:26 -0700)]
perf annotate-data: Copy back variable types after move
In some cases, compilers don't set the location expression in DWARF
precisely. For instance, it may assign a variable to a register after
copying it from a different register. Then it should use the register
for the new type but still uses the old register. This makes hard to
track the type information properly.
This is an example I found in __tcp_transmit_skb(). The first argument
(sk) of this function is a pointer to sock and there's a variable (tp)
for tcp_sock.
static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
int clone_it, gfp_t gfp_mask, u32 rcv_nxt)
{
...
struct tcp_sock *tp;
BUG_ON(!skb || !tcp_skb_pcount(skb));
tp = tcp_sk(sk);
prior_wstamp = tp->tcp_wstamp_ns;
tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
...
So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk.
But it turned out to be the same value because tcp_sock embeds sock as
the first member. The sk is located in reg5 (RDI) and tp is in reg3
(RBX). The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is
0x750. So you need to use RBX (reg3) to access the fields in the
tcp_sock. But the code used RDI (reg5) as it has the same value.
$ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750
u64 tcp_wstamp_ns; /* 0x748 0x8 */
u64 tcp_clock_cache; /* 0x750 0x8 */
And this is the disassembly of the part of the function.
<__tcp_transmit_skb>:
...
44: mov %rdi, %rbx
47: mov 0x748(%rdi), %rsi
4e: mov 0x750(%rdi), %rax
55: cmp %rax, %rsi
Because compiler put the debug info to RBX, it only knows RDI is a
pointer to sock and accessing those two fields resulted in error
due to offset being beyond the type size.
-----------------------------------------------------------
find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
CU for net/ipv4/tcp_output.c (die:0x817f543)
frame base: cfa=0 fbreg=6
scope: [1/1] (die:
81aac3e)
bb: [0 - 30]
var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg1 type='int' size=0x4 (die:0x818059e)
var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c) <<<--- the first argument ('sk' at %RDI)
mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
mov [20] stack canary -> reg0
mov [29] reg0 -> -0x30(stack) stack canary
bb: [36 - 3e]
mov [36] reg4 -> reg15 type='struct sk_buff*' size=0x8 (die:0x8181360)
bb: [44 - 63]
mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c) <<<--- calling tcp_sk()
var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead) <<<--- new variable ('tp' at %RBX)
var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size <<<--- access with old variable
final result: offset bigger than size
While it's a fault in the compiler, we could work around this issue by
using the type of new variable when it's copied directly. So I've added
copied_from field in the register state to track those direct register
to register copies. After that new register gets a new type and the old
register still has the same type, it'll update (copy it back) the type
of the old register.
For example, if we can update type of reg5 at __tcp_transmit_skb+0x47,
we can find the target type of the instruction at 0x63 like below:
-----------------------------------------------------------
find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
...
bb: [44 - 63]
mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)
var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)
var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead) <<<--- here
mov [47] 0x748(reg5) -> reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
mov [4e] 0x750(reg5) -> reg0 type='unsigned long long' size=0x8 (die:0x8180edd)
mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good! <<<--- new type
found by insn track: 0x748(reg5) type-offset=0x748
final result: type='struct tcp_sock' size=0xa98 (die:0x819eeb2)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 23:26:27 +0000 (16:26 -0700)]
perf annotate-data: Update stack slot for the store
When checking the match variable at the target instruction, it might not
have any information if it's a first write to a stack slot. In this
case it could spill a register value into the stack so the type info is
in the source operand.
But currently it's hard to get the operand from the checking function.
Let's process the instruction and retry to get the type info from the
stack if there's no information already.
This is an example of __tcp_transmit_skb(). The instructions are
<__tcp_transmit_skb>:
0: nopl 0x0(%rax, %rax, 1)
5: push %rbp
6: mov %rsp, %rbp
9: push %r15
b: push %r14
d: push %r13
f: push %r12
11: push %rbx
12: sub $0x98, %rsp
19: mov %r8d, -0xa8(%rbp)
...
It cannot find any variable at -0xa8(%rbp) at this point.
-----------------------------------------------------------
find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19
CU for net/ipv4/tcp_output.c (die:0x817f543)
frame base: cfa=0 fbreg=6
scope: [1/1] (die:
81aac3e)
bb: [0 - 19]
var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg1 type='int' size=0x4 (die:0x818059e)
var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)
chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : no type information
no type information
And it was able to find the type after processing the 'mov' instruction.
-----------------------------------------------------------
find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19
CU for net/ipv4/tcp_output.c (die:0x817f543)
frame base: cfa=0 fbreg=6
scope: [1/1] (die:
81aac3e)
bb: [0 - 19]
var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
var [5] reg1 type='int' size=0x4 (die:0x818059e)
var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)
chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : retry <<<--- here
mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : Good!
found by insn track: -0xa8(reg6) type-offset=0
final result: type='unsigned int' size=0x4 (die:0x8180ed6)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 23:26:26 +0000 (16:26 -0700)]
perf annotate-data: Update debug messages
In check_matching_type(), it'd be easier to display the typename in
question if it's available.
For example, check out the line starts with 'chk'.
-----------------------------------------------------------
find data type for 0x10(reg0) at cpuacct_charge+0x13
CU for kernel/sched/build_utility.c (die:0x137ee0b)
frame base: cfa=1 fbreg=7
scope: [3/3] (die:
13d9632)
bb: [c - 13]
var [c] reg5 type='struct task_struct*' size=0x8 (die:0x1381230)
mov [c] 0xdf8(reg5) -> reg0 type='struct css_set*' size=0x8 (die:0x1385c56)
chk [13] reg0 offset=0x10 ok=1 kind=1 (struct css_set*) : Good! <<<--- here
found by insn track: 0x10(reg0) type-offset=0x10
final result: type='struct css_set' size=0x250 (die:0x1385b0e)
Another example:
-----------------------------------------------------------
find data type for 0x8(reg0) at menu_select+0x279
CU for drivers/cpuidle/governors/menu.c (die:0x7b0fe79)
frame base: cfa=1 fbreg=7
scope: [2/2] (die:
7b11010)
bb: [273 - 277]
bb: [279 - 279]
chk [279] reg0 offset=0x8 ok=0 kind=0 cfa : no type information
scope: [1/2] (die:
7b10cbc)
bb: [0 - 64]
...
mov [26a] imm=0xffffffff -> reg15
bb: [273 - 277]
bb: [279 - 279]
chk [279] reg0 offset=0x8 ok=1 kind=1 (long long unsigned int) : no/void pointer <<<--- here
final result: no/void pointer
Also change some places to print negative offsets properly.
Before:
-----------------------------------------------------------
find data type for 0xffffff40(reg6) at __tcp_transmit_skb+0x58
After:
-----------------------------------------------------------
find data type for -0xc0(reg6) at __tcp_transmit_skb+0x58
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 23:26:25 +0000 (16:26 -0700)]
perf dwarf-aux: Handle bitfield members from pointer access
The __die_find_member_offset_cb() missed to handle bitfield members
which don't have DW_AT_data_member_location. Like in adding member
types in __add_member_cb() it should fallback to check the bit offset
when it resolves the member type for an offset.
Fixes:
437683a9941891c1 ("perf dwarf-aux: Handle type transfer for memory access")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 19 Aug 2024 23:36:03 +0000 (16:36 -0700)]
perf annotate-data: Add 'typecln' sort key
Sometimes it's useful to organize member fields in cache-line boundary.
The 'typecln' sort key is short for type-cacheline and to show samples
in each cacheline. The cacheline size is fixed to 64 for now, but it
can read the actual size once it saves the value from sysfs.
For example, you maybe want to which cacheline in a target is hot or
cold. The following shows members in the cfs_rq's first cache line.
$ perf report -s type,typecln,typeoff -H
...
- 2.67% struct cfs_rq
+ 1.23% struct cfs_rq: cache-line 2
+ 0.57% struct cfs_rq: cache-line 4
+ 0.46% struct cfs_rq: cache-line 6
- 0.41% struct cfs_rq: cache-line 0
0.39% struct cfs_rq +0x14 (h_nr_running)
0.02% struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
...
Committer testing:
# root@number:~# perf report -s type,typecln,typeoff -H --stdio
# Total Lost Samples: 0
#
# Samples: 5K of event 'cpu_atom/mem-loads,ldlat=5/P'
# Event count (approx.): 312251
#
# Overhead Data Type / Data Type Cacheline / Data Type Offset
# .............. ..................................................
#
<SNIP>
0.07% struct sigaction
0.05% struct sigaction: cache-line 1
0.02% struct sigaction +0x58 (sa_mask)
0.02% struct sigaction +0x78 (sa_mask)
0.03% struct sigaction: cache-line 0
0.02% struct sigaction +0x38 (sa_mask)
0.01% struct sigaction +0x8 (sa_mask)
<SNIP>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819233603.54941-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 19 Aug 2024 23:36:02 +0000 (16:36 -0700)]
perf annotate-data: Show offset and size in hex
It'd be better to have them in hex to check cacheline alignment.
Percent offset size field
100.00 0 0x1c0 struct cfs_rq {
0.00 0 0x10 struct load_weight load {
0.00 0 0x8 long unsigned int weight;
0.00 0x8 0x4 u32 inv_weight;
};
0.00 0x10 0x4 unsigned int nr_running;
14.56 0x14 0x4 unsigned int h_nr_running;
0.00 0x18 0x4 unsigned int idle_nr_running;
0.00 0x1c 0x4 unsigned int idle_h_nr_running;
...
Committer notes:
Justification from Namhyung when asked about why it would be "better":
Cache line sizes are power of 2 so it'd be natural to use hex and
check whether an offset is in the same boundary. Also 'perf annotate'
shows instruction offsets in hex.
>
> Maybe this should be selectable?
I can add an option and/or a config if you want.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yang Ruibin [Wed, 21 Aug 2024 10:14:56 +0000 (06:14 -0400)]
perf bpf: Remove redundant check that map is NULL
The check that map is NULL is already done in the bpf_map__fd(map) and
returns an errno, which does not run further checks.
In addition, even if the check for map is run, the return is a pointer,
which is not consistent with the err_number returned by bpf_map__fd(map).
Signed-off-by: Yang Ruibin <11162571@vivo.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: opensource.kernel@vivo.com
Link: https://lore.kernel.org/r/20240821101500.4568-1-11162571@vivo.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 06:54:08 +0000 (23:54 -0700)]
perf annotate-data: Fix percpu pointer check
In check_matching_type(), it checks the type state of the register in a
wrong order. When it's the percpu pointer, it should check the type for
the pointer, but it checks the CFA bit first and thought it has no type
in the stack slot. This resulted in no type info.
-----------------------------------------------------------
find data type for 0x28(reg1) at hrtimer_reprogram+0x88
CU for kernel/time/hrtimer.c (die:0x18f219f)
frame base: cfa=1 fbreg=7
...
add [72] percpu 0x24500 -> reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
bb: [7a - 7e]
bb: [80 - 86] (here)
bb: [88 - 88] vvv
chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information
no type information
Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the
correct type. But when it checks the final result, it wrongly thought
it was stack variable because it checks the cfa bit first.
After changing the order of state check:
-----------------------------------------------------------
find data type for 0x28(reg1) at hrtimer_reprogram+0x88
CU for kernel/time/hrtimer.c (die:0x18f219f)
frame base: cfa=1 fbreg=7
... (here)
vvvvvvvvvv
chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good!
found by insn track: 0x28(reg1) type-offset=0x28
final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 06:54:07 +0000 (23:54 -0700)]
perf annotate-data: Prefer struct/union over base type
Sometimes a compound type can have a single field and the size is the
same as the base type. But it's still preferred as struct or union
could carry more information than the base type.
Also put a slight priority on the typedef for the same reason.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 21 Aug 2024 06:54:06 +0000 (23:54 -0700)]
perf annotate-data: Fix missing constant copy
I found it missed to copy the immediate constant when it moves the
register value. This could result in a wrong type inference since the
address for the per-cpu variable would be 0 always.
Fixes:
eb9190afaed6afd5 ("perf annotate-data: Handle ADD instructions")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Tue, 6 Aug 2024 22:06:14 +0000 (15:06 -0700)]
perf cap: Tidy up and improve capability testing
Remove dependence on libcap. libcap is only used to query whether a
capability is supported, which is just 1 capget system call.
If the capget system call fails, fall back on root permission
checking. Previously if libcap fails then the permission is assumed
not present which may be pessimistic/wrong.
Add a used_root out argument to perf_cap__capable to say whether the
fall back root check was used. This allows the correct error message,
"root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to
be selected.
Tidy uses of perf_cap__capable so that tests aren't repeated if capget
isn't supported.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240806220614.831914-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 15 Aug 2024 22:38:23 +0000 (15:38 -0700)]
perf annotate-data: Set bitfield member offset and size properly
The bitfield members might not have DW_AT_data_member_location. Let's
use DW_AT_data_bit_offset to set the member offset correct. Also use
DW_AT_bit_size for the name like in a C program.
Before:
Annotate type: 'struct sk_buff' (1 samples)
Percent Offset Size Field
- 100.00 0 232 struct sk_buff {
+ 0.00 0 24 union ;
+ 0.00 24 8 union ;
+ 0.00 32 8 union ;
0.00 40 48 char[] cb;
+ 0.00 88 16 union ;
0.00 104 8 long unsigned int _nfct;
100.00 112 4 unsigned int len;
0.00 116 4 unsigned int data_len;
0.00 120 2 __u16 mac_len;
0.00 122 2 __u16 hdr_len;
0.00 124 2 __u16 queue_mapping;
0.00 126 0 __u8[] __cloned_offset;
0.00 0 1 __u8 cloned;
0.00 0 1 __u8 nohdr;
0.00 0 1 __u8 fclone;
0.00 0 1 __u8 peeked;
0.00 0 1 __u8 head_frag;
0.00 0 1 __u8 pfmemalloc;
0.00 0 1 __u8 pp_recycle;
0.00 127 1 __u8 active_extensions;
+ 0.00 128 60 union ;
0.00 188 4 sk_buff_data_t tail;
0.00 192 4 sk_buff_data_t end;
0.00 200 8 unsigned char* head;
After:
Annotate type: 'struct sk_buff' (1 samples)
Percent Offset Size Field
- 100.00 0 232 struct sk_buff {
+ 0.00 0 24 union ;
+ 0.00 24 8 union ;
+ 0.00 32 8 union ;
0.00 40 48 char[] cb
+ 0.00 88 16 union ;
0.00 104 8 long unsigned int _nfct;
100.00 112 4 unsigned int len;
0.00 116 4 unsigned int data_len;
0.00 120 2 __u16 mac_len;
0.00 122 2 __u16 hdr_len;
0.00 124 2 __u16 queue_mapping;
0.00 126 0 __u8[] __cloned_offset;
0.00 126 1 __u8 cloned:1;
0.00 126 1 __u8 nohdr:1;
0.00 126 1 __u8 fclone:2;
0.00 126 1 __u8 peeked:1;
0.00 126 1 __u8 head_frag:1;
0.00 126 1 __u8 pfmemalloc:1;
0.00 126 1 __u8 pp_recycle:1;
0.00 127 1 __u8 active_extensions;
+ 0.00 128 60 union ;
0.00 188 4 sk_buff_data_t tail;
0.00 192 4 sk_buff_data_t end;
0.00 200 8 unsigned char* head;
Commiter notes:
Collect some data:
root@number:~# perf mem record -a --ldlat 5 -- ping -s 8193 -f 192.168.86.1
Memory events are enabled on a subset of CPUs: 16-27
PING 192.168.86.1 (192.168.86.1) 8193(8221) bytes of data.
.^C
--- 192.168.86.1 ping statistics ---
13881 packets transmitted, 13880 received, 0.
00720409% packet loss, time 8664ms
rtt min/avg/max/mdev = 0.510/0.599/7.768/0.115 ms, ipg/ewma 0.624/0.593 ms
[ perf record: Woken up 8 times to write data ]
[ perf record: Captured and wrote 14.877 MB perf.data (46785 samples) ]
root@number:~#
root@number:~# perf evlist
cpu_atom/mem-loads,ldlat=5/P
cpu_atom/mem-stores/P
dummy:u
root@number:~# perf evlist -v
cpu_atom/mem-loads,ldlat=5/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x7
cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
root@number:~#
Ok, now lets see what changes from before this patch to after it:
root@number:~# perf annotate --data-type > /tmp/before
Apply the patch, build:
root@number:~# perf annotate --data-type > /tmp/after
The first hunk of the diff, for a glib data structure, in userspace,
look at those bitfields:
root@number:~# diff -u10 /tmp/before /tmp/after | head -20
--- /tmp/before 2024-08-20 17:29:58.
306765780 -0300
+++ /tmp/after 2024-08-20 17:33:13.
210582596 -0300
@@ -163,22 +163,22 @@
Annotate type: 'GHashTable' in /usr/lib64/libglib-2.0.so.0.8000.3 (1 samples):
============================================================================
Percent offset size field
100.00 0 96 GHashTable {
0.00 0 8 gsize size;
0.00 8 4 gint mod;
100.00 12 4 guint mask;
0.00 16 4 guint nnodes;
0.00 20 4 guint noccupied;
- 0.00 0 4 guint have_big_keys;
- 0.00 0 4 guint have_big_values;
+ 0.00 24 1 guint have_big_keys:1;
+ 0.00 24 1 guint have_big_values:1;
0.00 32 8 gpointer keys;
0.00 40 8 guint* hashes;
0.00 48 8 gpointer values;
root@number:~#
As advertised :-)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240815223823.2402285-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 19 Aug 2024 19:46:29 +0000 (16:46 -0300)]
perf daemon: Fix the build on more 32-bit architectures
The previous attempt fixed the build on debian:experimental-x-mipsel,
but when building on a larger set of containers I noticed it broke the
build on some other 32-bit architectures such as:
42 7.87 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
builtin-daemon.c: In function 'cmd_session_list':
builtin-daemon.c:692:16: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=]
fprintf(out, "%c%" PRIu64,
^~~~~
builtin-daemon.c:694:13:
csv_sep, (curr - daemon->start) / 60);
~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from builtin-daemon.c:3:0:
/usr/arm-linux-gnueabihf/include/inttypes.h:105:34: note: format string is defined here
# define PRIu64 __PRI64_PREFIX "u"
So lets cast that time_t (32-bit/64-bit) to uint64_t to make sure it
builds everywhere.
Fixes:
4bbe6002931954bb ("perf daemon: Fix the build on 32-bit architectures")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZsPmldtJ0D9Cua9_@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Sun, 18 Aug 2024 21:29:48 +0000 (14:29 -0700)]
perf test: Add cgroup sampling test
Add it to the record.sh shell test to verify if it tracks cgroup
information correctly. It records with --all-cgroups option can check
if it has PERF_RECORD_CGROUP and the names are not "unknown".
$ sudo ./perf test -vv 95
95: perf record tests:
--- start ---
test child forked, pid
2871922
169c90-169cd0 g test_loop
perf does have symbol 'test_loop'
Basic --per-thread mode test
Basic --per-thread mode test [Success]
Register capture test
Register capture test [Success]
Basic --system-wide mode test
Basic --system-wide mode test [Success]
Basic target workload test
Basic target workload test [Success]
Branch counter test
branch counter feature not supported on all core PMUs (/sys/bus/event_source/devices/cpu) [Skipped]
Cgroup sampling test
Cgroup sampling test [Success]
---- end(0) ----
95: perf record tests : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240818212948.2873156-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Sun, 18 Aug 2024 21:29:47 +0000 (14:29 -0700)]
perf record: Fix sample cgroup & namespace tracking
The recent change in 'struct perf_tool' constification broke the cgroup
and/or namespace tracking by resetting tool fields. It should set the
values after perf_tool__init().
Fixes:
cecb1cf154b301c6 ("perf record: Use perf_tool__init()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240818212948.2873156-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:38 +0000 (23:44 -0700)]
perf inject: Combine mmap and mmap2 handling
The handling of mmap and mmap2 events is near identical. Add a common
helper function and call that by the two event handling functions.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:37 +0000 (23:44 -0700)]
perf inject: Combine different mmap and mmap2 functions
There are repipe, build ID and JIT dump variants of the mmap and mmap2
repipe functions. The organization doesn't allow JIT dump to work with
build ID injection and the structure is less than clear. Combine the
function and enable the different behaviors based on ifs.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:36 +0000 (23:44 -0700)]
perf inject: Combine build_ids and build_id_all into enum
It is clearer to have a single enum that determines how build ids are
injected, it also allows for future extension.
Set the header build ID feature whether lazy or all are generated,
previously only the lazy case would set it.
Allow parsing of known build IDs for either the lazy or all cases.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:35 +0000 (23:44 -0700)]
perf test: Expand pipe/inject test
Test recording of call-graphs and injecting --build-all. Add/expand
trap handler.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:34 +0000 (23:44 -0700)]
perf evsel: Constify evsel__id_hdr_size() argument
Allows evsel__id_hdr_size() to be used when the evsel is const.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:33 +0000 (23:44 -0700)]
perf dso: Constify dso_id
The passed dso_id is copied and so is never an out argument. Remove
its mutability.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:32 +0000 (23:44 -0700)]
perf jit: Constify filename argument
Make it clearer the argument is just being used as a string.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:31 +0000 (23:44 -0700)]
perf map: API clean up
map__init() is only used internally so make it static. Assume memory is
zero initialized, which will better support adding fields to struct
map in the future and was already the case for map__new2.
To reduce complexity, change set_priv and set_erange_warned to not take
a value to assign as they always assign true.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Sat, 17 Aug 2024 06:44:30 +0000 (23:44 -0700)]
perf synthetic-events: Avoid unnecessary memset
Make sure the memset of a synthesized event only zeros the necessary
tracing data part of the event, as a full event can be over 4kb in
size.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Link: https://lore.kernel.org/r/20240817064442.2152089-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Xu Yang [Mon, 19 Aug 2024 02:34:03 +0000 (10:34 +0800)]
perf python: Fix the build on 32-bit arm by including missing "util/sample.h"
The 32-bit arm build system will complain:
tools/perf/util/python.c:75:28: error: field ‘sample’ has incomplete type
75 | struct perf_sample sample;
However, arm64 build system doesn't complain this.
The root cause is arm64 define "HAVE_KVM_STAT_SUPPORT := 1" in
tools/perf/arch/arm64/Makefile, but arm arch doesn't define this. This
will lead to kvm-stat.h include other header files on arm64 build
system, especially "util/sample.h" for util/python.c.
This will try to directly include "util/sample.h" for "util/python.c" to
avoid such build issue on arm platform.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: imx@lists.linux.dev
Link: https://lore.kernel.org/r/20240819023403.201324-1-xu.yang_2@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:39 +0000 (16:58 -0700)]
perf annotate-data: Update type stat at the end of find_data_type_die()
After trying all possibilities with DWARF and instruction tracking.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:38 +0000 (16:58 -0700)]
perf annotate-data: Check variables in every scope
Sometimes it matches a variable in the inner scope but it fails because
the actual access can be on a different type. Let's try variables in
every scope and choose the best one using is_better_type().
I have an example with update_blocked_averages(), at first it found a
variable (__mptr) but it's a void pointer. So it moved on to the upper
scope and found another variable (cfs_rq).
$ perf --debug type-profile annotate --data-type --stdio
...
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
found "cfs_rq" (die: 0x1301721) in scope=3/4 (die: 0x130171c) type_offset=0x140
variable location: reg14
type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
final type: type='struct cfs_rq' size=0x1c0 (die:0x12e37e5)
IIUC the scope is like below:
1: update_blocked_averages
2: __update_blocked_fair
3: for_each_leaf_cfs_rq_safe
4: list_entry -> (container_of)
The container_of is implemented like:
#define container_of(ptr, type, member) ({ \
void *__mptr = (void *)(ptr); \
static_assert(__same_type(*(ptr), ((type *)0)->member) || \
__same_type(*(ptr), void), \
"pointer type mismatch in container_of()"); \
((type *)(__mptr - offsetof(type, member))); })
That's why we see the __mptr variable first but it failed since it has
no type information.
Then for_each_leaf_cfs_rq_safe() is defined as
#define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \
list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list, \
leaf_cfs_rq_list)
Note that the access was 0x140(r14). And the cfs_rq has
leaf_cfs_rq_list at the 0x140. So it converts the list_head pointer to
a pointer to struct cfs_rq here.
$ pahole --hex -C cfs_rq vmlinux | grep 140
struct cfs_rq struct list_head leaf_cfs_rq_list; /* 0x140 0x10 */
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:37 +0000 (16:58 -0700)]
perf annotate-data: Add is_better_type() helper
Sometimes more than one variables are located in the same register or a
stack slot. Or it can overwrite existing information with others. I
found this is not helpful in some cases so it needs to update the type
information from the variable only if it's better.
But it's hard to know which one is better, so we needs heuristics. :)
As it deals with memory accesses, the location should have a pointer or
something similar (like array or reference). So if it had an integer
type and a variable is a pointer, we can take the variable's type to
resolve the target of the access.
If it has a pointer type and a variable with the same location has a
different pointer type, it'll take one with bigger target type. This
can be useful when the target type embeds a smaller type (like list
header or RB-tree node) at the beginning so their location is same.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:36 +0000 (16:58 -0700)]
perf annotate-data: Add is_pointer_type() helper
It treats pointers and arrays in the same way. Let's add the helper and
use it when it checks if it needs a pointer.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:35 +0000 (16:58 -0700)]
perf annotate-data: Change return type of find_data_type_block()
So that it can return enum variable_match_type to be propagated to the
find_data_type_die(). Also update the debug message to show the result
of the check_matching_type().
chk [dd] reg0 offset=0 ok=1 kind=1 : Good!
or
chk [177] reg4 offset=0x138 ok=0 kind=0 cfa : no type information
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:34 +0000 (16:58 -0700)]
perf annotate-data: Add variable_state_str()
So that it can show a proper debug message in the right place. The
check_variable() is used in other places which don't want to print the
message.
$ perf --debug type-profile annotate --data-type
Before:
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
no pointer or no type <<<--- removed
check variable "__mptr" failed (die: 0x13022f1)
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
After:
-----------------------------------------------------------
find data type for 0x140(reg14) at update_blocked_averages+0x2db
CU for kernel/sched/fair.c (die:0x12dd892)
frame base: cfa=1 fbreg=7
found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer <<<--- here
variable location: base=reg14, offset=0x140
type='void*' size=0x8 (die:0x12dd8f9)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:33 +0000 (16:58 -0700)]
perf annotate-data: Add 'enum type_match_result'
And let check_variable() return the enum value so that callers can know
what was the problem. This will be used by the later patch to update
the statistics correctly and print the error message in a right place.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:32 +0000 (16:58 -0700)]
perf annotate-data: Fix off-by-one in location range check
The location list will have entries with half-open addressing like
[start, end) which means it doesn't include the end address. So it
should skip entries at the end address and match to the next entry.
An example location list looks like this (from readelf -wo):
00237876 ffffffff8110d32b (base address)
0023787f v000000000000000 v000000000000002 views at
00237868 for:
ffffffff8110d32b ffffffff8110d4eb (DW_OP_reg3 (rbx)) <<<--- 1
00237885 v000000000000002 v000000000000000 views at
0023786a for:
ffffffff8110d4eb ffffffff8110d50b (DW_OP_reg14 (r14)) <<<--- 2
0023788c v000000000000000 v000000000000001 views at
0023786c for:
ffffffff8110d50b ffffffff8110d7c4 (DW_OP_reg3 (rbx))
00237893 v000000000000000 v000000000000000 views at
0023786e for:
ffffffff8110d806 ffffffff8110d854 (DW_OP_reg3 (rbx))
0023789a v000000000000000 v000000000000000 views at
00237870 for:
ffffffff8110d876 ffffffff8110d88e (DW_OP_reg3 (rbx))
The first entry at
0023787f has [
8110d32b,
8110d4eb) (omitting the
ffffffff at the beginning), and the second one has [
8110d4eb,
8110d50b).
Fixes:
2bc3cf575a162a2c ("perf annotate-data: Improve debug message with location info")
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 16 Aug 2024 23:58:31 +0000 (16:58 -0700)]
perf dwarf-aux: Check allowed location expressions when collecting variables
It missed to call check_allowed_ops() in __die_collect_vars_cb() so it
can take variables with complex location expression incorrectly.
For example, I found some variable has this expression.
015d8df8 ffffffff81aacfb3 (base address)
015d8e01 v000000000000004 v000000000000000 views at
015d8df2 for:
ffffffff81aacfb3 ffffffff81aacfd2 (DW_OP_fbreg: -176; DW_OP_deref;
DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
DW_OP_minus; DW_OP_stack_value)
015d8e14 v000000000000000 v000000000000000 views at
015d8df4 for:
ffffffff81aacfd2 ffffffff81aacfd7 (DW_OP_reg3 (rbx))
015d8e19 v000000000000000 v000000000000000 views at
015d8df6 for:
ffffffff81aacfd7 ffffffff81aad020 (DW_OP_fbreg: -176; DW_OP_deref;
DW_OP_plus_uconst: 332; DW_OP_deref_size: 4;
DW_OP_lit1; DW_OP_shra; DW_OP_const1u: 64;
DW_OP_minus; DW_OP_stack_value)
015d8e2c <End of list>
It looks like '((int *)(-176(%rbp) + 332) >> 1) - 64' but the current
code thought it's just -176(%rbp) and processed the variable incorrectly.
It should reject such a complex expression if check_allowed_ops()
doesn't like it. :)
Fixes:
932dcc2c39aedf54 ("perf dwarf-aux: Add die_collect_vars()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240816235840.2754937-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 16 Aug 2024 22:43:16 +0000 (19:43 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf-tools-next
To pick up the latest perf-tools merge for 6.11, i.e. to have the
current perf tools branch that is getting into 6.11 with the
perf-tools-next that is geared towards 6.12.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yicong Yang [Fri, 2 Aug 2024 06:58:00 +0000 (14:58 +0800)]
perf stat: Display iostat headers correctly
Currently we'll only print metric headers for metric leader in
aggregration mode. This will make `perf iostat` header not shown
since it'll aggregrated globally but don't have metric events:
root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
Performance counter stats for 'system wide':
port
0000:00 0 0 0 0
0000:80 0 0 0 0
[...]
Fix this by excluding the iostat in the check of printing metric
headers. Then we can see the headers:
root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
Performance counter stats for 'system wide':
port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
0000:00 0 0 0 0
0000:80 0 0 0 0
[...]
Fixes:
193a9e30207f5477 ("perf stat: Don't display metric header for non-leader uncore events")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: linuxarm@huawei.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Zeng Tao <prime.zeng@hisilicon.com>
Link: https://lore.kernel.org/r/20240802065800.48774-1-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yang Jihong [Tue, 6 Aug 2024 02:35:33 +0000 (10:35 +0800)]
perf sched timehist: Fix missing free of session in perf_sched__timehist()
When perf_time__parse_str() fails in perf_sched__timehist(),
need to free session that was previously created, fix it.
Fixes:
853b74071110bed3 ("perf sched timehist: Add option to specify time window of interest")
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240806023533.1316348-1-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Fri, 16 Aug 2024 21:03:31 +0000 (14:03 -0700)]
Merge tag 'block-6.11-
20240824' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fix corruption issues with s390/dasd (Eric, Stefan)
- Fix a misuse of non irq locking grab of a lock (Li)
- MD pull request with a single data corruption fix for raid1 (Yu)
* tag 'block-6.11-
20240824' of git://git.kernel.dk/linux:
block: Fix lockdep warning in blk_mq_mark_tag_wait
md/raid1: Fix data corruption for degraded array with slow disk
s390/dasd: fix error recovery leading to data corruption on ESE devices
s390/dasd: Remove DMA alignment
Linus Torvalds [Fri, 16 Aug 2024 21:00:05 +0000 (14:00 -0700)]
Merge tag 'io_uring-6.11-
20240824' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix a comment in the uapi header using the wrong member name (Caleb)
- Fix KCSAN warning for a debug check in sqpoll (me)
- Two more NAPI tweaks (Olivier)
* tag 'io_uring-6.11-
20240824' of git://git.kernel.dk/linux:
io_uring: fix user_data field name in comment
io_uring/sqpoll: annotate debug task == current with data_race()
io_uring/napi: remove duplicate io_napi_entry timeout assignation
io_uring/napi: check napi_enabled in io_napi_add() before proceeding
Linus Torvalds [Fri, 16 Aug 2024 20:50:33 +0000 (13:50 -0700)]
Merge tag 'devicetree-fixes-for-6.11-2' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix a possible (but unlikely) out-of-bounds read in interrupts
parsing code
- Add AT25 EEPROM "fujitsu,mb85rs256" compatible
- Update Konrad Dybcio's email
* tag 'devicetree-fixes-for-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of/irq: Prevent device address out-of-bounds read in interrupt map walk
dt-bindings: eeprom: at25: add fujitsu,mb85rs256 compatible
dt-bindings: Batch-update Konrad Dybcio's email
Linus Torvalds [Fri, 16 Aug 2024 18:49:07 +0000 (11:49 -0700)]
Merge tag 'thermal-6.11-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix a Bang-bang thermal governor issue causing it to fail to reset the
state of cooling devices if they are 'on' to start with, but the
thermal zone temperature is always below the corresponding trip point
(Rafael Wysocki)"
* tag 'thermal-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: gov_bang_bang: Use governor_data to reduce overhead
thermal: gov_bang_bang: Add .manage() callback
thermal: gov_bang_bang: Split bang_bang_control()
thermal: gov_bang_bang: Call __thermal_cdev_update() directly
Linus Torvalds [Fri, 16 Aug 2024 18:43:54 +0000 (11:43 -0700)]
Merge tag 'acpi-6.11-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix an issue related to the ACPI EC device handling that causes the
_REG control method to be evaluated for EC operation regions that are
not expected to be used.
This confuses the platform firmware and provokes various types of
misbehavior on some systems (Rafael Wysocki)"
* tag 'acpi-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: EC: Evaluate _REG outside the EC scope more carefully
ACPICA: Add a depth argument to acpi_execute_reg_methods()
Revert "ACPI: EC: Evaluate orphan _REG under EC device"