Ian Rogers [Sat, 1 Feb 2025 07:43:20 +0000 (23:43 -0800)]
perf stat: Changes to event name uniquification
The existing logic would disable uniquification on an evlist or enable
it per evsel, this is unfortunate as uniquification is most needed
when events have the same name and so the whole evlist must be
considered. Change the initial disable uniquify on an evlist
processing to also set a needs_uniquify flag, for cases like the
matching event names. This must be done as an initial pass as
uniquification of an event name will change the behavior of the
check. Keep the per counter uniquification but now only uniquify event
names when the needs_uniquify flag is set.
Before this change a hwmon like temp1 wouldn't be uniquified and
afterwards it will (ie the PMU is added to the temp1 event's name).
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Sat, 1 Feb 2025 07:43:19 +0000 (23:43 -0800)]
perf stat: Don't merge counters purely on name
Counter merging was added in commit
942c5593393d ("perf stat: Add
perf_stat_merge_counters()"), however, it merges events with the same
name on different PMUs regardless of whether the different PMUs are
actually of the same type (ie they differ only in the suffix on the
PMU). For hwmon events there may be a temp1 event on every PMU, but
the PMU names are all unique and don't differ just by a suffix. The
merging is over eager and will merge all the hwmon counters together
meaning an aggregated and very large temp1 value is shown. The same
would be true for say cache events and memory controller events where
the PMUs differ but the event names are the same.
Fix the problem by correctly saying two PMUs alias when they differ
only by suffix.
Note, there is an overlap with evsel's merged_stat with aggregation
and the evsel's metric_leader where aggregation happens for metrics.
Fixes:
942c5593393d ("perf stat: Add perf_stat_merge_counters()")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Sat, 1 Feb 2025 07:43:18 +0000 (23:43 -0800)]
perf pmu: Rename name matching for no suffix or wildcard variants
Wildcard PMU naming will match a name like pmu_1 to a PMU name like
pmu_10 but not to a PMU name like pmu_2 as the suffix forms part of
the match. No suffix matching will match pmu_10 to either pmu_1 or
pmu_2. Add or rename matching functions on PMU to make it clearer what
kind of matching is being performed.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Sat, 1 Feb 2025 07:43:17 +0000 (23:43 -0800)]
perf pmus: Restructure pmu_read_sysfs to scan fewer PMUs
Rather than scanning core or all PMUs, allow pmu_read_sysfs to read
some combination of core, other, hwmon and tool PMUs. The PMUs that
should be read and are already read are held as bitmaps. It is known
that a "hwmon_" prefix is necessary for a hwmon PMU's name, similarly
with "tool", so only scan those PMUs in situations the PMU name or the
PMU's type number make sense to.
The number of openat system calls reduces from 276 to 98 for a hwmon
event. The number of openats for regular perf events isn't changed.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Sat, 1 Feb 2025 07:43:16 +0000 (23:43 -0800)]
perf evsel: Reduce scanning core PMUs in is_hybrid
evsel__is_hybrid returns true if there are multiple core PMUs and the
evsel is for a core PMU. Determining the number of core PMUs can
require loading/scanning PMUs. There's no point doing the scanning if
evsel for the is_hybrid test isn't core so reorder the tests to reduce
PMU scanning.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Thomas Richter [Fri, 31 Jan 2025 11:24:00 +0000 (12:24 +0100)]
perf test: Fix Hwmon PMU test endianess issue
perf test 11 hwmon fails on s390 with this error
# ./perf test -Fv 11
--- start ---
---- end ----
11.1: Basic parsing test : Ok
--- start ---
Testing 'temp_test_hwmon_event1'
Using CPUID IBM,3931,704,A01,3.7,002f
temp_test_hwmon_event1 -> hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
FAILED tests/hwmon_pmu.c:189 Unexpected config for
'temp_test_hwmon_event1',
292470092988416 != 655361
---- end ----
11.2: Parsing without PMU name : FAILED!
--- start ---
Testing 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/'
FAILED tests/hwmon_pmu.c:189 Unexpected config for
'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/',
292470092988416 != 655361
---- end ----
11.3: Parsing with PMU name : FAILED!
#
The root cause is in member test_event::config which is initialized
to 0xA0001 or 655361. During event parsing a long list event parsing
functions are called and end up with this gdb call stack:
#0 hwmon_pmu__config_term (hwm=0x168dfd0, attr=0x3ffffff5ee8,
term=0x168db60, err=0x3ffffff81c8) at util/hwmon_pmu.c:623
#1 hwmon_pmu__config_terms (pmu=0x168dfd0, attr=0x3ffffff5ee8,
terms=0x3ffffff5ea8, err=0x3ffffff81c8) at util/hwmon_pmu.c:662
#2 0x00000000012f870c in perf_pmu__config_terms (pmu=0x168dfd0,
attr=0x3ffffff5ee8, terms=0x3ffffff5ea8, zero=false,
apply_hardcoded=false, err=0x3ffffff81c8) at util/pmu.c:1519
#3 0x00000000012f88a4 in perf_pmu__config (pmu=0x168dfd0, attr=0x3ffffff5ee8,
head_terms=0x3ffffff5ea8, apply_hardcoded=false, err=0x3ffffff81c8)
at util/pmu.c:1545
#4 0x00000000012680c4 in parse_events_add_pmu (parse_state=0x3ffffff7fb8,
list=0x168dc00, pmu=0x168dfd0, const_parsed_terms=0x3ffffff6090,
auto_merge_stats=true, alternate_hw_config=10)
at util/parse-events.c:1508
#5 0x00000000012684c6 in parse_events_multi_pmu_add (parse_state=0x3ffffff7fb8,
event_name=0x168ec10 "temp_test_hwmon_event1", hw_config=10,
const_parsed_terms=0x0, listp=0x3ffffff6230, loc_=0x3ffffff70e0)
at util/parse-events.c:1592
#6 0x00000000012f0e4e in parse_events_parse (_parse_state=0x3ffffff7fb8,
scanner=0x16878c0) at util/parse-events.y:293
#7 0x00000000012695a0 in parse_events__scanner (str=0x3ffffff81d8
"temp_test_hwmon_event1", input=0x0, parse_state=0x3ffffff7fb8)
at util/parse-events.c:1867
#8 0x000000000126a1e8 in __parse_events (evlist=0x168b580,
str=0x3ffffff81d8 "temp_test_hwmon_event1", pmu_filter=0x0,
err=0x3ffffff81c8, fake_pmu=false, warn_if_reordered=true,
fake_tp=false) at util/parse-events.c:2136
#9 0x00000000011e36aa in parse_events (evlist=0x168b580,
str=0x3ffffff81d8 "temp_test_hwmon_event1", err=0x3ffffff81c8)
at /root/linux/tools/perf/util/parse-events.h:41
#10 0x00000000011e3e64 in do_test (i=0, with_pmu=false, with_alias=false)
at tests/hwmon_pmu.c:164
#11 0x00000000011e422c in test__hwmon_pmu (with_pmu=false)
at tests/hwmon_pmu.c:219
#12 0x00000000011e431c in test__hwmon_pmu_without_pmu (test=0x1610368
<suite.hwmon_pmu>, subtest=1) at tests/hwmon_pmu.c:23
where the attr::config is set to value
292470092988416 or 0x10a0000000000
in line 625 of file ./util/hwmon_pmu.c:
attr->config = key.type_and_num;
However member key::type_and_num is defined as union and bit field:
union hwmon_pmu_event_key {
long type_and_num;
struct {
int num :16;
enum hwmon_type type :8;
};
};
s390 is big endian and Intel is little endian architecture.
The events for the hwmon dummy pmu have num = 1 or num = 2 and
type is set to HWMON_TYPE_TEMP (which is 10).
On s390 this assignes member key::type_and_num the value of
0x10a0000000000 (which is
292470092988416) as shown in above
trace output.
Fix this and export the structure/union hwmon_pmu_event_key
so the test shares the same implementation as the event parsing
functions for union and bit fields. This should avoid
endianess issues on all platforms.
Output after:
# ./perf test -F 11
11.1: Basic parsing test : Ok
11.2: Parsing without PMU name : Ok
11.3: Parsing with PMU name : Ok
#
Fixes:
531ee0fd4836 ("perf test: Add hwmon "PMU" test")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250131112400.568975-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Thomas Richter [Fri, 31 Jan 2025 10:27:56 +0000 (11:27 +0100)]
perf test: Use cycles event in perf record test for leader_sampling
On s390 the event instructions can not be used for recording.
This event is only supported by perf stat.
Change the event from instructions to cycles in subtest
test_leader_sampling.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250131102756.4185235-3-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Thomas Richter [Fri, 31 Jan 2025 10:27:55 +0000 (11:27 +0100)]
perf test: Fix perf record test for precise_max
On s390 the event instructions can not be used for recording.
This event is only supported by perf stat.
Test that each event cycles and instructions supports sampling.
If the event can not be sampled, skip it.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250131102756.4185235-2-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Anubhav Shelat [Fri, 31 Jan 2025 14:57:05 +0000 (09:57 -0500)]
perf script: force stdin for flamegraph in live mode
Currently, running "perf script flamegraph -a -F 99 sleep 1" should
produce flamegraph.html containing the flamegraph. Howevever, it gives a
segmentation fault.
This is caused because the flamegraph.py script is
supposed to take as input the output of "perf record", which should be
in stdin. This would require passing "-i -" to flamegraph.py. However,
the "flamegraph-report" script causes "perf script" command to take the
"-i -" option instead of flamegraph.py, which causes no problem for
"perf script", but causes a seg fault since flamegraph.py has no input
file. To fix this I added the "-i -" option directly to the
flamegraph-report script to ensure flamegraph.py gets input from stdin.
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Tested-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20250131145704.3164542-2-ashelat@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Thu, 30 Jan 2025 17:01:35 +0000 (09:01 -0800)]
perf test: Extra verbosity and hypervisor skip for tpebs test
When not running as root and with higher perf event paranoia values
the perf record forked by TPEBS can fail to attach to the process. Skip
the test in these scenarios.
Intel TPEBS test skips on non-Intel CPUs. On Intel CPUs under a
hypervisor the cache-misses event may not be present or precise. Skip
the test under this condition.
Refactor the output code to be placed in a file so that on a signal
the file can be dumped. This was necessary to catch the issue above as
the failing perf record command would fail without output.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250130170135.5817-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
James Clark [Wed, 29 Jan 2025 15:44:05 +0000 (15:44 +0000)]
perf: Always feature test reallocarray
This is also used in util/comm.c now, so instead of selectively doing
the feature test, always do it. If it's ever used anywhere else it's
less likely to cause another build failure.
This doesn't remove the need to manually include libc_compat.h, and
missing that will still cause an error for glibc < 2.26. There isn't a
way to fix that without poisoning reallocarray like libbpf did, but that
has other downsides like making memory debugging tools less useful. So
for Perf keep it like this and we'll have to fix up any missed includes.
Fixes the following build error:
util/comm.c:152:31: error: implicit declaration of function
'reallocarray' [-Wimplicit-function-declaration]
152 | tmp = reallocarray(comm_strs->strs,
| ^~~~~~~~~~~~
Fixes:
13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Ali Utku Selen <ali.utku.selen@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250129154405.777533-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Thu, 9 Jan 2025 22:21:07 +0000 (14:21 -0800)]
perf stat: Fix find_stat for mixed legacy/non-legacy events
Legacy events typically don't have a PMU when added leading to
mismatched legacy/non-legacy cases in find_stat. Use evsel__find_pmu
to make sure the evsel PMU is looked up. Update the evsel__find_pmu
code to look for the PMU using the extended config type or, for legacy
hardware/hw_cache events on non-hybrid systems, just use the core PMU.
Before:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
215,309,764 cycles
44,326,491 cpu/instructions/
1.
002555314 seconds time elapsed
```
After:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
Performance counter stats for 'system wide':
990,676,332 cycles
1,235,762,487 cpu/instructions/ # 1.25 insn per cycle
1.
002667198 seconds time elapsed
```
Fixes:
3612ca8e2935 ("perf stat: Fix the hard-coded metrics calculation on the hybrid")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Thu, 9 Jan 2025 22:21:06 +0000 (14:21 -0800)]
perf evsel: Add pmu_name helper
Add helper to get the name of the evsel's PMU. This handles the case
where there's no sysfs PMU via parse_events event_type helper.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
James Clark [Wed, 22 Jan 2025 16:35:03 +0000 (16:35 +0000)]
perf vendor events arm64: Add V3 events/metrics
Using the scripts at:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/
Generate perf json for neoverse-v3 using the following command:
```
$ telemetry-solution/tools/perf_json_generator/generate.py \
tools/perf/ --telemetry-files \
telemetry-solution/data/pmu/cpu/neoverse/neoverse-v3.json
```
Signed-off-by: Ian Rogers <irogers@google.com>
[Re-generate after updating script]
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250122163504.2061472-3-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
James Clark [Wed, 22 Jan 2025 16:35:02 +0000 (16:35 +0000)]
perf vendor events arm64: Add N3 events/metrics
Using the scripts at:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/
Generate perf json for neoverse-n3 using the following command:
```
$ telemetry-solution/tools/perf_json_generator/generate.py \
tools/perf/ --telemetry-files \
telemetry-solution/data/pmu/cpu/neoverse/neoverse-n3.json
```
Signed-off-by: Ian Rogers <irogers@google.com>
[Re-generate after updating script]
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250122163504.2061472-2-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Benjamin Peterson [Thu, 23 Jan 2025 23:59:35 +0000 (15:59 -0800)]
perf trace: Fix return value of trace__fprintf_tp_fields
This function formerly returned twice the number of bytes printed.
Signed-off-by: Benjamin Peterson <benjamin@engflow.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250123-void-fprintf_tp_fields-v2-1-6038f8224987@engflow.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Athira Rajeev [Fri, 10 Jan 2025 09:46:20 +0000 (15:16 +0530)]
perf test: Update event_groups test to use instructions
In some of the powerpc platforms, event group testcase fails as below:
# perf test -v 'Event groups'
69: Event groups :
--- start ---
test child forked, pid 9765
Using CPUID 0x00820200
Using hv_24x7 for uncore pmu event
0x0 0x0, 0x0 0x0, 0x0 0x0: Fail
0x0 0x0, 0x0 0x0, 0x1 0x3: Pass
The testcase creates various combinations of hw, sw and uncore
PMU events and verify group creation succeeds or fails as expected.
This tests one of the limitation in perf where it doesn't allow
creating a group of events from different hw PMUs.
The testcase starts a leader event and opens two sibling events.
The combination the fails is three hardware events in a group.
"0x0 0x0, 0x0 0x0, 0x0 0x0: Fail"
Type zero and config zero which translates to PERF_TYPE_HARDWARE
and PERF_COUNT_HW_CPU_CYCLE. There is event constraint in powerpc
that events using same counter cannot be programmed in a group.
Here there is one alternative event for cycles, hence one leader
and only one sibling event can go in as a group.
if all three events (leader and two sibling events), are hardware
events, use instructions as one of the sibling event. Since
PERF_COUNT_HW_INSTRUCTIONS is a generic hardware event and present
in all architectures, use this as third event.
Reported-by: Tejas Manhas <Tejas.Manhas1@ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20250110094620.94976-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Kuan-Wei Chiu [Thu, 16 Jan 2025 11:08:42 +0000 (19:08 +0800)]
perf bench: Fix undefined behavior in cmpworker()
The comparison function cmpworker() violates the C standard's
requirements for qsort() comparison functions, which mandate symmetry
and transitivity:
Symmetry: If x < y, then y > x.
Transitivity: If x < y and y < z, then x < z.
In its current implementation, cmpworker() incorrectly returns 0 when
w1->tid < w2->tid, which breaks both symmetry and transitivity. This
violation causes undefined behavior, potentially leading to issues such
as memory corruption in glibc [1].
Fix the issue by returning -1 when w1->tid < w2->tid, ensuring
compliance with the C standard and preventing undefined behavior.
Link: https://www.qualys.com/2024/01/30/qsort.txt
Fixes:
121dd9ea0116 ("perf bench: Add epoll parallel epoll_wait benchmark")
Cc: stable@vger.kernel.org
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250116110842.4087530-1-visitorckw@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 17 Jan 2025 18:18:48 +0000 (10:18 -0800)]
perf annotate: Prefer passing evsel to evsel->core.idx
An evsel idx may not be stable due to sorting, evlist removal,
etc. Try to reduce it being part of APIs by explicitly passing the
evsel in annotate code. Internally the code just reads evsel->core.idx
so behavior is unchanged.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Chen Ni <nichen@iscas.ac.cn>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20250117181848.690474-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Chun-Tse Shao [Thu, 16 Jan 2025 23:58:16 +0000 (15:58 -0800)]
perf lock: Rename fields in lock_type_table
`lock_type_table` contains `name` and `str` which can be confusing.
Rename them to `flags_name` and `lock_name` and add descriptions to
enhance understanding.
Tested by building perf for x86.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: nick.forrington@arm.com
Link: https://lore.kernel.org/r/20250116235838.2769691-3-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Chun-Tse Shao [Thu, 16 Jan 2025 23:58:15 +0000 (15:58 -0800)]
perf lock: Add percpu-rwsem for type filter
percpu-rwsem was missing in man page. And for backward compatibility,
replace `pcpu-sem` with `percpu-rwsem` before parsing lock name.
Tested `./perf lock con -ab -Y pcpu-sem` and `./perf lock con -ab -Y
percpu-rwsem`
Fixes:
4f701063bfa2 ("perf lock contention: Show lock type with address")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: nick.forrington@arm.com
Link: https://lore.kernel.org/r/20250116235838.2769691-2-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Chun-Tse Shao [Thu, 16 Jan 2025 23:58:14 +0000 (15:58 -0800)]
perf lock: Fix parse_lock_type which only retrieve one lock flag
`parse_lock_type` can only add the first lock flag in `lock_type_table`
given input `str`. For example, for `Y rwlock`, it only adds `rwlock:R`
into this perf session. Another example is for `-Y mutex`, it only adds
the mutex without `LCB_F_SPIN` flag. The patch fixes this issue, makes
sure both `rwlock:R` and `rwlock:W` will be added with `-Y rwlock`, and
so on.
Testing:
$ ./perf lock con -ab -Y mutex,rwlock -- perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed
1000000 pipe operations between two processes
Total time: 9.313 [sec]
9.313976 usecs/op
107365 ops/sec
contended total wait max wait avg wait type caller
176 1.65 ms 19.43 us 9.38 us mutex pipe_read+0x57
34 180.14 us 10.93 us 5.30 us mutex pipe_write+0x50
7 77.48 us 16.09 us 11.07 us mutex do_epoll_wait+0x24d
7 74.70 us 13.50 us 10.67 us mutex do_epoll_wait+0x24d
3 35.97 us 14.44 us 11.99 us rwlock:W ep_done_scan+0x2d
3 35.00 us 12.23 us 11.66 us rwlock:W do_epoll_wait+0x255
2 15.88 us 11.96 us 7.94 us rwlock:W do_epoll_wait+0x47c
1 15.23 us 15.23 us 15.23 us rwlock:W do_epoll_wait+0x4d0
1 14.26 us 14.26 us 14.26 us rwlock:W ep_done_scan+0x2d
2 14.00 us 7.99 us 7.00 us mutex pipe_read+0x282
1 12.29 us 12.29 us 12.29 us rwlock:R ep_poll_callback+0x35
1 12.02 us 12.02 us 12.02 us rwlock:W do_epoll_ctl+0xb65
1 10.25 us 10.25 us 10.25 us rwlock:R ep_poll_callback+0x35
1 7.86 us 7.86 us 7.86 us mutex do_epoll_ctl+0x6c1
1 5.04 us 5.04 us 5.04 us mutex do_epoll_ctl+0x3d4
[namhyung: Add a comment and rename to 'mutex:spin' for consistency
Fixes:
d783ea8f62c4 ("perf lock contention: Simplify parse_lock_type()")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: nick.forrington@arm.com
Link: https://lore.kernel.org/r/20250116235838.2769691-1-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Athira Rajeev [Fri, 10 Jan 2025 09:37:30 +0000 (15:07 +0530)]
perf lock: Fix return code for functions in __cmd_contention
perf lock contention returns zero exit value even if the lock contention
BPF setup failed.
# ./perf lock con -b true
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
libbpf: failed to find valid kernel BTF
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -ESRCH
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
# echo $?
0
Fix this by saving the return code for lock_contention_prepare
so that command exits with proper return code. Similarly set the
return code properly for two other functions in builtin-lock, namely
setup_output_field() and select_key().
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250110093730.93610-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Dmitry Vyukov [Wed, 8 Jan 2025 06:59:34 +0000 (07:59 +0100)]
perf hist: Fix width calculation in hpp__fmt()
hpp__width_fn() round up width to length of the field name,
hpp__fmt() should do it too. Otherwise, the numbers may
end up unaligned if the field name is long.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250108065949.235718-1-dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Dmitry Vyukov [Wed, 15 Jan 2025 08:00:42 +0000 (09:00 +0100)]
perf hist: Fix bogus profiles when filters are enabled
When a filtered column is not present in the sort order, profiles become
arbitrary broken. Filtered and non-filtered entries are collapsed
together, and the filtered-by field ends up with a random value (either
from a filtered or non-filtered entry). If we end up with filtered
entry/value, then the whole collapsed entry will be filtered out and will
be missing in the profile. If we end up with non-filtered entry/value,
then the overhead value will be wrongly larger (include some subset
of filtered out samples).
This leads to very confusing profiles. The problem is hard to notice,
and if noticed hard to understand. If the filter is for a single value,
then it can be fixed by adding the corresponding field to the sort order
(provided user understood the problem). But if the filter is for multiple
values, it's impossible to fix b/c there is no concept of binary sorting
based on filter predicate (we want to group all non-filtered values in
one bucket, and all filtered values in another).
Examples of affected commands:
perf report --tid=123
perf report --sort overhead,symbol --comm=foo,bar
Fix this by considering filtered status as the highest priority
sort/collapse predicate.
As a side effect this effectively adds a new feature of showing profile
where several lines are combined based on arbitrary filtering predicate.
For example, showing symbols from binaries foo and bar combined together,
but not from other binaries; or showing combined overhead of several
particular threads.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/359dc444ce94d20e59d3a9e360c36fbeac833a04.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Dmitry Vyukov [Wed, 15 Jan 2025 08:00:41 +0000 (09:00 +0100)]
perf hist: Deduplicate cmp/sort/collapse code
Application of cmp/sort/collapse fmt callbacks is duplicated 6 times.
Factor it into a common helper function. NFC.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/84c4b55614e24a344f86ae0db62e8fa8f251f874.1736927981.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 10 Jan 2025 04:57:36 +0000 (20:57 -0800)]
perf test: Improve verbose documentation
Add a little more detail on the output expectations for each verbose
level.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 10 Jan 2025 04:57:35 +0000 (20:57 -0800)]
perf test: Add a runs-per-test flag
To detect flakes it is useful to run tests more than once. Add a
runs-per-test flag that will run each test multiple times. Example
output:
```
$ perf test -r 3 lbr -v
122: perf record LBR tests : Ok
122: perf record LBR tests : Ok
122: perf record LBR tests : Ok
```
Update the documentation for the runs-per-test option.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 10 Jan 2025 04:57:34 +0000 (20:57 -0800)]
perf test: Fix parallel/sequential option documentation
The parallel option was removed in commit
94d1a913bdc4 ("perf test:
Make parallel testing the default"). Update the sequential
documentation to reflect it isn't the default except for "exclusive"
tests.
Fixes:
94d1a913bdc4 ("perf test: Make parallel testing the default")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 10 Jan 2025 04:57:33 +0000 (20:57 -0800)]
perf test: Send list output to stdout rather than stderr
Follow the workload listing in using stdout rather than
stderr. Correct the numbering of sub-tests to be 1.1 rather than 1:1.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 10 Jan 2025 04:57:32 +0000 (20:57 -0800)]
perf test: Rename functions and variables for better clarity
The relationship between subtests and test cases is somewhat
confusing, so let's do away with the notion of sub-tests and switch to
just working with some number of test cases. Add a
test_suite__for_each_test_case as in many cases, except the special
one test case situation, the iteration can just be on all test
cases. Switch variable names to be more intention revealing of what
their value is.
This work was motivated by discussion with Kan where it was noted the
code is becoming overly indented:
https://lore.kernel.org/lkml/
20241109160219.49976-1-irogers@google.com/
Unifying more of the sub-test/no-sub-tests avoids one level of
indentation in a number of places.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250110045736.598281-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Charlie Jenkins [Tue, 14 Jan 2025 19:35:44 +0000 (11:35 -0800)]
perf tools: Expose quiet/verbose variables in Makefile.perf
The variables to make builds silent/verbose live inside
tools/build/Makefile.build. Move those variables to the top-level
Makefile.perf to be generally available.
Committer testing:
See the SYSCALL lines, now they are consistent with the other
operations in other lines:
SYSTBL /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_32.h
SYSTBL /tmp/build/perf-tools-next/arch/x86/include/generated/asm/syscalls_64.h
GEN /tmp/build/perf-tools-next/common-cmds.h
GEN /tmp/build/perf-tools-next/arch/arm64/include/generated/asm/sysreg-defs.h
PERF_VERSION = 6.13.rc2.g3d94bb6ed1d0
GEN perf-archive
MKDIR /tmp/build/perf-tools-next/jvmti/
MKDIR /tmp/build/perf-tools-next/jvmti/
MKDIR /tmp/build/perf-tools-next/jvmti/
MKDIR /tmp/build/perf-tools-next/jvmti/
GEN perf-iostat
CC /tmp/build/perf-tools-next/jvmti/libjvmti.o
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20250114-perf_make_test-v1-1-decc1c517b11@rivosinc.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Arnaldo Carvalho de Melo [Tue, 14 Jan 2025 18:05:56 +0000 (15:05 -0300)]
perf config: Add a function to set one variable in .perfconfig
To allow for setting a variable from some other tool, like with the
"wallclock" patchset needs to allow the user to opt-in to having
that key in the sort order for 'perf report'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/Z4akewi7UPXpagce@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Mon, 13 Jan 2025 18:25:57 +0000 (19:25 +0100)]
perf test perftool_testsuite: Return correct value for skipping
In 'perf test', a return value 2 represents that the test case was
skipped. Fix this value for perftool_testsuite test cases to
differentiate between skip and pass values.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250113182605.130719-3-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Veronika Molnarova [Mon, 13 Jan 2025 18:25:56 +0000 (19:25 +0100)]
perf test perftool_testsuite: Add missing description
Properly name the test cases of perftool_testsuite instead of the
license being taken as the name for 'perf test'.
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250113182605.130719-2-vmolnaro@redhat.com
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Mon, 2 Dec 2024 11:19:58 +0000 (11:19 +0000)]
perf test record+probe_libc_inet_pton: Make test resilient
The test failed back and forth due to the call chain being heavily
impacted by the libc, which varies across different architectures and
distros.
The libc contains the symbols for "gaih_inet" and "getaddrinfo" in some
cases, but not always. Moreover, these symbols can be either normal
symbols or dynamic symbols, making it difficult to decide the call chain
entries due to the symbols are inconsistent.
To fix the issue, this commit identifies three call chain entries are
always present. These entries are matched by iterating through the
lines in the "perf script" result. The recording attribute max-stack is
set to 4 for the possible maximum call chain depth.
After:
# perf test -vF pton
--- start ---
Pattern: ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\)
Matching: ping 285058 [025]
1219802.466939: probe_libc:inet_pton: (
ffffa14b7cf0)
Pattern: .*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib/aarch64-linux-gnu/libc-2.31.so|inlined\)$
Matching: ping 285058 [025]
1219802.466939: probe_libc:inet_pton: (
ffffa14b7cf0)
Matching:
ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
Pattern: .*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$
Matching: ping 285058 [025]
1219802.466939: probe_libc:inet_pton: (
ffffa14b7cf0)
Matching:
ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
Matching:
ffffa1488040 getaddrinfo+0xe8 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
Matching:
aaaab8672da4 [unknown] (/usr/bin/ping)
---- end ----
82: probe libc's inet_pton & backtrace it with ping : Ok
Closes: https://lore.kernel.org/linux-perf-users/
1728978807-81116-1-git-send-email-renyu.zj@linux.alibaba.com/
Closes: https://lore.kernel.org/linux-perf-users/Z0X3AYUWkAgfPpWj@x1/T/#m57327e135b156047e37d214a0d453af6ae1e02be
Reported-by: Guilherme Amadio <amadio@gentoo.org>
Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241202111958.553403-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Wed, 11 Dec 2024 06:08:31 +0000 (22:08 -0800)]
perf inject: Fix use without initialization of local variables
Local variables were missing initialization and command line
processing didn't provide default values.
Fixes:
64eed019f3fce124 ("perf inject: Lazy build-id mmap2 event insertion")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241211060831.806539-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 11 Dec 2024 08:55:24 +0000 (08:55 +0000)]
perf probe: Rename err label
Rename err to out to avoid confusion because buf is still supposed to be
freed in non error cases.
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241211085525.519458-3-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 12 Dec 2024 17:33:54 +0000 (09:33 -0800)]
perf test stat: Avoid hybrid assumption when virtualized
The cycles event will fallback to task-clock in the hybrid test when
running virtualized. Change the test to not fail for this.
Fixes:
65d11821910bd910 ("perf test: Add a test for default perf stat command")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Athira Rajeev [Mon, 23 Dec 2024 13:58:13 +0000 (19:28 +0530)]
perf record: Fix segfault with --off-cpu when debuginfo is not enabled
When kernel is built without debuginfo, running 'perf record' with
--off-cpu results in segfault as below:
./perf record --off-cpu -e dummy sleep 1
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
libbpf: failed to find valid kernel BTF
Segmentation fault (core dumped)
The backtrace pointed to:
#0 0x00000000100fb17c in btf.type_cnt ()
#1 0x00000000100fc1a8 in btf_find_by_name_kind ()
#2 0x00000000100fc38c in btf.find_by_name_kind ()
#3 0x00000000102ee3ac in off_cpu_prepare ()
#4 0x000000001002f78c in cmd_record ()
#5 0x00000000100aee78 in run_builtin ()
#6 0x00000000100af3e4 in handle_internal_command ()
#7 0x000000001001004c in main ()
Code sequence is:
static void check_sched_switch_args(void)
{
struct btf *btf = btf__load_vmlinux_btf();
const struct btf_type *t1, *t2, *t3;
u32 type_id;
type_id = btf__find_by_name_kind(btf, "btf_trace_sched_switch",
BTF_KIND_TYPEDEF);
btf__load_vmlinux_btf() fails when CONFIG_DEBUG_INFO_BTF is not enabled.
Here bpf__find_by_name_kind() calls btf__type_cnt() with NULL btf value
and results in segfault.
To fix this, add a check to see if btf is not NULL before invoking
bpf__find_by_name_kind().
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://lore.kernel.org/r/20241223135813.8175-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Athira Rajeev [Fri, 10 Jan 2025 09:43:24 +0000 (15:13 +0530)]
perf tests base_probe: Fix check for the count of existing probes in test_adding_kernel
perftool-testsuite_probe fails in test_adding_kernel as below:
Regexp not found: "probe:inode_permission_11"
-- [ FAIL ] -- perf_probe :: test_adding_kernel :: force-adding probes ::
second probe adding (with force) (output regexp parsing)
event syntax error: 'probe:inode_permission_11'
\___ unknown tracepoint
Error: File /sys/kernel/tracing//events/probe/inode_permission_11
not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to
enable this feature?.
The test does the following:
1) Adds a probe point first using:
$CMD_PERF probe --add $TEST_PROBE
2) Then tries to add same probe again without —force and expects it to
fail. Next tries to add same probe again with —force. In this case,
perf probe succeeds and adds the probe with a suffix number. Example:
./perf probe --add inode_permission
Added new event:
probe:inode_permission (on inode_permission)
./perf probe --add inode_permission --force
Added new event:
probe:inode_permission_1 (on inode_permission)
./perf probe --add inode_permission --force
Added new event:
probe:inode_permission_2 (on inode_permission)
Each time, suffix is added to existing probe name.
To get the suffix number, test cases uses:
NO_OF_PROBES=`$CMD_PERF probe -l | wc -l`
This will work if there is no other probe existing in the system. If
there are any other probes other than kernel probes or inode_permission,
( example: any probe), "perf probe -l" will include count for other
probes too.
Example, in the system where this failed, already some probes were
default added. So count became 10
./perf probe -l | wc -l
10
So to be specific for "inode_permission", restrict the probe count check
to that probe point alone using:
NO_OF_PROBES=`$CMD_PERF probe -l $TEST_PROBE| wc -l`
Similarly while removing the probe using "probe --del *", (removing all
probes), check uses:
../common/check_all_lines_matched.pl "Removed event: probe:$TEST_PROBE"
But if there are other probes in the system, the log will contain
reference to other existing probe too. Hence change usage of
check_all_lines_matched.pl to check_all_patterns_found.pl This will make
sure expecting string comes in the result
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250110094324.94604-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michel Lind [Tue, 26 Nov 2024 23:41:59 +0000 (17:41 -0600)]
perf MANIFEST: Add license files
The standalone tarballs should include the license files - both the
COPYING declaration as well as the text of GPLv2.
Signed-off-by: Michel Lind <michel@michel-slm.name>
Link: https://lore.kernel.org/r/Z0Zcx0WRqtlUYpgw@hyperscale.parallels
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Fri, 13 Dec 2024 23:13:12 +0000 (17:13 -0600)]
perf test brstack: Speed up running test by using tr -s instead of xargs
The brstack test runs quite slowly in software models. Part of the reason
is "xargs -n1" is quite inefficient in replacing spaces with newlines.
While that's not noticeable on normal machines, it is on software models.
Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
much faster. For comparison on an M1 Macbook Pro:
$ time seq -s ' ' 10000 | xargs -n1 > /dev/null
real 0m2.729s
user 0m2.009s
sys 0m0.914s
$ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null
real 0m0.002s
user 0m0.001s
sys 0m0.001s
The "grep '.'" is also needed to remove any remaining blank lines.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241213231312.2640687-2-robh@kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
[robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Fri, 10 Jan 2025 19:22:51 +0000 (11:22 -0800)]
perf tools mips: Fix mips syscall generation
The mips syscall generation was still based on the old method. Delete
the Makefile since it is no longer needed with the new method of
generation.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes:
619ffe669496a288 ("perf tools mips: Use generic syscall scripts")
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250110-perf_fix_mips-v1-1-4e661c3b710a@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 8 Jan 2025 14:29:00 +0000 (14:29 +0000)]
perf tests arm_spe: Add test for discard mode
Add a test that checks that there were no AUX or AUXTRACE events
recorded when discard mode is used.
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-6-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 8 Jan 2025 14:28:59 +0000 (14:28 +0000)]
perf tools arm-spe: Don't allocate buffer or tracking event in discard mode
The buffer will never be written to so don't bother allocating it. The
tracking event is also not required.
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-5-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 8 Jan 2025 14:28:58 +0000 (14:28 +0000)]
perf tools arm-spe: Pull out functions for aux buffer and tracking setup
These won't be used in the next commit in discard mode, so put them in
their own functions. No functional changes intended.
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-4-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiachen Zhang [Thu, 9 Jan 2025 15:22:19 +0000 (23:22 +0800)]
perf report: Fix misleading help message about --demangle
The wrong help message may mislead users. This commit fixes it.
Fixes:
328ccdace8855289 ("perf report: Add --no-demangle option")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiachen Zhang <me@jcix.top>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 8 Jan 2025 21:00:15 +0000 (13:00 -0800)]
perf ftrace: Fix display for range of the first bucket
When min_latency is not given, it prints 0 - 0. It should be 0 - 1.
Before:
$ sudo ./perf ftrace latency -a -T do_futex sleep 1
# DURATION | COUNT | GRAPH |
0 - 0 us | 321 | ########### |
...
After:
$ sudo ./perf ftrace latency -a -T do_futex sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 699 | ############ |
...
Fixes:
08b875b6bf608589 ("perf ftrace latency: Introduce --min-latency to narrow down into a latency range")
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gabriele Monaco <gmonaco@redhat.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108210015.1188531-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 8 Jan 2025 21:00:14 +0000 (13:00 -0800)]
perf ftrace: Check min/max latency only with bucket range
It's an optional feature and remains 0 when bucket range is not given.
And it makes the histogram goes to the last entry always because any
latency (num) is greater than or equal to 0.
Before:
$ sudo ./perf ftrace latency -a -T do_futex sleep 1
# DURATION | COUNT | GRAPH |
0 - 0 us | 0 | |
1 - 2 us | 0 | |
2 - 4 us | 0 | |
4 - 8 us | 0 | |
8 - 16 us | 0 | |
16 - 32 us | 0 | |
32 - 64 us | 0 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 1353 | ############################################## |
After:
$ sudo ./perf ftrace latency -a -T do_futex sleep 1
# DURATION | COUNT | GRAPH |
0 - 0 us | 321 | ########### |
1 - 2 us | 132 | #### |
2 - 4 us | 202 | ####### |
4 - 8 us | 188 | ###### |
8 - 16 us | 16 | |
16 - 32 us | 12 | |
32 - 64 us | 30 | # |
64 - 128 us | 98 | ### |
128 - 256 us | 53 | # |
256 - 512 us | 57 | ## |
512 - 1024 us | 9 | |
1 - 2 ms | 9 | |
2 - 4 ms | 1 | |
4 - 8 ms | 98 | ### |
8 - 16 ms | 5 | |
16 - 32 ms | 7 | |
32 - 64 ms | 32 | # |
64 - 128 ms | 10 | |
128 - 256 ms | 10 | |
256 - 512 ms | 2 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
Fixes:
690a052a6d85c530 ("perf ftrace latency: Add --max-latency option")
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gabriele Monaco <gmonaco@redhat.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108210015.1188531-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 26 Nov 2024 20:43:18 +0000 (17:43 -0300)]
perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball
Needed to build tools/lib/bpf/ on various arches other than x86_64,
notably arm64 when using the perf tarballs generated by:
$ make help | grep perf-
perf-tar-src-pkg - Build the perf source tarball with no compression
perf-targz-src-pkg - Build the perf source tarball with gzip compression
perf-tarbz2-src-pkg - Build the perf source tarball with bz2 compression
perf-tarxz-src-pkg - Build the perf source tarball with xz compression
perf-tarzst-src-pkg - Build the perf source tarball with zst compression
$
Building with BPF support was opt-in in perf for a long time, and
testing it via the tarball main kernel Makefile targets in an
architecture other than x86_64 was an odd case.
I had noticed this at some point earlier this year while cross building
perf to some arches, including arm64, but it fell thru the cracks, see
the Link tag below.
Fix it now by adding those arch/*/include/uapi/asm/bpf_perf_event.h
files to the MANIFEST file used in building the perf source tarball.
Tested with:
perfbuilder@number:~$ time dm debian:experimental-x-arm64
1 21.60 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 14.1.0-5) 14.1.0 flex 2.6.4
BUILD_TARBALL_HEAD=
d31a974f6edc576f84c35be9526fec549a3b3520
$
$ git log --oneline -1
d31a974f6edc576f84c35be9526fec549a3b3520
d31a974f6edc576f (HEAD -> perf-tools-next) perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball
$
That was previously failing:
perfbuilder@number:~$ grep debian:experimental-x-arm64 dm.log.old/summary
19 4.80 debian:experimental-x-arm64 : FAIL gcc version 14.1.0 (Debian 14.1.0-5)
$
perfbuilder@number:~$ grep -B6 'Error 1' dm.log.old/debian:experimental-x-arm64
In file included from /git/perf-6.12.0-rc6/tools/include/uapi/linux/bpf_perf_event.h:11,
from libbpf.c:36:
/git/perf-6.12.0-rc6/tools/include/uapi/asm/bpf_perf_event.h:2:10: fatal error: ../../arch/arm64/include/uapi/asm/bpf_perf_event.h: No such file or directory
2 | #include "../../arch/arm64/include/uapi/asm/bpf_perf_event.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [/git/perf-6.12.0-rc6/tools/build/Makefile.build:105: /tmp/build/perf/libbpf/staticobjs/libbpf.o] Error 1
perfbuilder@number:~$
Closes: https://lore.kernel.org/all/Z0UNRCRYKunbDYxP@hyperscale.parallels
Fixes:
9eea8fafe33eb708 ("libbpf: fix __arg_ctx type enforcement for perf_event programs")
Reported-by: Michel Lind <michel@michel-slm.name>
Tested-by: Michel Lind <michel@michel-slm.name>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link:
317c11923cf676437456e44a7f408d4ce589a9c0.camel@michel-slm.name
Link: https://lore.kernel.org/bpf/ZfyEgoG3JFiOs2Fs@x1/
Link: https://lore.kernel.org/r/Z0Yy5u42Q1hWoEzz@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Yoshihiro Furudera [Tue, 17 Dec 2024 06:57:51 +0000 (06:57 +0000)]
perf vendor events arm64: Add FUJITSU-MONAKA PMU event
Add PMU events for FUJITSU-MONAKA.
And, also updated common-and-microarch.json and recommended.json.
FUJITSU-MONAKA Specification URL:
https://github.com/fujitsu/FUJITSU-MONAKA
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com>
Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20241217065751.1448755-1-fj5100bi@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 18 Dec 2024 22:04:53 +0000 (14:04 -0800)]
perf tools: Fixup end address of modules
In machine__create_module(), it reads /proc/modules to get a list of
modules in the system. The file shows the start address (of text) and
the size of the module so it uses the info to reconstruct system memory
maps for symbol resolution.
But module memory consists of multiple segments and they can be
scaterred. Currently perf tools assume they are contiguous and see some
overlaps. This can confuse the tool when it finds a map containing a
given address.
As we mostly care about the function symbols in the text segment, it can
fixup the size or end address of modules when there's an overlap. We
can use maps__fixup_end() which updates the end address using the start
address of the next map.
Ideally it should be able to track other segments (like data/rodata),
but that would require some changes in /proc/modules IMHO.
Reported-by: Blake Jones <blakejones@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20241218220453.203069-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 2 Jan 2025 20:32:51 +0000 (12:32 -0800)]
perf symbol: Prefer non-label symbols with same address
When there are more than one symbols at the same address, it needs to
choose which one is better. In choose_best_symbol() it didn't check the
type of symbols. It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.
$ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
105089:
ffffffff82000000 814 FUNC GLOBAL DEFAULT 1 __do_softirq
111954:
ffffffff82000000 0 NOTYPE GLOBAL DEFAULT 1 __softirqentry_text_start
The commit
77b004f4c5c3c90b tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code. Let's restore the original code and choose the right symbol using
type of the symbols.
Fixes:
77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Tue, 19 Nov 2024 03:17:54 +0000 (19:17 -0800)]
perf symbol-elf: Avoid a weak cxx_demangle_sym function
cxx_demangle_sym is weak in case demangle-cxx.c replaces the
definition in symbol-elf.c. When demangle-cxx.c is built
HAVE_CXA_DEMANGLE_SUPPORT is defined, as such the define can be used
to avoid a weak symbol.
As weak symbols are outside of the C standard their use can lead to
strange behaviors, in particular with LTO, as well as causing issues to
be hidden at link time.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241119031754.1021858-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Thu, 2 Jan 2025 20:12:47 +0000 (12:12 -0800)]
perf trace: Fix unaligned access for augmented args
Some version of compilers reported unaligned accesses in perf trace when
undefined-behavior sanitizer is on. I found that it uses raw data in
the sample directly and assuming it's properly aligned.
Unlike other sample fields, the raw data is not 8-byte aligned because
there's a size field (u32) before the actual data. So I added a static
buffer in syscall__augmented_args() and return it instead. This is not
ideal but should work well as perf trace is single-threaded.
A better approach would be aligning the raw data by adding a 4-byte data
before the augmented args but I'm afraid it'd break the backward
compatibility.
Committer testing:
To build with the undefined behaviour sanitizer:
$ make CC=clang EXTRA_CFLAGS=-fsanitize=undefined -C tools/perf
Checking if the resulting binary is instrumented:
root@number:~# nm ~/bin/perf | grep ubsan | wc -l
113
root@number:~# nm ~/bin/perf | grep ubsan | tail -5
000000000043d5b0 t _ZN7__ubsanL19UBsanOnDeadlySignalEiPvS0_
000000000043ce50 T _ZNK7__ubsan5Value12getSIntValueEv
000000000043cf40 T _ZNK7__ubsan5Value12getUIntValueEv
000000000043d140 T _ZNK7__ubsan5Value13getFloatValueEv
000000000043cfd0 T _ZNK7__ubsan5Value19getPositiveIntValueEv
root@number:~#
Now running something that will access timespec, as reported in the
Closes URL:
root@number:~# perf trace --max-events=1 -e *nano* sleep 1.1
trace/beauty/timespec.c:10:64: runtime error: member access within misaligned address 0x7fc583cfb2a4 for type 'struct augmented_arg', which requires 8 byte alignment
0x7fc583cfb2a4: note: pointer points here
99 99 11 00 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 01 e1 f5 05 00 00 00 00 00 00 00 00
^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior trace/beauty/timespec.c:10:64
<SNIP>
As Namhyung said we need to make the raw_data to be 64-bit aligned,
probably we need to add a PERF_SAMPLE_ALIGNED_RAW with a 64-bit raw_size
instead of the current u32 done at kernel/events/core.c,
perf_output_sample(), that perf_output_put(handle, raw->size) where
raw->size is an u32 and then the raw_data is always 64-bit unaligned...
After the patch:
root@number:~# perf trace -e *nano* sleep 1.1
0.000 (1100.064 ms): sleep/
1984224 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec:
100000001 }, rmtp: 0x7fff5b3fe970) = 0
root@number:~#
Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250102201248.790841-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 7 Jan 2025 16:59:30 +0000 (16:59 +0000)]
perf test: Mark remaining probe tests as exclusive
Probes are global and other probe tests are already exclusive. These
two tests can throw warnings when run at the same time so mark them as
exclusive too:
$ perf test -vvv 81 79
79: perftool-testsuite_probe:
--- start ---
test child forked, pid 46419
../common/init.sh: line 137: /sys/kernel/debug/tracing/uprobe_events: Device or resource busy
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Link: https://lore.kernel.org/r/20250107165933.292225-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:31 +0000 (18:36 -0800)]
perf tools: Remove dependency on libaudit
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is
no longer needed. With the removal of the flag, the related
GENERIC_SYSCALL_TABLE can also be removed.
libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT
was not defined, so libaudit is also no longer needed for any
architecture.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:30 +0000 (18:36 -0800)]
perf tools s390: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for s390.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-15-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:29 +0000 (18:36 -0800)]
perf tools powerpc: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for powerpc.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-14-7543b5293098@rivosinc.com
Link: https://lore.kernel.org/lkml/20250110100505.78d81450@canb.auug.org.au
[ Stephen Rothwell noticed on linux-next that the powerpc build for perf was broken and ...]
Link: https://lore.kernel.org/lkml/20250109-perf_powerpc_spu-v1-1-c097fc43737e@rivosinc.com
[ ... Charlie fixed it up and asked for it to be squashed to avoid breaking bisection. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:28 +0000 (18:36 -0800)]
perf tools mips: Use generic syscall scripts
Use the generic scripts to generate headers from the syscall table for
mips.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-13-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:27 +0000 (18:36 -0800)]
perf tools loongarch: Use syscall table
loongarch uses a syscall table, use that in perf instead of using unistd.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-12-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:26 +0000 (18:36 -0800)]
perf tools arm64: Use syscall table
arm64 uses a syscall table, use that in perf instead of using unistd.h.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-11-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:25 +0000 (18:36 -0800)]
perf tools parisc: Support syscall header
parisc uses a syscall table, use that in perf instead of requiring
libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-10-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:24 +0000 (18:36 -0800)]
perf tools alpha: Support syscall header
alpha uses a syscall table, use that in perf instead of requiring
libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-9-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:23 +0000 (18:36 -0800)]
perf tools x86: Use generic syscall scripts
Use the generic scripts to generate headers from the syscall table for
both 32- and 64-bit x86.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-8-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:22 +0000 (18:36 -0800)]
perf tools xtensa: Support syscall header
xtensa uses a syscall table, use that in perf instead of requiring
libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-7-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:21 +0000 (18:36 -0800)]
perf tools sparc: Support syscall headers
sparc uses a syscall table, use that in perf instead of requiring
libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-6-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:20 +0000 (18:36 -0800)]
perf tools sh: Support syscall headers
sh uses a syscall table, use that in perf instead of requiring libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-5-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:19 +0000 (18:36 -0800)]
perf tools arm: Support syscall headers
arm uses a syscall table, use that in perf instead of requiring
libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-4-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:18 +0000 (18:36 -0800)]
perf tools csky: Support generic syscall headers
csky uses the generic syscall table, use that in perf instead of
requiring libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-3-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:17 +0000 (18:36 -0800)]
perf tools arc: Support generic syscall headers
Arc uses the generic syscall table, use that in perf instead of
requiring libaudit.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-2-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 9 Jan 2025 02:36:16 +0000 (18:36 -0800)]
perf tools: Create generic syscall table support
Currently each architecture in perf independently generates syscall
headers.
Adapt the work that has gone into unifying syscall header
implementations in the kernel to work with perf tools.
Introduce this framework with riscv at first. riscv previously relied on
libaudit, but with this change, perf tools for riscv no longer needs
this external dependency.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-1-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Wed, 8 Jan 2025 05:15:11 +0000 (21:15 -0800)]
perf test cpumap: Avoid use-after-free following merge
Previously cpu maps in the test weren't modified by calls to the cpu map
API, however, perf_cpu_map__merge was modified so the left hand argument
was updated.
In the test this meant the maps copy of the "two" map was put/deleted in
the merge meaning when accessed via maps, the pointer was stale and to
the put/deleted memory.
To fix this add an extra layer of indirection to the maps array, so the
updated value of two is accessed.
Fixes:
a9d2217556f7745e ("libperf cpumap: Refactor perf_cpu_map__merge()")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108051511.1720369-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dmitry Vyukov [Wed, 8 Jan 2025 07:02:28 +0000 (08:02 +0100)]
perf llvm-add2line: Remove unused symbol_conf.h include
Remove unused symbol_conf.h include.
First, it's just unused. Second, it's problematic since this is a C++
file, and most perf headers don't compile as C++. So if any other
includes are added to symbol_conf.h, it may break the build.
Signed-off-by: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250108070248.237943-1-dvyukov@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 6 Jan 2025 16:42:58 +0000 (16:42 +0000)]
perf test trace_btf_general: Fix shellcheck warning
Shellcheck versions < v0.7.2 can't follow this path so add the helper to
fix the following warning:
tests/shell/trace_btf_general.sh line 8:
. "$(dirname $0)"/lib/probe.sh
^--------------------------^ SC1090: Can't follow non-constant source.
Use a directive to specify location.
Fixes:
0255338d69754a02 ("perf trace: Add tests for BTF general augmentation")
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250106164300.734202-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 Dec 2024 20:48:28 +0000 (17:48 -0300)]
perf namespaces: Fixup the nsinfo__in_pidns() return type, its bool
When adding support for refconunt checking a cut'n'paste made this
function, that is just an accessor to a bool member of 'struct nsinfo',
return a pid_t, when that member is a boolean, fix it.
Fixes:
bcaf0a97858de7ab ("perf namespaces: Add functions to access nsinfo")
Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-6-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 Dec 2024 20:48:27 +0000 (17:48 -0300)]
perf jitdump: Fixup in_pidns member when java agent and 'perf record' are not in the same pidns
When running 'perf record' outside a container and the java agent inside
a container the jit_repipe_code_load() and friends will emit
PERF_RECORD_MMAP2 entries for the jitdump records and will check if we
need to fixup the pid/tid:
nspid = jr->load.pid;
pid = jr_entry_pid(jd, jr);
tid = jr_entry_tid(jd, jr);
The jr_entry_pid() function looks if we're in the same pidns:
static pid_t jr_entry_pid(struct jit_buf_desc *jd, union jr_entry *jr)
{
if (jd->nsi && nsinfo__in_pidns(jd->nsi))
return nsinfo__tgid(jd->nsi);
return jr->load.pid;
}
But since the thread, populated from perf.data records, try to figure
out if in the same pidns by actually trying, on the system where 'perf
inject' is running to open a procfs file (a bug that remains to be
fixed), assuming that if it is not possible that is because that thread
terminated and thus we can't get its namespace info and tolerates
nsinfo__init() failing, noting only that that namespace can't be
entered, so don't even try.
But we can kinda get at least that info (thread->nsinfo->in_pidns) from
the data in the perf.data file, namely the pid and tid in the
PERF_RECORD_MMAP2 for the jit-<PID>.dump file generated from the java
agent, if the PERF_RECORD_MMAP2->pid is the same as what is in the
jitdump file, then we're in the same namespace, otherwise we need to use
the PERF_RECORD_MMAP2->pid.
This all has to be revamped for this jitdump + running perf from
outside, as the meaning of in_pidns is being abused, the initialization
of nsinfo->pid with the value coming from the PERF_RECORD_MMAP2 data is
wrong as it is the pid _outside_ the container since perf was running
there.
The hack in this patch at least produces the expected result in this
scenario by following the assumptions in the current codebase for
finding maps and for generating the PERF_RECORD_MMAP2 for the ELF files
synthesized from the jitdump records in jit_repipe_code_load(), etc.s
Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 Dec 2024 20:48:26 +0000 (17:48 -0300)]
perf namespaces: Introduce nsinfo__set_in_pidns()
When we're processing a perf.data file we will, for every thread in that
file do a machine__findnew_thread(machine, pid, tid) that when that pid
is seen for the first time will create a 'struct thread' representing
it.
That in turn will call nsinfo__new() -> nsinfo__init() and there it will
assume we're running live, which is wrong and will need to be addressed
in a followup patch.
The nsinfo__new() assumes that if we can't access that thread it has
already finished and will ignore the -1 return from nsinfo__init(), just
taking notes to avoid trying to enter in that namespace, since it isn't
there anymore, a race.
When doing this from 'perf inject', tho, we can fill in parts of that
nsinfo from what we get from the PERF_RECORD_MMAP2 (pid, tid) and in the
jitdump file name, that has the form of jit-<PID>.dump.
So if the pid in the jitdump file name is not the one in the
PERF_RECORD_MMAP2, we can assume that its the pid of the process
_inside_ the namespace, and that perf was runing outside that namespace.
This will be done in the following patch.
Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 6 Dec 2024 20:48:25 +0000 (17:48 -0300)]
perf jitdump: Accept jitdump mmaps emitted from inside containers
When the java agent is running inside a container it will emit mmaps
with the format:
⬢ [acme@toolbox a]$ perf report -D | grep PERF_RECORD_MMAP | grep \.dump
0 0x15c400 [0x90]: PERF_RECORD_MMAP2
3308868/
3308868: [0x7fb8de6cb000(0x1000) @ 0 08:14
3222905945 0]: r-xp /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jit-1.dump
⬢ [acme@toolbox a]$
Since perf is running from outside the container it sees the pid
3308868
in PERF_RECORD_MMAP2, while the agent saw the pid of the profiled app
inside the container, 1.
The previous validation was:
if (pid && pid2 != nsinfo__nstgid(nsi))
pid2 at this point is '1' (/jit-1.dump), so it considers this as a
malformed jitdump mmap and refuses to process it.
The test ends up as:
if (
3308868 && 1 !=
3308868)
which is true and the jitdump is not processed.
Since 1 in the container namespace is really
3308868 in the namespace
that perf is running, consider this a valid mmap.
We need to make perf realize this and behave accordingly, for now
checking instead:
if (pid && pid2 && pid != nsinfo__nstgid(nsi))
Translating to:
if (
3308868 && 1 &&
3308868 !=
3308868)
Will make the jitdump mmap to be considered valid and processed.
The jitdump is described in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jitdump-specification.txt
Now we end up with the expected flurry of MMAPs, one per jitted function
transformed into a little ELF file that should then be processable by
the other perf features, like code annotation:
[acme@toolbox a]$ echo $JITDUMPDIR
/tmp/.debug/jit
[acme@toolbox a]$
First use 'perf inject':
⬢ [acme@toolbox a]$ time perf inject -i perf.data -o acme-perf-injected.data -j
Then look at the PERF_RECORD_MMAP events in the result file, that went
thru the JIT map file:
⬢ [acme@toolbox a]$ ls -la /tmp/*.map
-rw-r--r--. 1 acme acme
2989559 Nov 27 16:11 /tmp/perf-
3308868.map
[acme@toolbox a]$
It is a symbol table:
⬢ [acme@toolbox a]$ head /tmp/*.map
0x00007fb8bda5c1a0 0x00000000000000d0 java.lang.String java.lang.module.ModuleDescriptor.name()
0x00007fb8bda5c4a0 0x0000000000000178 int java.lang.StringLatin1.hashCode(byte[])
0x00007fb8bda5c9a0 0x00000000000000d0 java.lang.String org.springframework.boot.context.config.ConfigDataLocation.getValue()
0x00007fb8bda5cca0 0x00000000000000d0 java.lang.module.ModuleDescriptor java.lang.module.ModuleReference.descriptor()
0x00007fb8bda5cfa0 0x00000000000000d0 java.lang.Object java.util.KeyValueHolder.getKey()
0x00007fb8bda5d2a0 0x00000000000000d0 java.lang.Object java.util.KeyValueHolder.getValue()
0x00007fb8bda5d5a0 0x0000000000000218 boolean jdk.internal.misc.Unsafe.compareAndSetReference(java.lang.Object, long, java.lang.Object, java.lang.Object)
0x00007fb8bda5d9a0 0x00000000000001f0 boolean jdk.internal.misc.Unsafe.compareAndSetLong(java.lang.Object, long, long, long)
0x00007fb8bda5dda0 0x00000000000001f8 void java.lang.System.arraycopy(java.lang.Object, int, java.lang.Object, int, int)
0x00007fb8bda5e1a0 0x00000000000001e8 int java.lang.Object.hashCode()
⬢ [acme@toolbox a]$
As specified in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jit-interface.txt
This was collected from inside the container, so came as
/tmp/perf-1.map.
To make perf, running outside the container to use it we need to copy it
to /tmp/perf-
3308868.map.
This is another logic that has to be added to perf to work on this
scenario of running outside the container but processing things created
by the hava agent running inside the container.
With all this in place we get to:
⬢ [acme@toolbox a]$ perf report -D -i acme-perf-injected.data | \
grep PERF_RECORD_MMAP > /tmp/acme-perf-injected.data.mmaps ; \
wc -l /tmp/acme-perf-injected.data.mmaps
44182 /tmp/acme-perf-injected.data.mmaps
⬢ [acme@toolbox a]$ tail /tmp/acme-perf-injected.data.mmaps
1030266786574466 0x7bc9e0 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0ceb1c0(0x8d0) @ 0x80 00:2c 238715 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43989.so
1030266795288774 0x7bca78 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cecc00(0x7e8) @ 0x80 00:2c 238716 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so
1030266895967339 0x7bcb10 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cee500(0x3328) @ 0x80 00:2c 238717 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43991.so
1030266915748306 0x7bcba8 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0aae0a0(0x138) @ 0x80 00:2c 238718 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43992.so
1030267185851220 0x7bcc40 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cf61e0(0x3b50) @ 0x80 00:2c 238719 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43993.so
1030267231364524 0x7bccd8 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cfea80(0x14a0) @ 0x80 00:2c 238720 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43994.so
1030267425498831 0x7bcd70 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c054b4a0(0x338) @ 0x80 00:2c 238721 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43995.so
1030267506147888 0x7bce08 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0a995c0(0x1e8) @ 0x80 00:2c 238722 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43996.so
1030268112586116 0x7bcea0 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0d02520(0x258) @ 0x80 00:2c 238723 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43997.so
1030269435398150 0x7bcf38 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0d02dc0(0x278) @ 0x80 00:2c 238724 1]: --xs /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43998.so
⬢ [acme@toolbox a]$
And if we look at those tiny ELF files generated by the jitdump code
used by 'perf inject' we see:
⬢ [acme@toolbox a]$ file /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43989.so
/tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43989.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=
790591db95a77d644657dfe5058658b200000000, with debug_info, not stripped
⬢ [acme@toolbox a]$ file /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so
/tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=
762f932acbee53a22638bf4c2b86780200000000, with debug_info, not stripped
⬢ [acme@toolbox a]$
⬢ [acme@toolbox a]$ ls -la /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43989.so /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so
-rw-r--r--. 1 acme acme 9432 Nov 29 10:56 /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43989.so
-rw-r--r--. 1 acme acme 7504 Nov 29 10:56 /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so
⬢ [acme@toolbox a]$
And:
⬢ [acme@toolbox a]$ objdump -dS /tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so | head -20
/tmp/.debug/jit/java-jit-
20241126.XXTxEIOn/jitted-1-43990.so: file format elf64-x86-64
Disassembly of section .text:
0000000000000080 <Lredacted/REDACTED/REDACTED/logging/RedactedRedacted;Redacted(Lredacted/REDACTED/REDACTED/redactedRedacted/Redacted;)V>:
80: 44 8b 56 08 mov 0x8(%rsi),%r10d
84: 49 c1 e2 03 shl $0x3,%r10
88: 49 3b c2 cmp %r10,%rax
8b: 0f 85 6f 15 83 fc jne
fffffffffc831600 <Lredacted/REDACTED/REDACTED/redacted/RedactedRedactedRedacted;Redacted(Lredacted/Redacted/Redacted/redactedRedacted/Redacted;)V+0xfffffffffc831580>
91: 66 66 90 data16 xchg %ax,%ax
94: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
9b: 00
9c: 66 66 66 90 data16 data16 xchg %ax,%ax
a0: 89 84 24 00 c0 fe ff mov %eax,-0x14000(%rsp)
a7: 55 push %rbp
a8: 48 8b ec mov %rsp,%rbp
ab: 48 83 ec 40 sub $0x40,%rsp
af: 48 89 34 24 mov %rsi,(%rsp)
⬢ [acme@toolbox a]$
The thing now being investigated is why we can't annotate anything here,
maybe that JITDUMPDIR is getting in the way:
⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)'
Error:
Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int):
Internal error: Invalid -1 error code
⬢ [acme@toolbox a]$
In the tests I performed while merging this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
6d518ac7be6223811ab947897273b1bbef846180
It works, but then there was no JITDUMPDIR involved:
/home/acme/.debug/jit/java-jit-
20241127.XXF1SRgN/jitted-
3912413-4191.so
⬢ [acme@toolbox perf-tools-next]$ perf report --call-graph none --no-child -i perf-injected.data | grep jitted- | head
1.36% java jitted-
3912413-54.so [.] Interpreter
0.30% C1 CompilerThre jitted-
3912413-1.so [.] flush_icache_stub
0.18% java jitted-
3912413-4184.so [.] org.apache.fop.fo.properties.PropertyMaker.get(int, org.apache.fop.fo.PropertyList, boolean, boolean)
0.18% java jitted-
3912413-4177.so [.] org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)
0.13% java jitted-
3912413-3845.so [.] java.text.DecimalFormat.subformatNumber(java.lang.StringBuffer, java.text.Format$FieldDelegate, boolean, boolean, int, int, int, int)
0.11% java jitted-
3912413-4191.so [.] org.apache.fop.fo.FObj.addChildNode(org.apache.fop.fo.FONode)
0.09% java jitted-
3912413-2418.so [.] org.apache.fop.fo.XMLWhiteSpaceHandler.handleWhiteSpace()
0.08% Reference Handl jitted-
3912413-54.so [.] Interpreter
0.08% java jitted-
3912413-3326.so [.] org.apache.xmlgraphics.fonts.Glyphs.stringToGlyph(java.lang.String)
0.08% java jitted-
3912413-3953.so [.] org.apache.fop.layoutmgr.BreakingAlgorithm.considerLegalBreak(org.apache.fop.layoutmgr.KnuthElement, int)
⬢ [acme@toolbox perf-tools-next]$
And then:
⬢ [acme@toolbox perf-tools-next]$ perf annotate --stdio2 -i perf-injected.data 'org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)' | head -20
Samples: 8 of event 'cpu_atom/cycles/Pu', 4000 Hz, Event count (approx.):
8112794, [percent: local period]
org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)() /home/acme/.debug/jit/java-jit-
20241127.XXF1SRgN/jitted-
3912413-4177.so
Percent 0x80 <org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)>:
nop
movl 0x8(%rsi),%r10d
cmpl 0x8(%rax),%r10d
→ jne 0
movl %eax,-0x14000(%rsp)
pushq %rbp
subq $0xb0,%rsp
nop
cmpl $0x3,0x20(%r15)
↓ jne 7037
2e: movl %ecx,0x28(%rsp)
movq %rdx,%rbp
movl 0x64(%rdx),%ebx
cmpb $0x0,0x38(%r15)
↓ jne 3a44
movq %rsi,0x30(%rsp)
48: movq 0x30(%rsp),%r10
⬢ [acme@toolbox perf-tools-next]$
No source code nor line numbers, that I saw in another build of perf for
RHEL9, for the same workload described in the cset above (a publicly
available java benchmark), so something to investigate on perf upstream
running on fedora, maybe some quirk with the jdk used when building perf
for RHEL 9 and for Fedora 40.
A related patch that should have make this all work is:
"perf inject jit: Add namespaces support"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
67dec926931448d688efb5fe34f7b5a22470fc0a
But we still need to polish this some more, maybe there are differences
in the agent used in NodeJS with --perf-prof and the jvmti one we're
using.
Hopefully describing all the steps while we investigate this case will
help us improve perf support for profiling JITed environments running in
containers while profiling from inside and outside it.
Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Christophe Leroy [Wed, 8 Jan 2025 09:15:24 +0000 (10:15 +0100)]
perf machine: Don't ignore _etext when not a text symbol
Depending on how vmlinux.lds is written, _etext might be the very first
data symbol instead of the very last text symbol.
Don't require it to be a text symbol, accept any symbol type.
Comitter notes:
See the first Link for further discussion, but it all boils down to
this:
---
# grep -e _stext -e _etext -e _edata /proc/kallsyms
c0000000 T _stext
c08b8000 D _etext
So there is no _edata and _etext is not text
$ ppc-linux-objdump -x vmlinux | grep -e _stext -e _etext -e _edata
c0000000 g .head.text
00000000 _stext
c08b8000 g .rodata
00000000 _etext
c1378000 g .sbss
00000000 _edata
---
Fixes:
ed9adb2035b5be58 ("perf machine: Read also the end of the kernel")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/b3ee1994d95257cb7f2de037c5030ba7d1bed404.1736327613.git.christophe.leroy@csgroup.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Christophe Leroy [Wed, 8 Jan 2025 09:54:20 +0000 (10:54 +0100)]
perf maps: Fix display of kernel symbols
Since commit
659ad3492b913c90 ("perf maps: Switch from rbtree to lazily
sorted array for addresses"), perf doesn't display anymore kernel
symbols on powerpc, allthough it still detects them as kernel addresses.
# Overhead Command Shared Object Symbol
# ........ .......... ............. ......................................
#
80.49% Coeur main [unknown] [k] 0xc005f0f8
3.91% Coeur main gau [.] engine_loop.constprop.0.isra.0
1.72% Coeur main [unknown] [k] 0xc005f11c
1.09% Coeur main [unknown] [k] 0xc01f82c8
0.44% Coeur main libc.so.6 [.] epoll_wait
0.38% Coeur main [unknown] [k] 0xc0011718
0.36% Coeur main [unknown] [k] 0xc01f45c0
This is because function maps__find_next_entry() now returns current
entry instead of next entry, leading to kernel map end address getting
mis-configured with its own start address instead of the start address
of the following map.
Fix it by really taking the next entry, also make sure that entry
follows current one by making sure entries are sorted.
Fixes:
659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/2ea4501209d5363bac71a6757fe91c0747558a42.1736329923.git.christophe.leroy@csgroup.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 7 Jan 2025 22:43:52 +0000 (14:43 -0800)]
perf test: Update ftrace test to use --graph-opts
I found it failed on machines with limited memory because 16M byte
per-cpu buffer is too big. The reason it added the option is not to
miss tracing data. Thus we can limit the data size by reducing the
function call depth instead of increasing the buffer size to handle the
whole data.
As it used the same option in the test_ftrace_trace() and it was able
to find the sleep function, it should work with the profile subcommand.
Get rid of other grep commands which might be affected by the depth
change.
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250107224352.1128669-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 7 Jan 2025 22:43:51 +0000 (14:43 -0800)]
perf ftrace profile: Add --graph-opts option
Like trace subcommand, it should be able to pass some options to control
the tracing behavior for the function graph tracer.
But some options are limited in order to maintain the internal behavior.
For example, it can limit the function call depth like below:
# perf ftrace profile --graph-opts depth=5 -- myprog
Committer testing:
root@number:~# perf ftrace profile --graph-opts thresh=1000 -- sleep 1
# Total (us) Avg (us) Max (us) Count Function
1001419.301 500709.650
1000032.000 2 x64_sys_call
1000032.000
1000032.000
1000032.000 1 __x64_sys_clock_nanosleep
1000032.000
1000032.000
1000032.000 1 common_nsleep
1000031.000
1000031.000
1000031.000 1 do_nanosleep
1000031.000
1000031.000
1000031.000 1 hrtimer_nanosleep
1000024.000
1000024.000
1000024.000 1 schedule
1387.208 1387.208 1387.208 1 __x64_sys_execve
1386.691 1386.691 1386.691 1 do_execveat_common.isra.0
1334.170 1334.170 1334.170 1 bprm_execve
1258.413 1258.413 1258.413 1 load_elf_binary
1123.068 1123.068 1123.068 1 begin_new_exec
1113.550 1113.550 1113.550 1 mmput
1109.237 1109.237 1109.237 1 exit_mmap
root@number:~# perf ftrace profile --graph-opts thresh=1200 -- sleep 1
# Total (us) Avg (us) Max (us) Count Function
1001448.204 500724.102
1000018.000 2 x64_sys_call
1000017.000
1000017.000
1000017.000 1 __x64_sys_clock_nanosleep
1000017.000
1000017.000
1000017.000 1 common_nsleep
1000017.000
1000017.000
1000017.000 1 hrtimer_nanosleep
1000016.000
1000016.000
1000016.000 1 do_nanosleep
1000012.000
1000012.000
1000012.000 1 schedule
1430.112 1430.112 1430.112 1 __x64_sys_execve
1429.581 1429.581 1429.581 1 do_execveat_common.isra.0
1376.289 1376.289 1376.289 1 bprm_execve
1301.743 1301.743 1301.743 1 load_elf_binary
root@number:~#
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250107224352.1128669-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Tue, 7 Jan 2025 22:43:50 +0000 (14:43 -0800)]
perf ftrace: Display latency statistics at the end
Sometimes users also want to see average latency as well as histogram.
Display latency statistics like avg, max, min at the end.
$ sudo ./perf ftrace latency -ab -T synchronize_rcu -- ...
# DURATION | COUNT | GRAPH |
0 - 1 us | 0 | |
1 - 2 us | 0 | |
2 - 4 us | 0 | |
4 - 8 us | 0 | |
8 - 16 us | 0 | |
16 - 32 us | 0 | |
32 - 64 us | 0 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 1 | ##### |
16 - 32 ms | 7 | ######################################## |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
# statistics (in usec)
total time: 171832
avg time: 21479
max time: 30906
min time: 15869
count: 8
Committer testing:
root@number:~# perf ftrace latency -nab --bucket-range 100 --max-latency 512 -T switch_mm_irqs_off sleep 1
# DURATION | COUNT | GRAPH |
0 - 100 ns | 314 | ## |
100 - 200 ns | 1843 | ############# |
200 - 300 ns | 1390 | ########## |
300 - 400 ns | 844 | ###### |
400 - 500 ns | 480 | ### |
500 - 512 ns | 315 | ## |
512 - ... ns | 16 | |
# statistics (in nsec)
total time:
2448936
avg time: 387
max time: 3285
min time: 82
count: 6328
root@number:~#
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250107224352.1128669-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Wed, 6 Nov 2024 00:30:05 +0000 (00:30 +0000)]
perf evsel: Improve the evsel__open_strerror() for EBUSY
The existing EBUSY strerror message is:
The sys_perf_event_open() syscall returned with 16 (Device or resource busy) for event (intel_bts//).
"dmesg | grep -i perf" may provide additional information.
The dmesg won't be useful. What is more useful is knowing what
processes are potentially using the PMU, which some procfs scanning can
reveal. When parallel testing tests/shell/stat_all_pmu.sh this yields:
Testing intel_bts//
Error:
The PMU intel_bts counters are busy and in use by another process.
Possible processes:
2585882 perf list
2585902 perf list -j -o /tmp/__perf_test.list_output.json.KF9MY
2585904 perf list
2585911 perf record -e task-clock --filter period > 1 -o /dev/null --quiet true
2585912 perf list
2585915 perf list
2586042 /tmp/perf/perf record -asdg -e cpu-clock -o /tmp/perftool-testsuite_report.dIF/perf_report/perf.data -- sleep 2
2589078 perf record -g -e task-clock:u -o - perf test -w noploop
2589148 /tmp/perf/perf record --control=fifo:control,ack -e cpu-clock -m 1 sleep 10
2589379 perf --buildid-dir /tmp/perf.debug.Umx record --buildid-all -o /tmp/perf.data.YBm /tmp/perf.ex.MD5.ZQW
2589568 perf record -o /tmp/__perf_test.program.mtcZH/perf.data --branch-filter any,save_type,u -- perf test -w brstack
2589649 perf record --per-thread -o /tmp/__perf_test.perf.data.5d3dc perf test -w thloop
2589898 perf record -o /tmp/perf-test-script.BX2b27Dcnj/pp-perf.data --sample-cpu uname
Which gets a little closer to finding the issue.
Committer testing:
root@number:~#
root@number:~# grep -m1 "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i7-14700K
root@number:~#
Before:
root@number:~# perf stat -e intel_bts// &
[1] 197954
root@number:~# perf test "perf all PMU test"
124: perf all PMU test : FAILED!
root@number:~# perf test -v "perf all PMU test" |& tail
Testing i915/vecs0-busy/
Testing i915/vecs0-sema/
Testing i915/vecs0-wait/
Testing intel_bts//
Unexpected signal in main
Error:
The sys_perf_event_open() syscall returned with 16 (Device or resource busy) for event (intel_bts//).
"dmesg | grep -i perf" may provide additional information.
---- end(-1) ----
124: perf all PMU test : FAILED!
root@number:~#
After:
root@number:~# perf stat -e intel_bts// &
[1] 200195
root@number:~# perf test "perf all PMU test"
123: perf all PMU test : FAILED!
root@number:~# perf test -v "perf all PMU test" |& tail
Testing i915/vecs0-wait/
Testing intel_bts//
Unexpected signal in main
Error:
The PMU intel_bts counters are busy and in use by another process.
Possible processes:
200195 perf stat -e intel_bts//
2319766 /root/bin/perf top --stdio
---- end(-1) ----
123: perf all PMU test : FAILED!
root@number:~#
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Ze Gao <zegao2021@gmail.com>
Change-Id: Ie1ed8688286c44e8f44a35e98fed8be3e2a344df
Link: https://lore.kernel.org/r/20241106003007.2112584-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Randy Dunlap [Mon, 13 May 2024 19:24:39 +0000 (12:24 -0700)]
perf Documentation: Clarify sysfs event names characters
Specify that perf event names in sysfs must not contain mixed lower and
upper case characters and that they may contain numbers, ".", "_",
or "-" as well.
Fixes:
785623ee855e893d ("perf Document: Sysfs event names must be lower or upper case")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20240513192439.18473-1-rdunlap@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 6 Jan 2025 19:48:17 +0000 (16:48 -0300)]
perf tests shell task_analyzer: Run this test exclusively
When running in the now default parallel mode this test has been
frequently failing, while when running exclusively, on a quiet system,
it passes.
Since its expectations were established when serial testing was the
norm, mark it as exclusive to get this kind of resunt:
root@x1:~# perf test 106
106: perf script task-analyzer tests : Ok
root@x1:~# set -o vi
root@x1:~# perf stat --null --repeat 10 perf test 106
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
106: perf script task-analyzer tests : Ok
Performance counter stats for 'perf test 106' (10 runs):
4.8872 +- 0.0179 seconds time elapsed ( +- 0.37% )
root@x1:~#
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Charlie Jenkins [Thu, 19 Dec 2024 21:44:25 +0000 (13:44 -0800)]
perf tests code-reading: Handle change in objdump output from binutils >= 2.41 on riscv
After binutils commit
e43d876 which was first included in binutils 2.41,
riscv no longer supports dumping in the middle of instructions.
Increase the objdump window by 2-bytes to ensure that any instruction
that sits on the boundary of the specified stop-address is not cut in
half.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241219-perf_fix_riscv_obj_reading-v3-1-a7d644dcfa50@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 2 Jan 2025 19:50:39 +0000 (16:50 -0300)]
perf top: Don't complain about lack of vmlinux when not resolving some kernel samples
Recently we got a case where a kernel sample wasn't being resolved due
to a bug that was not setting the end address on kernel functions
implemented in assembly (see Link: tag), and then those were not being
found by machine__resolve() -> map__find_symbol().
So we ended up with:
# perf top --stdio
PerfTop: 0 irqs/s kernel: 0% exact: 0% lost: 0/0 drop: 0/0 [cycles/P]
-----------------------------------------------------------------------
Warning:
A vmlinux file was not found.
Kernel samples will not be resolved.
^Z
[1]+ Stopped perf top --stdio
#
But then resolving all other kernel symbols.
So just fixup the logic to only print that warning when there are no
symbols in the kernel map.
Fixes:
d88205db9caa0e9d ("perf dso: Add dso__has_symbols() method")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/Z3buKhcCsZi3_aGb@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 12 Nov 2024 16:00:45 +0000 (16:00 +0000)]
perf stat: Document and clarify outstate members
Not all of these are "state" so separate them into two sections. Rename
and document to make all clearer.
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241112160048.951213-6-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 12 Nov 2024 16:00:44 +0000 (16:00 +0000)]
perf stat: Document and simplify interval timestamps
Rename 'prefix' to 'timestamp' because that's all it does, except in
iostat mode where it's slightly overloaded, but still includes a
timestamp. This reveals a problem with iostat and JSON mode so document
this.
Make it more explicit that these are printed in interval mode by
changing 'if (prefix)' to 'if (interval)' which reveals an unnecessary
'else if (... && !interval)' which can be removed.
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241112160048.951213-5-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 12 Nov 2024 16:00:43 +0000 (16:00 +0000)]
perf stat: Remove empty new_line_metric function
Despite the name new_line_metric doesn't make a new line, it actually
does nothing. Change it to NULL to avoid confusion.
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241112160048.951213-4-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 12 Nov 2024 16:00:42 +0000 (16:00 +0000)]
perf stat: Also hide metric-units from JSON when event didn't run
We decided to hide NULL metric-units rather than showing it as "(null)"
when a dependent event for a metric doesn't exist. But on hybrid systems
if the process doesn't hit a PMU you get an empty string metric unit
instead. To make it consistent change all empty strings to NULL.
Note that metric-threshold is already hidden in this case without this
change.
Where a process only runs on cpu_core and never hits cpu_atom:
Before:
$ perf stat -j -- true
...
{"counter-value" : "<not counted>", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 0, "pcnt-running" : 0.00, "metric-value" : "0.000000", "metric-unit" : ""}
{"counter-value" : "6326.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 293786, "pcnt-running" : 100.00, "metric-value" : "3.553394", "metric-unit" : "of all branches", "metric-threshold" : "good"}
...
After:
...
{"counter-value" : "<not counted>", "unit" : "", "event" : "cpu_atom/branch-misses/", "event-runtime" : 0, "pcnt-running" : 0.00}
{"counter-value" : "5778.000000", "unit" : "", "event" : "cpu_core/branch-misses/", "event-runtime" : 282240, "pcnt-running" : 100.00, "metric-value" : "3.226797", "metric-unit" : "of all branches", "metric-threshold" : "good"}
...
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241112160048.951213-3-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 12 Nov 2024 16:00:41 +0000 (16:00 +0000)]
perf stat: Fix trailing comma when there is no metric unit
Now that printing metric-value and metric-unit is optional,
print_running_json() shouldn't add the comma in case it becomes
trailing.
Replace all manual JSON comma stuff with a json_out() function that uses
the existing os->first tracking and auto inserts a comma if it's needed.
Update the test to handle that two of the fields can be missing.
This fixes the following test failure on Cortex A57 where the branch
misses metric is missing a required event:
$ perf test -vvv "json output"
106: perf stat JSON output linter:
--- start ---
test child forked, pid 665682
Checking json output: no args Test failed for input:
{"counter-value" : "3112.000000", "unit" : "",
"event" : "armv8_pmuv3_1/branch-misses/",
"event-runtime" :
20699340, "pcnt-running" : 100.00, }
...
json.decoder.JSONDecodeError: Expecting property name enclosed in
double quotes: line 12 column 144 (char 2109)
---- end(-1) ----
106: perf stat JSON output linter : FAILED!
Fixes:
e1cc918b6cfd1206 ("perf stat: Drop metric-unit if unit is NULL")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241112160048.951213-2-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Howard Chu [Sun, 15 Dec 2024 19:07:11 +0000 (11:07 -0800)]
perf docs: Add documentation for --force-btf option
The --force-btf option is intended for debugging purposes and is
currently undocumented. Add documentation for it.
Committer notes:
We need a follow up patch expanding on what can be done via BTF and what
isn't possible and thus needs further work to convert kernel C source
code into tables that can then be associated with syscall integer args
and struct members, as discussed in:
https://lore.kernel.org/all/
20241215190712.787847-3-howardchu95@gmail.com/T/#mcfbba653200775c59c730705229a49b34a153db7
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241215190712.787847-3-howardchu95@gmail.com
Link: https://lore.kernel.org/all/20241215190712.787847-3-howardchu95@gmail.com/T/#mcfbba653200775c59c730705229a49b34a153db7
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Howard Chu [Sun, 15 Dec 2024 19:07:10 +0000 (11:07 -0800)]
perf trace: Add tests for BTF general augmentation
Currently, we only have 'perf trace' augmentation tests for enum
arguments. This patch adds tests for more general syscall arguments,
such as struct pointers, strings, and buffers.
These tests utilize the 'perf config' system to configure 'the perf trace'
output, as suggested by Arnaldo Carvalho de Melo <acme@kernel.org>.
Committer testing:
root@number:~# perf test "BTF general"
109: perf trace BTF general tests : Ok
root@number:~# perf test -v "BTF general"
109: perf trace BTF general tests : Ok
root@number:~# perf test -vv "BTF general"
109: perf trace BTF general tests:
--- start ---
test child forked, pid
1410451
Checking if vmlinux BTF exists
Testing perf trace's string augmentation
Testing perf trace's buffer augmentation
Testing perf trace's struct augmentation
---- end(0) ----
109: perf trace BTF general tests : Ok
root@number:~#
It still fails sometimes, for instance when tested with:
root@number:~# perf stat --null -r 10 perf test "BTF general"
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : FAILED!
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : FAILED!
109: perf trace BTF general tests : Ok
109: perf trace BTF general tests : Ok
Performance counter stats for 'perf test BTF general' (10 runs):
2.148 +- 0.293 seconds time elapsed ( +- 13.63% )
root@number:~#
But we can go on from here and fix things up with followup patches.
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241215190712.787847-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dr. David Alan Gilbert [Sun, 22 Dec 2024 21:58:31 +0000 (21:58 +0000)]
perf path: Remove unused is_executable_file()
is_executable_file() has been unused since 2022's commit
7391db6459388d47 ("perf test: Refactor shell tests allowing subdirs")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241222215831.283248-1-linux@treblig.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 14 Nov 2024 23:07:12 +0000 (15:07 -0800)]
perf values: Use evsel rather than evsel->idx
An evsel idx may not be stable due to sorting, evlist removal,
etc. Avoid use of the idx where the evsel itself can be used to avoid
these problems. This removed 1 values array and duplicated evsel name
strings.
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Chen Ni <nichen@iscas.ac.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241114230713.330701-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 14 Nov 2024 23:07:11 +0000 (15:07 -0800)]
perf stream: Use evsel rather than evsel->idx
An evsel idx may not be stable due to sorting, evlist removal,
etc. Avoid use of the idx where the evsel itself can be used to avoid
these problems.
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Chen Ni <nichen@iscas.ac.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241114230713.330701-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>