linux-2.6-block.git
9 months agoperf env: Introduce perf_env__arch_strerrno()
Arnaldo Carvalho de Melo [Fri, 1 Dec 2023 17:52:00 +0000 (14:52 -0300)]
perf env: Introduce perf_env__arch_strerrno()

That will cache the arch specific function translating error numbers to
strings.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20231201203046.486596-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf beauty: Don't use 'find ... -printf' as it isn't available in busybox
Arnaldo Carvalho de Melo [Thu, 30 Nov 2023 21:46:36 +0000 (18:46 -0300)]
perf beauty: Don't use 'find ... -printf' as it isn't available in busybox

Namhyung reported:

  I'm seeing a build error on my Alpine linux image which uses busybox +
  musl libc:

    In file included from trace/beauty/arch_errno_names.c:1,
                     from builtin-trace.c:899:
    /build/trace/beauty/generated/arch_errno_name_array.c: In function 'arch_syscalls__strerrno':
    /build/trace/beauty/generated/arch_errno_name_array.c:142:49: error: unused parameter 'arch' [-Werror=unused-parameter]
      142 | const char *arch_syscalls__strerrno(const char *arch, int err)

  It looks like busybox find command doesn't have -printf option

    find: unrecognized: -printf
    , Yesterday 9:16 PM
    ,
    BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.

    Usage: find [-HL] [PATH]... [OPTIONS] [ACTIONS]

    Search for files and perform actions on them.
    First failed action stops processing of current file.
    Defaults: PATH is current directory, action is '-print'

So just remove it and pipe find's entry to a basename loop to produce
the same result.

Then use an alternative loop that relies on the shell to avoid needless
forks and execs.

The discussion about it generated the impetus to stop doing strcmps to
find the right table at each errno to string translation but instead do
this just once and then use a function pointer to the right arch
specific table.

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf docs: Fix man page formatting for 'perf lock'
Nick Forrington [Thu, 2 Nov 2023 16:11:16 +0000 (16:11 +0000)]
perf docs: Fix man page formatting for 'perf lock'

This makes "CONTENTION" a top level section (rather than a subsection of
"INFO").

Fixes: 79079f21f50a501f ("perf lock: Add -k and -F options to 'contention' subcommand")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231102161117.49533-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agotools api fs: Avoid reading whole file for a 1 byte bool
Ian Rogers [Mon, 27 Nov 2023 22:08:17 +0000 (14:08 -0800)]
tools api fs: Avoid reading whole file for a 1 byte bool

sysfs__read_bool() used the first byte from a fully read file into a
string. It then looked at the first byte's value. Avoid doing this and
just read the first byte.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231127220902.1315692-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agotools api fs: Switch filename__read_str to use io.h
Ian Rogers [Mon, 27 Nov 2023 22:08:16 +0000 (14:08 -0800)]
tools api fs: Switch filename__read_str to use io.h

filename__read_str() has its own string reading code that allocates
memory before reading into it. The memory allocated is sized at BUFSIZ
that is 8kb. Most strings are short and so most of this 8kb is wasted.

Refactor io__getline(), as io__getdelim(), so that the newline character
can be configurable and ignored in the case of filename__read_str().

Code like build_caches_for_cpu() in perf's header.c will read many strings
and hold them in a data structure, in this case multiple strings per
cache level per CPU.

Using io.h's io__getline() avoids the wasted memory as strings are
temporarily read into a buffer on the stack before being copied to a
buffer that grows 128 bytes at a time and is never sized larger than the
string.

For a 16 hyperthread system the memory consumption of "perf record
true" is reduced by 180kb, primarily through saving memory when
reading the cache information.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231127220902.1315692-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agolibperf: Lazily allocate/size mmap event copy
Ian Rogers [Mon, 27 Nov 2023 22:08:14 +0000 (14:08 -0800)]
libperf: Lazily allocate/size mmap event copy

The event copy in the mmap is used to have storage to read an event. Not
all users of mmaps read the events, such as perf record. The amount of
buffer was also statically set to PERF_SAMPLE_MAX_SIZE rather than the
amount necessary from the header's event size.

Switch to a model where the event_copy is reallocated if too small to
the event's size. This adds the potential for the event to move, so if a
copy of the event pointer were stored it could be broken. All the
current users do:

  while(event = perf_mmap__read_event()) { ... }

and so they would be broken due to the event being overwritten if they
had stored the pointer. Manual inspection and address sanitizer testing
also shows the event pointer not being stored.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231127220902.1315692-3-irogers@google.com
[ Replace two lines with equivalent zfree(&map->event_copy) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agolibapi: Add missing linux/types.h header to get the __u64 type on io.h
Arnaldo Carvalho de Melo [Thu, 30 Nov 2023 17:11:45 +0000 (14:11 -0300)]
libapi: Add missing linux/types.h header to get the __u64 type on io.h

There are functions using __u64, so we need to have the linux/types.h
header otherwise we'll break when its not included before api/io.h.

Fixes: e95770af4c4a280f ("tools api: Add a lightweight buffered reading api")
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZWjDPL+IzPPsuC3X@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf test record+probe_libc_inet_pton: Fix call chain match on powerpc
Likhitha Korrapati [Sun, 26 Nov 2023 07:09:14 +0000 (02:09 -0500)]
perf test record+probe_libc_inet_pton: Fix call chain match on powerpc

The perf test "probe libc's inet_pton & backtrace it with ping" fails on
powerpc as below:

  # perf test -v "probe libc's inet_pton & backtrace it with
  ping"
   85: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 96028
  ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60)
  7fffa1779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
  7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
  FAIL: expected backtrace entry
  "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
  got "7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"
  test child finished with -1
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: FAILED!

This test installs a probe on libc's inet_pton function, which will use
uprobes and then uses perf trace on a ping to localhost. It gets 3
levels deep backtrace and checks whether it is what we expected or not.

The test started failing from RHEL 9.4 where as it works in previous
distro version (RHEL 9.2). Test expects gaih_inet function to be part of
backtrace. But in the glibc version (2.34-86) which is part of distro
where it fails, this function is missing and hence the test is failing.

From nm and ping command output we can confirm that gaih_inet function
is not present in the expected backtrace for glibc version glibc-2.34-86

  [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
  00000000001273e0 t gaih_inet_serv
  00000000001cd8d8 r gaih_inet_typeproto

  [root@xxx perf]# perf script -i /tmp/perf.data.6E8
  ping  104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60)
              7fff83779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
              7fff8372a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
                 11dc73534 [unknown] (/usr/bin/ping)
              7fff8362a8c4 __libc_start_call_main+0x84 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)

  FAIL: expected backtrace entry
  "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$"
  got "7fff9d52a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)"

With version glibc-2.34-60 gaih_inet function is present as part of the
expected backtrace. So we cannot just remove the gaih_inet function from
the backtrace.

  [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet
  0000000000130490 t gaih_inet.constprop.0
  000000000012e830 t gaih_inet_serv
  00000000001d45e4 r gaih_inet_typeproto

  [root@xxx perf]# ./perf script -i /tmp/perf.data.b6S
  ping   67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd808207fffbdd80820 __GI___inet_pton+0x0
  (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31160 gaih_inet.constprop.0+0xcd0
  (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31c7c getaddrinfo+0x14c
  (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 1140d3558 [unknown] (/usr/bin/ping)

This patch solves this issue by doing a conditional skip. If there is a
gaih_inet function present in the libc then it will be added to the
expected backtrace else the function will be skipped from being added
to the expected backtrace.

Output with the patch

  [root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it
  with ping"
   83: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 102662
  ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60)
  7fff93379a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
  7fff9332a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)
  11ef03534 [unknown] (/usr/bin/ping)
  test child finished with 0
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Ok

Reported-by: Disha Goel <disgoel@linux.ibm.com>
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231126070914.175332-1-likhitha@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests sigtrap: Skip if running on a kernel with sleepable spinlocks
Arnaldo Carvalho de Melo [Wed, 29 Nov 2023 15:47:18 +0000 (12:47 -0300)]
perf tests sigtrap: Skip if running on a kernel with sleepable spinlocks

There are issues as reported that need some more investigation on the
RT kernel front, till that is addressed, skip this test.

This test is already skipped for multiple hardware architectures where
the tested kernel feature is not supported.

Acked-by: Marco Elver <elver@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/e368f2c848d77fbc8d259f44e2055fe469c219cf.camel@gmx.de/
Link: https://lore.kernel.org/r/20231129154718.326330-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf test sigtrap: Generalize the BTF routine to reuse it in this test
Arnaldo Carvalho de Melo [Wed, 29 Nov 2023 15:47:17 +0000 (12:47 -0300)]
perf test sigtrap: Generalize the BTF routine to reuse it in this test

Move the part that loads the BTF info to a "btf__available()" that will
lazy load the BTF info so that if we need it for some other test, which
we will in the following cset, we can reuse it.

At some point this will move from this specific 'perf test' entry to be
used in other parts of perf, do it when needed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231129154718.326330-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf mmap: Lazily initialize zstd streams to save memory when not using it
Ian Rogers [Thu, 2 Nov 2023 17:56:46 +0000 (10:56 -0700)]
perf mmap: Lazily initialize zstd streams to save memory when not using it

Zstd streams create dictionaries that can require significant RAM,
especially when there is one per-CPU. Tools like 'perf record' won't use
the streams without the -z option, and so the creation of the streams
is pure overhead. Switch to creating the streams on first use.

Committer notes:

ssize_t comes from sys/types.h, size_t from stddef.h. This worked on
glibc as stdlib.h includes both, but not on musl libc. So do what 'man
size_t' says and include sys/types.h and stddef.h instead of stdlib.h

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231102175735.2272696-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf dwarf-aux: Add die_find_variable_by_addr()
Namhyung Kim [Thu, 9 Nov 2023 23:59:51 +0000 (15:59 -0800)]
perf dwarf-aux: Add die_find_variable_by_addr()

The die_find_variable_by_addr() is to find a variables in the given DIE
using given (PC-relative) address.  Global variables will have a
location expression with DW_OP_addr which has an address so can simply
compare it with the address.

  <1><143a7>: Abbrev Number: 2 (DW_TAG_variable)
      <143a8>   DW_AT_name        : loops_per_jiffy
      <143ac>   DW_AT_type        : <0x1cca>
      <143b0>   DW_AT_external    : 1
      <143b0>   DW_AT_decl_file   : 193
      <143b1>   DW_AT_decl_line   : 213
      <143b2>   DW_AT_location    : 9 byte block: 3 b0 46 41 82 ff ff ff ff
                                     (DW_OP_addr: ffffffff824146b0)

Note that the type-offset should be calculated from the base address of
the global variable.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-33-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tools: Add --debug-file option to redirect debug output
Yang Jihong [Tue, 31 Oct 2023 10:55:23 +0000 (10:55 +0000)]
perf tools: Add --debug-file option to redirect debug output

Currently, debug messages is output to stderr, add --debug-file option to
support redirection to a specified file.

Some test scenarios:

  # perf --list-opts
  --help --version --exec-path --html-path --paginate --no-pager --debugfs-dir --buildid-dir --list-cmds --list-opts --debug --debug-file

  # perf --debug-file
  No path given for --debug-file.

   Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

  # perf --debug-file /sys/perf.log record -v true
  Open debug file '/sys/perf.log' failed: Permission denied

   Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]

  # perf --debug-file /tmp/perf.log record -v true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.013 MB perf.data (26 samples) ]
  # cat /tmp/perf.log
  DEBUGINFOD_URLS=
  Using CPUID GenuineIntel-6-3E-4
  nr_cblocks: 0
  affinity: SYS
  mmap flush: 1
  comp level: 0
  mmap size 528384B
  Control descriptor is not initialized
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
  symbol:unmap_start file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:unmap_complete file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:map_start file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:map_complete file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:reloc_start file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:reloc_complete file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:init_complete file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:lll_lock_wait_private file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:lll_lock_wait file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:setjmp file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:longjmp file:(null) line:0 offset:0 return:0 lazy:(null)
  symbol:longjmp_target file:(null) line:0 offset:0 return:0 lazy:(null)
  failed to write feature HYBRID_TOPOLOGY

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231031105523.1472558-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf annotate: Check if operand has multiple regs
Namhyung Kim [Thu, 12 Oct 2023 03:50:25 +0000 (20:50 -0700)]
perf annotate: Check if operand has multiple regs

It needs to check all possible information in an instruction.  Let's add
a field indicating if the operand has multiple registers.  I'll be used
to search type information like in an array access on x86 like:

  mov    0x10(%rax,%rbx,8), %rcx
             -------------
                 here

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231012035111.676789-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf test: Use existing config value for objdump path
James Clark [Mon, 13 Nov 2023 10:23:24 +0000 (10:23 +0000)]
perf test: Use existing config value for objdump path

There is already an existing config value for changing the objdump path,
so instead of having two values that do the same thing, make 'perf test'
use annotate.objdump as well.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/ZU5Cx4LTrB5q0sIG@kernel.org
Link: https://lore.kernel.org/r/20231113102327.695386-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf vendor events riscv: add T-HEAD C9xx JSON file
Inochi Amaoto [Wed, 22 Nov 2023 12:41:06 +0000 (20:41 +0800)]
perf vendor events riscv: add T-HEAD C9xx JSON file

Add JSON file of T-HEAD C9xx series events.

The event idx (raw value) is summary as following:

event id range   | support cpu
 0x01 - 0x2a     |  c906,c910,c920

The event ids are based on the public document of T-HEAD and cover the
c900 series.

These events are the max that c900 series support.  Since T-HEAD let
manufacturers decide whether events are usable, the final support of the
perf events is determined by the pmu node of the soc dtb.

Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Tested-by: Guo Ren <guoren@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Wang <unicorn_wang@outlook.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jisheng Zhang <jszhang@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Fu <wefu@redhat.com>
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/IA1PR20MB495325FCF603BAA841E29281BBBAA@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf vendor events: Add skx, clx, icx and spr upi bandwidth metric
Ian Rogers [Thu, 9 Nov 2023 23:27:32 +0000 (15:27 -0800)]
perf vendor events: Add skx, clx, icx and spr upi bandwidth metric

Add upi_data_receive_bw metric for skylakex, cascadelakex, icelakex
and sapphirerapids. The metric was added to perfmon metrics in:
https://github.com/intel/perfmon/pull/119

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231109232732.2973015-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Skip data symbol test if buf1 symbol is missing
Adrian Hunter [Thu, 23 Nov 2023 07:58:48 +0000 (09:58 +0200)]
perf tests: Skip data symbol test if buf1 symbol is missing

perf data symbol test depends on finding symbol buf1 in perf, and fails if
perf has been stripped and no debug object is available. In that case, skip
the test instead.

Example:

 Before:

  $ strip tools/perf/perf
  $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
  $ tools/perf/perf test -v 'data symbol'
  113: Test data symbol                                                :
  --- start ---
  test child forked, pid 125646
  Recording workload...
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.577 MB /tmp/__perf_test.perf.data.Jhbdp (7794 samples) ]
  Cleaning up files...
  test child finished with -1
  ---- end ----
  Test data symbol: FAILED!

 After:

  $ tools/perf/perf test -v 'data symbol'
  113: Test data symbol                                                :
  --- start ---
  test child forked, pid 125747
  perf does not have symbol 'buf1'
  perf is missing symbols - skipping test
  test child finished with -2
  ---- end ----
  Test data symbol: Skip

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Make data symbol test wait for perf to start
Adrian Hunter [Thu, 23 Nov 2023 07:58:47 +0000 (09:58 +0200)]
perf tests: Make data symbol test wait for perf to start

The perf data symbol test waits 1 second for perf to run and collect data,
which may be too little if perf takes a long time to start up, which has
been noticed on systems with many CPUs. Use existing wait_for_perf_to_start
helper to wait for perf to start.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Skip branch stack sampling test if brstack_bench symbol is missing
Adrian Hunter [Thu, 23 Nov 2023 07:58:46 +0000 (09:58 +0200)]
perf tests: Skip branch stack sampling test if brstack_bench symbol is missing

The test "Check branch stack sampling" depends on finding symbol
brstack_bench (and several others) in perf, and fails if perf has been
stripped and no debug object is available. In that case, skip the test
instead.

Example:

 Before:

  $ strip tools/perf/perf
  $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
  $ tools/perf/perf test -v 'branch stack sampling'
  112: Check branch stack sampling                                     :
  --- start ---
  test child forked, pid 123741
  Testing user branch stack sampling
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.5Dz1U/perf.script
  + cleanup
  + rm -rf /tmp/__perf_test.program.5Dz1U
  test child finished with -1
  ---- end ----
  Check branch stack sampling: FAILED!

 After:

  $ tools/perf/perf test -v 'branch stack sampling'
  112: Check branch stack sampling                                     :
  --- start ---
  test child forked, pid 125157
  perf does not have symbol 'brstack_bench'
  perf is missing symbols - skipping test
  test child finished with -2
  ---- end ----
  Check branch stack sampling: Skip

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Skip Arm64 callgraphs test if leafloop symbol is missing
Adrian Hunter [Thu, 23 Nov 2023 07:58:45 +0000 (09:58 +0200)]
perf tests: Skip Arm64 callgraphs test if leafloop symbol is missing

The test "Check Arm64 callgraphs are complete in fp mode" depends on
finding symbol leafloop in perf, and fails if perf has been stripped and no
debug object is available. In that case, skip the test instead.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Skip record test if test_loop symbol is missing
Adrian Hunter [Thu, 23 Nov 2023 07:58:44 +0000 (09:58 +0200)]
perf tests: Skip record test if test_loop symbol is missing

perf record test depends on finding symbol test_loop in perf, and fails if
perf has been stripped and no debug object is available. In that case, skip
the test instead.

Example:

 Note, building with perl support adds option -Wl,-E which causes the
 linker to add all (global) symbols to the dynamic symbol table. So the
 test_loop symbol, being global, does not get stripped unless NO_LIBPERL=1

 Before:

  $ make NO_LIBPERL=1 -C tools/perf >/dev/null 2>&1
  $ strip tools/perf/perf
  $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
  $ tools/perf/perf test -v 'record tests'
   91: perf record tests                                               :
  --- start ---
  test child forked, pid 118750
  Basic --per-thread mode test
  Per-thread record [Failed missing output]
  Register capture test
  Register capture test [Success]
  Basic --system-wide mode test
  System-wide record [Skipped not supported]
  Basic target workload test
  Workload record [Failed missing output]
  test child finished with -1
  ---- end ----
  perf record tests: FAILED!

 After:

  $ tools/perf/perf test -v 'record tests'
   91: perf record tests                                               :
  --- start ---
  test child forked, pid 120025
  perf does not have symbol 'test_loop'
  perf is missing symbols - skipping test
  test child finished with -2
  ---- end ----
  perf record tests: Skip

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests: Skip pipe test if noploop symbol is missing
Adrian Hunter [Thu, 23 Nov 2023 07:58:43 +0000 (09:58 +0200)]
perf tests: Skip pipe test if noploop symbol is missing

perf pipe recording and injection test depends on finding symbol noploop in
perf, and fails if perf has been stripped and no debug object is available.
In that case, skip the test instead.

Example:

 Before:

  $ strip tools/perf/perf
  $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf`
  $ tools/perf/perf test -v pipe
   86: perf pipe recording and injection test                          :
  --- start ---
  test child forked, pid 47734
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
       47741    47741       -1 |perf
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  cannot find noploop function in pipe #1
  test child finished with -1
  ---- end ----
  perf pipe recording and injection test: FAILED!

After:

  $ tools/perf/perf test -v pipe
   86: perf pipe recording and injection test                          :
  --- start ---
  test child forked, pid 48996
  perf does not have symbol 'noploop'
  perf is missing symbols - skipping test
  test child finished with -2
  ---- end ----
  perf pipe recording and injection test: Skip

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests lib: Add perf_has_symbol.sh
Adrian Hunter [Thu, 23 Nov 2023 07:58:42 +0000 (09:58 +0200)]
perf tests lib: Add perf_has_symbol.sh

Some shell tests depend on finding symbols for perf itself, and fail if
perf has been stripped and no debug object is available. Add helper
functions to check if perf has a needed symbol. This is preparation for
amending the tests themselves to be skipped if a needed symbol is not
found.

The functions make use of the "Symbols" test which reads and checks symbols
from a dso, perf itself by default. Note the "Symbols" test will find
symbols using the same method as other perf tests, including, for example,
looking in the buildid cache.

An alternative would be to prevent the needed symbols from being stripped,
which seems to work with gcc's externally_visible attribute, but that
attribute is not supported by clang.

Another alternative would be to use option -Wl,-E (which is already used
when perf is built with perl support) which causes the linker to add all
(global) symbols to the dynamic symbol table. Then the required symbols
need only be made global in scope to avoid being strippable. However that
goes beyond what is needed.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf header: Fix segfault on build_mem_topology() error path
Adrian Hunter [Thu, 23 Nov 2023 07:58:41 +0000 (09:58 +0200)]
perf header: Fix segfault on build_mem_topology() error path

Do not increase the node count unless a node has been successfully read,
because it can lead to a segfault if an error occurs.

For example, if perf exceeds the open file limit in memory_node__read(),
which, on a test system, could be made to happen by setting the file limit
to exactly 32:

 Before:

  $ ulimit -n 32
  $ perf mem record --all-user -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  failed: can't open memory sysfs data
  perf: Segmentation fault
  Obtained 14 stack frames.
  perf(sighandler_dump_stack+0x48) [0x55f4b1f59558]
  /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f4ba1c42520]
  /lib/x86_64-linux-gnu/libc.so.6(free+0x1e) [0x7f4ba1ca53fe]
  perf(+0x178ff4) [0x55f4b1f48ff4]
  perf(+0x179a70) [0x55f4b1f49a70]
  perf(+0x17ef5d) [0x55f4b1f4ef5d]
  perf(+0x85c0b) [0x55f4b1e55c0b]
  perf(cmd_record+0xe1d) [0x55f4b1e5920d]
  perf(cmd_mem+0xc96) [0x55f4b1e80e56]
  perf(+0x130460) [0x55f4b1f00460]
  perf(main+0x689) [0x55f4b1e427d9]
  /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4ba1c29d90]
  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4ba1c29e40]
  perf(_start+0x25) [0x55f4b1e42a25]
  Segmentation fault (core dumped)
  $

After:

  $ ulimit -n 32
  $ perf mem record --all-user -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  failed: can't open memory sysfs data
  [ perf record: Captured and wrote 0.005 MB perf.data (11 samples) ]
  $

Fixes: f8e502b9d1b3b197 ("perf header: Ensure bitmaps are freed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123075848.9652-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf report: Remove warning on missing raw data for s390
Thomas Richter [Wed, 22 Nov 2023 09:27:03 +0000 (10:27 +0100)]
perf report: Remove warning on missing raw data for s390

Command

   # ./perf report -i /tmp/111 -D > /dev/null

emits an error message when a sample for event CRYPTO_ALL in the
perf.data file does not contain any raw data. This is ok.  Do not
trigger this warning when the sample in the perf.data files does not
contain any raw data at all.  Check for availability of raw data for all
events and return if none is available.

Output before:

  # ./perf report -i /tmp/111 -D > /dev/null
  Invalid CRYPTO_ALL raw data encountered
  Invalid CRYPTO_ALL raw data encountered
  Invalid CRYPTO_ALL raw data encountered
  #

Output after:

  # ./perf report -i /tmp/111 -D > /dev/null
  #

Fixes: b539deafbadb2fc6 ("perf report: Add s390 raw data interpretation for PAI counters")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20231122092703.3163191-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf
Athira Rajeev [Thu, 23 Nov 2023 16:02:32 +0000 (21:32 +0530)]
perf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf

Add rule in new Makefile "tests/Makefile.tests" for running shellcheck
on shell test scripts. This automates below shellcheck into the build.

$ for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S warning $F; done

Condition for shellcheck is added in Makefile.perf to avoid build
breakage in the absence of shellcheck binary. Update Makefile.perf to
contain new rule for "SHELLCHECK_TEST" which is for making shellcheck
test as a dependency on perf binary.

Added "tests/Makefile.tests" to run shellcheck on shellscripts in
tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time
during make, shellcheck will be run only on modified files during
subsequent invocations. By this, if any newly added shell scripts or
fixes in existing scripts breaks coding/formatting style, it will get
captured during the perf build.

Example build failure by modifying probe_vfs_getname.sh in tests/shell:

In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
  ^-----------^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

Here, like other files which gets created during compilation (ex:
.builtin-bench.o.cmd or .perf.o.cmd ), create .shellcheck_log also as a
hidden file.  Example: tests/shell/.probe_vfs_getname.sh.shellcheck_log
shellcheck is re-run if any of the script gets modified based on its
dependency of this log file.

After this, for testing, changed "tests/shell/trace+probe_vfs_getname.sh" to
break shellcheck format. In the next make run, it is also captured:

In tests/shell/probe_vfs_getname.sh line 8:
. $(dirname $0)/lib/probe.sh
  ^-----------^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
make[3]: *** Waiting for unfinished jobs....

In tests/shell/trace+probe_vfs_getname.sh line 14:
. $(dirname $0)/lib/probe.sh
  ^-----------^ SC2046 (warning): Quote this to prevent word splitting.

For more information:
  https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
make[3]: *** [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.trace+probe_vfs_getname.sh.shellcheck_log] Error 1
make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.perf:244: sub-make] Error 2
make: *** [Makefile:70: all] Error 2

Failure log can be found in the stdout of make itself.

This is reported at build time. To be able to go ahead with the build or
disable shellcheck even though it is known that some test is broken, add
a "NO_SHELLCHECK" option. Example:

  make NO_SHELLCHECK=1

  INSTALL libsubcmd_headers
  INSTALL libsymbol_headers
  INSTALL libapi_headers
  INSTALL libperf_headers
  INSTALL libbpf_headers
  LINK    perf

Note:

This is tested on RHEL and also SLES. Use below check:
"$(shell which shellcheck 2> /dev/null)" to look for presence
of shellcheck binary. The approach "shell command -v" is not
used here. In some of the distros(RHEL), command is available
as executable file (/usr/bin/command). But in some distros(SLES),
it is a shell builtin and not available as executable file.

Committer testing:

  $ type shellcheck
  shellcheck is hashed (/usr/bin/shellcheck)
  $ rpm -qf /usr/bin/shellcheck
  ShellCheck-0.9.0-2.fc38.x86_64
  $
  $ alias m
  $ git diff
  diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh b/tools/perf/tests/shell/probe_vfs_getname.sh
  index 554e12e83c55fd56..dbc14634678e2bf6 100755
  --- a/tools/perf/tests/shell/probe_vfs_getname.sh
  +++ b/tools/perf/tests/shell/probe_vfs_getname.sh
  @@ -5,7 +5,7 @@
   # Arnaldo Carvalho de Melo <acme@kernel.org>, 2017

   # shellcheck source=lib/probe.sh
  -. "$(dirname $0)"/lib/probe.sh
  +. $(dirname $0)/lib/probe.sh

   skip_if_no_perf_probe || exit 2

  alias m='rm -rf ~/libexec/perf-core/ ; make -k CORESIGHT=1 O=/tmp/build/$(basename $PWD) -C tools/perf install-bin && perf test python'
  $ m
  make: Entering directory '/home/acme/git/perf-tools-next/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
<SNIP>
    INSTALL libbpf_headers

  In tests/shell/probe_vfs_getname.sh line 8:
  . $(dirname $0)/lib/probe.sh
    ^-----------^ SC2046 (warning): Quote this to prevent word splitting.

  For more information:
    https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt...
  make[3]: *** [/home/acme/git/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1
  make[2]: *** [Makefile.perf:686: SHELLCHECK_TEST] Error 2
  make[2]: *** Waiting for unfinished jobs....
  make[1]: *** [Makefile.perf:244: sub-make] Error 2
  make: *** [Makefile:113: install-bin] Error 2
  make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf'
  $

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231123160232.94253-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf vendor events riscv: Add StarFive Dubhe-90 JSON file
Ji Sheng Teoh [Wed, 22 Nov 2023 03:09:08 +0000 (11:09 +0800)]
perf vendor events riscv: Add StarFive Dubhe-90 JSON file

Similar to StarFive's Dubhe-80, Dubhe-90 supports raw event id 0x00 -
0x22. Reuse Dubhe-80 firmware and common json file.  The raw events are
enabled through PMU node of DT binding.  Besides raw event, add standard
RISC-V firmware events to support monitoring of firmware event.

Example of PMU DT node:
pmu {
compatible = "riscv,pmu";
riscv,raw-event-to-mhpmcounters =
/* Event ID 1-31 */
<0x00 0x00 0xFFFFFFFF 0xFFFFFFE0 0x00007FF8>,
/* Event ID 32-33 */
<0x00 0x20 0xFFFFFFFF 0xFFFFFFFE 0x00007FF8>,
/* Event ID 34 */
<0x00 0x22 0xFFFFFFFF 0xFFFFFF22 0x00007FF8>;
};

'perf stat' output:

  [root@user]# perf stat -a \
   -e access_mmu_stlb \
   -e miss_mmu_stlb \
   -e access_mmu_pte_c \
   -e rob_flush \
   -e btb_prediction_miss \
   -e itlb_miss \
   -e sync_del_fetch_g \
   -e icache_miss \
   -e bpu_br_retire \
   -e bpu_br_miss \
   -e ret_ins_retire \
   -e ret_ins_miss \
   -- openssl speed rsa2048
  Doing 2048 bits private rsa's for 10s: 39 2048 bits private RSA's in
  10.03s
  Doing 2048 bits public rsa's for 10s: 1469 2048 bits public RSA's in
  9.47s
  version: 3.0.10
  built on: Tue Aug  1 13:47:24 2023 UTC
  options: bn(64,64)
  CPUINFO: N/A
                    sign    verify    sign/s verify/s
  rsa 2048 bits 0.257179s 0.006447s      3.9    155.1

   Performance counter stats for 'system wide':

             3112882      access_mmu_stlb
               10550      miss_mmu_stlb
               18251      access_mmu_pte_c
              274765      rob_flush
            22470560      btb_prediction_miss
             3035839      itlb_miss
           643549060      sync_del_fetch_g
              133013      icache_miss
            62982796      bpu_br_retire
              287548      bpu_br_miss
             8935910      ret_ins_retire
                8308      ret_ins_miss

        20.656182600 seconds time elapsed

Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikita Shubin <n.shubin@yadro.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20231122030908.2981502-1-jisheng.teoh@starfivetech.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf tests coresight: Remove unused variables
zhujun2 [Wed, 15 Nov 2023 06:42:55 +0000 (22:42 -0800)]
perf tests coresight: Remove unused variables

These variables are never referenced in the code, just remove them.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231115064255.11057-1-zhujun2@cmss.chinamobile.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf lock: Fix a memory leak on an error path
zhaimingbing [Fri, 24 Nov 2023 09:26:57 +0000 (17:26 +0800)]
perf lock: Fix a memory leak on an error path

if a strdup-ed string is NULL,the allocated memory needs freeing.

Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20231124092657.10392-1-zhaimingbing@cmss.chinamobile.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf parse-events: Make legacy events lower priority than sysfs/JSON
Ian Rogers [Thu, 23 Nov 2023 04:29:22 +0000 (20:29 -0800)]
perf parse-events: Make legacy events lower priority than sysfs/JSON

The perf tool has previously made legacy events the priority so with
or without a PMU the legacy event would be opened:

  $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
  Using CPUID GenuineIntel-6-8D-1
  intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
  Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
  After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    size                             136
    config                           0 (PERF_COUNT_HW_CPU_CYCLES)
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 833967  cpu -1  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    size                             136
    config                           0 (PERF_COUNT_HW_CPU_CYCLES)
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  ...

Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs
capable of opening legacy events on each, removing hard coded "cpu_core"
and "cpu_atom" Intel PMU names, etc. caused a behavioral difference on
Apple/ARM due to latent issues in the PMU driver reported in:
https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/

As part of that report Mark Rutland <mark.rutland@arm.com> requested
that legacy events not be higher in priority when a PMU is specified
reversing what has until this change been perf's default behavior. With
this change the above becomes:

  $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true
  Using CPUID GenuineIntel-6-8D-1
  Attempt to add: cpu/cpu-cycles=0/
  ..after resolving event: cpu/event=0x3c/
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    size                             136
    config                           0 (PERF_COUNT_HW_CPU_CYCLES)
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 827628  cpu -1  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  perf_event_attr:
    type                             4 (PERF_TYPE_RAW)
    size                             136
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  ...

So the second event has become a raw event as
/sys/devices/cpu/events/cpu-cycles exists.

A fix was necessary to config_term_pmu in parse-events.c as check_alias
expansion needs to happen after config_term_pmu, and config_term_pmu may
need calling a second time because of this.

config_term_pmu is updated to not use the legacy event when the PMU has
such a named event (either from JSON or sysfs).

The bulk of this change is updating all of the parse-events test
expectations so that if a sysfs/JSON event exists for a PMU the test
doesn't fail - a further sign, if it were needed, that the legacy event
priority was a known and tested behavior of the perf tool.

Reported-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Hector Martin <marcan@marcan.st>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com
[ Initialize the 'alias_rewrote_terms' variable to false to address a clang warning ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf cs-etm: Enable itrace option 'T'
Leo Yan [Sat, 14 Oct 2023 07:45:13 +0000 (15:45 +0800)]
perf cs-etm: Enable itrace option 'T'

Prior to Armv8.4, the feature FEAT_TRF is not supported by Arm CPUs.
Consequently, the sysfs node 'ts_source' will not be set as 1 by the
CoreSight ETM driver.  On the other hand, the perf tool relies on the
'ts_source' node to determine whether the kernel timestamp is traced.
Since the 'ts_source' is not set for Arm CPUs prior to Armv8.4,
platforms in this case cannot utilize the traced timestamp as the kernel
time.

This patch enables the 'T' itrace option, which forcibly utilizes the
traced timestamp as the kernel time.  If users are aware that their
working platform's Arm CoreSight shares the same counter with the kernel
time, they can specify 'T' option to decode the traced timestamp as the
kernel time.

An usage example is:

  # perf record -e cs_etm// -- test_program
  # perf script --itrace=i10ibT
  # perf report --itrace=i10ibT

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231014074513.1668000-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf auxtrace: Add 'T' itrace option for timestamp trace
Leo Yan [Sat, 14 Oct 2023 07:45:12 +0000 (15:45 +0800)]
perf auxtrace: Add 'T' itrace option for timestamp trace

An AUX trace can contain timestamp, but in some situations, the hardware
trace module (e.g. Arm CoreSight) cannot decide the traced timestamp is
the same source with CPU's time, thus the decoder can not use the
timestamp trace for samples.

This patch introduces 'T' itrace option. If users know the platforms
they are working on have the same time counter with CPUs, users can
use this new option to tell a decoder for using timestamp trace as
kernel time.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231014074513.1668000-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf cs-etm: Bump minimum OpenCSD version to ensure a bugfix is present
James Clark [Fri, 1 Sep 2023 13:37:15 +0000 (14:37 +0100)]
perf cs-etm: Bump minimum OpenCSD version to ensure a bugfix is present

Since commit d927ef5004ef ("perf cs-etm: Add exception level consistency
check"), the exception that was added to Perf will be triggered unless
the following bugfix from OpenCSD is present:

 - _Version 1.2.1_:
  - __Bugfix__:
    ETM4x / ETE - output of context elements to client can in some
    circumstances be delayed until after subsequent atoms have been
    processed leading to incorrect memory decode access via the client
    callbacks. Fixed to flush context elements immediately they are
    committed.

Rather than remove the assert and silently fail, just increase the
minimum version requirement to avoid hard to debug issues and
regressions.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230901133716.677499-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf test: Basic branch counter support
Kan Liang [Tue, 7 Nov 2023 18:40:20 +0000 (10:40 -0800)]
perf test: Basic branch counter support

Add a basic test for the branch counter feature.

The test verifies that
- The new filter can be successfully applied on the supported platforms.
- The counter value can be outputted via the perf report -D

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231107184020.1497571-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf script perl: Fail check on dynamic allocation
zhaimingbing [Mon, 20 Nov 2023 11:23:56 +0000 (19:23 +0800)]
perf script perl: Fail check on dynamic allocation

Return ENOMEM when dynamic allocation failed.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20231120112356.8652-1-zhaimingbing@cmss.chinamobile.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf script python: Fail check on dynamic allocation
Paran Lee [Mon, 20 Nov 2023 22:32:19 +0000 (07:32 +0900)]
perf script python: Fail check on dynamic allocation

Add PyList_New() Fail check in get_field_numeric_entry()
function and dynamic allocation checking for
set_regs_in_dict(), python_start_script().

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: MichelleJin <shjy180909@gmail.com>
Signed-off-by: Paran Lee <p4ranlee@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Austin Kim <austindh.kim@gmail.com>
Cc: Honggyu Kim <honggyu.kp@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20231120223218.9036-1-p4ranlee@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 months agoperf test: Remove atomics from test_loop to avoid test failures
Nick Forrington [Thu, 2 Nov 2023 16:22:24 +0000 (16:22 +0000)]
perf test: Remove atomics from test_loop to avoid test failures

The current use of atomics can lead to test failures, as tests (such as
tests/shell/record.sh) search for samples with "test_loop" as the
top-most stack frame, but find frames related to the atomic operation
(e.g. __aarch64_ldadd4_relax).

This change simply removes the "count" variable, as it is not necessary.

Fixes: 1962ab6f6e0b39e4 ("perf test workload thloop: Make count increments atomic")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Acked-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231102162225.50028-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tools: Address python 3.6 DeprecationWarning for string scapes
Benjamin Gray [Tue, 12 Sep 2023 06:07:59 +0000 (16:07 +1000)]
perf tools: Address python 3.6 DeprecationWarning for string scapes

Python 3.6 introduced a DeprecationWarning for invalid escape sequences.
This is upgraded to a SyntaxWarning in Python 3.12, and will eventually
be a syntax error.

Fix these now to get ahead of it before it's an error.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Ian Abbott <abbotti@mev.co.uk>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mykola Lysenko <mykolal@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Todd E Brandt <todd.e.brandt@linux.intel.com>
Cc: Tom Rix <trix@redhat.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230912060801.95533-6-bgray@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf vendor events riscv: Add StarFive Dubhe-80 JSON file
Ji Sheng Teoh [Fri, 3 Nov 2023 08:24:41 +0000 (16:24 +0800)]
perf vendor events riscv: Add StarFive Dubhe-80 JSON file

StarFive's Dubhe-80 supports raw event id 0x00 - 0x22.  The raw events
are enabled through PMU node of DT binding.  Besides raw event, add
standard RISC-V firmware events to support monitoring of firmware event.

Example of PMU DT node:

  pmu {
   compatible = "riscv,pmu";
   riscv,raw-event-to-mhpmcounters =
   /* Event ID 1-31 */
   <0x00 0x00 0xFFFFFFFF 0xFFFFFFE0 0x00007FF8>,
   /* Event ID 32-33 */
   <0x00 0x20 0xFFFFFFFF 0xFFFFFFFE 0x00007FF8>,
   /* Event ID 34 */
   <0x00 0x22 0xFFFFFFFF 0xFFFFFF22 0x00007FF8>;
  };

Example of 'perf stat' output:

  [root@user]# perf stat -a \
   -e access_mmu_stlb \
   -e miss_mmu_stlb \
   -e access_mmu_pte_c \
   -e rob_flush \
   -e btb_prediction_miss \
   -e itlb_miss \
   -e sync_del_fetch_g \
   -e icache_miss \
   -e bpu_br_retire \
   -e bpu_br_miss \
   -e ret_ins_retire \
   -e ret_ins_miss \
   -- openssl speed rsa2048

  Doing 2048 bits private rsa's for 10s: 39 2048 bits private RSA's in
  10.14s
  Doing 2048 bits public rsa's for 10s: 1563 2048 bits public RSA's in
  10.00s
  version: 3.0.11
  built on: Tue Sep 19 13:02:31 2023 UTC
  options: bn(64,64)
  CPUINFO: N/A
                    sign    verify    sign/s verify/s
  rsa 2048 bits 0.260000s 0.006398s      3.8    156.3

   Performance counter stats for 'system wide':

             1338350      access_mmu_stlb
             1154025      miss_mmu_stlb
             1162691      access_mmu_pte_c
               34067      rob_flush
            11212384      btb_prediction_miss
             1256242      itlb_miss
           652523491      sync_del_fetch_g
              384465      icache_miss
            64635789      bpu_br_retire
              323440      bpu_br_miss
             8785143      ret_ins_retire
               31236      ret_ins_miss

        20.760822480 seconds time elapsed

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikita Shubin <n.shubin@yadro.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20231103082441.1389842-1-jisheng.teoh@starfivetech.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf report: Add s390 raw data interpretation for PAI counters
Thomas Richter [Fri, 10 Nov 2023 11:09:08 +0000 (12:09 +0100)]
perf report: Add s390 raw data interpretation for PAI counters

Commit 39d62336f5c126ad ("s390/pai: add support for cryptography
counters") added support for Processor Activity Instrumentation Facility
(PAI) counters.  These counters values are added as raw data with the
perf sample during 'perf record'.

Now add support to display these counters in the 'perf report' command.

The counter number, its assigned name and value is now printed in
addition to the hexadecimal output.

Output before:

  # perf report -D

  6 514766399626050 0x7b058 [0x48]: PERF_RECORD_SAMPLE(IP, 0x1):
  303977/303977: 0 period: 1 addr: 0
  ... thread: paitest:303977
  ...... dso: <not found>

  0x7b0a0@/root/perf.data.paicrypto [0x48]: event: 9
  .
  . ... raw event: size 72 bytes
  . 0000:  00 00 00 09 00 01 00 48 00 00 00 00 00 00 00 00  .......H........
  . 0010:  00 04 a3 69 00 04 a3 69 00 01 d4 2d 76 de a0 bb  ...i...i...-v...
  . 0020:  00 00 00 00 00 01 5c 53 00 00 00 06 00 00 00 00  ......\S........
  . 0030:  00 00 00 00 00 00 00 01 00 00 00 0c 00 07 00 00  ................
  . 0040:  00 00 00 53 96 af 00 00                          ...S....

Output after:

  # perf report -D

  6 514766399626050 0x7b058 [0x48]: PERF_RECORD_SAMPLE(IP, 0x1):
  303977/303977: 0 period: 1 addr: 0
  ... thread: paitest:303977
  ...... dso: <not found>

  0x7b0a0@/root/perf.data.paicrypto [0x48]: event: 9
  .
  . ... raw event: size 72 bytes
  . 0000:  00 00 00 09 00 01 00 48 00 00 00 00 00 00 00 00  .......H........
  . 0010:  00 04 a3 69 00 04 a3 69 00 01 d4 2d 76 de a0 bb  ...i...i...-v...
  . 0020:  00 00 00 00 00 01 5c 53 00 00 00 06 00 00 00 00  ......\S........
  . 0030:  00 00 00 00 00 00 00 01 00 00 00 0c 00 07 00 00  ................
  . 0040:  00 00 00 53 96 af 00 00                          ...S....

        Counter:007 km_aes_128 Value:0x00000000005396af     <--- new

Committer notes:

Had to add ignore pragmas for that __packed function:

  +#pragma GCC diagnostic ignored "-Wpacked"
  +#pragma GCC diagnostic ignored "-Wattributes"

Otherwise this doesn't build in things like debian experimentao cross
building to mips64, etc.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20231110110908.2312308-1-tmricht@linux.ibm.com
[ Corrected non-existent commit referred to the right one: 39d62336f5c126ad ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf probe: Convert to check dwarf_getcfi feature
Namhyung Kim [Thu, 9 Nov 2023 23:59:28 +0000 (15:59 -0800)]
perf probe: Convert to check dwarf_getcfi feature

Now it has a feature check for the dwarf_getcfi(), use it and convert
the code to check HAVE_DWARF_CFI_SUPPORT definition.

Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf build: Add feature check for dwarf_getcfi()
Namhyung Kim [Thu, 9 Nov 2023 23:59:27 +0000 (15:59 -0800)]
perf build: Add feature check for dwarf_getcfi()

The dwarf_getcfi() is available on libdw 0.142+.  Instead of just
checking the version number, it'd be nice to have a config item to check
the feature at build time.

Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf dwarf-aux: Add die_find_variable_by_reg() helper
Namhyung Kim [Thu, 9 Nov 2023 23:59:26 +0000 (15:59 -0800)]
perf dwarf-aux: Add die_find_variable_by_reg() helper

The die_find_variable_by_reg() will search for a variable or a parameter
sub-DIE in the given scope DIE where the location matches to the given
register.

For the simplest and most common case, memory access usually happens
with a base register and an offset to the field so the register holds a
pointer in a variable or function parameter.  Then we can find one if it
has a location expression at the (instruction) address.  This function
only handles such a simple case for now.

In this case, the expression has a DW_OP_regN operation where N < 32.
If the register index (N) is greater than or equal to 32, DW_OP_regx
operation with an operand which saves the value for the N would be used.
It rejects expressions with more operations.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf dwarf-aux: Add die_get_scopes() alternative to dwarf_getscopes()
Namhyung Kim [Thu, 9 Nov 2023 23:59:25 +0000 (15:59 -0800)]
perf dwarf-aux: Add die_get_scopes() alternative to dwarf_getscopes()

The die_get_scopes() returns the number of enclosing DIEs for the given
address and it fills an array of DIEs like dwarf_getscopes().  But it
doesn't follow the abstract origin of inlined functions as we want
information of the concrete instance.  This is needed to check the
location of parameters and local variables properly.  Users can check
the origin separately if needed.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf dwarf-aux: Move #else block of #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT code to...
Namhyung Kim [Thu, 9 Nov 2023 23:59:24 +0000 (15:59 -0800)]
perf dwarf-aux: Move #else block of #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT code to the header file

It's a usual convention that the conditional code is handled in a header
file.  As I'm planning to add some more of them, let's move the current
code to the header first.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf dwarf-aux: Fix die_get_typename() for void *
Namhyung Kim [Thu, 9 Nov 2023 23:59:23 +0000 (15:59 -0800)]
perf dwarf-aux: Fix die_get_typename() for void *

The die_get_typename() is to return a C-like type name from DWARF debug
entry and it follows data type if the target entry is a pointer type.

But I found that void pointers don't have the type attribute to follow
and then the function returns an error for that case.  This results in a
broken type string for void pointer types.

For example, the following type entries are pointer types.

 <1><48c>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <48d>   DW_AT_byte_size   : 8
    <48d>   DW_AT_type        : <0x481>
 <1><491>: Abbrev Number: 211 (DW_TAG_pointer_type)
    <493>   DW_AT_byte_size   : 8
 <1><494>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <495>   DW_AT_byte_size   : 8
    <495>   DW_AT_type        : <0x49e>

The first one at offset 48c and the third one at offset 494 have type
information.  Then they are pointer types for the referenced types.  But
the second one at offset 491 doesn't have the type attribute.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tools: Add util/debuginfo.[ch] files
Namhyung Kim [Thu, 9 Nov 2023 23:59:22 +0000 (15:59 -0800)]
perf tools: Add util/debuginfo.[ch] files

Split debuginfo data structure and related functions into a separate
file so that it can be used by other components than the probe-finder.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Move raw_comment and raw_func_start fields out of 'struct ins_operands'
Namhyung Kim [Thu, 9 Nov 2023 23:59:21 +0000 (15:59 -0800)]
perf annotate: Move raw_comment and raw_func_start fields out of 'struct ins_operands'

Thoese two fields are used only for the jump_ops, so move them into the
union to save some bytes.  Also add jump__delete() callback not to free
the fields as they didn't allocate new strings.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: WANG Rui <wangrui@loongson.cn>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Pass "-l" option to objdump conditionally
Namhyung Kim [Thu, 9 Nov 2023 23:59:20 +0000 (15:59 -0800)]
perf annotate: Pass "-l" option to objdump conditionally

The "-l" option is to print line numbers in the objdump output.  perf
annotate TUI only can show the line numbers later but it causes big slow
downs for the kernel binary.

Similarly, showing source code also takes a long time and it already has
an option to control it.

  $ time objdump ... -d -S -C vmlinux > /dev/null
  real 0m3.474s
  user 0m3.047s
  sys 0m0.428s

  $ time objdump ... -d -l -C vmlinux > /dev/null
  real 0m1.796s
  user 0m1.459s
  sys 0m0.338s

  $ time objdump ... -d -C vmlinux > /dev/null
  real 0m0.051s
  user 0m0.036s
  sys 0m0.016s

As it's not needed for data type profiling, let's make it conditional so
that it can skip the unnecessary work.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf header: Additional note on AMD IBS for max_precise pmu cap
Arnaldo Carvalho de Melo [Tue, 7 Nov 2023 08:33:31 +0000 (14:03 +0530)]
perf header: Additional note on AMD IBS for max_precise pmu cap

x86 core PMU exposes supported maximum precision level via max_precise
PMU capability. Although, AMD core PMU does not support precise mode,
certain core PMU events with precise_ip > 0 are allowed and forwarded to
IBS OP PMU.

Display a note about this in the 'perf report' header output and
document the details in the perf-list man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agotools: Disable __packed attribute compiler warning due to -Werror=attributes
Arnaldo Carvalho de Melo [Thu, 9 Nov 2023 19:34:09 +0000 (16:34 -0300)]
tools: Disable __packed attribute compiler warning due to -Werror=attributes

Noticed on several perf tools cross build test containers:

  [perfbuilder@five ~]$ grep FAIL ~/dm.log/summary
    19    10.18 debian:experimental-x-mips    : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
    20    11.21 debian:experimental-x-mips64  : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
    21    11.30 debian:experimental-x-mipsel  : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
    37    12.07 ubuntu:18.04-x-arm            : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
    42    11.91 ubuntu:18.04-x-riscv64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    44    13.17 ubuntu:18.04-x-sh4            : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
    45    12.09 ubuntu:18.04-x-sparc64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
  [perfbuilder@five ~]$

  In file included from util/intel-pt-decoder/intel-pt-pkt-decoder.c:10:
  /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h: In function 'get_unaligned_le16':
  /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:13:29: error: packed attribute causes inefficient alignment for 'x' [-Werror=attributes]
     13 |         const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr);      \
        |                             ^
  /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:27:28: note: in expansion of macro '__get_unaligned_t'
     27 |         return le16_to_cpu(__get_unaligned_t(__le16, p));
        |                            ^~~~~~~~~~~~~~~~~

This comes from the kernel, where the -Wattributes and -Wpacked isn't
used, -Wpacked is already disabled, do it for the attributes as well.

Fixes: a91c987254651443 ("perf tools: Add get_unaligned_leNN()")
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/7c5b626c-1de9-4c12-a781-e44985b4a797@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf bpf: Don't synthesize BPF events when disabled
Ian Rogers [Thu, 2 Nov 2023 17:56:54 +0000 (10:56 -0700)]
perf bpf: Don't synthesize BPF events when disabled

If BPF sideband events are disabled on the command line, don't
synthesize BPF events too.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Song Liu <song@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231102175735.2272696-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf test: Add support for setting objdump binary via perf config
James Clark [Mon, 6 Nov 2023 15:10:49 +0000 (15:10 +0000)]
perf test: Add support for setting objdump binary via perf config

Add a 'perf config' variable that does the same thing as "perf test
--objdump <x>".

Also update the man page.

Committer testing:

  # perf config test.objdump
  # perf test "object code reading"
   26: Object code reading                                             : Ok
  # perf config test.objdump=blah
  # perf config test.objdump
  test.objdump=blah
  # perf test "object code reading"
   26: Object code reading                                             : FAILED!
  # perf test -v "object code reading"
   26: Object code reading                                             :
  --- start ---
  test child forked, pid 600599
  Looking at the vmlinux_path (8 entries long)
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
  Parsing event 'cycles'
  Using CPUID AuthenticAMD-25-21-0
  mmap size 528384B
  Reading object code for memory address: 0x4d9a02
  File is: /home/acme/bin/perf
  On file address is: 0xd9a02
  Objdump command is: blah -z -d --start-address=0x4d9a02 --stop-address=0x4d9a82 /home/acme/bin/perf
  objdump read too few bytes: 128
  Bytes read differ from those read by objdump
  buf1 (dso):
  0x48 0x85 0xff 0x74 0x29 0xe8 0x94 0xdf 0x07 0x00 0x8b 0x73 0x1c 0x48 0x8b 0x43
  0x08 0xeb 0xa5 0x0f 0x1f 0x00 0x48 0x8b 0x45 0xe8 0x64 0x48 0x2b 0x04 0x25 0x28
  0x00 0x00 0x00 0x75 0x0f 0x48 0x8b 0x5d 0xf8 0xc9 0xc3 0x0f 0x1f 0x00 0x48 0x8b
  0x43 0x08 0xeb 0x84 0xe8 0xc5 0x3e 0xf3 0xff 0x0f 0x1f 0x44 0x00 0x00 0x55 0x48
  0x89 0xe5 0x41 0x56 0x41 0x55 0x49 0x89 0xd5 0x41 0x54 0x49 0x89 0xfc 0x53 0x48
  0x89 0xf3 0x48 0x83 0xec 0x30 0x48 0x8b 0x7e 0x20 0x64 0x48 0x8b 0x04 0x25 0x28
  0x00 0x00 0x00 0x48 0x89 0x45 0xd8 0x31 0xc0 0x48 0x89 0x75 0xb0 0x48 0xc7 0x45
  0xb8 0x00 0x00 0x00 0x00 0x48 0xc7 0x45 0xc0 0x00 0x00 0x00 0x00 0xe8 0xad 0xfa

  buf2 (objdump):
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
  0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

  test child finished with -1
  ---- end ----
  Object code reading: FAILED!
  # perf config test.objdump=/usr/bin/objdump
  # perf config test.objdump
  test.objdump=/usr/bin/objdump
  # perf test "object code reading"
   26: Object code reading                                             : Ok
  #

Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20231106151051.129440-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf test: Add option to change objdump binary
James Clark [Mon, 6 Nov 2023 15:10:48 +0000 (15:10 +0000)]
perf test: Add option to change objdump binary

All of the other Perf subcommands that use objdump have an option to
specify the binary, so add the same option to 'perf test'.

This is useful if you have built the kernel with a different toolchain
to the system one, where the system objdump may fail to disassemble
vmlinux.

Now this can be fixed with something like this:

  $ perf test --objdump llvm-objdump "object code reading"

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20231106151051.129440-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tests offcpu: Adjust test case perf record offcpu profiling tests for s390
Thomas Richter [Mon, 6 Nov 2023 09:16:27 +0000 (10:16 +0100)]
perf tests offcpu: Adjust test case perf record offcpu profiling tests for s390

On s390 using linux-next the test case:

    87: perf record offcpu profiling tests

fails. The root cause is this command

  # ./perf  record --off-cpu -e dummy -- ./perf bench sched messaging -l 10
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

     Total time: 0.231 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.077 MB perf.data (401 samples) ]
  #

It does not generate 800+ sample entries, on s390 usually around
40[1-9], sometimes a few more, but never more than 450. The higher the
number of CPUs the lower the number of samples.

Looking at function chain:

  bench_sched_messaging()
  +--> group()

the senders and receiver threads are created. The senders and receivers
call function ready() which writes one bytes and wait for a reply using
poll system() call.

As context switches are counted, the function ready() will trigger a
context switch when no input data is available after the write system
call. The write system call does not trigger context switches when the
data size is small. And writing 1000 bytes (10 iterations with
100 bytes) is not much and certainly won't block.

The 400+ context switch on s390 occur when the some receiver/sender
threads call ready() and wait for the response from function
bench_sched_messaging() being kicked off.

Lower the number of expected context switches to 400 to succeed on s390.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20231106091627.2022530-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tools: Add the python_ext_build directory to .gitignore
Yang Jihong [Mon, 30 Oct 2023 11:14:38 +0000 (11:14 +0000)]
perf tools: Add the python_ext_build directory to .gitignore

`python_ext_build` is the build directory for python.so, ignore it for
cleaner git status.

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231030111438.1357962-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tests attr: Fix spelling mistake "whic" to "which"
zhaimingbing [Mon, 30 Oct 2023 07:58:25 +0000 (15:58 +0800)]
perf tests attr: Fix spelling mistake "whic" to "which"

There is a spelling mistake, Please fix it.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20231030075825.3701-1-zhaimingbing@cmss.chinamobile.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Move offsets array from 'struct annotation' to 'struct annotated_source'
Namhyung Kim [Fri, 3 Nov 2023 19:19:07 +0000 (12:19 -0700)]
perf annotate: Move offsets array from 'struct annotation' to 'struct annotated_source'

The offsets array keeps pointers to 'struct annotation_line' entries
which are available in the 'struct annotated_source'.  Let's move it to
there.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Move some source code related fields from 'struct annotation' to ...
Namhyung Kim [Fri, 3 Nov 2023 19:19:06 +0000 (12:19 -0700)]
perf annotate: Move some source code related fields from 'struct annotation' to 'struct annotated_source'

Some fields in the 'struct annotation' are only used with 'struct
annotated_source' so better to be moved there in order to reduce memory
consumption for other symbols.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Move max_coverage from 'struct annotation' to 'struct annotated_branch'
Namhyung Kim [Fri, 3 Nov 2023 19:19:05 +0000 (12:19 -0700)]
perf annotate: Move max_coverage from 'struct annotation' to 'struct annotated_branch'

The max_coverage field is only used when branch stack info is available
so it'd be natural to move to 'struct annotated_branch'.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Split branch stack cycles info from 'struct annotation'
Namhyung Kim [Fri, 3 Nov 2023 19:19:04 +0000 (12:19 -0700)]
perf annotate: Split branch stack cycles info from 'struct annotation'

The cycles info is only meaningful when sample has branch stacks.  To
save the memory for normal cases, move those fields to a new 'struct
annotated_branch' and dynamically allocate it when needed.  Also move
cycles_hist from annotated_source as it's related here.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf annotate: Split branch stack cycles information out of 'struct annotation_line'
Namhyung Kim [Fri, 3 Nov 2023 19:19:03 +0000 (12:19 -0700)]
perf annotate: Split branch stack cycles information out of 'struct annotation_line'

The cycles info is used only when branch stack is provided.  Separate
them from 'struct annotation_line' into a separate struct and lazy
allocate them to save some memory.

Committer notes:

Make annotation__compute_ipc() check if the lazy allocation works,
bailing out if so, its callers already do error checking and
propagation.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231103191907.54531-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf test: Simplify "object code reading" test
Namhyung Kim [Fri, 3 Nov 2023 19:55:41 +0000 (12:55 -0700)]
perf test: Simplify "object code reading" test

It tries cycles (or cpu-clock on s390) event with exclude_kernel bit to
open.  But other arch on a VM can fail with the hardware event and need
to fallback to the software event in the same way.

So let's get rid of the cpuid check and use generic fallback mechanism
using an array of event candidates.  Now event in the odd index excludes
the kernel so use that for the return value.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: James Clark <james.clark@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20231103195541.67788-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf machine thread: Remove exited threads by default
Ian Rogers [Thu, 2 Nov 2023 17:56:47 +0000 (10:56 -0700)]
perf machine thread: Remove exited threads by default

'struct thread' values hold onto references to mmaps, DSOs, etc. When a
thread exits it is necessary to clean all of this memory up by removing
the thread from the machine's threads. Some tools require this doesn't
happen, such as auxtrace events, 'perf report' if offcpu events exist or
if a task list is being generated, so add a 'struct symbol_conf' member
to make the behavior optional. When an exited thread is left in the
machine's threads, mark it as exited.

This change relates to commit 40826c45eb0b8856 ("perf thread: Remove
notion of dead threads") . Dead threads were removed as they had a
reference count of 0 and were difficult to reason about with the
reference count checker. Here a thread is removed from threads when it
exits, unless via symbol_conf the exited thread isn't remove and is
marked as exited. Reference counting behaves as it normally does.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231102175735.2272696-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf record: Lazy load kernel symbols
Ian Rogers [Thu, 2 Nov 2023 17:56:44 +0000 (10:56 -0700)]
perf record: Lazy load kernel symbols

Commit 5b7ba82a75915e73 ("perf symbols: Load kernel maps before using")
changed it so that loading a kernel DSO would cause the symbols for the
DSO to be eagerly loaded.

For 'perf record' this is overhead as the symbols won't be used. Add a
field to 'struct symbol_conf' to control the behavior and disable it for
'perf record' and 'perf inject'.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Dong <lidong@vivo.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20231102175735.2272696-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tools: Fix spelling mistake "parametrized" -> "parameterized"
Colin Ian King [Tue, 3 Oct 2023 07:49:11 +0000 (08:49 +0100)]
perf tools: Fix spelling mistake "parametrized" -> "parameterized"

There are spelling mistakes in comments and a pr_debug message. Fix them.

Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20231003074911.220216-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tools: Add branch counter knob
Kan Liang [Wed, 25 Oct 2023 20:16:26 +0000 (13:16 -0700)]
perf tools: Add branch counter knob

Add a new branch filter, "counter", for the branch counter option. It is
used to mark the events which should be logged in the branch. If it is
applied with the -j option, the counters of all the events should be
logged in the branch. If the legacy kernel doesn't support the new
branch sample type, switching off the branch counter filter.

The stored counter values in each branch are displayed right after the
regular branch stack information via perf report -D.

Usage examples:

  # perf record -e "{branch-instructions,branch-misses}:S" -j any,counter

Only the first event, branch-instructions, collect the LBR. Both
branch-instructions and branch-misses are marked as logged events.  The
occurrences information of them can be found in the branch stack
extension space of each branch.

  # perf record -e "{cpu/branch-instructions,branch_type=any/,cpu/branch-misses,branch_type=counter/}"

Only the first event, branch-instructions, collect the LBR. Only the
branch-misses event is marked as a logged event.

Committer notes:

I noticed 'perf test "Sample parsing"' failing, reported to the list and
Kan provided a patch that checks if the evsel has a leader and that
evsel->evlist is set, the comment in the source code further explains
it.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231025201626.3000228-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf header: Support num and width of branch counters
Kan Liang [Wed, 25 Oct 2023 20:16:25 +0000 (13:16 -0700)]
perf header: Support num and width of branch counters

To support the branch counters feature, the information of the maximum
number of supported counters and the width of the counters is exposed in
the sysfs caps folder. The perf tool can use the information to parse
the logged counters in each branch.

Store the information in the perf_env for later usage.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231025201626.3000228-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agotools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
Kan Liang [Wed, 25 Oct 2023 20:16:24 +0000 (13:16 -0700)]
tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel

Sync the new sample type for the branch counters feature.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tinghao Zhang <tinghao.zhang@intel.com>
Link: https://lore.kernel.org/r/20231025201626.3000228-6-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf build: Warn about missing libelf before warning about missing libbpf
Arnaldo Carvalho de Melo [Wed, 11 Oct 2023 21:14:52 +0000 (18:14 -0300)]
perf build: Warn about missing libelf before warning about missing libbpf

As libelf is a requirement for libbpf if it is not available, as in some
container build tests where NO_LIBELF=1 is used, then better warn about
the most basic library first.

Ditto for libz, check its availability before libbpf too.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZUEehyDk0FkPnvMR@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf tests make: Remove the last egrep call, use 'grep -E' instead
Arnaldo Carvalho de Melo [Fri, 6 Oct 2023 21:11:05 +0000 (18:11 -0300)]
perf tests make: Remove the last egrep call, use 'grep -E' instead

One last case, caught while testing with amazonlinux:2, centos:stream,
etc:

   4     7.28 amazonlinux:2                 : FAIL egrep: warning: egrep is obsolescent; using grep -E
gcc version 7.3.1 20180712 (Red Hat 7.3.1-17) (GCC)
   8    13.87 centos:stream                 : FAIL egrep: warning: egrep is obsolescent; using grep -E

Reviewed-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZUEdtblE8qDAQkBK@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoperf beauty socket/prctl_option: Cope with extended regexp complaint by grep
Arnaldo Carvalho de Melo [Fri, 6 Oct 2023 19:14:54 +0000 (16:14 -0300)]
perf beauty socket/prctl_option: Cope with extended regexp complaint by grep

Noticed on fedora 38, the extended regexp that so far was ok for both
grep and sed now gets complaints by grep, that says '/' doesn't need to
be escaped with '\'.

So stop using '/' in sed, use '%' instead and remove the \ before / in
the common extended regexp.

Link: https://x.com/SMT_Solvers/status/1710380010098344192?s=20
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZUEddFPTJHVLhH%2F6@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
10 months agoMerge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-next
Namhyung Kim [Mon, 30 Oct 2023 20:39:28 +0000 (13:39 -0700)]
Merge tag 'perf-tools-fixes-for-v6.6-2-2023-10-20' into perf-tools-next

To get the latest fixes in the perf tools including perf stat output,
dlfilter and LLVM feature detection.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update tsx_cycles_per_elision metrics
Ian Rogers [Thu, 26 Oct 2023 00:31:49 +0000 (17:31 -0700)]
perf vendor events intel: Update tsx_cycles_per_elision metrics

Update tsx_cycles_per_elision as per:
https://github.com/intel/perfmon/pull/116

Prefer the el-start event rather than cycles-t for detecting whether
the metric will work as HLE may be disabled. Remove the metric from
sapphirerapids that has no el-start event.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update bonnell version number to v5
Ian Rogers [Thu, 26 Oct 2023 00:31:48 +0000 (17:31 -0700)]
perf vendor events intel: Update bonnell version number to v5

Spelling fixes were already incorporated in the Linux perf tree,
update the version number to reflect this.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update westmereex events to v4
Ian Rogers [Thu, 26 Oct 2023 00:31:47 +0000 (17:31 -0700)]
perf vendor events intel: Update westmereex events to v4

Update westmereex events from v3 to v4 fixing a spelling issue.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update meteorlake events to v1.06
Ian Rogers [Thu, 26 Oct 2023 00:31:46 +0000 (17:31 -0700)]
perf vendor events intel: Update meteorlake events to v1.06

Update meteorlake from v1.04 to v1.06 adding the changes from:
https://github.com/intel/perfmon/commit/bc84df043091ec7c98c0629f3d074d9d7a108194
https://github.com/intel/perfmon/commit/405d3ee987d756b5b5d9a64d8a8fa77559822ecf

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update knightslanding events to v16
Ian Rogers [Thu, 26 Oct 2023 00:31:45 +0000 (17:31 -0700)]
perf vendor events intel: Update knightslanding events to v16

Update knightslanding from v10 to v16 adding the changes from:
https://github.com/intel/perfmon/commit/6c1f169f6ed63ee1fd75ebb303d0fd06d71196f5
https://github.com/intel/perfmon/commit/b22ca587ec8b5ac20471ea2f14924f63e63afe9d
https://github.com/intel/perfmon/commit/e685286f083ee81cb7dafd0cd8546c79ee433187

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Add typo fix for ivybridge FP
Ian Rogers [Thu, 26 Oct 2023 00:31:44 +0000 (17:31 -0700)]
perf vendor events intel: Add typo fix for ivybridge FP

Add a missed space.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update a spelling in haswell/haswellx
Ian Rogers [Thu, 26 Oct 2023 00:31:43 +0000 (17:31 -0700)]
perf vendor events intel: Update a spelling in haswell/haswellx

The spelling of "in-flight" was switched to "inflight".

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update emeraldrapids to v1.01
Ian Rogers [Thu, 26 Oct 2023 00:31:42 +0000 (17:31 -0700)]
perf vendor events intel: Update emeraldrapids to v1.01

Update emeraldrapids to v1.01 from v1.00 adding the changes from:
https://github.com/intel/perfmon/commit/3993b600e032a9fd443ffd828aab73de7cb167e5

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Update alderlake/alderlake events to v1.23
Ian Rogers [Thu, 26 Oct 2023 00:31:41 +0000 (17:31 -0700)]
perf vendor events intel: Update alderlake/alderlake events to v1.23

Update alderlake and alderlaken events from v1.21 to v1.23 adding the
changes from:
https://github.com/intel/perfmon/commit/8df4db9433a2aab59dbbac1a70281032d1af7734
https://github.com/intel/perfmon/commit/846bd247c6e04acc572ca56c992e9e65852bbe63

The tsx_cycles_per_elision metric is updated from PR:
https://github.com/intel/perfmon/pull/116

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20231026003149.3287633-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf build: Disable BPF skeletons if clang version is < 12.0.1
Arnaldo Carvalho de Melo [Fri, 27 Oct 2023 14:18:47 +0000 (11:18 -0300)]
perf build: Disable BPF skeletons if clang version is < 12.0.1

While building on a wide range of distros and clang versions it was
noticed that at least version 12.0.1 (noticed on Alpine 3.15 with
"Alpine clang version 12.0.1") is needed to not fail with BTF generation
errors such as:

Debian:10

  Debian clang version 11.0.1-2~deb10u1:

    CLANG   /tmp/build/perf/util/bpf_skel/.tmp/sample_filter.bpf.o
  <SNIP>
    GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h
  libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2
  Error: failed to open BPF object file: No such file or directory
  make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254
  make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/sample_filter.skel.h'

Amazon Linux 2:

  clang version 11.1.0 (Amazon Linux 2 11.1.0-1.amzn2.0.2)

    GENSKEL /tmp/build/perf/util/bpf_skel/sample_filter.skel.h
  libbpf: elf: skipping unrecognized data section(18) .eh_frame
  libbpf: elf: skipping relo section(19) .rel.eh_frame for section(18) .eh_frame
  libbpf: failed to find BTF for extern 'bpf_cast_to_kern_ctx' [21] section: -2
  Error: failed to open BPF object file: No such file or directory
  make[2]: *** [/tmp/build/perf/util/bpf_skel/sample_filter.skel.h] Error 254
  make[2]: *** Deleting file `/tmp/build/perf/util/bpf_skel/sample_filter.skel.h'

Ubuntu 20.04:

  clang version 10.0.0-4ubuntu1

    CLANG   /tmp/build/perf/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o
    GENSKEL /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h
    GENSKEL /tmp/build/perf/util/bpf_skel/bperf_leader.skel.h
  libbpf: sec '.reluprobe': corrupted symbol #27 pointing to invalid section #65522 for relo #0
    GENSKEL /tmp/build/perf/util/bpf_skel/bperf_follower.skel.h
  Error: failed to open BPF object file: BPF object format invalid
  make[2]: *** [Makefile.perf:1121: /tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h] Error 95
  make[2]: *** Deleting file '/tmp/build/perf/util/bpf_skel/bench_uprobe.skel.h'

So check if the version is at least 12.0.1 otherwise disable building
BPF skels and provide a message about it, continuing the build.

The message, when running on amazonlinux:2:

  Makefile.config:698: Warning: Disabled BPF skeletons as reliable BTF generation needs at least clang version 12.0.1

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/ZTvGx/Ou6BVnYBqi@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf callchain: Fix spelling mistake "statisitcs" -> "statistics"
Colin Ian King [Fri, 27 Oct 2023 08:46:33 +0000 (09:46 +0100)]
perf callchain: Fix spelling mistake "statisitcs" -> "statistics"

There are a couple of spelling mistakes in perror messages. Fix them.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20231027084633.1167530-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf report: Fix spelling mistake "heirachy" -> "hierarchy"
Colin Ian King [Fri, 27 Oct 2023 08:40:11 +0000 (09:40 +0100)]
perf report: Fix spelling mistake "heirachy" -> "hierarchy"

There is a spelling mistake in a ui error message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20231027084011.1167091-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()
Arnaldo Carvalho de Melo [Fri, 27 Oct 2023 13:33:30 +0000 (10:33 -0300)]
perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()

The changes in ("perf evsel: Rename evsel__increase_rlimit to
rlimit__increase_nofile") ended up breaking the python binding that now
references the rlimit__increase_nofile function, add the util/rlimit.o
to the tools/perf/util/python-ext-sources to cure that.

This was detected by the 'perf test python' regression test:

  $ perf test python
   14: 'import perf' in python        : FAILED!

  $ perf test -v python
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
   14: 'import perf' in python                                         :
  --- start ---
  test child forked, pid 2912462
  python usage test: "echo "import sys ; sys.path.insert(0, '/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' "
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: rlimit__increase_nofile
  test child finished with -1
  ---- end ----
  'import perf' in python: FAILED!
  $

Fixes: e093a222d7cba1eb ("perf evsel: Rename evsel__increase_rlimit to rlimit__increase_nofile")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/lkml/ZTrCS5Z3PZAmfPdV@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf tests: test_arm_coresight: Simplify source iteration
James Clark [Mon, 23 Oct 2023 13:15:49 +0000 (14:15 +0100)]
perf tests: test_arm_coresight: Simplify source iteration

There are two reasons to do this, firstly there is a shellcheck warning
in cs_etm_dev_name(), which can be completely deleted. And secondly the
current iteration method doesn't support systems with both ETE and ETM
because it picks one or the other. There isn't a known system with this
configuration, but it could happen in the future.

Iterating over all the sources for each CPU can be done by going through
/sys/bus/event_source/devices/cs_etm/cpu* and following the symlink back
to the Coresight device in /sys/bus/coresight/devices. This will work
whether the device is ETE, ETM or any future name, and is much simpler
and doesn't require any hard coded version numbers

Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: tianruidong@linux.alibaba.com
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: atrajeev@linux.vnet.ibm.com
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231023131550.487760-1-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Add tigerlake two metrics
Ian Rogers [Tue, 26 Sep 2023 20:59:48 +0000 (13:59 -0700)]
perf vendor events intel: Add tigerlake two metrics

Add tma_info_system_socket_clks and uncore_freq metrics.

The associated converter script fix is in:
https://github.com/intel/perfmon/pull/112

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Link: https://lore.kernel.org/r/20230926205948.1399594-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Add broadwellde two metrics
Ian Rogers [Tue, 26 Sep 2023 20:59:47 +0000 (13:59 -0700)]
perf vendor events intel: Add broadwellde two metrics

Add tma_info_system_socket_clks and uncore_freq metrics that require a
broadwellx style uncore event for UNC_CLOCK.

The associated converter script fix is in:
https://github.com/intel/perfmon/pull/112

Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Link: https://lore.kernel.org/r/20230926205948.1399594-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
Ian Rogers [Tue, 26 Sep 2023 03:10:34 +0000 (20:10 -0700)]
perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric

Broadwell-de has a consumer core and server uncore. The uncore_arb PMU
isn't present and the broadwellx style cbox PMU should be used
instead. Fix the tma_info_system_dram_bw_use metric to use the server
metric rather than client.

The associated converter script fix is in:
https://github.com/intel/perfmon/pull/111

Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Link: https://lore.kernel.org/r/20230926031034.1201145-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
Ian Rogers [Tue, 24 Oct 2023 22:23:14 +0000 (15:23 -0700)]
perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit

Fix leak where mem_info__put wouldn't release the maps/map as used by
perf mem. Add exit functions and use elsewhere that the maps and map
are released.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf callchain: Minor layout changes to callchain_list
Ian Rogers [Tue, 24 Oct 2023 22:23:13 +0000 (15:23 -0700)]
perf callchain: Minor layout changes to callchain_list

Avoid 6 byte hole for padding. Place more frequently used fields
first in an attempt to use just 1 cacheline in the common case.

Before:
```
struct callchain_list {
        u64                        ip;                   /*     0     8 */
        struct map_symbol          ms;                   /*     8    24 */
        struct {
                _Bool              unfolded;             /*    32     1 */
                _Bool              has_children;         /*    33     1 */
        };                                               /*    32     2 */

        /* XXX 6 bytes hole, try to pack */

        u64                        branch_count;         /*    40     8 */
        u64                        from_count;           /*    48     8 */
        u64                        predicted_count;      /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u64                        abort_count;          /*    64     8 */
        u64                        cycles_count;         /*    72     8 */
        u64                        iter_count;           /*    80     8 */
        u64                        iter_cycles;          /*    88     8 */
        struct branch_type_stat *  brtype_stat;          /*    96     8 */
        const char  *              srcline;              /*   104     8 */
        struct list_head           list;                 /*   112    16 */

        /* size: 128, cachelines: 2, members: 13 */
        /* sum members: 122, holes: 1, sum holes: 6 */
};
```

After:
```
struct callchain_list {
        struct list_head           list;                 /*     0    16 */
        u64                        ip;                   /*    16     8 */
        struct map_symbol          ms;                   /*    24    24 */
        const char  *              srcline;              /*    48     8 */
        u64                        branch_count;         /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u64                        from_count;           /*    64     8 */
        u64                        cycles_count;         /*    72     8 */
        u64                        iter_count;           /*    80     8 */
        u64                        iter_cycles;          /*    88     8 */
        struct branch_type_stat *  brtype_stat;          /*    96     8 */
        u64                        predicted_count;      /*   104     8 */
        u64                        abort_count;          /*   112     8 */
        struct {
                _Bool              unfolded;             /*   120     1 */
                _Bool              has_children;         /*   121     1 */
        };                                               /*   120     2 */

        /* size: 128, cachelines: 2, members: 13 */
        /* padding: 6 */
};
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf callchain: Make brtype_stat in callchain_list optional
Ian Rogers [Tue, 24 Oct 2023 22:23:12 +0000 (15:23 -0700)]
perf callchain: Make brtype_stat in callchain_list optional

struct callchain_list is 352bytes in size, 232 of which are
brtype_stat. brtype_stat is only used for certain callchain_list
items so make it optional, allocating when necessary. So that
printing doesn't need to deal with an optional brtype_stat, pass
an empty/zero version.

Before:
```
struct callchain_list {
        u64                        ip;                   /*     0     8 */
        struct map_symbol          ms;                   /*     8    24 */
        struct {
                _Bool              unfolded;             /*    32     1 */
                _Bool              has_children;         /*    33     1 */
        };                                               /*    32     2 */

        /* XXX 6 bytes hole, try to pack */

        u64                        branch_count;         /*    40     8 */
        u64                        from_count;           /*    48     8 */
        u64                        predicted_count;      /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u64                        abort_count;          /*    64     8 */
        u64                        cycles_count;         /*    72     8 */
        u64                        iter_count;           /*    80     8 */
        u64                        iter_cycles;          /*    88     8 */
        struct branch_type_stat    brtype_stat;          /*    96   232 */
        /* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */
        const char  *              srcline;              /*   328     8 */
        struct list_head           list;                 /*   336    16 */

        /* size: 352, cachelines: 6, members: 13 */
        /* sum members: 346, holes: 1, sum holes: 6 */
        /* last cacheline: 32 bytes */
};
```

After:
```
struct callchain_list {
        u64                        ip;                   /*     0     8 */
        struct map_symbol          ms;                   /*     8    24 */
        struct {
                _Bool              unfolded;             /*    32     1 */
                _Bool              has_children;         /*    33     1 */
        };                                               /*    32     2 */

        /* XXX 6 bytes hole, try to pack */

        u64                        branch_count;         /*    40     8 */
        u64                        from_count;           /*    48     8 */
        u64                        predicted_count;      /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u64                        abort_count;          /*    64     8 */
        u64                        cycles_count;         /*    72     8 */
        u64                        iter_count;           /*    80     8 */
        u64                        iter_cycles;          /*    88     8 */
        struct branch_type_stat *  brtype_stat;          /*    96     8 */
        const char  *              srcline;              /*   104     8 */
        struct list_head           list;                 /*   112    16 */

        /* size: 128, cachelines: 2, members: 13 */
        /* sum members: 122, holes: 1, sum holes: 6 */
};
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf callchain: Make display use of branch_type_stat const
Ian Rogers [Tue, 24 Oct 2023 22:23:11 +0000 (15:23 -0700)]
perf callchain: Make display use of branch_type_stat const

Display code doesn't modify the branch_type_stat so switch uses to
const. This is done to aid refactoring struct callchain_list where
current the branch_type_stat is embedded even if not used.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf offcpu: Add missed btf_free
Ian Rogers [Tue, 24 Oct 2023 22:23:10 +0000 (15:23 -0700)]
perf offcpu: Add missed btf_free

Caught by address/leak sanitizer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf threads: Remove unused dead thread list
Ian Rogers [Tue, 24 Oct 2023 22:23:09 +0000 (15:23 -0700)]
perf threads: Remove unused dead thread list

Commit 40826c45eb0b ("perf thread: Remove notion of dead threads")
removed dead threads but the list head wasn't removed. Remove it here.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agoperf hist: Add missing puts to hist__account_cycles
Ian Rogers [Tue, 24 Oct 2023 22:23:08 +0000 (15:23 -0700)]
perf hist: Add missing puts to hist__account_cycles

Caught using reference count checking on perf top with
"--call-graph=lbr". After this no memory leaks were detected.

Fixes: 57849998e2cd ("perf report: Add processing for cycle histograms")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agolibperf rc_check: Add RC_CHK_EQUAL
Ian Rogers [Tue, 24 Oct 2023 22:23:07 +0000 (15:23 -0700)]
libperf rc_check: Add RC_CHK_EQUAL

Comparing pointers with reference count checking is tricky to avoid a
SEGV. Add a convenience macro to simplify and use.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 months agolibperf rc_check: Make implicit enabling work for GCC
Ian Rogers [Tue, 24 Oct 2023 22:23:06 +0000 (15:23 -0700)]
libperf rc_check: Make implicit enabling work for GCC

Make the implicit REFCOUNT_CHECKING robust to when building with GCC.

Fixes: 9be6ab181b7b ("libperf rc_check: Enable implicitly with sanitizers")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>