perf trace: Implement syscall summary in BPF
authorNamhyung Kim <namhyung@kernel.org>
Wed, 26 Mar 2025 04:40:00 +0000 (21:40 -0700)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Mon, 28 Apr 2025 19:53:11 +0000 (16:53 -0300)
commit1bec43f5239dddd60565553cf2b6a98f416500ae
treee316437fef9b9dd42fd7c60e72ac40676315775d
parent85447f68a1e3e6a237e3052a60a70a1593604e26
perf trace: Implement syscall summary in BPF

When -s/--summary option is used, it doesn't need (augmented) arguments
of syscalls.  Let's skip the augmentation and load another small BPF
program to collect the statistics in the kernel instead of copying the
data to the ring-buffer to calculate the stats in userspace.  This will
be much more light-weight than the existing approach and remove any lost
events.

Let's add a new option --bpf-summary to control this behavior.  I cannot
make it default because there's no way to get e_machine in the BPF which
is needed for detecting different ABIs like 32-bit compat mode.

No functional changes intended except for no more LOST events. :)

  $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1

   Summary of events:

   total, 6194 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     epoll_wait           561      0  4530.843     0.000     8.076   520.941     18.75%
     futex                693     45  4317.231     0.000     6.230   500.077     21.98%
     poll                 300      0  1040.109     0.000     3.467   120.928     17.02%
     clock_nanosleep        1      0  1000.172  1000.172  1000.172  1000.172      0.00%
     ppoll                360      0   872.386     0.001     2.423   253.275     41.91%
     epoll_pwait           14      0   384.349     0.001    27.453   380.002     98.79%
     pselect6              14      0   108.130     7.198     7.724     8.206      0.85%
     nanosleep             39      0    43.378     0.069     1.112    10.084     44.23%
     ...

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org
[ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/Documentation/perf-trace.txt
tools/perf/Makefile.perf
tools/perf/builtin-trace.c
tools/perf/util/Build
tools/perf/util/bpf-trace-summary.c [new file with mode: 0644]
tools/perf/util/bpf_skel/syscall_summary.bpf.c [new file with mode: 0644]
tools/perf/util/bpf_skel/syscall_summary.h [new file with mode: 0644]
tools/perf/util/trace.h [new file with mode: 0644]