Commit | Line | Data |
---|---|---|
86de78d2 MCC |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
99d56196 BL |
3 | .. _perf_index: |
4 | ||
aa1005d1 RG |
5 | ==== |
6 | Perf | |
7 | ==== | |
8 | ||
a9bf3130 AM |
9 | Perf Event Attributes |
10 | ===================== | |
11 | ||
86de78d2 MCC |
12 | :Author: Andrew Murray <andrew.murray@arm.com> |
13 | :Date: 2019-03-06 | |
a9bf3130 AM |
14 | |
15 | exclude_user | |
16 | ------------ | |
17 | ||
18 | This attribute excludes userspace. | |
19 | ||
20 | Userspace always runs at EL0 and thus this attribute will exclude EL0. | |
21 | ||
22 | ||
23 | exclude_kernel | |
24 | -------------- | |
25 | ||
26 | This attribute excludes the kernel. | |
27 | ||
28 | The kernel runs at EL2 with VHE and EL1 without. Guest kernels always run | |
29 | at EL1. | |
30 | ||
31 | For the host this attribute will exclude EL1 and additionally EL2 on a VHE | |
32 | system. | |
33 | ||
34 | For the guest this attribute will exclude EL1. Please note that EL2 is | |
35 | never counted within a guest. | |
36 | ||
37 | ||
38 | exclude_hv | |
39 | ---------- | |
40 | ||
41 | This attribute excludes the hypervisor. | |
42 | ||
43 | For a VHE host this attribute is ignored as we consider the host kernel to | |
44 | be the hypervisor. | |
45 | ||
46 | For a non-VHE host this attribute will exclude EL2 as we consider the | |
47 | hypervisor to be any code that runs at EL2 which is predominantly used for | |
48 | guest/host transitions. | |
49 | ||
50 | For the guest this attribute has no effect. Please note that EL2 is | |
51 | never counted within a guest. | |
52 | ||
53 | ||
54 | exclude_host / exclude_guest | |
55 | ---------------------------- | |
56 | ||
57 | These attributes exclude the KVM host and guest, respectively. | |
58 | ||
59 | The KVM host may run at EL0 (userspace), EL1 (non-VHE kernel) and EL2 (VHE | |
60 | kernel or non-VHE hypervisor). | |
61 | ||
62 | The KVM guest may run at EL0 (userspace) and EL1 (kernel). | |
63 | ||
64 | Due to the overlapping exception levels between host and guests we cannot | |
65 | exclusively rely on the PMU's hardware exception filtering - therefore we | |
66 | must enable/disable counting on the entry and exit to the guest. This is | |
67 | performed differently on VHE and non-VHE systems. | |
68 | ||
69 | For non-VHE systems we exclude EL2 for exclude_host - upon entering and | |
70 | exiting the guest we disable/enable the event as appropriate based on the | |
71 | exclude_host and exclude_guest attributes. | |
72 | ||
73 | For VHE systems we exclude EL1 for exclude_guest and exclude both EL0,EL2 | |
74 | for exclude_host. Upon entering and exiting the guest we modify the event | |
75 | to include/exclude EL0 as appropriate based on the exclude_host and | |
76 | exclude_guest attributes. | |
77 | ||
78 | The statements above also apply when these attributes are used within a | |
79 | non-VHE guest however please note that EL2 is never counted within a guest. | |
80 | ||
81 | ||
82 | Accuracy | |
83 | -------- | |
84 | ||
85 | On non-VHE hosts we enable/disable counters on the entry/exit of host/guest | |
86 | transition at EL2 - however there is a period of time between | |
87 | enabling/disabling the counters and entering/exiting the guest. We are | |
88 | able to eliminate counters counting host events on the boundaries of guest | |
89 | entry/exit when counting guest events by filtering out EL2 for | |
90 | exclude_host. However when using !exclude_hv there is a small blackout | |
91 | window at the guest entry/exit where host events are not captured. | |
92 | ||
93 | On VHE systems there are no blackout windows. | |
aa1005d1 RG |
94 | |
95 | Perf Userspace PMU Hardware Counter Access | |
96 | ========================================== | |
97 | ||
98 | Overview | |
99 | -------- | |
100 | The perf userspace tool relies on the PMU to monitor events. It offers an | |
101 | abstraction layer over the hardware counters since the underlying | |
102 | implementation is cpu-dependent. | |
103 | Arm64 allows userspace tools to have access to the registers storing the | |
104 | hardware counters' values directly. | |
105 | ||
106 | This targets specifically self-monitoring tasks in order to reduce the overhead | |
107 | by directly accessing the registers without having to go through the kernel. | |
108 | ||
109 | How-to | |
110 | ------ | |
111 | The focus is set on the armv8 PMUv3 which makes sure that the access to the pmu | |
112 | registers is enabled and that the userspace has access to the relevant | |
113 | information in order to use them. | |
114 | ||
115 | In order to have access to the hardware counters, the global sysctl | |
116 | kernel/perf_user_access must first be enabled: | |
117 | ||
118 | .. code-block:: sh | |
119 | ||
120 | echo 1 > /proc/sys/kernel/perf_user_access | |
121 | ||
122 | It is necessary to open the event using the perf tool interface with config1:1 | |
123 | attr bit set: the sys_perf_event_open syscall returns a fd which can | |
124 | subsequently be used with the mmap syscall in order to retrieve a page of memory | |
125 | containing information about the event. The PMU driver uses this page to expose | |
126 | to the user the hardware counter's index and other necessary data. Using this | |
127 | index enables the user to access the PMU registers using the `mrs` instruction. | |
128 | Access to the PMU registers is only valid while the sequence lock is unchanged. | |
129 | In particular, the PMSELR_EL0 register is zeroed each time the sequence lock is | |
130 | changed. | |
131 | ||
132 | The userspace access is supported in libperf using the perf_evsel__mmap() | |
133 | and perf_evsel__read() functions. See `tools/lib/perf/tests/test-evsel.c`_ for | |
134 | an example. | |
135 | ||
136 | About heterogeneous systems | |
137 | --------------------------- | |
138 | On heterogeneous systems such as big.LITTLE, userspace PMU counter access can | |
139 | only be enabled when the tasks are pinned to a homogeneous subset of cores and | |
140 | the corresponding PMU instance is opened by specifying the 'type' attribute. | |
141 | The use of generic event types is not supported in this case. | |
142 | ||
143 | Have a look at `tools/perf/arch/arm64/tests/user-events.c`_ for an example. It | |
144 | can be run using the perf tool to check that the access to the registers works | |
145 | correctly from userspace: | |
146 | ||
147 | .. code-block:: sh | |
148 | ||
149 | perf test -v user | |
150 | ||
151 | About chained events and counter sizes | |
152 | -------------------------------------- | |
153 | The user can request either a 32-bit (config1:0 == 0) or 64-bit (config1:0 == 1) | |
154 | counter along with userspace access. The sys_perf_event_open syscall will fail | |
155 | if a 64-bit counter is requested and the hardware doesn't support 64-bit | |
156 | counters. Chained events are not supported in conjunction with userspace counter | |
157 | access. If a 32-bit counter is requested on hardware with 64-bit counters, then | |
158 | userspace must treat the upper 32-bits read from the counter as UNKNOWN. The | |
159 | 'pmc_width' field in the user page will indicate the valid width of the counter | |
160 | and should be used to mask the upper bits as needed. | |
161 | ||
162 | .. Links | |
163 | .. _tools/perf/arch/arm64/tests/user-events.c: | |
164 | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/tests/user-events.c | |
165 | .. _tools/lib/perf/tests/test-evsel.c: | |
166 | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/lib/perf/tests/test-evsel.c |