Merge tag 'libnvdimm-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm...
[linux-2.6-block.git] / tools / perf / Documentation / intel-hybrid.txt
CommitLineData
2750ce1d
JY
1Intel hybrid support
2--------------------
3Support for Intel hybrid events within perf tools.
4
5For some Intel platforms, such as AlderLake, which is hybrid platform and
6it consists of atom cpu and core cpu. Each cpu has dedicated event list.
7Part of events are available on core cpu, part of events are available
8on atom cpu and even part of events are available on both.
9
10Kernel exports two new cpu pmus via sysfs:
11/sys/devices/cpu_core
12/sys/devices/cpu_atom
13
14The 'cpus' files are created under the directories. For example,
15
16cat /sys/devices/cpu_core/cpus
170-15
18
19cat /sys/devices/cpu_atom/cpus
2016-23
21
22It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
23
2750ce1d
JY
24As before, use perf-list to list the symbolic event.
25
26perf list
27
28inst_retired.any
29 [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
30inst_retired.any
31 [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
32
33The 'Unit: xxx' is added to brief description to indicate which pmu
34the event is belong to. Same event name but with different pmu can
35be supported.
36
37Enable hybrid event with a specific pmu
2750ce1d
JY
38
39To enable a core only event or atom only event, following syntax is supported:
40
41 cpu_core/<event name>/
42or
43 cpu_atom/<event name>/
44
45For example, count the 'cycles' event on core cpus.
46
47 perf stat -e cpu_core/cycles/
48
49Create two events for one hardware event automatically
2750ce1d
JY
50
51When creating one event and the event is available on both atom and core,
52two events are created automatically. One is for atom, the other is for
53core. Most of hardware events and cache events are available on both
54cpu_core and cpu_atom.
55
56For hardware events, they have pre-defined configs (e.g. 0 for cycles).
57But on hybrid platform, kernel needs to know where the event comes from
58(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
59can't carry pmu information. So now this type is extended to be PMU aware
60type. The PMU type ID is stored at attr.config[63:32].
61
62PMU type ID is retrieved from sysfs.
63/sys/devices/cpu_atom/type
64/sys/devices/cpu_core/type
65
66The new attr.config layout for PERF_TYPE_HARDWARE:
67
68PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA
69 AA: hardware event ID
70 EEEEEEEE: PMU type ID
71
72Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
73PMU aware type. The PMU type ID is stored at attr.config[63:32].
74
75The new attr.config layout for PERF_TYPE_HW_CACHE:
76
77PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB
78 BB: hardware cache ID
79 CC: hardware cache op ID
80 DD: hardware cache op result ID
81 EEEEEEEE: PMU type ID
82
83When enabling a hardware event without specified pmu, such as,
84perf stat -e cycles -a (use system-wide in this example), two events
85are created automatically.
86
87 ------------------------------------------------------------
88 perf_event_attr:
89 size 120
90 config 0x400000000
91 sample_type IDENTIFIER
92 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
93 disabled 1
94 inherit 1
95 exclude_guest 1
96 ------------------------------------------------------------
97
98and
99
100 ------------------------------------------------------------
101 perf_event_attr:
102 size 120
103 config 0x800000000
104 sample_type IDENTIFIER
105 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
106 disabled 1
107 inherit 1
108 exclude_guest 1
109 ------------------------------------------------------------
110
111type 0 is PERF_TYPE_HARDWARE.
1120x4 in 0x400000000 indicates it's cpu_core pmu.
1130x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
114
115The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
116and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
117
118For perf-stat result, it displays two events:
119
120 Performance counter stats for 'system wide':
121
122 6,744,979 cpu_core/cycles/
123 1,965,552 cpu_atom/cycles/
124
125The first 'cycles' is core event, the second 'cycles' is atom event.
126
127Thread mode example:
2750ce1d
JY
128
129perf-stat reports the scaled counts for hybrid event and with a percentage
130displayed. The percentage is the event's running time/enabling time.
131
132One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
133scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
134
f2c24eba 135perf stat -e cycles \-- taskset -c 16 ./triad_loop
2750ce1d
JY
136
137As previous, two events are created.
138
139------------------------------------------------------------
140perf_event_attr:
141 size 120
142 config 0x400000000
143 sample_type IDENTIFIER
144 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
145 disabled 1
146 inherit 1
147 enable_on_exec 1
148 exclude_guest 1
149------------------------------------------------------------
150
151and
152
153------------------------------------------------------------
154perf_event_attr:
155 size 120
156 config 0x800000000
157 sample_type IDENTIFIER
158 read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
159 disabled 1
160 inherit 1
161 enable_on_exec 1
162 exclude_guest 1
163------------------------------------------------------------
164
165 Performance counter stats for 'taskset -c 16 ./triad_loop':
166
167 233,066,666 cpu_core/cycles/ (0.43%)
168 604,097,080 cpu_atom/cycles/ (99.57%)
169
170perf-record:
2750ce1d
JY
171
172If there is no '-e' specified in perf record, on hybrid platform,
173it creates two default 'cycles' and adds them to event list. One
174is for core, the other is for atom.
175
176perf-stat:
2750ce1d
JY
177
178If there is no '-e' specified in perf stat, on hybrid platform,
179besides of software events, following events are created and
180added to event list in order.
181
182cpu_core/cycles/,
183cpu_atom/cycles/,
184cpu_core/instructions/,
185cpu_atom/instructions/,
186cpu_core/branches/,
187cpu_atom/branches/,
188cpu_core/branch-misses/,
189cpu_atom/branch-misses/
190
191Of course, both perf-stat and perf-record support to enable
192hybrid event with a specific pmu.
193
194e.g.
195perf stat -e cpu_core/cycles/
196perf stat -e cpu_atom/cycles/
197perf stat -e cpu_core/r1a/
198perf stat -e cpu_atom/L1-icache-loads/
199perf stat -e cpu_core/cycles/,cpu_atom/instructions/
200perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
201
202But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
203warning and disable grouping, because the pmus in group are
204not matched (cpu_core vs. cpu_atom).