Commit | Line | Data |
---|---|---|
fe13225f PT |
1 | ====================================== |
2 | Coresight - HW Assisted Tracing on ARM | |
3 | ====================================== | |
872234d3 | 4 | |
fe13225f PT |
5 | :Author: Mathieu Poirier <mathieu.poirier@linaro.org> |
6 | :Date: September 11th, 2014 | |
872234d3 MP |
7 | |
8 | Introduction | |
9 | ------------ | |
10 | ||
11 | Coresight is an umbrella of technologies allowing for the debugging of ARM | |
12 | based SoC. It includes solutions for JTAG and HW assisted tracing. This | |
13 | document is concerned with the latter. | |
14 | ||
15 | HW assisted tracing is becoming increasingly useful when dealing with systems | |
16 | that have many SoCs and other components like GPU and DMA engines. ARM has | |
17 | developed a HW assisted tracing solution by means of different components, each | |
b29d5c1f | 18 | being added to a design at synthesis time to cater to specific tracing needs. |
f8b66fe5 | 19 | Components are generally categorised as source, link and sinks and are |
872234d3 MP |
20 | (usually) discovered using the AMBA bus. |
21 | ||
22 | "Sources" generate a compressed stream representing the processor instruction | |
23 | path based on tracing scenarios as configured by users. From there the stream | |
24 | flows through the coresight system (via ATB bus) using links that are connecting | |
25 | the emanating source to a sink(s). Sinks serve as endpoints to the coresight | |
26 | implementation, either storing the compressed stream in a memory buffer or | |
27 | creating an interface to the outside world where data can be transferred to a | |
28 | host without fear of filling up the onboard coresight memory buffer. | |
29 | ||
fe13225f | 30 | At typical coresight system would look like this:: |
872234d3 MP |
31 | |
32 | ***************************************************************** | |
33 | **************************** AMBA AXI ****************************===|| | |
34 | ***************************************************************** || | |
35 | ^ ^ | || | |
36 | | | * ** | |
37 | 0000000 ::::: 0000000 ::::: ::::: @@@@@@@ |||||||||||| | |
38 | 0 CPU 0<-->: C : 0 CPU 0<-->: C : : C : @ STM @ || System || | |
39 | |->0000000 : T : |->0000000 : T : : T :<--->@@@@@ || Memory || | |
40 | | #######<-->: I : | #######<-->: I : : I : @@@<-| |||||||||||| | |
41 | | # ETM # ::::: | # PTM # ::::: ::::: @ | | |
42 | | ##### ^ ^ | ##### ^ ! ^ ! . | ||||||||| | |
43 | | |->### | ! | |->### | ! | ! . | || DAP || | |
44 | | | # | ! | | # | ! | ! . | ||||||||| | |
45 | | | . | ! | | . | ! | ! . | | | | |
46 | | | . | ! | | . | ! | ! . | | * | |
47 | | | . | ! | | . | ! | ! . | | SWD/ | |
48 | | | . | ! | | . | ! | ! . | | JTAG | |
49 | *****************************************************************<-| | |
7af8792b | 50 | *************************** AMBA Debug APB ************************ |
872234d3 MP |
51 | ***************************************************************** |
52 | | . ! . ! ! . | | |
53 | | . * . * * . | | |
54 | ***************************************************************** | |
55 | ******************** Cross Trigger Matrix (CTM) ******************* | |
56 | ***************************************************************** | |
57 | | . ^ . . | | |
58 | | * ! * * | | |
59 | ***************************************************************** | |
60 | ****************** AMBA Advanced Trace Bus (ATB) ****************** | |
61 | ***************************************************************** | |
62 | | ! =============== | | |
63 | | * ===== F =====<---------| | |
64 | | ::::::::: ==== U ==== | |
65 | |-->:: CTI ::<!! === N === | |
66 | | ::::::::: ! == N == | |
67 | | ^ * == E == | |
68 | | ! &&&&&&&&& IIIIIII == L == | |
69 | |------>&& ETB &&<......II I ======= | |
70 | | ! &&&&&&&&& II I . | |
71 | | ! I I . | |
72 | | ! I REP I<.......... | |
73 | | ! I I | |
74 | | !!>&&&&&&&&& II I *Source: ARM ltd. | |
75 | |------>& TPIU &<......II I DAP = Debug Access Port | |
76 | &&&&&&&&& IIIIIII ETM = Embedded Trace Macrocell | |
77 | ; PTM = Program Trace Macrocell | |
78 | ; CTI = Cross Trigger Interface | |
79 | * ETB = Embedded Trace Buffer | |
80 | To trace port TPIU= Trace Port Interface Unit | |
81 | SWD = Serial Wire Debug | |
82 | ||
7af8792b | 83 | While on target configuration of the components is done via the APB bus, |
872234d3 MP |
84 | all trace data are carried out-of-band on the ATB bus. The CTM provides |
85 | a way to aggregate and distribute signals between CoreSight components. | |
86 | ||
87 | The coresight framework provides a central point to represent, configure and | |
88 | manage coresight devices on a platform. This first implementation centers on | |
89 | the basic tracing functionality, enabling components such ETM/PTM, funnel, | |
90 | replicator, TMC, TPIU and ETB. Future work will enable more | |
91 | intricate IP blocks such as STM and CTI. | |
92 | ||
93 | ||
94 | Acronyms and Classification | |
95 | --------------------------- | |
96 | ||
97 | Acronyms: | |
98 | ||
fe13225f PT |
99 | PTM: |
100 | Program Trace Macrocell | |
101 | ETM: | |
102 | Embedded Trace Macrocell | |
103 | STM: | |
104 | System trace Macrocell | |
105 | ETB: | |
106 | Embedded Trace Buffer | |
107 | ITM: | |
108 | Instrumentation Trace Macrocell | |
109 | TPIU: | |
110 | Trace Port Interface Unit | |
111 | TMC-ETR: | |
112 | Trace Memory Controller, configured as Embedded Trace Router | |
113 | TMC-ETF: | |
114 | Trace Memory Controller, configured as Embedded Trace FIFO | |
115 | CTI: | |
116 | Cross Trigger Interface | |
872234d3 MP |
117 | |
118 | Classification: | |
119 | ||
120 | Source: | |
121 | ETMv3.x ETMv4, PTMv1.0, PTMv1.1, STM, STM500, ITM | |
122 | Link: | |
123 | Funnel, replicator (intelligent or not), TMC-ETR | |
124 | Sinks: | |
125 | ETBv1.0, ETB1.1, TPIU, TMC-ETF | |
126 | Misc: | |
127 | CTI | |
128 | ||
129 | ||
130 | Device Tree Bindings | |
fe13225f | 131 | -------------------- |
872234d3 | 132 | |
e49c0b14 | 133 | See ``Documentation/devicetree/bindings/arm/arm,coresight-*.yaml`` for details. |
872234d3 MP |
134 | |
135 | As of this writing drivers for ITM, STMs and CTIs are not provided but are | |
136 | expected to be added as the solution matures. | |
137 | ||
138 | ||
139 | Framework and implementation | |
140 | ---------------------------- | |
141 | ||
142 | The coresight framework provides a central point to represent, configure and | |
143 | manage coresight devices on a platform. Any coresight compliant device can | |
144 | register with the framework for as long as they use the right APIs: | |
145 | ||
fe13225f PT |
146 | .. c:function:: struct coresight_device *coresight_register(struct coresight_desc *desc); |
147 | .. c:function:: void coresight_unregister(struct coresight_device *csdev); | |
872234d3 | 148 | |
fe13225f PT |
149 | The registering function is taking a ``struct coresight_desc *desc`` and |
150 | register the device with the core framework. The unregister function takes | |
151 | a reference to a ``struct coresight_device *csdev`` obtained at registration time. | |
872234d3 MP |
152 | |
153 | If everything goes well during the registration process the new devices will | |
fe13225f | 154 | show up under /sys/bus/coresight/devices, as showns here for a TC2 platform:: |
872234d3 | 155 | |
fe13225f PT |
156 | root:~# ls /sys/bus/coresight/devices/ |
157 | replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm | |
158 | 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm | |
159 | root:~# | |
872234d3 | 160 | |
fe13225f | 161 | The functions take a ``struct coresight_device``, which looks like this:: |
872234d3 | 162 | |
fe13225f PT |
163 | struct coresight_desc { |
164 | enum coresight_dev_type type; | |
165 | struct coresight_dev_subtype subtype; | |
166 | const struct coresight_ops *ops; | |
167 | struct coresight_platform_data *pdata; | |
168 | struct device *dev; | |
169 | const struct attribute_group **groups; | |
170 | }; | |
872234d3 MP |
171 | |
172 | ||
173 | The "coresight_dev_type" identifies what the device is, i.e, source link or | |
174 | sink while the "coresight_dev_subtype" will characterise that type further. | |
175 | ||
fe13225f | 176 | The ``struct coresight_ops`` is mandatory and will tell the framework how to |
872234d3 | 177 | perform base operations related to the components, each component having |
fe13225f PT |
178 | a different set of requirement. For that ``struct coresight_ops_sink``, |
179 | ``struct coresight_ops_link`` and ``struct coresight_ops_source`` have been | |
872234d3 MP |
180 | provided. |
181 | ||
fe13225f PT |
182 | The next field ``struct coresight_platform_data *pdata`` is acquired by calling |
183 | ``of_get_coresight_platform_data()``, as part of the driver's _probe routine and | |
184 | ``struct device *dev`` gets the device reference embedded in the ``amba_device``:: | |
872234d3 | 185 | |
fe13225f PT |
186 | static int etm_probe(struct amba_device *adev, const struct amba_id *id) |
187 | { | |
188 | ... | |
189 | ... | |
190 | drvdata->dev = &adev->dev; | |
191 | ... | |
192 | } | |
872234d3 MP |
193 | |
194 | Specific class of device (source, link, or sink) have generic operations | |
fe13225f PT |
195 | that can be performed on them (see ``struct coresight_ops``). The ``**groups`` |
196 | is a list of sysfs entries pertaining to operations | |
872234d3 MP |
197 | specific to that component only. "Implementation defined" customisations are |
198 | expected to be accessed and controlled using those entries. | |
199 | ||
cd84d63a | 200 | Device Naming scheme |
fe13225f PT |
201 | -------------------- |
202 | ||
cd84d63a SP |
203 | The devices that appear on the "coresight" bus were named the same as their |
204 | parent devices, i.e, the real devices that appears on AMBA bus or the platform bus. | |
205 | Thus the names were based on the Linux Open Firmware layer naming convention, | |
206 | which follows the base physical address of the device followed by the device | |
fe13225f | 207 | type. e.g:: |
cd84d63a | 208 | |
fe13225f PT |
209 | root:~# ls /sys/bus/coresight/devices/ |
210 | 20010000.etf 20040000.funnel 20100000.stm 22040000.etm | |
211 | 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu | |
212 | 20070000.etr 20120000.replicator 220c0000.funnel | |
213 | 23040000.etm 23140000.etm 23340000.etm | |
cd84d63a SP |
214 | |
215 | However, with the introduction of ACPI support, the names of the real | |
216 | devices are a bit cryptic and non-obvious. Thus, a new naming scheme was | |
217 | introduced to use more generic names based on the type of the device. The | |
fe13225f | 218 | following rules apply:: |
cd84d63a SP |
219 | |
220 | 1) Devices that are bound to CPUs, are named based on the CPU logical | |
221 | number. | |
222 | ||
223 | e.g, ETM bound to CPU0 is named "etm0" | |
224 | ||
225 | 2) All other devices follow a pattern, "<device_type_prefix>N", where : | |
226 | ||
227 | <device_type_prefix> - A prefix specific to the type of the device | |
228 | N - a sequential number assigned based on the order | |
229 | of probing. | |
230 | ||
231 | e.g, tmc_etf0, tmc_etr0, funnel0, funnel1 | |
232 | ||
fe13225f | 233 | Thus, with the new scheme the devices could appear as :: |
cd84d63a | 234 | |
fe13225f PT |
235 | root:~# ls /sys/bus/coresight/devices/ |
236 | etm0 etm1 etm2 etm3 etm4 etm5 funnel0 | |
237 | funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 | |
cd84d63a SP |
238 | |
239 | Some of the examples below might refer to old naming scheme and some | |
240 | to the newer scheme, to give a confirmation that what you see on your | |
241 | system is not unexpected. One must use the "names" as they appear on | |
242 | the system under specified locations. | |
243 | ||
5153e57b ML |
244 | Topology Representation |
245 | ----------------------- | |
246 | ||
247 | Each CoreSight component has a ``connections`` directory which will contain | |
248 | links to other CoreSight components. This allows the user to explore the trace | |
249 | topology and for larger systems, determine the most appropriate sink for a | |
250 | given source. The connection information can also be used to establish | |
251 | which CTI devices are connected to a given component. This directory contains a | |
252 | ``nr_links`` attribute detailing the number of links in the directory. | |
253 | ||
254 | For an ETM source, in this case ``etm0`` on a Juno platform, a typical | |
255 | arrangement will be:: | |
256 | ||
257 | linaro-developer:~# ls - l /sys/bus/coresight/devices/etm0/connections | |
258 | <file details> cti_cpu0 -> ../../../23020000.cti/cti_cpu0 | |
259 | <file details> nr_links | |
260 | <file details> out:0 -> ../../../230c0000.funnel/funnel2 | |
261 | ||
262 | Following the out port to ``funnel2``:: | |
263 | ||
264 | linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel2/connections | |
265 | <file details> in:0 -> ../../../23040000.etm/etm0 | |
266 | <file details> in:1 -> ../../../23140000.etm/etm3 | |
267 | <file details> in:2 -> ../../../23240000.etm/etm4 | |
268 | <file details> in:3 -> ../../../23340000.etm/etm5 | |
269 | <file details> nr_links | |
270 | <file details> out:0 -> ../../../20040000.funnel/funnel0 | |
271 | ||
272 | And again to ``funnel0``:: | |
273 | ||
274 | linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel0/connections | |
275 | <file details> in:0 -> ../../../220c0000.funnel/funnel1 | |
276 | <file details> in:1 -> ../../../230c0000.funnel/funnel2 | |
277 | <file details> nr_links | |
278 | <file details> out:0 -> ../../../20010000.etf/tmc_etf0 | |
279 | ||
280 | Finding the first sink ``tmc_etf0``. This can be used to collect data | |
281 | as a sink, or as a link to propagate further along the chain:: | |
282 | ||
283 | linaro-developer:~# ls -l /sys/bus/coresight/devices/tmc_etf0/connections | |
284 | <file details> cti_sys0 -> ../../../20020000.cti/cti_sys0 | |
285 | <file details> in:0 -> ../../../20040000.funnel/funnel0 | |
286 | <file details> nr_links | |
287 | <file details> out:0 -> ../../../20150000.funnel/funnel4 | |
288 | ||
289 | via ``funnel4``:: | |
290 | ||
291 | linaro-developer:~# ls -l /sys/bus/coresight/devices/funnel4/connections | |
292 | <file details> in:0 -> ../../../20010000.etf/tmc_etf0 | |
293 | <file details> in:1 -> ../../../20140000.etf/tmc_etf1 | |
294 | <file details> nr_links | |
295 | <file details> out:0 -> ../../../20120000.replicator/replicator0 | |
296 | ||
297 | and a ``replicator0``:: | |
298 | ||
299 | linaro-developer:~# ls -l /sys/bus/coresight/devices/replicator0/connections | |
300 | <file details> in:0 -> ../../../20150000.funnel/funnel4 | |
301 | <file details> nr_links | |
302 | <file details> out:0 -> ../../../20030000.tpiu/tpiu0 | |
303 | <file details> out:1 -> ../../../20070000.etr/tmc_etr0 | |
304 | ||
305 | Arriving at the final sink in the chain, ``tmc_etr0``:: | |
306 | ||
307 | linaro-developer:~# ls -l /sys/bus/coresight/devices/tmc_etr0/connections | |
308 | <file details> cti_sys0 -> ../../../20020000.cti/cti_sys0 | |
309 | <file details> in:0 -> ../../../20120000.replicator/replicator0 | |
310 | <file details> nr_links | |
311 | ||
312 | As described below, when using sysfs it is sufficient to enable a sink and | |
313 | a source for successful trace. The framework will correctly enable all | |
314 | intermediate links as required. | |
315 | ||
316 | Note: ``cti_sys0`` appears in two of the connections lists above. | |
317 | CTIs can connect to multiple devices and are arranged in a star topology | |
e480336c MCC |
318 | via the CTM. See (Documentation/trace/coresight/coresight-ect.rst) |
319 | [#fourth]_ for further details. | |
5153e57b ML |
320 | Looking at this device we see 4 connections:: |
321 | ||
322 | linaro-developer:~# ls -l /sys/bus/coresight/devices/cti_sys0/connections | |
323 | <file details> nr_links | |
324 | <file details> stm0 -> ../../../20100000.stm/stm0 | |
325 | <file details> tmc_etf0 -> ../../../20010000.etf/tmc_etf0 | |
326 | <file details> tmc_etr0 -> ../../../20070000.etr/tmc_etr0 | |
327 | <file details> tpiu0 -> ../../../20030000.tpiu/tpiu0 | |
328 | ||
329 | ||
237483aa PP |
330 | How to use the tracer modules |
331 | ----------------------------- | |
872234d3 | 332 | |
fe13225f PT |
333 | There are two ways to use the Coresight framework: |
334 | ||
335 | 1. using the perf cmd line tools. | |
336 | 2. interacting directly with the Coresight devices using the sysFS interface. | |
337 | ||
338 | Preference is given to the former as using the sysFS interface | |
f29816b4 MP |
339 | requires a deep understanding of the Coresight HW. The following sections |
340 | provide details on using both methods. | |
341 | ||
bcc5834f JC |
342 | Using the sysFS interface |
343 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | |
f29816b4 MP |
344 | |
345 | Before trace collection can start, a coresight sink needs to be identified. | |
872234d3 MP |
346 | There is no limit on the amount of sinks (nor sources) that can be enabled at |
347 | any given moment. As a generic operation, all device pertaining to the sink | |
fe13225f PT |
348 | class will have an "active" entry in sysfs:: |
349 | ||
350 | root:/sys/bus/coresight/devices# ls | |
351 | replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm | |
352 | 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm | |
353 | root:/sys/bus/coresight/devices# ls 20010000.etb | |
354 | enable_sink status trigger_cntr | |
355 | root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink | |
356 | root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink | |
357 | 1 | |
358 | root:/sys/bus/coresight/devices# | |
872234d3 MP |
359 | |
360 | At boot time the current etm3x driver will configure the first address | |
361 | comparator with "_stext" and "_etext", essentially tracing any instruction | |
362 | that falls within that range. As such "enabling" a source will immediately | |
fe13225f PT |
363 | trigger a trace capture:: |
364 | ||
365 | root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source | |
366 | root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source | |
367 | 1 | |
368 | root:/sys/bus/coresight/devices# cat 20010000.etb/status | |
369 | Depth: 0x2000 | |
370 | Status: 0x1 | |
371 | RAM read ptr: 0x0 | |
372 | RAM wrt ptr: 0x19d3 <----- The write pointer is moving | |
373 | Trigger cnt: 0x0 | |
374 | Control: 0x1 | |
375 | Flush status: 0x0 | |
376 | Flush ctrl: 0x2001 | |
377 | root:/sys/bus/coresight/devices# | |
378 | ||
379 | Trace collection is stopped the same way:: | |
380 | ||
381 | root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source | |
382 | root:/sys/bus/coresight/devices# | |
383 | ||
384 | The content of the ETB buffer can be harvested directly from /dev:: | |
385 | ||
386 | root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \ | |
387 | of=~/cstrace.bin | |
388 | 64+0 records in | |
389 | 64+0 records out | |
390 | 32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s | |
391 | root:/sys/bus/coresight/devices# | |
872234d3 MP |
392 | |
393 | The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32. | |
394 | ||
395 | Following is a DS-5 output of an experimental loop that increments a variable up | |
396 | to a certain value. The example is simple and yet provides a glimpse of the | |
397 | wealth of possibilities that coresight provides. | |
fe13225f PT |
398 | :: |
399 | ||
400 | Info Tracing enabled | |
401 | Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr} | |
402 | Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc | |
403 | Instruction 0 0x8026B544 E3A03000 false MOV r3,#0 | |
404 | Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4] | |
405 | Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
406 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
407 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
408 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
409 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
410 | Timestamp Timestamp: 17106715833 | |
411 | Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
412 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
413 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
414 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
415 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
416 | Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
417 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
418 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
419 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
420 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
421 | Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
422 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
423 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
424 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
425 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
426 | Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
427 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
428 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
429 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
430 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
431 | Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
432 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
433 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
434 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
435 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
436 | Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1 | |
437 | Instruction 0 0x8026B564 E1A0100D false MOV r1,sp | |
438 | Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0 | |
439 | Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f | |
440 | Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4] | |
441 | Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368 | |
442 | Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc] | |
443 | Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0] | |
444 | Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4 | |
445 | Info Tracing enabled | |
446 | Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc | |
447 | Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc} | |
448 | Timestamp Timestamp: 17107041535 | |
237483aa | 449 | |
bcc5834f JC |
450 | Using perf framework |
451 | ~~~~~~~~~~~~~~~~~~~~ | |
f29816b4 MP |
452 | |
453 | Coresight tracers are represented using the Perf framework's Performance | |
454 | Monitoring Unit (PMU) abstraction. As such the perf framework takes charge of | |
455 | controlling when tracing gets enabled based on when the process of interest is | |
456 | scheduled. When configured in a system, Coresight PMUs will be listed when | |
457 | queried by the perf command line tool: | |
458 | ||
459 | linaro@linaro-nano:~$ ./perf list pmu | |
460 | ||
461 | List of pre-defined events (to be used in -e): | |
462 | ||
463 | cs_etm// [Kernel PMU event] | |
464 | ||
465 | linaro@linaro-nano:~$ | |
466 | ||
467 | Regardless of the number of tracers available in a system (usually equal to the | |
468 | amount of processor cores), the "cs_etm" PMU will be listed only once. | |
469 | ||
470 | A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is | |
471 | listed along with configuration options within forward slashes '/'. Since a | |
472 | Coresight system will typically have more than one sink, the name of the sink to | |
cd84d63a | 473 | work with needs to be specified as an event option. |
fe13225f PT |
474 | On newer kernels the available sinks are listed in sysFS under |
475 | ($SYSFS)/bus/event_source/devices/cs_etm/sinks/:: | |
cd84d63a SP |
476 | |
477 | root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls | |
478 | tmc_etf0 tmc_etr0 tpiu0 | |
479 | ||
480 | On older kernels, this may need to be found from the list of coresight devices, | |
fe13225f | 481 | available under ($SYSFS)/bus/coresight/devices/:: |
cd84d63a SP |
482 | |
483 | root:~# ls /sys/bus/coresight/devices/ | |
484 | etm0 etm1 etm2 etm3 etm4 etm5 funnel0 | |
485 | funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 | |
cd84d63a | 486 | root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program |
f29816b4 | 487 | |
cd84d63a SP |
488 | As mentioned above in section "Device Naming scheme", the names of the devices could |
489 | look different from what is used in the example above. One must use the device names | |
490 | as it appears under the sysFS. | |
f29816b4 MP |
491 | |
492 | The syntax within the forward slashes '/' is important. The '@' character | |
493 | tells the parser that a sink is about to be specified and that this is the sink | |
494 | to use for the trace session. | |
495 | ||
496 | More information on the above and other example on how to use Coresight with | |
497 | the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub | |
fe13225f | 498 | repository [#third]_. |
f29816b4 | 499 | |
bcc5834f JC |
500 | Advanced perf framework usage |
501 | ----------------------------- | |
502 | ||
503 | AutoFDO analysis using the perf tools | |
504 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
6673016f RW |
505 | |
506 | perf can be used to record and analyze trace of programs. | |
507 | ||
508 | Execution can be recorded using 'perf record' with the cs_etm event, | |
fe13225f | 509 | specifying the name of the sink to record to, e.g:: |
6673016f | 510 | |
cd84d63a | 511 | perf record -e cs_etm/@tmc_etr0/u --per-thread |
6673016f RW |
512 | |
513 | The 'perf report' and 'perf script' commands can be used to analyze execution, | |
514 | synthesizing instruction and branch events from the instruction trace. | |
515 | 'perf inject' can be used to replace the trace data with the synthesized events. | |
516 | The --itrace option controls the type and frequency of synthesized events | |
517 | (see perf documentation). | |
518 | ||
519 | Note that only 64-bit programs are currently supported - further work is | |
520 | required to support instruction decode of 32-bit Arm programs. | |
521 | ||
bcc5834f JC |
522 | Tracing PID |
523 | ~~~~~~~~~~~ | |
06c18e28 LY |
524 | |
525 | The kernel can be built to write the PID value into the PE ContextID registers. | |
526 | For a kernel running at EL1, the PID is stored in CONTEXTIDR_EL1. A PE may | |
527 | implement Arm Virtualization Host Extensions (VHE), which the kernel can | |
528 | run at EL2 as a virtualisation host; in this case, the PID value is stored in | |
529 | CONTEXTIDR_EL2. | |
530 | ||
531 | perf provides PMU formats that program the ETM to insert these values into the | |
532 | trace data; the PMU formats are defined as below: | |
533 | ||
534 | "contextid1": Available on both EL1 kernel and EL2 kernel. When the | |
535 | kernel is running at EL1, "contextid1" enables the PID | |
536 | tracing; when the kernel is running at EL2, this enables | |
537 | tracing the PID of guest applications. | |
538 | ||
539 | "contextid2": Only usable when the kernel is running at EL2. When | |
540 | selected, enables PID tracing on EL2 kernel. | |
541 | ||
542 | "contextid": Will be an alias for the option that enables PID | |
543 | tracing. I.e, | |
544 | contextid == contextid1, on EL1 kernel. | |
545 | contextid == contextid2, on EL2 kernel. | |
546 | ||
547 | perf will always enable PID tracing at the relevant EL, this is accomplished by | |
548 | automatically enable the "contextid" config - but for EL2 it is possible to make | |
549 | specific adjustments using configs "contextid1" and "contextid2", E.g. if a user | |
550 | wants to trace PIDs for both host and guest, the two configs "contextid1" and | |
551 | "contextid2" can be set at the same time: | |
552 | ||
553 | perf record -e cs_etm/contextid1,contextid2/u -- vm | |
554 | ||
6673016f RW |
555 | |
556 | Generating coverage files for Feedback Directed Optimization: AutoFDO | |
bcc5834f | 557 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
6673016f RW |
558 | |
559 | 'perf inject' accepts the --itrace option in which case tracing data is | |
560 | removed and replaced with the synthesized events. e.g. | |
fe13225f | 561 | :: |
6673016f RW |
562 | |
563 | perf inject --itrace --strip -i perf.data -o perf.data.new | |
564 | ||
565 | Below is an example of using ARM ETM for autoFDO. It requires autofdo | |
566 | (https://github.com/google/autofdo) and gcc version 5. The bubble | |
567 | sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial). | |
fe13225f | 568 | :: |
6673016f RW |
569 | |
570 | $ gcc-5 -O3 sort.c -o sort | |
571 | $ taskset -c 2 ./sort | |
572 | Bubble sorting array of 30000 elements | |
573 | 5910 ms | |
574 | ||
cd84d63a | 575 | $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort |
6673016f RW |
576 | Bubble sorting array of 30000 elements |
577 | 12543 ms | |
578 | [ perf record: Woken up 35 times to write data ] | |
579 | [ perf record: Captured and wrote 69.640 MB perf.data ] | |
580 | ||
581 | $ perf inject -i perf.data -o inj.data --itrace=il64 --strip | |
582 | $ create_gcov --binary=./sort --profile=inj.data --gcov=sort.gcov -gcov_version=1 | |
583 | $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo | |
584 | $ taskset -c 2 ./sort_autofdo | |
585 | Bubble sorting array of 30000 elements | |
586 | 5806 ms | |
87bf4d68 | 587 | |
32ee00d8 JC |
588 | Config option formats |
589 | ~~~~~~~~~~~~~~~~~~~~~ | |
590 | ||
591 | The following strings can be provided between // on the perf command line to enable various options. | |
592 | They are also listed in the folder /sys/bus/event_source/devices/cs_etm/format/ | |
593 | ||
594 | .. list-table:: | |
595 | :header-rows: 1 | |
596 | ||
597 | * - Option | |
598 | - Description | |
599 | * - branch_broadcast | |
600 | - Session local version of the system wide setting: | |
601 | :ref:`ETM_MODE_BB <coresight-branch-broadcast>` | |
602 | * - contextid | |
603 | - See `Tracing PID`_ | |
604 | * - contextid1 | |
605 | - See `Tracing PID`_ | |
606 | * - contextid2 | |
607 | - See `Tracing PID`_ | |
608 | * - configid | |
609 | - Selection for a custom configuration. This is an implementation detail and not used directly, | |
610 | see :ref:`trace/coresight/coresight-config:Using Configurations in perf` | |
611 | * - preset | |
612 | - Override for parameters in a custom configuration, see | |
613 | :ref:`trace/coresight/coresight-config:Using Configurations in perf` | |
614 | * - sinkid | |
615 | - Hashed version of the string to select a sink, automatically set when using the @ notation. | |
616 | This is an internal implementation detail and is not used directly, see `Using perf | |
617 | framework`_. | |
618 | * - cycacc | |
619 | - Session local version of the system wide setting: :ref:`ETMv4_MODE_CYCACC | |
620 | <coresight-cycle-accurate>` | |
621 | * - retstack | |
622 | - Session local version of the system wide setting: :ref:`ETM_MODE_RETURNSTACK | |
623 | <coresight-return-stack>` | |
624 | * - timestamp | |
625 | - Session local version of the system wide setting: :ref:`ETMv4_MODE_TIMESTAMP | |
626 | <coresight-timestamp>` | |
87bf4d68 MP |
627 | |
628 | How to use the STM module | |
629 | ------------------------- | |
630 | ||
631 | Using the System Trace Macrocell module is the same as the tracers - the only | |
632 | difference is that clients are driving the trace capture rather | |
633 | than the program flow through the code. | |
634 | ||
635 | As with any other CoreSight component, specifics about the STM tracer can be | |
fe13225f | 636 | found in sysfs with more information on each entry being found in [#first]_:: |
87bf4d68 | 637 | |
fe13225f PT |
638 | root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0 |
639 | enable_source hwevent_select port_enable subsystem uevent | |
640 | hwevent_enable mgmt port_select traceid | |
641 | root@genericarmv8:~# | |
87bf4d68 MP |
642 | |
643 | Like any other source a sink needs to be identified and the STM enabled before | |
fe13225f | 644 | being used:: |
87bf4d68 | 645 | |
fe13225f PT |
646 | root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink |
647 | root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source | |
87bf4d68 MP |
648 | |
649 | From there user space applications can request and use channels using the devfs | |
fe13225f PT |
650 | interface provided for that purpose by the generic STM API:: |
651 | ||
652 | root@genericarmv8:~# ls -l /dev/stm0 | |
653 | crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0 | |
654 | root@genericarmv8:~# | |
655 | ||
e480336c MCC |
656 | Details on how to use the generic STM API can be found here: |
657 | - Documentation/trace/stm.rst [#second]_. | |
87bf4d68 | 658 | |
82e0c782 ML |
659 | The CTI & CTM Modules |
660 | --------------------- | |
661 | ||
662 | The CTI (Cross Trigger Interface) provides a set of trigger signals between | |
663 | individual CTIs and components, and can propagate these between all CTIs via | |
664 | channels on the CTM (Cross Trigger Matrix). | |
665 | ||
666 | A separate documentation file is provided to explain the use of these devices. | |
e480336c | 667 | (Documentation/trace/coresight/coresight-ect.rst) [#fourth]_. |
82e0c782 | 668 | |
f71cd93d ML |
669 | CoreSight System Configuration |
670 | ------------------------------ | |
671 | ||
672 | CoreSight components can be complex devices with many programming options. | |
673 | Furthermore, components can be programmed to interact with each other across the | |
674 | complete system. | |
675 | ||
676 | A CoreSight System Configuration manager is provided to allow these complex programming | |
677 | configurations to be selected and used easily from perf and sysfs. | |
678 | ||
679 | See the separate document for further information. | |
680 | (Documentation/trace/coresight/coresight-config.rst) [#fifth]_. | |
681 | ||
82e0c782 | 682 | |
fe13225f | 683 | .. [#first] Documentation/ABI/testing/sysfs-bus-coresight-devices-stm |
87bf4d68 | 684 | |
fe13225f | 685 | .. [#second] Documentation/trace/stm.rst |
87bf4d68 | 686 | |
fe13225f | 687 | .. [#third] https://github.com/Linaro/perf-opencsd |
82e0c782 ML |
688 | |
689 | .. [#fourth] Documentation/trace/coresight/coresight-ect.rst | |
f71cd93d ML |
690 | |
691 | .. [#fifth] Documentation/trace/coresight/coresight-config.rst |