Commit | Line | Data |
---|---|---|
872234d3 MP |
1 | Coresight - HW Assisted Tracing on ARM |
2 | ====================================== | |
3 | ||
4 | Author: Mathieu Poirier <mathieu.poirier@linaro.org> | |
5 | Date: September 11th, 2014 | |
6 | ||
7 | Introduction | |
8 | ------------ | |
9 | ||
10 | Coresight is an umbrella of technologies allowing for the debugging of ARM | |
11 | based SoC. It includes solutions for JTAG and HW assisted tracing. This | |
12 | document is concerned with the latter. | |
13 | ||
14 | HW assisted tracing is becoming increasingly useful when dealing with systems | |
15 | that have many SoCs and other components like GPU and DMA engines. ARM has | |
16 | developed a HW assisted tracing solution by means of different components, each | |
b29d5c1f | 17 | being added to a design at synthesis time to cater to specific tracing needs. |
f8b66fe5 | 18 | Components are generally categorised as source, link and sinks and are |
872234d3 MP |
19 | (usually) discovered using the AMBA bus. |
20 | ||
21 | "Sources" generate a compressed stream representing the processor instruction | |
22 | path based on tracing scenarios as configured by users. From there the stream | |
23 | flows through the coresight system (via ATB bus) using links that are connecting | |
24 | the emanating source to a sink(s). Sinks serve as endpoints to the coresight | |
25 | implementation, either storing the compressed stream in a memory buffer or | |
26 | creating an interface to the outside world where data can be transferred to a | |
27 | host without fear of filling up the onboard coresight memory buffer. | |
28 | ||
29 | At typical coresight system would look like this: | |
30 | ||
31 | ***************************************************************** | |
32 | **************************** AMBA AXI ****************************===|| | |
33 | ***************************************************************** || | |
34 | ^ ^ | || | |
35 | | | * ** | |
36 | 0000000 ::::: 0000000 ::::: ::::: @@@@@@@ |||||||||||| | |
37 | 0 CPU 0<-->: C : 0 CPU 0<-->: C : : C : @ STM @ || System || | |
38 | |->0000000 : T : |->0000000 : T : : T :<--->@@@@@ || Memory || | |
39 | | #######<-->: I : | #######<-->: I : : I : @@@<-| |||||||||||| | |
40 | | # ETM # ::::: | # PTM # ::::: ::::: @ | | |
41 | | ##### ^ ^ | ##### ^ ! ^ ! . | ||||||||| | |
42 | | |->### | ! | |->### | ! | ! . | || DAP || | |
43 | | | # | ! | | # | ! | ! . | ||||||||| | |
44 | | | . | ! | | . | ! | ! . | | | | |
45 | | | . | ! | | . | ! | ! . | | * | |
46 | | | . | ! | | . | ! | ! . | | SWD/ | |
47 | | | . | ! | | . | ! | ! . | | JTAG | |
48 | *****************************************************************<-| | |
7af8792b | 49 | *************************** AMBA Debug APB ************************ |
872234d3 MP |
50 | ***************************************************************** |
51 | | . ! . ! ! . | | |
52 | | . * . * * . | | |
53 | ***************************************************************** | |
54 | ******************** Cross Trigger Matrix (CTM) ******************* | |
55 | ***************************************************************** | |
56 | | . ^ . . | | |
57 | | * ! * * | | |
58 | ***************************************************************** | |
59 | ****************** AMBA Advanced Trace Bus (ATB) ****************** | |
60 | ***************************************************************** | |
61 | | ! =============== | | |
62 | | * ===== F =====<---------| | |
63 | | ::::::::: ==== U ==== | |
64 | |-->:: CTI ::<!! === N === | |
65 | | ::::::::: ! == N == | |
66 | | ^ * == E == | |
67 | | ! &&&&&&&&& IIIIIII == L == | |
68 | |------>&& ETB &&<......II I ======= | |
69 | | ! &&&&&&&&& II I . | |
70 | | ! I I . | |
71 | | ! I REP I<.......... | |
72 | | ! I I | |
73 | | !!>&&&&&&&&& II I *Source: ARM ltd. | |
74 | |------>& TPIU &<......II I DAP = Debug Access Port | |
75 | &&&&&&&&& IIIIIII ETM = Embedded Trace Macrocell | |
76 | ; PTM = Program Trace Macrocell | |
77 | ; CTI = Cross Trigger Interface | |
78 | * ETB = Embedded Trace Buffer | |
79 | To trace port TPIU= Trace Port Interface Unit | |
80 | SWD = Serial Wire Debug | |
81 | ||
7af8792b | 82 | While on target configuration of the components is done via the APB bus, |
872234d3 MP |
83 | all trace data are carried out-of-band on the ATB bus. The CTM provides |
84 | a way to aggregate and distribute signals between CoreSight components. | |
85 | ||
86 | The coresight framework provides a central point to represent, configure and | |
87 | manage coresight devices on a platform. This first implementation centers on | |
88 | the basic tracing functionality, enabling components such ETM/PTM, funnel, | |
89 | replicator, TMC, TPIU and ETB. Future work will enable more | |
90 | intricate IP blocks such as STM and CTI. | |
91 | ||
92 | ||
93 | Acronyms and Classification | |
94 | --------------------------- | |
95 | ||
96 | Acronyms: | |
97 | ||
98 | PTM: Program Trace Macrocell | |
99 | ETM: Embedded Trace Macrocell | |
100 | STM: System trace Macrocell | |
101 | ETB: Embedded Trace Buffer | |
102 | ITM: Instrumentation Trace Macrocell | |
103 | TPIU: Trace Port Interface Unit | |
104 | TMC-ETR: Trace Memory Controller, configured as Embedded Trace Router | |
105 | TMC-ETF: Trace Memory Controller, configured as Embedded Trace FIFO | |
106 | CTI: Cross Trigger Interface | |
107 | ||
108 | Classification: | |
109 | ||
110 | Source: | |
111 | ETMv3.x ETMv4, PTMv1.0, PTMv1.1, STM, STM500, ITM | |
112 | Link: | |
113 | Funnel, replicator (intelligent or not), TMC-ETR | |
114 | Sinks: | |
115 | ETBv1.0, ETB1.1, TPIU, TMC-ETF | |
116 | Misc: | |
117 | CTI | |
118 | ||
119 | ||
120 | Device Tree Bindings | |
121 | ---------------------- | |
122 | ||
123 | See Documentation/devicetree/bindings/arm/coresight.txt for details. | |
124 | ||
125 | As of this writing drivers for ITM, STMs and CTIs are not provided but are | |
126 | expected to be added as the solution matures. | |
127 | ||
128 | ||
129 | Framework and implementation | |
130 | ---------------------------- | |
131 | ||
132 | The coresight framework provides a central point to represent, configure and | |
133 | manage coresight devices on a platform. Any coresight compliant device can | |
134 | register with the framework for as long as they use the right APIs: | |
135 | ||
136 | struct coresight_device *coresight_register(struct coresight_desc *desc); | |
137 | void coresight_unregister(struct coresight_device *csdev); | |
138 | ||
139 | The registering function is taking a "struct coresight_device *csdev" and | |
140 | register the device with the core framework. The unregister function takes | |
f8b66fe5 | 141 | a reference to a "struct coresight_device", obtained at registration time. |
872234d3 MP |
142 | |
143 | If everything goes well during the registration process the new devices will | |
144 | show up under /sys/bus/coresight/devices, as showns here for a TC2 platform: | |
145 | ||
146 | root:~# ls /sys/bus/coresight/devices/ | |
147 | replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm | |
148 | 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm | |
149 | root:~# | |
150 | ||
151 | The functions take a "struct coresight_device", which looks like this: | |
152 | ||
153 | struct coresight_desc { | |
154 | enum coresight_dev_type type; | |
155 | struct coresight_dev_subtype subtype; | |
156 | const struct coresight_ops *ops; | |
157 | struct coresight_platform_data *pdata; | |
158 | struct device *dev; | |
159 | const struct attribute_group **groups; | |
160 | }; | |
161 | ||
162 | ||
163 | The "coresight_dev_type" identifies what the device is, i.e, source link or | |
164 | sink while the "coresight_dev_subtype" will characterise that type further. | |
165 | ||
166 | The "struct coresight_ops" is mandatory and will tell the framework how to | |
167 | perform base operations related to the components, each component having | |
168 | a different set of requirement. For that "struct coresight_ops_sink", | |
169 | "struct coresight_ops_link" and "struct coresight_ops_source" have been | |
170 | provided. | |
171 | ||
172 | The next field, "struct coresight_platform_data *pdata" is acquired by calling | |
173 | "of_get_coresight_platform_data()", as part of the driver's _probe routine and | |
174 | "struct device *dev" gets the device reference embedded in the "amba_device": | |
175 | ||
176 | static int etm_probe(struct amba_device *adev, const struct amba_id *id) | |
177 | { | |
178 | ... | |
179 | ... | |
180 | drvdata->dev = &adev->dev; | |
181 | ... | |
182 | } | |
183 | ||
184 | Specific class of device (source, link, or sink) have generic operations | |
185 | that can be performed on them (see "struct coresight_ops"). The | |
186 | "**groups" is a list of sysfs entries pertaining to operations | |
187 | specific to that component only. "Implementation defined" customisations are | |
188 | expected to be accessed and controlled using those entries. | |
189 | ||
f29816b4 | 190 | |
cd84d63a SP |
191 | Device Naming scheme |
192 | ------------------------ | |
193 | The devices that appear on the "coresight" bus were named the same as their | |
194 | parent devices, i.e, the real devices that appears on AMBA bus or the platform bus. | |
195 | Thus the names were based on the Linux Open Firmware layer naming convention, | |
196 | which follows the base physical address of the device followed by the device | |
197 | type. e.g: | |
198 | ||
199 | root:~# ls /sys/bus/coresight/devices/ | |
200 | 20010000.etf 20040000.funnel 20100000.stm 22040000.etm | |
201 | 22140000.etm 230c0000.funnel 23240000.etm 20030000.tpiu | |
202 | 20070000.etr 20120000.replicator 220c0000.funnel | |
203 | 23040000.etm 23140000.etm 23340000.etm | |
204 | ||
205 | However, with the introduction of ACPI support, the names of the real | |
206 | devices are a bit cryptic and non-obvious. Thus, a new naming scheme was | |
207 | introduced to use more generic names based on the type of the device. The | |
208 | following rules apply: | |
209 | ||
210 | 1) Devices that are bound to CPUs, are named based on the CPU logical | |
211 | number. | |
212 | ||
213 | e.g, ETM bound to CPU0 is named "etm0" | |
214 | ||
215 | 2) All other devices follow a pattern, "<device_type_prefix>N", where : | |
216 | ||
217 | <device_type_prefix> - A prefix specific to the type of the device | |
218 | N - a sequential number assigned based on the order | |
219 | of probing. | |
220 | ||
221 | e.g, tmc_etf0, tmc_etr0, funnel0, funnel1 | |
222 | ||
223 | Thus, with the new scheme the devices could appear as : | |
224 | ||
225 | root:~# ls /sys/bus/coresight/devices/ | |
226 | etm0 etm1 etm2 etm3 etm4 etm5 funnel0 | |
227 | funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 | |
228 | ||
229 | Some of the examples below might refer to old naming scheme and some | |
230 | to the newer scheme, to give a confirmation that what you see on your | |
231 | system is not unexpected. One must use the "names" as they appear on | |
232 | the system under specified locations. | |
233 | ||
237483aa PP |
234 | How to use the tracer modules |
235 | ----------------------------- | |
872234d3 | 236 | |
f29816b4 MP |
237 | There are two ways to use the Coresight framework: 1) using the perf cmd line |
238 | tools and 2) interacting directly with the Coresight devices using the sysFS | |
239 | interface. Preference is given to the former as using the sysFS interface | |
240 | requires a deep understanding of the Coresight HW. The following sections | |
241 | provide details on using both methods. | |
242 | ||
243 | 1) Using the sysFS interface: | |
244 | ||
245 | Before trace collection can start, a coresight sink needs to be identified. | |
872234d3 MP |
246 | There is no limit on the amount of sinks (nor sources) that can be enabled at |
247 | any given moment. As a generic operation, all device pertaining to the sink | |
248 | class will have an "active" entry in sysfs: | |
249 | ||
250 | root:/sys/bus/coresight/devices# ls | |
251 | replicator 20030000.tpiu 2201c000.ptm 2203c000.etm 2203e000.etm | |
252 | 20010000.etb 20040000.funnel 2201d000.ptm 2203d000.etm | |
253 | root:/sys/bus/coresight/devices# ls 20010000.etb | |
254 | enable_sink status trigger_cntr | |
255 | root:/sys/bus/coresight/devices# echo 1 > 20010000.etb/enable_sink | |
256 | root:/sys/bus/coresight/devices# cat 20010000.etb/enable_sink | |
257 | 1 | |
258 | root:/sys/bus/coresight/devices# | |
259 | ||
260 | At boot time the current etm3x driver will configure the first address | |
261 | comparator with "_stext" and "_etext", essentially tracing any instruction | |
262 | that falls within that range. As such "enabling" a source will immediately | |
263 | trigger a trace capture: | |
264 | ||
265 | root:/sys/bus/coresight/devices# echo 1 > 2201c000.ptm/enable_source | |
266 | root:/sys/bus/coresight/devices# cat 2201c000.ptm/enable_source | |
267 | 1 | |
268 | root:/sys/bus/coresight/devices# cat 20010000.etb/status | |
269 | Depth: 0x2000 | |
270 | Status: 0x1 | |
271 | RAM read ptr: 0x0 | |
272 | RAM wrt ptr: 0x19d3 <----- The write pointer is moving | |
273 | Trigger cnt: 0x0 | |
274 | Control: 0x1 | |
275 | Flush status: 0x0 | |
276 | Flush ctrl: 0x2001 | |
277 | root:/sys/bus/coresight/devices# | |
278 | ||
279 | Trace collection is stopped the same way: | |
280 | ||
281 | root:/sys/bus/coresight/devices# echo 0 > 2201c000.ptm/enable_source | |
282 | root:/sys/bus/coresight/devices# | |
283 | ||
284 | The content of the ETB buffer can be harvested directly from /dev: | |
285 | ||
286 | root:/sys/bus/coresight/devices# dd if=/dev/20010000.etb \ | |
287 | of=~/cstrace.bin | |
288 | ||
289 | 64+0 records in | |
290 | 64+0 records out | |
291 | 32768 bytes (33 kB) copied, 0.00125258 s, 26.2 MB/s | |
292 | root:/sys/bus/coresight/devices# | |
293 | ||
294 | The file cstrace.bin can be decompressed using "ptm2human", DS-5 or Trace32. | |
295 | ||
296 | Following is a DS-5 output of an experimental loop that increments a variable up | |
297 | to a certain value. The example is simple and yet provides a glimpse of the | |
298 | wealth of possibilities that coresight provides. | |
299 | ||
300 | Info Tracing enabled | |
301 | Instruction 106378866 0x8026B53C E52DE004 false PUSH {lr} | |
302 | Instruction 0 0x8026B540 E24DD00C false SUB sp,sp,#0xc | |
303 | Instruction 0 0x8026B544 E3A03000 false MOV r3,#0 | |
304 | Instruction 0 0x8026B548 E58D3004 false STR r3,[sp,#4] | |
305 | Instruction 0 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
306 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
307 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
308 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
309 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
310 | Timestamp Timestamp: 17106715833 | |
311 | Instruction 319 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
312 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
313 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
314 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
315 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
316 | Instruction 9 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
317 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
318 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
319 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
320 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
321 | Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
322 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
323 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
324 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
325 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
326 | Instruction 7 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
327 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
328 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
329 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
330 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
331 | Instruction 10 0x8026B54C E59D3004 false LDR r3,[sp,#4] | |
332 | Instruction 0 0x8026B550 E3530004 false CMP r3,#4 | |
333 | Instruction 0 0x8026B554 E2833001 false ADD r3,r3,#1 | |
334 | Instruction 0 0x8026B558 E58D3004 false STR r3,[sp,#4] | |
335 | Instruction 0 0x8026B55C DAFFFFFA true BLE {pc}-0x10 ; 0x8026b54c | |
336 | Instruction 6 0x8026B560 EE1D3F30 false MRC p15,#0x0,r3,c13,c0,#1 | |
337 | Instruction 0 0x8026B564 E1A0100D false MOV r1,sp | |
338 | Instruction 0 0x8026B568 E3C12D7F false BIC r2,r1,#0x1fc0 | |
339 | Instruction 0 0x8026B56C E3C2203F false BIC r2,r2,#0x3f | |
340 | Instruction 0 0x8026B570 E59D1004 false LDR r1,[sp,#4] | |
341 | Instruction 0 0x8026B574 E59F0010 false LDR r0,[pc,#16] ; [0x8026B58C] = 0x80550368 | |
342 | Instruction 0 0x8026B578 E592200C false LDR r2,[r2,#0xc] | |
343 | Instruction 0 0x8026B57C E59221D0 false LDR r2,[r2,#0x1d0] | |
344 | Instruction 0 0x8026B580 EB07A4CF true BL {pc}+0x1e9344 ; 0x804548c4 | |
345 | Info Tracing enabled | |
346 | Instruction 13570831 0x8026B584 E28DD00C false ADD sp,sp,#0xc | |
347 | Instruction 0 0x8026B588 E8BD8000 true LDM sp!,{pc} | |
348 | Timestamp Timestamp: 17107041535 | |
237483aa | 349 | |
f29816b4 MP |
350 | 2) Using perf framework: |
351 | ||
352 | Coresight tracers are represented using the Perf framework's Performance | |
353 | Monitoring Unit (PMU) abstraction. As such the perf framework takes charge of | |
354 | controlling when tracing gets enabled based on when the process of interest is | |
355 | scheduled. When configured in a system, Coresight PMUs will be listed when | |
356 | queried by the perf command line tool: | |
357 | ||
358 | linaro@linaro-nano:~$ ./perf list pmu | |
359 | ||
360 | List of pre-defined events (to be used in -e): | |
361 | ||
362 | cs_etm// [Kernel PMU event] | |
363 | ||
364 | linaro@linaro-nano:~$ | |
365 | ||
366 | Regardless of the number of tracers available in a system (usually equal to the | |
367 | amount of processor cores), the "cs_etm" PMU will be listed only once. | |
368 | ||
369 | A Coresight PMU works the same way as any other PMU, i.e the name of the PMU is | |
370 | listed along with configuration options within forward slashes '/'. Since a | |
371 | Coresight system will typically have more than one sink, the name of the sink to | |
cd84d63a SP |
372 | work with needs to be specified as an event option. |
373 | On newer kernels the available sinks are listed in sysFS under: | |
374 | ($SYSFS)/bus/event_source/devices/cs_etm/sinks/ | |
375 | ||
376 | root@localhost:/sys/bus/event_source/devices/cs_etm/sinks# ls | |
377 | tmc_etf0 tmc_etr0 tpiu0 | |
378 | ||
379 | On older kernels, this may need to be found from the list of coresight devices, | |
380 | available under ($SYSFS)/bus/coresight/devices/: | |
381 | ||
382 | root:~# ls /sys/bus/coresight/devices/ | |
383 | etm0 etm1 etm2 etm3 etm4 etm5 funnel0 | |
384 | funnel1 funnel2 replicator0 stm0 tmc_etf0 tmc_etr0 tpiu0 | |
f29816b4 | 385 | |
cd84d63a | 386 | root@linaro-nano:~# perf record -e cs_etm/@tmc_etr0/u --per-thread program |
f29816b4 | 387 | |
cd84d63a SP |
388 | As mentioned above in section "Device Naming scheme", the names of the devices could |
389 | look different from what is used in the example above. One must use the device names | |
390 | as it appears under the sysFS. | |
f29816b4 MP |
391 | |
392 | The syntax within the forward slashes '/' is important. The '@' character | |
393 | tells the parser that a sink is about to be specified and that this is the sink | |
394 | to use for the trace session. | |
395 | ||
396 | More information on the above and other example on how to use Coresight with | |
397 | the perf tools can be found in the "HOWTO.md" file of the openCSD gitHub | |
398 | repository [3]. | |
399 | ||
87bf4d68 | 400 | 2.1) AutoFDO analysis using the perf tools: |
6673016f RW |
401 | |
402 | perf can be used to record and analyze trace of programs. | |
403 | ||
404 | Execution can be recorded using 'perf record' with the cs_etm event, | |
405 | specifying the name of the sink to record to, e.g: | |
406 | ||
cd84d63a | 407 | perf record -e cs_etm/@tmc_etr0/u --per-thread |
6673016f RW |
408 | |
409 | The 'perf report' and 'perf script' commands can be used to analyze execution, | |
410 | synthesizing instruction and branch events from the instruction trace. | |
411 | 'perf inject' can be used to replace the trace data with the synthesized events. | |
412 | The --itrace option controls the type and frequency of synthesized events | |
413 | (see perf documentation). | |
414 | ||
415 | Note that only 64-bit programs are currently supported - further work is | |
416 | required to support instruction decode of 32-bit Arm programs. | |
417 | ||
418 | ||
419 | Generating coverage files for Feedback Directed Optimization: AutoFDO | |
420 | --------------------------------------------------------------------- | |
421 | ||
422 | 'perf inject' accepts the --itrace option in which case tracing data is | |
423 | removed and replaced with the synthesized events. e.g. | |
424 | ||
425 | perf inject --itrace --strip -i perf.data -o perf.data.new | |
426 | ||
427 | Below is an example of using ARM ETM for autoFDO. It requires autofdo | |
428 | (https://github.com/google/autofdo) and gcc version 5. The bubble | |
429 | sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial). | |
430 | ||
431 | $ gcc-5 -O3 sort.c -o sort | |
432 | $ taskset -c 2 ./sort | |
433 | Bubble sorting array of 30000 elements | |
434 | 5910 ms | |
435 | ||
cd84d63a | 436 | $ perf record -e cs_etm/@tmc_etr0/u --per-thread taskset -c 2 ./sort |
6673016f RW |
437 | Bubble sorting array of 30000 elements |
438 | 12543 ms | |
439 | [ perf record: Woken up 35 times to write data ] | |
440 | [ perf record: Captured and wrote 69.640 MB perf.data ] | |
441 | ||
442 | $ perf inject -i perf.data -o inj.data --itrace=il64 --strip | |
443 | $ create_gcov --binary=./sort --profile=inj.data --gcov=sort.gcov -gcov_version=1 | |
444 | $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo | |
445 | $ taskset -c 2 ./sort_autofdo | |
446 | Bubble sorting array of 30000 elements | |
447 | 5806 ms | |
87bf4d68 MP |
448 | |
449 | ||
450 | How to use the STM module | |
451 | ------------------------- | |
452 | ||
453 | Using the System Trace Macrocell module is the same as the tracers - the only | |
454 | difference is that clients are driving the trace capture rather | |
455 | than the program flow through the code. | |
456 | ||
457 | As with any other CoreSight component, specifics about the STM tracer can be | |
458 | found in sysfs with more information on each entry being found in [1]: | |
459 | ||
cd84d63a | 460 | root@genericarmv8:~# ls /sys/bus/coresight/devices/stm0 |
87bf4d68 MP |
461 | enable_source hwevent_select port_enable subsystem uevent |
462 | hwevent_enable mgmt port_select traceid | |
463 | root@genericarmv8:~# | |
464 | ||
465 | Like any other source a sink needs to be identified and the STM enabled before | |
466 | being used: | |
467 | ||
cd84d63a SP |
468 | root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink |
469 | root@genericarmv8:~# echo 1 > /sys/bus/coresight/devices/stm0/enable_source | |
87bf4d68 MP |
470 | |
471 | From there user space applications can request and use channels using the devfs | |
472 | interface provided for that purpose by the generic STM API: | |
473 | ||
cd84d63a SP |
474 | root@genericarmv8:~# ls -l /dev/stm0 |
475 | crw------- 1 root root 10, 61 Jan 3 18:11 /dev/stm0 | |
87bf4d68 MP |
476 | root@genericarmv8:~# |
477 | ||
478 | Details on how to use the generic STM API can be found here [2]. | |
479 | ||
480 | [1]. Documentation/ABI/testing/sysfs-bus-coresight-devices-stm | |
5fb94e9c | 481 | [2]. Documentation/trace/stm.rst |
87bf4d68 | 482 | [3]. https://github.com/Linaro/perf-opencsd |