Commit | Line | Data |
---|---|---|
b8df4a36 TZ |
1 | Event Histograms |
2 | ||
3 | Documentation written by Tom Zanussi | |
4 | ||
5 | 1. Introduction | |
6 | =============== | |
7 | ||
8 | Histogram triggers are special event triggers that can be used to | |
9 | aggregate trace event data into histograms. For information on | |
10 | trace events and event triggers, see Documentation/trace/events.txt. | |
11 | ||
12 | ||
13 | 2. Histogram Trigger Command | |
14 | ============================ | |
15 | ||
16 | A histogram trigger command is an event trigger command that | |
17 | aggregates event hits into a hash table keyed on one or more trace | |
18 | event format fields (or stacktrace) and a set of running totals | |
19 | derived from one or more trace event format fields and/or event | |
20 | counts (hitcount). | |
21 | ||
22 | The format of a hist trigger is as follows: | |
23 | ||
24 | hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>] | |
25 | [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue] | |
26 | [:clear][:name=histname1] [if <filter>] | |
27 | ||
28 | When a matching event is hit, an entry is added to a hash table | |
29 | using the key(s) and value(s) named. Keys and values correspond to | |
30 | fields in the event's format description. Values must correspond to | |
31 | numeric fields - on an event hit, the value(s) will be added to a | |
32 | sum kept for that field. The special string 'hitcount' can be used | |
33 | in place of an explicit value field - this is simply a count of | |
34 | event hits. If 'values' isn't specified, an implicit 'hitcount' | |
35 | value will be automatically created and used as the only value. | |
36 | Keys can be any field, or the special string 'stacktrace', which | |
37 | will use the event's kernel stacktrace as the key. The keywords | |
38 | 'keys' or 'key' can be used to specify keys, and the keywords | |
39 | 'values', 'vals', or 'val' can be used to specify values. Compound | |
40 | keys consisting of up to two fields can be specified by the 'keys' | |
41 | keyword. Hashing a compound key produces a unique entry in the | |
42 | table for each unique combination of component keys, and can be | |
43 | useful for providing more fine-grained summaries of event data. | |
44 | Additionally, sort keys consisting of up to two fields can be | |
45 | specified by the 'sort' keyword. If more than one field is | |
46 | specified, the result will be a 'sort within a sort': the first key | |
47 | is taken to be the primary sort key and the second the secondary | |
48 | key. If a hist trigger is given a name using the 'name' parameter, | |
49 | its histogram data will be shared with other triggers of the same | |
50 | name, and trigger hits will update this common data. Only triggers | |
51 | with 'compatible' fields can be combined in this way; triggers are | |
52 | 'compatible' if the fields named in the trigger share the same | |
53 | number and type of fields and those fields also have the same names. | |
54 | Note that any two events always share the compatible 'hitcount' and | |
55 | 'stacktrace' fields and can therefore be combined using those | |
56 | fields, however pointless that may be. | |
57 | ||
58 | 'hist' triggers add a 'hist' file to each event's subdirectory. | |
59 | Reading the 'hist' file for the event will dump the hash table in | |
60 | its entirety to stdout. If there are multiple hist triggers | |
61 | attached to an event, there will be a table for each trigger in the | |
62 | output. The table displayed for a named trigger will be the same as | |
63 | any other instance having the same name. Each printed hash table | |
64 | entry is a simple list of the keys and values comprising the entry; | |
65 | keys are printed first and are delineated by curly braces, and are | |
66 | followed by the set of value fields for the entry. By default, | |
67 | numeric fields are displayed as base-10 integers. This can be | |
68 | modified by appending any of the following modifiers to the field | |
69 | name: | |
70 | ||
71 | .hex display a number as a hex value | |
72 | .sym display an address as a symbol | |
73 | .sym-offset display an address as a symbol and offset | |
74 | .syscall display a syscall id as a system call name | |
75 | .execname display a common_pid as a program name | |
442c9484 | 76 | .log2 display log2 value rather than raw number |
860f9f6b | 77 | .usecs display a common_timestamp in microseconds |
b8df4a36 TZ |
78 | |
79 | Note that in general the semantics of a given field aren't | |
80 | interpreted when applying a modifier to it, but there are some | |
81 | restrictions to be aware of in this regard: | |
82 | ||
83 | - only the 'hex' modifier can be used for values (because values | |
84 | are essentially sums, and the other modifiers don't make sense | |
85 | in that context). | |
86 | - the 'execname' modifier can only be used on a 'common_pid'. The | |
87 | reason for this is that the execname is simply the 'comm' value | |
88 | saved for the 'current' process when an event was triggered, | |
89 | which is the same as the common_pid value saved by the event | |
90 | tracing code. Trying to apply that comm value to other pid | |
91 | values wouldn't be correct, and typically events that care save | |
92 | pid-specific comm fields in the event itself. | |
93 | ||
94 | A typical usage scenario would be the following to enable a hist | |
95 | trigger, read its current contents, and then turn it off: | |
96 | ||
97 | # echo 'hist:keys=skbaddr.hex:vals=len' > \ | |
98 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger | |
99 | ||
100 | # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist | |
101 | ||
102 | # echo '!hist:keys=skbaddr.hex:vals=len' > \ | |
103 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger | |
104 | ||
105 | The trigger file itself can be read to show the details of the | |
106 | currently attached hist trigger. This information is also displayed | |
107 | at the top of the 'hist' file when read. | |
108 | ||
109 | By default, the size of the hash table is 2048 entries. The 'size' | |
110 | parameter can be used to specify more or fewer than that. The units | |
111 | are in terms of hashtable entries - if a run uses more entries than | |
112 | specified, the results will show the number of 'drops', the number | |
113 | of hits that were ignored. The size should be a power of 2 between | |
114 | 128 and 131072 (any non- power-of-2 number specified will be rounded | |
115 | up). | |
116 | ||
117 | The 'sort' parameter can be used to specify a value field to sort | |
118 | on. The default if unspecified is 'hitcount' and the default sort | |
119 | order is 'ascending'. To sort in the opposite direction, append | |
120 | .descending' to the sort key. | |
121 | ||
122 | The 'pause' parameter can be used to pause an existing hist trigger | |
123 | or to start a hist trigger but not log any events until told to do | |
124 | so. 'continue' or 'cont' can be used to start or restart a paused | |
125 | hist trigger. | |
126 | ||
127 | The 'clear' parameter will clear the contents of a running hist | |
128 | trigger and leave its current paused/active state. | |
129 | ||
130 | Note that the 'pause', 'cont', and 'clear' parameters should be | |
131 | applied using 'append' shell operator ('>>') if applied to an | |
132 | existing trigger, rather than via the '>' operator, which will cause | |
133 | the trigger to be removed through truncation. | |
134 | ||
135 | - enable_hist/disable_hist | |
136 | ||
137 | The enable_hist and disable_hist triggers can be used to have one | |
138 | event conditionally start and stop another event's already-attached | |
139 | hist trigger. Any number of enable_hist and disable_hist triggers | |
140 | can be attached to a given event, allowing that event to kick off | |
141 | and stop aggregations on a host of other events. | |
142 | ||
143 | The format is very similar to the enable/disable_event triggers: | |
144 | ||
145 | enable_hist:<system>:<event>[:count] | |
146 | disable_hist:<system>:<event>[:count] | |
147 | ||
148 | Instead of enabling or disabling the tracing of the target event | |
149 | into the trace buffer as the enable/disable_event triggers do, the | |
150 | enable/disable_hist triggers enable or disable the aggregation of | |
151 | the target event into a hash table. | |
152 | ||
153 | A typical usage scenario for the enable_hist/disable_hist triggers | |
154 | would be to first set up a paused hist trigger on some event, | |
155 | followed by an enable_hist/disable_hist pair that turns the hist | |
156 | aggregation on and off when conditions of interest are hit: | |
157 | ||
158 | # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \ | |
159 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
160 | ||
161 | # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ | |
162 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger | |
163 | ||
164 | # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ | |
165 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger | |
166 | ||
167 | The above sets up an initially paused hist trigger which is unpaused | |
168 | and starts aggregating events when a given program is executed, and | |
169 | which stops aggregating when the process exits and the hist trigger | |
170 | is paused again. | |
171 | ||
172 | The examples below provide a more concrete illustration of the | |
173 | concepts and typical usage patterns discussed above. | |
174 | ||
8b7622bf TZ |
175 | 'special' event fields |
176 | ------------------------ | |
177 | ||
178 | There are a number of 'special event fields' available for use as | |
179 | keys or values in a hist trigger. These look like and behave as if | |
180 | they were actual event fields, but aren't really part of the event's | |
181 | field definition or format file. They are however available for any | |
182 | event, and can be used anywhere an actual event field could be. | |
183 | They are: | |
184 | ||
185 | common_timestamp u64 - timestamp (from ring buffer) associated | |
186 | with the event, in nanoseconds. May be | |
187 | modified by .usecs to have timestamps | |
188 | interpreted as microseconds. | |
189 | cpu int - the cpu on which the event occurred. | |
b8df4a36 | 190 | |
f404da6e TZ |
191 | Extended error information |
192 | -------------------------- | |
193 | ||
194 | For some error conditions encountered when invoking a hist trigger | |
195 | command, extended error information is available via the | |
196 | corresponding event's 'hist' file. Reading the hist file after an | |
197 | error will display more detailed information about what went wrong, | |
198 | if information is available. This extended error information will | |
199 | be available until the next hist trigger command for that event. | |
200 | ||
201 | If available for a given error condition, the extended error | |
202 | information and usage takes the following form: | |
203 | ||
204 | # echo xxx > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger | |
205 | echo: write error: Invalid argument | |
206 | ||
207 | # cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/hist | |
208 | ERROR: Couldn't yyy: zzz | |
209 | Last command: xxx | |
210 | ||
b8df4a36 TZ |
211 | 6.2 'hist' trigger examples |
212 | --------------------------- | |
213 | ||
214 | The first set of examples creates aggregations using the kmalloc | |
215 | event. The fields that can be used for the hist trigger are listed | |
216 | in the kmalloc event's format file: | |
217 | ||
218 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format | |
219 | name: kmalloc | |
220 | ID: 374 | |
221 | format: | |
222 | field:unsigned short common_type; offset:0; size:2; signed:0; | |
223 | field:unsigned char common_flags; offset:2; size:1; signed:0; | |
224 | field:unsigned char common_preempt_count; offset:3; size:1; signed:0; | |
225 | field:int common_pid; offset:4; size:4; signed:1; | |
226 | ||
227 | field:unsigned long call_site; offset:8; size:8; signed:0; | |
228 | field:const void * ptr; offset:16; size:8; signed:0; | |
229 | field:size_t bytes_req; offset:24; size:8; signed:0; | |
230 | field:size_t bytes_alloc; offset:32; size:8; signed:0; | |
231 | field:gfp_t gfp_flags; offset:40; size:4; signed:0; | |
232 | ||
233 | We'll start by creating a hist trigger that generates a simple table | |
234 | that lists the total number of bytes requested for each function in | |
235 | the kernel that made one or more calls to kmalloc: | |
236 | ||
237 | # echo 'hist:key=call_site:val=bytes_req' > \ | |
238 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
239 | ||
240 | This tells the tracing system to create a 'hist' trigger using the | |
241 | call_site field of the kmalloc event as the key for the table, which | |
242 | just means that each unique call_site address will have an entry | |
243 | created for it in the table. The 'val=bytes_req' parameter tells | |
244 | the hist trigger that for each unique entry (call_site) in the | |
245 | table, it should keep a running total of the number of bytes | |
246 | requested by that call_site. | |
247 | ||
248 | We'll let it run for awhile and then dump the contents of the 'hist' | |
249 | file in the kmalloc event's subdirectory (for readability, a number | |
250 | of entries have been omitted): | |
251 | ||
252 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
253 | # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] | |
254 | ||
255 | { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176 | |
256 | { call_site: 18446744071579557049 } hitcount: 1 bytes_req: 1024 | |
257 | { call_site: 18446744071580608289 } hitcount: 1 bytes_req: 16384 | |
258 | { call_site: 18446744071581827654 } hitcount: 1 bytes_req: 24 | |
259 | { call_site: 18446744071580700980 } hitcount: 1 bytes_req: 8 | |
260 | { call_site: 18446744071579359876 } hitcount: 1 bytes_req: 152 | |
261 | { call_site: 18446744071580795365 } hitcount: 3 bytes_req: 144 | |
262 | { call_site: 18446744071581303129 } hitcount: 3 bytes_req: 144 | |
263 | { call_site: 18446744071580713234 } hitcount: 4 bytes_req: 2560 | |
264 | { call_site: 18446744071580933750 } hitcount: 4 bytes_req: 736 | |
265 | . | |
266 | . | |
267 | . | |
268 | { call_site: 18446744072106047046 } hitcount: 69 bytes_req: 5576 | |
269 | { call_site: 18446744071582116407 } hitcount: 73 bytes_req: 2336 | |
270 | { call_site: 18446744072106054684 } hitcount: 136 bytes_req: 140504 | |
271 | { call_site: 18446744072106224230 } hitcount: 136 bytes_req: 19584 | |
272 | { call_site: 18446744072106078074 } hitcount: 153 bytes_req: 2448 | |
273 | { call_site: 18446744072106062406 } hitcount: 153 bytes_req: 36720 | |
274 | { call_site: 18446744071582507929 } hitcount: 153 bytes_req: 37088 | |
275 | { call_site: 18446744072102520590 } hitcount: 273 bytes_req: 10920 | |
276 | { call_site: 18446744071582143559 } hitcount: 358 bytes_req: 716 | |
277 | { call_site: 18446744072106465852 } hitcount: 417 bytes_req: 56712 | |
278 | { call_site: 18446744072102523378 } hitcount: 485 bytes_req: 27160 | |
279 | { call_site: 18446744072099568646 } hitcount: 1676 bytes_req: 33520 | |
280 | ||
281 | Totals: | |
282 | Hits: 4610 | |
283 | Entries: 45 | |
284 | Dropped: 0 | |
285 | ||
286 | The output displays a line for each entry, beginning with the key | |
287 | specified in the trigger, followed by the value(s) also specified in | |
288 | the trigger. At the beginning of the output is a line that displays | |
289 | the trigger info, which can also be displayed by reading the | |
290 | 'trigger' file: | |
291 | ||
292 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
293 | hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] | |
294 | ||
295 | At the end of the output are a few lines that display the overall | |
296 | totals for the run. The 'Hits' field shows the total number of | |
297 | times the event trigger was hit, the 'Entries' field shows the total | |
298 | number of used entries in the hash table, and the 'Dropped' field | |
299 | shows the number of hits that were dropped because the number of | |
300 | used entries for the run exceeded the maximum number of entries | |
301 | allowed for the table (normally 0, but if not a hint that you may | |
302 | want to increase the size of the table using the 'size' parameter). | |
303 | ||
304 | Notice in the above output that there's an extra field, 'hitcount', | |
305 | which wasn't specified in the trigger. Also notice that in the | |
306 | trigger info output, there's a parameter, 'sort=hitcount', which | |
307 | wasn't specified in the trigger either. The reason for that is that | |
308 | every trigger implicitly keeps a count of the total number of hits | |
309 | attributed to a given entry, called the 'hitcount'. That hitcount | |
310 | information is explicitly displayed in the output, and in the | |
311 | absence of a user-specified sort parameter, is used as the default | |
312 | sort field. | |
313 | ||
314 | The value 'hitcount' can be used in place of an explicit value in | |
315 | the 'values' parameter if you don't really need to have any | |
316 | particular field summed and are mainly interested in hit | |
317 | frequencies. | |
318 | ||
319 | To turn the hist trigger off, simply call up the trigger in the | |
320 | command history and re-execute it with a '!' prepended: | |
321 | ||
322 | # echo '!hist:key=call_site:val=bytes_req' > \ | |
323 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
324 | ||
325 | Finally, notice that the call_site as displayed in the output above | |
326 | isn't really very useful. It's an address, but normally addresses | |
327 | are displayed in hex. To have a numeric field displayed as a hex | |
328 | value, simply append '.hex' to the field name in the trigger: | |
329 | ||
330 | # echo 'hist:key=call_site.hex:val=bytes_req' > \ | |
331 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
332 | ||
333 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
334 | # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active] | |
335 | ||
336 | { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433 | |
337 | { call_site: ffffffffa07186ff } hitcount: 1 bytes_req: 176 | |
338 | { call_site: ffffffff811ae721 } hitcount: 1 bytes_req: 16384 | |
339 | { call_site: ffffffff811c5134 } hitcount: 1 bytes_req: 8 | |
340 | { call_site: ffffffffa04a9ebb } hitcount: 1 bytes_req: 511 | |
341 | { call_site: ffffffff8122e0a6 } hitcount: 1 bytes_req: 12 | |
342 | { call_site: ffffffff8107da84 } hitcount: 1 bytes_req: 152 | |
343 | { call_site: ffffffff812d8246 } hitcount: 1 bytes_req: 24 | |
344 | { call_site: ffffffff811dc1e5 } hitcount: 3 bytes_req: 144 | |
345 | { call_site: ffffffffa02515e8 } hitcount: 3 bytes_req: 648 | |
346 | { call_site: ffffffff81258159 } hitcount: 3 bytes_req: 144 | |
347 | { call_site: ffffffff811c80f4 } hitcount: 4 bytes_req: 544 | |
348 | . | |
349 | . | |
350 | . | |
351 | { call_site: ffffffffa06c7646 } hitcount: 106 bytes_req: 8024 | |
352 | { call_site: ffffffffa06cb246 } hitcount: 132 bytes_req: 31680 | |
353 | { call_site: ffffffffa06cef7a } hitcount: 132 bytes_req: 2112 | |
354 | { call_site: ffffffff8137e399 } hitcount: 132 bytes_req: 23232 | |
355 | { call_site: ffffffffa06c941c } hitcount: 185 bytes_req: 171360 | |
356 | { call_site: ffffffffa06f2a66 } hitcount: 185 bytes_req: 26640 | |
357 | { call_site: ffffffffa036a70e } hitcount: 265 bytes_req: 10600 | |
358 | { call_site: ffffffff81325447 } hitcount: 292 bytes_req: 584 | |
359 | { call_site: ffffffffa072da3c } hitcount: 446 bytes_req: 60656 | |
360 | { call_site: ffffffffa036b1f2 } hitcount: 526 bytes_req: 29456 | |
361 | { call_site: ffffffffa0099c06 } hitcount: 1780 bytes_req: 35600 | |
362 | ||
363 | Totals: | |
364 | Hits: 4775 | |
365 | Entries: 46 | |
366 | Dropped: 0 | |
367 | ||
368 | Even that's only marginally more useful - while hex values do look | |
369 | more like addresses, what users are typically more interested in | |
370 | when looking at text addresses are the corresponding symbols | |
371 | instead. To have an address displayed as symbolic value instead, | |
372 | simply append '.sym' or '.sym-offset' to the field name in the | |
373 | trigger: | |
374 | ||
375 | # echo 'hist:key=call_site.sym:val=bytes_req' > \ | |
376 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
377 | ||
378 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
379 | # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active] | |
380 | ||
381 | { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024 | |
382 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 | |
383 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 | |
384 | { call_site: [ffffffff8154acbe] usb_alloc_urb } hitcount: 1 bytes_req: 192 | |
385 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 | |
386 | { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 | |
387 | { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 | |
388 | { call_site: [ffffffff811febd5] fsnotify_alloc_group } hitcount: 2 bytes_req: 528 | |
389 | { call_site: [ffffffff81440f58] __tty_buffer_request_room } hitcount: 2 bytes_req: 2624 | |
390 | { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 2 bytes_req: 96 | |
391 | { call_site: [ffffffffa05e19af] ieee80211_start_tx_ba_session [mac80211] } hitcount: 2 bytes_req: 464 | |
392 | { call_site: [ffffffff81672406] tcp_get_metrics } hitcount: 2 bytes_req: 304 | |
393 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 | |
394 | { call_site: [ffffffff81089b05] sched_create_group } hitcount: 2 bytes_req: 1424 | |
395 | . | |
396 | . | |
397 | . | |
398 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1185 bytes_req: 123240 | |
399 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 1185 bytes_req: 104280 | |
400 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 1402 bytes_req: 190672 | |
401 | { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 1518 bytes_req: 146208 | |
402 | { call_site: [ffffffffa029070e] drm_vma_node_allow [drm] } hitcount: 1746 bytes_req: 69840 | |
403 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 2021 bytes_req: 792312 | |
404 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 2592 bytes_req: 145152 | |
405 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2629 bytes_req: 378576 | |
406 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2629 bytes_req: 3783248 | |
407 | { call_site: [ffffffff81325607] apparmor_file_alloc_security } hitcount: 5192 bytes_req: 10384 | |
408 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 5529 bytes_req: 110584 | |
409 | { call_site: [ffffffff8131ebf7] aa_alloc_task_context } hitcount: 21943 bytes_req: 702176 | |
410 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 55759 bytes_req: 5074265 | |
411 | ||
412 | Totals: | |
413 | Hits: 109928 | |
414 | Entries: 71 | |
415 | Dropped: 0 | |
416 | ||
417 | Because the default sort key above is 'hitcount', the above shows a | |
418 | the list of call_sites by increasing hitcount, so that at the bottom | |
419 | we see the functions that made the most kmalloc calls during the | |
420 | run. If instead we we wanted to see the top kmalloc callers in | |
421 | terms of the number of bytes requested rather than the number of | |
422 | calls, and we wanted the top caller to appear at the top, we can use | |
423 | the 'sort' parameter, along with the 'descending' modifier: | |
424 | ||
425 | # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \ | |
426 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
427 | ||
428 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
429 | # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] | |
430 | ||
431 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464 | |
432 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1790 bytes_req: 712176 | |
433 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 8132 bytes_req: 513135 | |
434 | { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 106 bytes_req: 440128 | |
435 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2186 bytes_req: 314784 | |
436 | { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 2174 bytes_req: 208992 | |
437 | { call_site: [ffffffff811ae8e1] __kmalloc } hitcount: 8 bytes_req: 131072 | |
438 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 859 bytes_req: 116824 | |
439 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 1834 bytes_req: 102704 | |
440 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 972 bytes_req: 101088 | |
441 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 972 bytes_req: 85536 | |
442 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 3333 bytes_req: 66664 | |
443 | { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 209 bytes_req: 61632 | |
444 | . | |
445 | . | |
446 | . | |
447 | { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 | |
448 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 | |
449 | { call_site: [ffffffff812d8406] copy_semundo } hitcount: 2 bytes_req: 48 | |
450 | { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 1 bytes_req: 48 | |
451 | { call_site: [ffffffffa027121a] drm_getmagic [drm] } hitcount: 1 bytes_req: 48 | |
452 | { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 | |
453 | { call_site: [ffffffff811c52f4] bprm_change_interp } hitcount: 2 bytes_req: 16 | |
454 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 | |
455 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 | |
456 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 | |
457 | ||
458 | Totals: | |
459 | Hits: 32133 | |
460 | Entries: 81 | |
461 | Dropped: 0 | |
462 | ||
463 | To display the offset and size information in addition to the symbol | |
464 | name, just use 'sym-offset' instead: | |
465 | ||
466 | # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \ | |
467 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
468 | ||
469 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
470 | # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] | |
471 | ||
472 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720 | |
473 | { call_site: [ffffffffa0489a66] intel_ring_begin+0xc6/0x1f0 [i915] } hitcount: 4569 bytes_req: 657936 | |
474 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23+0x694/0x1020 [i915] } hitcount: 1519 bytes_req: 472936 | |
475 | { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23+0x516/0x1020 [i915] } hitcount: 3050 bytes_req: 211832 | |
476 | { call_site: [ffffffff811e2a1b] seq_buf_alloc+0x1b/0x50 } hitcount: 34 bytes_req: 148384 | |
477 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip+0xbc/0x870 [i915] } hitcount: 1385 bytes_req: 144040 | |
478 | { call_site: [ffffffff811ae8e1] __kmalloc+0x191/0x1b0 } hitcount: 8 bytes_req: 131072 | |
479 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl+0x282/0x360 [drm] } hitcount: 1385 bytes_req: 121880 | |
480 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc+0x32/0x100 [drm] } hitcount: 1848 bytes_req: 103488 | |
481 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state+0x2c/0xa0 [i915] } hitcount: 461 bytes_req: 62696 | |
482 | { call_site: [ffffffffa029070e] drm_vma_node_allow+0x2e/0xd0 [drm] } hitcount: 1541 bytes_req: 61640 | |
483 | { call_site: [ffffffff815f8d7b] sk_prot_alloc+0xcb/0x1b0 } hitcount: 57 bytes_req: 57456 | |
484 | . | |
485 | . | |
486 | . | |
487 | { call_site: [ffffffff8109524a] alloc_fair_sched_group+0x5a/0x1a0 } hitcount: 2 bytes_req: 128 | |
488 | { call_site: [ffffffffa027b921] drm_vm_open_locked+0x31/0xa0 [drm] } hitcount: 3 bytes_req: 96 | |
489 | { call_site: [ffffffff8122e266] proc_self_follow_link+0x76/0xb0 } hitcount: 8 bytes_req: 96 | |
490 | { call_site: [ffffffff81213e80] load_elf_binary+0x240/0x1650 } hitcount: 3 bytes_req: 84 | |
491 | { call_site: [ffffffff8154bc62] usb_control_msg+0x42/0x110 } hitcount: 1 bytes_req: 8 | |
492 | { call_site: [ffffffffa00bf6fe] hidraw_send_report+0x7e/0x1a0 [hid] } hitcount: 1 bytes_req: 7 | |
493 | { call_site: [ffffffffa00bf1ca] hidraw_report_event+0x8a/0x120 [hid] } hitcount: 1 bytes_req: 7 | |
494 | ||
495 | Totals: | |
496 | Hits: 26098 | |
497 | Entries: 64 | |
498 | Dropped: 0 | |
499 | ||
500 | We can also add multiple fields to the 'values' parameter. For | |
501 | example, we might want to see the total number of bytes allocated | |
502 | alongside bytes requested, and display the result sorted by bytes | |
503 | allocated in a descending order: | |
504 | ||
505 | # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \ | |
506 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
507 | ||
508 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
509 | # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active] | |
510 | ||
511 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016 | |
512 | { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 541 bytes_req: 2213968 bytes_alloc: 2228224 | |
513 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 7404 bytes_req: 1066176 bytes_alloc: 1421568 | |
514 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1565 bytes_req: 557368 bytes_alloc: 1037760 | |
515 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 9557 bytes_req: 595778 bytes_alloc: 695744 | |
516 | { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 5839 bytes_req: 430680 bytes_alloc: 470400 | |
517 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 2388 bytes_req: 324768 bytes_alloc: 458496 | |
518 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 3911 bytes_req: 219016 bytes_alloc: 250304 | |
519 | { call_site: [ffffffff815f8d7b] sk_prot_alloc } hitcount: 235 bytes_req: 236880 bytes_alloc: 240640 | |
520 | { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 557 bytes_req: 169024 bytes_alloc: 221760 | |
521 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 9378 bytes_req: 187548 bytes_alloc: 206312 | |
522 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1519 bytes_req: 157976 bytes_alloc: 194432 | |
523 | . | |
524 | . | |
525 | . | |
526 | { call_site: [ffffffff8109bd3b] sched_autogroup_create_attach } hitcount: 2 bytes_req: 144 bytes_alloc: 192 | |
527 | { call_site: [ffffffff81097ee8] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 | |
528 | { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 | |
529 | { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 | |
530 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 | |
531 | { call_site: [ffffffff81213e80] load_elf_binary } hitcount: 3 bytes_req: 84 bytes_alloc: 96 | |
532 | { call_site: [ffffffff81079a2e] kthread_create_on_node } hitcount: 1 bytes_req: 56 bytes_alloc: 64 | |
533 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 | |
534 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 bytes_alloc: 8 | |
535 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 | |
536 | ||
537 | Totals: | |
538 | Hits: 66598 | |
539 | Entries: 65 | |
540 | Dropped: 0 | |
541 | ||
542 | Finally, to finish off our kmalloc example, instead of simply having | |
543 | the hist trigger display symbolic call_sites, we can have the hist | |
544 | trigger additionally display the complete set of kernel stack traces | |
545 | that led to each call_site. To do that, we simply use the special | |
546 | value 'stacktrace' for the key parameter: | |
547 | ||
548 | # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \ | |
549 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger | |
550 | ||
551 | The above trigger will use the kernel stack trace in effect when an | |
552 | event is triggered as the key for the hash table. This allows the | |
553 | enumeration of every kernel callpath that led up to a particular | |
554 | event, along with a running total of any of the event fields for | |
555 | that event. Here we tally bytes requested and bytes allocated for | |
556 | every callpath in the system that led up to a kmalloc (in this case | |
557 | every callpath to a kmalloc for a kernel compile): | |
558 | ||
559 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist | |
560 | # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active] | |
561 | ||
562 | { stacktrace: | |
563 | __kmalloc_track_caller+0x10b/0x1a0 | |
564 | kmemdup+0x20/0x50 | |
565 | hidraw_report_event+0x8a/0x120 [hid] | |
566 | hid_report_raw_event+0x3ea/0x440 [hid] | |
567 | hid_input_report+0x112/0x190 [hid] | |
568 | hid_irq_in+0xc2/0x260 [usbhid] | |
569 | __usb_hcd_giveback_urb+0x72/0x120 | |
570 | usb_giveback_urb_bh+0x9e/0xe0 | |
571 | tasklet_hi_action+0xf8/0x100 | |
572 | __do_softirq+0x114/0x2c0 | |
573 | irq_exit+0xa5/0xb0 | |
574 | do_IRQ+0x5a/0xf0 | |
575 | ret_from_intr+0x0/0x30 | |
576 | cpuidle_enter+0x17/0x20 | |
577 | cpu_startup_entry+0x315/0x3e0 | |
578 | rest_init+0x7c/0x80 | |
579 | } hitcount: 3 bytes_req: 21 bytes_alloc: 24 | |
580 | { stacktrace: | |
581 | __kmalloc_track_caller+0x10b/0x1a0 | |
582 | kmemdup+0x20/0x50 | |
583 | hidraw_report_event+0x8a/0x120 [hid] | |
584 | hid_report_raw_event+0x3ea/0x440 [hid] | |
585 | hid_input_report+0x112/0x190 [hid] | |
586 | hid_irq_in+0xc2/0x260 [usbhid] | |
587 | __usb_hcd_giveback_urb+0x72/0x120 | |
588 | usb_giveback_urb_bh+0x9e/0xe0 | |
589 | tasklet_hi_action+0xf8/0x100 | |
590 | __do_softirq+0x114/0x2c0 | |
591 | irq_exit+0xa5/0xb0 | |
592 | do_IRQ+0x5a/0xf0 | |
593 | ret_from_intr+0x0/0x30 | |
594 | } hitcount: 3 bytes_req: 21 bytes_alloc: 24 | |
595 | { stacktrace: | |
596 | kmem_cache_alloc_trace+0xeb/0x150 | |
597 | aa_alloc_task_context+0x27/0x40 | |
598 | apparmor_cred_prepare+0x1f/0x50 | |
599 | security_prepare_creds+0x16/0x20 | |
600 | prepare_creds+0xdf/0x1a0 | |
601 | SyS_capset+0xb5/0x200 | |
602 | system_call_fastpath+0x12/0x6a | |
603 | } hitcount: 1 bytes_req: 32 bytes_alloc: 32 | |
604 | . | |
605 | . | |
606 | . | |
607 | { stacktrace: | |
608 | __kmalloc+0x11b/0x1b0 | |
609 | i915_gem_execbuffer2+0x6c/0x2c0 [i915] | |
610 | drm_ioctl+0x349/0x670 [drm] | |
611 | do_vfs_ioctl+0x2f0/0x4f0 | |
612 | SyS_ioctl+0x81/0xa0 | |
613 | system_call_fastpath+0x12/0x6a | |
614 | } hitcount: 17726 bytes_req: 13944120 bytes_alloc: 19593808 | |
615 | { stacktrace: | |
616 | __kmalloc+0x11b/0x1b0 | |
617 | load_elf_phdrs+0x76/0xa0 | |
618 | load_elf_binary+0x102/0x1650 | |
619 | search_binary_handler+0x97/0x1d0 | |
620 | do_execveat_common.isra.34+0x551/0x6e0 | |
621 | SyS_execve+0x3a/0x50 | |
622 | return_from_execve+0x0/0x23 | |
623 | } hitcount: 33348 bytes_req: 17152128 bytes_alloc: 20226048 | |
624 | { stacktrace: | |
625 | kmem_cache_alloc_trace+0xeb/0x150 | |
626 | apparmor_file_alloc_security+0x27/0x40 | |
627 | security_file_alloc+0x16/0x20 | |
628 | get_empty_filp+0x93/0x1c0 | |
629 | path_openat+0x31/0x5f0 | |
630 | do_filp_open+0x3a/0x90 | |
631 | do_sys_open+0x128/0x220 | |
632 | SyS_open+0x1e/0x20 | |
633 | system_call_fastpath+0x12/0x6a | |
634 | } hitcount: 4766422 bytes_req: 9532844 bytes_alloc: 38131376 | |
635 | { stacktrace: | |
636 | __kmalloc+0x11b/0x1b0 | |
637 | seq_buf_alloc+0x1b/0x50 | |
638 | seq_read+0x2cc/0x370 | |
639 | proc_reg_read+0x3d/0x80 | |
640 | __vfs_read+0x28/0xe0 | |
641 | vfs_read+0x86/0x140 | |
642 | SyS_read+0x46/0xb0 | |
643 | system_call_fastpath+0x12/0x6a | |
644 | } hitcount: 19133 bytes_req: 78368768 bytes_alloc: 78368768 | |
645 | ||
646 | Totals: | |
647 | Hits: 6085872 | |
648 | Entries: 253 | |
649 | Dropped: 0 | |
650 | ||
651 | If you key a hist trigger on common_pid, in order for example to | |
652 | gather and display sorted totals for each process, you can use the | |
653 | special .execname modifier to display the executable names for the | |
654 | processes in the table rather than raw pids. The example below | |
655 | keeps a per-process sum of total bytes read: | |
656 | ||
657 | # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \ | |
658 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger | |
659 | ||
660 | # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist | |
661 | # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active] | |
662 | ||
663 | { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 | |
664 | { common_pid: Xorg [ 1309] } hitcount: 525 count: 256640 | |
665 | { common_pid: compiz [ 2889] } hitcount: 59 count: 254400 | |
666 | { common_pid: bash [ 8710] } hitcount: 3 count: 66369 | |
667 | { common_pid: dbus-daemon-lau [ 8703] } hitcount: 49 count: 47739 | |
668 | { common_pid: irqbalance [ 1252] } hitcount: 27 count: 27648 | |
669 | { common_pid: 01ifupdown [ 8705] } hitcount: 3 count: 17216 | |
670 | { common_pid: dbus-daemon [ 772] } hitcount: 10 count: 12396 | |
671 | { common_pid: Socket Thread [ 8342] } hitcount: 11 count: 11264 | |
672 | { common_pid: nm-dhcp-client. [ 8701] } hitcount: 6 count: 7424 | |
673 | { common_pid: gmain [ 1315] } hitcount: 18 count: 6336 | |
674 | . | |
675 | . | |
676 | . | |
677 | { common_pid: postgres [ 1892] } hitcount: 2 count: 32 | |
678 | { common_pid: postgres [ 1891] } hitcount: 2 count: 32 | |
679 | { common_pid: gmain [ 8704] } hitcount: 2 count: 32 | |
680 | { common_pid: upstart-dbus-br [ 2740] } hitcount: 21 count: 21 | |
681 | { common_pid: nm-dispatcher.a [ 8696] } hitcount: 1 count: 16 | |
682 | { common_pid: indicator-datet [ 2904] } hitcount: 1 count: 16 | |
683 | { common_pid: gdbus [ 2998] } hitcount: 1 count: 16 | |
684 | { common_pid: rtkit-daemon [ 2052] } hitcount: 1 count: 8 | |
685 | { common_pid: init [ 1] } hitcount: 2 count: 2 | |
686 | ||
687 | Totals: | |
688 | Hits: 2116 | |
689 | Entries: 51 | |
690 | Dropped: 0 | |
691 | ||
692 | Similarly, if you key a hist trigger on syscall id, for example to | |
693 | gather and display a list of systemwide syscall hits, you can use | |
694 | the special .syscall modifier to display the syscall names rather | |
695 | than raw ids. The example below keeps a running total of syscall | |
696 | counts for the system during the run: | |
697 | ||
698 | # echo 'hist:key=id.syscall:val=hitcount' > \ | |
699 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger | |
700 | ||
701 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist | |
702 | # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active] | |
703 | ||
704 | { id: sys_fsync [ 74] } hitcount: 1 | |
705 | { id: sys_newuname [ 63] } hitcount: 1 | |
706 | { id: sys_prctl [157] } hitcount: 1 | |
707 | { id: sys_statfs [137] } hitcount: 1 | |
708 | { id: sys_symlink [ 88] } hitcount: 1 | |
709 | { id: sys_sendmmsg [307] } hitcount: 1 | |
710 | { id: sys_semctl [ 66] } hitcount: 1 | |
711 | { id: sys_readlink [ 89] } hitcount: 3 | |
712 | { id: sys_bind [ 49] } hitcount: 3 | |
713 | { id: sys_getsockname [ 51] } hitcount: 3 | |
714 | { id: sys_unlink [ 87] } hitcount: 3 | |
715 | { id: sys_rename [ 82] } hitcount: 4 | |
716 | { id: unknown_syscall [ 58] } hitcount: 4 | |
717 | { id: sys_connect [ 42] } hitcount: 4 | |
718 | { id: sys_getpid [ 39] } hitcount: 4 | |
719 | . | |
720 | . | |
721 | . | |
722 | { id: sys_rt_sigprocmask [ 14] } hitcount: 952 | |
723 | { id: sys_futex [202] } hitcount: 1534 | |
724 | { id: sys_write [ 1] } hitcount: 2689 | |
725 | { id: sys_setitimer [ 38] } hitcount: 2797 | |
726 | { id: sys_read [ 0] } hitcount: 3202 | |
727 | { id: sys_select [ 23] } hitcount: 3773 | |
728 | { id: sys_writev [ 20] } hitcount: 4531 | |
729 | { id: sys_poll [ 7] } hitcount: 8314 | |
730 | { id: sys_recvmsg [ 47] } hitcount: 13738 | |
731 | { id: sys_ioctl [ 16] } hitcount: 21843 | |
732 | ||
733 | Totals: | |
734 | Hits: 67612 | |
735 | Entries: 72 | |
736 | Dropped: 0 | |
737 | ||
738 | The syscall counts above provide a rough overall picture of system | |
739 | call activity on the system; we can see for example that the most | |
740 | popular system call on this system was the 'sys_ioctl' system call. | |
741 | ||
742 | We can use 'compound' keys to refine that number and provide some | |
743 | further insight as to which processes exactly contribute to the | |
744 | overall ioctl count. | |
745 | ||
746 | The command below keeps a hitcount for every unique combination of | |
747 | system call id and pid - the end result is essentially a table | |
748 | that keeps a per-pid sum of system call hits. The results are | |
749 | sorted using the system call id as the primary key, and the | |
750 | hitcount sum as the secondary key: | |
751 | ||
752 | # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \ | |
753 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger | |
754 | ||
755 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist | |
756 | # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active] | |
757 | ||
758 | { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1 | |
759 | { id: sys_read [ 0], common_pid: gdbus [ 2976] } hitcount: 1 | |
760 | { id: sys_read [ 0], common_pid: console-kit-dae [ 3400] } hitcount: 1 | |
761 | { id: sys_read [ 0], common_pid: postgres [ 1865] } hitcount: 1 | |
762 | { id: sys_read [ 0], common_pid: deja-dup-monito [ 3543] } hitcount: 2 | |
763 | { id: sys_read [ 0], common_pid: NetworkManager [ 890] } hitcount: 2 | |
764 | { id: sys_read [ 0], common_pid: evolution-calen [ 3048] } hitcount: 2 | |
765 | { id: sys_read [ 0], common_pid: postgres [ 1864] } hitcount: 2 | |
766 | { id: sys_read [ 0], common_pid: nm-applet [ 3022] } hitcount: 2 | |
767 | { id: sys_read [ 0], common_pid: whoopsie [ 1212] } hitcount: 2 | |
768 | . | |
769 | . | |
770 | . | |
771 | { id: sys_ioctl [ 16], common_pid: bash [ 8479] } hitcount: 1 | |
772 | { id: sys_ioctl [ 16], common_pid: bash [ 3472] } hitcount: 12 | |
773 | { id: sys_ioctl [ 16], common_pid: gnome-terminal [ 3199] } hitcount: 16 | |
774 | { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 1808 | |
775 | { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 5580 | |
776 | . | |
777 | . | |
778 | . | |
779 | { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2690] } hitcount: 3 | |
780 | { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2688] } hitcount: 16 | |
781 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 975] } hitcount: 2 | |
782 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3204] } hitcount: 4 | |
783 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 2888] } hitcount: 4 | |
784 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3003] } hitcount: 4 | |
785 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 2873] } hitcount: 4 | |
786 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3196] } hitcount: 6 | |
787 | { id: sys_openat [257], common_pid: java [ 2623] } hitcount: 2 | |
788 | { id: sys_eventfd2 [290], common_pid: ibus-ui-gtk3 [ 2760] } hitcount: 4 | |
789 | { id: sys_eventfd2 [290], common_pid: compiz [ 2994] } hitcount: 6 | |
790 | ||
791 | Totals: | |
792 | Hits: 31536 | |
793 | Entries: 323 | |
794 | Dropped: 0 | |
795 | ||
796 | The above list does give us a breakdown of the ioctl syscall by | |
797 | pid, but it also gives us quite a bit more than that, which we | |
798 | don't really care about at the moment. Since we know the syscall | |
799 | id for sys_ioctl (16, displayed next to the sys_ioctl name), we | |
800 | can use that to filter out all the other syscalls: | |
801 | ||
802 | # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \ | |
803 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger | |
804 | ||
805 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist | |
806 | # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active] | |
807 | ||
808 | { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1 | |
809 | { id: sys_ioctl [ 16], common_pid: evolution-addre [ 8571] } hitcount: 1 | |
810 | { id: sys_ioctl [ 16], common_pid: gmain [ 3003] } hitcount: 1 | |
811 | { id: sys_ioctl [ 16], common_pid: gmain [ 2781] } hitcount: 1 | |
812 | { id: sys_ioctl [ 16], common_pid: gmain [ 2829] } hitcount: 1 | |
813 | { id: sys_ioctl [ 16], common_pid: bash [ 8726] } hitcount: 1 | |
814 | { id: sys_ioctl [ 16], common_pid: bash [ 8508] } hitcount: 1 | |
815 | { id: sys_ioctl [ 16], common_pid: gmain [ 2970] } hitcount: 1 | |
816 | { id: sys_ioctl [ 16], common_pid: gmain [ 2768] } hitcount: 1 | |
817 | . | |
818 | . | |
819 | . | |
820 | { id: sys_ioctl [ 16], common_pid: pool [ 8559] } hitcount: 45 | |
821 | { id: sys_ioctl [ 16], common_pid: pool [ 8555] } hitcount: 48 | |
822 | { id: sys_ioctl [ 16], common_pid: pool [ 8551] } hitcount: 48 | |
823 | { id: sys_ioctl [ 16], common_pid: avahi-daemon [ 896] } hitcount: 66 | |
824 | { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 26674 | |
825 | { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 73443 | |
826 | ||
827 | Totals: | |
828 | Hits: 101162 | |
829 | Entries: 103 | |
830 | Dropped: 0 | |
831 | ||
832 | The above output shows that 'compiz' and 'Xorg' are far and away | |
833 | the heaviest ioctl callers (which might lead to questions about | |
834 | whether they really need to be making all those calls and to | |
835 | possible avenues for further investigation.) | |
836 | ||
837 | The compound key examples used a key and a sum value (hitcount) to | |
838 | sort the output, but we can just as easily use two keys instead. | |
839 | Here's an example where we use a compound key composed of the the | |
840 | common_pid and size event fields. Sorting with pid as the primary | |
841 | key and 'size' as the secondary key allows us to display an | |
842 | ordered summary of the recvfrom sizes, with counts, received by | |
843 | each process: | |
844 | ||
845 | # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \ | |
846 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger | |
847 | ||
848 | # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist | |
849 | # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active] | |
850 | ||
851 | { common_pid: smbd [ 784], size: 4 } hitcount: 1 | |
852 | { common_pid: dnsmasq [ 1412], size: 4096 } hitcount: 672 | |
853 | { common_pid: postgres [ 1796], size: 1000 } hitcount: 6 | |
854 | { common_pid: postgres [ 1867], size: 1000 } hitcount: 10 | |
855 | { common_pid: bamfdaemon [ 2787], size: 28 } hitcount: 2 | |
856 | { common_pid: bamfdaemon [ 2787], size: 14360 } hitcount: 1 | |
857 | { common_pid: compiz [ 2994], size: 8 } hitcount: 1 | |
858 | { common_pid: compiz [ 2994], size: 20 } hitcount: 11 | |
859 | { common_pid: gnome-terminal [ 3199], size: 4 } hitcount: 2 | |
860 | { common_pid: firefox [ 8817], size: 4 } hitcount: 1 | |
861 | { common_pid: firefox [ 8817], size: 8 } hitcount: 5 | |
862 | { common_pid: firefox [ 8817], size: 588 } hitcount: 2 | |
863 | { common_pid: firefox [ 8817], size: 628 } hitcount: 1 | |
864 | { common_pid: firefox [ 8817], size: 6944 } hitcount: 1 | |
865 | { common_pid: firefox [ 8817], size: 408880 } hitcount: 2 | |
866 | { common_pid: firefox [ 8822], size: 8 } hitcount: 2 | |
867 | { common_pid: firefox [ 8822], size: 160 } hitcount: 2 | |
868 | { common_pid: firefox [ 8822], size: 320 } hitcount: 2 | |
869 | { common_pid: firefox [ 8822], size: 352 } hitcount: 1 | |
870 | . | |
871 | . | |
872 | . | |
873 | { common_pid: pool [ 8923], size: 1960 } hitcount: 10 | |
874 | { common_pid: pool [ 8923], size: 2048 } hitcount: 10 | |
875 | { common_pid: pool [ 8924], size: 1960 } hitcount: 10 | |
876 | { common_pid: pool [ 8924], size: 2048 } hitcount: 10 | |
877 | { common_pid: pool [ 8928], size: 1964 } hitcount: 4 | |
878 | { common_pid: pool [ 8928], size: 1965 } hitcount: 2 | |
879 | { common_pid: pool [ 8928], size: 2048 } hitcount: 6 | |
880 | { common_pid: pool [ 8929], size: 1982 } hitcount: 1 | |
881 | { common_pid: pool [ 8929], size: 2048 } hitcount: 1 | |
882 | ||
883 | Totals: | |
884 | Hits: 2016 | |
885 | Entries: 224 | |
886 | Dropped: 0 | |
887 | ||
888 | The above example also illustrates the fact that although a compound | |
889 | key is treated as a single entity for hashing purposes, the sub-keys | |
890 | it's composed of can be accessed independently. | |
891 | ||
892 | The next example uses a string field as the hash key and | |
893 | demonstrates how you can manually pause and continue a hist trigger. | |
894 | In this example, we'll aggregate fork counts and don't expect a | |
895 | large number of entries in the hash table, so we'll drop it to a | |
896 | much smaller number, say 256: | |
897 | ||
898 | # echo 'hist:key=child_comm:val=hitcount:size=256' > \ | |
899 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger | |
900 | ||
901 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist | |
902 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] | |
903 | ||
904 | { child_comm: dconf worker } hitcount: 1 | |
905 | { child_comm: ibus-daemon } hitcount: 1 | |
906 | { child_comm: whoopsie } hitcount: 1 | |
907 | { child_comm: smbd } hitcount: 1 | |
908 | { child_comm: gdbus } hitcount: 1 | |
909 | { child_comm: kthreadd } hitcount: 1 | |
910 | { child_comm: dconf worker } hitcount: 1 | |
911 | { child_comm: evolution-alarm } hitcount: 2 | |
912 | { child_comm: Socket Thread } hitcount: 2 | |
913 | { child_comm: postgres } hitcount: 2 | |
914 | { child_comm: bash } hitcount: 3 | |
915 | { child_comm: compiz } hitcount: 3 | |
916 | { child_comm: evolution-sourc } hitcount: 4 | |
917 | { child_comm: dhclient } hitcount: 4 | |
918 | { child_comm: pool } hitcount: 5 | |
919 | { child_comm: nm-dispatcher.a } hitcount: 8 | |
920 | { child_comm: firefox } hitcount: 8 | |
921 | { child_comm: dbus-daemon } hitcount: 8 | |
922 | { child_comm: glib-pacrunner } hitcount: 10 | |
923 | { child_comm: evolution } hitcount: 23 | |
924 | ||
925 | Totals: | |
926 | Hits: 89 | |
927 | Entries: 20 | |
928 | Dropped: 0 | |
929 | ||
930 | If we want to pause the hist trigger, we can simply append :pause to | |
931 | the command that started the trigger. Notice that the trigger info | |
932 | displays as [paused]: | |
933 | ||
934 | # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \ | |
935 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger | |
936 | ||
937 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist | |
938 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused] | |
939 | ||
940 | { child_comm: dconf worker } hitcount: 1 | |
941 | { child_comm: kthreadd } hitcount: 1 | |
942 | { child_comm: dconf worker } hitcount: 1 | |
943 | { child_comm: gdbus } hitcount: 1 | |
944 | { child_comm: ibus-daemon } hitcount: 1 | |
945 | { child_comm: Socket Thread } hitcount: 2 | |
946 | { child_comm: evolution-alarm } hitcount: 2 | |
947 | { child_comm: smbd } hitcount: 2 | |
948 | { child_comm: bash } hitcount: 3 | |
949 | { child_comm: whoopsie } hitcount: 3 | |
950 | { child_comm: compiz } hitcount: 3 | |
951 | { child_comm: evolution-sourc } hitcount: 4 | |
952 | { child_comm: pool } hitcount: 5 | |
953 | { child_comm: postgres } hitcount: 6 | |
954 | { child_comm: firefox } hitcount: 8 | |
955 | { child_comm: dhclient } hitcount: 10 | |
956 | { child_comm: emacs } hitcount: 12 | |
957 | { child_comm: dbus-daemon } hitcount: 20 | |
958 | { child_comm: nm-dispatcher.a } hitcount: 20 | |
959 | { child_comm: evolution } hitcount: 35 | |
960 | { child_comm: glib-pacrunner } hitcount: 59 | |
961 | ||
962 | Totals: | |
963 | Hits: 199 | |
964 | Entries: 21 | |
965 | Dropped: 0 | |
966 | ||
967 | To manually continue having the trigger aggregate events, append | |
968 | :cont instead. Notice that the trigger info displays as [active] | |
969 | again, and the data has changed: | |
970 | ||
971 | # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \ | |
972 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger | |
973 | ||
974 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist | |
975 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] | |
976 | ||
977 | { child_comm: dconf worker } hitcount: 1 | |
978 | { child_comm: dconf worker } hitcount: 1 | |
979 | { child_comm: kthreadd } hitcount: 1 | |
980 | { child_comm: gdbus } hitcount: 1 | |
981 | { child_comm: ibus-daemon } hitcount: 1 | |
982 | { child_comm: Socket Thread } hitcount: 2 | |
983 | { child_comm: evolution-alarm } hitcount: 2 | |
984 | { child_comm: smbd } hitcount: 2 | |
985 | { child_comm: whoopsie } hitcount: 3 | |
986 | { child_comm: compiz } hitcount: 3 | |
987 | { child_comm: evolution-sourc } hitcount: 4 | |
988 | { child_comm: bash } hitcount: 5 | |
989 | { child_comm: pool } hitcount: 5 | |
990 | { child_comm: postgres } hitcount: 6 | |
991 | { child_comm: firefox } hitcount: 8 | |
992 | { child_comm: dhclient } hitcount: 11 | |
993 | { child_comm: emacs } hitcount: 12 | |
994 | { child_comm: dbus-daemon } hitcount: 22 | |
995 | { child_comm: nm-dispatcher.a } hitcount: 22 | |
996 | { child_comm: evolution } hitcount: 35 | |
997 | { child_comm: glib-pacrunner } hitcount: 59 | |
998 | ||
999 | Totals: | |
1000 | Hits: 206 | |
1001 | Entries: 21 | |
1002 | Dropped: 0 | |
1003 | ||
1004 | The previous example showed how to start and stop a hist trigger by | |
1005 | appending 'pause' and 'continue' to the hist trigger command. A | |
1006 | hist trigger can also be started in a paused state by initially | |
1007 | starting the trigger with ':pause' appended. This allows you to | |
1008 | start the trigger only when you're ready to start collecting data | |
1009 | and not before. For example, you could start the trigger in a | |
1010 | paused state, then unpause it and do something you want to measure, | |
1011 | then pause the trigger again when done. | |
1012 | ||
1013 | Of course, doing this manually can be difficult and error-prone, but | |
1014 | it is possible to automatically start and stop a hist trigger based | |
1015 | on some condition, via the enable_hist and disable_hist triggers. | |
1016 | ||
1017 | For example, suppose we wanted to take a look at the relative | |
1018 | weights in terms of skb length for each callpath that leads to a | |
1019 | netif_receieve_skb event when downloading a decent-sized file using | |
1020 | wget. | |
1021 | ||
1022 | First we set up an initially paused stacktrace trigger on the | |
1023 | netif_receive_skb event: | |
1024 | ||
1025 | # echo 'hist:key=stacktrace:vals=len:pause' > \ | |
1026 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1027 | ||
1028 | Next, we set up an 'enable_hist' trigger on the sched_process_exec | |
1029 | event, with an 'if filename==/usr/bin/wget' filter. The effect of | |
1030 | this new trigger is that it will 'unpause' the hist trigger we just | |
1031 | set up on netif_receive_skb if and only if it sees a | |
1032 | sched_process_exec event with a filename of '/usr/bin/wget'. When | |
1033 | that happens, all netif_receive_skb events are aggregated into a | |
1034 | hash table keyed on stacktrace: | |
1035 | ||
1036 | # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ | |
1037 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger | |
1038 | ||
1039 | The aggregation continues until the netif_receive_skb is paused | |
1040 | again, which is what the following disable_hist event does by | |
1041 | creating a similar setup on the sched_process_exit event, using the | |
1042 | filter 'comm==wget': | |
1043 | ||
1044 | # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ | |
1045 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger | |
1046 | ||
1047 | Whenever a process exits and the comm field of the disable_hist | |
1048 | trigger filter matches 'comm==wget', the netif_receive_skb hist | |
1049 | trigger is disabled. | |
1050 | ||
1051 | The overall effect is that netif_receive_skb events are aggregated | |
1052 | into the hash table for only the duration of the wget. Executing a | |
1053 | wget command and then listing the 'hist' file will display the | |
1054 | output generated by the wget command: | |
1055 | ||
1056 | $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz | |
1057 | ||
1058 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist | |
1059 | # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] | |
1060 | ||
1061 | { stacktrace: | |
1062 | __netif_receive_skb_core+0x46d/0x990 | |
1063 | __netif_receive_skb+0x18/0x60 | |
1064 | netif_receive_skb_internal+0x23/0x90 | |
1065 | napi_gro_receive+0xc8/0x100 | |
1066 | ieee80211_deliver_skb+0xd6/0x270 [mac80211] | |
1067 | ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] | |
1068 | ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] | |
1069 | ieee80211_rx+0x31d/0x900 [mac80211] | |
1070 | iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] | |
1071 | iwl_rx_dispatch+0x8e/0xf0 [iwldvm] | |
1072 | iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] | |
1073 | irq_thread_fn+0x20/0x50 | |
1074 | irq_thread+0x11f/0x150 | |
1075 | kthread+0xd2/0xf0 | |
1076 | ret_from_fork+0x42/0x70 | |
1077 | } hitcount: 85 len: 28884 | |
1078 | { stacktrace: | |
1079 | __netif_receive_skb_core+0x46d/0x990 | |
1080 | __netif_receive_skb+0x18/0x60 | |
1081 | netif_receive_skb_internal+0x23/0x90 | |
1082 | napi_gro_complete+0xa4/0xe0 | |
1083 | dev_gro_receive+0x23a/0x360 | |
1084 | napi_gro_receive+0x30/0x100 | |
1085 | ieee80211_deliver_skb+0xd6/0x270 [mac80211] | |
1086 | ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] | |
1087 | ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] | |
1088 | ieee80211_rx+0x31d/0x900 [mac80211] | |
1089 | iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] | |
1090 | iwl_rx_dispatch+0x8e/0xf0 [iwldvm] | |
1091 | iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] | |
1092 | irq_thread_fn+0x20/0x50 | |
1093 | irq_thread+0x11f/0x150 | |
1094 | kthread+0xd2/0xf0 | |
1095 | } hitcount: 98 len: 664329 | |
1096 | { stacktrace: | |
1097 | __netif_receive_skb_core+0x46d/0x990 | |
1098 | __netif_receive_skb+0x18/0x60 | |
1099 | process_backlog+0xa8/0x150 | |
1100 | net_rx_action+0x15d/0x340 | |
1101 | __do_softirq+0x114/0x2c0 | |
1102 | do_softirq_own_stack+0x1c/0x30 | |
1103 | do_softirq+0x65/0x70 | |
1104 | __local_bh_enable_ip+0xb5/0xc0 | |
1105 | ip_finish_output+0x1f4/0x840 | |
1106 | ip_output+0x6b/0xc0 | |
1107 | ip_local_out_sk+0x31/0x40 | |
1108 | ip_send_skb+0x1a/0x50 | |
1109 | udp_send_skb+0x173/0x2a0 | |
1110 | udp_sendmsg+0x2bf/0x9f0 | |
1111 | inet_sendmsg+0x64/0xa0 | |
1112 | sock_sendmsg+0x3d/0x50 | |
1113 | } hitcount: 115 len: 13030 | |
1114 | { stacktrace: | |
1115 | __netif_receive_skb_core+0x46d/0x990 | |
1116 | __netif_receive_skb+0x18/0x60 | |
1117 | netif_receive_skb_internal+0x23/0x90 | |
1118 | napi_gro_complete+0xa4/0xe0 | |
1119 | napi_gro_flush+0x6d/0x90 | |
1120 | iwl_pcie_irq_handler+0x92a/0x12f0 [iwlwifi] | |
1121 | irq_thread_fn+0x20/0x50 | |
1122 | irq_thread+0x11f/0x150 | |
1123 | kthread+0xd2/0xf0 | |
1124 | ret_from_fork+0x42/0x70 | |
1125 | } hitcount: 934 len: 5512212 | |
1126 | ||
1127 | Totals: | |
1128 | Hits: 1232 | |
1129 | Entries: 4 | |
1130 | Dropped: 0 | |
1131 | ||
1132 | The above shows all the netif_receive_skb callpaths and their total | |
1133 | lengths for the duration of the wget command. | |
1134 | ||
1135 | The 'clear' hist trigger param can be used to clear the hash table. | |
1136 | Suppose we wanted to try another run of the previous example but | |
1137 | this time also wanted to see the complete list of events that went | |
1138 | into the histogram. In order to avoid having to set everything up | |
1139 | again, we can just clear the histogram first: | |
1140 | ||
1141 | # echo 'hist:key=stacktrace:vals=len:clear' >> \ | |
1142 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1143 | ||
1144 | Just to verify that it is in fact cleared, here's what we now see in | |
1145 | the hist file: | |
1146 | ||
1147 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist | |
1148 | # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] | |
1149 | ||
1150 | Totals: | |
1151 | Hits: 0 | |
1152 | Entries: 0 | |
1153 | Dropped: 0 | |
1154 | ||
1155 | Since we want to see the detailed list of every netif_receive_skb | |
1156 | event occurring during the new run, which are in fact the same | |
1157 | events being aggregated into the hash table, we add some additional | |
1158 | 'enable_event' events to the triggering sched_process_exec and | |
1159 | sched_process_exit events as such: | |
1160 | ||
1161 | # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \ | |
1162 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger | |
1163 | ||
1164 | # echo 'disable_event:net:netif_receive_skb if comm==wget' > \ | |
1165 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger | |
1166 | ||
1167 | If you read the trigger files for the sched_process_exec and | |
1168 | sched_process_exit triggers, you should see two triggers for each: | |
1169 | one enabling/disabling the hist aggregation and the other | |
1170 | enabling/disabling the logging of events: | |
1171 | ||
1172 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger | |
1173 | enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget | |
1174 | enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget | |
1175 | ||
1176 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger | |
1177 | enable_event:net:netif_receive_skb:unlimited if comm==wget | |
1178 | disable_hist:net:netif_receive_skb:unlimited if comm==wget | |
1179 | ||
1180 | In other words, whenever either of the sched_process_exec or | |
1181 | sched_process_exit events is hit and matches 'wget', it enables or | |
1182 | disables both the histogram and the event log, and what you end up | |
1183 | with is a hash table and set of events just covering the specified | |
1184 | duration. Run the wget command again: | |
1185 | ||
1186 | $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz | |
1187 | ||
1188 | Displaying the 'hist' file should show something similar to what you | |
1189 | saw in the last run, but this time you should also see the | |
1190 | individual events in the trace file: | |
1191 | ||
1192 | # cat /sys/kernel/debug/tracing/trace | |
1193 | ||
1194 | # tracer: nop | |
1195 | # | |
1196 | # entries-in-buffer/entries-written: 183/1426 #P:4 | |
1197 | # | |
1198 | # _-----=> irqs-off | |
1199 | # / _----=> need-resched | |
1200 | # | / _---=> hardirq/softirq | |
1201 | # || / _--=> preempt-depth | |
1202 | # ||| / delay | |
1203 | # TASK-PID CPU# |||| TIMESTAMP FUNCTION | |
1204 | # | | | |||| | | | |
1205 | wget-15108 [000] ..s1 31769.606929: netif_receive_skb: dev=lo skbaddr=ffff88009c353100 len=60 | |
1206 | wget-15108 [000] ..s1 31769.606999: netif_receive_skb: dev=lo skbaddr=ffff88009c353200 len=60 | |
1207 | dnsmasq-1382 [000] ..s1 31769.677652: netif_receive_skb: dev=lo skbaddr=ffff88009c352b00 len=130 | |
1208 | dnsmasq-1382 [000] ..s1 31769.685917: netif_receive_skb: dev=lo skbaddr=ffff88009c352200 len=138 | |
1209 | ##### CPU 2 buffer started #### | |
1210 | irq/29-iwlwifi-559 [002] ..s. 31772.031529: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433d00 len=2948 | |
1211 | irq/29-iwlwifi-559 [002] ..s. 31772.031572: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432200 len=1500 | |
1212 | irq/29-iwlwifi-559 [002] ..s. 31772.032196: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433100 len=2948 | |
1213 | irq/29-iwlwifi-559 [002] ..s. 31772.032761: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433000 len=2948 | |
1214 | irq/29-iwlwifi-559 [002] ..s. 31772.033220: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432e00 len=1500 | |
1215 | . | |
1216 | . | |
1217 | . | |
1218 | ||
1219 | The following example demonstrates how multiple hist triggers can be | |
1220 | attached to a given event. This capability can be useful for | |
1221 | creating a set of different summaries derived from the same set of | |
1222 | events, or for comparing the effects of different filters, among | |
1223 | other things. | |
1224 | ||
1225 | # echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \ | |
1226 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1227 | # echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \ | |
1228 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1229 | # echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \ | |
1230 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1231 | # echo 'hist:keys=skbaddr.hex:vals=len' >> \ | |
1232 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1233 | # echo 'hist:keys=len:vals=common_preempt_count' >> \ | |
1234 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1235 | ||
1236 | The above set of commands create four triggers differing only in | |
1237 | their filters, along with a completely different though fairly | |
1238 | nonsensical trigger. Note that in order to append multiple hist | |
1239 | triggers to the same file, you should use the '>>' operator to | |
1240 | append them ('>' will also add the new hist trigger, but will remove | |
1241 | any existing hist triggers beforehand). | |
1242 | ||
1243 | Displaying the contents of the 'hist' file for the event shows the | |
1244 | contents of all five histograms: | |
1245 | ||
1246 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist | |
1247 | ||
1248 | # event histogram | |
1249 | # | |
1250 | # trigger info: hist:keys=len:vals=hitcount,common_preempt_count:sort=hitcount:size=2048 [active] | |
1251 | # | |
1252 | ||
1253 | { len: 176 } hitcount: 1 common_preempt_count: 0 | |
1254 | { len: 223 } hitcount: 1 common_preempt_count: 0 | |
1255 | { len: 4854 } hitcount: 1 common_preempt_count: 0 | |
1256 | { len: 395 } hitcount: 1 common_preempt_count: 0 | |
1257 | { len: 177 } hitcount: 1 common_preempt_count: 0 | |
1258 | { len: 446 } hitcount: 1 common_preempt_count: 0 | |
1259 | { len: 1601 } hitcount: 1 common_preempt_count: 0 | |
1260 | . | |
1261 | . | |
1262 | . | |
1263 | { len: 1280 } hitcount: 66 common_preempt_count: 0 | |
1264 | { len: 116 } hitcount: 81 common_preempt_count: 40 | |
1265 | { len: 708 } hitcount: 112 common_preempt_count: 0 | |
1266 | { len: 46 } hitcount: 221 common_preempt_count: 0 | |
1267 | { len: 1264 } hitcount: 458 common_preempt_count: 0 | |
1268 | ||
1269 | Totals: | |
1270 | Hits: 1428 | |
1271 | Entries: 147 | |
1272 | Dropped: 0 | |
1273 | ||
1274 | ||
1275 | # event histogram | |
1276 | # | |
1277 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] | |
1278 | # | |
1279 | ||
1280 | { skbaddr: ffff8800baee5e00 } hitcount: 1 len: 130 | |
1281 | { skbaddr: ffff88005f3d5600 } hitcount: 1 len: 1280 | |
1282 | { skbaddr: ffff88005f3d4900 } hitcount: 1 len: 1280 | |
1283 | { skbaddr: ffff88009fed6300 } hitcount: 1 len: 115 | |
1284 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 115 | |
1285 | { skbaddr: ffff88008cdb1900 } hitcount: 1 len: 46 | |
1286 | { skbaddr: ffff880064b5ef00 } hitcount: 1 len: 118 | |
1287 | { skbaddr: ffff880044e3c700 } hitcount: 1 len: 60 | |
1288 | { skbaddr: ffff880100065900 } hitcount: 1 len: 46 | |
1289 | { skbaddr: ffff8800d46bd500 } hitcount: 1 len: 116 | |
1290 | { skbaddr: ffff88005f3d5f00 } hitcount: 1 len: 1280 | |
1291 | { skbaddr: ffff880100064700 } hitcount: 1 len: 365 | |
1292 | { skbaddr: ffff8800badb6f00 } hitcount: 1 len: 60 | |
1293 | . | |
1294 | . | |
1295 | . | |
1296 | { skbaddr: ffff88009fe0be00 } hitcount: 27 len: 24677 | |
1297 | { skbaddr: ffff88009fe0a400 } hitcount: 27 len: 23052 | |
1298 | { skbaddr: ffff88009fe0b700 } hitcount: 31 len: 25589 | |
1299 | { skbaddr: ffff88009fe0b600 } hitcount: 32 len: 27326 | |
1300 | { skbaddr: ffff88006a462800 } hitcount: 68 len: 71678 | |
1301 | { skbaddr: ffff88006a463700 } hitcount: 70 len: 72678 | |
1302 | { skbaddr: ffff88006a462b00 } hitcount: 71 len: 77589 | |
1303 | { skbaddr: ffff88006a463600 } hitcount: 73 len: 71307 | |
1304 | { skbaddr: ffff88006a462200 } hitcount: 81 len: 81032 | |
1305 | ||
1306 | Totals: | |
1307 | Hits: 1451 | |
1308 | Entries: 318 | |
1309 | Dropped: 0 | |
1310 | ||
1311 | ||
1312 | # event histogram | |
1313 | # | |
1314 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len == 256 [active] | |
1315 | # | |
1316 | ||
1317 | ||
1318 | Totals: | |
1319 | Hits: 0 | |
1320 | Entries: 0 | |
1321 | Dropped: 0 | |
1322 | ||
1323 | ||
1324 | # event histogram | |
1325 | # | |
1326 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len > 4096 [active] | |
1327 | # | |
1328 | ||
1329 | { skbaddr: ffff88009fd2c300 } hitcount: 1 len: 7212 | |
1330 | { skbaddr: ffff8800d2bcce00 } hitcount: 1 len: 7212 | |
1331 | { skbaddr: ffff8800d2bcd700 } hitcount: 1 len: 7212 | |
1332 | { skbaddr: ffff8800d2bcda00 } hitcount: 1 len: 21492 | |
1333 | { skbaddr: ffff8800ae2e2d00 } hitcount: 1 len: 7212 | |
1334 | { skbaddr: ffff8800d2bcdb00 } hitcount: 1 len: 7212 | |
1335 | { skbaddr: ffff88006a4df500 } hitcount: 1 len: 4854 | |
1336 | { skbaddr: ffff88008ce47b00 } hitcount: 1 len: 18636 | |
1337 | { skbaddr: ffff8800ae2e2200 } hitcount: 1 len: 12924 | |
1338 | { skbaddr: ffff88005f3e1000 } hitcount: 1 len: 4356 | |
1339 | { skbaddr: ffff8800d2bcdc00 } hitcount: 2 len: 24420 | |
1340 | { skbaddr: ffff8800d2bcc200 } hitcount: 2 len: 12996 | |
1341 | ||
1342 | Totals: | |
1343 | Hits: 14 | |
1344 | Entries: 12 | |
1345 | Dropped: 0 | |
1346 | ||
1347 | ||
1348 | # event histogram | |
1349 | # | |
1350 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len < 0 [active] | |
1351 | # | |
1352 | ||
1353 | ||
1354 | Totals: | |
1355 | Hits: 0 | |
1356 | Entries: 0 | |
1357 | Dropped: 0 | |
1358 | ||
1359 | Named triggers can be used to have triggers share a common set of | |
1360 | histogram data. This capability is mostly useful for combining the | |
1361 | output of events generated by tracepoints contained inside inline | |
1362 | functions, but names can be used in a hist trigger on any event. | |
1363 | For example, these two triggers when hit will update the same 'len' | |
1364 | field in the shared 'foo' histogram data: | |
1365 | ||
1366 | # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ | |
1367 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger | |
1368 | # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ | |
1369 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger | |
1370 | ||
1371 | You can see that they're updating common histogram data by reading | |
1372 | each event's hist files at the same time: | |
1373 | ||
1374 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist; | |
1375 | cat /sys/kernel/debug/tracing/events/net/netif_rx/hist | |
1376 | ||
1377 | # event histogram | |
1378 | # | |
1379 | # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] | |
1380 | # | |
1381 | ||
1382 | { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46 | |
1383 | { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76 | |
1384 | { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46 | |
1385 | { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468 | |
1386 | { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46 | |
1387 | { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52 | |
1388 | { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168 | |
1389 | { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46 | |
1390 | { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260 | |
1391 | { skbaddr: ffff880064505000 } hitcount: 1 len: 46 | |
1392 | { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32 | |
1393 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46 | |
1394 | { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44 | |
1395 | { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168 | |
1396 | { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40 | |
1397 | { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40 | |
1398 | { skbaddr: ffff880064505f00 } hitcount: 1 len: 174 | |
1399 | { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160 | |
1400 | { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76 | |
1401 | { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46 | |
1402 | { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32 | |
1403 | { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46 | |
1404 | { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988 | |
1405 | { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46 | |
1406 | { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44 | |
1407 | { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676 | |
1408 | { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107 | |
1409 | { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92 | |
1410 | { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142 | |
1411 | { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220 | |
1412 | { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92 | |
1413 | { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92 | |
1414 | { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675 | |
1415 | { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138 | |
1416 | { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138 | |
1417 | { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184 | |
1418 | { skbaddr: ffff880064504400 } hitcount: 4 len: 184 | |
1419 | { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184 | |
1420 | { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230 | |
1421 | { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196 | |
1422 | { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276 | |
1423 | { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276 | |
1424 | ||
1425 | Totals: | |
1426 | Hits: 81 | |
1427 | Entries: 42 | |
1428 | Dropped: 0 | |
1429 | # event histogram | |
1430 | # | |
1431 | # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] | |
1432 | # | |
1433 | ||
1434 | { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46 | |
1435 | { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76 | |
1436 | { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46 | |
1437 | { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468 | |
1438 | { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46 | |
1439 | { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52 | |
1440 | { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168 | |
1441 | { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46 | |
1442 | { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260 | |
1443 | { skbaddr: ffff880064505000 } hitcount: 1 len: 46 | |
1444 | { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32 | |
1445 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46 | |
1446 | { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44 | |
1447 | { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168 | |
1448 | { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40 | |
1449 | { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40 | |
1450 | { skbaddr: ffff880064505f00 } hitcount: 1 len: 174 | |
1451 | { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160 | |
1452 | { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76 | |
1453 | { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46 | |
1454 | { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32 | |
1455 | { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46 | |
1456 | { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988 | |
1457 | { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46 | |
1458 | { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44 | |
1459 | { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676 | |
1460 | { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107 | |
1461 | { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92 | |
1462 | { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142 | |
1463 | { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220 | |
1464 | { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92 | |
1465 | { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92 | |
1466 | { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675 | |
1467 | { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138 | |
1468 | { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138 | |
1469 | { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184 | |
1470 | { skbaddr: ffff880064504400 } hitcount: 4 len: 184 | |
1471 | { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184 | |
1472 | { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230 | |
1473 | { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196 | |
1474 | { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276 | |
1475 | { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276 | |
1476 | ||
1477 | Totals: | |
1478 | Hits: 81 | |
1479 | Entries: 42 | |
1480 | Dropped: 0 | |
1481 | ||
1482 | And here's an example that shows how to combine histogram data from | |
1483 | any two events even if they don't share any 'compatible' fields | |
1484 | other than 'hitcount' and 'stacktrace'. These commands create a | |
1485 | couple of triggers named 'bar' using those fields: | |
1486 | ||
1487 | # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ | |
1488 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger | |
1489 | # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ | |
1490 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger | |
1491 | ||
1492 | And displaying the output of either shows some interesting if | |
1493 | somewhat confusing output: | |
1494 | ||
1495 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist | |
1496 | # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist | |
1497 | ||
1498 | # event histogram | |
1499 | # | |
1500 | # trigger info: hist:name=bar:keys=stacktrace:vals=hitcount:sort=hitcount:size=2048 [active] | |
1501 | # | |
1502 | ||
1503 | { stacktrace: | |
1504 | _do_fork+0x18e/0x330 | |
1505 | kernel_thread+0x29/0x30 | |
1506 | kthreadd+0x154/0x1b0 | |
1507 | ret_from_fork+0x3f/0x70 | |
1508 | } hitcount: 1 | |
1509 | { stacktrace: | |
1510 | netif_rx_internal+0xb2/0xd0 | |
1511 | netif_rx_ni+0x20/0x70 | |
1512 | dev_loopback_xmit+0xaa/0xd0 | |
1513 | ip_mc_output+0x126/0x240 | |
1514 | ip_local_out_sk+0x31/0x40 | |
1515 | igmp_send_report+0x1e9/0x230 | |
1516 | igmp_timer_expire+0xe9/0x120 | |
1517 | call_timer_fn+0x39/0xf0 | |
1518 | run_timer_softirq+0x1e1/0x290 | |
1519 | __do_softirq+0xfd/0x290 | |
1520 | irq_exit+0x98/0xb0 | |
1521 | smp_apic_timer_interrupt+0x4a/0x60 | |
1522 | apic_timer_interrupt+0x6d/0x80 | |
1523 | cpuidle_enter+0x17/0x20 | |
1524 | call_cpuidle+0x3b/0x60 | |
1525 | cpu_startup_entry+0x22d/0x310 | |
1526 | } hitcount: 1 | |
1527 | { stacktrace: | |
1528 | netif_rx_internal+0xb2/0xd0 | |
1529 | netif_rx_ni+0x20/0x70 | |
1530 | dev_loopback_xmit+0xaa/0xd0 | |
1531 | ip_mc_output+0x17f/0x240 | |
1532 | ip_local_out_sk+0x31/0x40 | |
1533 | ip_send_skb+0x1a/0x50 | |
1534 | udp_send_skb+0x13e/0x270 | |
1535 | udp_sendmsg+0x2bf/0x980 | |
1536 | inet_sendmsg+0x67/0xa0 | |
1537 | sock_sendmsg+0x38/0x50 | |
1538 | SYSC_sendto+0xef/0x170 | |
1539 | SyS_sendto+0xe/0x10 | |
1540 | entry_SYSCALL_64_fastpath+0x12/0x6a | |
1541 | } hitcount: 2 | |
1542 | { stacktrace: | |
1543 | netif_rx_internal+0xb2/0xd0 | |
1544 | netif_rx+0x1c/0x60 | |
1545 | loopback_xmit+0x6c/0xb0 | |
1546 | dev_hard_start_xmit+0x219/0x3a0 | |
1547 | __dev_queue_xmit+0x415/0x4f0 | |
1548 | dev_queue_xmit_sk+0x13/0x20 | |
1549 | ip_finish_output2+0x237/0x340 | |
1550 | ip_finish_output+0x113/0x1d0 | |
1551 | ip_output+0x66/0xc0 | |
1552 | ip_local_out_sk+0x31/0x40 | |
1553 | ip_send_skb+0x1a/0x50 | |
1554 | udp_send_skb+0x16d/0x270 | |
1555 | udp_sendmsg+0x2bf/0x980 | |
1556 | inet_sendmsg+0x67/0xa0 | |
1557 | sock_sendmsg+0x38/0x50 | |
1558 | ___sys_sendmsg+0x14e/0x270 | |
1559 | } hitcount: 76 | |
1560 | { stacktrace: | |
1561 | netif_rx_internal+0xb2/0xd0 | |
1562 | netif_rx+0x1c/0x60 | |
1563 | loopback_xmit+0x6c/0xb0 | |
1564 | dev_hard_start_xmit+0x219/0x3a0 | |
1565 | __dev_queue_xmit+0x415/0x4f0 | |
1566 | dev_queue_xmit_sk+0x13/0x20 | |
1567 | ip_finish_output2+0x237/0x340 | |
1568 | ip_finish_output+0x113/0x1d0 | |
1569 | ip_output+0x66/0xc0 | |
1570 | ip_local_out_sk+0x31/0x40 | |
1571 | ip_send_skb+0x1a/0x50 | |
1572 | udp_send_skb+0x16d/0x270 | |
1573 | udp_sendmsg+0x2bf/0x980 | |
1574 | inet_sendmsg+0x67/0xa0 | |
1575 | sock_sendmsg+0x38/0x50 | |
1576 | ___sys_sendmsg+0x269/0x270 | |
1577 | } hitcount: 77 | |
1578 | { stacktrace: | |
1579 | netif_rx_internal+0xb2/0xd0 | |
1580 | netif_rx+0x1c/0x60 | |
1581 | loopback_xmit+0x6c/0xb0 | |
1582 | dev_hard_start_xmit+0x219/0x3a0 | |
1583 | __dev_queue_xmit+0x415/0x4f0 | |
1584 | dev_queue_xmit_sk+0x13/0x20 | |
1585 | ip_finish_output2+0x237/0x340 | |
1586 | ip_finish_output+0x113/0x1d0 | |
1587 | ip_output+0x66/0xc0 | |
1588 | ip_local_out_sk+0x31/0x40 | |
1589 | ip_send_skb+0x1a/0x50 | |
1590 | udp_send_skb+0x16d/0x270 | |
1591 | udp_sendmsg+0x2bf/0x980 | |
1592 | inet_sendmsg+0x67/0xa0 | |
1593 | sock_sendmsg+0x38/0x50 | |
1594 | SYSC_sendto+0xef/0x170 | |
1595 | } hitcount: 88 | |
1596 | { stacktrace: | |
1597 | _do_fork+0x18e/0x330 | |
1598 | SyS_clone+0x19/0x20 | |
1599 | entry_SYSCALL_64_fastpath+0x12/0x6a | |
1600 | } hitcount: 244 | |
1601 | ||
1602 | Totals: | |
1603 | Hits: 489 | |
1604 | Entries: 7 | |
1605 | Dropped: 0 | |
033cbcee TZ |
1606 | |
1607 | ||
1608 | 2.2 Inter-event hist triggers | |
1609 | ----------------------------- | |
1610 | ||
1611 | Inter-event hist triggers are hist triggers that combine values from | |
1612 | one or more other events and create a histogram using that data. Data | |
1613 | from an inter-event histogram can in turn become the source for | |
1614 | further combined histograms, thus providing a chain of related | |
1615 | histograms, which is important for some applications. | |
1616 | ||
1617 | The most important example of an inter-event quantity that can be used | |
1618 | in this manner is latency, which is simply a difference in timestamps | |
1619 | between two events. Although latency is the most important | |
1620 | inter-event quantity, note that because the support is completely | |
1621 | general across the trace event subsystem, any event field can be used | |
1622 | in an inter-event quantity. | |
1623 | ||
1624 | An example of a histogram that combines data from other histograms | |
1625 | into a useful chain would be a 'wakeupswitch latency' histogram that | |
1626 | combines a 'wakeup latency' histogram and a 'switch latency' | |
1627 | histogram. | |
1628 | ||
1629 | Normally, a hist trigger specification consists of a (possibly | |
1630 | compound) key along with one or more numeric values, which are | |
1631 | continually updated sums associated with that key. A histogram | |
1632 | specification in this case consists of individual key and value | |
1633 | specifications that refer to trace event fields associated with a | |
1634 | single event type. | |
1635 | ||
1636 | The inter-event hist trigger extension allows fields from multiple | |
1637 | events to be referenced and combined into a multi-event histogram | |
1638 | specification. In support of this overall goal, a few enabling | |
1639 | features have been added to the hist trigger support: | |
1640 | ||
1641 | - In order to compute an inter-event quantity, a value from one | |
1642 | event needs to saved and then referenced from another event. This | |
1643 | requires the introduction of support for histogram 'variables'. | |
1644 | ||
1645 | - The computation of inter-event quantities and their combination | |
1646 | require some minimal amount of support for applying simple | |
1647 | expressions to variables (+ and -). | |
1648 | ||
1649 | - A histogram consisting of inter-event quantities isn't logically a | |
1650 | histogram on either event (so having the 'hist' file for either | |
1651 | event host the histogram output doesn't really make sense). To | |
1652 | address the idea that the histogram is associated with a | |
1653 | combination of events, support is added allowing the creation of | |
1654 | 'synthetic' events that are events derived from other events. | |
1655 | These synthetic events are full-fledged events just like any other | |
1656 | and can be used as such, as for instance to create the | |
1657 | 'combination' histograms mentioned previously. | |
1658 | ||
1659 | - A set of 'actions' can be associated with histogram entries - | |
1660 | these can be used to generate the previously mentioned synthetic | |
1661 | events, but can also be used for other purposes, such as for | |
1662 | example saving context when a 'max' latency has been hit. | |
1663 | ||
1664 | - Trace events don't have a 'timestamp' associated with them, but | |
1665 | there is an implicit timestamp saved along with an event in the | |
1666 | underlying ftrace ring buffer. This timestamp is now exposed as a | |
1667 | a synthetic field named 'common_timestamp' which can be used in | |
1668 | histograms as if it were any other event field; it isn't an actual | |
1669 | field in the trace format but rather is a synthesized value that | |
1670 | nonetheless can be used as if it were an actual field. By default | |
1671 | it is in units of nanoseconds; appending '.usecs' to a | |
1672 | common_timestamp field changes the units to microseconds. | |
1673 | ||
1674 | These features are decribed in more detail in the following sections. | |
1675 | ||
1676 | 2.2.1 Histogram Variables | |
1677 | ------------------------- | |
1678 | ||
1679 | Variables are simply named locations used for saving and retrieving | |
1680 | values between matching events. A 'matching' event is defined as an | |
1681 | event that has a matching key - if a variable is saved for a histogram | |
1682 | entry corresponding to that key, any subsequent event with a matching | |
1683 | key can access that variable. | |
1684 | ||
1685 | A variable's value is normally available to any subsequent event until | |
1686 | it is set to something else by a subsequent event. The one exception | |
1687 | to that rule is that any variable used in an expression is essentially | |
1688 | 'read-once' - once it's used by an expression in a subsequent event, | |
1689 | it's reset to its 'unset' state, which means it can't be used again | |
1690 | unless it's set again. This ensures not only that an event doesn't | |
1691 | use an uninitialized variable in a calculation, but that that variable | |
1692 | is used only once and not for any unrelated subsequent match. | |
1693 | ||
1694 | The basic syntax for saving a variable is to simply prefix a unique | |
1695 | variable name not corresponding to any keyword along with an '=' sign | |
1696 | to any event field. | |
1697 | ||
1698 | Either keys or values can be saved and retrieved in this way. This | |
1699 | creates a variable named 'ts0' for a histogram entry with the key | |
1700 | 'next_pid': | |
1701 | ||
1702 | # echo 'hist:keys=next_pid:vals=$ts0:ts0=common_timestamp ... >> \ | |
1703 | event/trigger | |
1704 | ||
1705 | The ts0 variable can be accessed by any subsequent event having the | |
1706 | same pid as 'next_pid'. | |
1707 | ||
1708 | Variable references are formed by prepending the variable name with | |
1709 | the '$' sign. Thus for example, the ts0 variable above would be | |
1710 | referenced as '$ts0' in expressions. | |
1711 | ||
1712 | Because 'vals=' is used, the common_timestamp variable value above | |
1713 | will also be summed as a normal histogram value would (though for a | |
1714 | timestamp it makes little sense). | |
1715 | ||
1716 | The below shows that a key value can also be saved in the same way: | |
1717 | ||
1718 | # echo 'hist:timer_pid=common_pid:key=timer_pid ...' >> event/trigger | |
1719 | ||
1720 | If a variable isn't a key variable or prefixed with 'vals=', the | |
1721 | associated event field will be saved in a variable but won't be summed | |
1722 | as a value: | |
1723 | ||
1724 | # echo 'hist:keys=next_pid:ts1=common_timestamp ... >> event/trigger | |
1725 | ||
1726 | Multiple variables can be assigned at the same time. The below would | |
1727 | result in both ts0 and b being created as variables, with both | |
1728 | common_timestamp and field1 additionally being summed as values: | |
1729 | ||
1730 | # echo 'hist:keys=pid:vals=$ts0,$b:ts0=common_timestamp,b=field1 ... >> \ | |
1731 | event/trigger | |
1732 | ||
1733 | Note that variable assignments can appear either preceding or | |
1734 | following their use. The command below behaves identically to the | |
1735 | command above: | |
1736 | ||
1737 | # echo 'hist:keys=pid:ts0=common_timestamp,b=field1:vals=$ts0,$b ... >> \ | |
1738 | event/trigger | |
1739 | ||
1740 | Any number of variables not bound to a 'vals=' prefix can also be | |
1741 | assigned by simply separating them with colons. Below is the same | |
1742 | thing but without the values being summed in the histogram: | |
1743 | ||
1744 | # echo 'hist:keys=pid:ts0=common_timestamp:b=field1 ... >> event/trigger | |
1745 | ||
1746 | Variables set as above can be referenced and used in expressions on | |
1747 | another event. | |
1748 | ||
1749 | For example, here's how a latency can be calculated: | |
1750 | ||
1751 | # echo 'hist:keys=pid,prio:ts0=common_timestamp ... >> event1/trigger | |
1752 | # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp-$ts0 ... >> event2/trigger | |
1753 | ||
1754 | In the first line above, the event's timetamp is saved into the | |
1755 | variable ts0. In the next line, ts0 is subtracted from the second | |
1756 | event's timestamp to produce the latency, which is then assigned into | |
1757 | yet another variable, 'wakeup_lat'. The hist trigger below in turn | |
1758 | makes use of the wakeup_lat variable to compute a combined latency | |
1759 | using the same key and variable from yet another event: | |
1760 | ||
1761 | # echo 'hist:key=pid:wakeupswitch_lat=$wakeup_lat+$switchtime_lat ... >> event3/trigger | |
1762 | ||
1763 | 2.2.2 Synthetic Events | |
1764 | ---------------------- | |
1765 | ||
1766 | Synthetic events are user-defined events generated from hist trigger | |
1767 | variables or fields associated with one or more other events. Their | |
1768 | purpose is to provide a mechanism for displaying data spanning | |
1769 | multiple events consistent with the existing and already familiar | |
1770 | usage for normal events. | |
1771 | ||
1772 | To define a synthetic event, the user writes a simple specification | |
1773 | consisting of the name of the new event along with one or more | |
1774 | variables and their types, which can be any valid field type, | |
1775 | separated by semicolons, to the tracing/synthetic_events file. | |
1776 | ||
1777 | For instance, the following creates a new event named 'wakeup_latency' | |
1778 | with 3 fields: lat, pid, and prio. Each of those fields is simply a | |
1779 | variable reference to a variable on another event: | |
1780 | ||
1781 | # echo 'wakeup_latency \ | |
1782 | u64 lat; \ | |
1783 | pid_t pid; \ | |
1784 | int prio' >> \ | |
1785 | /sys/kernel/debug/tracing/synthetic_events | |
1786 | ||
1787 | Reading the tracing/synthetic_events file lists all the currently | |
1788 | defined synthetic events, in this case the event defined above: | |
1789 | ||
1790 | # cat /sys/kernel/debug/tracing/synthetic_events | |
1791 | wakeup_latency u64 lat; pid_t pid; int prio | |
1792 | ||
1793 | An existing synthetic event definition can be removed by prepending | |
1794 | the command that defined it with a '!': | |
1795 | ||
1796 | # echo '!wakeup_latency u64 lat pid_t pid int prio' >> \ | |
1797 | /sys/kernel/debug/tracing/synthetic_events | |
1798 | ||
1799 | At this point, there isn't yet an actual 'wakeup_latency' event | |
1800 | instantiated in the event subsytem - for this to happen, a 'hist | |
1801 | trigger action' needs to be instantiated and bound to actual fields | |
1802 | and variables defined on other events (see Section 6.3.3 below). | |
1803 | ||
1804 | Once that is done, an event instance is created, and a histogram can | |
1805 | be defined using it: | |
1806 | ||
1807 | # echo 'hist:keys=pid,prio,lat.log2:sort=pid,lat' >> \ | |
1808 | /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger | |
1809 | ||
1810 | The new event is created under the tracing/events/synthetic/ directory | |
1811 | and looks and behaves just like any other event: | |
1812 | ||
1813 | # ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency | |
1814 | enable filter format hist id trigger | |
1815 | ||
1816 | Like any other event, once a histogram is enabled for the event, the | |
1817 | output can be displayed by reading the event's 'hist' file. | |
1818 | ||
1819 | 2.2.3 Hist trigger 'actions' | |
1820 | ---------------------------- | |
1821 | ||
1822 | A hist trigger 'action' is a function that's executed whenever a | |
1823 | histogram entry is added or updated. | |
1824 | ||
1825 | The default 'action' if no special function is explicity specified is | |
1826 | as it always has been, to simply update the set of values associated | |
1827 | with an entry. Some applications, however, may want to perform | |
1828 | additional actions at that point, such as generate another event, or | |
1829 | compare and save a maximum. | |
1830 | ||
1831 | The following additional actions are available. To specify an action | |
1832 | for a given event, simply specify the action between colons in the | |
1833 | hist trigger specification. | |
1834 | ||
1835 | - onmatch(matching.event).<synthetic_event_name>(param list) | |
1836 | ||
1837 | The 'onmatch(matching.event).<synthetic_event_name>(params)' hist | |
1838 | trigger action is invoked whenever an event matches and the | |
1839 | histogram entry would be added or updated. It causes the named | |
1840 | synthetic event to be generated with the values given in the | |
1841 | 'param list'. The result is the generation of a synthetic event | |
1842 | that consists of the values contained in those variables at the | |
1843 | time the invoking event was hit. | |
1844 | ||
1845 | The 'param list' consists of one or more parameters which may be | |
1846 | either variables or fields defined on either the 'matching.event' | |
1847 | or the target event. The variables or fields specified in the | |
1848 | param list may be either fully-qualified or unqualified. If a | |
1849 | variable is specified as unqualified, it must be unique between | |
1850 | the two events. A field name used as a param can be unqualified | |
1851 | if it refers to the target event, but must be fully qualified if | |
1852 | it refers to the matching event. A fully-qualified name is of the | |
1853 | form 'system.event_name.$var_name' or 'system.event_name.field'. | |
1854 | ||
1855 | The 'matching.event' specification is simply the fully qualified | |
1856 | event name of the event that matches the target event for the | |
1857 | onmatch() functionality, in the form 'system.event_name'. | |
1858 | ||
1859 | Finally, the number and type of variables/fields in the 'param | |
1860 | list' must match the number and types of the fields in the | |
1861 | synthetic event being generated. | |
1862 | ||
1863 | As an example the below defines a simple synthetic event and uses | |
1864 | a variable defined on the sched_wakeup_new event as a parameter | |
1865 | when invoking the synthetic event. Here we define the synthetic | |
1866 | event: | |
1867 | ||
1868 | # echo 'wakeup_new_test pid_t pid' >> \ | |
1869 | /sys/kernel/debug/tracing/synthetic_events | |
1870 | ||
1871 | # cat /sys/kernel/debug/tracing/synthetic_events | |
1872 | wakeup_new_test pid_t pid | |
1873 | ||
1874 | The following hist trigger both defines the missing testpid | |
1875 | variable and specifies an onmatch() action that generates a | |
1876 | wakeup_new_test synthetic event whenever a sched_wakeup_new event | |
1877 | occurs, which because of the 'if comm == "cyclictest"' filter only | |
1878 | happens when the executable is cyclictest: | |
1879 | ||
1880 | # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ | |
1881 | wakeup_new_test($testpid) if comm=="cyclictest"' >> \ | |
1882 | /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger | |
1883 | ||
1884 | Creating and displaying a histogram based on those events is now | |
1885 | just a matter of using the fields and new synthetic event in the | |
1886 | tracing/events/synthetic directory, as usual: | |
1887 | ||
1888 | # echo 'hist:keys=pid:sort=pid' >> \ | |
1889 | /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger | |
1890 | ||
1891 | Running 'cyclictest' should cause wakeup_new events to generate | |
1892 | wakeup_new_test synthetic events which should result in histogram | |
1893 | output in the wakeup_new_test event's hist file: | |
1894 | ||
1895 | # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist | |
1896 | ||
1897 | A more typical usage would be to use two events to calculate a | |
1898 | latency. The following example uses a set of hist triggers to | |
1899 | produce a 'wakeup_latency' histogram: | |
1900 | ||
1901 | First, we define a 'wakeup_latency' synthetic event: | |
1902 | ||
1903 | # echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \ | |
1904 | /sys/kernel/debug/tracing/synthetic_events | |
1905 | ||
1906 | Next, we specify that whenever we see a sched_waking event for a | |
1907 | cyclictest thread, save the timestamp in a 'ts0' variable: | |
1908 | ||
1909 | # echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \ | |
1910 | if comm=="cyclictest"' >> \ | |
1911 | /sys/kernel/debug/tracing/events/sched/sched_waking/trigger | |
1912 | ||
1913 | Then, when the corresponding thread is actually scheduled onto the | |
1914 | CPU by a sched_switch event, calculate the latency and use that | |
1915 | along with another variable and an event field to generate a | |
1916 | wakeup_latency synthetic event: | |
1917 | ||
1918 | # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\ | |
1919 | onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\ | |
1920 | $saved_pid,next_prio) if next_comm=="cyclictest"' >> \ | |
1921 | /sys/kernel/debug/tracing/events/sched/sched_switch/trigger | |
1922 | ||
1923 | We also need to create a histogram on the wakeup_latency synthetic | |
1924 | event in order to aggregate the generated synthetic event data: | |
1925 | ||
1926 | # echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \ | |
1927 | /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger | |
1928 | ||
1929 | Finally, once we've run cyclictest to actually generate some | |
1930 | events, we can see the output by looking at the wakeup_latency | |
1931 | synthetic event's hist file: | |
1932 | ||
1933 | # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist | |
1934 | ||
1935 | - onmax(var).save(field,.. .) | |
1936 | ||
1937 | The 'onmax(var).save(field,...)' hist trigger action is invoked | |
1938 | whenever the value of 'var' associated with a histogram entry | |
1939 | exceeds the current maximum contained in that variable. | |
1940 | ||
1941 | The end result is that the trace event fields specified as the | |
1942 | onmax.save() params will be saved if 'var' exceeds the current | |
1943 | maximum for that hist trigger entry. This allows context from the | |
1944 | event that exhibited the new maximum to be saved for later | |
1945 | reference. When the histogram is displayed, additional fields | |
1946 | displaying the saved values will be printed. | |
1947 | ||
1948 | As an example the below defines a couple of hist triggers, one for | |
1949 | sched_waking and another for sched_switch, keyed on pid. Whenever | |
1950 | a sched_waking occurs, the timestamp is saved in the entry | |
1951 | corresponding to the current pid, and when the scheduler switches | |
1952 | back to that pid, the timestamp difference is calculated. If the | |
1953 | resulting latency, stored in wakeup_lat, exceeds the current | |
1954 | maximum latency, the values specified in the save() fields are | |
1955 | recoreded: | |
1956 | ||
1957 | # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ | |
1958 | if comm=="cyclictest"' >> \ | |
1959 | /sys/kernel/debug/tracing/events/sched/sched_waking/trigger | |
1960 | ||
1961 | # echo 'hist:keys=next_pid:\ | |
1962 | wakeup_lat=common_timestamp.usecs-$ts0:\ | |
1963 | onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \ | |
1964 | if next_comm=="cyclictest"' >> \ | |
1965 | /sys/kernel/debug/tracing/events/sched/sched_switch/trigger | |
1966 | ||
1967 | When the histogram is displayed, the max value and the saved | |
1968 | values corresponding to the max are displayed following the rest | |
1969 | of the fields: | |
1970 | ||
1971 | # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist | |
1972 | { next_pid: 2255 } hitcount: 239 | |
1973 | common_timestamp-ts0: 0 | |
1974 | max: 27 | |
1975 | next_comm: cyclictest | |
1976 | prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 | |
1977 | ||
1978 | { next_pid: 2256 } hitcount: 2355 | |
1979 | common_timestamp-ts0: 0 | |
1980 | max: 49 next_comm: cyclictest | |
1981 | prev_pid: 0 prev_prio: 120 prev_comm: swapper/0 | |
1982 | ||
1983 | Totals: | |
1984 | Hits: 12970 | |
1985 | Entries: 2 | |
1986 | Dropped: 0 |