Commit | Line | Data |
---|---|---|
d5a5e4ec TB |
1 | p m - g r a p h |
2 | ||
3 | pm-graph: suspend/resume/boot timing analysis tools | |
4 | Version: 5.4 | |
5 | Author: Todd Brandt <todd.e.brandt@intel.com> | |
6 | Home Page: https://01.org/pm-graph | |
7 | ||
8 | Report bugs/issues at bugzilla.kernel.org Tools/pm-graph | |
9 | - https://bugzilla.kernel.org/buglist.cgi?component=pm-graph&product=Tools | |
10 | ||
11 | Full documentation available online & in man pages | |
12 | - Getting Started: | |
13 | https://01.org/pm-graph/documentation/getting-started | |
14 | ||
15 | - Config File Format: | |
16 | https://01.org/pm-graph/documentation/3-config-file-format | |
17 | ||
18 | - upstream version in git: | |
19 | https://github.com/intel/pm-graph/ | |
20 | ||
21 | Table of Contents | |
22 | - Overview | |
23 | - Setup | |
24 | - Usage | |
25 | - Basic Usage | |
26 | - Dev Mode Usage | |
27 | - Proc Mode Usage | |
28 | - Configuration Files | |
29 | - Usage Examples | |
30 | - Config File Options | |
31 | - Custom Timeline Entries | |
32 | - Adding/Editing Timeline Functions | |
33 | - Adding/Editing Dev Timeline Source Functions | |
34 | - Verifying your Custom Functions | |
35 | - Testing on consumer linux Operating Systems | |
36 | - Android | |
37 | ||
38 | ------------------------------------------------------------------ | |
39 | | OVERVIEW | | |
40 | ------------------------------------------------------------------ | |
41 | ||
42 | This tool suite is designed to assist kernel and OS developers in optimizing | |
43 | their linux stack's suspend/resume & boot time. Using a kernel image built | |
44 | with a few extra options enabled, the tools will execute a suspend or boot, | |
45 | and will capture dmesg and ftrace data. This data is transformed into a set of | |
46 | timelines and a callgraph to give a quick and detailed view of which devices | |
47 | and kernel processes are taking the most time in suspend/resume & boot. | |
48 | ||
49 | ------------------------------------------------------------------ | |
50 | | SETUP | | |
51 | ------------------------------------------------------------------ | |
52 | ||
53 | These packages are required to execute the scripts | |
54 | - python | |
55 | - python-requests | |
56 | ||
57 | Ubuntu: | |
58 | sudo apt-get install python python-requests | |
59 | ||
60 | Fedora: | |
61 | sudo dnf install python python-requests | |
62 | ||
63 | The tools can most easily be installed via git clone and make install | |
64 | ||
65 | $> git clone http://github.com/intel/pm-graph.git | |
66 | $> cd pm-graph | |
67 | $> sudo make install | |
68 | $> man sleepgraph ; man bootgraph | |
69 | ||
70 | Setup involves some minor kernel configuration | |
71 | ||
72 | The following kernel build options are required for all kernels: | |
73 | CONFIG_DEVMEM=y | |
74 | CONFIG_PM_DEBUG=y | |
75 | CONFIG_PM_SLEEP_DEBUG=y | |
76 | CONFIG_FTRACE=y | |
77 | CONFIG_FUNCTION_TRACER=y | |
78 | CONFIG_FUNCTION_GRAPH_TRACER=y | |
79 | CONFIG_KPROBES=y | |
80 | CONFIG_KPROBES_ON_FTRACE=y | |
81 | ||
82 | In kernel 3.15.0, two patches were upstreamed which enable the | |
83 | v3.0 behavior. These patches allow the tool to read all the | |
84 | data from trace events instead of from dmesg. You can enable | |
85 | this behavior on earlier kernels with these patches: | |
86 | ||
87 | (kernel/pre-3.15/enable_trace_events_suspend_resume.patch) | |
88 | (kernel/pre-3.15/enable_trace_events_device_pm_callback.patch) | |
89 | ||
90 | If you're using a kernel older than 3.15.0, the following | |
91 | additional kernel parameters are required: | |
92 | (e.g. in file /etc/default/grub) | |
93 | GRUB_CMDLINE_LINUX_DEFAULT="... initcall_debug log_buf_len=32M ..." | |
94 | ||
95 | If you're using a kernel older than 3.11-rc2, the following simple | |
96 | patch must be applied to enable ftrace data: | |
97 | in file: kernel/power/suspend.c | |
98 | in function: int suspend_devices_and_enter(suspend_state_t state) | |
99 | remove call to "ftrace_stop();" | |
100 | remove call to "ftrace_start();" | |
101 | ||
102 | There is a patch which does this for kernel v3.8.0: | |
103 | (kernel/pre-3.11-rc2/enable_ftrace_in_suspendresume.patch) | |
104 | ||
105 | ||
106 | ||
107 | ------------------------------------------------------------------ | |
108 | | USAGE | | |
109 | ------------------------------------------------------------------ | |
110 | ||
111 | Basic Usage | |
112 | ___________ | |
113 | ||
114 | 1) First configure a kernel using the instructions from the previous sections. | |
115 | Then build, install, and boot with it. | |
116 | 2) Open up a terminal window and execute the mode list command: | |
117 | ||
118 | %> sudo ./sleepgraph.py -modes | |
119 | ['freeze', 'mem', 'disk'] | |
120 | ||
121 | Execute a test using one of the available power modes, e.g. mem (S3): | |
122 | ||
123 | %> sudo ./sleepgraph.py -m mem -rtcwake 15 | |
124 | ||
125 | or with a config file | |
126 | ||
127 | %> sudo ./sleepgraph.py -config config/suspend.cfg | |
128 | ||
129 | When the system comes back you'll see the script finishing up and | |
130 | creating the output files in the test subdir. It generates output | |
131 | files in subdirectory: suspend-mmddyy-HHMMSS. The ftrace file can | |
132 | be used to regenerate the html timeline with different options | |
133 | ||
134 | HTML output: <hostname>_<mode>.html | |
135 | raw dmesg output: <hostname>_<mode>_dmesg.txt | |
136 | raw ftrace output: <hostname>_<mode>_ftrace.txt | |
137 | ||
138 | View the html in firefox or chrome. | |
139 | ||
140 | ||
141 | Dev Mode Usage | |
142 | ______________ | |
143 | ||
144 | Developer mode adds information on low level source calls to the timeline. | |
145 | The tool sets kprobes on all delay and mutex calls to see which devices | |
146 | are waiting for something and when. It also sets a suite of kprobes on | |
147 | subsystem dependent calls to better fill out the timeline. | |
148 | ||
149 | The tool will also expose kernel threads that don't normally show up in the | |
150 | timeline. This is useful in discovering dependent threads to get a better | |
151 | idea of what each device is waiting for. For instance, the scsi_eh thread, | |
152 | a.k.a. scsi resume error handler, is what each SATA disk device waits for | |
153 | before it can continue resume. | |
154 | ||
155 | The timeline will be much larger if run with dev mode, so it can be useful | |
156 | to set the -mindev option to clip out any device blocks that are too small | |
157 | to see easily. The following command will give a nice dev mode run: | |
158 | ||
159 | %> sudo ./sleepgraph.py -m mem -rtcwake 15 -mindev 1 -dev | |
160 | ||
161 | or with a config file | |
162 | ||
163 | %> sudo ./sleepgraph.py -config config/suspend-dev.cfg | |
164 | ||
165 | ||
166 | Proc Mode Usage | |
167 | _______________ | |
168 | ||
169 | Proc mode adds user process info to the timeline. This is done in a manner | |
170 | similar to the bootchart utility, which graphs init processes and their | |
171 | execution as the system boots. This tool option does the same thing but for | |
172 | the period before and after suspend/resume. | |
173 | ||
174 | In order to see any process info, there needs to be some delay before or | |
175 | after resume since processes are frozen in suspend_prepare and thawed in | |
176 | resume_complete. The predelay and postdelay args allow you to do this. It | |
177 | can also be useful to run in x2 mode with an x2 delay, this way you can | |
178 | see process activity before and after resume, and in between two | |
179 | successive suspend/resumes. | |
180 | ||
181 | The command can be run like this: | |
182 | ||
183 | %> sudo ./sleepgraph.py -m mem -rtcwake 15 -x2 -x2delay 1000 -predelay 1000 -postdelay 1000 -proc | |
184 | ||
185 | or with a config file | |
186 | ||
187 | %> sudo ./sleepgraph.py -config config/suspend-proc.cfg | |
188 | ||
189 | ||
190 | ------------------------------------------------------------------ | |
191 | | CONFIGURATION FILES | | |
192 | ------------------------------------------------------------------ | |
193 | ||
194 | Since 4.0 we've moved to using config files in lieu of command line options. | |
195 | The config folder contains a collection of typical use cases. | |
196 | There are corresponding configs for other power modes: | |
197 | ||
198 | Simple suspend/resume with basic timeline (mem/freeze/standby) | |
199 | config/suspend.cfg | |
200 | config/freeze.cfg | |
201 | config/standby.cfg | |
202 | ||
203 | Dev mode suspend/resume with dev timeline (mem/freeze/standby) | |
204 | config/suspend-dev.cfg | |
205 | config/freeze-dev.cfg | |
206 | config/standby-dev.cfg | |
207 | ||
208 | Simple suspend/resume with timeline and callgraph (mem/freeze/standby) | |
209 | config/suspend-callgraph.cfg | |
210 | config/freeze-callgraph.cfg | |
211 | config/standby-callgraph.cfg | |
212 | ||
213 | Sample proc mode x2 run using mem suspend | |
214 | config/suspend-x2-proc.cfg | |
215 | ||
216 | Sample for editing timeline funcs (moves internal functions into config) | |
217 | config/custom-timeline-functions.cfg | |
218 | ||
219 | Sample debug config for serio subsystem | |
220 | config/debug-serio-suspend.cfg | |
221 | ||
222 | ||
223 | Usage Examples | |
224 | ______________ | |
225 | ||
226 | Run a simple mem suspend: | |
227 | %> sudo ./sleepgraph.py -config config/suspend.cfg | |
228 | ||
229 | Run a mem suspend with callgraph data: | |
230 | %> sudo ./sleepgraph.py -config config/suspend-callgraph.cfg | |
231 | ||
232 | Run a mem suspend with dev mode detail: | |
233 | %> sudo ./sleepgraph.py -config config/suspend-dev.cfg | |
234 | ||
235 | ||
236 | Config File Options | |
237 | ___________________ | |
238 | ||
239 | [Settings] | |
240 | ||
241 | # Verbosity: print verbose messages (def: false) | |
242 | verbose: false | |
243 | ||
244 | # Suspend Mode: e.g. standby, mem, freeze, disk (def: mem) | |
245 | mode: mem | |
246 | ||
247 | # Output Directory Format: {hostname}, {date}, {time} give current values | |
248 | output-dir: suspend-{hostname}-{date}-{time} | |
249 | ||
250 | # Automatic Wakeup: use rtcwake to wakeup after X seconds (def: infinity) | |
251 | rtcwake: 15 | |
252 | ||
253 | # Add Logs: add the dmesg and ftrace log to the html output (def: false) | |
254 | addlogs: false | |
255 | ||
256 | # Sus/Res Gap: insert a gap between sus & res in the timeline (def: false) | |
257 | srgap: false | |
258 | ||
259 | # Custom Command: Command to execute in lieu of suspend (def: "") | |
260 | command: echo mem > /sys/power/state | |
261 | ||
262 | # Proc mode: graph user processes and cpu usage in the timeline (def: false) | |
263 | proc: false | |
264 | ||
265 | # Dev mode: graph source functions in the timeline (def: false) | |
266 | dev: false | |
267 | ||
268 | # Suspend/Resume x2: run 2 suspend/resumes back to back (def: false) | |
269 | x2: false | |
270 | ||
271 | # x2 Suspend Delay: time delay between the two test runs in ms (def: 0 ms) | |
272 | x2delay: 0 | |
273 | ||
274 | # Pre Suspend Delay: nclude an N ms delay before (1st) suspend (def: 0 ms) | |
275 | predelay: 0 | |
276 | ||
277 | # Post Resume Delay: include an N ms delay after (last) resume (def: 0 ms) | |
278 | postdelay: 0 | |
279 | ||
280 | # Min Device Length: graph only dev callbacks longer than min (def: 0.001 ms) | |
281 | mindev: 0.001 | |
282 | ||
283 | # Callgraph: gather ftrace callgraph data on all timeline events (def: false) | |
284 | callgraph: false | |
285 | ||
286 | # Expand Callgraph: pre-expand the callgraph treeviews in html (def: false) | |
287 | expandcg: false | |
288 | ||
289 | # Min Callgraph Length: show callgraphs only if longer than min (def: 1 ms) | |
290 | mincg: 1 | |
291 | ||
292 | # Timestamp Precision: number of sig digits in timestamps (0:S, [3:ms], 6:us) | |
293 | timeprec: 3 | |
294 | ||
295 | # Device Filter: show only devs whose name/driver includes one of these strings | |
296 | devicefilter: _cpu_up,_cpu_down,i915,usb | |
297 | ||
298 | # Override default timeline entries: | |
299 | # Do not use the internal default functions for timeline entries (def: false) | |
300 | # Set this to true if you intend to only use the ones defined in the config | |
301 | override-timeline-functions: true | |
302 | ||
303 | # Override default dev timeline entries: | |
304 | # Do not use the internal default functions for dev timeline entries (def: false) | |
305 | # Set this to true if you intend to only use the ones defined in the config | |
306 | override-dev-timeline-functions: true | |
307 | ||
308 | # Call Loop Max Gap (dev mode only) | |
309 | # merge loops of the same call if each is less than maxgap apart (def: 100us) | |
310 | callloop-maxgap: 0.0001 | |
311 | ||
312 | # Call Loop Max Length (dev mode only) | |
313 | # merge loops of the same call if each is less than maxlen in length (def: 5ms) | |
314 | callloop-maxlen: 0.005 | |
315 | ||
316 | ------------------------------------------------------------------ | |
317 | | CUSTOM TIMELINE ENTRIES | | |
318 | ------------------------------------------------------------------ | |
319 | ||
320 | Adding or Editing Timeline Functions | |
321 | ____________________________________ | |
322 | ||
323 | The tool uses an array of function names to fill out empty spaces in the | |
324 | timeline where device callbacks don't appear. For instance, in suspend_prepare | |
325 | the tool adds the sys_sync and freeze_processes calls as virtual device blocks | |
326 | in the timeline to show you where the time is going. These calls should fill | |
327 | the timeline with contiguous data so that most kernel execution is covered. | |
328 | ||
329 | It is possible to add new function calls to the timeline by adding them to | |
330 | the config. It's also possible to copy the internal timeline functions into | |
331 | the config so that you can override and edit them. Place them in the | |
332 | timeline_functions_ARCH section with the name of your architecture appended. | |
333 | i.e. for x86_64: [timeline_functions_x86_64] | |
334 | ||
335 | Use the override-timeline-functions option if you only want to use your | |
336 | custom calls, or leave it false to append them to the internal ones. | |
337 | ||
338 | This section includes a list of functions (set using kprobes) which use both | |
339 | symbol data and function arg data. The args are pulled directly from the | |
340 | stack using this architecture's registers and stack formatting. Each entry | |
341 | can include up to four pieces of info: The function name, a format string, | |
342 | an argument list, and a color. But only a function name is required. | |
343 | ||
344 | For a full example config, see config/custom-timeline-functions.cfg. It pulls | |
345 | all the internal timeline functions into the config and allows you to edit | |
346 | them. | |
347 | ||
348 | Entry format: | |
349 | ||
350 | function: format{fn_arg1}_{fn_arg2} fn_arg1 fn_arg2 ... [color=purple] | |
351 | ||
352 | Required Arguments: | |
353 | ||
354 | function: The symbol name for the function you want probed, this is the | |
355 | minimum required for an entry, it will show up as the function | |
356 | name with no arguments. | |
357 | ||
358 | example: _cpu_up: | |
359 | ||
360 | Optional Arguments: | |
361 | ||
362 | format: The format to display the data on the timeline in. Use braces to | |
363 | enclose the arg names. | |
364 | ||
365 | example: CPU_ON[{cpu}] | |
366 | ||
367 | color: The color of the entry block in the timeline. The default color is | |
368 | transparent, so the entry shares the phase color. The color is an | |
369 | html color string, either a word, or an RGB. | |
370 | ||
371 | example: [color=#CC00CC] | |
372 | ||
373 | arglist: A list of arguments from registers/stack addresses. See URL: | |
374 | https://www.kernel.org/doc/Documentation/trace/kprobetrace.txt | |
375 | ||
376 | example: cpu=%di:s32 | |
377 | ||
378 | Here is a full example entry. It displays cpu resume calls in the timeline | |
379 | in orange. They will appear as CPU_ON[0], CPU_ON[1], etc. | |
380 | ||
381 | [timeline_functions_x86_64] | |
382 | _cpu_up: CPU_ON[{cpu}] cpu=%di:s32 [color=orange] | |
383 | ||
384 | ||
385 | Adding or Editing Dev Mode Timeline Source Functions | |
386 | ____________________________________________________ | |
387 | ||
388 | In dev mode, the tool uses an array of function names to monitor source | |
389 | execution within the timeline entries. | |
390 | ||
391 | The function calls are displayed inside the main device/call blocks in the | |
392 | timeline. However, if a function call is not within a main timeline event, | |
393 | it will spawn an entirely new event named after the caller's kernel thread. | |
394 | These asynchronous kernel threads will populate in a separate section | |
395 | beneath the main device/call section. | |
396 | ||
397 | The tool has a set of hard coded calls which focus on the most common use | |
398 | cases: msleep, udelay, schedule_timeout, mutex_lock_slowpath, etc. These are | |
399 | the functions that add a hardcoded time delay to the suspend/resume path. | |
400 | The tool also includes some common functions native to important | |
401 | subsystems: ata, i915, and ACPI, etc. | |
402 | ||
403 | It is possible to add new function calls to the dev timeline by adding them | |
404 | to the config. It's also possible to copy the internal dev timeline | |
405 | functions into the config so that you can override and edit them. Place them | |
406 | in the dev_timeline_functions_ARCH section with the name of your architecture | |
407 | appended. i.e. for x86_64: [dev_timeline_functions_x86_64] | |
408 | ||
409 | Use the override-dev-timeline-functions option if you only want to use your | |
410 | custom calls, or leave it false to append them to the internal ones. | |
411 | ||
412 | The format is the same as the timeline_functions_x86_64 section. It's a | |
413 | list of functions (set using kprobes) which use both symbol data and function | |
414 | arg data. The args are pulled directly from the stack using this | |
415 | architecture's registers and stack formatting. Each entry can include up | |
416 | to four pieces of info: The function name, a format string, an argument list, | |
417 | and a color. But only the function name is required. | |
418 | ||
419 | For a full example config, see config/custom-timeline-functions.cfg. It pulls | |
420 | all the internal dev timeline functions into the config and allows you to edit | |
421 | them. | |
422 | ||
423 | Here is a full example entry. It displays the ATA port reset calls as | |
424 | ataN_port_reset in the timeline. This is where most of the SATA disk resume | |
425 | time goes, so it can be helpful to see the low level call. | |
426 | ||
427 | [dev_timeline_functions_x86_64] | |
428 | ata_eh_recover: ata{port}_port_reset port=+36(%di):s32 [color=#CC00CC] | |
429 | ||
430 | ||
431 | Verifying your custom functions | |
432 | _______________________________ | |
433 | ||
434 | Once you have a set of functions (kprobes) defined, it can be useful to | |
435 | perform a quick check to see if you formatted them correctly and if the system | |
436 | actually supports them. To do this, run the tool with your config file | |
437 | and the -status option. The tool will go through all the kprobes (both | |
438 | custom and internal if you haven't overridden them) and actually attempts | |
439 | to set them in ftrace. It will then print out success or fail for you. | |
440 | ||
441 | Note that kprobes which don't actually exist in the kernel won't stop the | |
442 | tool, they just wont show up. | |
443 | ||
444 | For example: | |
445 | ||
446 | sudo ./sleepgraph.py -config config/custom-timeline-functions.cfg -status | |
447 | Checking this system (myhostname)... | |
448 | have root access: YES | |
449 | is sysfs mounted: YES | |
450 | is "mem" a valid power mode: YES | |
451 | is ftrace supported: YES | |
452 | are kprobes supported: YES | |
453 | timeline data source: FTRACE (all trace events found) | |
454 | is rtcwake supported: YES | |
455 | verifying timeline kprobes work: | |
456 | _cpu_down: YES | |
457 | _cpu_up: YES | |
458 | acpi_pm_finish: YES | |
459 | acpi_pm_prepare: YES | |
460 | freeze_kernel_threads: YES | |
461 | freeze_processes: YES | |
462 | sys_sync: YES | |
463 | thaw_processes: YES | |
464 | verifying dev kprobes work: | |
465 | __const_udelay: YES | |
466 | __mutex_lock_slowpath: YES | |
467 | acpi_os_stall: YES | |
468 | acpi_ps_parse_aml: YES | |
469 | intel_opregion_init: NO | |
470 | intel_opregion_register: NO | |
471 | intel_opregion_setup: NO | |
472 | msleep: YES | |
473 | schedule_timeout: YES | |
474 | schedule_timeout_uninterruptible: YES | |
475 | usleep_range: YES | |
476 | ||
477 | ||
478 | ------------------------------------------------------------------ | |
479 | | TESTING ON CONSUMER LINUX OPERATING SYSTEMS | | |
480 | ------------------------------------------------------------------ | |
481 | ||
482 | Android | |
483 | _______ | |
484 | ||
485 | The easiest way to execute on an android device is to run the android.sh | |
486 | script on the device, then pull the ftrace log back to the host and run | |
487 | sleepgraph.py on it. | |
488 | ||
489 | Here are the steps: | |
490 | ||
491 | [download and install the tool on the device] | |
492 | ||
493 | host%> wget https://raw.githubusercontent.com/intel/pm-graph/master/tools/android.sh | |
494 | host%> adb connect 192.168.1.6 | |
495 | host%> adb root | |
496 | # push the script to a writeable location | |
497 | host%> adb push android.sh /sdcard/ | |
498 | ||
499 | [check whether the tool will run on your device] | |
500 | ||
501 | host%> adb shell | |
502 | dev%> cd /sdcard | |
503 | dev%> sh android.sh status | |
504 | host : asus_t100 | |
505 | kernel : 3.14.0-i386-dirty | |
506 | modes : freeze mem | |
507 | rtcwake : supported | |
508 | ftrace : supported | |
509 | trace events { | |
510 | suspend_resume: found | |
511 | device_pm_callback_end: found | |
512 | device_pm_callback_start: found | |
513 | } | |
514 | # the above is what you see on a system that's properly patched | |
515 | ||
516 | [execute the suspend] | |
517 | ||
518 | # NOTE: The suspend will only work if the screen isn't timed out, | |
519 | # so you have to press some keys first to wake it up b4 suspend) | |
520 | dev%> sh android.sh suspend mem | |
521 | ------------------------------------ | |
522 | Suspend/Resume timing test initiated | |
523 | ------------------------------------ | |
524 | hostname : asus_t100 | |
525 | kernel : 3.14.0-i386-dirty | |
526 | mode : mem | |
527 | ftrace out : /mnt/shell/emulated/0/ftrace.txt | |
528 | dmesg out : /mnt/shell/emulated/0/dmesg.txt | |
529 | log file : /mnt/shell/emulated/0/log.txt | |
530 | ------------------------------------ | |
531 | INITIALIZING FTRACE........DONE | |
532 | STARTING FTRACE | |
533 | SUSPEND START @ 21:24:02 (rtcwake in 10 seconds) | |
534 | <adb connection will now terminate> | |
535 | ||
536 | [retrieve the data from the device] | |
537 | ||
538 | # I find that you have to actually kill the adb process and | |
539 | # reconnect sometimes in order for the connection to work post-suspend | |
540 | host%> adb connect 192.168.1.6 | |
541 | # (required) get the ftrace data, this is the most important piece | |
542 | host%> adb pull /sdcard/ftrace.txt | |
543 | # (optional) get the dmesg data, this is for debugging | |
544 | host%> adb pull /sdcard/dmesg.txt | |
545 | # (optional) get the log, which just lists some test times for comparison | |
546 | host%> adb pull /sdcard/log.txt | |
547 | ||
548 | [create an output html file using sleepgraph.py] | |
549 | ||
550 | host%> sleepgraph.py -ftrace ftrace.txt | |
551 | ||
552 | You should now have an output.html with the android data, enjoy! |