Commit | Line | Data |
---|---|---|
151f4e2b | 1 | ================================================== |
62052ab1 | 2 | Runtime Power Management Framework for I/O Devices |
151f4e2b | 3 | ================================================== |
5e928f77 | 4 | |
9659cc06 | 5 | (C) 2009-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. |
151f4e2b | 6 | |
7490e442 | 7 | (C) 2010 Alan Stern <stern@rowland.harvard.edu> |
151f4e2b | 8 | |
f71495f3 | 9 | (C) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
5e928f77 RW |
10 | |
11 | 1. Introduction | |
151f4e2b | 12 | =============== |
5e928f77 | 13 | |
62052ab1 | 14 | Support for runtime power management (runtime PM) of I/O devices is provided |
5e928f77 RW |
15 | at the power management core (PM core) level by means of: |
16 | ||
17 | * The power management workqueue pm_wq in which bus types and device drivers can | |
18 | put their PM-related work items. It is strongly recommended that pm_wq be | |
62052ab1 | 19 | used for queuing all work items related to runtime PM, because this allows |
5e928f77 RW |
20 | them to be synchronized with system-wide power transitions (suspend to RAM, |
21 | hibernation and resume from system sleep states). pm_wq is declared in | |
22 | include/linux/pm_runtime.h and defined in kernel/power/main.c. | |
23 | ||
62052ab1 | 24 | * A number of runtime PM fields in the 'power' member of 'struct device' (which |
5e928f77 | 25 | is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can |
62052ab1 | 26 | be used for synchronizing runtime PM operations with one another. |
5e928f77 | 27 | |
62052ab1 | 28 | * Three device runtime PM callbacks in 'struct dev_pm_ops' (defined in |
5e928f77 RW |
29 | include/linux/pm.h). |
30 | ||
31 | * A set of helper functions defined in drivers/base/power/runtime.c that can be | |
62052ab1 | 32 | used for carrying out runtime PM operations in such a way that the |
5e928f77 RW |
33 | synchronization between them is taken care of by the PM core. Bus types and |
34 | device drivers are encouraged to use these functions. | |
35 | ||
62052ab1 | 36 | The runtime PM callbacks present in 'struct dev_pm_ops', the device runtime PM |
5e928f77 | 37 | fields of 'struct dev_pm_info' and the core helper functions provided for |
62052ab1 | 38 | runtime PM are described below. |
5e928f77 | 39 | |
62052ab1 | 40 | 2. Device Runtime PM Callbacks |
151f4e2b | 41 | ============================== |
5e928f77 | 42 | |
151f4e2b | 43 | There are three device runtime PM callbacks defined in 'struct dev_pm_ops':: |
5e928f77 | 44 | |
151f4e2b | 45 | struct dev_pm_ops { |
5e928f77 RW |
46 | ... |
47 | int (*runtime_suspend)(struct device *dev); | |
48 | int (*runtime_resume)(struct device *dev); | |
e1b1903e | 49 | int (*runtime_idle)(struct device *dev); |
5e928f77 | 50 | ... |
151f4e2b | 51 | }; |
5e928f77 | 52 | |
2fb242ad | 53 | The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks |
5841eb64 RW |
54 | are executed by the PM core for the device's subsystem that may be either of |
55 | the following: | |
56 | ||
57 | 1. PM domain of the device, if the device's PM domain object, dev->pm_domain, | |
58 | is present. | |
59 | ||
60 | 2. Device type of the device, if both dev->type and dev->type->pm are present. | |
61 | ||
62 | 3. Device class of the device, if both dev->class and dev->class->pm are | |
63 | present. | |
64 | ||
65 | 4. Bus type of the device, if both dev->bus and dev->bus->pm are present. | |
66 | ||
35cd133c RW |
67 | If the subsystem chosen by applying the above rules doesn't provide the relevant |
68 | callback, the PM core will invoke the corresponding driver callback stored in | |
69 | dev->driver->pm directly (if present). | |
70 | ||
5841eb64 RW |
71 | The PM core always checks which callback to use in the order given above, so the |
72 | priority order of callbacks from high to low is: PM domain, device type, class | |
73 | and bus type. Moreover, the high-priority one will always take precedence over | |
74 | a low-priority one. The PM domain, bus type, device type and class callbacks | |
75 | are referred to as subsystem-level callbacks in what follows. | |
a6ab7aa9 | 76 | |
c7b61de5 | 77 | By default, the callbacks are always invoked in process context with interrupts |
35cd133c RW |
78 | enabled. However, the pm_runtime_irq_safe() helper function can be used to tell |
79 | the PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume() | |
80 | and ->runtime_idle() callbacks for the given device in atomic context with | |
81 | interrupts disabled. This implies that the callback routines in question must | |
82 | not block or sleep, but it also means that the synchronous helper functions | |
83 | listed at the end of Section 4 may be used for that device within an interrupt | |
84 | handler or generally in an atomic context. | |
85 | ||
86 | The subsystem-level suspend callback, if present, is _entirely_ _responsible_ | |
87 | for handling the suspend of the device as appropriate, which may, but need not | |
88 | include executing the device driver's own ->runtime_suspend() callback (from the | |
5e928f77 | 89 | PM core's point of view it is not necessary to implement a ->runtime_suspend() |
a6ab7aa9 RW |
90 | callback in a device driver as long as the subsystem-level suspend callback |
91 | knows what to do to handle the device). | |
5e928f77 | 92 | |
35cd133c RW |
93 | * Once the subsystem-level suspend callback (or the driver suspend callback, |
94 | if invoked directly) has completed successfully for the given device, the PM | |
95 | core regards the device as suspended, which need not mean that it has been | |
96 | put into a low power state. It is supposed to mean, however, that the | |
97 | device will not process data and will not communicate with the CPU(s) and | |
98 | RAM until the appropriate resume callback is executed for it. The runtime | |
99 | PM status of a device after successful execution of the suspend callback is | |
100 | 'suspended'. | |
101 | ||
102 | * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM | |
103 | status remains 'active', which means that the device _must_ be fully | |
104 | operational afterwards. | |
105 | ||
106 | * If the suspend callback returns an error code different from -EBUSY and | |
107 | -EAGAIN, the PM core regards this as a fatal error and will refuse to run | |
108 | the helper functions described in Section 4 for the device until its status | |
35bfa99e | 109 | is directly set to either 'active', or 'suspended' (the PM core provides |
35cd133c RW |
110 | special helper functions for this purpose). |
111 | ||
112 | In particular, if the driver requires remote wakeup capability (i.e. hardware | |
a6ab7aa9 | 113 | mechanism allowing the device to request a change of its power state, such as |
de3ef1eb | 114 | PCI PME) for proper functioning and device_can_wakeup() returns 'false' for the |
a6ab7aa9 | 115 | device, then ->runtime_suspend() should return -EBUSY. On the other hand, if |
de3ef1eb | 116 | device_can_wakeup() returns 'true' for the device and the device is put into a |
35cd133c RW |
117 | low-power state during the execution of the suspend callback, it is expected |
118 | that remote wakeup will be enabled for the device. Generally, remote wakeup | |
119 | should be enabled for all input devices put into low-power states at run time. | |
120 | ||
151f4e2b | 121 | The subsystem-level resume callback, if present, is **entirely responsible** for |
35cd133c RW |
122 | handling the resume of the device as appropriate, which may, but need not |
123 | include executing the device driver's own ->runtime_resume() callback (from the | |
124 | PM core's point of view it is not necessary to implement a ->runtime_resume() | |
125 | callback in a device driver as long as the subsystem-level resume callback knows | |
126 | what to do to handle the device). | |
127 | ||
128 | * Once the subsystem-level resume callback (or the driver resume callback, if | |
129 | invoked directly) has completed successfully, the PM core regards the device | |
130 | as fully operational, which means that the device _must_ be able to complete | |
131 | I/O operations as needed. The runtime PM status of the device is then | |
132 | 'active'. | |
133 | ||
134 | * If the resume callback returns an error code, the PM core regards this as a | |
135 | fatal error and will refuse to run the helper functions described in Section | |
136 | 4 for the device, until its status is directly set to either 'active', or | |
137 | 'suspended' (by means of special helper functions provided by the PM core | |
138 | for this purpose). | |
139 | ||
140 | The idle callback (a subsystem-level one, if present, or the driver one) is | |
141 | executed by the PM core whenever the device appears to be idle, which is | |
142 | indicated to the PM core by two counters, the device's usage counter and the | |
143 | counter of 'active' children of the device. | |
5e928f77 RW |
144 | |
145 | * If any of these counters is decreased using a helper function provided by | |
146 | the PM core and it turns out to be equal to zero, the other counter is | |
147 | checked. If that counter also is equal to zero, the PM core executes the | |
35cd133c | 148 | idle callback with the device as its argument. |
5e928f77 | 149 | |
35cd133c RW |
150 | The action performed by the idle callback is totally dependent on the subsystem |
151 | (or driver) in question, but the expected and recommended action is to check | |
a6ab7aa9 RW |
152 | if the device can be suspended (i.e. if all of the conditions necessary for |
153 | suspending the device are satisfied) and to queue up a suspend request for the | |
43d51af4 | 154 | device in that case. If there is no idle callback, or if the callback returns |
d66e6db2 UH |
155 | 0, then the PM core will attempt to carry out a runtime suspend of the device, |
156 | also respecting devices configured for autosuspend. In essence this means a | |
b7d46644 | 157 | call to __pm_runtime_autosuspend() (do note that drivers needs to update the |
d66e6db2 UH |
158 | device last busy mark, pm_runtime_mark_last_busy(), to control the delay under |
159 | this circumstance). To prevent this (for example, if the callback routine has | |
160 | started a delayed suspend), the routine must return a non-zero value. Negative | |
161 | error return codes are ignored by the PM core. | |
5e928f77 RW |
162 | |
163 | The helper functions provided by the PM core, described in Section 4, guarantee | |
35cd133c RW |
164 | that the following constraints are met with respect to runtime PM callbacks for |
165 | one device: | |
5e928f77 RW |
166 | |
167 | (1) The callbacks are mutually exclusive (e.g. it is forbidden to execute | |
168 | ->runtime_suspend() in parallel with ->runtime_resume() or with another | |
169 | instance of ->runtime_suspend() for the same device) with the exception that | |
170 | ->runtime_suspend() or ->runtime_resume() can be executed in parallel with | |
171 | ->runtime_idle() (although ->runtime_idle() will not be started while any | |
172 | of the other callbacks is being executed for the same device). | |
173 | ||
174 | (2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active' | |
175 | devices (i.e. the PM core will only execute ->runtime_idle() or | |
62052ab1 | 176 | ->runtime_suspend() for the devices the runtime PM status of which is |
5e928f77 RW |
177 | 'active'). |
178 | ||
179 | (3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device | |
180 | the usage counter of which is equal to zero _and_ either the counter of | |
181 | 'active' children of which is equal to zero, or the 'power.ignore_children' | |
182 | flag of which is set. | |
183 | ||
184 | (4) ->runtime_resume() can only be executed for 'suspended' devices (i.e. the | |
62052ab1 | 185 | PM core will only execute ->runtime_resume() for the devices the runtime |
5e928f77 RW |
186 | PM status of which is 'suspended'). |
187 | ||
188 | Additionally, the helper functions provided by the PM core obey the following | |
189 | rules: | |
190 | ||
191 | * If ->runtime_suspend() is about to be executed or there's a pending request | |
192 | to execute it, ->runtime_idle() will not be executed for the same device. | |
193 | ||
194 | * A request to execute or to schedule the execution of ->runtime_suspend() | |
195 | will cancel any pending requests to execute ->runtime_idle() for the same | |
196 | device. | |
197 | ||
198 | * If ->runtime_resume() is about to be executed or there's a pending request | |
199 | to execute it, the other callbacks will not be executed for the same device. | |
200 | ||
201 | * A request to execute ->runtime_resume() will cancel any pending or | |
15bcb91d AS |
202 | scheduled requests to execute the other callbacks for the same device, |
203 | except for scheduled autosuspends. | |
5e928f77 | 204 | |
62052ab1 | 205 | 3. Runtime PM Device Fields |
151f4e2b | 206 | =========================== |
5e928f77 | 207 | |
62052ab1 | 208 | The following device runtime PM fields are present in 'struct dev_pm_info', as |
5e928f77 RW |
209 | defined in include/linux/pm.h: |
210 | ||
151f4e2b | 211 | `struct timer_list suspend_timer;` |
15bcb91d | 212 | - timer used for scheduling (delayed) suspend and autosuspend requests |
5e928f77 | 213 | |
151f4e2b | 214 | `unsigned long timer_expires;` |
5e928f77 RW |
215 | - timer expiration time, in jiffies (if this is different from zero, the |
216 | timer is running and will expire at that time, otherwise the timer is not | |
217 | running) | |
218 | ||
151f4e2b | 219 | `struct work_struct work;` |
5e928f77 RW |
220 | - work structure used for queuing up requests (i.e. work items in pm_wq) |
221 | ||
151f4e2b | 222 | `wait_queue_head_t wait_queue;` |
5e928f77 RW |
223 | - wait queue used if any of the helper functions needs to wait for another |
224 | one to complete | |
225 | ||
151f4e2b | 226 | `spinlock_t lock;` |
35bfa99e | 227 | - lock used for synchronization |
5e928f77 | 228 | |
151f4e2b | 229 | `atomic_t usage_count;` |
5e928f77 RW |
230 | - the usage counter of the device |
231 | ||
151f4e2b | 232 | `atomic_t child_count;` |
5e928f77 RW |
233 | - the count of 'active' children of the device |
234 | ||
151f4e2b | 235 | `unsigned int ignore_children;` |
5e928f77 RW |
236 | - if set, the value of child_count is ignored (but still updated) |
237 | ||
151f4e2b | 238 | `unsigned int disable_depth;` |
1f999d14 | 239 | - used for disabling the helper functions (they work normally if this is |
62052ab1 | 240 | equal to zero); the initial value of it is 1 (i.e. runtime PM is |
5e928f77 RW |
241 | initially disabled for all devices) |
242 | ||
151f4e2b | 243 | `int runtime_error;` |
5e928f77 | 244 | - if set, there was a fatal error (one of the callbacks returned error code |
1f999d14 | 245 | as described in Section 2), so the helper functions will not work until |
5e928f77 RW |
246 | this flag is cleared; this is the error code returned by the failing |
247 | callback | |
248 | ||
151f4e2b | 249 | `unsigned int idle_notification;` |
5e928f77 RW |
250 | - if set, ->runtime_idle() is being executed |
251 | ||
151f4e2b | 252 | `unsigned int request_pending;` |
5e928f77 RW |
253 | - if set, there's a pending request (i.e. a work item queued up into pm_wq) |
254 | ||
151f4e2b | 255 | `enum rpm_request request;` |
5e928f77 RW |
256 | - type of request that's pending (valid if request_pending is set) |
257 | ||
151f4e2b | 258 | `unsigned int deferred_resume;` |
5e928f77 RW |
259 | - set if ->runtime_resume() is about to be run while ->runtime_suspend() is |
260 | being executed for that device and it is not practical to wait for the | |
261 | suspend to complete; means "start a resume as soon as you've suspended" | |
262 | ||
151f4e2b | 263 | `enum rpm_status runtime_status;` |
62052ab1 | 264 | - the runtime PM status of the device; this field's initial value is |
5e928f77 RW |
265 | RPM_SUSPENDED, which means that each device is initially regarded by the |
266 | PM core as 'suspended', regardless of its real hardware status | |
267 | ||
c24efa67 RW |
268 | `enum rpm_status last_status;` |
269 | - the last runtime PM status of the device captured before disabling runtime | |
270 | PM for it (invalid initially and when disable_depth is 0) | |
271 | ||
151f4e2b | 272 | `unsigned int runtime_auto;` |
87d1b3e6 RW |
273 | - if set, indicates that the user space has allowed the device driver to |
274 | power manage the device at run time via the /sys/devices/.../power/control | |
1992b66d BH |
275 | `interface;` it may only be modified with the help of the |
276 | pm_runtime_allow() and pm_runtime_forbid() helper functions | |
87d1b3e6 | 277 | |
151f4e2b | 278 | `unsigned int no_callbacks;` |
62052ab1 | 279 | - indicates that the device does not use the runtime PM callbacks (see |
7490e442 AS |
280 | Section 8); it may be modified only by the pm_runtime_no_callbacks() |
281 | helper function | |
282 | ||
151f4e2b | 283 | `unsigned int irq_safe;` |
c7b61de5 AS |
284 | - indicates that the ->runtime_suspend() and ->runtime_resume() callbacks |
285 | will be invoked with the spinlock held and interrupts disabled | |
286 | ||
151f4e2b | 287 | `unsigned int use_autosuspend;` |
15bcb91d AS |
288 | - indicates that the device's driver supports delayed autosuspend (see |
289 | Section 9); it may be modified only by the | |
290 | pm_runtime{_dont}_use_autosuspend() helper functions | |
291 | ||
151f4e2b | 292 | `unsigned int timer_autosuspends;` |
15bcb91d AS |
293 | - indicates that the PM core should attempt to carry out an autosuspend |
294 | when the timer expires rather than a normal suspend | |
295 | ||
151f4e2b | 296 | `int autosuspend_delay;` |
15bcb91d AS |
297 | - the delay time (in milliseconds) to be used for autosuspend |
298 | ||
151f4e2b | 299 | `unsigned long last_busy;` |
15bcb91d AS |
300 | - the time (in jiffies) when the pm_runtime_mark_last_busy() helper |
301 | function was last called for this device; used in calculating inactivity | |
302 | periods for autosuspend | |
303 | ||
5e928f77 RW |
304 | All of the above fields are members of the 'power' member of 'struct device'. |
305 | ||
62052ab1 | 306 | 4. Runtime PM Device Helper Functions |
151f4e2b | 307 | ===================================== |
5e928f77 | 308 | |
62052ab1 | 309 | The following runtime PM helper functions are defined in |
5e928f77 RW |
310 | drivers/base/power/runtime.c and include/linux/pm_runtime.h: |
311 | ||
151f4e2b | 312 | `void pm_runtime_init(struct device *dev);` |
62052ab1 | 313 | - initialize the device runtime PM fields in 'struct dev_pm_info' |
5e928f77 | 314 | |
151f4e2b | 315 | `void pm_runtime_remove(struct device *dev);` |
62052ab1 | 316 | - make sure that the runtime PM of the device will be disabled after |
5e928f77 RW |
317 | removing the device from device hierarchy |
318 | ||
151f4e2b | 319 | `int pm_runtime_idle(struct device *dev);` |
43d51af4 AS |
320 | - execute the subsystem-level idle callback for the device; returns an |
321 | error code on failure, where -EINPROGRESS means that ->runtime_idle() is | |
322 | already being executed; if there is no callback or the callback returns 0 | |
d66e6db2 | 323 | then run pm_runtime_autosuspend(dev) and return its result |
5e928f77 | 324 | |
151f4e2b | 325 | `int pm_runtime_suspend(struct device *dev);` |
a6ab7aa9 | 326 | - execute the subsystem-level suspend callback for the device; returns 0 on |
62052ab1 | 327 | success, 1 if the device's runtime PM status was already 'suspended', or |
5e928f77 | 328 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt |
632e270e RW |
329 | to suspend the device again in future and -EACCES means that |
330 | 'power.disable_depth' is different from 0 | |
5e928f77 | 331 | |
151f4e2b | 332 | `int pm_runtime_autosuspend(struct device *dev);` |
15bcb91d | 333 | - same as pm_runtime_suspend() except that the autosuspend delay is taken |
151f4e2b | 334 | `into account;` if pm_runtime_autosuspend_expiration() says the delay has |
15bcb91d AS |
335 | not yet expired then an autosuspend is scheduled for the appropriate time |
336 | and 0 is returned | |
337 | ||
151f4e2b | 338 | `int pm_runtime_resume(struct device *dev);` |
de8164fb | 339 | - execute the subsystem-level resume callback for the device; returns 0 on |
c24efa67 RW |
340 | success, 1 if the device's runtime PM status is already 'active' (also if |
341 | 'power.disable_depth' is nonzero, but the status was 'active' when it was | |
342 | changing from 0 to 1) or error code on failure, where -EAGAIN means it may | |
343 | be safe to attempt to resume the device again in future, but | |
344 | 'power.runtime_error' should be checked additionally, and -EACCES means | |
345 | that the callback could not be run, because 'power.disable_depth' was | |
632e270e | 346 | different from 0 |
5e928f77 | 347 | |
2c412337 AS |
348 | `int pm_runtime_resume_and_get(struct device *dev);` |
349 | - run pm_runtime_resume(dev) and if successful, increment the device's | |
350 | usage counter; return the result of pm_runtime_resume | |
351 | ||
151f4e2b | 352 | `int pm_request_idle(struct device *dev);` |
a6ab7aa9 RW |
353 | - submit a request to execute the subsystem-level idle callback for the |
354 | device (the request is represented by a work item in pm_wq); returns 0 on | |
355 | success or error code if the request has not been queued up | |
5e928f77 | 356 | |
151f4e2b | 357 | `int pm_request_autosuspend(struct device *dev);` |
15bcb91d AS |
358 | - schedule the execution of the subsystem-level suspend callback for the |
359 | device when the autosuspend delay has expired; if the delay has already | |
360 | expired then the work item is queued up immediately | |
361 | ||
151f4e2b | 362 | `int pm_schedule_suspend(struct device *dev, unsigned int delay);` |
a6ab7aa9 RW |
363 | - schedule the execution of the subsystem-level suspend callback for the |
364 | device in future, where 'delay' is the time to wait before queuing up a | |
365 | suspend work item in pm_wq, in milliseconds (if 'delay' is zero, the work | |
366 | item is queued up immediately); returns 0 on success, 1 if the device's PM | |
62052ab1 | 367 | runtime status was already 'suspended', or error code if the request |
5e928f77 RW |
368 | hasn't been scheduled (or queued up if 'delay' is 0); if the execution of |
369 | ->runtime_suspend() is already scheduled and not yet expired, the new | |
370 | value of 'delay' will be used as the time to wait | |
371 | ||
151f4e2b | 372 | `int pm_request_resume(struct device *dev);` |
a6ab7aa9 RW |
373 | - submit a request to execute the subsystem-level resume callback for the |
374 | device (the request is represented by a work item in pm_wq); returns 0 on | |
62052ab1 | 375 | success, 1 if the device's runtime PM status was already 'active', or |
5e928f77 RW |
376 | error code if the request hasn't been queued up |
377 | ||
151f4e2b | 378 | `void pm_runtime_get_noresume(struct device *dev);` |
5e928f77 RW |
379 | - increment the device's usage counter |
380 | ||
151f4e2b | 381 | `int pm_runtime_get(struct device *dev);` |
5e928f77 RW |
382 | - increment the device's usage counter, run pm_request_resume(dev) and |
383 | return its result | |
384 | ||
151f4e2b | 385 | `int pm_runtime_get_sync(struct device *dev);` |
5e928f77 | 386 | - increment the device's usage counter, run pm_runtime_resume(dev) and |
c58e7ed2 KK |
387 | return its result; |
388 | note that it does not drop the device's usage counter on errors, so | |
389 | consider using pm_runtime_resume_and_get() instead of it, especially | |
390 | if its return value is checked by the caller, as this is likely to | |
391 | result in cleaner code. | |
5e928f77 | 392 | |
151f4e2b | 393 | `int pm_runtime_get_if_in_use(struct device *dev);` |
a436b6a1 RW |
394 | - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the |
395 | runtime PM status is RPM_ACTIVE and the runtime PM usage counter is | |
396 | nonzero, increment the counter and return 1; otherwise return 0 without | |
397 | changing the counter | |
398 | ||
c0ef3df8 | 399 | `int pm_runtime_get_if_active(struct device *dev);` |
c111566b | 400 | - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the |
c0ef3df8 | 401 | runtime PM status is RPM_ACTIVE, increment the counter and |
c111566b SA |
402 | return 1; otherwise return 0 without changing the counter |
403 | ||
151f4e2b | 404 | `void pm_runtime_put_noidle(struct device *dev);` |
5e928f77 RW |
405 | - decrement the device's usage counter |
406 | ||
151f4e2b | 407 | `int pm_runtime_put(struct device *dev);` |
15bcb91d AS |
408 | - decrement the device's usage counter; if the result is 0 then run |
409 | pm_request_idle(dev) and return its result | |
410 | ||
151f4e2b | 411 | `int pm_runtime_put_autosuspend(struct device *dev);` |
b7d46644 SA |
412 | - does the same as __pm_runtime_put_autosuspend() for now, but in the |
413 | future, will also call pm_runtime_mark_last_busy() as well, DO NOT USE! | |
414 | ||
415 | `int __pm_runtime_put_autosuspend(struct device *dev);` | |
15bcb91d AS |
416 | - decrement the device's usage counter; if the result is 0 then run |
417 | pm_request_autosuspend(dev) and return its result | |
5e928f77 | 418 | |
151f4e2b | 419 | `int pm_runtime_put_sync(struct device *dev);` |
15bcb91d AS |
420 | - decrement the device's usage counter; if the result is 0 then run |
421 | pm_runtime_idle(dev) and return its result | |
422 | ||
151f4e2b | 423 | `int pm_runtime_put_sync_suspend(struct device *dev);` |
c7b61de5 AS |
424 | - decrement the device's usage counter; if the result is 0 then run |
425 | pm_runtime_suspend(dev) and return its result | |
426 | ||
151f4e2b | 427 | `int pm_runtime_put_sync_autosuspend(struct device *dev);` |
15bcb91d AS |
428 | - decrement the device's usage counter; if the result is 0 then run |
429 | pm_runtime_autosuspend(dev) and return its result | |
5e928f77 | 430 | |
151f4e2b | 431 | `void pm_runtime_enable(struct device *dev);` |
e358bad7 | 432 | - decrement the device's 'power.disable_depth' field; if that field is equal |
62052ab1 | 433 | to zero, the runtime PM helper functions can execute subsystem-level |
e358bad7 | 434 | callbacks described in Section 2 for the device |
5e928f77 | 435 | |
151f4e2b | 436 | `int pm_runtime_disable(struct device *dev);` |
e358bad7 RW |
437 | - increment the device's 'power.disable_depth' field (if the value of that |
438 | field was previously zero, this prevents subsystem-level runtime PM | |
91e63cc0 GU |
439 | callbacks from being run for the device), make sure that all of the |
440 | pending runtime PM operations on the device are either completed or | |
441 | canceled; returns 1 if there was a resume request pending and it was | |
442 | necessary to execute the subsystem-level resume callback for the device | |
443 | to satisfy that request, otherwise 0 is returned | |
5e928f77 | 444 | |
151f4e2b | 445 | `int pm_runtime_barrier(struct device *dev);` |
e358bad7 RW |
446 | - check if there's a resume request pending for the device and resume it |
447 | (synchronously) in that case, cancel any other pending runtime PM requests | |
448 | regarding it and wait for all runtime PM operations on it in progress to | |
449 | complete; returns 1 if there was a resume request pending and it was | |
450 | necessary to execute the subsystem-level resume callback for the device to | |
451 | satisfy that request, otherwise 0 is returned | |
452 | ||
151f4e2b | 453 | `void pm_suspend_ignore_children(struct device *dev, bool enable);` |
5e928f77 RW |
454 | - set/unset the power.ignore_children flag of the device |
455 | ||
151f4e2b | 456 | `int pm_runtime_set_active(struct device *dev);` |
62052ab1 | 457 | - clear the device's 'power.runtime_error' flag, set the device's runtime |
5e928f77 RW |
458 | PM status to 'active' and update its parent's counter of 'active' |
459 | children as appropriate (it is only valid to use this function if | |
460 | 'power.runtime_error' is set or 'power.disable_depth' is greater than | |
461 | zero); it will fail and return error code if the device has a parent | |
462 | which is not active and the 'power.ignore_children' flag of which is unset | |
463 | ||
151f4e2b | 464 | `void pm_runtime_set_suspended(struct device *dev);` |
62052ab1 | 465 | - clear the device's 'power.runtime_error' flag, set the device's runtime |
5e928f77 RW |
466 | PM status to 'suspended' and update its parent's counter of 'active' |
467 | children as appropriate (it is only valid to use this function if | |
468 | 'power.runtime_error' is set or 'power.disable_depth' is greater than | |
f8817f61 | 469 | zero) |
5e928f77 | 470 | |
151f4e2b | 471 | `bool pm_runtime_active(struct device *dev);` |
fbadc58d SL |
472 | - return true if the device's runtime PM status is 'active' or its |
473 | 'power.disable_depth' field is not equal to zero, or false otherwise | |
474 | ||
151f4e2b | 475 | `bool pm_runtime_suspended(struct device *dev);` |
f08f5a0a RW |
476 | - return true if the device's runtime PM status is 'suspended' and its |
477 | 'power.disable_depth' field is equal to zero, or false otherwise | |
d690b2cd | 478 | |
151f4e2b | 479 | `bool pm_runtime_status_suspended(struct device *dev);` |
f3393b62 KH |
480 | - return true if the device's runtime PM status is 'suspended' |
481 | ||
151f4e2b | 482 | `void pm_runtime_allow(struct device *dev);` |
87d1b3e6 RW |
483 | - set the power.runtime_auto flag for the device and decrease its usage |
484 | counter (used by the /sys/devices/.../power/control interface to | |
485 | effectively allow the device to be power managed at run time) | |
486 | ||
151f4e2b | 487 | `void pm_runtime_forbid(struct device *dev);` |
87d1b3e6 RW |
488 | - unset the power.runtime_auto flag for the device and increase its usage |
489 | counter (used by the /sys/devices/.../power/control interface to | |
490 | effectively prevent the device from being power managed at run time) | |
491 | ||
151f4e2b | 492 | `void pm_runtime_no_callbacks(struct device *dev);` |
62052ab1 | 493 | - set the power.no_callbacks flag for the device and remove the runtime |
7490e442 AS |
494 | PM attributes from /sys/devices/.../power (or prevent them from being |
495 | added when the device is registered) | |
496 | ||
151f4e2b | 497 | `void pm_runtime_irq_safe(struct device *dev);` |
c7b61de5 | 498 | - set the power.irq_safe flag for the device, causing the runtime-PM |
64584eb9 | 499 | callbacks to be invoked with interrupts off |
c7b61de5 | 500 | |
151f4e2b | 501 | `bool pm_runtime_is_irq_safe(struct device *dev);` |
3fb1581e KK |
502 | - return true if power.irq_safe flag was set for the device, causing |
503 | the runtime-PM callbacks to be invoked with interrupts off | |
504 | ||
151f4e2b | 505 | `void pm_runtime_mark_last_busy(struct device *dev);` |
15bcb91d AS |
506 | - set the power.last_busy field to the current time |
507 | ||
151f4e2b | 508 | `void pm_runtime_use_autosuspend(struct device *dev);` |
bafdcde7 JH |
509 | - set the power.use_autosuspend flag, enabling autosuspend delays; call |
510 | pm_runtime_get_sync if the flag was previously cleared and | |
511 | power.autosuspend_delay is negative | |
15bcb91d | 512 | |
151f4e2b | 513 | `void pm_runtime_dont_use_autosuspend(struct device *dev);` |
bafdcde7 JH |
514 | - clear the power.use_autosuspend flag, disabling autosuspend delays; |
515 | decrement the device's usage counter if the flag was previously set and | |
516 | power.autosuspend_delay is negative; call pm_runtime_idle | |
15bcb91d | 517 | |
151f4e2b | 518 | `void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);` |
15bcb91d | 519 | - set the power.autosuspend_delay value to 'delay' (expressed in |
62052ab1 | 520 | milliseconds); if 'delay' is negative then runtime suspends are |
bafdcde7 JH |
521 | prevented; if power.use_autosuspend is set, pm_runtime_get_sync may be |
522 | called or the device's usage counter may be decremented and | |
523 | pm_runtime_idle called depending on if power.autosuspend_delay is | |
524 | changed to or from a negative value; if power.use_autosuspend is clear, | |
525 | pm_runtime_idle is called | |
15bcb91d | 526 | |
151f4e2b | 527 | `unsigned long pm_runtime_autosuspend_expiration(struct device *dev);` |
15bcb91d AS |
528 | - calculate the time when the current autosuspend delay period will expire, |
529 | based on power.last_busy and power.autosuspend_delay; if the delay time | |
530 | is 1000 ms or larger then the expiration time is rounded up to the | |
531 | nearest second; returns 0 if the delay period has already expired or | |
532 | power.use_autosuspend isn't set, otherwise returns the expiration time | |
533 | in jiffies | |
534 | ||
5e928f77 RW |
535 | It is safe to execute the following helper functions from interrupt context: |
536 | ||
151f4e2b MCC |
537 | - pm_request_idle() |
538 | - pm_request_autosuspend() | |
539 | - pm_schedule_suspend() | |
540 | - pm_request_resume() | |
541 | - pm_runtime_get_noresume() | |
542 | - pm_runtime_get() | |
543 | - pm_runtime_put_noidle() | |
544 | - pm_runtime_put() | |
545 | - pm_runtime_put_autosuspend() | |
b7d46644 | 546 | - __pm_runtime_put_autosuspend() |
151f4e2b MCC |
547 | - pm_runtime_enable() |
548 | - pm_suspend_ignore_children() | |
549 | - pm_runtime_set_active() | |
550 | - pm_runtime_set_suspended() | |
551 | - pm_runtime_suspended() | |
552 | - pm_runtime_mark_last_busy() | |
553 | - pm_runtime_autosuspend_expiration() | |
5e928f77 | 554 | |
c7b61de5 AS |
555 | If pm_runtime_irq_safe() has been called for a device then the following helper |
556 | functions may also be used in interrupt context: | |
557 | ||
151f4e2b MCC |
558 | - pm_runtime_idle() |
559 | - pm_runtime_suspend() | |
560 | - pm_runtime_autosuspend() | |
561 | - pm_runtime_resume() | |
562 | - pm_runtime_get_sync() | |
563 | - pm_runtime_put_sync() | |
564 | - pm_runtime_put_sync_suspend() | |
565 | - pm_runtime_put_sync_autosuspend() | |
c7b61de5 | 566 | |
62052ab1 | 567 | 5. Runtime PM Initialization, Device Probing and Removal |
151f4e2b | 568 | ======================================================== |
5e928f77 | 569 | |
62052ab1 | 570 | Initially, the runtime PM is disabled for all devices, which means that the |
1f999d14 | 571 | majority of the runtime PM helper functions described in Section 4 will return |
5e928f77 RW |
572 | -EAGAIN until pm_runtime_enable() is called for the device. |
573 | ||
62052ab1 | 574 | In addition to that, the initial runtime PM status of all devices is |
5e928f77 RW |
575 | 'suspended', but it need not reflect the actual physical state of the device. |
576 | Thus, if the device is initially active (i.e. it is able to process I/O), its | |
62052ab1 | 577 | runtime PM status must be changed to 'active', with the help of |
5e928f77 RW |
578 | pm_runtime_set_active(), before pm_runtime_enable() is called for the device. |
579 | ||
62052ab1 | 580 | However, if the device has a parent and the parent's runtime PM is enabled, |
5e928f77 RW |
581 | calling pm_runtime_set_active() for the device will affect the parent, unless |
582 | the parent's 'power.ignore_children' flag is set. Namely, in that case the | |
583 | parent won't be able to suspend at run time, using the PM core's helper | |
584 | functions, as long as the child's status is 'active', even if the child's | |
62052ab1 | 585 | runtime PM is still disabled (i.e. pm_runtime_enable() hasn't been called for |
5e928f77 RW |
586 | the child yet or pm_runtime_disable() has been called for it). For this reason, |
587 | once pm_runtime_set_active() has been called for the device, pm_runtime_enable() | |
62052ab1 | 588 | should be called for it too as soon as reasonably possible or its runtime PM |
5e928f77 RW |
589 | status should be changed back to 'suspended' with the help of |
590 | pm_runtime_set_suspended(). | |
591 | ||
62052ab1 | 592 | If the default initial runtime PM status of the device (i.e. 'suspended') |
5e928f77 RW |
593 | reflects the actual state of the device, its bus type's or its driver's |
594 | ->probe() callback will likely need to wake it up using one of the PM core's | |
595 | helper functions described in Section 4. In that case, pm_runtime_resume() | |
62052ab1 | 596 | should be used. Of course, for this purpose the device's runtime PM has to be |
5e928f77 RW |
597 | enabled earlier by calling pm_runtime_enable(). |
598 | ||
f6a2fbb9 | 599 | Note, if the device may execute pm_runtime calls during the probe (such as |
30966309 | 600 | if it is registered with a subsystem that may call back in) then the |
f6a2fbb9 BD |
601 | pm_runtime_get_sync() call paired with a pm_runtime_put() call will be |
602 | appropriate to ensure that the device is not put back to sleep during the | |
603 | probe. This can happen with systems such as the network device layer. | |
604 | ||
ea309944 | 605 | It may be desirable to suspend the device once ->probe() has finished. |
35bfa99e | 606 | Therefore the driver core uses the asynchronous pm_request_idle() to submit a |
ea309944 | 607 | request to execute the subsystem-level idle callback for the device at that |
30966309 | 608 | time. A driver that makes use of the runtime autosuspend feature may want to |
ea309944 | 609 | update the last busy mark before returning from ->probe(). |
f5da24db RW |
610 | |
611 | Moreover, the driver core prevents runtime PM callbacks from racing with the bus | |
30966309 | 612 | notifier callback in __device_release_driver(), which is necessary because the |
f5da24db RW |
613 | notifier is used by some subsystems to carry out operations affecting the |
614 | runtime PM functionality. It does so by calling pm_runtime_get_sync() before | |
615 | driver_sysfs_remove() and the BUS_NOTIFY_UNBIND_DRIVER notifications. This | |
616 | resumes the device if it's in the suspended state and prevents it from | |
617 | being suspended again while those routines are being executed. | |
618 | ||
619 | To allow bus types and drivers to put devices into the suspended state by | |
620 | calling pm_runtime_suspend() from their ->remove() routines, the driver core | |
621 | executes pm_runtime_put_sync() after running the BUS_NOTIFY_UNBIND_DRIVER | |
622 | notifications in __device_release_driver(). This requires bus types and | |
623 | drivers to make their ->remove() callbacks avoid races with runtime PM directly, | |
30966309 | 624 | but it also allows more flexibility in the handling of devices during the |
f5da24db | 625 | removal of their drivers. |
f1212ae1 | 626 | |
8fd2910e KK |
627 | Drivers in ->remove() callback should undo the runtime PM changes done |
628 | in ->probe(). Usually this means calling pm_runtime_disable(), | |
629 | pm_runtime_dont_use_autosuspend() etc. | |
630 | ||
87d1b3e6 RW |
631 | The user space can effectively disallow the driver of the device to power manage |
632 | it at run time by changing the value of its /sys/devices/.../power/control | |
633 | attribute to "on", which causes pm_runtime_forbid() to be called. In principle, | |
634 | this mechanism may also be used by the driver to effectively turn off the | |
62052ab1 RW |
635 | runtime power management of the device until the user space turns it on. |
636 | Namely, during the initialization the driver can make sure that the runtime PM | |
87d1b3e6 RW |
637 | status of the device is 'active' and call pm_runtime_forbid(). It should be |
638 | noted, however, that if the user space has already intentionally changed the | |
639 | value of /sys/devices/.../power/control to "auto" to allow the driver to power | |
640 | manage the device at run time, the driver may confuse it by using | |
641 | pm_runtime_forbid() this way. | |
642 | ||
62052ab1 | 643 | 6. Runtime PM and System Sleep |
151f4e2b | 644 | ============================== |
f1212ae1 | 645 | |
62052ab1 | 646 | Runtime PM and system sleep (i.e., system suspend and hibernation, also known |
f1212ae1 AS |
647 | as suspend-to-RAM and suspend-to-disk) interact with each other in a couple of |
648 | ways. If a device is active when a system sleep starts, everything is | |
649 | straightforward. But what should happen if the device is already suspended? | |
650 | ||
62052ab1 RW |
651 | The device may have different wake-up settings for runtime PM and system sleep. |
652 | For example, remote wake-up may be enabled for runtime suspend but disallowed | |
f1212ae1 AS |
653 | for system sleep (device_may_wakeup(dev) returns 'false'). When this happens, |
654 | the subsystem-level system suspend callback is responsible for changing the | |
655 | device's wake-up setting (it may leave that to the device driver's system | |
656 | suspend routine). It may be necessary to resume the device and suspend it again | |
657 | in order to do so. The same is true if the driver uses different power levels | |
62052ab1 | 658 | or other settings for runtime suspend and system sleep. |
f1212ae1 | 659 | |
455716e9 RW |
660 | During system resume, the simplest approach is to bring all devices back to full |
661 | power, even if they had been suspended before the system suspend began. There | |
662 | are several reasons for this, including: | |
f1212ae1 AS |
663 | |
664 | * The device might need to switch power levels, wake-up settings, etc. | |
665 | ||
666 | * Remote wake-up events might have been lost by the firmware. | |
667 | ||
668 | * The device's children may need the device to be at full power in order | |
669 | to resume themselves. | |
670 | ||
671 | * The driver's idea of the device state may not agree with the device's | |
672 | physical state. This can happen during resume from hibernation. | |
673 | ||
674 | * The device might need to be reset. | |
675 | ||
676 | * Even though the device was suspended, if its usage counter was > 0 then most | |
62052ab1 | 677 | likely it would need a runtime resume in the near future anyway. |
f1212ae1 | 678 | |
455716e9 | 679 | If the device had been suspended before the system suspend began and it's |
62052ab1 | 680 | brought back to full power during resume, then its runtime PM status will have |
455716e9 RW |
681 | to be updated to reflect the actual post-system sleep status. The way to do |
682 | this is: | |
f1212ae1 | 683 | |
151f4e2b MCC |
684 | - pm_runtime_disable(dev); |
685 | - pm_runtime_set_active(dev); | |
686 | - pm_runtime_enable(dev); | |
f1212ae1 | 687 | |
62052ab1 | 688 | The PM core always increments the runtime usage counter before calling the |
1e2ef05b | 689 | ->suspend() callback and decrements it after calling the ->resume() callback. |
62052ab1 | 690 | Hence disabling runtime PM temporarily like this will not cause any runtime |
1e2ef05b RW |
691 | suspend attempts to be permanently lost. If the usage count goes to zero |
692 | following the return of the ->resume() callback, the ->runtime_idle() callback | |
693 | will be invoked as usual. | |
694 | ||
455716e9 RW |
695 | On some systems, however, system sleep is not entered through a global firmware |
696 | or hardware operation. Instead, all hardware components are put into low-power | |
697 | states directly by the kernel in a coordinated way. Then, the system sleep | |
698 | state effectively follows from the states the hardware components end up in | |
699 | and the system is woken up from that state by a hardware interrupt or a similar | |
700 | mechanism entirely under the kernel's control. As a result, the kernel never | |
701 | gives control away and the states of all devices during resume are precisely | |
702 | known to it. If that is the case and none of the situations listed above takes | |
703 | place (in particular, if the system is not waking up from hibernation), it may | |
704 | be more efficient to leave the devices that had been suspended before the system | |
705 | suspend began in the suspended state. | |
706 | ||
f71495f3 RW |
707 | To this end, the PM core provides a mechanism allowing some coordination between |
708 | different levels of device hierarchy. Namely, if a system suspend .prepare() | |
709 | callback returns a positive number for a device, that indicates to the PM core | |
710 | that the device appears to be runtime-suspended and its state is fine, so it | |
711 | may be left in runtime suspend provided that all of its descendants are also | |
712 | left in runtime suspend. If that happens, the PM core will not execute any | |
713 | system suspend and resume callbacks for all of those devices, except for the | |
30966309 | 714 | .complete() callback, which is then entirely responsible for handling the device |
f71495f3 | 715 | as appropriate. This only applies to system suspend transitions that are not |
66ccc64f | 716 | related to hibernation (see Documentation/driver-api/pm/devices.rst for more |
f71495f3 RW |
717 | information). |
718 | ||
1e2ef05b RW |
719 | The PM core does its best to reduce the probability of race conditions between |
720 | the runtime PM and system suspend/resume (and hibernation) callbacks by carrying | |
721 | out the following operations: | |
722 | ||
4ec6a9cc RW |
723 | * During system suspend pm_runtime_get_noresume() is called for every device |
724 | right before executing the subsystem-level .prepare() callback for it and | |
725 | pm_runtime_barrier() is called for every device right before executing the | |
726 | subsystem-level .suspend() callback for it. In addition to that the PM core | |
30966309 | 727 | calls __pm_runtime_disable() with 'false' as the second argument for every |
4ec6a9cc RW |
728 | device right before executing the subsystem-level .suspend_late() callback |
729 | for it. | |
730 | ||
731 | * During system resume pm_runtime_enable() and pm_runtime_put() are called for | |
732 | every device right after executing the subsystem-level .resume_early() | |
733 | callback and right after executing the subsystem-level .complete() callback | |
9f6d8f6a | 734 | for it, respectively. |
1e2ef05b | 735 | |
d690b2cd | 736 | 7. Generic subsystem callbacks |
e6509568 | 737 | ============================== |
d690b2cd RW |
738 | |
739 | Subsystems may wish to conserve code space by using the set of generic power | |
740 | management callbacks provided by the PM core, defined in | |
741 | driver/base/power/generic_ops.c: | |
742 | ||
151f4e2b | 743 | `int pm_generic_runtime_suspend(struct device *dev);` |
d690b2cd | 744 | - invoke the ->runtime_suspend() callback provided by the driver of this |
39c29f3d | 745 | device and return its result, or return 0 if not defined |
d690b2cd | 746 | |
151f4e2b | 747 | `int pm_generic_runtime_resume(struct device *dev);` |
d690b2cd | 748 | - invoke the ->runtime_resume() callback provided by the driver of this |
39c29f3d | 749 | device and return its result, or return 0 if not defined |
d690b2cd | 750 | |
151f4e2b | 751 | `int pm_generic_suspend(struct device *dev);` |
d690b2cd RW |
752 | - if the device has not been suspended at run time, invoke the ->suspend() |
753 | callback provided by its driver and return its result, or return 0 if not | |
754 | defined | |
755 | ||
151f4e2b | 756 | `int pm_generic_suspend_noirq(struct device *dev);` |
e5291928 RW |
757 | - if pm_runtime_suspended(dev) returns "false", invoke the ->suspend_noirq() |
758 | callback provided by the device's driver and return its result, or return | |
759 | 0 if not defined | |
760 | ||
151f4e2b | 761 | `int pm_generic_resume(struct device *dev);` |
d690b2cd RW |
762 | - invoke the ->resume() callback provided by the driver of this device and, |
763 | if successful, change the device's runtime PM status to 'active' | |
764 | ||
151f4e2b | 765 | `int pm_generic_resume_noirq(struct device *dev);` |
e5291928 RW |
766 | - invoke the ->resume_noirq() callback provided by the driver of this device |
767 | ||
151f4e2b | 768 | `int pm_generic_freeze(struct device *dev);` |
d690b2cd RW |
769 | - if the device has not been suspended at run time, invoke the ->freeze() |
770 | callback provided by its driver and return its result, or return 0 if not | |
771 | defined | |
772 | ||
151f4e2b | 773 | `int pm_generic_freeze_noirq(struct device *dev);` |
e5291928 RW |
774 | - if pm_runtime_suspended(dev) returns "false", invoke the ->freeze_noirq() |
775 | callback provided by the device's driver and return its result, or return | |
776 | 0 if not defined | |
777 | ||
151f4e2b | 778 | `int pm_generic_thaw(struct device *dev);` |
d690b2cd RW |
779 | - if the device has not been suspended at run time, invoke the ->thaw() |
780 | callback provided by its driver and return its result, or return 0 if not | |
781 | defined | |
782 | ||
151f4e2b | 783 | `int pm_generic_thaw_noirq(struct device *dev);` |
e5291928 RW |
784 | - if pm_runtime_suspended(dev) returns "false", invoke the ->thaw_noirq() |
785 | callback provided by the device's driver and return its result, or return | |
786 | 0 if not defined | |
787 | ||
151f4e2b | 788 | `int pm_generic_poweroff(struct device *dev);` |
d690b2cd RW |
789 | - if the device has not been suspended at run time, invoke the ->poweroff() |
790 | callback provided by its driver and return its result, or return 0 if not | |
791 | defined | |
792 | ||
151f4e2b | 793 | `int pm_generic_poweroff_noirq(struct device *dev);` |
e5291928 RW |
794 | - if pm_runtime_suspended(dev) returns "false", run the ->poweroff_noirq() |
795 | callback provided by the device's driver and return its result, or return | |
796 | 0 if not defined | |
797 | ||
151f4e2b | 798 | `int pm_generic_restore(struct device *dev);` |
d690b2cd RW |
799 | - invoke the ->restore() callback provided by the driver of this device and, |
800 | if successful, change the device's runtime PM status to 'active' | |
801 | ||
151f4e2b | 802 | `int pm_generic_restore_noirq(struct device *dev);` |
e5291928 RW |
803 | - invoke the ->restore_noirq() callback provided by the device's driver |
804 | ||
30966309 | 805 | These functions are the defaults used by the PM core if a subsystem doesn't |
fd6fe826 | 806 | provide its own callbacks for ->runtime_idle(), ->runtime_suspend(), |
e5291928 RW |
807 | ->runtime_resume(), ->suspend(), ->suspend_noirq(), ->resume(), |
808 | ->resume_noirq(), ->freeze(), ->freeze_noirq(), ->thaw(), ->thaw_noirq(), | |
fd6fe826 GU |
809 | ->poweroff(), ->poweroff_noirq(), ->restore(), ->restore_noirq() in the |
810 | subsystem-level dev_pm_ops structure. | |
d690b2cd RW |
811 | |
812 | Device drivers that wish to use the same function as a system suspend, freeze, | |
62052ab1 RW |
813 | poweroff and runtime suspend callback, and similarly for system resume, thaw, |
814 | restore, and runtime resume, can achieve this with the help of the | |
d690b2cd RW |
815 | UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its |
816 | last argument to NULL). | |
7490e442 AS |
817 | |
818 | 8. "No-Callback" Devices | |
151f4e2b | 819 | ======================== |
7490e442 AS |
820 | |
821 | Some "devices" are only logical sub-devices of their parent and cannot be | |
822 | power-managed on their own. (The prototype example is a USB interface. Entire | |
823 | USB devices can go into low-power mode or send wake-up requests, but neither is | |
824 | possible for individual interfaces.) The drivers for these devices have no | |
62052ab1 | 825 | need of runtime PM callbacks; if the callbacks did exist, ->runtime_suspend() |
7490e442 AS |
826 | and ->runtime_resume() would always return 0 without doing anything else and |
827 | ->runtime_idle() would always call pm_runtime_suspend(). | |
828 | ||
829 | Subsystems can tell the PM core about these devices by calling | |
830 | pm_runtime_no_callbacks(). This should be done after the device structure is | |
831 | initialized and before it is registered (although after device registration is | |
832 | also okay). The routine will set the device's power.no_callbacks flag and | |
62052ab1 | 833 | prevent the non-debugging runtime PM sysfs attributes from being created. |
7490e442 AS |
834 | |
835 | When power.no_callbacks is set, the PM core will not invoke the | |
836 | ->runtime_idle(), ->runtime_suspend(), or ->runtime_resume() callbacks. | |
837 | Instead it will assume that suspends and resumes always succeed and that idle | |
838 | devices should be suspended. | |
839 | ||
840 | As a consequence, the PM core will never directly inform the device's subsystem | |
62052ab1 | 841 | or driver about runtime power changes. Instead, the driver for the device's |
7490e442 AS |
842 | parent must take responsibility for telling the device's driver when the |
843 | parent's power state changes. | |
15bcb91d | 844 | |
4ec4f059 UH |
845 | Note that, in some cases it may not be desirable for subsystems/drivers to call |
846 | pm_runtime_no_callbacks() for their devices. This could be because a subset of | |
847 | the runtime PM callbacks needs to be implemented, a platform dependent PM | |
848 | domain could get attached to the device or that the device is power managed | |
849 | through a supplier device link. For these reasons and to avoid boilerplate code | |
850 | in subsystems/drivers, the PM core allows runtime PM callbacks to be | |
851 | unassigned. More precisely, if a callback pointer is NULL, the PM core will act | |
852 | as though there was a callback and it returned 0. | |
853 | ||
15bcb91d | 854 | 9. Autosuspend, or automatically-delayed suspends |
151f4e2b | 855 | ================================================= |
15bcb91d AS |
856 | |
857 | Changing a device's power state isn't free; it requires both time and energy. | |
858 | A device should be put in a low-power state only when there's some reason to | |
859 | think it will remain in that state for a substantial time. A common heuristic | |
860 | says that a device which hasn't been used for a while is liable to remain | |
861 | unused; following this advice, drivers should not allow devices to be suspended | |
62052ab1 | 862 | at runtime until they have been inactive for some minimum period. Even when |
15bcb91d AS |
863 | the heuristic ends up being non-optimal, it will still prevent devices from |
864 | "bouncing" too rapidly between low-power and full-power states. | |
865 | ||
866 | The term "autosuspend" is an historical remnant. It doesn't mean that the | |
867 | device is automatically suspended (the subsystem or driver still has to call | |
62052ab1 | 868 | the appropriate PM routines); rather it means that runtime suspends will |
15bcb91d AS |
869 | automatically be delayed until the desired period of inactivity has elapsed. |
870 | ||
871 | Inactivity is determined based on the power.last_busy field. Drivers should | |
872 | call pm_runtime_mark_last_busy() to update this field after carrying out I/O, | |
b7d46644 SA |
873 | typically just before calling __pm_runtime_put_autosuspend(). The desired |
874 | length of the inactivity period is a matter of policy. Subsystems can set this | |
875 | length initially by calling pm_runtime_set_autosuspend_delay(), but after device | |
15bcb91d AS |
876 | registration the length should be controlled by user space, using the |
877 | /sys/devices/.../power/autosuspend_delay_ms attribute. | |
878 | ||
879 | In order to use autosuspend, subsystems or drivers must call | |
880 | pm_runtime_use_autosuspend() (preferably before registering the device), and | |
151f4e2b MCC |
881 | thereafter they should use the various `*_autosuspend()` helper functions |
882 | instead of the non-autosuspend counterparts:: | |
15bcb91d AS |
883 | |
884 | Instead of: pm_runtime_suspend use: pm_runtime_autosuspend; | |
885 | Instead of: pm_schedule_suspend use: pm_request_autosuspend; | |
b7d46644 | 886 | Instead of: pm_runtime_put use: __pm_runtime_put_autosuspend; |
15bcb91d AS |
887 | Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend. |
888 | ||
889 | Drivers may also continue to use the non-autosuspend helper functions; they | |
72ec2e17 JH |
890 | will behave normally, which means sometimes taking the autosuspend delay into |
891 | account (see pm_runtime_idle). | |
15bcb91d | 892 | |
886486b7 AS |
893 | Under some circumstances a driver or subsystem may want to prevent a device |
894 | from autosuspending immediately, even though the usage counter is zero and the | |
895 | autosuspend delay time has expired. If the ->runtime_suspend() callback | |
896 | returns -EAGAIN or -EBUSY, and if the next autosuspend delay expiration time is | |
897 | in the future (as it normally would be if the callback invoked | |
898 | pm_runtime_mark_last_busy()), the PM core will automatically reschedule the | |
899 | autosuspend. The ->runtime_suspend() callback can't do this rescheduling | |
900 | itself because no suspend requests of any kind are accepted while the device is | |
901 | suspending (i.e., while the callback is running). | |
902 | ||
15bcb91d AS |
903 | The implementation is well suited for asynchronous use in interrupt contexts. |
904 | However such use inevitably involves races, because the PM core can't | |
905 | synchronize ->runtime_suspend() callbacks with the arrival of I/O requests. | |
906 | This synchronization must be handled by the driver, using its private lock. | |
151f4e2b | 907 | Here is a schematic pseudo-code example:: |
15bcb91d AS |
908 | |
909 | foo_read_or_write(struct foo_priv *foo, void *data) | |
910 | { | |
911 | lock(&foo->private_lock); | |
912 | add_request_to_io_queue(foo, data); | |
913 | if (foo->num_pending_requests++ == 0) | |
914 | pm_runtime_get(&foo->dev); | |
915 | if (!foo->is_suspended) | |
916 | foo_process_next_request(foo); | |
917 | unlock(&foo->private_lock); | |
918 | } | |
919 | ||
920 | foo_io_completion(struct foo_priv *foo, void *req) | |
921 | { | |
922 | lock(&foo->private_lock); | |
923 | if (--foo->num_pending_requests == 0) { | |
924 | pm_runtime_mark_last_busy(&foo->dev); | |
b7d46644 | 925 | __pm_runtime_put_autosuspend(&foo->dev); |
15bcb91d AS |
926 | } else { |
927 | foo_process_next_request(foo); | |
928 | } | |
929 | unlock(&foo->private_lock); | |
930 | /* Send req result back to the user ... */ | |
931 | } | |
932 | ||
933 | int foo_runtime_suspend(struct device *dev) | |
934 | { | |
935 | struct foo_priv foo = container_of(dev, ...); | |
936 | int ret = 0; | |
937 | ||
938 | lock(&foo->private_lock); | |
939 | if (foo->num_pending_requests > 0) { | |
940 | ret = -EBUSY; | |
941 | } else { | |
942 | /* ... suspend the device ... */ | |
943 | foo->is_suspended = 1; | |
944 | } | |
945 | unlock(&foo->private_lock); | |
946 | return ret; | |
947 | } | |
948 | ||
949 | int foo_runtime_resume(struct device *dev) | |
950 | { | |
951 | struct foo_priv foo = container_of(dev, ...); | |
952 | ||
953 | lock(&foo->private_lock); | |
954 | /* ... resume the device ... */ | |
955 | foo->is_suspended = 0; | |
956 | pm_runtime_mark_last_busy(&foo->dev); | |
957 | if (foo->num_pending_requests > 0) | |
fe982450 | 958 | foo_process_next_request(foo); |
15bcb91d AS |
959 | unlock(&foo->private_lock); |
960 | return 0; | |
961 | } | |
962 | ||
963 | The important point is that after foo_io_completion() asks for an autosuspend, | |
964 | the foo_runtime_suspend() callback may race with foo_read_or_write(). | |
965 | Therefore foo_runtime_suspend() has to check whether there are any pending I/O | |
966 | requests (while holding the private lock) before allowing the suspend to | |
967 | proceed. | |
968 | ||
969 | In addition, the power.autosuspend_delay field can be changed by user space at | |
970 | any time. If a driver cares about this, it can call | |
971 | pm_runtime_autosuspend_expiration() from within the ->runtime_suspend() | |
972 | callback while holding its private lock. If the function returns a nonzero | |
973 | value then the delay has not yet expired and the callback should return | |
974 | -EAGAIN. |