Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Review Checklist for RCU Patches |
2 | ||
3 | ||
4 | This document contains a checklist for producing and reviewing patches | |
5 | that make use of RCU. Violating any of the rules listed below will | |
6 | result in the same sorts of problems that leaving out a locking primitive | |
7 | would cause. This list is based on experiences reviewing such patches | |
8 | over a rather long period of time, but improvements are always welcome! | |
9 | ||
10 | 0. Is RCU being applied to a read-mostly situation? If the data | |
4c54005c PM |
11 | structure is updated more than about 10% of the time, then you |
12 | should strongly consider some other approach, unless detailed | |
13 | performance measurements show that RCU is nonetheless the right | |
14 | tool for the job. Yes, RCU does reduce read-side overhead by | |
15 | increasing write-side overhead, which is exactly why normal uses | |
16 | of RCU will do much more reading than updating. | |
1da177e4 | 17 | |
32300751 PM |
18 | Another exception is where performance is not an issue, and RCU |
19 | provides a simpler implementation. An example of this situation | |
20 | is the dynamic NMI code in the Linux 2.6 kernel, at least on | |
21 | architectures where NMIs are rare. | |
22 | ||
23 | Yet another exception is where the low real-time latency of RCU's | |
24 | read-side primitives is critically important. | |
1da177e4 | 25 | |
4de5f89e PM |
26 | One final exception is where RCU readers are used to prevent |
27 | the ABA problem (https://en.wikipedia.org/wiki/ABA_problem) | |
28 | for lockless updates. This does result in the mildly | |
29 | counter-intuitive situation where rcu_read_lock() and | |
30 | rcu_read_unlock() are used to protect updates, however, this | |
31 | approach provides the same potential simplifications that garbage | |
32 | collectors do. | |
33 | ||
1da177e4 LT |
34 | 1. Does the update code have proper mutual exclusion? |
35 | ||
36 | RCU does allow -readers- to run (almost) naked, but -writers- must | |
37 | still use some sort of mutual exclusion, such as: | |
38 | ||
39 | a. locking, | |
40 | b. atomic operations, or | |
41 | c. restricting updates to a single task. | |
42 | ||
43 | If you choose #b, be prepared to describe how you have handled | |
44 | memory barriers on weakly ordered machines (pretty much all of | |
4c54005c PM |
45 | them -- even x86 allows later loads to be reordered to precede |
46 | earlier stores), and be prepared to explain why this added | |
47 | complexity is worthwhile. If you choose #c, be prepared to | |
48 | explain how this single task does not become a major bottleneck on | |
49 | big multiprocessor machines (for example, if the task is updating | |
50 | information relating to itself that other tasks can read, there | |
4de5f89e PM |
51 | by definition can be no bottleneck). Note that the definition |
52 | of "large" has changed significantly: Eight CPUs was "large" | |
53 | in the year 2000, but a hundred CPUs was unremarkable in 2017. | |
1da177e4 LT |
54 | |
55 | 2. Do the RCU read-side critical sections make proper use of | |
56 | rcu_read_lock() and friends? These primitives are needed | |
32300751 PM |
57 | to prevent grace periods from ending prematurely, which |
58 | could result in data being unceremoniously freed out from | |
59 | under your read-side code, which can greatly increase the | |
60 | actuarial risk of your kernel. | |
1da177e4 | 61 | |
dd81eca8 | 62 | As a rough rule of thumb, any dereference of an RCU-protected |
4c54005c PM |
63 | pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(), |
64 | rcu_read_lock_sched(), or by the appropriate update-side lock. | |
65 | Disabling of preemption can serve as rcu_read_lock_sched(), but | |
090c1685 | 66 | is less readable and prevents lockdep from detecting locking issues. |
dd81eca8 | 67 | |
4de5f89e PM |
68 | Letting RCU-protected pointers "leak" out of an RCU read-side |
69 | critical section is every bid as bad as letting them leak out | |
70 | from under a lock. Unless, of course, you have arranged some | |
71 | other means of protection, such as a lock or a reference count | |
72 | -before- letting them out of the RCU read-side critical section. | |
73 | ||
1da177e4 LT |
74 | 3. Does the update code tolerate concurrent accesses? |
75 | ||
76 | The whole point of RCU is to permit readers to run without | |
77 | any locks or atomic operations. This means that readers will | |
78 | be running while updates are in progress. There are a number | |
79 | of ways to handle this concurrency, depending on the situation: | |
80 | ||
32300751 | 81 | a. Use the RCU variants of the list and hlist update |
4c54005c PM |
82 | primitives to add, remove, and replace elements on |
83 | an RCU-protected list. Alternatively, use the other | |
84 | RCU-protected data structures that have been added to | |
85 | the Linux kernel. | |
32300751 PM |
86 | |
87 | This is almost always the best approach. | |
88 | ||
89 | b. Proceed as in (a) above, but also maintain per-element | |
90 | locks (that are acquired by both readers and writers) | |
91 | that guard per-element state. Of course, fields that | |
4c54005c PM |
92 | the readers refrain from accessing can be guarded by |
93 | some other lock acquired only by updaters, if desired. | |
32300751 PM |
94 | |
95 | This works quite well, also. | |
96 | ||
4de5f89e | 97 | c. Make updates appear atomic to readers. For example, |
4c54005c PM |
98 | pointer updates to properly aligned fields will |
99 | appear atomic, as will individual atomic primitives. | |
4de5f89e | 100 | Sequences of operations performed under a lock will -not- |
4c54005c PM |
101 | appear to be atomic to RCU readers, nor will sequences |
102 | of multiple atomic primitives. | |
1da177e4 | 103 | |
32300751 | 104 | This can work, but is starting to get a bit tricky. |
1da177e4 | 105 | |
32300751 | 106 | d. Carefully order the updates and the reads so that |
1da177e4 LT |
107 | readers see valid data at all phases of the update. |
108 | This is often more difficult than it sounds, especially | |
109 | given modern CPUs' tendency to reorder memory references. | |
110 | One must usually liberally sprinkle memory barriers | |
111 | (smp_wmb(), smp_rmb(), smp_mb()) through the code, | |
112 | making it difficult to understand and to test. | |
113 | ||
114 | It is usually better to group the changing data into | |
115 | a separate structure, so that the change may be made | |
116 | to appear atomic by updating a pointer to reference | |
117 | a new structure containing updated values. | |
118 | ||
119 | 4. Weakly ordered CPUs pose special challenges. Almost all CPUs | |
4c54005c PM |
120 | are weakly ordered -- even x86 CPUs allow later loads to be |
121 | reordered to precede earlier stores. RCU code must take all of | |
122 | the following measures to prevent memory-corruption problems: | |
1da177e4 LT |
123 | |
124 | a. Readers must maintain proper ordering of their memory | |
125 | accesses. The rcu_dereference() primitive ensures that | |
126 | the CPU picks up the pointer before it picks up the data | |
127 | that the pointer points to. This really is necessary | |
128 | on Alpha CPUs. If you don't believe me, see: | |
129 | ||
130 | http://www.openvms.compaq.com/wizard/wiz_2637.html | |
131 | ||
132 | The rcu_dereference() primitive is also an excellent | |
b4c5bf35 PM |
133 | documentation aid, letting the person reading the |
134 | code know exactly which pointers are protected by RCU. | |
4c54005c PM |
135 | Please note that compilers can also reorder code, and |
136 | they are becoming increasingly aggressive about doing | |
b4c5bf35 PM |
137 | just that. The rcu_dereference() primitive therefore also |
138 | prevents destructive compiler optimizations. However, | |
139 | with a bit of devious creativity, it is possible to | |
140 | mishandle the return value from rcu_dereference(). | |
141 | Please see rcu_dereference.txt in this directory for | |
142 | more information. | |
4c54005c PM |
143 | |
144 | The rcu_dereference() primitive is used by the | |
145 | various "_rcu()" list-traversal primitives, such | |
146 | as the list_for_each_entry_rcu(). Note that it is | |
147 | perfectly legal (if redundant) for update-side code to | |
148 | use rcu_dereference() and the "_rcu()" list-traversal | |
149 | primitives. This is particularly useful in code that | |
c598a070 PM |
150 | is common to readers and updaters. However, lockdep |
151 | will complain if you access rcu_dereference() outside | |
152 | of an RCU read-side critical section. See lockdep.txt | |
153 | to learn what to do about this. | |
154 | ||
155 | Of course, neither rcu_dereference() nor the "_rcu()" | |
156 | list-traversal primitives can substitute for a good | |
157 | concurrency design coordinating among multiple updaters. | |
1da177e4 | 158 | |
a83f1fe2 PM |
159 | b. If the list macros are being used, the list_add_tail_rcu() |
160 | and list_add_rcu() primitives must be used in order | |
161 | to prevent weakly ordered machines from misordering | |
162 | structure initialization and pointer planting. | |
1da177e4 | 163 | Similarly, if the hlist macros are being used, the |
a83f1fe2 | 164 | hlist_add_head_rcu() primitive is required. |
1da177e4 | 165 | |
a83f1fe2 PM |
166 | c. If the list macros are being used, the list_del_rcu() |
167 | primitive must be used to keep list_del()'s pointer | |
168 | poisoning from inflicting toxic effects on concurrent | |
169 | readers. Similarly, if the hlist macros are being used, | |
170 | the hlist_del_rcu() primitive is required. | |
171 | ||
4c54005c PM |
172 | The list_replace_rcu() and hlist_replace_rcu() primitives |
173 | may be used to replace an old structure with a new one | |
174 | in their respective types of RCU-protected lists. | |
175 | ||
176 | d. Rules similar to (4b) and (4c) apply to the "hlist_nulls" | |
177 | type of RCU-protected linked lists. | |
a83f1fe2 | 178 | |
4c54005c | 179 | e. Updates must ensure that initialization of a given |
1da177e4 LT |
180 | structure happens before pointers to that structure are |
181 | publicized. Use the rcu_assign_pointer() primitive | |
182 | when publicizing a pointer to a structure that can | |
183 | be traversed by an RCU read-side critical section. | |
184 | ||
4fea6ef0 PM |
185 | 5. If call_rcu() or call_srcu() is used, the callback function will |
186 | be called from softirq context. In particular, it cannot block. | |
1da177e4 | 187 | |
4fea6ef0 PM |
188 | 6. Since synchronize_rcu() can block, it cannot be called |
189 | from any sort of irq context. The same rule applies | |
190 | for synchronize_srcu(), synchronize_rcu_expedited(), and | |
191 | synchronize_srcu_expedited(). | |
4c54005c PM |
192 | |
193 | The expedited forms of these primitives have the same semantics | |
4de5f89e PM |
194 | as the non-expedited forms, but expediting is both expensive and |
195 | (with the exception of synchronize_srcu_expedited()) unfriendly | |
196 | to real-time workloads. Use of the expedited primitives should | |
197 | be restricted to rare configuration-change operations that would | |
198 | not normally be undertaken while a real-time workload is running. | |
199 | However, real-time workloads can use rcupdate.rcu_normal kernel | |
200 | boot parameter to completely disable expedited grace periods, | |
201 | though this might have performance implications. | |
4c54005c | 202 | |
236fefaf PM |
203 | In particular, if you find yourself invoking one of the expedited |
204 | primitives repeatedly in a loop, please do everyone a favor: | |
205 | Restructure your code so that it batches the updates, allowing | |
206 | a single non-expedited primitive to cover the entire batch. | |
207 | This will very likely be faster than the loop containing the | |
208 | expedited primitive, and will be much much easier on the rest | |
209 | of the system, especially to real-time workloads running on | |
210 | the rest of the system. | |
211 | ||
4fea6ef0 PM |
212 | 7. As of v4.20, a given kernel implements only one RCU flavor, |
213 | which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y. | |
214 | If the updater uses call_rcu() or synchronize_rcu(), | |
215 | then the corresponding readers my use rcu_read_lock() and | |
216 | rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(), | |
217 | or any pair of primitives that disables and re-enables preemption, | |
218 | for example, rcu_read_lock_sched() and rcu_read_unlock_sched(). | |
219 | If the updater uses synchronize_srcu() or call_srcu(), | |
220 | then the corresponding readers must use srcu_read_lock() and | |
74d874e7 PM |
221 | srcu_read_unlock(), and with the same srcu_struct. The rules for |
222 | the expedited primitives are the same as for their non-expedited | |
223 | counterparts. Mixing things up will result in confusion and | |
4fea6ef0 PM |
224 | broken kernels, and has even resulted in an exploitable security |
225 | issue. | |
1da177e4 LT |
226 | |
227 | One exception to this rule: rcu_read_lock() and rcu_read_unlock() | |
228 | may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh() | |
229 | in cases where local bottom halves are already known to be | |
230 | disabled, for example, in irq or softirq context. Commenting | |
231 | such cases is a must, of course! And the jury is still out on | |
232 | whether the increased speed is worth it. | |
233 | ||
32300751 | 234 | 8. Although synchronize_rcu() is slower than is call_rcu(), it |
3f944adb PM |
235 | usually results in simpler code. So, unless update performance is |
236 | critically important, the updaters cannot block, or the latency of | |
237 | synchronize_rcu() is visible from userspace, synchronize_rcu() | |
238 | should be used in preference to call_rcu(). Furthermore, | |
239 | kfree_rcu() usually results in even simpler code than does | |
240 | synchronize_rcu() without synchronize_rcu()'s multi-millisecond | |
241 | latency. So please take advantage of kfree_rcu()'s "fire and | |
242 | forget" memory-freeing capabilities where it applies. | |
165d6c78 PM |
243 | |
244 | An especially important property of the synchronize_rcu() | |
245 | primitive is that it automatically self-limits: if grace periods | |
246 | are delayed for whatever reason, then the synchronize_rcu() | |
247 | primitive will correspondingly delay updates. In contrast, | |
248 | code using call_rcu() should explicitly limit update rate in | |
249 | cases where grace periods are delayed, as failing to do so can | |
250 | result in excessive realtime latencies or even OOM conditions. | |
251 | ||
252 | Ways of gaining this self-limiting property when using call_rcu() | |
253 | include: | |
254 | ||
255 | a. Keeping a count of the number of data-structure elements | |
5cc6517a PM |
256 | used by the RCU-protected data structure, including |
257 | those waiting for a grace period to elapse. Enforce a | |
258 | limit on this number, stalling updates as needed to allow | |
259 | previously deferred frees to complete. Alternatively, | |
260 | limit only the number awaiting deferred free rather than | |
261 | the total number of elements. | |
262 | ||
263 | One way to stall the updates is to acquire the update-side | |
264 | mutex. (Don't try this with a spinlock -- other CPUs | |
265 | spinning on the lock could prevent the grace period | |
266 | from ever ending.) Another way to stall the updates | |
267 | is for the updates to use a wrapper function around | |
268 | the memory allocator, so that this wrapper function | |
269 | simulates OOM when there is too much memory awaiting an | |
270 | RCU grace period. There are of course many other | |
271 | variations on this theme. | |
165d6c78 PM |
272 | |
273 | b. Limiting update rate. For example, if updates occur only | |
6e676696 PM |
274 | once per hour, then no explicit rate limiting is |
275 | required, unless your system is already badly broken. | |
276 | Older versions of the dcache subsystem take this approach, | |
277 | guarding updates with a global lock, limiting their rate. | |
165d6c78 PM |
278 | |
279 | c. Trusted update -- if updates can only be done manually by | |
280 | superuser or some other trusted user, then it might not | |
281 | be necessary to automatically limit them. The theory | |
282 | here is that superuser already has lots of ways to crash | |
283 | the machine. | |
284 | ||
bc2072c9 | 285 | d. Periodically invoke synchronize_rcu(), permitting a limited |
165d6c78 | 286 | number of updates per grace period. |
1da177e4 | 287 | |
4fea6ef0 | 288 | The same cautions apply to call_srcu() and kfree_rcu(). |
4c54005c | 289 | |
6e676696 PM |
290 | Note that although these primitives do take action to avoid memory |
291 | exhaustion when any given CPU has too many callbacks, a determined | |
292 | user could still exhaust memory. This is especially the case | |
293 | if a system with a large number of CPUs has been configured to | |
294 | offload all of its RCU callbacks onto a single CPU, or if the | |
295 | system has relatively little free memory. | |
296 | ||
1da177e4 | 297 | 9. All RCU list-traversal primitives, which include |
bb08f76d PM |
298 | rcu_dereference(), list_for_each_entry_rcu(), and |
299 | list_for_each_safe_rcu(), must be either within an RCU read-side | |
300 | critical section or must be protected by appropriate update-side | |
301 | locks. RCU read-side critical sections are delimited by | |
302 | rcu_read_lock() and rcu_read_unlock(), or by similar primitives | |
303 | such as rcu_read_lock_bh() and rcu_read_unlock_bh(), in which | |
304 | case the matching rcu_dereference() primitive must be used in | |
305 | order to keep lockdep happy, in this case, rcu_dereference_bh(). | |
1da177e4 | 306 | |
32300751 PM |
307 | The reason that it is permissible to use RCU list-traversal |
308 | primitives when the update-side lock is held is that doing so | |
309 | can be quite helpful in reducing code bloat when common code is | |
50aec002 PM |
310 | shared between readers and updaters. Additional primitives |
311 | are provided for this case, as discussed in lockdep.txt. | |
1da177e4 LT |
312 | |
313 | 10. Conversely, if you are in an RCU read-side critical section, | |
32300751 PM |
314 | and you don't hold the appropriate update-side lock, you -must- |
315 | use the "_rcu()" variants of the list macros. Failing to do so | |
4c54005c PM |
316 | will break Alpha, cause aggressive compilers to generate bad code, |
317 | and confuse people trying to read your code. | |
a83f1fe2 | 318 | |
e060a03a | 319 | 11. Any lock acquired by an RCU callback must be acquired elsewhere |
240ebbf8 | 320 | with softirq disabled, e.g., via spin_lock_irqsave(), |
884b429a | 321 | spin_lock_bh(), etc. Failing to disable softirq on a given |
4c54005c PM |
322 | acquisition of that lock will result in deadlock as soon as |
323 | the RCU softirq handler happens to run your RCU callback while | |
324 | interrupting that acquisition's critical section. | |
621934ee | 325 | |
e060a03a | 326 | 12. RCU callbacks can be and are executed in parallel. In many cases, |
ef48bd24 PM |
327 | the callback code simply wrappers around kfree(), so that this |
328 | is not an issue (or, more accurately, to the extent that it is | |
329 | an issue, the memory-allocator locking handles it). However, | |
330 | if the callbacks do manipulate a shared data structure, they | |
331 | must use whatever locking or other synchronization is required | |
332 | to safely access and/or modify that data structure. | |
333 | ||
884b429a PM |
334 | Do not assume that RCU callbacks will be executed on the same |
335 | CPU that executed the corresponding call_rcu() or call_srcu(). | |
336 | For example, if a given CPU goes offline while having an RCU | |
337 | callback pending, then that RCU callback will execute on some | |
338 | surviving CPU. (If this was not the case, a self-spawning RCU | |
339 | callback would prevent the victim CPU from ever going offline.) | |
340 | Furthermore, CPUs designated by rcu_nocbs= might well -always- | |
341 | have their RCU callbacks executed on some other CPUs, in fact, | |
342 | for some real-time workloads, this is the whole point of using | |
343 | the rcu_nocbs= kernel boot parameter. | |
32300751 | 344 | |
e060a03a | 345 | 13. Unlike other forms of RCU, it -is- permissible to block in an |
4de5f89e PM |
346 | SRCU read-side critical section (demarked by srcu_read_lock() |
347 | and srcu_read_unlock()), hence the "SRCU": "sleepable RCU". | |
348 | Please note that if you don't need to sleep in read-side critical | |
349 | sections, you should be using RCU rather than SRCU, because RCU | |
350 | is almost always faster and easier to use than is SRCU. | |
351 | ||
352 | Also unlike other forms of RCU, explicit initialization and | |
353 | cleanup is required either at build time via DEFINE_SRCU() | |
354 | or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct() | |
355 | and cleanup_srcu_struct(). These last two are passed a | |
356 | "struct srcu_struct" that defines the scope of a given | |
357 | SRCU domain. Once initialized, the srcu_struct is passed | |
358 | to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(), | |
359 | synchronize_srcu_expedited(), and call_srcu(). A given | |
360 | synchronize_srcu() waits only for SRCU read-side critical | |
4c54005c PM |
361 | sections governed by srcu_read_lock() and srcu_read_unlock() |
362 | calls that have been passed the same srcu_struct. This property | |
363 | is what makes sleeping read-side critical sections tolerable -- | |
364 | a given subsystem delays only its own updates, not those of other | |
365 | subsystems using SRCU. Therefore, SRCU is less prone to OOM the | |
366 | system than RCU would be if RCU's read-side critical sections | |
367 | were permitted to sleep. | |
621934ee PM |
368 | |
369 | The ability to sleep in read-side critical sections does not | |
370 | come for free. First, corresponding srcu_read_lock() and | |
371 | srcu_read_unlock() calls must be passed the same srcu_struct. | |
372 | Second, grace-period-detection overhead is amortized only | |
373 | over those updates sharing a given srcu_struct, rather than | |
374 | being globally amortized as they are for other forms of RCU. | |
375 | Therefore, SRCU should be used in preference to rw_semaphore | |
376 | only in extremely read-intensive situations, or in situations | |
377 | requiring SRCU's read-side deadlock immunity or low read-side | |
4de5f89e PM |
378 | realtime latency. You should also consider percpu_rw_semaphore |
379 | when you need lightweight readers. | |
621934ee | 380 | |
4de5f89e PM |
381 | SRCU's expedited primitive (synchronize_srcu_expedited()) |
382 | never sends IPIs to other CPUs, so it is easier on | |
4fea6ef0 | 383 | real-time workloads than is synchronize_rcu_expedited(). |
4de5f89e | 384 | |
884b429a PM |
385 | Note that rcu_assign_pointer() relates to SRCU just as it does to |
386 | other forms of RCU, but instead of rcu_dereference() you should | |
387 | use srcu_dereference() in order to avoid lockdep splats. | |
0612ea00 | 388 | |
e060a03a | 389 | 14. The whole point of call_rcu(), synchronize_rcu(), and friends |
0612ea00 PM |
390 | is to wait until all pre-existing readers have finished before |
391 | carrying out some otherwise-destructive operation. It is | |
392 | therefore critically important to -first- remove any path | |
393 | that readers can follow that could be affected by the | |
394 | destructive operation, and -only- -then- invoke call_rcu(), | |
395 | synchronize_rcu(), or friends. | |
396 | ||
4c54005c PM |
397 | Because these primitives only wait for pre-existing readers, it |
398 | is the caller's responsibility to guarantee that any subsequent | |
399 | readers will execute safely. | |
240ebbf8 | 400 | |
e060a03a | 401 | 15. The various RCU read-side primitives do -not- necessarily contain |
4c54005c PM |
402 | memory barriers. You should therefore plan for the CPU |
403 | and the compiler to freely reorder code into and out of RCU | |
404 | read-side critical sections. It is the responsibility of the | |
405 | RCU update-side primitives to deal with this. | |
84483ea4 | 406 | |
884b429a PM |
407 | For SRCU readers, you can use smp_mb__after_srcu_read_unlock() |
408 | immediately after an srcu_read_unlock() to get a full barrier. | |
409 | ||
e060a03a | 410 | 16. Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the |
41a2901e PM |
411 | __rcu sparse checks to validate your RCU code. These can help |
412 | find problems as follows: | |
84483ea4 | 413 | |
41a2901e | 414 | CONFIG_PROVE_LOCKING: check that accesses to RCU-protected data |
84483ea4 PM |
415 | structures are carried out under the proper RCU |
416 | read-side critical section, while holding the right | |
417 | combination of locks, or whatever other conditions | |
418 | are appropriate. | |
419 | ||
420 | CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the | |
421 | same object to call_rcu() (or friends) before an RCU | |
422 | grace period has elapsed since the last time that you | |
423 | passed that same object to call_rcu() (or friends). | |
424 | ||
425 | __rcu sparse checks: tag the pointer to the RCU-protected data | |
426 | structure with __rcu, and sparse will warn you if you | |
427 | access that pointer without the services of one of the | |
428 | variants of rcu_dereference(). | |
429 | ||
430 | These debugging aids can help you find problems that are | |
431 | otherwise extremely difficult to spot. | |
4de5f89e | 432 | |
884b429a PM |
433 | 17. If you register a callback using call_rcu() or call_srcu(), and |
434 | pass in a function defined within a loadable module, then it in | |
435 | necessary to wait for all pending callbacks to be invoked after | |
436 | the last invocation and before unloading that module. Note that | |
437 | it is absolutely -not- sufficient to wait for a grace period! | |
438 | The current (say) synchronize_rcu() implementation is -not- | |
4fea6ef0 | 439 | guaranteed to wait for callbacks registered on other CPUs. |
884b429a PM |
440 | Or even on the current CPU if that CPU recently went offline |
441 | and came back online. | |
4de5f89e PM |
442 | |
443 | You instead need to use one of the barrier functions: | |
444 | ||
445 | o call_rcu() -> rcu_barrier() | |
4de5f89e PM |
446 | o call_srcu() -> srcu_barrier() |
447 | ||
448 | However, these barrier functions are absolutely -not- guaranteed | |
449 | to wait for a grace period. In fact, if there are no call_rcu() | |
450 | callbacks waiting anywhere in the system, rcu_barrier() is within | |
451 | its rights to return immediately. | |
452 | ||
453 | So if you need to wait for both an RCU grace period and for | |
454 | all pre-existing call_rcu() callbacks, you will need to execute | |
455 | both rcu_barrier() and synchronize_rcu(), if necessary, using | |
456 | something like workqueues to to execute them concurrently. | |
457 | ||
458 | See rcubarrier.txt for more information. |