Commit | Line | Data |
---|---|---|
2a5b0c84 | 1 | .. _up_doc: |
1da177e4 | 2 | |
2a5b0c84 JC |
3 | RCU on Uniprocessor Systems |
4 | =========================== | |
1da177e4 LT |
5 | |
6 | A common misconception is that, on UP systems, the call_rcu() primitive | |
240ebbf8 | 7 | may immediately invoke its function. The basis of this misconception |
1da177e4 LT |
8 | is that since there is only one CPU, it should not be necessary to |
9 | wait for anything else to get done, since there are no other CPUs for | |
2a5b0c84 | 10 | anything else to be happening on. Although this approach will *sort of* |
1da177e4 | 11 | work a surprising amount of the time, it is a very bad idea in general. |
240ebbf8 PM |
12 | This document presents three examples that demonstrate exactly how bad |
13 | an idea this is. | |
1da177e4 | 14 | |
1da177e4 | 15 | Example 1: softirq Suicide |
2a5b0c84 | 16 | -------------------------- |
1da177e4 LT |
17 | |
18 | Suppose that an RCU-based algorithm scans a linked list containing | |
19 | elements A, B, and C in process context, and can delete elements from | |
20 | this same list in softirq context. Suppose that the process-context scan | |
21 | is referencing element B when it is interrupted by softirq processing, | |
22 | which deletes element B, and then invokes call_rcu() to free element B | |
23 | after a grace period. | |
24 | ||
25 | Now, if call_rcu() were to directly invoke its arguments, then upon return | |
26 | from softirq, the list scan would find itself referencing a newly freed | |
27 | element B. This situation can greatly decrease the life expectancy of | |
28 | your kernel. | |
29 | ||
dd81eca8 PM |
30 | This same problem can occur if call_rcu() is invoked from a hardware |
31 | interrupt handler. | |
32 | ||
1da177e4 | 33 | Example 2: Function-Call Fatality |
2a5b0c84 | 34 | --------------------------------- |
1da177e4 LT |
35 | |
36 | Of course, one could avert the suicide described in the preceding example | |
37 | by having call_rcu() directly invoke its arguments only if it was called | |
38 | from process context. However, this can fail in a similar manner. | |
39 | ||
40 | Suppose that an RCU-based algorithm again scans a linked list containing | |
41 | elements A, B, and C in process contexts, but that it invokes a function | |
42 | on each element as it is scanned. Suppose further that this function | |
43 | deletes element B from the list, then passes it to call_rcu() for deferred | |
44 | freeing. This may be a bit unconventional, but it is perfectly legal | |
45 | RCU usage, since call_rcu() must wait for a grace period to elapse. | |
46 | Therefore, in this case, allowing call_rcu() to immediately invoke | |
47 | its arguments would cause it to fail to make the fundamental guarantee | |
48 | underlying RCU, namely that call_rcu() defers invoking its arguments until | |
49 | all RCU read-side critical sections currently executing have completed. | |
50 | ||
2a5b0c84 JC |
51 | Quick Quiz #1: |
52 | Why is it *not* legal to invoke synchronize_rcu() in this case? | |
dd81eca8 | 53 | |
2a5b0c84 | 54 | :ref:`Answers to Quick Quiz <answer_quick_quiz_up>` |
dd81eca8 PM |
55 | |
56 | Example 3: Death by Deadlock | |
2a5b0c84 | 57 | ---------------------------- |
dd81eca8 PM |
58 | |
59 | Suppose that call_rcu() is invoked while holding a lock, and that the | |
60 | callback function must acquire this same lock. In this case, if | |
61 | call_rcu() were to directly invoke the callback, the result would | |
62 | be self-deadlock. | |
63 | ||
64 | In some cases, it would possible to restructure to code so that | |
65 | the call_rcu() is delayed until after the lock is released. However, | |
66 | there are cases where this can be quite ugly: | |
67 | ||
68 | 1. If a number of items need to be passed to call_rcu() within | |
69 | the same critical section, then the code would need to create | |
70 | a list of them, then traverse the list once the lock was | |
71 | released. | |
72 | ||
73 | 2. In some cases, the lock will be held across some kernel API, | |
74 | so that delaying the call_rcu() until the lock is released | |
75 | requires that the data item be passed up via a common API. | |
76 | It is far better to guarantee that callbacks are invoked | |
77 | with no locks held than to have to modify such APIs to allow | |
78 | arbitrary data items to be passed back up through them. | |
79 | ||
80 | If call_rcu() directly invokes the callback, painful locking restrictions | |
81 | or API changes would be required. | |
82 | ||
2a5b0c84 JC |
83 | Quick Quiz #2: |
84 | What locking restriction must RCU callbacks respect? | |
1da177e4 | 85 | |
2a5b0c84 | 86 | :ref:`Answers to Quick Quiz <answer_quick_quiz_up>` |
1da177e4 LT |
87 | |
88 | Summary | |
2a5b0c84 | 89 | ------- |
1da177e4 | 90 | |
240ebbf8 PM |
91 | Permitting call_rcu() to immediately invoke its arguments breaks RCU, |
92 | even on a UP system. So do not do it! Even on a UP system, the RCU | |
2a5b0c84 | 93 | infrastructure *must* respect grace periods, and *must* invoke callbacks |
240ebbf8 PM |
94 | from a known environment in which no locks are held. |
95 | ||
2a5b0c84 JC |
96 | Note that it *is* safe for synchronize_rcu() to return immediately on |
97 | UP systems, including PREEMPT SMP builds running on UP systems. | |
240ebbf8 | 98 | |
2a5b0c84 JC |
99 | Quick Quiz #3: |
100 | Why can't synchronize_rcu() return immediately on UP systems running | |
101 | preemptable RCU? | |
dd81eca8 | 102 | |
2a5b0c84 | 103 | .. _answer_quick_quiz_up: |
dd81eca8 PM |
104 | |
105 | Answer to Quick Quiz #1: | |
2a5b0c84 | 106 | Why is it *not* legal to invoke synchronize_rcu() in this case? |
dd81eca8 PM |
107 | |
108 | Because the calling function is scanning an RCU-protected linked | |
109 | list, and is therefore within an RCU read-side critical section. | |
110 | Therefore, the called function has been invoked within an RCU | |
111 | read-side critical section, and is not permitted to block. | |
112 | ||
113 | Answer to Quick Quiz #2: | |
114 | What locking restriction must RCU callbacks respect? | |
115 | ||
acb6258a JC |
116 | Any lock that is acquired within an RCU callback must be acquired |
117 | elsewhere using an _bh variant of the spinlock primitive. | |
118 | For example, if "mylock" is acquired by an RCU callback, then | |
119 | a process-context acquisition of this lock must use something | |
120 | like spin_lock_bh() to acquire the lock. Please note that | |
121 | it is also OK to use _irq variants of spinlocks, for example, | |
122 | spin_lock_irqsave(). | |
dd81eca8 PM |
123 | |
124 | If the process-context code were to simply use spin_lock(), | |
125 | then, since RCU callbacks can be invoked from softirq context, | |
126 | the callback might be called from a softirq that interrupted | |
127 | the process-context critical section. This would result in | |
128 | self-deadlock. | |
129 | ||
130 | This restriction might seem gratuitous, since very few RCU | |
131 | callbacks acquire locks directly. However, a great many RCU | |
2a5b0c84 | 132 | callbacks do acquire locks *indirectly*, for example, via |
dd81eca8 | 133 | the kfree() primitive. |
240ebbf8 PM |
134 | |
135 | Answer to Quick Quiz #3: | |
136 | Why can't synchronize_rcu() return immediately on UP systems | |
137 | running preemptable RCU? | |
138 | ||
139 | Because some other task might have been preempted in the middle | |
140 | of an RCU read-side critical section. If synchronize_rcu() | |
141 | simply immediately returned, it would prematurely signal the | |
142 | end of the grace period, which would come as a nasty shock to | |
143 | that other thread when it started running again. |