Commit | Line | Data |
---|---|---|
0d24f65e AD |
1 | ====================================== |
2 | Sequence counters and sequential locks | |
3 | ====================================== | |
4 | ||
5 | Introduction | |
6 | ============ | |
7 | ||
8 | Sequence counters are a reader-writer consistency mechanism with | |
9 | lockless readers (read-only retry loops), and no writer starvation. They | |
10 | are used for data that's rarely written to (e.g. system time), where the | |
11 | reader wants a consistent set of information and is willing to retry if | |
12 | that information changes. | |
13 | ||
14 | A data set is consistent when the sequence count at the beginning of the | |
15 | read side critical section is even and the same sequence count value is | |
16 | read again at the end of the critical section. The data in the set must | |
17 | be copied out inside the read side critical section. If the sequence | |
18 | count has changed between the start and the end of the critical section, | |
19 | the reader must retry. | |
20 | ||
21 | Writers increment the sequence count at the start and the end of their | |
22 | critical section. After starting the critical section the sequence count | |
23 | is odd and indicates to the readers that an update is in progress. At | |
24 | the end of the write side critical section the sequence count becomes | |
25 | even again which lets readers make progress. | |
26 | ||
27 | A sequence counter write side critical section must never be preempted | |
28 | or interrupted by read side sections. Otherwise the reader will spin for | |
29 | the entire scheduler tick due to the odd sequence count value and the | |
30 | interrupted writer. If that reader belongs to a real-time scheduling | |
31 | class, it can spin forever and the kernel will livelock. | |
32 | ||
33 | This mechanism cannot be used if the protected data contains pointers, | |
34 | as the writer can invalidate a pointer that the reader is following. | |
35 | ||
36 | ||
37 | .. _seqcount_t: | |
38 | ||
39 | Sequence counters (``seqcount_t``) | |
40 | ================================== | |
41 | ||
42 | This is the the raw counting mechanism, which does not protect against | |
43 | multiple writers. Write side critical sections must thus be serialized | |
44 | by an external lock. | |
45 | ||
46 | If the write serialization primitive is not implicitly disabling | |
47 | preemption, preemption must be explicitly disabled before entering the | |
48 | write side section. If the read section can be invoked from hardirq or | |
49 | softirq contexts, interrupts or bottom halves must also be respectively | |
50 | disabled before entering the write section. | |
51 | ||
52 | If it's desired to automatically handle the sequence counter | |
53 | requirements of writer serialization and non-preemptibility, use | |
54 | :ref:`seqlock_t` instead. | |
55 | ||
56 | Initialization:: | |
57 | ||
58 | /* dynamic */ | |
59 | seqcount_t foo_seqcount; | |
60 | seqcount_init(&foo_seqcount); | |
61 | ||
62 | /* static */ | |
63 | static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount); | |
64 | ||
65 | /* C99 struct init */ | |
66 | struct { | |
67 | .seq = SEQCNT_ZERO(foo.seq), | |
68 | } foo; | |
69 | ||
70 | Write path:: | |
71 | ||
72 | /* Serialized context with disabled preemption */ | |
73 | ||
74 | write_seqcount_begin(&foo_seqcount); | |
75 | ||
76 | /* ... [[write-side critical section]] ... */ | |
77 | ||
78 | write_seqcount_end(&foo_seqcount); | |
79 | ||
80 | Read path:: | |
81 | ||
82 | do { | |
83 | seq = read_seqcount_begin(&foo_seqcount); | |
84 | ||
85 | /* ... [[read-side critical section]] ... */ | |
86 | ||
87 | } while (read_seqcount_retry(&foo_seqcount, seq)); | |
88 | ||
89 | ||
55f3560d AD |
90 | .. _seqcount_locktype_t: |
91 | ||
cf486472 | 92 | Sequence counters with associated locks (``seqcount_LOCKNAME_t``) |
55f3560d AD |
93 | ----------------------------------------------------------------- |
94 | ||
95 | As discussed at :ref:`seqcount_t`, sequence count write side critical | |
96 | sections must be serialized and non-preemptible. This variant of | |
97 | sequence counters associate the lock used for writer serialization at | |
98 | initialization time, which enables lockdep to validate that the write | |
99 | side critical sections are properly serialized. | |
100 | ||
101 | This lock association is a NOOP if lockdep is disabled and has neither | |
102 | storage nor runtime overhead. If lockdep is enabled, the lock pointer is | |
103 | stored in struct seqcount and lockdep's "lock is held" assertions are | |
104 | injected at the beginning of the write side critical section to validate | |
105 | that it is properly protected. | |
106 | ||
107 | For lock types which do not implicitly disable preemption, preemption | |
108 | protection is enforced in the write side function. | |
109 | ||
110 | The following sequence counters with associated locks are defined: | |
111 | ||
112 | - ``seqcount_spinlock_t`` | |
113 | - ``seqcount_raw_spinlock_t`` | |
114 | - ``seqcount_rwlock_t`` | |
115 | - ``seqcount_mutex_t`` | |
116 | - ``seqcount_ww_mutex_t`` | |
117 | ||
cf486472 AD |
118 | The sequence counter read and write APIs can take either a plain |
119 | seqcount_t or any of the seqcount_LOCKNAME_t variants above. | |
55f3560d | 120 | |
cf486472 | 121 | Initialization (replace "LOCKNAME" with one of the supported locks):: |
55f3560d AD |
122 | |
123 | /* dynamic */ | |
cf486472 AD |
124 | seqcount_LOCKNAME_t foo_seqcount; |
125 | seqcount_LOCKNAME_init(&foo_seqcount, &lock); | |
55f3560d AD |
126 | |
127 | /* static */ | |
cf486472 AD |
128 | static seqcount_LOCKNAME_t foo_seqcount = |
129 | SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock); | |
55f3560d AD |
130 | |
131 | /* C99 struct init */ | |
132 | struct { | |
cf486472 | 133 | .seq = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock), |
55f3560d AD |
134 | } foo; |
135 | ||
136 | Write path: same as in :ref:`seqcount_t`, while running from a context | |
cf486472 | 137 | with the associated write serialization lock acquired. |
55f3560d AD |
138 | |
139 | Read path: same as in :ref:`seqcount_t`. | |
140 | ||
80793c34 AD |
141 | |
142 | .. _seqcount_latch_t: | |
143 | ||
144 | Latch sequence counters (``seqcount_latch_t``) | |
145 | ---------------------------------------------- | |
146 | ||
147 | Latch sequence counters are a multiversion concurrency control mechanism | |
148 | where the embedded seqcount_t counter even/odd value is used to switch | |
149 | between two copies of protected data. This allows the sequence counter | |
150 | read path to safely interrupt its own write side critical section. | |
151 | ||
152 | Use seqcount_latch_t when the write side sections cannot be protected | |
153 | from interruption by readers. This is typically the case when the read | |
154 | side can be invoked from NMI handlers. | |
155 | ||
156 | Check `raw_write_seqcount_latch()` for more information. | |
157 | ||
158 | ||
0d24f65e AD |
159 | .. _seqlock_t: |
160 | ||
161 | Sequential locks (``seqlock_t``) | |
162 | ================================ | |
163 | ||
164 | This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an | |
165 | embedded spinlock for writer serialization and non-preemptibility. | |
166 | ||
167 | If the read side section can be invoked from hardirq or softirq context, | |
168 | use the write side function variants which disable interrupts or bottom | |
169 | halves respectively. | |
170 | ||
171 | Initialization:: | |
172 | ||
173 | /* dynamic */ | |
174 | seqlock_t foo_seqlock; | |
175 | seqlock_init(&foo_seqlock); | |
176 | ||
177 | /* static */ | |
178 | static DEFINE_SEQLOCK(foo_seqlock); | |
179 | ||
180 | /* C99 struct init */ | |
181 | struct { | |
182 | .seql = __SEQLOCK_UNLOCKED(foo.seql) | |
183 | } foo; | |
184 | ||
185 | Write path:: | |
186 | ||
187 | write_seqlock(&foo_seqlock); | |
188 | ||
189 | /* ... [[write-side critical section]] ... */ | |
190 | ||
191 | write_sequnlock(&foo_seqlock); | |
192 | ||
193 | Read path, three categories: | |
194 | ||
195 | 1. Normal Sequence readers which never block a writer but they must | |
196 | retry if a writer is in progress by detecting change in the sequence | |
197 | number. Writers do not wait for a sequence reader:: | |
198 | ||
199 | do { | |
200 | seq = read_seqbegin(&foo_seqlock); | |
201 | ||
202 | /* ... [[read-side critical section]] ... */ | |
203 | ||
204 | } while (read_seqretry(&foo_seqlock, seq)); | |
205 | ||
206 | 2. Locking readers which will wait if a writer or another locking reader | |
207 | is in progress. A locking reader in progress will also block a writer | |
208 | from entering its critical section. This read lock is | |
209 | exclusive. Unlike rwlock_t, only one locking reader can acquire it:: | |
210 | ||
211 | read_seqlock_excl(&foo_seqlock); | |
212 | ||
213 | /* ... [[read-side critical section]] ... */ | |
214 | ||
215 | read_sequnlock_excl(&foo_seqlock); | |
216 | ||
217 | 3. Conditional lockless reader (as in 1), or locking reader (as in 2), | |
218 | according to a passed marker. This is used to avoid lockless readers | |
219 | starvation (too much retry loops) in case of a sharp spike in write | |
220 | activity. First, a lockless read is tried (even marker passed). If | |
221 | that trial fails (odd sequence counter is returned, which is used as | |
222 | the next iteration marker), the lockless read is transformed to a | |
223 | full locking read and no retry loop is necessary:: | |
224 | ||
225 | /* marker; even initialization */ | |
226 | int seq = 0; | |
227 | do { | |
228 | read_seqbegin_or_lock(&foo_seqlock, &seq); | |
229 | ||
230 | /* ... [[read-side critical section]] ... */ | |
231 | ||
232 | } while (need_seqretry(&foo_seqlock, seq)); | |
233 | done_seqretry(&foo_seqlock, seq); | |
234 | ||
235 | ||
236 | API documentation | |
237 | ================= | |
238 | ||
239 | .. kernel-doc:: include/linux/seqlock.h |