Commit | Line | Data |
---|---|---|
f0ba4377 MCC |
1 | ============ |
2 | dm-integrity | |
3 | ============ | |
4 | ||
7eada909 MP |
5 | The dm-integrity target emulates a block device that has additional |
6 | per-sector tags that can be used for storing integrity information. | |
7 | ||
8 | A general problem with storing integrity tags with every sector is that | |
9 | writing the sector and the integrity tag must be atomic - i.e. in case of | |
10 | crash, either both sector and integrity tag or none of them is written. | |
11 | ||
12 | To guarantee write atomicity, the dm-integrity target uses journal, it | |
13 | writes sector data and integrity tags into a journal, commits the journal | |
14 | and then copies the data and integrity tags to their respective location. | |
15 | ||
16 | The dm-integrity target can be used with the dm-crypt target - in this | |
17 | situation the dm-crypt target creates the integrity data and passes them | |
18 | to the dm-integrity target via bio_integrity_payload attached to the bio. | |
19 | In this mode, the dm-crypt and dm-integrity targets provide authenticated | |
20 | disk encryption - if the attacker modifies the encrypted device, an I/O | |
21 | error is returned instead of random data. | |
22 | ||
23 | The dm-integrity target can also be used as a standalone target, in this | |
24 | mode it calculates and verifies the integrity tag internally. In this | |
25 | mode, the dm-integrity target can be used to detect silent data | |
26 | corruption on the disk or in the I/O path. | |
27 | ||
468dfca3 MP |
28 | There's an alternate mode of operation where dm-integrity uses bitmap |
29 | instead of a journal. If a bit in the bitmap is 1, the corresponding | |
30 | region's data and integrity tags are not synchronized - if the machine | |
31 | crashes, the unsynchronized regions will be recalculated. The bitmap mode | |
32 | is faster than the journal mode, because we don't have to write the data | |
33 | twice, but it is also less reliable, because if data corruption happens | |
34 | when the machine crashes, it may not be detected. | |
7eada909 MP |
35 | |
36 | When loading the target for the first time, the kernel driver will format | |
37 | the device. But it will only format the device if the superblock contains | |
38 | zeroes. If the superblock is neither valid nor zeroed, the dm-integrity | |
39 | target can't be loaded. | |
40 | ||
41 | To use the target for the first time: | |
f0ba4377 | 42 | |
7eada909 MP |
43 | 1. overwrite the superblock with zeroes |
44 | 2. load the dm-integrity target with one-sector size, the kernel driver | |
f0ba4377 | 45 | will format the device |
7eada909 MP |
46 | 3. unload the dm-integrity target |
47 | 4. read the "provided_data_sectors" value from the superblock | |
4e578ba6 | 48 | 5. load the dm-integrity target with the target size |
f0ba4377 | 49 | "provided_data_sectors" |
7eada909 | 50 | 6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target |
f0ba4377 | 51 | with the size "provided_data_sectors" |
7eada909 MP |
52 | |
53 | ||
54 | Target arguments: | |
55 | ||
56 | 1. the underlying block device | |
57 | ||
58 | 2. the number of reserved sector at the beginning of the device - the | |
f0ba4377 | 59 | dm-integrity won't read of write these sectors |
7eada909 MP |
60 | |
61 | 3. the size of the integrity tag (if "-" is used, the size is taken from | |
f0ba4377 | 62 | the internal-hash algorithm) |
7eada909 MP |
63 | |
64 | 4. mode: | |
f0ba4377 MCC |
65 | |
66 | D - direct writes (without journal) | |
67 | in this mode, journaling is | |
7eada909 MP |
68 | not used and data sectors and integrity tags are written |
69 | separately. In case of crash, it is possible that the data | |
70 | and integrity tag doesn't match. | |
f0ba4377 MCC |
71 | J - journaled writes |
72 | data and integrity tags are written to the | |
7eada909 MP |
73 | journal and atomicity is guaranteed. In case of crash, |
74 | either both data and tag or none of them are written. The | |
75 | journaled mode degrades write throughput twice because the | |
76 | data have to be written twice. | |
468dfca3 MP |
77 | B - bitmap mode - data and metadata are written without any |
78 | synchronization, the driver maintains a bitmap of dirty | |
79 | regions where data and metadata don't match. This mode can | |
80 | only be used with internal hash. | |
c2bcb2b7 MP |
81 | R - recovery mode - in this mode, journal is not replayed, |
82 | checksums are not checked and writes to the device are not | |
83 | allowed. This mode is useful for data recovery if the | |
84 | device cannot be activated in any of the other standard | |
85 | modes. | |
7eada909 MP |
86 | |
87 | 5. the number of additional arguments | |
88 | ||
89 | Additional arguments: | |
90 | ||
56b67a4f | 91 | journal_sectors:number |
7eada909 MP |
92 | The size of journal, this argument is used only if formatting the |
93 | device. If the device is already formatted, the value from the | |
94 | superblock is used. | |
95 | ||
56b67a4f | 96 | interleave_sectors:number |
7eada909 MP |
97 | The number of interleaved sectors. This values is rounded down to |
98 | a power of two. If the device is already formatted, the value from | |
99 | the superblock is used. | |
100 | ||
88ad5d1e | 101 | meta_device:device |
4e578ba6 | 102 | Don't interleave the data and metadata on the device. Use a |
88ad5d1e MP |
103 | separate device for metadata. |
104 | ||
56b67a4f | 105 | buffer_sectors:number |
7eada909 MP |
106 | The number of sectors in one buffer. The value is rounded down to |
107 | a power of two. | |
108 | ||
109 | The tag area is accessed using buffers, the buffer size is | |
110 | configurable. The large buffer size means that the I/O size will | |
111 | be larger, but there could be less I/Os issued. | |
112 | ||
56b67a4f | 113 | journal_watermark:number |
7eada909 MP |
114 | The journal watermark in percents. When the size of the journal |
115 | exceeds this watermark, the thread that flushes the journal will | |
116 | be started. | |
117 | ||
56b67a4f | 118 | commit_time:number |
7eada909 | 119 | Commit time in milliseconds. When this time passes, the journal is |
751d5b27 | 120 | written. The journal is also written immediately if the FLUSH |
7eada909 MP |
121 | request is received. |
122 | ||
56b67a4f | 123 | internal_hash:algorithm(:key) (the key is optional) |
7eada909 MP |
124 | Use internal hash or crc. |
125 | When this argument is used, the dm-integrity target won't accept | |
126 | integrity tags from the upper target, but it will automatically | |
127 | generate and verify the integrity tags. | |
128 | ||
129 | You can use a crc algorithm (such as crc32), then integrity target | |
130 | will protect the data against accidental corruption. | |
131 | You can also use a hmac algorithm (for example | |
132 | "hmac(sha256):0123456789abcdef"), in this mode it will provide | |
133 | cryptographic authentication of the data without encryption. | |
134 | ||
135 | When this argument is not used, the integrity tags are accepted | |
136 | from an upper layer target, such as dm-crypt. The upper layer | |
137 | target should check the validity of the integrity tags. | |
138 | ||
a3fcf725 MP |
139 | recalculate |
140 | Recalculate the integrity tags automatically. It is only valid | |
141 | when using internal hash. | |
142 | ||
56b67a4f | 143 | journal_crypt:algorithm(:key) (the key is optional) |
7eada909 MP |
144 | Encrypt the journal using given algorithm to make sure that the |
145 | attacker can't read the journal. You can use a block cipher here | |
146 | (such as "cbc(aes)") or a stream cipher (for example "chacha20", | |
7fc979f8 | 147 | "salsa20" or "ctr(aes)"). |
7eada909 MP |
148 | |
149 | The journal contains history of last writes to the block device, | |
751d5b27 | 150 | an attacker reading the journal could see the last sector numbers |
7eada909 MP |
151 | that were written. From the sector numbers, the attacker can infer |
152 | the size of files that were written. To protect against this | |
153 | situation, you can encrypt the journal. | |
154 | ||
56b67a4f | 155 | journal_mac:algorithm(:key) (the key is optional) |
7eada909 MP |
156 | Protect sector numbers in the journal from accidental or malicious |
157 | modification. To protect against accidental modification, use a | |
158 | crc algorithm, to protect against malicious modification, use a | |
159 | hmac algorithm with a key. | |
160 | ||
161 | This option is not needed when using internal-hash because in this | |
162 | mode, the integrity of journal entries is checked when replaying | |
163 | the journal. Thus, modified sector number would be detected at | |
164 | this stage. | |
165 | ||
9d609f85 MP |
166 | block_size:number |
167 | The size of a data block in bytes. The larger the block size the | |
168 | less overhead there is for per-block integrity metadata. | |
169 | Supported values are 512, 1024, 2048 and 4096 bytes. If not | |
170 | specified the default block size is 512 bytes. | |
7eada909 | 171 | |
468dfca3 MP |
172 | sectors_per_bit:number |
173 | In the bitmap mode, this parameter specifies the number of | |
174 | 512-byte sectors that corresponds to one bitmap bit. | |
175 | ||
176 | bitmap_flush_interval:number | |
177 | The bitmap flush interval in milliseconds. The metadata buffers | |
178 | are synchronized when this interval expires. | |
179 | ||
d537858a MP |
180 | fix_padding |
181 | Use a smaller padding of the tag area that is more | |
182 | space-efficient. If this option is not present, large padding is | |
183 | used - that is for compatibility with older kernels. | |
184 | ||
0a2bd55c MB |
185 | allow_discards |
186 | Allow block discard requests (a.k.a. TRIM) for the integrity device. | |
187 | Discards are only allowed to devices using internal hash. | |
188 | ||
189 | The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and | |
190 | allow_discards can be changed when reloading the target (load an inactive | |
191 | table and swap the tables with suspend and resume). The other arguments | |
192 | should not be changed when reloading the target because the layout of disk | |
193 | data depend on them and the reloaded target would be non-functional. | |
7eada909 MP |
194 | |
195 | ||
40e9c5ac MP |
196 | Status line: |
197 | ||
198 | 1. the number of integrity mismatches | |
199 | 2. provided data sectors - that is the number of sectors that the user | |
200 | could use | |
201 | 3. the current recalculating position (or '-' if we didn't recalculate) | |
202 | ||
203 | ||
7eada909 | 204 | The layout of the formatted block device: |
f0ba4377 MCC |
205 | |
206 | * reserved sectors | |
207 | (they are not used by this target, they can be used for | |
208 | storing LUKS metadata or for other purpose), the size of the reserved | |
209 | area is specified in the target arguments | |
210 | ||
7eada909 MP |
211 | * superblock (4kiB) |
212 | * magic string - identifies that the device was formatted | |
213 | * version | |
214 | * log2(interleave sectors) | |
215 | * integrity tag size | |
216 | * the number of journal sections | |
217 | * provided data sectors - the number of sectors that this target | |
218 | provides (i.e. the size of the device minus the size of all | |
219 | metadata and padding). The user of this target should not send | |
220 | bios that access data beyond the "provided data sectors" limit. | |
88ad5d1e | 221 | * flags |
f0ba4377 MCC |
222 | SB_FLAG_HAVE_JOURNAL_MAC |
223 | - a flag is set if journal_mac is used | |
224 | SB_FLAG_RECALCULATING | |
225 | - recalculating is in progress | |
226 | SB_FLAG_DIRTY_BITMAP | |
227 | - journal area contains the bitmap of dirty | |
228 | blocks | |
88ad5d1e MP |
229 | * log2(sectors per block) |
230 | * a position where recalculating finished | |
7eada909 MP |
231 | * journal |
232 | The journal is divided into sections, each section contains: | |
f0ba4377 | 233 | |
7eada909 | 234 | * metadata area (4kiB), it contains journal entries |
f0ba4377 MCC |
235 | |
236 | - every journal entry contains: | |
237 | ||
7eada909 MP |
238 | * logical sector (specifies where the data and tag should |
239 | be written) | |
240 | * last 8 bytes of data | |
241 | * integrity tag (the size is specified in the superblock) | |
f0ba4377 MCC |
242 | |
243 | - every metadata sector ends with | |
244 | ||
7eada909 MP |
245 | * mac (8-bytes), all the macs in 8 metadata sectors form a |
246 | 64-byte value. It is used to store hmac of sector | |
247 | numbers in the journal section, to protect against a | |
248 | possibility that the attacker tampers with sector | |
249 | numbers in the journal. | |
250 | * commit id | |
f0ba4377 | 251 | |
7eada909 MP |
252 | * data area (the size is variable; it depends on how many journal |
253 | entries fit into the metadata area) | |
f0ba4377 MCC |
254 | |
255 | - every sector in the data area contains: | |
256 | ||
7eada909 MP |
257 | * data (504 bytes of data, the last 8 bytes are stored in |
258 | the journal entry) | |
259 | * commit id | |
f0ba4377 | 260 | |
7eada909 MP |
261 | To test if the whole journal section was written correctly, every |
262 | 512-byte sector of the journal ends with 8-byte commit id. If the | |
263 | commit id matches on all sectors in a journal section, then it is | |
264 | assumed that the section was written correctly. If the commit id | |
265 | doesn't match, the section was written partially and it should not | |
266 | be replayed. | |
f0ba4377 MCC |
267 | |
268 | * one or more runs of interleaved tags and data. | |
269 | Each run contains: | |
270 | ||
7eada909 MP |
271 | * tag area - it contains integrity tags. There is one tag for each |
272 | sector in the data area | |
273 | * data area - it contains data sectors. The number of data sectors | |
274 | in one run must be a power of two. log2 of this value is stored | |
275 | in the superblock. |