Merge tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm
[linux-2.6-block.git] / Documentation / admin-guide / device-mapper / dm-integrity.rst
CommitLineData
f0ba4377
MCC
1============
2dm-integrity
3============
4
7eada909
MP
5The dm-integrity target emulates a block device that has additional
6per-sector tags that can be used for storing integrity information.
7
8A general problem with storing integrity tags with every sector is that
9writing the sector and the integrity tag must be atomic - i.e. in case of
10crash, either both sector and integrity tag or none of them is written.
11
12To guarantee write atomicity, the dm-integrity target uses journal, it
13writes sector data and integrity tags into a journal, commits the journal
14and then copies the data and integrity tags to their respective location.
15
16The dm-integrity target can be used with the dm-crypt target - in this
17situation the dm-crypt target creates the integrity data and passes them
18to the dm-integrity target via bio_integrity_payload attached to the bio.
19In this mode, the dm-crypt and dm-integrity targets provide authenticated
20disk encryption - if the attacker modifies the encrypted device, an I/O
21error is returned instead of random data.
22
23The dm-integrity target can also be used as a standalone target, in this
24mode it calculates and verifies the integrity tag internally. In this
25mode, the dm-integrity target can be used to detect silent data
26corruption on the disk or in the I/O path.
27
468dfca3
MP
28There's an alternate mode of operation where dm-integrity uses bitmap
29instead of a journal. If a bit in the bitmap is 1, the corresponding
30region's data and integrity tags are not synchronized - if the machine
31crashes, the unsynchronized regions will be recalculated. The bitmap mode
32is faster than the journal mode, because we don't have to write the data
33twice, but it is also less reliable, because if data corruption happens
34when the machine crashes, it may not be detected.
7eada909
MP
35
36When loading the target for the first time, the kernel driver will format
37the device. But it will only format the device if the superblock contains
38zeroes. If the superblock is neither valid nor zeroed, the dm-integrity
39target can't be loaded.
40
41To use the target for the first time:
f0ba4377 42
7eada909
MP
431. overwrite the superblock with zeroes
442. load the dm-integrity target with one-sector size, the kernel driver
f0ba4377 45 will format the device
7eada909
MP
463. unload the dm-integrity target
474. read the "provided_data_sectors" value from the superblock
4e578ba6 485. load the dm-integrity target with the target size
f0ba4377 49 "provided_data_sectors"
7eada909 506. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
f0ba4377 51 with the size "provided_data_sectors"
7eada909
MP
52
53
54Target arguments:
55
561. the underlying block device
57
582. the number of reserved sector at the beginning of the device - the
f0ba4377 59 dm-integrity won't read of write these sectors
7eada909
MP
60
613. the size of the integrity tag (if "-" is used, the size is taken from
f0ba4377 62 the internal-hash algorithm)
7eada909
MP
63
644. mode:
f0ba4377
MCC
65
66 D - direct writes (without journal)
67 in this mode, journaling is
7eada909
MP
68 not used and data sectors and integrity tags are written
69 separately. In case of crash, it is possible that the data
70 and integrity tag doesn't match.
f0ba4377
MCC
71 J - journaled writes
72 data and integrity tags are written to the
7eada909
MP
73 journal and atomicity is guaranteed. In case of crash,
74 either both data and tag or none of them are written. The
75 journaled mode degrades write throughput twice because the
76 data have to be written twice.
468dfca3
MP
77 B - bitmap mode - data and metadata are written without any
78 synchronization, the driver maintains a bitmap of dirty
79 regions where data and metadata don't match. This mode can
80 only be used with internal hash.
c2bcb2b7
MP
81 R - recovery mode - in this mode, journal is not replayed,
82 checksums are not checked and writes to the device are not
83 allowed. This mode is useful for data recovery if the
84 device cannot be activated in any of the other standard
85 modes.
7eada909
MP
86
875. the number of additional arguments
88
89Additional arguments:
90
56b67a4f 91journal_sectors:number
7eada909
MP
92 The size of journal, this argument is used only if formatting the
93 device. If the device is already formatted, the value from the
94 superblock is used.
95
56b67a4f 96interleave_sectors:number
7eada909
MP
97 The number of interleaved sectors. This values is rounded down to
98 a power of two. If the device is already formatted, the value from
99 the superblock is used.
100
88ad5d1e 101meta_device:device
4e578ba6 102 Don't interleave the data and metadata on the device. Use a
88ad5d1e
MP
103 separate device for metadata.
104
56b67a4f 105buffer_sectors:number
7eada909
MP
106 The number of sectors in one buffer. The value is rounded down to
107 a power of two.
108
109 The tag area is accessed using buffers, the buffer size is
110 configurable. The large buffer size means that the I/O size will
111 be larger, but there could be less I/Os issued.
112
56b67a4f 113journal_watermark:number
7eada909
MP
114 The journal watermark in percents. When the size of the journal
115 exceeds this watermark, the thread that flushes the journal will
116 be started.
117
56b67a4f 118commit_time:number
7eada909 119 Commit time in milliseconds. When this time passes, the journal is
751d5b27 120 written. The journal is also written immediately if the FLUSH
7eada909
MP
121 request is received.
122
56b67a4f 123internal_hash:algorithm(:key) (the key is optional)
7eada909
MP
124 Use internal hash or crc.
125 When this argument is used, the dm-integrity target won't accept
126 integrity tags from the upper target, but it will automatically
127 generate and verify the integrity tags.
128
129 You can use a crc algorithm (such as crc32), then integrity target
130 will protect the data against accidental corruption.
131 You can also use a hmac algorithm (for example
132 "hmac(sha256):0123456789abcdef"), in this mode it will provide
133 cryptographic authentication of the data without encryption.
134
135 When this argument is not used, the integrity tags are accepted
136 from an upper layer target, such as dm-crypt. The upper layer
137 target should check the validity of the integrity tags.
138
a3fcf725
MP
139recalculate
140 Recalculate the integrity tags automatically. It is only valid
141 when using internal hash.
142
56b67a4f 143journal_crypt:algorithm(:key) (the key is optional)
7eada909
MP
144 Encrypt the journal using given algorithm to make sure that the
145 attacker can't read the journal. You can use a block cipher here
146 (such as "cbc(aes)") or a stream cipher (for example "chacha20",
7fc979f8 147 "salsa20" or "ctr(aes)").
7eada909
MP
148
149 The journal contains history of last writes to the block device,
751d5b27 150 an attacker reading the journal could see the last sector numbers
7eada909
MP
151 that were written. From the sector numbers, the attacker can infer
152 the size of files that were written. To protect against this
153 situation, you can encrypt the journal.
154
56b67a4f 155journal_mac:algorithm(:key) (the key is optional)
7eada909
MP
156 Protect sector numbers in the journal from accidental or malicious
157 modification. To protect against accidental modification, use a
158 crc algorithm, to protect against malicious modification, use a
159 hmac algorithm with a key.
160
161 This option is not needed when using internal-hash because in this
162 mode, the integrity of journal entries is checked when replaying
163 the journal. Thus, modified sector number would be detected at
164 this stage.
165
9d609f85
MP
166block_size:number
167 The size of a data block in bytes. The larger the block size the
168 less overhead there is for per-block integrity metadata.
169 Supported values are 512, 1024, 2048 and 4096 bytes. If not
170 specified the default block size is 512 bytes.
7eada909 171
468dfca3
MP
172sectors_per_bit:number
173 In the bitmap mode, this parameter specifies the number of
174 512-byte sectors that corresponds to one bitmap bit.
175
176bitmap_flush_interval:number
177 The bitmap flush interval in milliseconds. The metadata buffers
178 are synchronized when this interval expires.
179
d537858a
MP
180fix_padding
181 Use a smaller padding of the tag area that is more
182 space-efficient. If this option is not present, large padding is
183 used - that is for compatibility with older kernels.
184
0a2bd55c
MB
185allow_discards
186 Allow block discard requests (a.k.a. TRIM) for the integrity device.
187 Discards are only allowed to devices using internal hash.
188
189The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
190allow_discards can be changed when reloading the target (load an inactive
191table and swap the tables with suspend and resume). The other arguments
192should not be changed when reloading the target because the layout of disk
193data depend on them and the reloaded target would be non-functional.
7eada909
MP
194
195
40e9c5ac
MP
196Status line:
197
1981. the number of integrity mismatches
1992. provided data sectors - that is the number of sectors that the user
200 could use
2013. the current recalculating position (or '-' if we didn't recalculate)
202
203
7eada909 204The layout of the formatted block device:
f0ba4377
MCC
205
206* reserved sectors
207 (they are not used by this target, they can be used for
208 storing LUKS metadata or for other purpose), the size of the reserved
209 area is specified in the target arguments
210
7eada909
MP
211* superblock (4kiB)
212 * magic string - identifies that the device was formatted
213 * version
214 * log2(interleave sectors)
215 * integrity tag size
216 * the number of journal sections
217 * provided data sectors - the number of sectors that this target
218 provides (i.e. the size of the device minus the size of all
219 metadata and padding). The user of this target should not send
220 bios that access data beyond the "provided data sectors" limit.
88ad5d1e 221 * flags
f0ba4377
MCC
222 SB_FLAG_HAVE_JOURNAL_MAC
223 - a flag is set if journal_mac is used
224 SB_FLAG_RECALCULATING
225 - recalculating is in progress
226 SB_FLAG_DIRTY_BITMAP
227 - journal area contains the bitmap of dirty
228 blocks
88ad5d1e
MP
229 * log2(sectors per block)
230 * a position where recalculating finished
7eada909
MP
231* journal
232 The journal is divided into sections, each section contains:
f0ba4377 233
7eada909 234 * metadata area (4kiB), it contains journal entries
f0ba4377
MCC
235
236 - every journal entry contains:
237
7eada909
MP
238 * logical sector (specifies where the data and tag should
239 be written)
240 * last 8 bytes of data
241 * integrity tag (the size is specified in the superblock)
f0ba4377
MCC
242
243 - every metadata sector ends with
244
7eada909
MP
245 * mac (8-bytes), all the macs in 8 metadata sectors form a
246 64-byte value. It is used to store hmac of sector
247 numbers in the journal section, to protect against a
248 possibility that the attacker tampers with sector
249 numbers in the journal.
250 * commit id
f0ba4377 251
7eada909
MP
252 * data area (the size is variable; it depends on how many journal
253 entries fit into the metadata area)
f0ba4377
MCC
254
255 - every sector in the data area contains:
256
7eada909
MP
257 * data (504 bytes of data, the last 8 bytes are stored in
258 the journal entry)
259 * commit id
f0ba4377 260
7eada909
MP
261 To test if the whole journal section was written correctly, every
262 512-byte sector of the journal ends with 8-byte commit id. If the
263 commit id matches on all sectors in a journal section, then it is
264 assumed that the section was written correctly. If the commit id
265 doesn't match, the section was written partially and it should not
266 be replayed.
f0ba4377
MCC
267
268* one or more runs of interleaved tags and data.
269 Each run contains:
270
7eada909
MP
271 * tag area - it contains integrity tags. There is one tag for each
272 sector in the data area
273 * data area - it contains data sectors. The number of data sectors
274 in one run must be a power of two. log2 of this value is stored
275 in the superblock.