Commit | Line | Data |
---|---|---|
9a610812 MCC |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ================================================ | |
fcb9c24b | 4 | ZoneFS - Zone filesystem for Zoned block devices |
9a610812 | 5 | ================================================ |
fcb9c24b DLM |
6 | |
7 | Introduction | |
8 | ============ | |
9 | ||
10 | zonefs is a very simple file system exposing each zone of a zoned block device | |
11 | as a file. Unlike a regular POSIX-compliant file system with native zoned block | |
12 | device support (e.g. f2fs), zonefs does not hide the sequential write | |
13 | constraint of zoned block devices to the user. Files representing sequential | |
14 | write zones of the device must be written sequentially starting from the end | |
15 | of the file (append only writes). | |
16 | ||
17 | As such, zonefs is in essence closer to a raw block device access interface | |
18 | than to a full-featured POSIX file system. The goal of zonefs is to simplify | |
19 | the implementation of zoned block device support in applications by replacing | |
20 | raw block device file accesses with a richer file API, avoiding relying on | |
21 | direct block device file ioctls which may be more obscure to developers. One | |
22 | example of this approach is the implementation of LSM (log-structured merge) | |
23 | tree structures (such as used in RocksDB and LevelDB) on zoned block devices | |
24 | by allowing SSTables to be stored in a zone file similarly to a regular file | |
25 | system rather than as a range of sectors of the entire disk. The introduction | |
26 | of the higher level construct "one file is one zone" can help reducing the | |
27 | amount of changes needed in the application as well as introducing support for | |
28 | different application programming languages. | |
29 | ||
30 | Zoned block devices | |
31 | ------------------- | |
32 | ||
33 | Zoned storage devices belong to a class of storage devices with an address | |
34 | space that is divided into zones. A zone is a group of consecutive LBAs and all | |
35 | zones are contiguous (there are no LBA gaps). Zones may have different types. | |
9a610812 | 36 | |
fcb9c24b DLM |
37 | * Conventional zones: there are no access constraints to LBAs belonging to |
38 | conventional zones. Any read or write access can be executed, similarly to a | |
39 | regular block device. | |
40 | * Sequential zones: these zones accept random reads but must be written | |
41 | sequentially. Each sequential zone has a write pointer maintained by the | |
42 | device that keeps track of the mandatory start LBA position of the next write | |
43 | to the device. As a result of this write constraint, LBAs in a sequential zone | |
44 | cannot be overwritten. Sequential zones must first be erased using a special | |
45 | command (zone reset) before rewriting. | |
46 | ||
47 | Zoned storage devices can be implemented using various recording and media | |
48 | technologies. The most common form of zoned storage today uses the SCSI Zoned | |
49 | Block Commands (ZBC) and Zoned ATA Commands (ZAC) interfaces on Shingled | |
50 | Magnetic Recording (SMR) HDDs. | |
51 | ||
52 | Solid State Disks (SSD) storage devices can also implement a zoned interface | |
53 | to, for instance, reduce internal write amplification due to garbage collection. | |
54 | The NVMe Zoned NameSpace (ZNS) is a technical proposal of the NVMe standard | |
55 | committee aiming at adding a zoned storage interface to the NVMe protocol. | |
56 | ||
57 | Zonefs Overview | |
58 | =============== | |
59 | ||
60 | Zonefs exposes the zones of a zoned block device as files. The files | |
61 | representing zones are grouped by zone type, which are themselves represented | |
62 | by sub-directories. This file structure is built entirely using zone information | |
63 | provided by the device and so does not require any complex on-disk metadata | |
64 | structure. | |
65 | ||
66 | On-disk metadata | |
67 | ---------------- | |
68 | ||
69 | zonefs on-disk metadata is reduced to an immutable super block which | |
70 | persistently stores a magic number and optional feature flags and values. On | |
71 | mount, zonefs uses blkdev_report_zones() to obtain the device zone configuration | |
72 | and populates the mount point with a static file tree solely based on this | |
73 | information. File sizes come from the device zone type and write pointer | |
74 | position managed by the device itself. | |
75 | ||
76 | The super block is always written on disk at sector 0. The first zone of the | |
77 | device storing the super block is never exposed as a zone file by zonefs. If | |
78 | the zone containing the super block is a sequential zone, the mkzonefs format | |
79 | tool always "finishes" the zone, that is, it transitions the zone to a full | |
80 | state to make it read-only, preventing any data write. | |
81 | ||
82 | Zone type sub-directories | |
83 | ------------------------- | |
84 | ||
85 | Files representing zones of the same type are grouped together under the same | |
86 | sub-directory automatically created on mount. | |
87 | ||
88 | For conventional zones, the sub-directory "cnv" is used. This directory is | |
89 | however created if and only if the device has usable conventional zones. If | |
90 | the device only has a single conventional zone at sector 0, the zone will not | |
91 | be exposed as a file as it will be used to store the zonefs super block. For | |
92 | such devices, the "cnv" sub-directory will not be created. | |
93 | ||
94 | For sequential write zones, the sub-directory "seq" is used. | |
95 | ||
96 | These two directories are the only directories that exist in zonefs. Users | |
97 | cannot create other directories and cannot rename nor delete the "cnv" and | |
98 | "seq" sub-directories. | |
99 | ||
100 | The size of the directories indicated by the st_size field of struct stat, | |
101 | obtained with the stat() or fstat() system calls, indicates the number of files | |
102 | existing under the directory. | |
103 | ||
104 | Zone files | |
105 | ---------- | |
106 | ||
107 | Zone files are named using the number of the zone they represent within the set | |
108 | of zones of a particular type. That is, both the "cnv" and "seq" directories | |
109 | contain files named "0", "1", "2", ... The file numbers also represent | |
110 | increasing zone start sector on the device. | |
111 | ||
112 | All read and write operations to zone files are not allowed beyond the file | |
4c96870e JT |
113 | maximum size, that is, beyond the zone capacity. Any access exceeding the zone |
114 | capacity is failed with the -EFBIG error. | |
fcb9c24b DLM |
115 | |
116 | Creating, deleting, renaming or modifying any attribute of files and | |
117 | sub-directories is not allowed. | |
118 | ||
119 | The number of blocks of a file as reported by stat() and fstat() indicates the | |
4c96870e | 120 | capacity of the zone file, or in other words, the maximum file size. |
fcb9c24b DLM |
121 | |
122 | Conventional zone files | |
123 | ----------------------- | |
124 | ||
125 | The size of conventional zone files is fixed to the size of the zone they | |
126 | represent. Conventional zone files cannot be truncated. | |
127 | ||
128 | These files can be randomly read and written using any type of I/O operation: | |
129 | buffered I/Os, direct I/Os, memory mapped I/Os (mmap), etc. There are no I/O | |
130 | constraint for these files beyond the file size limit mentioned above. | |
131 | ||
132 | Sequential zone files | |
133 | --------------------- | |
134 | ||
135 | The size of sequential zone files grouped in the "seq" sub-directory represents | |
136 | the file's zone write pointer position relative to the zone start sector. | |
137 | ||
138 | Sequential zone files can only be written sequentially, starting from the file | |
139 | end, that is, write operations can only be append writes. Zonefs makes no | |
140 | attempt at accepting random writes and will fail any write request that has a | |
141 | start offset not corresponding to the end of the file, or to the end of the last | |
4c5fd3b7 | 142 | write issued and still in-flight (for asynchronous I/O operations). |
fcb9c24b DLM |
143 | |
144 | Since dirty page writeback by the page cache does not guarantee a sequential | |
145 | write pattern, zonefs prevents buffered writes and writeable shared mappings | |
146 | on sequential files. Only direct I/O writes are accepted for these files. | |
147 | zonefs relies on the sequential delivery of write I/O requests to the device | |
148 | implemented by the block layer elevator. An elevator implementing the sequential | |
149 | write feature for zoned block device (ELEVATOR_F_ZBD_SEQ_WRITE elevator feature) | |
4c5fd3b7 | 150 | must be used. This type of elevator (e.g. mq-deadline) is set by default |
fcb9c24b DLM |
151 | for zoned block devices on device initialization. |
152 | ||
153 | There are no restrictions on the type of I/O used for read operations in | |
154 | sequential zone files. Buffered I/Os, direct I/Os and shared read mappings are | |
155 | all accepted. | |
156 | ||
157 | Truncating sequential zone files is allowed only down to 0, in which case, the | |
158 | zone is reset to rewind the file zone write pointer position to the start of | |
4c96870e JT |
159 | the zone, or up to the zone capacity, in which case the file's zone is |
160 | transitioned to the FULL state (finish zone operation). | |
fcb9c24b DLM |
161 | |
162 | Format options | |
163 | -------------- | |
164 | ||
165 | Several optional features of zonefs can be enabled at format time. | |
9a610812 | 166 | |
fcb9c24b DLM |
167 | * Conventional zone aggregation: ranges of contiguous conventional zones can be |
168 | aggregated into a single larger file instead of the default one file per zone. | |
169 | * File ownership: The owner UID and GID of zone files is by default 0 (root) | |
170 | but can be changed to any valid UID/GID. | |
171 | * File access permissions: the default 640 access permissions can be changed. | |
172 | ||
173 | IO error handling | |
174 | ----------------- | |
175 | ||
176 | Zoned block devices may fail I/O requests for reasons similar to regular block | |
177 | devices, e.g. due to bad sectors. However, in addition to such known I/O | |
178 | failure pattern, the standards governing zoned block devices behavior define | |
179 | additional conditions that result in I/O errors. | |
180 | ||
181 | * A zone may transition to the read-only condition (BLK_ZONE_COND_READONLY): | |
182 | While the data already written in the zone is still readable, the zone can | |
183 | no longer be written. No user action on the zone (zone management command or | |
184 | read/write access) can change the zone condition back to a normal read/write | |
185 | state. While the reasons for the device to transition a zone to read-only | |
186 | state are not defined by the standards, a typical cause for such transition | |
187 | would be a defective write head on an HDD (all zones under this head are | |
188 | changed to read-only). | |
189 | ||
190 | * A zone may transition to the offline condition (BLK_ZONE_COND_OFFLINE): | |
191 | An offline zone cannot be read nor written. No user action can transition an | |
192 | offline zone back to an operational good state. Similarly to zone read-only | |
193 | transitions, the reasons for a drive to transition a zone to the offline | |
194 | condition are undefined. A typical cause would be a defective read-write head | |
195 | on an HDD causing all zones on the platter under the broken head to be | |
196 | inaccessible. | |
197 | ||
198 | * Unaligned write errors: These errors result from the host issuing write | |
199 | requests with a start sector that does not correspond to a zone write pointer | |
200 | position when the write request is executed by the device. Even though zonefs | |
201 | enforces sequential file write for sequential zones, unaligned write errors | |
202 | may still happen in the case of a partial failure of a very large direct I/O | |
203 | operation split into multiple BIOs/requests or asynchronous I/O operations. | |
204 | If one of the write request within the set of sequential write requests | |
4c5fd3b7 | 205 | issued to the device fails, all write requests queued after it will |
fcb9c24b DLM |
206 | become unaligned and fail. |
207 | ||
208 | * Delayed write errors: similarly to regular block devices, if the device side | |
209 | write cache is enabled, write errors may occur in ranges of previously | |
210 | completed writes when the device write cache is flushed, e.g. on fsync(). | |
211 | Similarly to the previous immediate unaligned write error case, delayed write | |
212 | errors can propagate through a stream of cached sequential data for a zone | |
213 | causing all data to be dropped after the sector that caused the error. | |
214 | ||
215 | All I/O errors detected by zonefs are notified to the user with an error code | |
4c5fd3b7 | 216 | return for the system call that triggered or detected the error. The recovery |
fcb9c24b DLM |
217 | actions taken by zonefs in response to I/O errors depend on the I/O type (read |
218 | vs write) and on the reason for the error (bad sector, unaligned writes or zone | |
219 | condition change). | |
220 | ||
221 | * For read I/O errors, zonefs does not execute any particular recovery action, | |
222 | but only if the file zone is still in a good condition and there is no | |
223 | inconsistency between the file inode size and its zone write pointer position. | |
224 | If a problem is detected, I/O error recovery is executed (see below table). | |
225 | ||
226 | * For write I/O errors, zonefs I/O error recovery is always executed. | |
227 | ||
228 | * A zone condition change to read-only or offline also always triggers zonefs | |
229 | I/O error recovery. | |
230 | ||
4c5fd3b7 | 231 | Zonefs minimal I/O error recovery may change a file size and file access |
fcb9c24b DLM |
232 | permissions. |
233 | ||
234 | * File size changes: | |
235 | Immediate or delayed write errors in a sequential zone file may cause the file | |
236 | inode size to be inconsistent with the amount of data successfully written in | |
237 | the file zone. For instance, the partial failure of a multi-BIO large write | |
238 | operation will cause the zone write pointer to advance partially, even though | |
239 | the entire write operation will be reported as failed to the user. In such | |
240 | case, the file inode size must be advanced to reflect the zone write pointer | |
241 | change and eventually allow the user to restart writing at the end of the | |
242 | file. | |
243 | A file size may also be reduced to reflect a delayed write error detected on | |
244 | fsync(): in this case, the amount of data effectively written in the zone may | |
245 | be less than originally indicated by the file inode size. After such I/O | |
4c5fd3b7 | 246 | error, zonefs always fixes the file inode size to reflect the amount of data |
fcb9c24b DLM |
247 | persistently stored in the file zone. |
248 | ||
249 | * Access permission changes: | |
250 | A zone condition change to read-only is indicated with a change in the file | |
251 | access permissions to render the file read-only. This disables changes to the | |
252 | file attributes and data modification. For offline zones, all permissions | |
253 | (read and write) to the file are disabled. | |
254 | ||
255 | Further action taken by zonefs I/O error recovery can be controlled by the user | |
256 | with the "errors=xxx" mount option. The table below summarizes the result of | |
257 | zonefs I/O error processing depending on the mount option and on the zone | |
9a610812 | 258 | conditions:: |
fcb9c24b DLM |
259 | |
260 | +--------------+-----------+-----------------------------------------+ | |
261 | | | | Post error state | | |
262 | | "errors=xxx" | device | access permissions | | |
263 | | mount | zone | file file device zone | | |
264 | | option | condition | size read write read write | | |
265 | +--------------+-----------+-----------------------------------------+ | |
266 | | | good | fixed yes no yes yes | | |
ccf4ad7d | 267 | | remount-ro | read-only | as is yes no yes no | |
fcb9c24b DLM |
268 | | (default) | offline | 0 no no no no | |
269 | +--------------+-----------+-----------------------------------------+ | |
270 | | | good | fixed yes no yes yes | | |
ccf4ad7d | 271 | | zone-ro | read-only | as is yes no yes no | |
fcb9c24b DLM |
272 | | | offline | 0 no no no no | |
273 | +--------------+-----------+-----------------------------------------+ | |
274 | | | good | 0 no no yes yes | | |
275 | | zone-offline | read-only | 0 no no yes no | | |
276 | | | offline | 0 no no no no | | |
277 | +--------------+-----------+-----------------------------------------+ | |
278 | | | good | fixed yes yes yes yes | | |
ccf4ad7d | 279 | | repair | read-only | as is yes no yes no | |
fcb9c24b DLM |
280 | | | offline | 0 no no no no | |
281 | +--------------+-----------+-----------------------------------------+ | |
282 | ||
283 | Further notes: | |
9a610812 | 284 | |
fcb9c24b DLM |
285 | * The "errors=remount-ro" mount option is the default behavior of zonefs I/O |
286 | error processing if no errors mount option is specified. | |
287 | * With the "errors=remount-ro" mount option, the change of the file access | |
288 | permissions to read-only applies to all files. The file system is remounted | |
289 | read-only. | |
290 | * Access permission and file size changes due to the device transitioning zones | |
4c5fd3b7 | 291 | to the offline condition are permanent. Remounting or reformatting the device |
fcb9c24b DLM |
292 | with mkfs.zonefs (mkzonefs) will not change back offline zone files to a good |
293 | state. | |
294 | * File access permission changes to read-only due to the device transitioning | |
4c5fd3b7 | 295 | zones to the read-only condition are permanent. Remounting or reformatting |
fcb9c24b DLM |
296 | the device will not re-enable file write access. |
297 | * File access permission changes implied by the remount-ro, zone-ro and | |
298 | zone-offline mount options are temporary for zones in a good condition. | |
299 | Unmounting and remounting the file system will restore the previous default | |
300 | (format time values) access rights to the files affected. | |
301 | * The repair mount option triggers only the minimal set of I/O error recovery | |
302 | actions, that is, file size fixes for zones in a good condition. Zones | |
303 | indicated as being read-only or offline by the device still imply changes to | |
304 | the zone file access permissions as noted in the table above. | |
305 | ||
306 | Mount options | |
307 | ------------- | |
308 | ||
ae430388 DLM |
309 | zonefs defines several mount options: |
310 | * errors=<behavior> | |
311 | * explicit-open | |
312 | ||
313 | "errors=<behavior>" option | |
314 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
315 | ||
316 | The "errors=<behavior>" option mount option allows the user to specify zonefs | |
317 | behavior in response to I/O errors, inode size inconsistencies or zone | |
4c5fd3b7 | 318 | condition changes. The defined behaviors are as follow: |
9a610812 | 319 | |
fcb9c24b DLM |
320 | * remount-ro (default) |
321 | * zone-ro | |
322 | * zone-offline | |
323 | * repair | |
324 | ||
ccf4ad7d DLM |
325 | The run-time I/O error actions defined for each behavior are detailed in the |
326 | previous section. Mount time I/O errors will cause the mount operation to fail. | |
327 | The handling of read-only zones also differs between mount-time and run-time. | |
328 | If a read-only zone is found at mount time, the zone is always treated in the | |
329 | same manner as offline zones, that is, all accesses are disabled and the zone | |
330 | file size set to 0. This is necessary as the write pointer of read-only zones | |
331 | is defined as invalib by the ZBC and ZAC standards, making it impossible to | |
332 | discover the amount of data that has been written to the zone. In the case of a | |
333 | read-only zone discovered at run-time, as indicated in the previous section. | |
4c96870e | 334 | The size of the zone file is left unchanged from its last updated value. |
fcb9c24b | 335 | |
ae430388 DLM |
336 | "explicit-open" option |
337 | ~~~~~~~~~~~~~~~~~~~~~~ | |
338 | ||
48bfd5c6 JT |
339 | A zoned block device (e.g. an NVMe Zoned Namespace device) may have limits on |
340 | the number of zones that can be active, that is, zones that are in the | |
341 | implicit open, explicit open or closed conditions. This potential limitation | |
342 | translates into a risk for applications to see write IO errors due to this | |
343 | limit being exceeded if the zone of a file is not already active when a write | |
344 | request is issued by the user. | |
345 | ||
346 | To avoid these potential errors, the "explicit-open" mount option forces zones | |
347 | to be made active using an open zone command when a file is opened for writing | |
348 | for the first time. If the zone open command succeeds, the application is then | |
349 | guaranteed that write requests can be processed. Conversely, the | |
350 | "explicit-open" mount option will result in a zone close command being issued | |
351 | to the device on the last close() of a zone file if the zone is not full nor | |
352 | empty. | |
353 | ||
31a644b3 DLM |
354 | Runtime sysfs attributes |
355 | ------------------------ | |
356 | ||
357 | zonefs defines several sysfs attributes for mounted devices. All attributes | |
358 | are user readable and can be found in the directory /sys/fs/zonefs/<dev>/, | |
359 | where <dev> is the name of the mounted zoned block device. | |
360 | ||
361 | The attributes defined are as follows. | |
362 | ||
363 | * **max_wro_seq_files**: This attribute reports the maximum number of | |
364 | sequential zone files that can be open for writing. This number corresponds | |
365 | to the maximum number of explicitly or implicitly open zones that the device | |
366 | supports. A value of 0 means that the device has no limit and that any zone | |
367 | (any file) can be open for writing and written at any time, regardless of the | |
368 | state of other zones. When the *explicit-open* mount option is used, zonefs | |
369 | will fail any open() system call requesting to open a sequential zone file for | |
370 | writing when the number of sequential zone files already open for writing has | |
371 | reached the *max_wro_seq_files* limit. | |
372 | * **nr_wro_seq_files**: This attribute reports the current number of sequential | |
373 | zone files open for writing. When the "explicit-open" mount option is used, | |
374 | this number can never exceed *max_wro_seq_files*. If the *explicit-open* | |
375 | mount option is not used, the reported number can be greater than | |
376 | *max_wro_seq_files*. In such case, it is the responsibility of the | |
377 | application to not write simultaneously more than *max_wro_seq_files* | |
378 | sequential zone files. Failure to do so can result in write errors. | |
379 | * **max_active_seq_files**: This attribute reports the maximum number of | |
380 | sequential zone files that are in an active state, that is, sequential zone | |
381 | files that are partially writen (not empty nor full) or that have a zone that | |
382 | is explicitly open (which happens only if the *explicit-open* mount option is | |
383 | used). This number is always equal to the maximum number of active zones that | |
384 | the device supports. A value of 0 means that the mounted device has no limit | |
385 | on the number of sequential zone files that can be active. | |
386 | * **nr_active_seq_files**: This attributes reports the current number of | |
387 | sequential zone files that are active. If *max_active_seq_files* is not 0, | |
388 | then the value of *nr_active_seq_files* can never exceed the value of | |
389 | *nr_active_seq_files*, regardless of the use of the *explicit-open* mount | |
390 | option. | |
391 | ||
fcb9c24b DLM |
392 | Zonefs User Space Tools |
393 | ======================= | |
394 | ||
395 | The mkzonefs tool is used to format zoned block devices for use with zonefs. | |
396 | This tool is available on Github at: | |
397 | ||
398 | https://github.com/damien-lemoal/zonefs-tools | |
399 | ||
400 | zonefs-tools also includes a test suite which can be run against any zoned | |
401 | block device, including null_blk block device created with zoned mode. | |
402 | ||
403 | Examples | |
404 | -------- | |
405 | ||
406 | The following formats a 15TB host-managed SMR HDD with 256 MB zones | |
9a610812 | 407 | with the conventional zones aggregation feature enabled:: |
fcb9c24b | 408 | |
9a610812 MCC |
409 | # mkzonefs -o aggr_cnv /dev/sdX |
410 | # mount -t zonefs /dev/sdX /mnt | |
411 | # ls -l /mnt/ | |
412 | total 0 | |
413 | dr-xr-xr-x 2 root root 1 Nov 25 13:23 cnv | |
414 | dr-xr-xr-x 2 root root 55356 Nov 25 13:23 seq | |
fcb9c24b DLM |
415 | |
416 | The size of the zone files sub-directories indicate the number of files | |
417 | existing for each type of zones. In this example, there is only one | |
418 | conventional zone file (all conventional zones are aggregated under a single | |
9a610812 | 419 | file):: |
fcb9c24b | 420 | |
9a610812 MCC |
421 | # ls -l /mnt/cnv |
422 | total 137101312 | |
423 | -rw-r----- 1 root root 140391743488 Nov 25 13:23 0 | |
fcb9c24b | 424 | |
9a610812 | 425 | This aggregated conventional zone file can be used as a regular file:: |
fcb9c24b | 426 | |
9a610812 MCC |
427 | # mkfs.ext4 /mnt/cnv/0 |
428 | # mount -o loop /mnt/cnv/0 /data | |
fcb9c24b DLM |
429 | |
430 | The "seq" sub-directory grouping files for sequential write zones has in this | |
9a610812 | 431 | example 55356 zones:: |
fcb9c24b | 432 | |
9a610812 MCC |
433 | # ls -lv /mnt/seq |
434 | total 14511243264 | |
435 | -rw-r----- 1 root root 0 Nov 25 13:23 0 | |
436 | -rw-r----- 1 root root 0 Nov 25 13:23 1 | |
437 | -rw-r----- 1 root root 0 Nov 25 13:23 2 | |
438 | ... | |
439 | -rw-r----- 1 root root 0 Nov 25 13:23 55354 | |
440 | -rw-r----- 1 root root 0 Nov 25 13:23 55355 | |
fcb9c24b DLM |
441 | |
442 | For sequential write zone files, the file size changes as data is appended at | |
9a610812 | 443 | the end of the file, similarly to any regular file system:: |
fcb9c24b | 444 | |
9a610812 MCC |
445 | # dd if=/dev/zero of=/mnt/seq/0 bs=4096 count=1 conv=notrunc oflag=direct |
446 | 1+0 records in | |
447 | 1+0 records out | |
448 | 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00044121 s, 9.3 MB/s | |
fcb9c24b | 449 | |
9a610812 MCC |
450 | # ls -l /mnt/seq/0 |
451 | -rw-r----- 1 root root 4096 Nov 25 13:23 /mnt/seq/0 | |
fcb9c24b DLM |
452 | |
453 | The written file can be truncated to the zone size, preventing any further | |
9a610812 | 454 | write operation:: |
fcb9c24b | 455 | |
9a610812 MCC |
456 | # truncate -s 268435456 /mnt/seq/0 |
457 | # ls -l /mnt/seq/0 | |
458 | -rw-r----- 1 root root 268435456 Nov 25 13:49 /mnt/seq/0 | |
fcb9c24b DLM |
459 | |
460 | Truncation to 0 size allows freeing the file zone storage space and restart | |
9a610812 | 461 | append-writes to the file:: |
fcb9c24b | 462 | |
9a610812 MCC |
463 | # truncate -s 0 /mnt/seq/0 |
464 | # ls -l /mnt/seq/0 | |
465 | -rw-r----- 1 root root 0 Nov 25 13:49 /mnt/seq/0 | |
fcb9c24b | 466 | |
4c96870e JT |
467 | Since files are statically mapped to zones on the disk, the number of blocks |
468 | of a file as reported by stat() and fstat() indicates the capacity of the file | |
469 | zone:: | |
9a610812 MCC |
470 | |
471 | # stat /mnt/seq/0 | |
472 | File: /mnt/seq/0 | |
473 | Size: 0 Blocks: 524288 IO Block: 4096 regular empty file | |
474 | Device: 870h/2160d Inode: 50431 Links: 1 | |
475 | Access: (0640/-rw-r-----) Uid: ( 0/ root) Gid: ( 0/ root) | |
476 | Access: 2019-11-25 13:23:57.048971997 +0900 | |
477 | Modify: 2019-11-25 13:52:25.553805765 +0900 | |
478 | Change: 2019-11-25 13:52:25.553805765 +0900 | |
479 | Birth: - | |
fcb9c24b DLM |
480 | |
481 | The number of blocks of the file ("Blocks") in units of 512B blocks gives the | |
482 | maximum file size of 524288 * 512 B = 256 MB, corresponding to the device zone | |
4c96870e JT |
483 | capacity in this example. Of note is that the "IO block" field always |
484 | indicates the minimum I/O size for writes and corresponds to the device | |
485 | physical sector size. |