Commit | Line | Data |
---|---|---|
71bfa161 JA |
1 | Table of contents |
2 | ----------------- | |
3 | ||
4 | 1. Overview | |
5 | 2. How fio works | |
6 | 3. Running fio | |
7 | 4. Job file format | |
8 | 5. Detailed list of parameters | |
9 | 6. Normal output | |
10 | 7. Terse output | |
25c8b9d7 | 11 | 8. Trace file format |
71bfa161 JA |
12 | |
13 | 1.0 Overview and history | |
14 | ------------------------ | |
15 | fio was originally written to save me the hassle of writing special test | |
16 | case programs when I wanted to test a specific workload, either for | |
17 | performance reasons or to find/reproduce a bug. The process of writing | |
18 | such a test app can be tiresome, especially if you have to do it often. | |
19 | Hence I needed a tool that would be able to simulate a given io workload | |
20 | without resorting to writing a tailored test case again and again. | |
21 | ||
22 | A test work load is difficult to define, though. There can be any number | |
23 | of processes or threads involved, and they can each be using their own | |
24 | way of generating io. You could have someone dirtying large amounts of | |
25 | memory in an memory mapped file, or maybe several threads issuing | |
26 | reads using asynchronous io. fio needed to be flexible enough to | |
27 | simulate both of these cases, and many more. | |
28 | ||
29 | 2.0 How fio works | |
30 | ----------------- | |
31 | The first step in getting fio to simulate a desired io workload, is | |
32 | writing a job file describing that specific setup. A job file may contain | |
33 | any number of threads and/or files - the typical contents of the job file | |
34 | is a global section defining shared parameters, and one or more job | |
35 | sections describing the jobs involved. When run, fio parses this file | |
36 | and sets everything up as described. If we break down a job from top to | |
37 | bottom, it contains the following basic parameters: | |
38 | ||
39 | IO type Defines the io pattern issued to the file(s). | |
40 | We may only be reading sequentially from this | |
41 | file(s), or we may be writing randomly. Or even | |
42 | mixing reads and writes, sequentially or randomly. | |
43 | ||
44 | Block size In how large chunks are we issuing io? This may be | |
45 | a single value, or it may describe a range of | |
46 | block sizes. | |
47 | ||
48 | IO size How much data are we going to be reading/writing. | |
49 | ||
50 | IO engine How do we issue io? We could be memory mapping the | |
51 | file, we could be using regular read/write, we | |
d0ff85df | 52 | could be using splice, async io, syslet, or even |
71bfa161 JA |
53 | SG (SCSI generic sg). |
54 | ||
6c219763 | 55 | IO depth If the io engine is async, how large a queuing |
71bfa161 JA |
56 | depth do we want to maintain? |
57 | ||
58 | IO type Should we be doing buffered io, or direct/raw io? | |
59 | ||
60 | Num files How many files are we spreading the workload over. | |
61 | ||
62 | Num threads How many threads or processes should we spread | |
63 | this workload over. | |
64 | ||
65 | The above are the basic parameters defined for a workload, in addition | |
66 | there's a multitude of parameters that modify other aspects of how this | |
67 | job behaves. | |
68 | ||
69 | ||
70 | 3.0 Running fio | |
71 | --------------- | |
72 | See the README file for command line parameters, there are only a few | |
73 | of them. | |
74 | ||
75 | Running fio is normally the easiest part - you just give it the job file | |
76 | (or job files) as parameters: | |
77 | ||
78 | $ fio job_file | |
79 | ||
80 | and it will start doing what the job_file tells it to do. You can give | |
81 | more than one job file on the command line, fio will serialize the running | |
82 | of those files. Internally that is the same as using the 'stonewall' | |
83 | parameter described the the parameter section. | |
84 | ||
b4692828 JA |
85 | If the job file contains only one job, you may as well just give the |
86 | parameters on the command line. The command line parameters are identical | |
87 | to the job parameters, with a few extra that control global parameters | |
88 | (see README). For example, for the job file parameter iodepth=2, the | |
c2b1e753 JA |
89 | mirror command line option would be --iodepth 2 or --iodepth=2. You can |
90 | also use the command line for giving more than one job entry. For each | |
91 | --name option that fio sees, it will start a new job with that name. | |
92 | Command line entries following a --name entry will apply to that job, | |
93 | until there are no more entries or a new --name entry is seen. This is | |
94 | similar to the job file options, where each option applies to the current | |
95 | job until a new [] job entry is seen. | |
b4692828 | 96 | |
71bfa161 JA |
97 | fio does not need to run as root, except if the files or devices specified |
98 | in the job section requires that. Some other options may also be restricted, | |
6c219763 | 99 | such as memory locking, io scheduler switching, and decreasing the nice value. |
71bfa161 JA |
100 | |
101 | ||
102 | 4.0 Job file format | |
103 | ------------------- | |
104 | As previously described, fio accepts one or more job files describing | |
105 | what it is supposed to do. The job file format is the classic ini file, | |
106 | where the names enclosed in [] brackets define the job name. You are free | |
107 | to use any ascii name you want, except 'global' which has special meaning. | |
108 | A global section sets defaults for the jobs described in that file. A job | |
109 | may override a global section parameter, and a job file may even have | |
110 | several global sections if so desired. A job is only affected by a global | |
65db0851 JA |
111 | section residing above it. If the first character in a line is a ';' or a |
112 | '#', the entire line is discarded as a comment. | |
71bfa161 | 113 | |
3c54bc46 | 114 | So let's look at a really simple job file that defines two processes, each |
b22989b9 | 115 | randomly reading from a 128MB file. |
71bfa161 JA |
116 | |
117 | ; -- start job file -- | |
118 | [global] | |
119 | rw=randread | |
120 | size=128m | |
121 | ||
122 | [job1] | |
123 | ||
124 | [job2] | |
125 | ||
126 | ; -- end job file -- | |
127 | ||
128 | As you can see, the job file sections themselves are empty as all the | |
129 | described parameters are shared. As no filename= option is given, fio | |
c2b1e753 JA |
130 | makes up a filename for each of the jobs as it sees fit. On the command |
131 | line, this job would look as follows: | |
132 | ||
133 | $ fio --name=global --rw=randread --size=128m --name=job1 --name=job2 | |
134 | ||
71bfa161 | 135 | |
3c54bc46 | 136 | Let's look at an example that has a number of processes writing randomly |
71bfa161 JA |
137 | to files. |
138 | ||
139 | ; -- start job file -- | |
140 | [random-writers] | |
141 | ioengine=libaio | |
142 | iodepth=4 | |
143 | rw=randwrite | |
144 | bs=32k | |
145 | direct=0 | |
146 | size=64m | |
147 | numjobs=4 | |
148 | ||
149 | ; -- end job file -- | |
150 | ||
151 | Here we have no global section, as we only have one job defined anyway. | |
152 | We want to use async io here, with a depth of 4 for each file. We also | |
b22989b9 | 153 | increased the buffer size used to 32KB and define numjobs to 4 to |
71bfa161 | 154 | fork 4 identical jobs. The result is 4 processes each randomly writing |
b22989b9 | 155 | to their own 64MB file. Instead of using the above job file, you could |
b4692828 JA |
156 | have given the parameters on the command line. For this case, you would |
157 | specify: | |
158 | ||
159 | $ fio --name=random-writers --ioengine=libaio --iodepth=4 --rw=randwrite --bs=32k --direct=0 --size=64m --numjobs=4 | |
71bfa161 | 160 | |
74929ac2 JA |
161 | 4.1 Environment variables |
162 | ------------------------- | |
163 | ||
3c54bc46 AC |
164 | fio also supports environment variable expansion in job files. Any |
165 | substring of the form "${VARNAME}" as part of an option value (in other | |
166 | words, on the right of the `='), will be expanded to the value of the | |
167 | environment variable called VARNAME. If no such environment variable | |
168 | is defined, or VARNAME is the empty string, the empty string will be | |
169 | substituted. | |
170 | ||
171 | As an example, let's look at a sample fio invocation and job file: | |
172 | ||
173 | $ SIZE=64m NUMJOBS=4 fio jobfile.fio | |
174 | ||
175 | ; -- start job file -- | |
176 | [random-writers] | |
177 | rw=randwrite | |
178 | size=${SIZE} | |
179 | numjobs=${NUMJOBS} | |
180 | ; -- end job file -- | |
181 | ||
182 | This will expand to the following equivalent job file at runtime: | |
183 | ||
184 | ; -- start job file -- | |
185 | [random-writers] | |
186 | rw=randwrite | |
187 | size=64m | |
188 | numjobs=4 | |
189 | ; -- end job file -- | |
190 | ||
71bfa161 JA |
191 | fio ships with a few example job files, you can also look there for |
192 | inspiration. | |
193 | ||
74929ac2 JA |
194 | 4.2 Reserved keywords |
195 | --------------------- | |
196 | ||
197 | Additionally, fio has a set of reserved keywords that will be replaced | |
198 | internally with the appropriate value. Those keywords are: | |
199 | ||
200 | $pagesize The architecture page size of the running system | |
201 | $mb_memory Megabytes of total memory in the system | |
202 | $ncpus Number of online available CPUs | |
203 | ||
204 | These can be used on the command line or in the job file, and will be | |
205 | automatically substituted with the current system values when the job | |
892a6ffc JA |
206 | is run. Simple math is also supported on these keywords, so you can |
207 | perform actions like: | |
208 | ||
209 | size=8*$mb_memory | |
210 | ||
211 | and get that properly expanded to 8 times the size of memory in the | |
212 | machine. | |
74929ac2 | 213 | |
71bfa161 JA |
214 | |
215 | 5.0 Detailed list of parameters | |
216 | ------------------------------- | |
217 | ||
218 | This section describes in details each parameter associated with a job. | |
219 | Some parameters take an option of a given type, such as an integer or | |
220 | a string. The following types are used: | |
221 | ||
222 | str String. This is a sequence of alpha characters. | |
b09da8fa | 223 | time Integer with possible time suffix. In seconds unless otherwise |
e417fd66 JA |
224 | specified, use eg 10m for 10 minutes. Accepts s/m/h for seconds, |
225 | minutes, and hours. | |
b09da8fa JA |
226 | int SI integer. A whole number value, which may contain a suffix |
227 | describing the base of the number. Accepted suffixes are k/m/g/t/p, | |
228 | meaning kilo, mega, giga, tera, and peta. The suffix is not case | |
57fc29fa JA |
229 | sensitive, and you may also include trailing 'b' (eg 'kb' is the same |
230 | as 'k'). So if you want to specify 4096, you could either write | |
b09da8fa | 231 | out '4096' or just give 4k. The suffixes signify base 2 values, so |
57fc29fa JA |
232 | 1024 is 1k and 1024k is 1m and so on, unless the suffix is explicitly |
233 | set to a base 10 value using 'kib', 'mib', 'gib', etc. If that is the | |
234 | case, then 1000 is used as the multiplier. This can be handy for | |
235 | disks, since manufacturers generally use base 10 values when listing | |
236 | the capacity of a drive. If the option accepts an upper and lower | |
237 | range, use a colon ':' or minus '-' to separate such values. May also | |
238 | include a prefix to indicate numbers base. If 0x is used, the number | |
239 | is assumed to be hexadecimal. See irange. | |
71bfa161 JA |
240 | bool Boolean. Usually parsed as an integer, however only defined for |
241 | true and false (1 and 0). | |
b09da8fa | 242 | irange Integer range with suffix. Allows value range to be given, such |
bf9a3edb | 243 | as 1024-4096. A colon may also be used as the separator, eg |
0c9baf91 JA |
244 | 1k:4k. If the option allows two sets of ranges, they can be |
245 | specified with a ',' or '/' delimiter: 1k-4k/8k-32k. Also see | |
f7fa2653 | 246 | int. |
83349190 | 247 | float_list A list of floating numbers, separated by a ':' character. |
71bfa161 JA |
248 | |
249 | With the above in mind, here follows the complete list of fio job | |
250 | parameters. | |
251 | ||
252 | name=str ASCII name of the job. This may be used to override the | |
253 | name printed by fio for this job. Otherwise the job | |
c2b1e753 | 254 | name is used. On the command line this parameter has the |
6c219763 | 255 | special purpose of also signaling the start of a new |
c2b1e753 | 256 | job. |
71bfa161 | 257 | |
61697c37 JA |
258 | description=str Text description of the job. Doesn't do anything except |
259 | dump this text description when this job is run. It's | |
260 | not parsed. | |
261 | ||
3776041e | 262 | directory=str Prefix filenames with this directory. Used to place files |
71bfa161 JA |
263 | in a different location than "./". |
264 | ||
265 | filename=str Fio normally makes up a filename based on the job name, | |
266 | thread number, and file number. If you want to share | |
267 | files between threads in a job or several jobs, specify | |
ed92ac0c | 268 | a filename for each of them to override the default. If |
414c2a3e | 269 | the ioengine used is 'net', the filename is the host, port, |
0fd666bf | 270 | and protocol to use in the format of =host,port,protocol. |
414c2a3e JA |
271 | See ioengine=net for more. If the ioengine is file based, you |
272 | can specify a number of files by separating the names with a | |
273 | ':' colon. So if you wanted a job to open /dev/sda and /dev/sdb | |
274 | as the two working files, you would use | |
03e20d68 | 275 | filename=/dev/sda:/dev/sdb. On Windows, disk devices are accessed |
ecc314ba BC |
276 | as \\.\PhysicalDrive0 for the first device, \\.\PhysicalDrive1 |
277 | for the second etc. If the wanted filename does need to | |
278 | include a colon, then escape that with a '\' character. | |
279 | For instance, if the filename is "/dev/dsk/foo@3,0:c", | |
280 | then you would use filename="/dev/dsk/foo@3,0\:c". | |
03e20d68 BC |
281 | '-' is a reserved name, meaning stdin or stdout. Which of the |
282 | two depends on the read/write direction set. | |
71bfa161 | 283 | |
bbf6b540 JA |
284 | opendir=str Tell fio to recursively add any file it can find in this |
285 | directory and down the file system tree. | |
286 | ||
3776041e | 287 | lockfile=str Fio defaults to not locking any files before it does |
4d4e80f2 JA |
288 | IO to them. If a file or file descriptor is shared, fio |
289 | can serialize IO to that file to make the end result | |
290 | consistent. This is usual for emulating real workloads that | |
291 | share files. The lock modes are: | |
292 | ||
293 | none No locking. The default. | |
294 | exclusive Only one thread/process may do IO, | |
295 | excluding all others. | |
296 | readwrite Read-write locking on the file. Many | |
297 | readers may access the file at the | |
298 | same time, but writes get exclusive | |
299 | access. | |
300 | ||
301 | The option may be post-fixed with a lock batch number. If | |
302 | set, then each thread/process may do that amount of IOs to | |
bf9a3edb | 303 | the file before giving up the lock. Since lock acquisition is |
4d4e80f2 | 304 | expensive, batching the lock/unlocks will speed up IO. |
29c1349f | 305 | |
d3aad8f2 | 306 | readwrite=str |
71bfa161 JA |
307 | rw=str Type of io pattern. Accepted values are: |
308 | ||
309 | read Sequential reads | |
310 | write Sequential writes | |
311 | randwrite Random writes | |
312 | randread Random reads | |
313 | rw Sequential mixed reads and writes | |
314 | randrw Random mixed reads and writes | |
315 | ||
316 | For the mixed io types, the default is to split them 50/50. | |
317 | For certain types of io the result may still be skewed a bit, | |
211097b2 | 318 | since the speed may be different. It is possible to specify |
38dad62d JA |
319 | a number of IO's to do before getting a new offset, this is |
320 | one by appending a ':<nr>' to the end of the string given. | |
321 | For a random read, it would look like 'rw=randread:8' for | |
059b0802 JA |
322 | passing in an offset modifier with a value of 8. If the |
323 | postfix is used with a sequential IO pattern, then the value | |
324 | specified will be added to the generated offset for each IO. | |
325 | For instance, using rw=write:4k will skip 4k for every | |
326 | write. It turns sequential IO into sequential IO with holes. | |
327 | See the 'rw_sequencer' option. | |
38dad62d JA |
328 | |
329 | rw_sequencer=str If an offset modifier is given by appending a number to | |
330 | the rw=<str> line, then this option controls how that | |
331 | number modifies the IO offset being generated. Accepted | |
332 | values are: | |
333 | ||
334 | sequential Generate sequential offset | |
335 | identical Generate the same offset | |
336 | ||
337 | 'sequential' is only useful for random IO, where fio would | |
338 | normally generate a new random offset for every IO. If you | |
339 | append eg 8 to randread, you would get a new random offset for | |
211097b2 JA |
340 | every 8 IO's. The result would be a seek for only every 8 |
341 | IO's, instead of for every IO. Use rw=randread:8 to specify | |
38dad62d JA |
342 | that. As sequential IO is already sequential, setting |
343 | 'sequential' for that would not result in any differences. | |
344 | 'identical' behaves in a similar fashion, except it sends | |
345 | the same offset 8 number of times before generating a new | |
346 | offset. | |
71bfa161 | 347 | |
90fef2d1 JA |
348 | kb_base=int The base unit for a kilobyte. The defacto base is 2^10, 1024. |
349 | Storage manufacturers like to use 10^3 or 1000 as a base | |
350 | ten unit instead, for obvious reasons. Allow values are | |
351 | 1024 or 1000, with 1024 being the default. | |
352 | ||
ee738499 JA |
353 | randrepeat=bool For random IO workloads, seed the generator in a predictable |
354 | way so that results are repeatable across repetitions. | |
355 | ||
2615cc4b JA |
356 | use_os_rand=bool Fio can either use the random generator supplied by the OS |
357 | to generator random offsets, or it can use it's own internal | |
358 | generator (based on Tausworthe). Default is to use the | |
359 | internal generator, which is often of better quality and | |
360 | faster. | |
361 | ||
a596f047 EG |
362 | fallocate=str Whether pre-allocation is performed when laying down files. |
363 | Accepted values are: | |
364 | ||
365 | none Do not pre-allocate space | |
366 | posix Pre-allocate via posix_fallocate() | |
367 | keep Pre-allocate via fallocate() with | |
368 | FALLOC_FL_KEEP_SIZE set | |
369 | 0 Backward-compatible alias for 'none' | |
370 | 1 Backward-compatible alias for 'posix' | |
371 | ||
372 | May not be available on all supported platforms. 'keep' is only | |
373 | available on Linux.If using ZFS on Solaris this must be set to | |
374 | 'none' because ZFS doesn't support it. Default: 'posix'. | |
7bc8c2cf | 375 | |
d2f3ac35 JA |
376 | fadvise_hint=bool By default, fio will use fadvise() to advise the kernel |
377 | on what IO patterns it is likely to issue. Sometimes you | |
378 | want to test specific IO patterns without telling the | |
379 | kernel about it, in which case you can disable this option. | |
380 | If set, fio will use POSIX_FADV_SEQUENTIAL for sequential | |
381 | IO and POSIX_FADV_RANDOM for random IO. | |
382 | ||
f7fa2653 | 383 | size=int The total size of file io for this job. Fio will run until |
7616cafe JA |
384 | this many bytes has been transferred, unless runtime is |
385 | limited by other options (such as 'runtime', for instance). | |
3776041e | 386 | Unless specific nrfiles and filesize options are given, |
7616cafe | 387 | fio will divide this size between the available files |
d6667268 JA |
388 | specified by the job. If not set, fio will use the full |
389 | size of the given files or devices. If the the files | |
7bb59102 JA |
390 | do not exist, size must be given. It is also possible to |
391 | give size as a percentage between 1 and 100. If size=20% | |
392 | is given, fio will use 20% of the full size of the given | |
393 | files or devices. | |
71bfa161 | 394 | |
f7fa2653 | 395 | filesize=int Individual file sizes. May be a range, in which case fio |
9c60ce64 JA |
396 | will select sizes for files at random within the given range |
397 | and limited to 'size' in total (if that is given). If not | |
398 | given, each created file is the same size. | |
399 | ||
74586c1e JA |
400 | fill_device=bool |
401 | fill_fs=bool Sets size to something really large and waits for ENOSPC (no | |
aa31f1f1 | 402 | space left on device) as the terminating condition. Only makes |
3ce9dcaf | 403 | sense with sequential write. For a read workload, the mount |
4f12432e JA |
404 | point will be filled first then IO started on the result. This |
405 | option doesn't make sense if operating on a raw device node, | |
406 | since the size of that is already known by the file system. | |
407 | Additionally, writing beyond end-of-device will not return | |
408 | ENOSPC there. | |
aa31f1f1 | 409 | |
f7fa2653 JA |
410 | blocksize=int |
411 | bs=int The block size used for the io units. Defaults to 4k. Values | |
412 | can be given for both read and writes. If a single int is | |
413 | given, it will apply to both. If a second int is specified | |
f90eff5a JA |
414 | after a comma, it will apply to writes only. In other words, |
415 | the format is either bs=read_and_write or bs=read,write. | |
416 | bs=4k,8k will thus use 4k blocks for reads, and 8k blocks | |
787f7e95 JA |
417 | for writes. If you only wish to set the write size, you |
418 | can do so by passing an empty read size - bs=,8k will set | |
419 | 8k for writes and leave the read default value. | |
a00735e6 | 420 | |
2b7a01d0 JA |
421 | blockalign=int |
422 | ba=int At what boundary to align random IO offsets. Defaults to | |
423 | the same as 'blocksize' the minimum blocksize given. | |
424 | Minimum alignment is typically 512b for using direct IO, | |
425 | though it usually depends on the hardware block size. This | |
426 | option is mutually exclusive with using a random map for | |
427 | files, so it will turn off that option. | |
428 | ||
d3aad8f2 | 429 | blocksize_range=irange |
71bfa161 JA |
430 | bsrange=irange Instead of giving a single block size, specify a range |
431 | and fio will mix the issued io block sizes. The issued | |
432 | io unit will always be a multiple of the minimum value | |
f90eff5a JA |
433 | given (also see bs_unaligned). Applies to both reads and |
434 | writes, however a second range can be given after a comma. | |
435 | See bs=. | |
a00735e6 | 436 | |
564ca972 JA |
437 | bssplit=str Sometimes you want even finer grained control of the |
438 | block sizes issued, not just an even split between them. | |
439 | This option allows you to weight various block sizes, | |
440 | so that you are able to define a specific amount of | |
441 | block sizes issued. The format for this option is: | |
442 | ||
443 | bssplit=blocksize/percentage:blocksize/percentage | |
444 | ||
445 | for as many block sizes as needed. So if you want to define | |
446 | a workload that has 50% 64k blocks, 10% 4k blocks, and | |
447 | 40% 32k blocks, you would write: | |
448 | ||
449 | bssplit=4k/10:64k/50:32k/40 | |
450 | ||
451 | Ordering does not matter. If the percentage is left blank, | |
452 | fio will fill in the remaining values evenly. So a bssplit | |
453 | option like this one: | |
454 | ||
455 | bssplit=4k/50:1k/:32k/ | |
456 | ||
457 | would have 50% 4k ios, and 25% 1k and 32k ios. The percentages | |
458 | always add up to 100, if bssplit is given a range that adds | |
459 | up to more, it will error out. | |
460 | ||
720e84ad JA |
461 | bssplit also supports giving separate splits to reads and |
462 | writes. The format is identical to what bs= accepts. You | |
463 | have to separate the read and write parts with a comma. So | |
464 | if you want a workload that has 50% 2k reads and 50% 4k reads, | |
465 | while having 90% 4k writes and 10% 8k writes, you would | |
466 | specify: | |
467 | ||
468 | bssplit=2k/50:4k/50,4k/90,8k/10 | |
469 | ||
d3aad8f2 | 470 | blocksize_unaligned |
690adba3 JA |
471 | bs_unaligned If this option is given, any byte size value within bsrange |
472 | may be used as a block range. This typically wont work with | |
473 | direct IO, as that normally requires sector alignment. | |
71bfa161 | 474 | |
e9459e5a JA |
475 | zero_buffers If this option is given, fio will init the IO buffers to |
476 | all zeroes. The default is to fill them with random data. | |
477 | ||
5973cafb JA |
478 | refill_buffers If this option is given, fio will refill the IO buffers |
479 | on every submit. The default is to only fill it at init | |
480 | time and reuse that data. Only makes sense if zero_buffers | |
41ccd845 JA |
481 | isn't specified, naturally. If data verification is enabled, |
482 | refill_buffers is also automatically enabled. | |
5973cafb | 483 | |
fd68418e JA |
484 | scramble_buffers=bool If refill_buffers is too costly and the target is |
485 | using data deduplication, then setting this option will | |
486 | slightly modify the IO buffer contents to defeat normal | |
487 | de-dupe attempts. This is not enough to defeat more clever | |
488 | block compression attempts, but it will stop naive dedupe of | |
489 | blocks. Default: true. | |
490 | ||
71bfa161 JA |
491 | nrfiles=int Number of files to use for this job. Defaults to 1. |
492 | ||
390b1537 JA |
493 | openfiles=int Number of files to keep open at the same time. Defaults to |
494 | the same as nrfiles, can be set smaller to limit the number | |
495 | simultaneous opens. | |
496 | ||
5af1c6f3 JA |
497 | file_service_type=str Defines how fio decides which file from a job to |
498 | service next. The following types are defined: | |
499 | ||
500 | random Just choose a file at random. | |
501 | ||
502 | roundrobin Round robin over open files. This | |
503 | is the default. | |
504 | ||
a086c257 JA |
505 | sequential Finish one file before moving on to |
506 | the next. Multiple files can still be | |
507 | open depending on 'openfiles'. | |
508 | ||
1907dbc6 JA |
509 | The string can have a number appended, indicating how |
510 | often to switch to a new file. So if option random:4 is | |
511 | given, fio will switch to a new random file after 4 ios | |
512 | have been issued. | |
513 | ||
71bfa161 JA |
514 | ioengine=str Defines how the job issues io to the file. The following |
515 | types are defined: | |
516 | ||
517 | sync Basic read(2) or write(2) io. lseek(2) is | |
518 | used to position the io location. | |
519 | ||
a31041ea | 520 | psync Basic pread(2) or pwrite(2) io. |
521 | ||
e05af9e5 | 522 | vsync Basic readv(2) or writev(2) IO. |
1d2af02a | 523 | |
15d182aa JA |
524 | libaio Linux native asynchronous io. Note that Linux |
525 | may only support queued behaviour with | |
526 | non-buffered IO (set direct=1 or buffered=0). | |
c44b1ff5 JA |
527 | This engine also has a sub-option, |
528 | userspace_reap. To set it, use | |
529 | ioengine=libaio:userspace_reap. Normally, with | |
530 | the libaio engine in use, fio will use the | |
531 | io_getevents system call to reap newly returned | |
532 | events. With this flag turned on, the AIO ring | |
533 | will be read directly from user-space to reap | |
534 | events. The reaping mode is only enabled when | |
535 | polling for a minimum of 0 events (eg when | |
536 | iodepth_batch_complete=0). | |
71bfa161 JA |
537 | |
538 | posixaio glibc posix asynchronous io. | |
539 | ||
417f0068 JA |
540 | solarisaio Solaris native asynchronous io. |
541 | ||
03e20d68 BC |
542 | windowsaio Windows native asynchronous io. |
543 | ||
71bfa161 JA |
544 | mmap File is memory mapped and data copied |
545 | to/from using memcpy(3). | |
546 | ||
547 | splice splice(2) is used to transfer the data and | |
548 | vmsplice(2) to transfer data from user | |
549 | space to the kernel. | |
550 | ||
d0ff85df JA |
551 | syslet-rw Use the syslet system calls to make |
552 | regular read/write async. | |
553 | ||
71bfa161 | 554 | sg SCSI generic sg v3 io. May either be |
6c219763 | 555 | synchronous using the SG_IO ioctl, or if |
71bfa161 JA |
556 | the target is an sg character device |
557 | we use read(2) and write(2) for asynchronous | |
558 | io. | |
559 | ||
a94ea28b JA |
560 | null Doesn't transfer any data, just pretends |
561 | to. This is mainly used to exercise fio | |
562 | itself and for debugging/testing purposes. | |
563 | ||
ed92ac0c JA |
564 | net Transfer over the network to given host:port. |
565 | 'filename' must be set appropriately to | |
414c2a3e | 566 | filename=host/port/protocol regardless of send |
ed92ac0c | 567 | or receive, if the latter only the port |
414c2a3e JA |
568 | argument is used. 'host' may be an IP address |
569 | or hostname, port is the port number to be used, | |
570 | and protocol may be 'udp' or 'tcp'. If no | |
571 | protocol is given, TCP is used. | |
ed92ac0c | 572 | |
9cce02e8 JA |
573 | netsplice Like net, but uses splice/vmsplice to |
574 | map data and send/receive. | |
575 | ||
53aec0a4 | 576 | cpuio Doesn't transfer any data, but burns CPU |
ba0fbe10 JA |
577 | cycles according to the cpuload= and |
578 | cpucycle= options. Setting cpuload=85 | |
579 | will cause that job to do nothing but burn | |
36ecec83 GP |
580 | 85% of the CPU. In case of SMP machines, |
581 | use numjobs=<no_of_cpu> to get desired CPU | |
582 | usage, as the cpuload only loads a single | |
583 | CPU at the desired rate. | |
ba0fbe10 | 584 | |
e9a1806f JA |
585 | guasi The GUASI IO engine is the Generic Userspace |
586 | Asyncronous Syscall Interface approach | |
587 | to async IO. See | |
588 | ||
589 | http://www.xmailserver.org/guasi-lib.html | |
590 | ||
591 | for more info on GUASI. | |
592 | ||
21b8aee8 | 593 | rdma The RDMA I/O engine supports both RDMA |
eb52fa3f BVA |
594 | memory semantics (RDMA_WRITE/RDMA_READ) and |
595 | channel semantics (Send/Recv) for the | |
596 | InfiniBand, RoCE and iWARP protocols. | |
21b8aee8 | 597 | |
8a7bd877 JA |
598 | external Prefix to specify loading an external |
599 | IO engine object file. Append the engine | |
600 | filename, eg ioengine=external:/tmp/foo.o | |
601 | to load ioengine foo.o in /tmp. | |
602 | ||
71bfa161 JA |
603 | iodepth=int This defines how many io units to keep in flight against |
604 | the file. The default is 1 for each file defined in this | |
605 | job, can be overridden with a larger value for higher | |
ee72ca09 JA |
606 | concurrency. Note that increasing iodepth beyond 1 will not |
607 | affect synchronous ioengines (except for small degress when | |
9b836561 | 608 | verify_async is in use). Even async engines may impose OS |
ee72ca09 JA |
609 | restrictions causing the desired depth not to be achieved. |
610 | This may happen on Linux when using libaio and not setting | |
611 | direct=1, since buffered IO is not async on that OS. Keep an | |
612 | eye on the IO depth distribution in the fio output to verify | |
613 | that the achieved depth is as expected. Default: 1. | |
71bfa161 | 614 | |
4950421a | 615 | iodepth_batch_submit=int |
cb5ab512 | 616 | iodepth_batch=int This defines how many pieces of IO to submit at once. |
89e820f6 JA |
617 | It defaults to 1 which means that we submit each IO |
618 | as soon as it is available, but can be raised to submit | |
619 | bigger batches of IO at the time. | |
cb5ab512 | 620 | |
4950421a JA |
621 | iodepth_batch_complete=int This defines how many pieces of IO to retrieve |
622 | at once. It defaults to 1 which means that we'll ask | |
623 | for a minimum of 1 IO in the retrieval process from | |
624 | the kernel. The IO retrieval will go on until we | |
625 | hit the limit set by iodepth_low. If this variable is | |
626 | set to 0, then fio will always check for completed | |
627 | events before queuing more IO. This helps reduce | |
628 | IO latency, at the cost of more retrieval system calls. | |
629 | ||
e916b390 JA |
630 | iodepth_low=int The low water mark indicating when to start filling |
631 | the queue again. Defaults to the same as iodepth, meaning | |
632 | that fio will attempt to keep the queue full at all times. | |
633 | If iodepth is set to eg 16 and iodepth_low is set to 4, then | |
634 | after fio has filled the queue of 16 requests, it will let | |
635 | the depth drain down to 4 before starting to fill it again. | |
636 | ||
71bfa161 | 637 | direct=bool If value is true, use non-buffered io. This is usually |
9b836561 | 638 | O_DIRECT. Note that ZFS on Solaris doesn't support direct io. |
76a43db4 JA |
639 | |
640 | buffered=bool If value is true, use buffered io. This is the opposite | |
641 | of the 'direct' option. Defaults to true. | |
71bfa161 | 642 | |
f7fa2653 | 643 | offset=int Start io at the given offset in the file. The data before |
71bfa161 JA |
644 | the given offset will not be touched. This effectively |
645 | caps the file size at real_size - offset. | |
646 | ||
647 | fsync=int If writing to a file, issue a sync of the dirty data | |
648 | for every number of blocks given. For example, if you give | |
649 | 32 as a parameter, fio will sync the file for every 32 | |
650 | writes issued. If fio is using non-buffered io, we may | |
651 | not sync the file. The exception is the sg io engine, which | |
6c219763 | 652 | synchronizes the disk cache anyway. |
71bfa161 | 653 | |
e76b1da4 | 654 | fdatasync=int Like fsync= but uses fdatasync() to only sync data and not |
5f9099ea | 655 | metadata blocks. |
e72fa4d4 JA |
656 | In FreeBSD there is no fdatasync(), this falls back to |
657 | using fsync() | |
5f9099ea | 658 | |
e76b1da4 JA |
659 | sync_file_range=str:val Use sync_file_range() for every 'val' number of |
660 | write operations. Fio will track range of writes that | |
661 | have happened since the last sync_file_range() call. 'str' | |
662 | can currently be one or more of: | |
663 | ||
664 | wait_before SYNC_FILE_RANGE_WAIT_BEFORE | |
665 | write SYNC_FILE_RANGE_WRITE | |
666 | wait_after SYNC_FILE_RANGE_WAIT_AFTER | |
667 | ||
668 | So if you do sync_file_range=wait_before,write:8, fio would | |
669 | use SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE for | |
670 | every 8 writes. Also see the sync_file_range(2) man page. | |
671 | This option is Linux specific. | |
672 | ||
5036fc1e JA |
673 | overwrite=bool If true, writes to a file will always overwrite existing |
674 | data. If the file doesn't already exist, it will be | |
675 | created before the write phase begins. If the file exists | |
676 | and is large enough for the specified write phase, nothing | |
677 | will be done. | |
71bfa161 JA |
678 | |
679 | end_fsync=bool If true, fsync file contents when the job exits. | |
680 | ||
ebb1415f JA |
681 | fsync_on_close=bool If true, fio will fsync() a dirty file on close. |
682 | This differs from end_fsync in that it will happen on every | |
683 | file close, not just at the end of the job. | |
684 | ||
71bfa161 JA |
685 | rwmixread=int How large a percentage of the mix should be reads. |
686 | ||
687 | rwmixwrite=int How large a percentage of the mix should be writes. If both | |
688 | rwmixread and rwmixwrite is given and the values do not add | |
689 | up to 100%, the latter of the two will be used to override | |
c35dd7a6 JA |
690 | the first. This may interfere with a given rate setting, |
691 | if fio is asked to limit reads or writes to a certain rate. | |
692 | If that is the case, then the distribution may be skewed. | |
71bfa161 | 693 | |
bb8895e0 JA |
694 | norandommap Normally fio will cover every block of the file when doing |
695 | random IO. If this option is given, fio will just get a | |
696 | new random offset without looking at past io history. This | |
697 | means that some blocks may not be read or written, and that | |
698 | some blocks may be read/written more than once. This option | |
8347239a JA |
699 | is mutually exclusive with verify= if and only if multiple |
700 | blocksizes (via bsrange=) are used, since fio only tracks | |
701 | complete rewrites of blocks. | |
bb8895e0 | 702 | |
0408c206 JA |
703 | softrandommap=bool See norandommap. If fio runs with the random block map |
704 | enabled and it fails to allocate the map, if this option is | |
705 | set it will continue without a random block map. As coverage | |
706 | will not be as complete as with random maps, this option is | |
2b386d25 JA |
707 | disabled by default. |
708 | ||
71bfa161 JA |
709 | nice=int Run the job with the given nice value. See man nice(2). |
710 | ||
711 | prio=int Set the io priority value of this job. Linux limits us to | |
712 | a positive value between 0 and 7, with 0 being the highest. | |
713 | See man ionice(1). | |
714 | ||
715 | prioclass=int Set the io priority class. See man ionice(1). | |
716 | ||
717 | thinktime=int Stall the job x microseconds after an io has completed before | |
718 | issuing the next. May be used to simulate processing being | |
48097d5c JA |
719 | done by an application. See thinktime_blocks and |
720 | thinktime_spin. | |
721 | ||
722 | thinktime_spin=int | |
723 | Only valid if thinktime is set - pretend to spend CPU time | |
724 | doing something with the data received, before falling back | |
725 | to sleeping for the rest of the period specified by | |
726 | thinktime. | |
9c1f7434 JA |
727 | |
728 | thinktime_blocks | |
729 | Only valid if thinktime is set - control how many blocks | |
730 | to issue, before waiting 'thinktime' usecs. If not set, | |
731 | defaults to 1 which will make fio wait 'thinktime' usecs | |
732 | after every block. | |
71bfa161 | 733 | |
581e7141 | 734 | rate=int Cap the bandwidth used by this job. The number is in bytes/sec, |
b09da8fa | 735 | the normal suffix rules apply. You can use rate=500k to limit |
581e7141 JA |
736 | reads and writes to 500k each, or you can specify read and |
737 | writes separately. Using rate=1m,500k would limit reads to | |
738 | 1MB/sec and writes to 500KB/sec. Capping only reads or | |
739 | writes can be done with rate=,500k or rate=500k,. The former | |
740 | will only limit writes (to 500KB/sec), the latter will only | |
741 | limit reads. | |
71bfa161 JA |
742 | |
743 | ratemin=int Tell fio to do whatever it can to maintain at least this | |
4e991c23 | 744 | bandwidth. Failing to meet this requirement, will cause |
581e7141 JA |
745 | the job to exit. The same format as rate is used for |
746 | read vs write separation. | |
4e991c23 JA |
747 | |
748 | rate_iops=int Cap the bandwidth to this number of IOPS. Basically the same | |
749 | as rate, just specified independently of bandwidth. If the | |
750 | job is given a block size range instead of a fixed value, | |
581e7141 JA |
751 | the smallest block size is used as the metric. The same format |
752 | as rate is used for read vs write seperation. | |
4e991c23 JA |
753 | |
754 | rate_iops_min=int If fio doesn't meet this rate of IO, it will cause | |
581e7141 JA |
755 | the job to exit. The same format as rate is used for read vs |
756 | write seperation. | |
71bfa161 JA |
757 | |
758 | ratecycle=int Average bandwidth for 'rate' and 'ratemin' over this number | |
6c219763 | 759 | of milliseconds. |
71bfa161 JA |
760 | |
761 | cpumask=int Set the CPU affinity of this job. The parameter given is a | |
a08bc17f JA |
762 | bitmask of allowed CPU's the job may run on. So if you want |
763 | the allowed CPUs to be 1 and 5, you would pass the decimal | |
764 | value of (1 << 1 | 1 << 5), or 34. See man | |
7dbb6eba | 765 | sched_setaffinity(2). This may not work on all supported |
b0ea08ce JA |
766 | operating systems or kernel versions. This option doesn't |
767 | work well for a higher CPU count than what you can store in | |
768 | an integer mask, so it can only control cpus 1-32. For | |
769 | boxes with larger CPU counts, use cpus_allowed. | |
71bfa161 | 770 | |
d2e268b0 JA |
771 | cpus_allowed=str Controls the same options as cpumask, but it allows a text |
772 | setting of the permitted CPUs instead. So to use CPUs 1 and | |
62a7273d JA |
773 | 5, you would specify cpus_allowed=1,5. This options also |
774 | allows a range of CPUs. Say you wanted a binding to CPUs | |
775 | 1, 5, and 8-15, you would set cpus_allowed=1,5,8-15. | |
d2e268b0 | 776 | |
e417fd66 | 777 | startdelay=time Start this job the specified number of seconds after fio |
71bfa161 JA |
778 | has started. Only useful if the job file contains several |
779 | jobs, and you want to delay starting some jobs to a certain | |
780 | time. | |
781 | ||
e417fd66 | 782 | runtime=time Tell fio to terminate processing after the specified number |
71bfa161 JA |
783 | of seconds. It can be quite hard to determine for how long |
784 | a specified job will run, so this parameter is handy to | |
785 | cap the total runtime to a given time. | |
786 | ||
cf4464ca | 787 | time_based If set, fio will run for the duration of the runtime |
bf9a3edb | 788 | specified even if the file(s) are completely read or |
cf4464ca JA |
789 | written. It will simply loop over the same workload |
790 | as many times as the runtime allows. | |
791 | ||
e417fd66 | 792 | ramp_time=time If set, fio will run the specified workload for this amount |
721938ae JA |
793 | of time before logging any performance numbers. Useful for |
794 | letting performance settle before logging results, thus | |
b29ee5b3 JA |
795 | minimizing the runtime required for stable results. Note |
796 | that the ramp_time is considered lead in time for a job, | |
797 | thus it will increase the total runtime if a special timeout | |
798 | or runtime is specified. | |
721938ae | 799 | |
71bfa161 JA |
800 | invalidate=bool Invalidate the buffer/page cache parts for this file prior |
801 | to starting io. Defaults to true. | |
802 | ||
803 | sync=bool Use sync io for buffered writes. For the majority of the | |
804 | io engines, this means using O_SYNC. | |
805 | ||
d3aad8f2 | 806 | iomem=str |
71bfa161 JA |
807 | mem=str Fio can use various types of memory as the io unit buffer. |
808 | The allowed values are: | |
809 | ||
810 | malloc Use memory from malloc(3) as the buffers. | |
811 | ||
812 | shm Use shared memory as the buffers. Allocated | |
813 | through shmget(2). | |
814 | ||
74b025b0 JA |
815 | shmhuge Same as shm, but use huge pages as backing. |
816 | ||
313cb206 JA |
817 | mmap Use mmap to allocate buffers. May either be |
818 | anonymous memory, or can be file backed if | |
819 | a filename is given after the option. The | |
820 | format is mem=mmap:/path/to/file. | |
71bfa161 | 821 | |
d0bdaf49 JA |
822 | mmaphuge Use a memory mapped huge file as the buffer |
823 | backing. Append filename after mmaphuge, ala | |
824 | mem=mmaphuge:/hugetlbfs/file | |
825 | ||
71bfa161 | 826 | The area allocated is a function of the maximum allowed |
5394ae5f JA |
827 | bs size for the job, multiplied by the io depth given. Note |
828 | that for shmhuge and mmaphuge to work, the system must have | |
829 | free huge pages allocated. This can normally be checked | |
830 | and set by reading/writing /proc/sys/vm/nr_hugepages on a | |
b22989b9 | 831 | Linux system. Fio assumes a huge page is 4MB in size. So |
5394ae5f JA |
832 | to calculate the number of huge pages you need for a given |
833 | job file, add up the io depth of all jobs (normally one unless | |
834 | iodepth= is used) and multiply by the maximum bs set. Then | |
835 | divide that number by the huge page size. You can see the | |
836 | size of the huge pages in /proc/meminfo. If no huge pages | |
837 | are allocated by having a non-zero number in nr_hugepages, | |
56bb17f2 | 838 | using mmaphuge or shmhuge will fail. Also see hugepage-size. |
5394ae5f JA |
839 | |
840 | mmaphuge also needs to have hugetlbfs mounted and the file | |
841 | location should point there. So if it's mounted in /huge, | |
842 | you would use mem=mmaphuge:/huge/somefile. | |
71bfa161 | 843 | |
d529ee19 JA |
844 | iomem_align=int This indiciates the memory alignment of the IO memory buffers. |
845 | Note that the given alignment is applied to the first IO unit | |
846 | buffer, if using iodepth the alignment of the following buffers | |
847 | are given by the bs used. In other words, if using a bs that is | |
848 | a multiple of the page sized in the system, all buffers will | |
849 | be aligned to this value. If using a bs that is not page | |
850 | aligned, the alignment of subsequent IO memory buffers is the | |
851 | sum of the iomem_align and bs used. | |
852 | ||
f7fa2653 | 853 | hugepage-size=int |
56bb17f2 | 854 | Defines the size of a huge page. Must at least be equal |
b22989b9 | 855 | to the system setting, see /proc/meminfo. Defaults to 4MB. |
c51074e7 JA |
856 | Should probably always be a multiple of megabytes, so using |
857 | hugepage-size=Xm is the preferred way to set this to avoid | |
858 | setting a non-pow-2 bad value. | |
56bb17f2 | 859 | |
71bfa161 JA |
860 | exitall When one job finishes, terminate the rest. The default is |
861 | to wait for each job to finish, sometimes that is not the | |
862 | desired action. | |
863 | ||
864 | bwavgtime=int Average the calculated bandwidth over the given time. Value | |
6c219763 | 865 | is specified in milliseconds. |
71bfa161 | 866 | |
c8eeb9df JA |
867 | iopsavgtime=int Average the calculated IOPS over the given time. Value |
868 | is specified in milliseconds. | |
869 | ||
71bfa161 JA |
870 | create_serialize=bool If true, serialize the file creating for the jobs. |
871 | This may be handy to avoid interleaving of data | |
872 | files, which may greatly depend on the filesystem | |
873 | used and even the number of processors in the system. | |
874 | ||
875 | create_fsync=bool fsync the data file after creation. This is the | |
876 | default. | |
877 | ||
814452bd JA |
878 | create_on_open=bool Don't pre-setup the files for IO, just create open() |
879 | when it's time to do IO to that file. | |
880 | ||
afad68f7 | 881 | pre_read=bool If this is given, files will be pre-read into memory before |
34f1c044 JA |
882 | starting the given IO operation. This will also clear |
883 | the 'invalidate' flag, since it is pointless to pre-read | |
9c0d2241 JA |
884 | and then drop the cache. This will only work for IO engines |
885 | that are seekable, since they allow you to read the same data | |
886 | multiple times. Thus it will not work on eg network or splice | |
887 | IO. | |
afad68f7 | 888 | |
e545a6ce | 889 | unlink=bool Unlink the job files when done. Not the default, as repeated |
bf9a3edb JA |
890 | runs of that job would then waste time recreating the file |
891 | set again and again. | |
71bfa161 JA |
892 | |
893 | loops=int Run the specified number of iterations of this job. Used | |
894 | to repeat the same workload a given number of times. Defaults | |
895 | to 1. | |
896 | ||
68e1f29a | 897 | do_verify=bool Run the verify phase after a write phase. Only makes sense if |
e84c73a8 SL |
898 | verify is set. Defaults to 1. |
899 | ||
71bfa161 JA |
900 | verify=str If writing to a file, fio can verify the file contents |
901 | after each iteration of the job. The allowed values are: | |
902 | ||
903 | md5 Use an md5 sum of the data area and store | |
904 | it in the header of each block. | |
905 | ||
17dc34df JA |
906 | crc64 Use an experimental crc64 sum of the data |
907 | area and store it in the header of each | |
908 | block. | |
909 | ||
bac39e0e JA |
910 | crc32c Use a crc32c sum of the data area and store |
911 | it in the header of each block. | |
912 | ||
3845591f | 913 | crc32c-intel Use hardware assisted crc32c calcuation |
0539d758 JA |
914 | provided on SSE4.2 enabled processors. Falls |
915 | back to regular software crc32c, if not | |
916 | supported by the system. | |
3845591f | 917 | |
71bfa161 JA |
918 | crc32 Use a crc32 sum of the data area and store |
919 | it in the header of each block. | |
920 | ||
969f7ed3 JA |
921 | crc16 Use a crc16 sum of the data area and store |
922 | it in the header of each block. | |
923 | ||
17dc34df JA |
924 | crc7 Use a crc7 sum of the data area and store |
925 | it in the header of each block. | |
926 | ||
cd14cc10 JA |
927 | sha512 Use sha512 as the checksum function. |
928 | ||
929 | sha256 Use sha256 as the checksum function. | |
930 | ||
7c353ceb JA |
931 | sha1 Use optimized sha1 as the checksum function. |
932 | ||
7437ee87 SL |
933 | meta Write extra information about each io |
934 | (timestamp, block number etc.). The block | |
996093bb | 935 | number is verified. See also verify_pattern. |
7437ee87 | 936 | |
36690c9b JA |
937 | null Only pretend to verify. Useful for testing |
938 | internals with ioengine=null, not for much | |
939 | else. | |
940 | ||
6c219763 | 941 | This option can be used for repeated burn-in tests of a |
71bfa161 | 942 | system to make sure that the written data is also |
b892dc08 JA |
943 | correctly read back. If the data direction given is |
944 | a read or random read, fio will assume that it should | |
945 | verify a previously written file. If the data direction | |
946 | includes any form of write, the verify will be of the | |
947 | newly written data. | |
71bfa161 | 948 | |
160b966d JA |
949 | verifysort=bool If set, fio will sort written verify blocks when it deems |
950 | it faster to read them back in a sorted manner. This is | |
951 | often the case when overwriting an existing file, since | |
952 | the blocks are already laid out in the file system. You | |
953 | can ignore this option unless doing huge amounts of really | |
954 | fast IO where the red-black tree sorting CPU time becomes | |
955 | significant. | |
3f9f4e26 | 956 | |
f7fa2653 | 957 | verify_offset=int Swap the verification header with data somewhere else |
546a9142 SL |
958 | in the block before writing. Its swapped back before |
959 | verifying. | |
960 | ||
f7fa2653 | 961 | verify_interval=int Write the verification header at a finer granularity |
3f9f4e26 SL |
962 | than the blocksize. It will be written for chunks the |
963 | size of header_interval. blocksize should divide this | |
964 | evenly. | |
90059d65 | 965 | |
0e92f873 | 966 | verify_pattern=str If set, fio will fill the io buffers with this |
e28218f3 SL |
967 | pattern. Fio defaults to filling with totally random |
968 | bytes, but sometimes it's interesting to fill with a known | |
969 | pattern for io verification purposes. Depending on the | |
970 | width of the pattern, fio will fill 1/2/3/4 bytes of the | |
0e92f873 RR |
971 | buffer at the time(it can be either a decimal or a hex number). |
972 | The verify_pattern if larger than a 32-bit quantity has to | |
996093bb JA |
973 | be a hex number that starts with either "0x" or "0X". Use |
974 | with verify=meta. | |
e28218f3 | 975 | |
68e1f29a | 976 | verify_fatal=bool Normally fio will keep checking the entire contents |
a12a3b4d JA |
977 | before quitting on a block verification failure. If this |
978 | option is set, fio will exit the job on the first observed | |
979 | failure. | |
e8462bd8 | 980 | |
b463e936 JA |
981 | verify_dump=bool If set, dump the contents of both the original data |
982 | block and the data block we read off disk to files. This | |
983 | allows later analysis to inspect just what kind of data | |
984 | corruption occurred. On by default. | |
985 | ||
e8462bd8 JA |
986 | verify_async=int Fio will normally verify IO inline from the submitting |
987 | thread. This option takes an integer describing how many | |
988 | async offload threads to create for IO verification instead, | |
989 | causing fio to offload the duty of verifying IO contents | |
c85c324c JA |
990 | to one or more separate threads. If using this offload |
991 | option, even sync IO engines can benefit from using an | |
992 | iodepth setting higher than 1, as it allows them to have | |
993 | IO in flight while verifies are running. | |
e8462bd8 JA |
994 | |
995 | verify_async_cpus=str Tell fio to set the given CPU affinity on the | |
996 | async IO verification threads. See cpus_allowed for the | |
997 | format used. | |
6f87418f JA |
998 | |
999 | verify_backlog=int Fio will normally verify the written contents of a | |
1000 | job that utilizes verify once that job has completed. In | |
1001 | other words, everything is written then everything is read | |
1002 | back and verified. You may want to verify continually | |
1003 | instead for a variety of reasons. Fio stores the meta data | |
1004 | associated with an IO block in memory, so for large | |
1005 | verify workloads, quite a bit of memory would be used up | |
1006 | holding this meta data. If this option is enabled, fio | |
f42195a3 JA |
1007 | will write only N blocks before verifying these blocks. |
1008 | ||
6f87418f JA |
1009 | will verify the previously written blocks before continuing |
1010 | to write new ones. | |
1011 | ||
1012 | verify_backlog_batch=int Control how many blocks fio will verify | |
1013 | if verify_backlog is set. If not set, will default to | |
1014 | the value of verify_backlog (meaning the entire queue | |
f42195a3 JA |
1015 | is read back and verified). If verify_backlog_batch is |
1016 | less than verify_backlog then not all blocks will be verified, | |
1017 | if verify_backlog_batch is larger than verify_backlog, some | |
1018 | blocks will be verified more than once. | |
160b966d | 1019 | |
d392365e JA |
1020 | stonewall |
1021 | wait_for_previous Wait for preceeding jobs in the job file to exit, before | |
71bfa161 | 1022 | starting this one. Can be used to insert serialization |
b3d62a75 JA |
1023 | points in the job file. A stone wall also implies starting |
1024 | a new reporting group. | |
1025 | ||
1026 | new_group Start a new reporting group. If this option isn't given, | |
1027 | jobs in a file will be part of the same reporting group | |
bf9a3edb | 1028 | unless separated by a stone wall (or if it's a group |
b3d62a75 | 1029 | by itself, with the numjobs option). |
71bfa161 JA |
1030 | |
1031 | numjobs=int Create the specified number of clones of this job. May be | |
1032 | used to setup a larger number of threads/processes doing | |
fa28c85a JA |
1033 | the same thing. We regard that grouping of jobs as a |
1034 | specific group. | |
1035 | ||
1036 | group_reporting If 'numjobs' is set, it may be interesting to display | |
1037 | statistics for the group as a whole instead of for each | |
1038 | individual job. This is especially true of 'numjobs' is | |
1039 | large, looking at individual thread/process output quickly | |
1040 | becomes unwieldy. If 'group_reporting' is specified, fio | |
1041 | will show the final report per-group instead of per-job. | |
71bfa161 JA |
1042 | |
1043 | thread fio defaults to forking jobs, however if this option is | |
1044 | given, fio will use pthread_create(3) to create threads | |
1045 | instead. | |
1046 | ||
f7fa2653 | 1047 | zonesize=int Divide a file into zones of the specified size. See zoneskip. |
71bfa161 | 1048 | |
f7fa2653 | 1049 | zoneskip=int Skip the specified number of bytes when zonesize data has |
71bfa161 JA |
1050 | been read. The two zone options can be used to only do |
1051 | io on zones of a file. | |
1052 | ||
076efc7c | 1053 | write_iolog=str Write the issued io patterns to the specified file. See |
5b42a488 SH |
1054 | read_iolog. Specify a separate file for each job, otherwise |
1055 | the iologs will be interspersed and the file may be corrupt. | |
71bfa161 | 1056 | |
076efc7c | 1057 | read_iolog=str Open an iolog with the specified file name and replay the |
71bfa161 | 1058 | io patterns it contains. This can be used to store a |
6df8adaa JA |
1059 | workload and replay it sometime later. The iolog given |
1060 | may also be a blktrace binary file, which allows fio | |
1061 | to replay a workload captured by blktrace. See blktrace | |
1062 | for how to capture such logging data. For blktrace replay, | |
1063 | the file needs to be turned into a blkparse binary data | |
ea3e51c3 | 1064 | file first (blkparse <device> -o /dev/null -d file_for_fio.bin). |
64bbb865 DN |
1065 | |
1066 | replay_no_stall=int When replaying I/O with read_iolog the default behavior | |
62776229 JA |
1067 | is to attempt to respect the time stamps within the log and |
1068 | replay them with the appropriate delay between IOPS. By | |
1069 | setting this variable fio will not respect the timestamps and | |
1070 | attempt to replay them as fast as possible while still | |
1071 | respecting ordering. The result is the same I/O pattern to a | |
1072 | given device, but different timings. | |
71bfa161 | 1073 | |
d1c46c04 DN |
1074 | replay_redirect=str While replaying I/O patterns using read_iolog the |
1075 | default behavior is to replay the IOPS onto the major/minor | |
1076 | device that each IOP was recorded from. This is sometimes | |
1077 | undesireable because on a different machine those major/minor | |
1078 | numbers can map to a different device. Changing hardware on | |
1079 | the same system can also result in a different major/minor | |
1080 | mapping. Replay_redirect causes all IOPS to be replayed onto | |
1081 | the single specified device regardless of the device it was | |
1082 | recorded from. i.e. replay_redirect=/dev/sdc would cause all | |
1083 | IO in the blktrace to be replayed onto /dev/sdc. This means | |
1084 | multiple devices will be replayed onto a single, if the trace | |
1085 | contains multiple devices. If you want multiple devices to be | |
1086 | replayed concurrently to multiple redirected devices you must | |
1087 | blkparse your trace into separate traces and replay them with | |
1088 | independent fio invocations. Unfortuantely this also breaks | |
1089 | the strict time ordering between multiple device accesses. | |
1090 | ||
e3cedca7 | 1091 | write_bw_log=str If given, write a bandwidth log of the jobs in this job |
71bfa161 | 1092 | file. Can be used to store data of the bandwidth of the |
e0da9bc2 JA |
1093 | jobs in their lifetime. The included fio_generate_plots |
1094 | script uses gnuplot to turn these text files into nice | |
e3cedca7 JA |
1095 | graphs. See write_log_log for behaviour of given |
1096 | filename. For this option, the postfix is _bw.log. | |
71bfa161 | 1097 | |
e3cedca7 | 1098 | write_lat_log=str Same as write_bw_log, except that this option stores io |
02af0988 JA |
1099 | submission, completion, and total latencies instead. If no |
1100 | filename is given with this option, the default filename of | |
1101 | "jobname_type.log" is used. Even if the filename is given, | |
1102 | fio will still append the type of log. So if one specifies | |
e3cedca7 JA |
1103 | |
1104 | write_lat_log=foo | |
1105 | ||
02af0988 JA |
1106 | The actual log names will be foo_slat.log, foo_slat.log, |
1107 | and foo_lat.log. This helps fio_generate_plot fine the logs | |
1108 | automatically. | |
71bfa161 | 1109 | |
c8eeb9df JA |
1110 | write_bw_log=str If given, write an IOPS log of the jobs in this job |
1111 | file. See write_bw_log. | |
1112 | ||
f7fa2653 | 1113 | lockmem=int Pin down the specified amount of memory with mlock(2). Can |
71bfa161 JA |
1114 | potentially be used instead of removing memory or booting |
1115 | with less memory to simulate a smaller amount of memory. | |
1116 | ||
1117 | exec_prerun=str Before running this job, issue the command specified | |
1118 | through system(3). | |
1119 | ||
1120 | exec_postrun=str After the job completes, issue the command specified | |
1121 | though system(3). | |
1122 | ||
1123 | ioscheduler=str Attempt to switch the device hosting the file to the specified | |
1124 | io scheduler before running. | |
1125 | ||
1126 | cpuload=int If the job is a CPU cycle eater, attempt to use the specified | |
1127 | percentage of CPU cycles. | |
1128 | ||
1129 | cpuchunks=int If the job is a CPU cycle eater, split the load into | |
26eca2db | 1130 | cycles of the given time. In microseconds. |
71bfa161 | 1131 | |
0a839f30 JA |
1132 | disk_util=bool Generate disk utilization statistics, if the platform |
1133 | supports it. Defaults to on. | |
1134 | ||
02af0988 | 1135 | disable_lat=bool Disable measurements of total latency numbers. Useful |
9520ebb9 JA |
1136 | only for cutting back the number of calls to gettimeofday, |
1137 | as that does impact performance at really high IOPS rates. | |
1138 | Note that to really get rid of a large amount of these | |
1139 | calls, this option must be used with disable_slat and | |
1140 | disable_bw as well. | |
1141 | ||
02af0988 JA |
1142 | disable_clat=bool Disable measurements of completion latency numbers. See |
1143 | disable_lat. | |
1144 | ||
9520ebb9 | 1145 | disable_slat=bool Disable measurements of submission latency numbers. See |
02af0988 | 1146 | disable_slat. |
9520ebb9 JA |
1147 | |
1148 | disable_bw=bool Disable measurements of throughput/bandwidth numbers. See | |
02af0988 | 1149 | disable_lat. |
9520ebb9 | 1150 | |
83349190 YH |
1151 | clat_percentiles=bool Enable the reporting of percentiles of |
1152 | completion latencies. | |
1153 | ||
1154 | percentile_list=float_list Overwrite the default list of percentiles | |
1155 | for completion latencies. Each number is a floating | |
1156 | number in the range (0,100], and the maximum length of | |
1157 | the list is 20. Use ':' to separate the numbers, and | |
1158 | list the numbers in ascending order. For example, | |
1159 | --percentile_list=99.5:99.9 will cause fio to report | |
1160 | the values of completion latency below which 99.5% and | |
1161 | 99.9% of the observed latencies fell, respectively. | |
1162 | ||
993bf48b JA |
1163 | gtod_reduce=bool Enable all of the gettimeofday() reducing options |
1164 | (disable_clat, disable_slat, disable_bw) plus reduce | |
1165 | precision of the timeout somewhat to really shrink | |
1166 | the gettimeofday() call count. With this option enabled, | |
1167 | we only do about 0.4% of the gtod() calls we would have | |
1168 | done if all time keeping was enabled. | |
1169 | ||
be4ecfdf JA |
1170 | gtod_cpu=int Sometimes it's cheaper to dedicate a single thread of |
1171 | execution to just getting the current time. Fio (and | |
1172 | databases, for instance) are very intensive on gettimeofday() | |
1173 | calls. With this option, you can set one CPU aside for | |
1174 | doing nothing but logging current time to a shared memory | |
1175 | location. Then the other threads/processes that run IO | |
1176 | workloads need only copy that segment, instead of entering | |
1177 | the kernel with a gettimeofday() call. The CPU set aside | |
1178 | for doing these time calls will be excluded from other | |
1179 | uses. Fio will manually clear it from the CPU mask of other | |
1180 | jobs. | |
a696fa2a | 1181 | |
f2bba182 RR |
1182 | continue_on_error=bool Normally fio will exit the job on the first observed |
1183 | failure. If this option is set, fio will continue the job when | |
1184 | there is a 'non-fatal error' (EIO or EILSEQ) until the runtime | |
1185 | is exceeded or the I/O size specified is completed. If this | |
1186 | option is used, there are two more stats that are appended, | |
1187 | the total error count and the first error. The error field | |
1188 | given in the stats is the first error that was hit during the | |
1189 | run. | |
be4ecfdf | 1190 | |
6adb38a1 JA |
1191 | cgroup=str Add job to this control group. If it doesn't exist, it will |
1192 | be created. The system must have a mounted cgroup blkio | |
1193 | mount point for this to work. If your system doesn't have it | |
1194 | mounted, you can do so with: | |
a696fa2a JA |
1195 | |
1196 | # mount -t cgroup -o blkio none /cgroup | |
1197 | ||
a696fa2a JA |
1198 | cgroup_weight=int Set the weight of the cgroup to this value. See |
1199 | the documentation that comes with the kernel, allowed values | |
1200 | are in the range of 100..1000. | |
71bfa161 | 1201 | |
7de87099 VG |
1202 | cgroup_nodelete=bool Normally fio will delete the cgroups it has created after |
1203 | the job completion. To override this behavior and to leave | |
1204 | cgroups around after the job completion, set cgroup_nodelete=1. | |
1205 | This can be useful if one wants to inspect various cgroup | |
1206 | files after job completion. Default: false | |
1207 | ||
e0b0d892 JA |
1208 | uid=int Instead of running as the invoking user, set the user ID to |
1209 | this value before the thread/process does any work. | |
1210 | ||
1211 | gid=int Set group ID, see uid. | |
1212 | ||
71bfa161 JA |
1213 | 6.0 Interpreting the output |
1214 | --------------------------- | |
1215 | ||
1216 | fio spits out a lot of output. While running, fio will display the | |
1217 | status of the jobs created. An example of that would be: | |
1218 | ||
73c8b082 | 1219 | Threads: 1: [_r] [24.8% done] [ 13509/ 8334 kb/s] [eta 00h:01m:31s] |
71bfa161 JA |
1220 | |
1221 | The characters inside the square brackets denote the current status of | |
1222 | each thread. The possible values (in typical life cycle order) are: | |
1223 | ||
1224 | Idle Run | |
1225 | ---- --- | |
1226 | P Thread setup, but not started. | |
1227 | C Thread created. | |
1228 | I Thread initialized, waiting. | |
b0f65863 | 1229 | p Thread running pre-reading file(s). |
71bfa161 JA |
1230 | R Running, doing sequential reads. |
1231 | r Running, doing random reads. | |
1232 | W Running, doing sequential writes. | |
1233 | w Running, doing random writes. | |
1234 | M Running, doing mixed sequential reads/writes. | |
1235 | m Running, doing mixed random reads/writes. | |
1236 | F Running, currently waiting for fsync() | |
fc6bd43c | 1237 | V Running, doing verification of written data. |
71bfa161 JA |
1238 | E Thread exited, not reaped by main thread yet. |
1239 | _ Thread reaped. | |
1240 | ||
1241 | The other values are fairly self explanatory - number of threads | |
c9f60304 JA |
1242 | currently running and doing io, rate of io since last check (read speed |
1243 | listed first, then write speed), and the estimated completion percentage | |
1244 | and time for the running group. It's impossible to estimate runtime of | |
1245 | the following groups (if any). | |
71bfa161 JA |
1246 | |
1247 | When fio is done (or interrupted by ctrl-c), it will show the data for | |
1248 | each thread, group of threads, and disks in that order. For each data | |
1249 | direction, the output looks like: | |
1250 | ||
1251 | Client1 (g=0): err= 0: | |
35649e58 | 1252 | write: io= 32MB, bw= 666KB/s, iops=89 , runt= 50320msec |
6104ddb6 JA |
1253 | slat (msec): min= 0, max= 136, avg= 0.03, stdev= 1.92 |
1254 | clat (msec): min= 0, max= 631, avg=48.50, stdev=86.82 | |
b22989b9 | 1255 | bw (KB/s) : min= 0, max= 1196, per=51.00%, avg=664.02, stdev=681.68 |
e7823a94 | 1256 | cpu : usr=1.49%, sys=0.25%, ctx=7969, majf=0, minf=17 |
71619dc2 | 1257 | IO depths : 1=0.1%, 2=0.3%, 4=0.5%, 8=99.0%, 16=0.0%, 32=0.0%, >32=0.0% |
838bc709 JA |
1258 | submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% |
1259 | complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% | |
30061b97 | 1260 | issued r/w: total=0/32768, short=0/0 |
8abdce66 JA |
1261 | lat (msec): 2=1.6%, 4=0.0%, 10=3.2%, 20=12.8%, 50=38.4%, 100=24.8%, |
1262 | lat (msec): 250=15.2%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2048=0.0% | |
71bfa161 JA |
1263 | |
1264 | The client number is printed, along with the group id and error of that | |
1265 | thread. Below is the io statistics, here for writes. In the order listed, | |
1266 | they denote: | |
1267 | ||
1268 | io= Number of megabytes io performed | |
1269 | bw= Average bandwidth rate | |
35649e58 | 1270 | iops= Average IOs performed per second |
71bfa161 | 1271 | runt= The runtime of that thread |
72fbda2a | 1272 | slat= Submission latency (avg being the average, stdev being the |
71bfa161 JA |
1273 | standard deviation). This is the time it took to submit |
1274 | the io. For sync io, the slat is really the completion | |
8a35c71e | 1275 | latency, since queue/complete is one operation there. This |
bf9a3edb | 1276 | value can be in milliseconds or microseconds, fio will choose |
8a35c71e | 1277 | the most appropriate base and print that. In the example |
bf9a3edb | 1278 | above, milliseconds is the best scale. |
71bfa161 JA |
1279 | clat= Completion latency. Same names as slat, this denotes the |
1280 | time from submission to completion of the io pieces. For | |
1281 | sync io, clat will usually be equal (or very close) to 0, | |
1282 | as the time from submit to complete is basically just | |
1283 | CPU time (io has already been done, see slat explanation). | |
1284 | bw= Bandwidth. Same names as the xlat stats, but also includes | |
1285 | an approximate percentage of total aggregate bandwidth | |
1286 | this thread received in this group. This last value is | |
1287 | only really useful if the threads in this group are on the | |
1288 | same disk, since they are then competing for disk access. | |
1289 | cpu= CPU usage. User and system time, along with the number | |
e7823a94 JA |
1290 | of context switches this thread went through, usage of |
1291 | system and user time, and finally the number of major | |
1292 | and minor page faults. | |
71619dc2 JA |
1293 | IO depths= The distribution of io depths over the job life time. The |
1294 | numbers are divided into powers of 2, so for example the | |
1295 | 16= entries includes depths up to that value but higher | |
1296 | than the previous entry. In other words, it covers the | |
1297 | range from 16 to 31. | |
838bc709 JA |
1298 | IO submit= How many pieces of IO were submitting in a single submit |
1299 | call. Each entry denotes that amount and below, until | |
1300 | the previous entry - eg, 8=100% mean that we submitted | |
1301 | anywhere in between 5-8 ios per submit call. | |
1302 | IO complete= Like the above submit number, but for completions instead. | |
30061b97 JA |
1303 | IO issued= The number of read/write requests issued, and how many |
1304 | of them were short. | |
ec118304 JA |
1305 | IO latencies= The distribution of IO completion latencies. This is the |
1306 | time from when IO leaves fio and when it gets completed. | |
1307 | The numbers follow the same pattern as the IO depths, | |
1308 | meaning that 2=1.6% means that 1.6% of the IO completed | |
8abdce66 JA |
1309 | within 2 msecs, 20=12.8% means that 12.8% of the IO |
1310 | took more than 10 msecs, but less than (or equal to) 20 msecs. | |
71bfa161 JA |
1311 | |
1312 | After each client has been listed, the group statistics are printed. They | |
1313 | will look like this: | |
1314 | ||
1315 | Run status group 0 (all jobs): | |
b22989b9 JA |
1316 | READ: io=64MB, aggrb=22178, minb=11355, maxb=11814, mint=2840msec, maxt=2955msec |
1317 | WRITE: io=64MB, aggrb=1302, minb=666, maxb=669, mint=50093msec, maxt=50320msec | |
71bfa161 JA |
1318 | |
1319 | For each data direction, it prints: | |
1320 | ||
1321 | io= Number of megabytes io performed. | |
1322 | aggrb= Aggregate bandwidth of threads in this group. | |
1323 | minb= The minimum average bandwidth a thread saw. | |
1324 | maxb= The maximum average bandwidth a thread saw. | |
1325 | mint= The smallest runtime of the threads in that group. | |
1326 | maxt= The longest runtime of the threads in that group. | |
1327 | ||
1328 | And finally, the disk statistics are printed. They will look like this: | |
1329 | ||
1330 | Disk stats (read/write): | |
1331 | sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00% | |
1332 | ||
1333 | Each value is printed for both reads and writes, with reads first. The | |
1334 | numbers denote: | |
1335 | ||
1336 | ios= Number of ios performed by all groups. | |
1337 | merge= Number of merges io the io scheduler. | |
1338 | ticks= Number of ticks we kept the disk busy. | |
1339 | io_queue= Total time spent in the disk queue. | |
1340 | util= The disk utilization. A value of 100% means we kept the disk | |
1341 | busy constantly, 50% would be a disk idling half of the time. | |
1342 | ||
1343 | ||
1344 | 7.0 Terse output | |
1345 | ---------------- | |
1346 | ||
1347 | For scripted usage where you typically want to generate tables or graphs | |
6af019c9 | 1348 | of the results, fio can output the results in a semicolon separated format. |
71bfa161 JA |
1349 | The format is one long line of values, such as: |
1350 | ||
562c2d2f DN |
1351 | 2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00% |
1352 | A description of this job goes here. | |
1353 | ||
1354 | The job description (if provided) follows on a second line. | |
71bfa161 | 1355 | |
525c2bfa JA |
1356 | To enable terse output, use the --minimal command line option. The first |
1357 | value is the version of the terse output format. If the output has to | |
1358 | be changed for some reason, this number will be incremented by 1 to | |
1359 | signify that change. | |
6820cb3b | 1360 | |
71bfa161 JA |
1361 | Split up, the format is as follows: |
1362 | ||
525c2bfa | 1363 | version, jobname, groupid, error |
71bfa161 | 1364 | READ status: |
85549ba0 | 1365 | Total IO (KB), bandwidth (KB/sec), runtime (msec) |
71bfa161 JA |
1366 | Submission latency: min, max, mean, deviation |
1367 | Completion latency: min, max, mean, deviation | |
525c2bfa | 1368 | Total latency: min, max, mean, deviation |
6c219763 | 1369 | Bw: min, max, aggregate percentage of total, mean, deviation |
71bfa161 | 1370 | WRITE status: |
85549ba0 | 1371 | Total IO (KB), bandwidth (KB/sec), runtime (msec) |
71bfa161 JA |
1372 | Submission latency: min, max, mean, deviation |
1373 | Completion latency: min, max, mean, deviation | |
525c2bfa | 1374 | Total latency: min, max, mean, deviation |
6c219763 | 1375 | Bw: min, max, aggregate percentage of total, mean, deviation |
046ee302 | 1376 | CPU usage: user, system, context switches, major faults, minor faults |
2270890c | 1377 | IO depths: <=1, 2, 4, 8, 16, 32, >=64 |
562c2d2f DN |
1378 | IO latencies microseconds: <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000 |
1379 | IO latencies milliseconds: <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000 | |
1380 | Additional Info (dependant on continue_on_error, default off): total # errors, first error code | |
1381 | ||
f42195a3 | 1382 | Additional Info (dependant on description being set): Text description |
25c8b9d7 PD |
1383 | |
1384 | ||
1385 | 8.0 Trace file format | |
1386 | --------------------- | |
1387 | There are two trace file format that you can encounter. The older (v1) format | |
1388 | is unsupported since version 1.20-rc3 (March 2008). It will still be described | |
1389 | below in case that you get an old trace and want to understand it. | |
1390 | ||
1391 | In any case the trace is a simple text file with a single action per line. | |
1392 | ||
1393 | ||
1394 | 8.1 Trace file format v1 | |
1395 | ------------------------ | |
1396 | Each line represents a single io action in the following format: | |
1397 | ||
1398 | rw, offset, length | |
1399 | ||
1400 | where rw=0/1 for read/write, and the offset and length entries being in bytes. | |
1401 | ||
1402 | This format is not supported in Fio versions => 1.20-rc3. | |
1403 | ||
1404 | ||
1405 | 8.2 Trace file format v2 | |
1406 | ------------------------ | |
1407 | The second version of the trace file format was added in Fio version 1.17. | |
1408 | It allows to access more then one file per trace and has a bigger set of | |
1409 | possible file actions. | |
1410 | ||
1411 | The first line of the trace file has to be: | |
1412 | ||
1413 | fio version 2 iolog | |
1414 | ||
1415 | Following this can be lines in two different formats, which are described below. | |
1416 | ||
1417 | The file management format: | |
1418 | ||
1419 | filename action | |
1420 | ||
1421 | The filename is given as an absolute path. The action can be one of these: | |
1422 | ||
1423 | add Add the given filename to the trace | |
1424 | open Open the file with the given filename. The filename has to have | |
1425 | been added with the add action before. | |
1426 | close Close the file with the given filename. The file has to have been | |
1427 | opened before. | |
1428 | ||
1429 | ||
1430 | The file io action format: | |
1431 | ||
1432 | filename action offset length | |
1433 | ||
1434 | The filename is given as an absolute path, and has to have been added and opened | |
1435 | before it can be used with this format. The offset and length are given in | |
1436 | bytes. The action can be one of these: | |
1437 | ||
1438 | wait Wait for 'offset' microseconds. Everything below 100 is discarded. | |
1439 | read Read 'length' bytes beginning from 'offset' | |
1440 | write Write 'length' bytes beginning from 'offset' | |
1441 | sync fsync() the file | |
1442 | datasync fdatasync() the file | |
1443 | trim trim the given file from the given 'offset' for 'length' bytes |