1 .TH fio 1 "September 2007" "User Manual"
3 fio \- flexible I/O tester
6 [\fIoptions\fR] [\fIjobfile\fR]...
9 is a tool that will spawn a number of threads or processes doing a
10 particular type of I/O action as specified by the user.
11 The typical use of fio is to write a job file matching the I/O load
12 one wants to simulate.
15 .BI \-\-output \fR=\fPfilename
16 Write output to \fIfilename\fR.
18 .BI \-\-timeout \fR=\fPtimeout
19 Limit run time to \fItimeout\fR seconds.
22 Generate per-job latency logs.
25 Generate per-job bandwidth logs.
28 Print statistics in a terse, semicolon-delimited format.
30 .BI \-\-showcmd \fR=\fPjobfile
31 Convert \fIjobfile\fR to a set of command-line options.
34 Enable read-only safety checks.
36 .BI \-\-eta \fR=\fPwhen
37 Specifies when real-time ETA estimate should be printed. \fIwhen\fR may
38 be one of `always', `never' or `auto'.
40 .BI \-\-section \fR=\fPsec
41 Only run section \fIsec\fR from job file.
43 .BI \-\-cmdhelp \fR=\fPcommand
44 Print help information for \fIcommand\fR. May be `all' for all commands.
46 .BI \-\-debug \fR=\fPtype
47 Enable verbose tracing of various fio actions. May be `all' for all types
48 or individual types seperated by a comma (eg --debug=io,file). `help' will
49 list all available tracing options.
52 Display usage information and exit.
55 Display version information and exit.
57 Job files are in `ini' format. They consist of one or more
58 job definitions, which begin with a job name in square brackets and
59 extend to the next job name. The job name can be any ASCII string
60 except `global', which has a special meaning. Following the job name is
61 a sequence of zero or more parameters, one per line, that define the
62 behavior of the job. Any line starting with a `;' or `#' character is
63 considered a comment and ignored.
65 If \fIjobfile\fR is specified as `-', the job file will be read from
68 The global section contains default parameters for jobs specified in the
69 job file. A job is only affected by global sections residing above it,
70 and there may be any number of global sections. Specific job definitions
71 may override any parameter set in global sections.
74 Some parameters may take arguments of a specific type. The types used are:
77 String: a sequence of alphanumeric characters.
80 Integer: a whole number, possibly negative. If prefixed with `0x', the value
81 is assumed to be base 16 (hexadecimal).
84 SI integer: a whole number, possibly containing a suffix denoting the base unit
85 of the value. Accepted suffixes are `k', 'M' and 'G', denoting kilo (1024),
86 mega (1024*1024) and giga (1024*1024*1024) respectively.
89 Boolean: a true or false value. `0' denotes false, `1' denotes true.
92 Integer range: a range of integers specified in the format
93 \fIlower\fR:\fIupper\fR or \fIlower\fR\-\fIupper\fR. \fIlower\fR and
94 \fIupper\fR may contain a suffix as described above. If an option allows two
95 sets of ranges, they are separated with a `,' or `/' character. For example:
100 May be used to override the job name. On the command line, this parameter
101 has the special purpose of signalling the start of a new job.
103 .BI description \fR=\fPstr
104 Human-readable description of the job. It is printed when the job is run, but
105 otherwise has no special purpose.
107 .BI directory \fR=\fPstr
108 Prefix filenames with this directory. Used to place files in a location other
111 .BI filename \fR=\fPstr
113 normally makes up a file name based on the job name, thread number, and file
114 number. If you want to share files between threads in a job or several jobs,
115 specify a \fIfilename\fR for each of them to override the default. If the I/O
116 engine used is `net', \fIfilename\fR is the host and port to connect to in the
117 format \fIhost\fR/\fIport\fR. If the I/O engine is file-based, you can specify
118 a number of files by separating the names with a `:' character. `\-' is a
119 reserved name, meaning stdin or stdout, depending on the read/write direction
122 .BI opendir \fR=\fPstr
123 Recursively open any files below directory \fIstr\fR.
125 .BI readwrite \fR=\fPstr "\fR,\fP rw" \fR=\fPstr
126 Type of I/O pattern. Accepted values are:
143 Mixed sequential reads and writes.
146 Mixed random reads and writes.
149 For mixed I/O, the default split is 50/50. For random I/O, the number of I/Os
150 to perform before getting a new offset can be specified by appending
151 `:\fIint\fR' to the pattern type. The default is 1.
154 .BI randrepeat \fR=\fPbool
155 Seed the random number generator in a predictable way so results are repeatable
156 across runs. Default: true.
158 .BI fadvise_hint \fR=\fPbool
159 Disable use of \fIposix_fadvise\fR\|(2) to advise the kernel what I/O patterns
160 are likely to be issued. Default: true.
162 .BI size \fR=\fPsiint
163 Total size of I/O for this job. \fBfio\fR will run until this many bytes have
164 been transfered, unless limited by other options (\fBruntime\fR, for instance).
165 Unless \fBnr_files\fR and \fBfilesize\fR options are given, this amount will be
166 divided between the available files for the job.
168 .BI filesize \fR=\fPirange
169 Individual file sizes. May be a range, in which case \fBfio\fR will select sizes
170 for files at random within the given range, limited to \fBsize\fR in total (if
171 that is given). If \fBfilesize\fR is not specified, each created file is the
174 .BI blocksize \fR=\fPsiint[,siint] "\fR,\fB bs" \fR=\fPsiint[,siint]
175 Block size for I/O units. Default: 4k. Values for reads and writes can be
176 specified seperately in the format \fIread\fR,\fIwrite\fR, either of
177 which may be empty to leave that value at its default.
179 .BI blocksize_range \fR=\fPirange[,irange] "\fR,\fB bsrange" \fR=\fPirange[,irange]
180 Specify a range of I/O block sizes. The issued I/O unit will always be a
181 multiple of the minimum size, unless \fBblocksize_unaligned\fR is set. Applies
182 to both reads and writes if only one range is given, but can be specified
183 seperately with a comma seperating the values. Example: bsrange=1k-4k,2k-8k.
184 Also (see \fBblocksize\fR).
186 .BI bssplit \fR=\fPstr
187 This option allows even finer grained control of the block sizes issued,
188 not just even splits between them. With this option, you can weight various
189 block sizes for exact control of the issued IO for a job that has mixed
190 block sizes. The format of the option is bssplit=blocksize/percentage,
191 optionally adding as many definitions as needed seperated by a colon.
192 Example: bssplit=4k/10:64k/50:32k/40 would issue 50% 64k blocks, 10% 4k
193 blocks and 40% 32k blocks.
195 .B blocksize_unaligned\fR,\fP bs_unaligned
196 If set, any size in \fBblocksize_range\fR may be used. This typically won't
197 work with direct I/O, as that normally requires sector alignment.
200 Initialise buffers with all zeros. Default: fill buffers with random data.
202 .BI nrfiles \fR=\fPint
203 Number of files to use for this job. Default: 1.
205 .BI openfiles \fR=\fPint
206 Number of files to keep open at the same time. Default: \fBnrfiles\fR.
208 .BI file_service_type \fR=\fPstr
209 Defines how files to service are selected. The following types are defined:
214 Choose a file at random
217 Round robin over open files (default).
220 The number of I/Os to issue before switching a new file can be specified by
221 appending `:\fIint\fR' to the service type.
224 .BI ioengine \fR=\fPstr
225 Defines how the job issues I/O. The following types are defined:
230 Basic \fIread\fR\|(2) or \fIwrite\fR\|(2) I/O. \fIfseek\fR\|(2) is used to
231 position the I/O location.
234 Basic \fIpread\fR\|(2) or \fIpwrite\fR\|(2) I/O.
237 Basic \fIreadv\fR\|(2) or \fIwritev\fR\|(2) I/O. Will emulate queuing by
238 coalescing adjacents IOs into a single submission.
241 Linux native asynchronous I/O.
244 glibc POSIX asynchronous I/O using \fIaio_read\fR\|(3) and \fIaio_write\fR\|(3).
247 File is memory mapped with \fImmap\fR\|(2) and data copied using
251 \fIsplice\fR\|(2) is used to transfer the data and \fIvmsplice\fR\|(2) to
252 transfer data from user-space to the kernel.
255 Use the syslet system calls to make regular read/write asynchronous.
258 SCSI generic sg v3 I/O. May be either synchronous using the SG_IO ioctl, or if
259 the target is an sg character device, we use \fIread\fR\|(2) and
260 \fIwrite\fR\|(2) for asynchronous I/O.
263 Doesn't transfer any data, just pretends to. Mainly used to exercise \fBfio\fR
264 itself and for debugging and testing purposes.
267 Transfer over the network. \fBfilename\fR must be set appropriately to
268 `\fIhost\fR/\fIport\fR' regardless of data direction. If receiving, only the
269 \fIport\fR argument is used.
272 Like \fBnet\fR, but uses \fIsplice\fR\|(2) and \fIvmsplice\fR\|(2) to map data
276 Doesn't transfer any data, but burns CPU cycles according to \fBcpuload\fR and
277 \fBcpucycles\fR parameters.
280 The GUASI I/O engine is the Generic Userspace Asynchronous Syscall Interface
281 approach to asycnronous I/O.
283 See <http://www.xmailserver.org/guasi\-lib.html>.
286 Loads an external I/O engine object file. Append the engine filename as
291 .BI iodepth \fR=\fPint
292 Number of I/O units to keep in flight against the file. Default: 1.
294 .BI iodepth_batch \fR=\fPint
295 Number of I/Os to submit at once. Default: \fBiodepth\fR.
297 .BI iodepth_low \fR=\fPint
298 Low watermark indicating when to start filling the queue again. Default:
301 .BI direct \fR=\fPbool
302 If true, use non-buffered I/O (usually O_DIRECT). Default: false.
304 .BI buffered \fR=\fPbool
305 If true, use buffered I/O. This is the opposite of the \fBdirect\fR parameter.
308 .BI offset \fR=\fPsiint
309 Offset in the file to start I/O. Data before the offset will not be touched.
312 How many I/Os to perform before issuing an \fBfsync\fR\|(2) of dirty data. If
313 0, don't sync. Default: 0.
315 .BI overwrite \fR=\fPbool
316 If writing, setup the file first and do overwrites. Default: false.
318 .BI end_fsync \fR=\fPbool
319 Sync file contents when job exits. Default: false.
321 .BI fsync_on_close \fR=\fPbool
322 If true, sync file contents on close. This differs from \fBend_fsync\fR in that
323 it will happen on every close, not just at the end of the job. Default: false.
325 .BI rwmixcycle \fR=\fPint
326 How many milliseconds before switching between reads and writes for a mixed
327 workload. Default: 500ms.
329 .BI rwmixread \fR=\fPint
330 Percentage of a mixed workload that should be reads. Default: 50.
332 .BI rwmixwrite \fR=\fPint
333 Percentage of a mixed workload that should be writes. If \fBrwmixread\fR and
334 \fBwrmixwrite\fR are given and do not sum to 100%, the latter of the two
335 overrides the first. Default: 50.
338 Normally \fBfio\fR will cover every block of the file when doing random I/O. If
339 this parameter is given, a new offset will be chosen without looking at past
340 I/O history. This parameter is mutually exclusive with \fBverify\fR.
343 Run job with given nice value. See \fInice\fR\|(2).
346 Set I/O priority value of this job between 0 (highest) and 7 (lowest). See
349 .BI prioclass \fR=\fPint
350 Set I/O priority class. See \fIionice\fR\|(1).
352 .BI thinktime \fR=\fPint
353 Stall job for given number of microseconds between issuing I/Os.
355 .BI thinktime_spin \fR=\fPint
356 Pretend to spend CPU time for given number of microseconds, sleeping the rest
357 of the time specified by \fBthinktime\fR. Only valid if \fBthinktime\fR is set.
359 .BI thinktime_blocks \fR=\fPint
360 Number of blocks to issue before waiting \fBthinktime\fR microseconds.
364 Cap bandwidth used by this job to this number of KiB/s.
366 .BI ratemin \fR=\fPint
367 Tell \fBfio\fR to do whatever it can to maintain at least the given bandwidth.
368 Failing to meet this requirement will cause the job to exit.
370 .BI rate_iops \fR=\fPint
371 Cap the bandwidth to this number of IOPS. If \fBblocksize\fR is a range, the
372 smallest block size is used as the metric.
374 .BI rate_iops_min \fR=\fPint
375 If this rate of I/O is not met, the job will exit.
377 .BI ratecycle \fR=\fPint
378 Average bandwidth for \fBrate\fR and \fBratemin\fR over this number of
379 milliseconds. Default: 1000ms.
381 .BI cpumask \fR=\fPint
382 Set CPU affinity for this job. \fIint\fR is a bitmask of allowed CPUs the job
383 may run on. See \fBsched_setaffinity\fR\|(2).
385 .BI cpus_allowed \fR=\fPstr
386 Same as \fBcpumask\fR, but allows a comma-delimited list of CPU numbers.
388 .BI startdelay \fR=\fPint
389 Delay start of job for the specified number of seconds.
391 .BI runtime \fR=\fPint
392 Terminate processing after the specified number of seconds.
395 If given, run for the specified \fBruntime\fR duration even if the files are
396 completely read or written. The same workload will be repeated as many times
397 as \fBruntime\fR allows.
399 .BI invalidate \fR=\fPbool
400 Invalidate buffer-cache for the file prior to starting I/O. Default: true.
403 Use synchronous I/O for buffered writes. For the majority of I/O engines,
404 this means using O_SYNC. Default: false.
406 .BI iomem \fR=\fPstr "\fR,\fP mem" \fR=\fPstr
407 Allocation method for I/O unit buffer. Allowed values are:
412 Allocate memory with \fImalloc\fR\|(3).
415 Use shared memory buffers allocated through \fIshmget\fR\|(2).
418 Same as \fBshm\fR, but use huge pages as backing.
421 Use \fImmap\fR\|(2) for allocation. Uses anonymous memory unless a filename
422 is given after the option in the format `:\fIfile\fR'.
425 Same as \fBmmap\fR, but use huge files as backing.
428 The amount of memory allocated is the maximum allowed \fBblocksize\fR for the
429 job multiplied by \fBiodepth\fR. For \fBshmhuge\fR or \fBmmaphuge\fR to work,
430 the system must have free huge pages allocated. \fBmmaphuge\fR also needs to
431 have hugetlbfs mounted, and \fIfile\fR must point there.
434 .BI hugepage\-size \fR=\fPsiint
435 Defines the size of a huge page. Must be at least equal to the system setting.
436 Should be a multiple of 1MiB. Default: 4MiB.
439 Terminate all jobs when one finishes. Default: wait for each job to finish.
441 .BI bwavgtime \fR=\fPint
442 Average bandwidth calculations over the given time in milliseconds. Default:
445 .BI create_serialize \fR=\fPbool
446 If true, serialize file creation for the jobs. Default: true.
448 .BI create_fsync \fR=\fPbool
449 \fIfsync\fR\|(2) data file after creation. Default: true.
451 .BI unlink \fR=\fPbool
452 Unlink job files when done. Default: false.
455 Specifies the number of iterations (runs of the same workload) of this job.
458 .BI do_verify \fR=\fPbool
459 Run the verify phase after a write phase. Only valid if \fBverify\fR is set.
462 .BI verify \fR=\fPstr
463 Method of verifying file contents after each iteration of the job. Allowed
468 .B md5 crc16 crc32 crc64 crc7 sha256 sha512
469 Store appropriate checksum in the header of each block.
472 Write extra information about each I/O (timestamp, block number, etc.). The
473 block number is verified.
476 Fill I/O buffers with a specific pattern that is used to verify. The pattern is
477 specified by appending `:\fIint\fR' to the parameter. \fIint\fR cannot be larger
481 Pretend to verify. Used for testing internals.
485 .BI verify_sort \fR=\fPbool
486 If true, written verify blocks are sorted if \fBfio\fR deems it to be faster to
487 read them back in a sorted manner. Default: true.
489 .BI verify_offset \fR=\fPsiint
490 Swap the verification header with data somewhere else in the block before
491 writing. It is swapped back before verifying.
493 .BI verify_interval \fR=\fPsiint
494 Write the verification header for this number of bytes, which should divide
495 \fBblocksize\fR. Default: \fBblocksize\fR.
497 .BI verify_fatal \fR=\fPbool
498 If true, exit the job on the first observed verification failure. Default:
502 Wait for preceeding jobs in the job file to exit before starting this one.
503 \fBstonewall\fR implies \fBnew_group\fR.
506 Start a new reporting group. If not given, all jobs in a file will be part
507 of the same reporting group, unless separated by a stonewall.
509 .BI numjobs \fR=\fPint
510 Number of clones (processes/threads performing the same workload) of this job.
514 If set, display per-group reports instead of per-job when \fBnumjobs\fR is
518 Use threads created with \fBpthread_create\fR\|(3) instead of processes created
519 with \fBfork\fR\|(2).
521 .BI zonesize \fR=\fPsiint
522 Divide file into zones of the specified size in bytes. See \fBzoneskip\fR.
524 .BI zoneskip \fR=\fPsiint
525 Skip the specified number of bytes when \fBzonesize\fR bytes of data have been
528 .BI write_iolog \fR=\fPstr
529 Write the issued I/O patterns to the specified file.
531 .BI read_iolog \fR=\fPstr
532 Replay the I/O patterns contained in the specified file generated by
533 \fBwrite_iolog\fR, or may be a \fBblktrace\fR binary file.
536 If given, write bandwidth logs of the jobs in this file.
539 Same as \fBwrite_bw_log\fR, but writes I/O completion latencies.
541 .BI lockmem \fR=\fPsiint
542 Pin the specified amount of memory with \fBmlock\fR\|(2). Can be used to
543 simulate a smaller amount of memory.
545 .BI exec_prerun \fR=\fPstr
546 Before running the job, execute the specified command with \fBsystem\fR\|(3).
548 .BI exec_postrun \fR=\fPstr
549 Same as \fBexec_prerun\fR, but the command is executed after the job completes.
551 .BI ioscheduler \fR=\fPstr
552 Attempt to switch the device hosting the file to the specified I/O scheduler.
554 .BI cpuload \fR=\fPint
555 If the job is a CPU cycle-eater, attempt to use the specified percentage of
558 .BI cpuchunks \fR=\fPint
559 If the job is a CPU cycle-eater, split the load into cycles of the
560 given time in milliseconds.
562 .BI disk_util \fR=\fPbool
563 Generate disk utilization statistics if the platform supports it. Default: true.
565 While running, \fBfio\fR will display the status of the created jobs. For
569 Threads: 1: [_r] [24.8% done] [ 13509/ 8334 kb/s] [eta 00h:01m:31s]
572 The characters in the first set of brackets denote the current status of each
573 threads. The possible values are:
579 Setup but not started.
585 Initialized, waiting.
588 Running, doing sequential reads.
591 Running, doing random reads.
594 Running, doing sequential writes.
597 Running, doing random writes.
600 Running, doing mixed sequential reads/writes.
603 Running, doing mixed random reads/writes.
606 Running, currently waiting for \fBfsync\fR\|(2).
609 Running, verifying written data.
612 Exited, not reaped by main thread.
615 Exited, thread reaped.
619 The second set of brackets shows the estimated completion percentage of
620 the current group. The third set shows the read and write I/O rate,
621 respectively. Finally, the estimated run time of the job is displayed.
623 When \fBfio\fR completes (or is interrupted by Ctrl-C), it will show data
624 for each thread, each group of threads, and each disk, in that order.
626 Per-thread statistics first show the threads client number, group-id, and
627 error code. The remaining figures are as follows:
631 Number of megabytes of I/O performed.
634 Average data rate (bandwidth).
640 Submission latency minimum, maximum, average and standard deviation. This is
641 the time it took to submit the I/O.
644 Completion latency minimum, maximum, average and standard deviation. This
645 is the time between submission and completion.
648 Bandwidth minimum, maximum, percentage of aggregate bandwidth received, average
649 and standard deviation.
652 CPU usage statistics. Includes user and system time, number of context switches
653 this thread went through and number of major and minor page faults.
656 Distribution of I/O depths. Each depth includes everything less than (or equal)
657 to it, but greater than the previous depth.
660 Number of read/write requests issued, and number of short read/write requests.
663 Distribution of I/O completion latencies. The numbers follow the same pattern
667 The group statistics show:
672 Number of megabytes I/O performed.
675 Aggregate bandwidth of threads in the group.
678 Minimum average bandwidth a thread saw.
681 Maximum average bandwidth a thread saw.
684 Shortest runtime of threads in the group.
687 Longest runtime of threads in the group.
691 Finally, disk statistics are printed with reads first:
696 Number of I/Os performed by all groups.
699 Number of merges in the I/O scheduler.
702 Number of ticks we kept the disk busy.
705 Total time spent in the disk queue.
712 If the \fB\-\-minimal\fR option is given, the results will be printed in a
713 semicolon-delimited format suitable for scripted use. The fields are:
716 .B jobname, groupid, error
720 .B KiB I/O, bandwidth \fR(KiB/s)\fP, runtime \fR(ms)\fP
724 .B min, max, mean, standard deviation
728 .B min, max, mean, standard deviation
732 .B min, max, aggregate percentage of total, mean, standard deviation
738 .B KiB I/O, bandwidth \fR(KiB/s)\fP, runtime \fR(ms)\fP
742 .B min, max, mean, standard deviation
746 .B min, max, mean, standard deviation
750 .B min, max, aggregate percentage of total, mean, standard deviation
756 .B user, system, context switches, major page faults, minor page faults
759 IO depth distribution:
761 .B <=1, 2, 4, 8, 16, 32, >=64
764 IO latency distribution (ms):
766 .B <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, >=2000
773 was written by Jens Axboe <jens.axboe@oracle.com>.
775 This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au> based
776 on documentation by Jens Axboe.
778 Report bugs to the \fBfio\fR mailing list <fio-devel@kernel.dk>.
781 For further documentation see \fBHOWTO\fR and \fBREADME\fR.
783 Sample jobfiles are available in the \fBexamples\fR directory.