backend: verify-trigger fixes
[fio.git] / fio.1
CommitLineData
523bad63 1.TH fio 1 "August 2017" "User Manual"
d60e92d1
AC
2.SH NAME
3fio \- flexible I/O tester
4.SH SYNOPSIS
5.B fio
6[\fIoptions\fR] [\fIjobfile\fR]...
7.SH DESCRIPTION
8.B fio
9is a tool that will spawn a number of threads or processes doing a
10particular type of I/O action as specified by the user.
11The typical use of fio is to write a job file matching the I/O load
12one wants to simulate.
13.SH OPTIONS
14.TP
49da1240 15.BI \-\-debug \fR=\fPtype
7db7a5a0
TK
16Enable verbose tracing \fItype\fR of various fio actions. May be `all' for all \fItype\fRs
17or individual types separated by a comma (e.g. `\-\-debug=file,mem' will enable
bdd88be3
TK
18file and memory debugging). `help' will list all available tracing options.
19.TP
7db7a5a0 20.BI \-\-parse\-only
bdd88be3 21Parse options only, don't start any I/O.
49da1240 22.TP
d60e92d1
AC
23.BI \-\-output \fR=\fPfilename
24Write output to \fIfilename\fR.
25.TP
7db7a5a0
TK
26.BI \-\-output\-format \fR=\fPformat
27Set the reporting \fIformat\fR to `normal', `terse', `json', or
28`json+'. Multiple formats can be selected, separate by a comma. `terse'
29is a CSV based format. `json+' is like `json', except it adds a full
513e37ee 30dump of the latency buckets.
e28ee21d 31.TP
b2cecdc2 32.BI \-\-runtime \fR=\fPruntime
33Limit run time to \fIruntime\fR seconds.
d60e92d1 34.TP
7db7a5a0 35.BI \-\-bandwidth\-log
d23ae827 36Generate aggregate bandwidth logs.
d60e92d1 37.TP
7db7a5a0
TK
38.BI \-\-minimal
39Print statistics in a terse, semicolon\-delimited format.
d60e92d1 40.TP
7db7a5a0
TK
41.BI \-\-append\-terse
42Print statistics in selected mode AND terse, semicolon\-delimited format.
43\fBDeprecated\fR, use \fB\-\-output\-format\fR instead to select multiple formats.
f6a7df53 44.TP
065248bf 45.BI \-\-terse\-version \fR=\fPversion
7db7a5a0 46Set terse \fIversion\fR output format (default `3', or `2', `4', `5').
49da1240 47.TP
7db7a5a0 48.BI \-\-version
bdd88be3
TK
49Print version information and exit.
50.TP
7db7a5a0 51.BI \-\-help
bdd88be3 52Print a summary of the command line options and exit.
49da1240 53.TP
7db7a5a0 54.BI \-\-cpuclock\-test
bdd88be3 55Perform test and validation of internal CPU clock.
fec0f21c 56.TP
bdd88be3 57.BI \-\-crctest \fR=\fP[test]
7db7a5a0 58Test the speed of the built\-in checksumming functions. If no argument is given,
bdd88be3 59all of them are tested. Alternatively, a comma separated list can be passed, in which
fec0f21c
JA
60case the given ones are tested.
61.TP
49da1240 62.BI \-\-cmdhelp \fR=\fPcommand
bdd88be3 63Print help information for \fIcommand\fR. May be `all' for all commands.
49da1240 64.TP
7db7a5a0
TK
65.BI \-\-enghelp \fR=\fP[ioengine[,command]]
66List all commands defined by \fIioengine\fR, or print help for \fIcommand\fR
67defined by \fIioengine\fR. If no \fIioengine\fR is given, list all
68available ioengines.
de890a1e 69.TP
d60e92d1 70.BI \-\-showcmd \fR=\fPjobfile
7db7a5a0 71Convert \fIjobfile\fR to a set of command\-line options.
d60e92d1 72.TP
bdd88be3 73.BI \-\-readonly
7db7a5a0 74Turn on safety read\-only checks, preventing writes. The \fB\-\-readonly\fR
bdd88be3
TK
75option is an extra safety guard to prevent users from accidentally starting
76a write workload when that is not desired. Fio will only write if
7db7a5a0
TK
77`rw=write/randwrite/rw/randrw' is given. This extra safety net can be used
78as an extra precaution as \fB\-\-readonly\fR will also enable a write check in
bdd88be3
TK
79the I/O engine core to prevent writes due to unknown user space bug(s).
80.TP
d60e92d1 81.BI \-\-eta \fR=\fPwhen
7db7a5a0 82Specifies when real\-time ETA estimate should be printed. \fIwhen\fR may
bdd88be3 83be `always', `never' or `auto'.
d60e92d1 84.TP
30b5d57f 85.BI \-\-eta\-newline \fR=\fPtime
bdd88be3
TK
86Force a new line for every \fItime\fR period passed. When the unit is omitted,
87the value is interpreted in seconds.
30b5d57f
JA
88.TP
89.BI \-\-status\-interval \fR=\fPtime
bdd88be3
TK
90Force full status dump every \fItime\fR period passed. When the unit is omitted,
91the value is interpreted in seconds.
92.TP
93.BI \-\-section \fR=\fPname
94Only run specified section \fIname\fR in job file. Multiple sections can be specified.
7db7a5a0 95The \fB\-\-section\fR option allows one to combine related jobs into one file.
bdd88be3 96E.g. one job file could define light, moderate, and heavy sections. Tell
7db7a5a0 97fio to run only the "heavy" section by giving `\-\-section=heavy'
bdd88be3 98command line option. One can also specify the "write" operations in one
7db7a5a0 99section and "verify" operation in another section. The \fB\-\-section\fR option
bdd88be3
TK
100only applies to job sections. The reserved *global* section is always
101parsed and used.
c0a5d35e 102.TP
49da1240 103.BI \-\-alloc\-size \fR=\fPkb
7db7a5a0
TK
104Set the internal smalloc pool size to \fIkb\fR in KiB. The
105\fB\-\-alloc\-size\fR switch allows one to use a larger pool size for smalloc.
bdd88be3
TK
106If running large jobs with randommap enabled, fio can run out of memory.
107Smalloc is an internal allocator for shared structures from a fixed size
108memory pool and can grow to 16 pools. The pool size defaults to 16MiB.
7db7a5a0
TK
109NOTE: While running `.fio_smalloc.*' backing store files are visible
110in `/tmp'.
d60e92d1 111.TP
49da1240
JA
112.BI \-\-warnings\-fatal
113All fio parser warnings are fatal, causing fio to exit with an error.
9183788d 114.TP
49da1240 115.BI \-\-max\-jobs \fR=\fPnr
7db7a5a0 116Set the maximum number of threads/processes to support to \fInr\fR.
d60e92d1 117.TP
49da1240 118.BI \-\-server \fR=\fPargs
7db7a5a0
TK
119Start a backend server, with \fIargs\fR specifying what to listen to.
120See \fBCLIENT/SERVER\fR section.
f57a9c59 121.TP
49da1240 122.BI \-\-daemonize \fR=\fPpidfile
7db7a5a0 123Background a fio server, writing the pid to the given \fIpidfile\fR file.
49da1240 124.TP
bdd88be3 125.BI \-\-client \fR=\fPhostname
7db7a5a0
TK
126Instead of running the jobs locally, send and run them on the given \fIhostname\fR
127or set of \fIhostname\fRs. See \fBCLIENT/SERVER\fR section.
bdd88be3 128.TP
7db7a5a0
TK
129.BI \-\-remote\-config \fR=\fPfile
130Tell fio server to load this local \fIfile\fR.
f2a2ce0e
HL
131.TP
132.BI \-\-idle\-prof \fR=\fPoption
7db7a5a0 133Report CPU idleness. \fIoption\fR is one of the following:
bdd88be3
TK
134.RS
135.RS
136.TP
137.B calibrate
138Run unit work calibration only and exit.
139.TP
140.B system
141Show aggregate system idleness and unit work.
142.TP
143.B percpu
7db7a5a0 144As \fBsystem\fR but also show per CPU idleness.
bdd88be3
TK
145.RE
146.RE
147.TP
7db7a5a0
TK
148.BI \-\-inflate\-log \fR=\fPlog
149Inflate and output compressed \fIlog\fR.
bdd88be3 150.TP
7db7a5a0
TK
151.BI \-\-trigger\-file \fR=\fPfile
152Execute trigger command when \fIfile\fR exists.
bdd88be3 153.TP
7db7a5a0
TK
154.BI \-\-trigger\-timeout \fR=\fPtime
155Execute trigger at this \fItime\fR.
bdd88be3 156.TP
7db7a5a0
TK
157.BI \-\-trigger \fR=\fPcommand
158Set this \fIcommand\fR as local trigger.
bdd88be3 159.TP
7db7a5a0
TK
160.BI \-\-trigger\-remote \fR=\fPcommand
161Set this \fIcommand\fR as remote trigger.
bdd88be3 162.TP
7db7a5a0
TK
163.BI \-\-aux\-path \fR=\fPpath
164Use this \fIpath\fR for fio state generated files.
d60e92d1 165.SH "JOB FILE FORMAT"
7a14cf18
TK
166Any parameters following the options will be assumed to be job files, unless
167they match a job file parameter. Multiple job files can be listed and each job
7db7a5a0 168file will be regarded as a separate group. Fio will \fBstonewall\fR execution
7a14cf18
TK
169between each group.
170
171Fio accepts one or more job files describing what it is
172supposed to do. The job file format is the classic ini file, where the names
173enclosed in [] brackets define the job name. You are free to use any ASCII name
174you want, except *global* which has special meaning. Following the job name is
175a sequence of zero or more parameters, one per line, that define the behavior of
176the job. If the first character in a line is a ';' or a '#', the entire line is
177discarded as a comment.
178
179A *global* section sets defaults for the jobs described in that file. A job may
180override a *global* section parameter, and a job file may even have several
181*global* sections if so desired. A job is only affected by a *global* section
182residing above it.
183
7db7a5a0
TK
184The \fB\-\-cmdhelp\fR option also lists all options. If used with an \fIcommand\fR
185argument, \fB\-\-cmdhelp\fR will detail the given \fIcommand\fR.
7a14cf18 186
7db7a5a0
TK
187See the `examples/' directory for inspiration on how to write job files. Note
188the copyright and license requirements currently apply to
189`examples/' files.
54eb4569
TK
190.SH "JOB FILE PARAMETERS"
191Some parameters take an option of a given type, such as an integer or a
192string. Anywhere a numeric value is required, an arithmetic expression may be
193used, provided it is surrounded by parentheses. Supported operators are:
d59aa780 194.RS
7db7a5a0 195.P
d59aa780 196.B addition (+)
7db7a5a0
TK
197.P
198.B subtraction (\-)
199.P
d59aa780 200.B multiplication (*)
7db7a5a0 201.P
d59aa780 202.B division (/)
7db7a5a0 203.P
d59aa780 204.B modulus (%)
7db7a5a0 205.P
d59aa780
JA
206.B exponentiation (^)
207.RE
d59aa780
JA
208.P
209For time values in expressions, units are microseconds by default. This is
210different than for time values not in expressions (not enclosed in
54eb4569
TK
211parentheses).
212.SH "PARAMETER TYPES"
213The following parameter types are used.
d60e92d1
AC
214.TP
215.I str
6b86fc18
TK
216String. A sequence of alphanumeric characters.
217.TP
218.I time
219Integer with possible time suffix. Without a unit value is interpreted as
220seconds unless otherwise specified. Accepts a suffix of 'd' for days, 'h' for
221hours, 'm' for minutes, 's' for seconds, 'ms' (or 'msec') for milliseconds and 'us'
222(or 'usec') for microseconds. For example, use 10m for 10 minutes.
d60e92d1
AC
223.TP
224.I int
6d500c2e
RE
225Integer. A whole number value, which may contain an integer prefix
226and an integer suffix.
0b43a833
TK
227.RS
228.RS
229.P
6b86fc18 230[*integer prefix*] **number** [*integer suffix*]
0b43a833
TK
231.RE
232.P
6b86fc18
TK
233The optional *integer prefix* specifies the number's base. The default
234is decimal. *0x* specifies hexadecimal.
0b43a833 235.P
6b86fc18
TK
236The optional *integer suffix* specifies the number's units, and includes an
237optional unit prefix and an optional unit. For quantities of data, the
238default unit is bytes. For quantities of time, the default unit is seconds
239unless otherwise specified.
0b43a833
TK
240.P
241With `kb_base=1000', fio follows international standards for unit
7db7a5a0 242prefixes. To specify power\-of\-10 decimal values defined in the
6b86fc18 243International System of Units (SI):
0b43a833
TK
244.RS
245.P
7db7a5a0 246.PD 0
eccce61a 247K means kilo (K) or 1000
7db7a5a0 248.P
eccce61a 249M means mega (M) or 1000**2
7db7a5a0 250.P
eccce61a 251G means giga (G) or 1000**3
7db7a5a0 252.P
eccce61a 253T means tera (T) or 1000**4
7db7a5a0 254.P
eccce61a 255P means peta (P) or 1000**5
7db7a5a0 256.PD
0b43a833
TK
257.RE
258.P
7db7a5a0 259To specify power\-of\-2 binary values defined in IEC 80000\-13:
0b43a833
TK
260.RS
261.P
7db7a5a0 262.PD 0
eccce61a 263Ki means kibi (Ki) or 1024
7db7a5a0 264.P
eccce61a 265Mi means mebi (Mi) or 1024**2
7db7a5a0 266.P
eccce61a 267Gi means gibi (Gi) or 1024**3
7db7a5a0 268.P
eccce61a 269Ti means tebi (Ti) or 1024**4
7db7a5a0 270.P
eccce61a 271Pi means pebi (Pi) or 1024**5
7db7a5a0 272.PD
0b43a833
TK
273.RE
274.P
275With `kb_base=1024' (the default), the unit prefixes are opposite
7db7a5a0 276from those specified in the SI and IEC 80000\-13 standards to provide
6b86fc18 277compatibility with old scripts. For example, 4k means 4096.
0b43a833 278.P
6b86fc18
TK
279For quantities of data, an optional unit of 'B' may be included
280(e.g., 'kB' is the same as 'k').
0b43a833 281.P
6b86fc18
TK
282The *integer suffix* is not case sensitive (e.g., m/mi mean mebi/mega,
283not milli). 'b' and 'B' both mean byte, not bit.
0b43a833
TK
284.P
285Examples with `kb_base=1000':
286.RS
287.P
7db7a5a0 288.PD 0
6d500c2e 2894 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
7db7a5a0 290.P
6d500c2e 2911 MiB: 1048576, 1m, 1024k
7db7a5a0 292.P
6d500c2e 2931 MB: 1000000, 1mi, 1000ki
7db7a5a0 294.P
6d500c2e 2951 TiB: 1073741824, 1t, 1024m, 1048576k
7db7a5a0 296.P
6d500c2e 2971 TB: 1000000000, 1ti, 1000mi, 1000000ki
7db7a5a0 298.PD
0b43a833
TK
299.RE
300.P
301Examples with `kb_base=1024' (default):
302.RS
303.P
7db7a5a0 304.PD 0
6d500c2e 3054 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
7db7a5a0 306.P
6d500c2e 3071 MiB: 1048576, 1m, 1024k
7db7a5a0 308.P
6d500c2e 3091 MB: 1000000, 1mi, 1000ki
7db7a5a0 310.P
6d500c2e 3111 TiB: 1073741824, 1t, 1024m, 1048576k
7db7a5a0 312.P
6d500c2e 3131 TB: 1000000000, 1ti, 1000mi, 1000000ki
7db7a5a0 314.PD
0b43a833
TK
315.RE
316.P
6d500c2e 317To specify times (units are not case sensitive):
0b43a833
TK
318.RS
319.P
7db7a5a0 320.PD 0
6d500c2e 321D means days
7db7a5a0 322.P
6d500c2e 323H means hours
7db7a5a0 324.P
6d500c2e 325M mean minutes
7db7a5a0 326.P
6d500c2e 327s or sec means seconds (default)
7db7a5a0 328.P
6d500c2e 329ms or msec means milliseconds
7db7a5a0 330.P
6d500c2e 331us or usec means microseconds
7db7a5a0 332.PD
0b43a833
TK
333.RE
334.P
6b86fc18 335If the option accepts an upper and lower range, use a colon ':' or
7db7a5a0 336minus '\-' to separate such values. See \fIirange\fR parameter type.
6b86fc18
TK
337If the lower value specified happens to be larger than the upper value
338the two values are swapped.
0b43a833 339.RE
d60e92d1
AC
340.TP
341.I bool
6b86fc18
TK
342Boolean. Usually parsed as an integer, however only defined for
343true and false (1 and 0).
d60e92d1
AC
344.TP
345.I irange
6b86fc18 346Integer range with suffix. Allows value range to be given, such as
7db7a5a0 3471024\-4096. A colon may also be used as the separator, e.g. 1k:4k. If the
6b86fc18 348option allows two sets of ranges, they can be specified with a ',' or '/'
7db7a5a0 349delimiter: 1k\-4k/8k\-32k. Also see \fIint\fR parameter type.
83349190
YH
350.TP
351.I float_list
6b86fc18 352A list of floating point numbers, separated by a ':' character.
523bad63 353.SH "JOB PARAMETERS"
54eb4569 354With the above in mind, here follows the complete list of fio job parameters.
523bad63 355.SS "Units"
d60e92d1 356.TP
523bad63
TK
357.BI kb_base \fR=\fPint
358Select the interpretation of unit prefixes in input parameters.
359.RS
360.RS
d60e92d1 361.TP
523bad63
TK
362.B 1000
363Inputs comply with IEC 80000\-13 and the International
364System of Units (SI). Use:
365.RS
366.P
367.PD 0
368\- power\-of\-2 values with IEC prefixes (e.g., KiB)
369.P
370\- power\-of\-10 values with SI prefixes (e.g., kB)
371.PD
372.RE
373.TP
374.B 1024
375Compatibility mode (default). To avoid breaking old scripts:
376.P
377.RS
378.PD 0
379\- power\-of\-2 values with SI prefixes
380.P
381\- power\-of\-10 values with IEC prefixes
382.PD
383.RE
384.RE
385.P
386See \fBbs\fR for more details on input parameters.
387.P
388Outputs always use correct prefixes. Most outputs include both
389side\-by\-side, like:
390.P
391.RS
392bw=2383.3kB/s (2327.4KiB/s)
393.RE
394.P
395If only one value is reported, then kb_base selects the one to use:
396.P
397.RS
398.PD 0
3991000 \-\- SI prefixes
400.P
4011024 \-\- IEC prefixes
402.PD
403.RE
404.RE
405.TP
406.BI unit_base \fR=\fPint
407Base unit for reporting. Allowed values are:
408.RS
409.RS
410.TP
411.B 0
412Use auto\-detection (default).
413.TP
414.B 8
415Byte based.
416.TP
417.B 1
418Bit based.
419.RE
420.RE
421.SS "Job description"
422.TP
423.BI name \fR=\fPstr
424ASCII name of the job. This may be used to override the name printed by fio
425for this job. Otherwise the job name is used. On the command line this
426parameter has the special purpose of also signaling the start of a new job.
9cc8cb91 427.TP
d60e92d1 428.BI description \fR=\fPstr
523bad63
TK
429Text description of the job. Doesn't do anything except dump this text
430description when this job is run. It's not parsed.
431.TP
432.BI loops \fR=\fPint
433Run the specified number of iterations of this job. Used to repeat the same
434workload a given number of times. Defaults to 1.
435.TP
436.BI numjobs \fR=\fPint
437Create the specified number of clones of this job. Each clone of job
438is spawned as an independent thread or process. May be used to setup a
439larger number of threads/processes doing the same thing. Each thread is
440reported separately; to see statistics for all clones as a whole, use
441\fBgroup_reporting\fR in conjunction with \fBnew_group\fR.
442See \fB\-\-max\-jobs\fR. Default: 1.
443.SS "Time related parameters"
444.TP
445.BI runtime \fR=\fPtime
446Tell fio to terminate processing after the specified period of time. It
447can be quite hard to determine for how long a specified job will run, so
448this parameter is handy to cap the total runtime to a given time. When
449the unit is omitted, the value is intepreted in seconds.
450.TP
451.BI time_based
452If set, fio will run for the duration of the \fBruntime\fR specified
453even if the file(s) are completely read or written. It will simply loop over
454the same workload as many times as the \fBruntime\fR allows.
455.TP
456.BI startdelay \fR=\fPirange(int)
457Delay the start of job for the specified amount of time. Can be a single
458value or a range. When given as a range, each thread will choose a value
459randomly from within the range. Value is in seconds if a unit is omitted.
460.TP
461.BI ramp_time \fR=\fPtime
462If set, fio will run the specified workload for this amount of time before
463logging any performance numbers. Useful for letting performance settle
464before logging results, thus minimizing the runtime required for stable
465results. Note that the \fBramp_time\fR is considered lead in time for a job,
466thus it will increase the total runtime if a special timeout or
467\fBruntime\fR is specified. When the unit is omitted, the value is
468given in seconds.
469.TP
470.BI clocksource \fR=\fPstr
471Use the given clocksource as the base of timing. The supported options are:
472.RS
473.RS
474.TP
475.B gettimeofday
476\fBgettimeofday\fR\|(2)
477.TP
478.B clock_gettime
479\fBclock_gettime\fR\|(2)
480.TP
481.B cpu
482Internal CPU clock source
483.RE
484.P
485\fBcpu\fR is the preferred clocksource if it is reliable, as it is very fast (and
486fio is heavy on time calls). Fio will automatically use this clocksource if
487it's supported and considered reliable on the system it is running on,
488unless another clocksource is specifically set. For x86/x86\-64 CPUs, this
489means supporting TSC Invariant.
490.RE
491.TP
492.BI gtod_reduce \fR=\fPbool
493Enable all of the \fBgettimeofday\fR\|(2) reducing options
494(\fBdisable_clat\fR, \fBdisable_slat\fR, \fBdisable_bw_measurement\fR) plus
495reduce precision of the timeout somewhat to really shrink the
496\fBgettimeofday\fR\|(2) call count. With this option enabled, we only do
497about 0.4% of the \fBgettimeofday\fR\|(2) calls we would have done if all
498time keeping was enabled.
499.TP
500.BI gtod_cpu \fR=\fPint
501Sometimes it's cheaper to dedicate a single thread of execution to just
502getting the current time. Fio (and databases, for instance) are very
503intensive on \fBgettimeofday\fR\|(2) calls. With this option, you can set
504one CPU aside for doing nothing but logging current time to a shared memory
505location. Then the other threads/processes that run I/O workloads need only
506copy that segment, instead of entering the kernel with a
507\fBgettimeofday\fR\|(2) call. The CPU set aside for doing these time
508calls will be excluded from other uses. Fio will manually clear it from the
509CPU mask of other jobs.
510.SS "Target file/device"
d60e92d1
AC
511.TP
512.BI directory \fR=\fPstr
523bad63
TK
513Prefix \fBfilename\fRs with this directory. Used to place files in a different
514location than `./'. You can specify a number of directories by
515separating the names with a ':' character. These directories will be
516assigned equally distributed to job clones created by \fBnumjobs\fR as
517long as they are using generated filenames. If specific \fBfilename\fR(s) are
518set fio will use the first listed directory, and thereby matching the
519\fBfilename\fR semantic which generates a file each clone if not specified, but
520let all clones use the same if set.
521.RS
522.P
523See the \fBfilename\fR option for information on how to escape ':' and '\'
524characters within the directory path itself.
525.RE
d60e92d1
AC
526.TP
527.BI filename \fR=\fPstr
523bad63
TK
528Fio normally makes up a \fBfilename\fR based on the job name, thread number, and
529file number (see \fBfilename_format\fR). If you want to share files
530between threads in a job or several
531jobs with fixed file paths, specify a \fBfilename\fR for each of them to override
532the default. If the ioengine is file based, you can specify a number of files
533by separating the names with a ':' colon. So if you wanted a job to open
534`/dev/sda' and `/dev/sdb' as the two working files, you would use
535`filename=/dev/sda:/dev/sdb'. This also means that whenever this option is
536specified, \fBnrfiles\fR is ignored. The size of regular files specified
537by this option will be \fBsize\fR divided by number of files unless an
538explicit size is specified by \fBfilesize\fR.
539.RS
540.P
541Each colon and backslash in the wanted path must be escaped with a '\'
542character. For instance, if the path is `/dev/dsk/foo@3,0:c' then you
543would use `filename=/dev/dsk/foo@3,0\\:c' and if the path is
544`F:\\\\filename' then you would use `filename=F\\:\\\\filename'.
545.P
546On Windows, disk devices are accessed as `\\\\\\\\.\\\\PhysicalDrive0' for
547the first device, `\\\\\\\\.\\\\PhysicalDrive1' for the second etc.
548Note: Windows and FreeBSD prevent write access to areas
549of the disk containing in\-use data (e.g. filesystems).
550.P
551The filename `\-' is a reserved name, meaning *stdin* or *stdout*. Which
552of the two depends on the read/write direction set.
553.RE
d60e92d1 554.TP
de98bd30 555.BI filename_format \fR=\fPstr
523bad63
TK
556If sharing multiple files between jobs, it is usually necessary to have fio
557generate the exact names that you want. By default, fio will name a file
de98bd30 558based on the default file format specification of
523bad63 559`jobname.jobnumber.filenumber'. With this option, that can be
de98bd30
JA
560customized. Fio will recognize and replace the following keywords in this
561string:
562.RS
563.RS
564.TP
565.B $jobname
566The name of the worker thread or process.
567.TP
568.B $jobnum
569The incremental number of the worker thread or process.
570.TP
571.B $filenum
572The incremental number of the file for that worker thread or process.
573.RE
574.P
523bad63
TK
575To have dependent jobs share a set of files, this option can be set to have
576fio generate filenames that are shared between the two. For instance, if
577`testfiles.$filenum' is specified, file number 4 for any job will be
578named `testfiles.4'. The default of `$jobname.$jobnum.$filenum'
de98bd30
JA
579will be used if no other format specifier is given.
580.RE
de98bd30 581.TP
922a5be8 582.BI unique_filename \fR=\fPbool
523bad63
TK
583To avoid collisions between networked clients, fio defaults to prefixing any
584generated filenames (with a directory specified) with the source of the
585client connecting. To disable this behavior, set this option to 0.
586.TP
587.BI opendir \fR=\fPstr
588Recursively open any files below directory \fIstr\fR.
922a5be8 589.TP
3ce9dcaf 590.BI lockfile \fR=\fPstr
523bad63
TK
591Fio defaults to not locking any files before it does I/O to them. If a file
592or file descriptor is shared, fio can serialize I/O to that file to make the
593end result consistent. This is usual for emulating real workloads that share
594files. The lock modes are:
3ce9dcaf
JA
595.RS
596.RS
597.TP
598.B none
523bad63 599No locking. The default.
3ce9dcaf
JA
600.TP
601.B exclusive
523bad63 602Only one thread or process may do I/O at a time, excluding all others.
3ce9dcaf
JA
603.TP
604.B readwrite
523bad63
TK
605Read\-write locking on the file. Many readers may
606access the file at the same time, but writes get exclusive access.
3ce9dcaf 607.RE
ce594fbe 608.RE
523bad63
TK
609.TP
610.BI nrfiles \fR=\fPint
611Number of files to use for this job. Defaults to 1. The size of files
612will be \fBsize\fR divided by this unless explicit size is specified by
613\fBfilesize\fR. Files are created for each thread separately, and each
614file will have a file number within its name by default, as explained in
615\fBfilename\fR section.
616.TP
617.BI openfiles \fR=\fPint
618Number of files to keep open at the same time. Defaults to the same as
619\fBnrfiles\fR, can be set smaller to limit the number simultaneous
620opens.
621.TP
622.BI file_service_type \fR=\fPstr
623Defines how fio decides which file from a job to service next. The following
624types are defined:
625.RS
626.RS
627.TP
628.B random
629Choose a file at random.
630.TP
631.B roundrobin
632Round robin over opened files. This is the default.
633.TP
634.B sequential
635Finish one file before moving on to the next. Multiple files can
636still be open depending on \fBopenfiles\fR.
637.TP
638.B zipf
639Use a Zipf distribution to decide what file to access.
640.TP
641.B pareto
642Use a Pareto distribution to decide what file to access.
643.TP
644.B normal
645Use a Gaussian (normal) distribution to decide what file to access.
646.TP
647.B gauss
648Alias for normal.
649.RE
3ce9dcaf 650.P
523bad63
TK
651For \fBrandom\fR, \fBroundrobin\fR, and \fBsequential\fR, a postfix can be appended to
652tell fio how many I/Os to issue before switching to a new file. For example,
653specifying `file_service_type=random:8' would cause fio to issue
6548 I/Os before selecting a new file at random. For the non\-uniform
655distributions, a floating point postfix can be given to influence how the
656distribution is skewed. See \fBrandom_distribution\fR for a description
657of how that would work.
658.RE
659.TP
660.BI ioscheduler \fR=\fPstr
661Attempt to switch the device hosting the file to the specified I/O scheduler
662before running.
663.TP
664.BI create_serialize \fR=\fPbool
665If true, serialize the file creation for the jobs. This may be handy to
666avoid interleaving of data files, which may greatly depend on the filesystem
667used and even the number of processors in the system. Default: true.
668.TP
669.BI create_fsync \fR=\fPbool
670\fBfsync\fR\|(2) the data file after creation. This is the default.
671.TP
672.BI create_on_open \fR=\fPbool
673If true, don't pre\-create files but allow the job's open() to create a file
674when it's time to do I/O. Default: false \-\- pre\-create all necessary files
675when the job starts.
676.TP
677.BI create_only \fR=\fPbool
678If true, fio will only run the setup phase of the job. If files need to be
679laid out or updated on disk, only that will be done \-\- the actual job contents
680are not executed. Default: false.
681.TP
682.BI allow_file_create \fR=\fPbool
683If true, fio is permitted to create files as part of its workload. If this
684option is false, then fio will error out if
685the files it needs to use don't already exist. Default: true.
686.TP
687.BI allow_mounted_write \fR=\fPbool
688If this isn't set, fio will abort jobs that are destructive (e.g. that write)
689to what appears to be a mounted device or partition. This should help catch
690creating inadvertently destructive tests, not realizing that the test will
691destroy data on the mounted file system. Note that some platforms don't allow
692writing against a mounted device regardless of this option. Default: false.
693.TP
694.BI pre_read \fR=\fPbool
695If this is given, files will be pre\-read into memory before starting the
696given I/O operation. This will also clear the \fBinvalidate\fR flag,
697since it is pointless to pre\-read and then drop the cache. This will only
698work for I/O engines that are seek\-able, since they allow you to read the
699same data multiple times. Thus it will not work on non\-seekable I/O engines
700(e.g. network, splice). Default: false.
701.TP
702.BI unlink \fR=\fPbool
703Unlink the job files when done. Not the default, as repeated runs of that
704job would then waste time recreating the file set again and again. Default:
705false.
706.TP
707.BI unlink_each_loop \fR=\fPbool
708Unlink job files after each iteration or loop. Default: false.
709.TP
710.BI zonesize \fR=\fPint
711Divide a file into zones of the specified size. See \fBzoneskip\fR.
712.TP
713.BI zonerange \fR=\fPint
714Give size of an I/O zone. See \fBzoneskip\fR.
715.TP
716.BI zoneskip \fR=\fPint
717Skip the specified number of bytes when \fBzonesize\fR data has been
718read. The two zone options can be used to only do I/O on zones of a file.
719.SS "I/O type"
720.TP
721.BI direct \fR=\fPbool
722If value is true, use non\-buffered I/O. This is usually O_DIRECT. Note that
723ZFS on Solaris doesn't support direct I/O. On Windows the synchronous
724ioengines don't support direct I/O. Default: false.
725.TP
726.BI atomic \fR=\fPbool
727If value is true, attempt to use atomic direct I/O. Atomic writes are
728guaranteed to be stable once acknowledged by the operating system. Only
729Linux supports O_ATOMIC right now.
730.TP
731.BI buffered \fR=\fPbool
732If value is true, use buffered I/O. This is the opposite of the
733\fBdirect\fR option. Defaults to true.
d60e92d1
AC
734.TP
735.BI readwrite \fR=\fPstr "\fR,\fP rw" \fR=\fPstr
523bad63 736Type of I/O pattern. Accepted values are:
d60e92d1
AC
737.RS
738.RS
739.TP
740.B read
d1429b5c 741Sequential reads.
d60e92d1
AC
742.TP
743.B write
d1429b5c 744Sequential writes.
d60e92d1 745.TP
fa769d44 746.B trim
169c098d 747Sequential trims (Linux block devices only).
fa769d44 748.TP
d60e92d1 749.B randread
d1429b5c 750Random reads.
d60e92d1
AC
751.TP
752.B randwrite
d1429b5c 753Random writes.
d60e92d1 754.TP
fa769d44 755.B randtrim
169c098d 756Random trims (Linux block devices only).
fa769d44 757.TP
523bad63
TK
758.B rw,readwrite
759Sequential mixed reads and writes.
d60e92d1 760.TP
ff6bb260 761.B randrw
523bad63 762Random mixed reads and writes.
82a90686
JA
763.TP
764.B trimwrite
523bad63
TK
765Sequential trim+write sequences. Blocks will be trimmed first,
766then the same blocks will be written to.
d60e92d1
AC
767.RE
768.P
523bad63
TK
769Fio defaults to read if the option is not specified. For the mixed I/O
770types, the default is to split them 50/50. For certain types of I/O the
771result may still be skewed a bit, since the speed may be different.
772.P
773It is possible to specify the number of I/Os to do before getting a new
774offset by appending `:<nr>' to the end of the string given. For a
775random read, it would look like `rw=randread:8' for passing in an offset
776modifier with a value of 8. If the suffix is used with a sequential I/O
777pattern, then the `<nr>' value specified will be added to the generated
778offset for each I/O turning sequential I/O into sequential I/O with holes.
779For instance, using `rw=write:4k' will skip 4k for every write. Also see
780the \fBrw_sequencer\fR option.
d60e92d1
AC
781.RE
782.TP
38dad62d 783.BI rw_sequencer \fR=\fPstr
523bad63
TK
784If an offset modifier is given by appending a number to the `rw=\fIstr\fR'
785line, then this option controls how that number modifies the I/O offset
786being generated. Accepted values are:
38dad62d
JA
787.RS
788.RS
789.TP
790.B sequential
523bad63 791Generate sequential offset.
38dad62d
JA
792.TP
793.B identical
523bad63 794Generate the same offset.
38dad62d
JA
795.RE
796.P
523bad63
TK
797\fBsequential\fR is only useful for random I/O, where fio would normally
798generate a new random offset for every I/O. If you append e.g. 8 to randread,
799you would get a new random offset for every 8 I/Os. The result would be a
800seek for only every 8 I/Os, instead of for every I/O. Use `rw=randread:8'
801to specify that. As sequential I/O is already sequential, setting
802\fBsequential\fR for that would not result in any differences. \fBidentical\fR
803behaves in a similar fashion, except it sends the same offset 8 number of
804times before generating a new offset.
38dad62d 805.RE
90fef2d1 806.TP
771e58be
JA
807.BI unified_rw_reporting \fR=\fPbool
808Fio normally reports statistics on a per data direction basis, meaning that
523bad63
TK
809reads, writes, and trims are accounted and reported separately. If this
810option is set fio sums the results and report them as "mixed" instead.
771e58be 811.TP
d60e92d1 812.BI randrepeat \fR=\fPbool
523bad63
TK
813Seed the random number generator used for random I/O patterns in a
814predictable way so the pattern is repeatable across runs. Default: true.
56e2a5fc
CE
815.TP
816.BI allrandrepeat \fR=\fPbool
817Seed all random number generators in a predictable way so results are
523bad63 818repeatable across runs. Default: false.
d60e92d1 819.TP
04778baf
JA
820.BI randseed \fR=\fPint
821Seed the random number generators based on this seed value, to be able to
822control what sequence of output is being generated. If not set, the random
823sequence depends on the \fBrandrepeat\fR setting.
824.TP
a596f047 825.BI fallocate \fR=\fPstr
523bad63
TK
826Whether pre\-allocation is performed when laying down files.
827Accepted values are:
a596f047
EG
828.RS
829.RS
830.TP
831.B none
523bad63 832Do not pre\-allocate space.
a596f047 833.TP
2c3e17be 834.B native
523bad63
TK
835Use a platform's native pre\-allocation call but fall back to
836\fBnone\fR behavior if it fails/is not implemented.
2c3e17be 837.TP
a596f047 838.B posix
523bad63 839Pre\-allocate via \fBposix_fallocate\fR\|(3).
a596f047
EG
840.TP
841.B keep
523bad63
TK
842Pre\-allocate via \fBfallocate\fR\|(2) with
843FALLOC_FL_KEEP_SIZE set.
a596f047
EG
844.TP
845.B 0
523bad63 846Backward\-compatible alias for \fBnone\fR.
a596f047
EG
847.TP
848.B 1
523bad63 849Backward\-compatible alias for \fBposix\fR.
a596f047
EG
850.RE
851.P
523bad63
TK
852May not be available on all supported platforms. \fBkeep\fR is only available
853on Linux. If using ZFS on Solaris this cannot be set to \fBposix\fR
854because ZFS doesn't support pre\-allocation. Default: \fBnative\fR if any
855pre\-allocation methods are available, \fBnone\fR if not.
a596f047 856.RE
7bc8c2cf 857.TP
ecb2083d 858.BI fadvise_hint \fR=\fPstr
cf145d90 859Use \fBposix_fadvise\fR\|(2) to advise the kernel what I/O patterns
ecb2083d
JA
860are likely to be issued. Accepted values are:
861.RS
862.RS
863.TP
864.B 0
865Backwards compatible hint for "no hint".
866.TP
867.B 1
868Backwards compatible hint for "advise with fio workload type". This
523bad63 869uses FADV_RANDOM for a random workload, and FADV_SEQUENTIAL
ecb2083d
JA
870for a sequential workload.
871.TP
872.B sequential
523bad63 873Advise using FADV_SEQUENTIAL.
ecb2083d
JA
874.TP
875.B random
523bad63 876Advise using FADV_RANDOM.
ecb2083d
JA
877.RE
878.RE
d60e92d1 879.TP
8f4b9f24 880.BI write_hint \fR=\fPstr
523bad63
TK
881Use \fBfcntl\fR\|(2) to advise the kernel what life time to expect
882from a write. Only supported on Linux, as of version 4.13. Accepted
8f4b9f24
JA
883values are:
884.RS
885.RS
886.TP
887.B none
888No particular life time associated with this file.
889.TP
890.B short
891Data written to this file has a short life time.
892.TP
893.B medium
894Data written to this file has a medium life time.
895.TP
896.B long
897Data written to this file has a long life time.
898.TP
899.B extreme
900Data written to this file has a very long life time.
901.RE
523bad63
TK
902.P
903The values are all relative to each other, and no absolute meaning
904should be associated with them.
8f4b9f24 905.RE
37659335 906.TP
523bad63
TK
907.BI offset \fR=\fPint
908Start I/O at the provided offset in the file, given as either a fixed size in
909bytes or a percentage. If a percentage is given, the next \fBblockalign\fR\-ed
910offset will be used. Data before the given offset will not be touched. This
911effectively caps the file size at `real_size \- offset'. Can be combined with
912\fBsize\fR to constrain the start and end range of the I/O workload.
913A percentage can be specified by a number between 1 and 100 followed by '%',
914for example, `offset=20%' to specify 20%.
6d500c2e 915.TP
523bad63
TK
916.BI offset_increment \fR=\fPint
917If this is provided, then the real offset becomes `\fBoffset\fR + \fBoffset_increment\fR
918* thread_number', where the thread number is a counter that starts at 0 and
919is incremented for each sub\-job (i.e. when \fBnumjobs\fR option is
920specified). This option is useful if there are several jobs which are
921intended to operate on a file in parallel disjoint segments, with even
922spacing between the starting points.
6d500c2e 923.TP
523bad63
TK
924.BI number_ios \fR=\fPint
925Fio will normally perform I/Os until it has exhausted the size of the region
926set by \fBsize\fR, or if it exhaust the allocated time (or hits an error
927condition). With this setting, the range/size can be set independently of
928the number of I/Os to perform. When fio reaches this number, it will exit
929normally and report status. Note that this does not extend the amount of I/O
930that will be done, it will only stop fio if this condition is met before
931other end\-of\-job criteria.
d60e92d1 932.TP
523bad63
TK
933.BI fsync \fR=\fPint
934If writing to a file, issue an \fBfsync\fR\|(2) (or its equivalent) of
935the dirty data for every number of blocks given. For example, if you give 32
936as a parameter, fio will sync the file after every 32 writes issued. If fio is
937using non\-buffered I/O, we may not sync the file. The exception is the sg
938I/O engine, which synchronizes the disk cache anyway. Defaults to 0, which
939means fio does not periodically issue and wait for a sync to complete. Also
940see \fBend_fsync\fR and \fBfsync_on_close\fR.
6d500c2e 941.TP
523bad63
TK
942.BI fdatasync \fR=\fPint
943Like \fBfsync\fR but uses \fBfdatasync\fR\|(2) to only sync data and
944not metadata blocks. In Windows, FreeBSD, and DragonFlyBSD there is no
945\fBfdatasync\fR\|(2) so this falls back to using \fBfsync\fR\|(2).
946Defaults to 0, which means fio does not periodically issue and wait for a
947data\-only sync to complete.
d60e92d1 948.TP
523bad63
TK
949.BI write_barrier \fR=\fPint
950Make every N\-th write a barrier write.
901bb994 951.TP
523bad63
TK
952.BI sync_file_range \fR=\fPstr:int
953Use \fBsync_file_range\fR\|(2) for every \fIint\fR number of write
954operations. Fio will track range of writes that have happened since the last
955\fBsync_file_range\fR\|(2) call. \fIstr\fR can currently be one or more of:
956.RS
957.RS
fd68418e 958.TP
523bad63
TK
959.B wait_before
960SYNC_FILE_RANGE_WAIT_BEFORE
c5751c62 961.TP
523bad63
TK
962.B write
963SYNC_FILE_RANGE_WRITE
c5751c62 964.TP
523bad63
TK
965.B wait_after
966SYNC_FILE_RANGE_WRITE_AFTER
2fa5a241 967.RE
523bad63
TK
968.P
969So if you do `sync_file_range=wait_before,write:8', fio would use
970`SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE' for every 8
971writes. Also see the \fBsync_file_range\fR\|(2) man page. This option is
972Linux specific.
2fa5a241 973.RE
ce35b1ec 974.TP
523bad63
TK
975.BI overwrite \fR=\fPbool
976If true, writes to a file will always overwrite existing data. If the file
977doesn't already exist, it will be created before the write phase begins. If
978the file exists and is large enough for the specified write phase, nothing
979will be done. Default: false.
5c94b008 980.TP
523bad63
TK
981.BI end_fsync \fR=\fPbool
982If true, \fBfsync\fR\|(2) file contents when a write stage has completed.
983Default: false.
d60e92d1 984.TP
523bad63
TK
985.BI fsync_on_close \fR=\fPbool
986If true, fio will \fBfsync\fR\|(2) a dirty file on close. This differs
987from \fBend_fsync\fR in that it will happen on every file close, not
988just at the end of the job. Default: false.
d60e92d1 989.TP
523bad63
TK
990.BI rwmixread \fR=\fPint
991Percentage of a mixed workload that should be reads. Default: 50.
992.TP
993.BI rwmixwrite \fR=\fPint
994Percentage of a mixed workload that should be writes. If both
995\fBrwmixread\fR and \fBrwmixwrite\fR is given and the values do not
996add up to 100%, the latter of the two will be used to override the
997first. This may interfere with a given rate setting, if fio is asked to
998limit reads or writes to a certain rate. If that is the case, then the
999distribution may be skewed. Default: 50.
1000.TP
1001.BI random_distribution \fR=\fPstr:float[,str:float][,str:float]
1002By default, fio will use a completely uniform random distribution when asked
1003to perform random I/O. Sometimes it is useful to skew the distribution in
1004specific ways, ensuring that some parts of the data is more hot than others.
1005fio includes the following distribution models:
d60e92d1
AC
1006.RS
1007.RS
1008.TP
1009.B random
523bad63 1010Uniform random distribution
8c07860d
JA
1011.TP
1012.B zipf
523bad63 1013Zipf distribution
8c07860d
JA
1014.TP
1015.B pareto
523bad63 1016Pareto distribution
8c07860d 1017.TP
dd3503d3 1018.B normal
523bad63 1019Normal (Gaussian) distribution
dd3503d3 1020.TP
523bad63
TK
1021.B zoned
1022Zoned random distribution
d60e92d1
AC
1023.RE
1024.P
523bad63
TK
1025When using a \fBzipf\fR or \fBpareto\fR distribution, an input value is also
1026needed to define the access pattern. For \fBzipf\fR, this is the `Zipf theta'.
1027For \fBpareto\fR, it's the `Pareto power'. Fio includes a test
1028program, \fBfio\-genzipf\fR, that can be used visualize what the given input
1029values will yield in terms of hit rates. If you wanted to use \fBzipf\fR with
1030a `theta' of 1.2, you would use `random_distribution=zipf:1.2' as the
1031option. If a non\-uniform model is used, fio will disable use of the random
1032map. For the \fBnormal\fR distribution, a normal (Gaussian) deviation is
1033supplied as a value between 0 and 100.
1034.P
1035For a \fBzoned\fR distribution, fio supports specifying percentages of I/O
1036access that should fall within what range of the file or device. For
1037example, given a criteria of:
d60e92d1 1038.RS
523bad63
TK
1039.P
1040.PD 0
104160% of accesses should be to the first 10%
1042.P
104330% of accesses should be to the next 20%
1044.P
10458% of accesses should be to the next 30%
1046.P
10472% of accesses should be to the next 40%
1048.PD
1049.RE
1050.P
1051we can define that through zoning of the random accesses. For the above
1052example, the user would do:
1053.RS
1054.P
1055random_distribution=zoned:60/10:30/20:8/30:2/40
1056.RE
1057.P
1058similarly to how \fBbssplit\fR works for setting ranges and percentages
1059of block sizes. Like \fBbssplit\fR, it's possible to specify separate
1060zones for reads, writes, and trims. If just one set is given, it'll apply to
1061all of them.
1062.RE
1063.TP
1064.BI percentage_random \fR=\fPint[,int][,int]
1065For a random workload, set how big a percentage should be random. This
1066defaults to 100%, in which case the workload is fully random. It can be set
1067from anywhere from 0 to 100. Setting it to 0 would make the workload fully
1068sequential. Any setting in between will result in a random mix of sequential
1069and random I/O, at the given percentages. Comma\-separated values may be
1070specified for reads, writes, and trims as described in \fBblocksize\fR.
1071.TP
1072.BI norandommap
1073Normally fio will cover every block of the file when doing random I/O. If
1074this option is given, fio will just get a new random offset without looking
1075at past I/O history. This means that some blocks may not be read or written,
1076and that some blocks may be read/written more than once. If this option is
1077used with \fBverify\fR and multiple blocksizes (via \fBbsrange\fR),
1078only intact blocks are verified, i.e., partially\-overwritten blocks are
1079ignored.
1080.TP
1081.BI softrandommap \fR=\fPbool
1082See \fBnorandommap\fR. If fio runs with the random block map enabled and
1083it fails to allocate the map, if this option is set it will continue without
1084a random block map. As coverage will not be as complete as with random maps,
1085this option is disabled by default.
1086.TP
1087.BI random_generator \fR=\fPstr
1088Fio supports the following engines for generating I/O offsets for random I/O:
1089.RS
1090.RS
1091.TP
1092.B tausworthe
1093Strong 2^88 cycle random number generator.
1094.TP
1095.B lfsr
1096Linear feedback shift register generator.
1097.TP
1098.B tausworthe64
1099Strong 64\-bit 2^258 cycle random number generator.
1100.RE
1101.P
1102\fBtausworthe\fR is a strong random number generator, but it requires tracking
1103on the side if we want to ensure that blocks are only read or written
1104once. \fBlfsr\fR guarantees that we never generate the same offset twice, and
1105it's also less computationally expensive. It's not a true random generator,
1106however, though for I/O purposes it's typically good enough. \fBlfsr\fR only
1107works with single block sizes, not with workloads that use multiple block
1108sizes. If used with such a workload, fio may read or write some blocks
1109multiple times. The default value is \fBtausworthe\fR, unless the required
1110space exceeds 2^32 blocks. If it does, then \fBtausworthe64\fR is
1111selected automatically.
1112.RE
1113.SS "Block size"
1114.TP
1115.BI blocksize \fR=\fPint[,int][,int] "\fR,\fB bs" \fR=\fPint[,int][,int]
1116The block size in bytes used for I/O units. Default: 4096. A single value
1117applies to reads, writes, and trims. Comma\-separated values may be
1118specified for reads, writes, and trims. A value not terminated in a comma
1119applies to subsequent types. Examples:
1120.RS
1121.RS
1122.P
1123.PD 0
1124bs=256k means 256k for reads, writes and trims.
1125.P
1126bs=8k,32k means 8k for reads, 32k for writes and trims.
1127.P
1128bs=8k,32k, means 8k for reads, 32k for writes, and default for trims.
1129.P
1130bs=,8k means default for reads, 8k for writes and trims.
1131.P
1132bs=,8k, means default for reads, 8k for writes, and default for trims.
1133.PD
1134.RE
1135.RE
1136.TP
1137.BI blocksize_range \fR=\fPirange[,irange][,irange] "\fR,\fB bsrange" \fR=\fPirange[,irange][,irange]
1138A range of block sizes in bytes for I/O units. The issued I/O unit will
1139always be a multiple of the minimum size, unless
1140\fBblocksize_unaligned\fR is set.
1141Comma\-separated ranges may be specified for reads, writes, and trims as
1142described in \fBblocksize\fR. Example:
1143.RS
1144.RS
1145.P
1146bsrange=1k\-4k,2k\-8k
1147.RE
1148.RE
1149.TP
1150.BI bssplit \fR=\fPstr[,str][,str]
1151Sometimes you want even finer grained control of the block sizes issued, not
1152just an even split between them. This option allows you to weight various
1153block sizes, so that you are able to define a specific amount of block sizes
1154issued. The format for this option is:
1155.RS
1156.RS
1157.P
1158bssplit=blocksize/percentage:blocksize/percentage
1159.RE
1160.P
1161for as many block sizes as needed. So if you want to define a workload that
1162has 50% 64k blocks, 10% 4k blocks, and 40% 32k blocks, you would write:
1163.RS
1164.P
1165bssplit=4k/10:64k/50:32k/40
1166.RE
1167.P
1168Ordering does not matter. If the percentage is left blank, fio will fill in
1169the remaining values evenly. So a bssplit option like this one:
1170.RS
1171.P
1172bssplit=4k/50:1k/:32k/
1173.RE
1174.P
1175would have 50% 4k ios, and 25% 1k and 32k ios. The percentages always add up
1176to 100, if bssplit is given a range that adds up to more, it will error out.
1177.P
1178Comma\-separated values may be specified for reads, writes, and trims as
1179described in \fBblocksize\fR.
1180.P
1181If you want a workload that has 50% 2k reads and 50% 4k reads, while having
118290% 4k writes and 10% 8k writes, you would specify:
1183.RS
1184.P
1185bssplit=2k/50:4k/50,4k/90,8k/10
1186.RE
1187.RE
1188.TP
1189.BI blocksize_unaligned "\fR,\fB bs_unaligned"
1190If set, fio will issue I/O units with any size within
1191\fBblocksize_range\fR, not just multiples of the minimum size. This
1192typically won't work with direct I/O, as that normally requires sector
1193alignment.
1194.TP
1195.BI bs_is_seq_rand \fR=\fPbool
1196If this option is set, fio will use the normal read,write blocksize settings
1197as sequential,random blocksize settings instead. Any random read or write
1198will use the WRITE blocksize settings, and any sequential read or write will
1199use the READ blocksize settings.
1200.TP
1201.BI blockalign \fR=\fPint[,int][,int] "\fR,\fB ba" \fR=\fPint[,int][,int]
1202Boundary to which fio will align random I/O units. Default:
1203\fBblocksize\fR. Minimum alignment is typically 512b for using direct
1204I/O, though it usually depends on the hardware block size. This option is
1205mutually exclusive with using a random map for files, so it will turn off
1206that option. Comma\-separated values may be specified for reads, writes, and
1207trims as described in \fBblocksize\fR.
1208.SS "Buffers and memory"
1209.TP
1210.BI zero_buffers
1211Initialize buffers with all zeros. Default: fill buffers with random data.
1212.TP
1213.BI refill_buffers
1214If this option is given, fio will refill the I/O buffers on every
1215submit. The default is to only fill it at init time and reuse that
1216data. Only makes sense if zero_buffers isn't specified, naturally. If data
1217verification is enabled, \fBrefill_buffers\fR is also automatically enabled.
1218.TP
1219.BI scramble_buffers \fR=\fPbool
1220If \fBrefill_buffers\fR is too costly and the target is using data
1221deduplication, then setting this option will slightly modify the I/O buffer
1222contents to defeat normal de\-dupe attempts. This is not enough to defeat
1223more clever block compression attempts, but it will stop naive dedupe of
1224blocks. Default: true.
1225.TP
1226.BI buffer_compress_percentage \fR=\fPint
1227If this is set, then fio will attempt to provide I/O buffer content (on
1228WRITEs) that compresses to the specified level. Fio does this by providing a
1229mix of random data and a fixed pattern. The fixed pattern is either zeros,
1230or the pattern specified by \fBbuffer_pattern\fR. If the pattern option
1231is used, it might skew the compression ratio slightly. Note that this is per
1232block size unit, for file/disk wide compression level that matches this
1233setting, you'll also want to set \fBrefill_buffers\fR.
1234.TP
1235.BI buffer_compress_chunk \fR=\fPint
1236See \fBbuffer_compress_percentage\fR. This setting allows fio to manage
1237how big the ranges of random data and zeroed data is. Without this set, fio
1238will provide \fBbuffer_compress_percentage\fR of blocksize random data,
1239followed by the remaining zeroed. With this set to some chunk size smaller
1240than the block size, fio can alternate random and zeroed data throughout the
1241I/O buffer.
1242.TP
1243.BI buffer_pattern \fR=\fPstr
1244If set, fio will fill the I/O buffers with this pattern or with the contents
1245of a file. If not set, the contents of I/O buffers are defined by the other
1246options related to buffer contents. The setting can be any pattern of bytes,
1247and can be prefixed with 0x for hex values. It may also be a string, where
1248the string must then be wrapped with "". Or it may also be a filename,
1249where the filename must be wrapped with '' in which case the file is
1250opened and read. Note that not all the file contents will be read if that
1251would cause the buffers to overflow. So, for example:
1252.RS
1253.RS
1254.P
1255.PD 0
1256buffer_pattern='filename'
1257.P
1258or:
1259.P
1260buffer_pattern="abcd"
1261.P
1262or:
1263.P
1264buffer_pattern=\-12
1265.P
1266or:
1267.P
1268buffer_pattern=0xdeadface
1269.PD
1270.RE
1271.P
1272Also you can combine everything together in any order:
1273.RS
1274.P
1275buffer_pattern=0xdeadface"abcd"\-12'filename'
1276.RE
1277.RE
1278.TP
1279.BI dedupe_percentage \fR=\fPint
1280If set, fio will generate this percentage of identical buffers when
1281writing. These buffers will be naturally dedupable. The contents of the
1282buffers depend on what other buffer compression settings have been set. It's
1283possible to have the individual buffers either fully compressible, or not at
1284all. This option only controls the distribution of unique buffers.
1285.TP
1286.BI invalidate \fR=\fPbool
1287Invalidate the buffer/page cache parts of the files to be used prior to
1288starting I/O if the platform and file type support it. Defaults to true.
1289This will be ignored if \fBpre_read\fR is also specified for the
1290same job.
1291.TP
1292.BI sync \fR=\fPbool
1293Use synchronous I/O for buffered writes. For the majority of I/O engines,
1294this means using O_SYNC. Default: false.
1295.TP
1296.BI iomem \fR=\fPstr "\fR,\fP mem" \fR=\fPstr
1297Fio can use various types of memory as the I/O unit buffer. The allowed
1298values are:
1299.RS
1300.RS
1301.TP
1302.B malloc
1303Use memory from \fBmalloc\fR\|(3) as the buffers. Default memory type.
1304.TP
1305.B shm
1306Use shared memory as the buffers. Allocated through \fBshmget\fR\|(2).
1307.TP
1308.B shmhuge
1309Same as \fBshm\fR, but use huge pages as backing.
1310.TP
1311.B mmap
1312Use \fBmmap\fR\|(2) to allocate buffers. May either be anonymous memory, or can
1313be file backed if a filename is given after the option. The format
1314is `mem=mmap:/path/to/file'.
1315.TP
1316.B mmaphuge
1317Use a memory mapped huge file as the buffer backing. Append filename
1318after mmaphuge, ala `mem=mmaphuge:/hugetlbfs/file'.
1319.TP
1320.B mmapshared
1321Same as \fBmmap\fR, but use a MMAP_SHARED mapping.
1322.TP
1323.B cudamalloc
1324Use GPU memory as the buffers for GPUDirect RDMA benchmark.
1325The \fBioengine\fR must be \fBrdma\fR.
1326.RE
1327.P
1328The area allocated is a function of the maximum allowed bs size for the job,
1329multiplied by the I/O depth given. Note that for \fBshmhuge\fR and
1330\fBmmaphuge\fR to work, the system must have free huge pages allocated. This
1331can normally be checked and set by reading/writing
1332`/proc/sys/vm/nr_hugepages' on a Linux system. Fio assumes a huge page
1333is 4MiB in size. So to calculate the number of huge pages you need for a
1334given job file, add up the I/O depth of all jobs (normally one unless
1335\fBiodepth\fR is used) and multiply by the maximum bs set. Then divide
1336that number by the huge page size. You can see the size of the huge pages in
1337`/proc/meminfo'. If no huge pages are allocated by having a non\-zero
1338number in `nr_hugepages', using \fBmmaphuge\fR or \fBshmhuge\fR will fail. Also
1339see \fBhugepage\-size\fR.
1340.P
1341\fBmmaphuge\fR also needs to have hugetlbfs mounted and the file location
1342should point there. So if it's mounted in `/huge', you would use
1343`mem=mmaphuge:/huge/somefile'.
1344.RE
1345.TP
1346.BI iomem_align \fR=\fPint "\fR,\fP mem_align" \fR=\fPint
1347This indicates the memory alignment of the I/O memory buffers. Note that
1348the given alignment is applied to the first I/O unit buffer, if using
1349\fBiodepth\fR the alignment of the following buffers are given by the
1350\fBbs\fR used. In other words, if using a \fBbs\fR that is a
1351multiple of the page sized in the system, all buffers will be aligned to
1352this value. If using a \fBbs\fR that is not page aligned, the alignment
1353of subsequent I/O memory buffers is the sum of the \fBiomem_align\fR and
1354\fBbs\fR used.
1355.TP
1356.BI hugepage\-size \fR=\fPint
1357Defines the size of a huge page. Must at least be equal to the system
1358setting, see `/proc/meminfo'. Defaults to 4MiB. Should probably
1359always be a multiple of megabytes, so using `hugepage\-size=Xm' is the
1360preferred way to set this to avoid setting a non\-pow\-2 bad value.
1361.TP
1362.BI lockmem \fR=\fPint
1363Pin the specified amount of memory with \fBmlock\fR\|(2). Can be used to
1364simulate a smaller amount of memory. The amount specified is per worker.
1365.SS "I/O size"
1366.TP
1367.BI size \fR=\fPint
1368The total size of file I/O for each thread of this job. Fio will run until
1369this many bytes has been transferred, unless runtime is limited by other options
1370(such as \fBruntime\fR, for instance, or increased/decreased by \fBio_size\fR).
1371Fio will divide this size between the available files determined by options
1372such as \fBnrfiles\fR, \fBfilename\fR, unless \fBfilesize\fR is
1373specified by the job. If the result of division happens to be 0, the size is
1374set to the physical size of the given files or devices if they exist.
1375If this option is not specified, fio will use the full size of the given
1376files or devices. If the files do not exist, size must be given. It is also
1377possible to give size as a percentage between 1 and 100. If `size=20%' is
1378given, fio will use 20% of the full size of the given files or devices.
1379Can be combined with \fBoffset\fR to constrain the start and end range
1380that I/O will be done within.
1381.TP
1382.BI io_size \fR=\fPint "\fR,\fB io_limit" \fR=\fPint
1383Normally fio operates within the region set by \fBsize\fR, which means
1384that the \fBsize\fR option sets both the region and size of I/O to be
1385performed. Sometimes that is not what you want. With this option, it is
1386possible to define just the amount of I/O that fio should do. For instance,
1387if \fBsize\fR is set to 20GiB and \fBio_size\fR is set to 5GiB, fio
1388will perform I/O within the first 20GiB but exit when 5GiB have been
1389done. The opposite is also possible \-\- if \fBsize\fR is set to 20GiB,
1390and \fBio_size\fR is set to 40GiB, then fio will do 40GiB of I/O within
1391the 0..20GiB region.
1392.TP
1393.BI filesize \fR=\fPirange(int)
1394Individual file sizes. May be a range, in which case fio will select sizes
1395for files at random within the given range and limited to \fBsize\fR in
1396total (if that is given). If not given, each created file is the same size.
1397This option overrides \fBsize\fR in terms of file size, which means
1398this value is used as a fixed size or possible range of each file.
1399.TP
1400.BI file_append \fR=\fPbool
1401Perform I/O after the end of the file. Normally fio will operate within the
1402size of a file. If this option is set, then fio will append to the file
1403instead. This has identical behavior to setting \fBoffset\fR to the size
1404of a file. This option is ignored on non\-regular files.
1405.TP
1406.BI fill_device \fR=\fPbool "\fR,\fB fill_fs" \fR=\fPbool
1407Sets size to something really large and waits for ENOSPC (no space left on
1408device) as the terminating condition. Only makes sense with sequential
1409write. For a read workload, the mount point will be filled first then I/O
1410started on the result. This option doesn't make sense if operating on a raw
1411device node, since the size of that is already known by the file system.
1412Additionally, writing beyond end\-of\-device will not return ENOSPC there.
1413.SS "I/O engine"
1414.TP
1415.BI ioengine \fR=\fPstr
1416Defines how the job issues I/O to the file. The following types are defined:
1417.RS
1418.RS
1419.TP
1420.B sync
1421Basic \fBread\fR\|(2) or \fBwrite\fR\|(2)
1422I/O. \fBlseek\fR\|(2) is used to position the I/O location.
1423See \fBfsync\fR and \fBfdatasync\fR for syncing write I/Os.
1424.TP
1425.B psync
1426Basic \fBpread\fR\|(2) or \fBpwrite\fR\|(2) I/O. Default on
1427all supported operating systems except for Windows.
1428.TP
1429.B vsync
1430Basic \fBreadv\fR\|(2) or \fBwritev\fR\|(2) I/O. Will emulate
1431queuing by coalescing adjacent I/Os into a single submission.
1432.TP
1433.B pvsync
1434Basic \fBpreadv\fR\|(2) or \fBpwritev\fR\|(2) I/O.
a46c5e01 1435.TP
2cafffbe
JA
1436.B pvsync2
1437Basic \fBpreadv2\fR\|(2) or \fBpwritev2\fR\|(2) I/O.
1438.TP
d60e92d1 1439.B libaio
523bad63
TK
1440Linux native asynchronous I/O. Note that Linux may only support
1441queued behavior with non\-buffered I/O (set `direct=1' or
1442`buffered=0').
1443This engine defines engine specific options.
d60e92d1
AC
1444.TP
1445.B posixaio
523bad63
TK
1446POSIX asynchronous I/O using \fBaio_read\fR\|(3) and
1447\fBaio_write\fR\|(3).
03e20d68
BC
1448.TP
1449.B solarisaio
1450Solaris native asynchronous I/O.
1451.TP
1452.B windowsaio
38f8c318 1453Windows native asynchronous I/O. Default on Windows.
d60e92d1
AC
1454.TP
1455.B mmap
523bad63
TK
1456File is memory mapped with \fBmmap\fR\|(2) and data copied
1457to/from using \fBmemcpy\fR\|(3).
d60e92d1
AC
1458.TP
1459.B splice
523bad63
TK
1460\fBsplice\fR\|(2) is used to transfer the data and
1461\fBvmsplice\fR\|(2) to transfer data from user space to the
1462kernel.
d60e92d1 1463.TP
d60e92d1 1464.B sg
523bad63
TK
1465SCSI generic sg v3 I/O. May either be synchronous using the SG_IO
1466ioctl, or if the target is an sg character device we use
1467\fBread\fR\|(2) and \fBwrite\fR\|(2) for asynchronous
1468I/O. Requires \fBfilename\fR option to specify either block or
1469character devices.
d60e92d1
AC
1470.TP
1471.B null
523bad63
TK
1472Doesn't transfer any data, just pretends to. This is mainly used to
1473exercise fio itself and for debugging/testing purposes.
d60e92d1
AC
1474.TP
1475.B net
523bad63
TK
1476Transfer over the network to given `host:port'. Depending on the
1477\fBprotocol\fR used, the \fBhostname\fR, \fBport\fR,
1478\fBlisten\fR and \fBfilename\fR options are used to specify
1479what sort of connection to make, while the \fBprotocol\fR option
1480determines which protocol will be used. This engine defines engine
1481specific options.
d60e92d1
AC
1482.TP
1483.B netsplice
523bad63
TK
1484Like \fBnet\fR, but uses \fBsplice\fR\|(2) and
1485\fBvmsplice\fR\|(2) to map data and send/receive.
1486This engine defines engine specific options.
d60e92d1 1487.TP
53aec0a4 1488.B cpuio
523bad63
TK
1489Doesn't transfer any data, but burns CPU cycles according to the
1490\fBcpuload\fR and \fBcpuchunks\fR options. Setting
1491\fBcpuload\fR\=85 will cause that job to do nothing but burn 85%
1492of the CPU. In case of SMP machines, use `numjobs=<nr_of_cpu>'
1493to get desired CPU usage, as the cpuload only loads a
1494single CPU at the desired rate. A job never finishes unless there is
1495at least one non\-cpuio job.
d60e92d1
AC
1496.TP
1497.B guasi
523bad63
TK
1498The GUASI I/O engine is the Generic Userspace Asyncronous Syscall
1499Interface approach to async I/O. See \fIhttp://www.xmailserver.org/guasi\-lib.html\fR
1500for more info on GUASI.
d60e92d1 1501.TP
21b8aee8 1502.B rdma
523bad63
TK
1503The RDMA I/O engine supports both RDMA memory semantics
1504(RDMA_WRITE/RDMA_READ) and channel semantics (Send/Recv) for the
1505InfiniBand, RoCE and iWARP protocols.
d54fce84
DM
1506.TP
1507.B falloc
523bad63
TK
1508I/O engine that does regular fallocate to simulate data transfer as
1509fio ioengine.
1510.RS
1511.P
1512.PD 0
1513DDIR_READ does fallocate(,mode = FALLOC_FL_KEEP_SIZE,).
1514.P
1515DIR_WRITE does fallocate(,mode = 0).
1516.P
1517DDIR_TRIM does fallocate(,mode = FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
1518.PD
1519.RE
1520.TP
1521.B ftruncate
1522I/O engine that sends \fBftruncate\fR\|(2) operations in response
1523to write (DDIR_WRITE) events. Each ftruncate issued sets the file's
1524size to the current block offset. \fBblocksize\fR is ignored.
d54fce84
DM
1525.TP
1526.B e4defrag
523bad63
TK
1527I/O engine that does regular EXT4_IOC_MOVE_EXT ioctls to simulate
1528defragment activity in request to DDIR_WRITE event.
0d978694
DAG
1529.TP
1530.B rbd
523bad63
TK
1531I/O engine supporting direct access to Ceph Rados Block Devices
1532(RBD) via librbd without the need to use the kernel rbd driver. This
1533ioengine defines engine specific options.
a7c386f4 1534.TP
1535.B gfapi
523bad63
TK
1536Using GlusterFS libgfapi sync interface to direct access to
1537GlusterFS volumes without having to go through FUSE. This ioengine
1538defines engine specific options.
cc47f094 1539.TP
1540.B gfapi_async
523bad63
TK
1541Using GlusterFS libgfapi async interface to direct access to
1542GlusterFS volumes without having to go through FUSE. This ioengine
1543defines engine specific options.
1b10477b 1544.TP
b74e419e 1545.B libhdfs
523bad63
TK
1546Read and write through Hadoop (HDFS). The \fBfilename\fR option
1547is used to specify host,port of the hdfs name\-node to connect. This
1548engine interprets offsets a little differently. In HDFS, files once
1549created cannot be modified so random writes are not possible. To
1550imitate this the libhdfs engine expects a bunch of small files to be
1551created over HDFS and will randomly pick a file from them
1552based on the offset generated by fio backend (see the example
1553job file to create such files, use `rw=write' option). Please
1554note, it may be necessary to set environment variables to work
1555with HDFS/libhdfs properly. Each job uses its own connection to
1556HDFS.
65fa28ca
DE
1557.TP
1558.B mtd
523bad63
TK
1559Read, write and erase an MTD character device (e.g.,
1560`/dev/mtd0'). Discards are treated as erases. Depending on the
1561underlying device type, the I/O may have to go in a certain pattern,
1562e.g., on NAND, writing sequentially to erase blocks and discarding
1563before overwriting. The \fBtrimwrite\fR mode works well for this
65fa28ca 1564constraint.
5c4ef02e
JA
1565.TP
1566.B pmemblk
523bad63
TK
1567Read and write using filesystem DAX to a file on a filesystem
1568mounted with DAX on a persistent memory device through the NVML
1569libpmemblk library.
104ee4de 1570.TP
523bad63
TK
1571.B dev\-dax
1572Read and write using device DAX to a persistent memory device (e.g.,
1573/dev/dax0.0) through the NVML libpmem library.
d60e92d1 1574.TP
523bad63
TK
1575.B external
1576Prefix to specify loading an external I/O engine object file. Append
1577the engine filename, e.g. `ioengine=external:/tmp/foo.o' to load
1578ioengine `foo.o' in `/tmp'.
1579.SS "I/O engine specific parameters"
1580In addition, there are some parameters which are only valid when a specific
1581\fBioengine\fR is in use. These are used identically to normal parameters,
1582with the caveat that when used on the command line, they must come after the
1583\fBioengine\fR that defines them is selected.
d60e92d1 1584.TP
523bad63
TK
1585.BI (libaio)userspace_reap
1586Normally, with the libaio engine in use, fio will use the
1587\fBio_getevents\fR\|(3) system call to reap newly returned events. With
1588this flag turned on, the AIO ring will be read directly from user\-space to
1589reap events. The reaping mode is only enabled when polling for a minimum of
15900 events (e.g. when `iodepth_batch_complete=0').
3ce9dcaf 1591.TP
523bad63
TK
1592.BI (pvsync2)hipri
1593Set RWF_HIPRI on I/O, indicating to the kernel that it's of higher priority
1594than normal.
82407585 1595.TP
523bad63
TK
1596.BI (pvsync2)hipri_percentage
1597When hipri is set this determines the probability of a pvsync2 I/O being high
1598priority. The default is 100%.
d60e92d1 1599.TP
523bad63
TK
1600.BI (cpuio)cpuload \fR=\fPint
1601Attempt to use the specified percentage of CPU cycles. This is a mandatory
1602option when using cpuio I/O engine.
997b5680 1603.TP
523bad63
TK
1604.BI (cpuio)cpuchunks \fR=\fPint
1605Split the load into cycles of the given time. In microseconds.
1ad01bd1 1606.TP
523bad63
TK
1607.BI (cpuio)exit_on_io_done \fR=\fPbool
1608Detect when I/O threads are done, then exit.
d60e92d1 1609.TP
523bad63
TK
1610.BI (libhdfs)namenode \fR=\fPstr
1611The hostname or IP address of a HDFS cluster namenode to contact.
d01612f3 1612.TP
523bad63
TK
1613.BI (libhdfs)port
1614The listening port of the HFDS cluster namenode.
d60e92d1 1615.TP
523bad63
TK
1616.BI (netsplice,net)port
1617The TCP or UDP port to bind to or connect to. If this is used with
1618\fBnumjobs\fR to spawn multiple instances of the same job type, then
1619this will be the starting port number since fio will use a range of
1620ports.
d60e92d1 1621.TP
523bad63
TK
1622.BI (netsplice,net)hostname \fR=\fPstr
1623The hostname or IP address to use for TCP or UDP based I/O. If the job is
1624a TCP listener or UDP reader, the hostname is not used and must be omitted
1625unless it is a valid UDP multicast address.
591e9e06 1626.TP
523bad63
TK
1627.BI (netsplice,net)interface \fR=\fPstr
1628The IP address of the network interface used to send or receive UDP
1629multicast.
ddf24e42 1630.TP
523bad63
TK
1631.BI (netsplice,net)ttl \fR=\fPint
1632Time\-to\-live value for outgoing UDP multicast packets. Default: 1.
d60e92d1 1633.TP
523bad63
TK
1634.BI (netsplice,net)nodelay \fR=\fPbool
1635Set TCP_NODELAY on TCP connections.
fa769d44 1636.TP
523bad63
TK
1637.BI (netsplice,net)protocol \fR=\fPstr "\fR,\fP proto" \fR=\fPstr
1638The network protocol to use. Accepted values are:
1639.RS
e76b1da4
JA
1640.RS
1641.TP
523bad63
TK
1642.B tcp
1643Transmission control protocol.
e76b1da4 1644.TP
523bad63
TK
1645.B tcpv6
1646Transmission control protocol V6.
e76b1da4 1647.TP
523bad63
TK
1648.B udp
1649User datagram protocol.
1650.TP
1651.B udpv6
1652User datagram protocol V6.
e76b1da4 1653.TP
523bad63
TK
1654.B unix
1655UNIX domain socket.
e76b1da4
JA
1656.RE
1657.P
523bad63
TK
1658When the protocol is TCP or UDP, the port must also be given, as well as the
1659hostname if the job is a TCP listener or UDP reader. For unix sockets, the
1660normal \fBfilename\fR option should be used and the port is invalid.
1661.RE
1662.TP
1663.BI (netsplice,net)listen
1664For TCP network connections, tell fio to listen for incoming connections
1665rather than initiating an outgoing connection. The \fBhostname\fR must
1666be omitted if this option is used.
1667.TP
1668.BI (netsplice,net)pingpong
1669Normally a network writer will just continue writing data, and a network
1670reader will just consume packages. If `pingpong=1' is set, a writer will
1671send its normal payload to the reader, then wait for the reader to send the
1672same payload back. This allows fio to measure network latencies. The
1673submission and completion latencies then measure local time spent sending or
1674receiving, and the completion latency measures how long it took for the
1675other end to receive and send back. For UDP multicast traffic
1676`pingpong=1' should only be set for a single reader when multiple readers
1677are listening to the same address.
1678.TP
1679.BI (netsplice,net)window_size \fR=\fPint
1680Set the desired socket buffer size for the connection.
e76b1da4 1681.TP
523bad63
TK
1682.BI (netsplice,net)mss \fR=\fPint
1683Set the TCP maximum segment size (TCP_MAXSEG).
d60e92d1 1684.TP
523bad63
TK
1685.BI (e4defrag)donorname \fR=\fPstr
1686File will be used as a block donor (swap extents between files).
d60e92d1 1687.TP
523bad63
TK
1688.BI (e4defrag)inplace \fR=\fPint
1689Configure donor file blocks allocation strategy:
1690.RS
1691.RS
d60e92d1 1692.TP
523bad63
TK
1693.B 0
1694Default. Preallocate donor's file on init.
d60e92d1 1695.TP
523bad63
TK
1696.B 1
1697Allocate space immediately inside defragment event, and free right
1698after event.
1699.RE
1700.RE
d60e92d1 1701.TP
523bad63
TK
1702.BI (rbd)clustername \fR=\fPstr
1703Specifies the name of the Ceph cluster.
92d42d69 1704.TP
523bad63
TK
1705.BI (rbd)rbdname \fR=\fPstr
1706Specifies the name of the RBD.
92d42d69 1707.TP
523bad63
TK
1708.BI (rbd)pool \fR=\fPstr
1709Specifies the name of the Ceph pool containing RBD.
92d42d69 1710.TP
523bad63
TK
1711.BI (rbd)clientname \fR=\fPstr
1712Specifies the username (without the 'client.' prefix) used to access the
1713Ceph cluster. If the \fBclustername\fR is specified, the \fBclientname\fR shall be
1714the full *type.id* string. If no type. prefix is given, fio will add 'client.'
1715by default.
92d42d69 1716.TP
523bad63
TK
1717.BI (mtd)skip_bad \fR=\fPbool
1718Skip operations against known bad blocks.
8116fd24 1719.TP
523bad63
TK
1720.BI (libhdfs)hdfsdirectory
1721libhdfs will create chunk in this HDFS directory.
e0a04ac1 1722.TP
523bad63
TK
1723.BI (libhdfs)chunk_size
1724The size of the chunk to use for each file.
1725.SS "I/O depth"
1726.TP
1727.BI iodepth \fR=\fPint
1728Number of I/O units to keep in flight against the file. Note that
1729increasing \fBiodepth\fR beyond 1 will not affect synchronous ioengines (except
1730for small degrees when \fBverify_async\fR is in use). Even async
1731engines may impose OS restrictions causing the desired depth not to be
1732achieved. This may happen on Linux when using libaio and not setting
1733`direct=1', since buffered I/O is not async on that OS. Keep an
1734eye on the I/O depth distribution in the fio output to verify that the
1735achieved depth is as expected. Default: 1.
1736.TP
1737.BI iodepth_batch_submit \fR=\fPint "\fR,\fP iodepth_batch" \fR=\fPint
1738This defines how many pieces of I/O to submit at once. It defaults to 1
1739which means that we submit each I/O as soon as it is available, but can be
1740raised to submit bigger batches of I/O at the time. If it is set to 0 the
1741\fBiodepth\fR value will be used.
1742.TP
1743.BI iodepth_batch_complete_min \fR=\fPint "\fR,\fP iodepth_batch_complete" \fR=\fPint
1744This defines how many pieces of I/O to retrieve at once. It defaults to 1
1745which means that we'll ask for a minimum of 1 I/O in the retrieval process
1746from the kernel. The I/O retrieval will go on until we hit the limit set by
1747\fBiodepth_low\fR. If this variable is set to 0, then fio will always
1748check for completed events before queuing more I/O. This helps reduce I/O
1749latency, at the cost of more retrieval system calls.
1750.TP
1751.BI iodepth_batch_complete_max \fR=\fPint
1752This defines maximum pieces of I/O to retrieve at once. This variable should
1753be used along with \fBiodepth_batch_complete_min\fR=\fIint\fR variable,
1754specifying the range of min and max amount of I/O which should be
1755retrieved. By default it is equal to \fBiodepth_batch_complete_min\fR
1756value. Example #1:
e0a04ac1 1757.RS
e0a04ac1 1758.RS
e0a04ac1 1759.P
523bad63
TK
1760.PD 0
1761iodepth_batch_complete_min=1
e0a04ac1 1762.P
523bad63
TK
1763iodepth_batch_complete_max=<iodepth>
1764.PD
e0a04ac1
JA
1765.RE
1766.P
523bad63
TK
1767which means that we will retrieve at least 1 I/O and up to the whole
1768submitted queue depth. If none of I/O has been completed yet, we will wait.
1769Example #2:
e8b1961d 1770.RS
523bad63
TK
1771.P
1772.PD 0
1773iodepth_batch_complete_min=0
1774.P
1775iodepth_batch_complete_max=<iodepth>
1776.PD
e8b1961d
JA
1777.RE
1778.P
523bad63
TK
1779which means that we can retrieve up to the whole submitted queue depth, but
1780if none of I/O has been completed yet, we will NOT wait and immediately exit
1781the system call. In this example we simply do polling.
1782.RE
e8b1961d 1783.TP
523bad63
TK
1784.BI iodepth_low \fR=\fPint
1785The low water mark indicating when to start filling the queue
1786again. Defaults to the same as \fBiodepth\fR, meaning that fio will
1787attempt to keep the queue full at all times. If \fBiodepth\fR is set to
1788e.g. 16 and \fBiodepth_low\fR is set to 4, then after fio has filled the queue of
178916 requests, it will let the depth drain down to 4 before starting to fill
1790it again.
d60e92d1 1791.TP
523bad63
TK
1792.BI serialize_overlap \fR=\fPbool
1793Serialize in-flight I/Os that might otherwise cause or suffer from data races.
1794When two or more I/Os are submitted simultaneously, there is no guarantee that
1795the I/Os will be processed or completed in the submitted order. Further, if
1796two or more of those I/Os are writes, any overlapping region between them can
1797become indeterminate/undefined on certain storage. These issues can cause
1798verification to fail erratically when at least one of the racing I/Os is
1799changing data and the overlapping region has a non-zero size. Setting
1800\fBserialize_overlap\fR tells fio to avoid provoking this behavior by explicitly
1801serializing in-flight I/Os that have a non-zero overlap. Note that setting
1802this option can reduce both performance and the \fBiodepth\fR achieved.
1803Additionally this option does not work when \fBio_submit_mode\fR is set to
1804offload. Default: false.
d60e92d1 1805.TP
523bad63
TK
1806.BI io_submit_mode \fR=\fPstr
1807This option controls how fio submits the I/O to the I/O engine. The default
1808is `inline', which means that the fio job threads submit and reap I/O
1809directly. If set to `offload', the job threads will offload I/O submission
1810to a dedicated pool of I/O threads. This requires some coordination and thus
1811has a bit of extra overhead, especially for lower queue depth I/O where it
1812can increase latencies. The benefit is that fio can manage submission rates
1813independently of the device completion rates. This avoids skewed latency
1814reporting if I/O gets backed up on the device side (the coordinated omission
1815problem).
1816.SS "I/O rate"
d60e92d1 1817.TP
523bad63
TK
1818.BI thinktime \fR=\fPtime
1819Stall the job for the specified period of time after an I/O has completed before issuing the
1820next. May be used to simulate processing being done by an application.
1821When the unit is omitted, the value is interpreted in microseconds. See
1822\fBthinktime_blocks\fR and \fBthinktime_spin\fR.
d60e92d1 1823.TP
523bad63
TK
1824.BI thinktime_spin \fR=\fPtime
1825Only valid if \fBthinktime\fR is set \- pretend to spend CPU time doing
1826something with the data received, before falling back to sleeping for the
1827rest of the period specified by \fBthinktime\fR. When the unit is
1828omitted, the value is interpreted in microseconds.
d60e92d1
AC
1829.TP
1830.BI thinktime_blocks \fR=\fPint
523bad63
TK
1831Only valid if \fBthinktime\fR is set \- control how many blocks to issue,
1832before waiting \fBthinktime\fR usecs. If not set, defaults to 1 which will make
1833fio wait \fBthinktime\fR usecs after every block. This effectively makes any
1834queue depth setting redundant, since no more than 1 I/O will be queued
1835before we have to complete it and do our \fBthinktime\fR. In other words, this
1836setting effectively caps the queue depth if the latter is larger.
d60e92d1 1837.TP
6d500c2e 1838.BI rate \fR=\fPint[,int][,int]
523bad63
TK
1839Cap the bandwidth used by this job. The number is in bytes/sec, the normal
1840suffix rules apply. Comma\-separated values may be specified for reads,
1841writes, and trims as described in \fBblocksize\fR.
1842.RS
1843.P
1844For example, using `rate=1m,500k' would limit reads to 1MiB/sec and writes to
1845500KiB/sec. Capping only reads or writes can be done with `rate=,500k' or
1846`rate=500k,' where the former will only limit writes (to 500KiB/sec) and the
1847latter will only limit reads.
1848.RE
d60e92d1 1849.TP
6d500c2e 1850.BI rate_min \fR=\fPint[,int][,int]
523bad63
TK
1851Tell fio to do whatever it can to maintain at least this bandwidth. Failing
1852to meet this requirement will cause the job to exit. Comma\-separated values
1853may be specified for reads, writes, and trims as described in
1854\fBblocksize\fR.
d60e92d1 1855.TP
6d500c2e 1856.BI rate_iops \fR=\fPint[,int][,int]
523bad63
TK
1857Cap the bandwidth to this number of IOPS. Basically the same as
1858\fBrate\fR, just specified independently of bandwidth. If the job is
1859given a block size range instead of a fixed value, the smallest block size
1860is used as the metric. Comma\-separated values may be specified for reads,
1861writes, and trims as described in \fBblocksize\fR.
d60e92d1 1862.TP
6d500c2e 1863.BI rate_iops_min \fR=\fPint[,int][,int]
523bad63
TK
1864If fio doesn't meet this rate of I/O, it will cause the job to exit.
1865Comma\-separated values may be specified for reads, writes, and trims as
1866described in \fBblocksize\fR.
d60e92d1 1867.TP
6de65959 1868.BI rate_process \fR=\fPstr
523bad63
TK
1869This option controls how fio manages rated I/O submissions. The default is
1870`linear', which submits I/O in a linear fashion with fixed delays between
1871I/Os that gets adjusted based on I/O completion rates. If this is set to
1872`poisson', fio will submit I/O based on a more real world random request
6de65959 1873flow, known as the Poisson process
523bad63 1874(\fIhttps://en.wikipedia.org/wiki/Poisson_point_process\fR). The lambda will be
5d02b083 187510^6 / IOPS for the given workload.
523bad63 1876.SS "I/O latency"
ff6bb260 1877.TP
523bad63 1878.BI latency_target \fR=\fPtime
3e260a46 1879If set, fio will attempt to find the max performance point that the given
523bad63
TK
1880workload will run at while maintaining a latency below this target. When
1881the unit is omitted, the value is interpreted in microseconds. See
1882\fBlatency_window\fR and \fBlatency_percentile\fR.
3e260a46 1883.TP
523bad63 1884.BI latency_window \fR=\fPtime
3e260a46 1885Used with \fBlatency_target\fR to specify the sample window that the job
523bad63
TK
1886is run at varying queue depths to test the performance. When the unit is
1887omitted, the value is interpreted in microseconds.
3e260a46
JA
1888.TP
1889.BI latency_percentile \fR=\fPfloat
523bad63
TK
1890The percentage of I/Os that must fall within the criteria specified by
1891\fBlatency_target\fR and \fBlatency_window\fR. If not set, this
1892defaults to 100.0, meaning that all I/Os must be equal or below to the value
1893set by \fBlatency_target\fR.
1894.TP
1895.BI max_latency \fR=\fPtime
1896If set, fio will exit the job with an ETIMEDOUT error if it exceeds this
1897maximum latency. When the unit is omitted, the value is interpreted in
1898microseconds.
1899.TP
1900.BI rate_cycle \fR=\fPint
1901Average bandwidth for \fBrate\fR and \fBrate_min\fR over this number
1902of milliseconds. Defaults to 1000.
1903.SS "I/O replay"
1904.TP
1905.BI write_iolog \fR=\fPstr
1906Write the issued I/O patterns to the specified file. See
1907\fBread_iolog\fR. Specify a separate file for each job, otherwise the
1908iologs will be interspersed and the file may be corrupt.
1909.TP
1910.BI read_iolog \fR=\fPstr
1911Open an iolog with the specified filename and replay the I/O patterns it
1912contains. This can be used to store a workload and replay it sometime
1913later. The iolog given may also be a blktrace binary file, which allows fio
1914to replay a workload captured by blktrace. See
1915\fBblktrace\fR\|(8) for how to capture such logging data. For blktrace
1916replay, the file needs to be turned into a blkparse binary data file first
1917(`blkparse <device> \-o /dev/null \-d file_for_fio.bin').
3e260a46 1918.TP
523bad63
TK
1919.BI replay_no_stall \fR=\fPbool
1920When replaying I/O with \fBread_iolog\fR the default behavior is to
1921attempt to respect the timestamps within the log and replay them with the
1922appropriate delay between IOPS. By setting this variable fio will not
1923respect the timestamps and attempt to replay them as fast as possible while
1924still respecting ordering. The result is the same I/O pattern to a given
1925device, but different timings.
1926.TP
1927.BI replay_redirect \fR=\fPstr
1928While replaying I/O patterns using \fBread_iolog\fR the default behavior
1929is to replay the IOPS onto the major/minor device that each IOP was recorded
1930from. This is sometimes undesirable because on a different machine those
1931major/minor numbers can map to a different device. Changing hardware on the
1932same system can also result in a different major/minor mapping.
1933\fBreplay_redirect\fR causes all I/Os to be replayed onto the single specified
1934device regardless of the device it was recorded
1935from. i.e. `replay_redirect=/dev/sdc' would cause all I/O
1936in the blktrace or iolog to be replayed onto `/dev/sdc'. This means
1937multiple devices will be replayed onto a single device, if the trace
1938contains multiple devices. If you want multiple devices to be replayed
1939concurrently to multiple redirected devices you must blkparse your trace
1940into separate traces and replay them with independent fio invocations.
1941Unfortunately this also breaks the strict time ordering between multiple
1942device accesses.
1943.TP
1944.BI replay_align \fR=\fPint
1945Force alignment of I/O offsets and lengths in a trace to this power of 2
1946value.
1947.TP
1948.BI replay_scale \fR=\fPint
1949Scale sector offsets down by this factor when replaying traces.
1950.SS "Threads, processes and job synchronization"
1951.TP
1952.BI thread
1953Fio defaults to creating jobs by using fork, however if this option is
1954given, fio will create jobs by using POSIX Threads' function
1955\fBpthread_create\fR\|(3) to create threads instead.
1956.TP
1957.BI wait_for \fR=\fPstr
1958If set, the current job won't be started until all workers of the specified
1959waitee job are done.
1960.\" ignore blank line here from HOWTO as it looks normal without it
1961\fBwait_for\fR operates on the job name basis, so there are a few
1962limitations. First, the waitee must be defined prior to the waiter job
1963(meaning no forward references). Second, if a job is being referenced as a
1964waitee, it must have a unique name (no duplicate waitees).
1965.TP
1966.BI nice \fR=\fPint
1967Run the job with the given nice value. See man \fBnice\fR\|(2).
1968.\" ignore blank line here from HOWTO as it looks normal without it
1969On Windows, values less than \-15 set the process class to "High"; \-1 through
1970\-15 set "Above Normal"; 1 through 15 "Below Normal"; and above 15 "Idle"
1971priority class.
1972.TP
1973.BI prio \fR=\fPint
1974Set the I/O priority value of this job. Linux limits us to a positive value
1975between 0 and 7, with 0 being the highest. See man
1976\fBionice\fR\|(1). Refer to an appropriate manpage for other operating
1977systems since meaning of priority may differ.
1978.TP
1979.BI prioclass \fR=\fPint
1980Set the I/O priority class. See man \fBionice\fR\|(1).
15501535 1981.TP
d60e92d1 1982.BI cpumask \fR=\fPint
523bad63
TK
1983Set the CPU affinity of this job. The parameter given is a bit mask of
1984allowed CPUs the job may run on. So if you want the allowed CPUs to be 1
1985and 5, you would pass the decimal value of (1 << 1 | 1 << 5), or 34. See man
1986\fBsched_setaffinity\fR\|(2). This may not work on all supported
1987operating systems or kernel versions. This option doesn't work well for a
1988higher CPU count than what you can store in an integer mask, so it can only
1989control cpus 1\-32. For boxes with larger CPU counts, use
1990\fBcpus_allowed\fR.
d60e92d1
AC
1991.TP
1992.BI cpus_allowed \fR=\fPstr
523bad63
TK
1993Controls the same options as \fBcpumask\fR, but accepts a textual
1994specification of the permitted CPUs instead. So to use CPUs 1 and 5 you
1995would specify `cpus_allowed=1,5'. This option also allows a range of CPUs
1996to be specified \-\- say you wanted a binding to CPUs 1, 5, and 8 to 15, you
1997would set `cpus_allowed=1,5,8\-15'.
d60e92d1 1998.TP
c2acfbac 1999.BI cpus_allowed_policy \fR=\fPstr
523bad63
TK
2000Set the policy of how fio distributes the CPUs specified by
2001\fBcpus_allowed\fR or \fBcpumask\fR. Two policies are supported:
c2acfbac
JA
2002.RS
2003.RS
2004.TP
2005.B shared
2006All jobs will share the CPU set specified.
2007.TP
2008.B split
2009Each job will get a unique CPU from the CPU set.
2010.RE
2011.P
523bad63
TK
2012\fBshared\fR is the default behavior, if the option isn't specified. If
2013\fBsplit\fR is specified, then fio will will assign one cpu per job. If not
2014enough CPUs are given for the jobs listed, then fio will roundrobin the CPUs
2015in the set.
c2acfbac 2016.RE
c2acfbac 2017.TP
d0b937ed 2018.BI numa_cpu_nodes \fR=\fPstr
cecbfd47 2019Set this job running on specified NUMA nodes' CPUs. The arguments allow
523bad63
TK
2020comma delimited list of cpu numbers, A\-B ranges, or `all'. Note, to enable
2021NUMA options support, fio must be built on a system with libnuma\-dev(el)
2022installed.
d0b937ed
YR
2023.TP
2024.BI numa_mem_policy \fR=\fPstr
523bad63
TK
2025Set this job's memory policy and corresponding NUMA nodes. Format of the
2026arguments:
39c7a2ca
VF
2027.RS
2028.RS
523bad63
TK
2029.P
2030<mode>[:<nodelist>]
39c7a2ca 2031.RE
523bad63
TK
2032.P
2033`mode' is one of the following memory poicies: `default', `prefer',
2034`bind', `interleave' or `local'. For `default' and `local' memory
2035policies, no node needs to be specified. For `prefer', only one node is
2036allowed. For `bind' and `interleave' the `nodelist' may be as
2037follows: a comma delimited list of numbers, A\-B ranges, or `all'.
39c7a2ca
VF
2038.RE
2039.TP
523bad63
TK
2040.BI cgroup \fR=\fPstr
2041Add job to this control group. If it doesn't exist, it will be created. The
2042system must have a mounted cgroup blkio mount point for this to work. If
2043your system doesn't have it mounted, you can do so with:
d60e92d1
AC
2044.RS
2045.RS
d60e92d1 2046.P
523bad63
TK
2047# mount \-t cgroup \-o blkio none /cgroup
2048.RE
d60e92d1
AC
2049.RE
2050.TP
523bad63
TK
2051.BI cgroup_weight \fR=\fPint
2052Set the weight of the cgroup to this value. See the documentation that comes
2053with the kernel, allowed values are in the range of 100..1000.
d60e92d1 2054.TP
523bad63
TK
2055.BI cgroup_nodelete \fR=\fPbool
2056Normally fio will delete the cgroups it has created after the job
2057completion. To override this behavior and to leave cgroups around after the
2058job completion, set `cgroup_nodelete=1'. This can be useful if one wants
2059to inspect various cgroup files after job completion. Default: false.
c8eeb9df 2060.TP
523bad63
TK
2061.BI flow_id \fR=\fPint
2062The ID of the flow. If not specified, it defaults to being a global
2063flow. See \fBflow\fR.
d60e92d1 2064.TP
523bad63
TK
2065.BI flow \fR=\fPint
2066Weight in token\-based flow control. If this value is used, then there is
2067a 'flow counter' which is used to regulate the proportion of activity between
2068two or more jobs. Fio attempts to keep this flow counter near zero. The
2069\fBflow\fR parameter stands for how much should be added or subtracted to the
2070flow counter on each iteration of the main I/O loop. That is, if one job has
2071`flow=8' and another job has `flow=\-1', then there will be a roughly 1:8
2072ratio in how much one runs vs the other.
d60e92d1 2073.TP
523bad63
TK
2074.BI flow_watermark \fR=\fPint
2075The maximum value that the absolute value of the flow counter is allowed to
2076reach before the job must wait for a lower value of the counter.
6b7f6851 2077.TP
523bad63
TK
2078.BI flow_sleep \fR=\fPint
2079The period of time, in microseconds, to wait after the flow watermark has
2080been exceeded before retrying operations.
25460cf6 2081.TP
523bad63
TK
2082.BI stonewall "\fR,\fB wait_for_previous"
2083Wait for preceding jobs in the job file to exit, before starting this
2084one. Can be used to insert serialization points in the job file. A stone
2085wall also implies starting a new reporting group, see
2086\fBgroup_reporting\fR.
2378826d 2087.TP
523bad63
TK
2088.BI exitall
2089By default, fio will continue running all other jobs when one job finishes
2090but sometimes this is not the desired action. Setting \fBexitall\fR will
2091instead make fio terminate all other jobs when one job finishes.
e81ecca3 2092.TP
523bad63
TK
2093.BI exec_prerun \fR=\fPstr
2094Before running this job, issue the command specified through
2095\fBsystem\fR\|(3). Output is redirected in a file called `jobname.prerun.txt'.
e9f48479 2096.TP
523bad63
TK
2097.BI exec_postrun \fR=\fPstr
2098After the job completes, issue the command specified though
2099\fBsystem\fR\|(3). Output is redirected in a file called `jobname.postrun.txt'.
d60e92d1 2100.TP
523bad63
TK
2101.BI uid \fR=\fPint
2102Instead of running as the invoking user, set the user ID to this value
2103before the thread/process does any work.
39c1c323 2104.TP
523bad63
TK
2105.BI gid \fR=\fPint
2106Set group ID, see \fBuid\fR.
2107.SS "Verification"
d60e92d1 2108.TP
589e88b7 2109.BI verify_only
523bad63 2110Do not perform specified workload, only verify data still matches previous
5e4c7118 2111invocation of this workload. This option allows one to check data multiple
523bad63
TK
2112times at a later date without overwriting it. This option makes sense only
2113for workloads that write data, and does not support workloads with the
5e4c7118
JA
2114\fBtime_based\fR option set.
2115.TP
d60e92d1 2116.BI do_verify \fR=\fPbool
523bad63
TK
2117Run the verify phase after a write phase. Only valid if \fBverify\fR is
2118set. Default: true.
d60e92d1
AC
2119.TP
2120.BI verify \fR=\fPstr
523bad63
TK
2121If writing to a file, fio can verify the file contents after each iteration
2122of the job. Each verification method also implies verification of special
2123header, which is written to the beginning of each block. This header also
2124includes meta information, like offset of the block, block number, timestamp
2125when block was written, etc. \fBverify\fR can be combined with
2126\fBverify_pattern\fR option. The allowed values are:
d60e92d1
AC
2127.RS
2128.RS
2129.TP
523bad63
TK
2130.B md5
2131Use an md5 sum of the data area and store it in the header of
2132each block.
2133.TP
2134.B crc64
2135Use an experimental crc64 sum of the data area and store it in the
2136header of each block.
2137.TP
2138.B crc32c
2139Use a crc32c sum of the data area and store it in the header of
2140each block. This will automatically use hardware acceleration
2141(e.g. SSE4.2 on an x86 or CRC crypto extensions on ARM64) but will
2142fall back to software crc32c if none is found. Generally the
2143fatest checksum fio supports when hardware accelerated.
2144.TP
2145.B crc32c\-intel
2146Synonym for crc32c.
2147.TP
2148.B crc32
2149Use a crc32 sum of the data area and store it in the header of each
2150block.
2151.TP
2152.B crc16
2153Use a crc16 sum of the data area and store it in the header of each
2154block.
2155.TP
2156.B crc7
2157Use a crc7 sum of the data area and store it in the header of each
2158block.
2159.TP
2160.B xxhash
2161Use xxhash as the checksum function. Generally the fastest software
2162checksum that fio supports.
2163.TP
2164.B sha512
2165Use sha512 as the checksum function.
2166.TP
2167.B sha256
2168Use sha256 as the checksum function.
2169.TP
2170.B sha1
2171Use optimized sha1 as the checksum function.
2172.TP
2173.B sha3\-224
2174Use optimized sha3\-224 as the checksum function.
2175.TP
2176.B sha3\-256
2177Use optimized sha3\-256 as the checksum function.
2178.TP
2179.B sha3\-384
2180Use optimized sha3\-384 as the checksum function.
2181.TP
2182.B sha3\-512
2183Use optimized sha3\-512 as the checksum function.
d60e92d1
AC
2184.TP
2185.B meta
523bad63
TK
2186This option is deprecated, since now meta information is included in
2187generic verification header and meta verification happens by
2188default. For detailed information see the description of the
2189\fBverify\fR setting. This option is kept because of
2190compatibility's sake with old configurations. Do not use it.
d60e92d1 2191.TP
59245381 2192.B pattern
523bad63
TK
2193Verify a strict pattern. Normally fio includes a header with some
2194basic information and checksumming, but if this option is set, only
2195the specific pattern set with \fBverify_pattern\fR is verified.
59245381 2196.TP
d60e92d1 2197.B null
523bad63
TK
2198Only pretend to verify. Useful for testing internals with
2199`ioengine=null', not for much else.
d60e92d1 2200.RE
523bad63
TK
2201.P
2202This option can be used for repeated burn\-in tests of a system to make sure
2203that the written data is also correctly read back. If the data direction
2204given is a read or random read, fio will assume that it should verify a
2205previously written file. If the data direction includes any form of write,
2206the verify will be of the newly written data.
d60e92d1
AC
2207.RE
2208.TP
5c9323fb 2209.BI verifysort \fR=\fPbool
523bad63
TK
2210If true, fio will sort written verify blocks when it deems it faster to read
2211them back in a sorted manner. This is often the case when overwriting an
2212existing file, since the blocks are already laid out in the file system. You
2213can ignore this option unless doing huge amounts of really fast I/O where
2214the red\-black tree sorting CPU time becomes significant. Default: true.
d60e92d1 2215.TP
fa769d44 2216.BI verifysort_nr \fR=\fPint
523bad63 2217Pre\-load and sort verify blocks for a read workload.
fa769d44 2218.TP
f7fa2653 2219.BI verify_offset \fR=\fPint
d60e92d1 2220Swap the verification header with data somewhere else in the block before
523bad63 2221writing. It is swapped back before verifying.
d60e92d1 2222.TP
f7fa2653 2223.BI verify_interval \fR=\fPint
523bad63
TK
2224Write the verification header at a finer granularity than the
2225\fBblocksize\fR. It will be written for chunks the size of
2226\fBverify_interval\fR. \fBblocksize\fR should divide this evenly.
d60e92d1 2227.TP
996093bb 2228.BI verify_pattern \fR=\fPstr
523bad63
TK
2229If set, fio will fill the I/O buffers with this pattern. Fio defaults to
2230filling with totally random bytes, but sometimes it's interesting to fill
2231with a known pattern for I/O verification purposes. Depending on the width
2232of the pattern, fio will fill 1/2/3/4 bytes of the buffer at the time (it can
2233be either a decimal or a hex number). The \fBverify_pattern\fR if larger than
2234a 32\-bit quantity has to be a hex number that starts with either "0x" or
2235"0X". Use with \fBverify\fR. Also, \fBverify_pattern\fR supports %o
2236format, which means that for each block offset will be written and then
2237verified back, e.g.:
2fa5a241
RP
2238.RS
2239.RS
523bad63
TK
2240.P
2241verify_pattern=%o
2fa5a241 2242.RE
523bad63 2243.P
2fa5a241 2244Or use combination of everything:
2fa5a241 2245.RS
523bad63
TK
2246.P
2247verify_pattern=0xff%o"abcd"\-12
2fa5a241
RP
2248.RE
2249.RE
996093bb 2250.TP
d60e92d1 2251.BI verify_fatal \fR=\fPbool
523bad63
TK
2252Normally fio will keep checking the entire contents before quitting on a
2253block verification failure. If this option is set, fio will exit the job on
2254the first observed failure. Default: false.
d60e92d1 2255.TP
b463e936 2256.BI verify_dump \fR=\fPbool
523bad63
TK
2257If set, dump the contents of both the original data block and the data block
2258we read off disk to files. This allows later analysis to inspect just what
2259kind of data corruption occurred. Off by default.
b463e936 2260.TP
e8462bd8 2261.BI verify_async \fR=\fPint
523bad63
TK
2262Fio will normally verify I/O inline from the submitting thread. This option
2263takes an integer describing how many async offload threads to create for I/O
2264verification instead, causing fio to offload the duty of verifying I/O
2265contents to one or more separate threads. If using this offload option, even
2266sync I/O engines can benefit from using an \fBiodepth\fR setting higher
2267than 1, as it allows them to have I/O in flight while verifies are running.
2268Defaults to 0 async threads, i.e. verification is not asynchronous.
e8462bd8
JA
2269.TP
2270.BI verify_async_cpus \fR=\fPstr
523bad63
TK
2271Tell fio to set the given CPU affinity on the async I/O verification
2272threads. See \fBcpus_allowed\fR for the format used.
e8462bd8 2273.TP
6f87418f
JA
2274.BI verify_backlog \fR=\fPint
2275Fio will normally verify the written contents of a job that utilizes verify
2276once that job has completed. In other words, everything is written then
2277everything is read back and verified. You may want to verify continually
523bad63
TK
2278instead for a variety of reasons. Fio stores the meta data associated with
2279an I/O block in memory, so for large verify workloads, quite a bit of memory
2280would be used up holding this meta data. If this option is enabled, fio will
2281write only N blocks before verifying these blocks.
6f87418f
JA
2282.TP
2283.BI verify_backlog_batch \fR=\fPint
523bad63
TK
2284Control how many blocks fio will verify if \fBverify_backlog\fR is
2285set. If not set, will default to the value of \fBverify_backlog\fR
2286(meaning the entire queue is read back and verified). If
2287\fBverify_backlog_batch\fR is less than \fBverify_backlog\fR then not all
2288blocks will be verified, if \fBverify_backlog_batch\fR is larger than
2289\fBverify_backlog\fR, some blocks will be verified more than once.
2290.TP
2291.BI verify_state_save \fR=\fPbool
2292When a job exits during the write phase of a verify workload, save its
2293current state. This allows fio to replay up until that point, if the verify
2294state is loaded for the verify read phase. The format of the filename is,
2295roughly:
2296.RS
2297.RS
2298.P
2299<type>\-<jobname>\-<jobindex>\-verify.state.
2300.RE
2301.P
2302<type> is "local" for a local run, "sock" for a client/server socket
2303connection, and "ip" (192.168.0.1, for instance) for a networked
2304client/server connection. Defaults to true.
2305.RE
2306.TP
2307.BI verify_state_load \fR=\fPbool
2308If a verify termination trigger was used, fio stores the current write state
2309of each thread. This can be used at verification time so that fio knows how
2310far it should verify. Without this information, fio will run a full
2311verification pass, according to the settings in the job file used. Default
2312false.
6f87418f 2313.TP
fa769d44
SW
2314.BI trim_percentage \fR=\fPint
2315Number of verify blocks to discard/trim.
2316.TP
2317.BI trim_verify_zero \fR=\fPbool
523bad63 2318Verify that trim/discarded blocks are returned as zeros.
fa769d44
SW
2319.TP
2320.BI trim_backlog \fR=\fPint
523bad63 2321Verify that trim/discarded blocks are returned as zeros.
fa769d44
SW
2322.TP
2323.BI trim_backlog_batch \fR=\fPint
523bad63 2324Trim this number of I/O blocks.
fa769d44
SW
2325.TP
2326.BI experimental_verify \fR=\fPbool
2327Enable experimental verification.
523bad63 2328.SS "Steady state"
fa769d44 2329.TP
523bad63
TK
2330.BI steadystate \fR=\fPstr:float "\fR,\fP ss" \fR=\fPstr:float
2331Define the criterion and limit for assessing steady state performance. The
2332first parameter designates the criterion whereas the second parameter sets
2333the threshold. When the criterion falls below the threshold for the
2334specified duration, the job will stop. For example, `iops_slope:0.1%' will
2335direct fio to terminate the job when the least squares regression slope
2336falls below 0.1% of the mean IOPS. If \fBgroup_reporting\fR is enabled
2337this will apply to all jobs in the group. Below is the list of available
2338steady state assessment criteria. All assessments are carried out using only
2339data from the rolling collection window. Threshold limits can be expressed
2340as a fixed value or as a percentage of the mean in the collection window.
2341.RS
2342.RS
d60e92d1 2343.TP
523bad63
TK
2344.B iops
2345Collect IOPS data. Stop the job if all individual IOPS measurements
2346are within the specified limit of the mean IOPS (e.g., `iops:2'
2347means that all individual IOPS values must be within 2 of the mean,
2348whereas `iops:0.2%' means that all individual IOPS values must be
2349within 0.2% of the mean IOPS to terminate the job).
d60e92d1 2350.TP
523bad63
TK
2351.B iops_slope
2352Collect IOPS data and calculate the least squares regression
2353slope. Stop the job if the slope falls below the specified limit.
d60e92d1 2354.TP
523bad63
TK
2355.B bw
2356Collect bandwidth data. Stop the job if all individual bandwidth
2357measurements are within the specified limit of the mean bandwidth.
64bbb865 2358.TP
523bad63
TK
2359.B bw_slope
2360Collect bandwidth data and calculate the least squares regression
2361slope. Stop the job if the slope falls below the specified limit.
2362.RE
2363.RE
d1c46c04 2364.TP
523bad63
TK
2365.BI steadystate_duration \fR=\fPtime "\fR,\fP ss_dur" \fR=\fPtime
2366A rolling window of this duration will be used to judge whether steady state
2367has been reached. Data will be collected once per second. The default is 0
2368which disables steady state detection. When the unit is omitted, the
2369value is interpreted in seconds.
0c63576e 2370.TP
523bad63
TK
2371.BI steadystate_ramp_time \fR=\fPtime "\fR,\fP ss_ramp" \fR=\fPtime
2372Allow the job to run for the specified duration before beginning data
2373collection for checking the steady state job termination criterion. The
2374default is 0. When the unit is omitted, the value is interpreted in seconds.
2375.SS "Measurements and reporting"
0c63576e 2376.TP
3a5db920
JA
2377.BI per_job_logs \fR=\fPbool
2378If set, this generates bw/clat/iops log with per file private filenames. If
523bad63
TK
2379not set, jobs with identical names will share the log filename. Default:
2380true.
2381.TP
2382.BI group_reporting
2383It may sometimes be interesting to display statistics for groups of jobs as
2384a whole instead of for each individual job. This is especially true if
2385\fBnumjobs\fR is used; looking at individual thread/process output
2386quickly becomes unwieldy. To see the final report per\-group instead of
2387per\-job, use \fBgroup_reporting\fR. Jobs in a file will be part of the
2388same reporting group, unless if separated by a \fBstonewall\fR, or by
2389using \fBnew_group\fR.
2390.TP
2391.BI new_group
2392Start a new reporting group. See: \fBgroup_reporting\fR. If not given,
2393all jobs in a file will be part of the same reporting group, unless
2394separated by a \fBstonewall\fR.
2395.TP
2396.BI stats \fR=\fPbool
2397By default, fio collects and shows final output results for all jobs
2398that run. If this option is set to 0, then fio will ignore it in
2399the final stat output.
3a5db920 2400.TP
836bad52 2401.BI write_bw_log \fR=\fPstr
523bad63
TK
2402If given, write a bandwidth log for this job. Can be used to store data of
2403the bandwidth of the jobs in their lifetime. The included
2404\fBfio_generate_plots\fR script uses gnuplot to turn these
2405text files into nice graphs. See \fBwrite_lat_log\fR for behavior of
2406given filename. For this option, the postfix is `_bw.x.log', where `x'
2407is the index of the job (1..N, where N is the number of jobs). If
2408\fBper_job_logs\fR is false, then the filename will not include the job
2409index. See \fBLOG FILE FORMATS\fR section.
d60e92d1 2410.TP
836bad52 2411.BI write_lat_log \fR=\fPstr
523bad63
TK
2412Same as \fBwrite_bw_log\fR, except that this option stores I/O
2413submission, completion, and total latencies instead. If no filename is given
2414with this option, the default filename of `jobname_type.log' is
2415used. Even if the filename is given, fio will still append the type of
2416log. So if one specifies:
2417.RS
2418.RS
2419.P
2420write_lat_log=foo
2421.RE
2422.P
2423The actual log names will be `foo_slat.x.log', `foo_clat.x.log',
2424and `foo_lat.x.log', where `x' is the index of the job (1..N, where N
2425is the number of jobs). This helps \fBfio_generate_plots\fR find the
2426logs automatically. If \fBper_job_logs\fR is false, then the filename
2427will not include the job index. See \fBLOG FILE FORMATS\fR section.
2428.RE
901bb994 2429.TP
1e613c9c 2430.BI write_hist_log \fR=\fPstr
523bad63
TK
2431Same as \fBwrite_lat_log\fR, but writes I/O completion latency
2432histograms. If no filename is given with this option, the default filename
2433of `jobname_clat_hist.x.log' is used, where `x' is the index of the
2434job (1..N, where N is the number of jobs). Even if the filename is given,
2435fio will still append the type of log. If \fBper_job_logs\fR is false,
2436then the filename will not include the job index. See \fBLOG FILE FORMATS\fR section.
1e613c9c 2437.TP
c8eeb9df 2438.BI write_iops_log \fR=\fPstr
523bad63
TK
2439Same as \fBwrite_bw_log\fR, but writes IOPS. If no filename is given
2440with this option, the default filename of `jobname_type.x.log' is
2441used, where `x' is the index of the job (1..N, where N is the number of
2442jobs). Even if the filename is given, fio will still append the type of
2443log. If \fBper_job_logs\fR is false, then the filename will not include
2444the job index. See \fBLOG FILE FORMATS\fR section.
c8eeb9df 2445.TP
b8bc8cba
JA
2446.BI log_avg_msec \fR=\fPint
2447By default, fio will log an entry in the iops, latency, or bw log for every
523bad63 2448I/O that completes. When writing to the disk log, that can quickly grow to a
b8bc8cba 2449very large size. Setting this option makes fio average the each log entry
e6989e10 2450over the specified period of time, reducing the resolution of the log. See
523bad63
TK
2451\fBlog_max_value\fR as well. Defaults to 0, logging all entries.
2452Also see \fBLOG FILE FORMATS\fR section.
b8bc8cba 2453.TP
1e613c9c 2454.BI log_hist_msec \fR=\fPint
523bad63
TK
2455Same as \fBlog_avg_msec\fR, but logs entries for completion latency
2456histograms. Computing latency percentiles from averages of intervals using
2457\fBlog_avg_msec\fR is inaccurate. Setting this option makes fio log
2458histogram entries over the specified period of time, reducing log sizes for
2459high IOPS devices while retaining percentile accuracy. See
2460\fBlog_hist_coarseness\fR as well. Defaults to 0, meaning histogram
2461logging is disabled.
1e613c9c
KC
2462.TP
2463.BI log_hist_coarseness \fR=\fPint
523bad63
TK
2464Integer ranging from 0 to 6, defining the coarseness of the resolution of
2465the histogram logs enabled with \fBlog_hist_msec\fR. For each increment
2466in coarseness, fio outputs half as many bins. Defaults to 0, for which
2467histogram logs contain 1216 latency bins. See \fBLOG FILE FORMATS\fR section.
2468.TP
2469.BI log_max_value \fR=\fPbool
2470If \fBlog_avg_msec\fR is set, fio logs the average over that window. If
2471you instead want to log the maximum value, set this option to 1. Defaults to
24720, meaning that averaged values are logged.
1e613c9c 2473.TP
ae588852 2474.BI log_offset \fR=\fPbool
523bad63
TK
2475If this is set, the iolog options will include the byte offset for the I/O
2476entry as well as the other data values. Defaults to 0 meaning that
2477offsets are not present in logs. Also see \fBLOG FILE FORMATS\fR section.
ae588852 2478.TP
aee2ab67 2479.BI log_compression \fR=\fPint
523bad63
TK
2480If this is set, fio will compress the I/O logs as it goes, to keep the
2481memory footprint lower. When a log reaches the specified size, that chunk is
2482removed and compressed in the background. Given that I/O logs are fairly
2483highly compressible, this yields a nice memory savings for longer runs. The
2484downside is that the compression will consume some background CPU cycles, so
2485it may impact the run. This, however, is also true if the logging ends up
2486consuming most of the system memory. So pick your poison. The I/O logs are
2487saved normally at the end of a run, by decompressing the chunks and storing
2488them in the specified log file. This feature depends on the availability of
2489zlib.
aee2ab67 2490.TP
c08f9fe2 2491.BI log_compression_cpus \fR=\fPstr
523bad63
TK
2492Define the set of CPUs that are allowed to handle online log compression for
2493the I/O jobs. This can provide better isolation between performance
c08f9fe2
JA
2494sensitive jobs, and background compression work.
2495.TP
b26317c9 2496.BI log_store_compressed \fR=\fPbool
c08f9fe2 2497If set, fio will store the log files in a compressed format. They can be
523bad63
TK
2498decompressed with fio, using the \fB\-\-inflate\-log\fR command line
2499parameter. The files will be stored with a `.fz' suffix.
b26317c9 2500.TP
3aea75b1
KC
2501.BI log_unix_epoch \fR=\fPbool
2502If set, fio will log Unix timestamps to the log files produced by enabling
523bad63 2503write_type_log for each log type, instead of the default zero\-based
3aea75b1
KC
2504timestamps.
2505.TP
66347cfa 2506.BI block_error_percentiles \fR=\fPbool
523bad63
TK
2507If set, record errors in trim block\-sized units from writes and trims and
2508output a histogram of how many trims it took to get to errors, and what kind
2509of error was encountered.
d60e92d1 2510.TP
523bad63
TK
2511.BI bwavgtime \fR=\fPint
2512Average the calculated bandwidth over the given time. Value is specified in
2513milliseconds. If the job also does bandwidth logging through
2514\fBwrite_bw_log\fR, then the minimum of this option and
2515\fBlog_avg_msec\fR will be used. Default: 500ms.
d60e92d1 2516.TP
523bad63
TK
2517.BI iopsavgtime \fR=\fPint
2518Average the calculated IOPS over the given time. Value is specified in
2519milliseconds. If the job also does IOPS logging through
2520\fBwrite_iops_log\fR, then the minimum of this option and
2521\fBlog_avg_msec\fR will be used. Default: 500ms.
d60e92d1 2522.TP
d60e92d1 2523.BI disk_util \fR=\fPbool
523bad63
TK
2524Generate disk utilization statistics, if the platform supports it.
2525Default: true.
fa769d44 2526.TP
523bad63
TK
2527.BI disable_lat \fR=\fPbool
2528Disable measurements of total latency numbers. Useful only for cutting back
2529the number of calls to \fBgettimeofday\fR\|(2), as that does impact
2530performance at really high IOPS rates. Note that to really get rid of a
2531large amount of these calls, this option must be used with
2532\fBdisable_slat\fR and \fBdisable_bw_measurement\fR as well.
9e684a49 2533.TP
523bad63
TK
2534.BI disable_clat \fR=\fPbool
2535Disable measurements of completion latency numbers. See
2536\fBdisable_lat\fR.
9e684a49 2537.TP
523bad63
TK
2538.BI disable_slat \fR=\fPbool
2539Disable measurements of submission latency numbers. See
2540\fBdisable_lat\fR.
9e684a49 2541.TP
523bad63
TK
2542.BI disable_bw_measurement \fR=\fPbool "\fR,\fP disable_bw" \fR=\fPbool
2543Disable measurements of throughput/bandwidth numbers. See
2544\fBdisable_lat\fR.
9e684a49 2545.TP
83349190
YH
2546.BI clat_percentiles \fR=\fPbool
2547Enable the reporting of percentiles of completion latencies.
2548.TP
2549.BI percentile_list \fR=\fPfloat_list
66347cfa 2550Overwrite the default list of percentiles for completion latencies and the
523bad63
TK
2551block error histogram. Each number is a floating number in the range
2552(0,100], and the maximum length of the list is 20. Use ':' to separate the
2553numbers, and list the numbers in ascending order. For example,
2554`\-\-percentile_list=99.5:99.9' will cause fio to report the values of
2555completion latency below which 99.5% and 99.9% of the observed latencies
2556fell, respectively.
2557.SS "Error handling"
e4585935 2558.TP
523bad63
TK
2559.BI exitall_on_error
2560When one job finishes in error, terminate the rest. The default is to wait
2561for each job to finish.
e4585935 2562.TP
523bad63
TK
2563.BI continue_on_error \fR=\fPstr
2564Normally fio will exit the job on the first observed failure. If this option
2565is set, fio will continue the job when there is a 'non\-fatal error' (EIO or
2566EILSEQ) until the runtime is exceeded or the I/O size specified is
2567completed. If this option is used, there are two more stats that are
2568appended, the total error count and the first error. The error field given
2569in the stats is the first error that was hit during the run.
2570The allowed values are:
2571.RS
2572.RS
046395d7 2573.TP
523bad63
TK
2574.B none
2575Exit on any I/O or verify errors.
de890a1e 2576.TP
523bad63
TK
2577.B read
2578Continue on read errors, exit on all others.
2cafffbe 2579.TP
523bad63
TK
2580.B write
2581Continue on write errors, exit on all others.
a0679ce5 2582.TP
523bad63
TK
2583.B io
2584Continue on any I/O error, exit on all others.
de890a1e 2585.TP
523bad63
TK
2586.B verify
2587Continue on verify errors, exit on all others.
de890a1e 2588.TP
523bad63
TK
2589.B all
2590Continue on all errors.
b93b6a2e 2591.TP
523bad63
TK
2592.B 0
2593Backward\-compatible alias for 'none'.
d3a623de 2594.TP
523bad63
TK
2595.B 1
2596Backward\-compatible alias for 'all'.
2597.RE
2598.RE
1d360ffb 2599.TP
523bad63
TK
2600.BI ignore_error \fR=\fPstr
2601Sometimes you want to ignore some errors during test in that case you can
2602specify error list for each error type, instead of only being able to
2603ignore the default 'non\-fatal error' using \fBcontinue_on_error\fR.
2604`ignore_error=READ_ERR_LIST,WRITE_ERR_LIST,VERIFY_ERR_LIST' errors for
2605given error type is separated with ':'. Error may be symbol ('ENOSPC', 'ENOMEM')
2606or integer. Example:
de890a1e
SL
2607.RS
2608.RS
523bad63
TK
2609.P
2610ignore_error=EAGAIN,ENOSPC:122
2611.RE
2612.P
2613This option will ignore EAGAIN from READ, and ENOSPC and 122(EDQUOT) from
2614WRITE. This option works by overriding \fBcontinue_on_error\fR with
2615the list of errors for each error type if any.
2616.RE
de890a1e 2617.TP
523bad63
TK
2618.BI error_dump \fR=\fPbool
2619If set dump every error even if it is non fatal, true by default. If
2620disabled only fatal error will be dumped.
2621.SS "Running predefined workloads"
2622Fio includes predefined profiles that mimic the I/O workloads generated by
2623other tools.
49ccb8c1 2624.TP
523bad63
TK
2625.BI profile \fR=\fPstr
2626The predefined workload to run. Current profiles are:
2627.RS
2628.RS
de890a1e 2629.TP
523bad63
TK
2630.B tiobench
2631Threaded I/O bench (tiotest/tiobench) like workload.
49ccb8c1 2632.TP
523bad63
TK
2633.B act
2634Aerospike Certification Tool (ACT) like workload.
2635.RE
de890a1e
SL
2636.RE
2637.P
523bad63
TK
2638To view a profile's additional options use \fB\-\-cmdhelp\fR after specifying
2639the profile. For example:
2640.RS
2641.TP
2642$ fio \-\-profile=act \-\-cmdhelp
de890a1e 2643.RE
523bad63 2644.SS "Act profile options"
de890a1e 2645.TP
523bad63
TK
2646.BI device\-names \fR=\fPstr
2647Devices to use.
d54fce84 2648.TP
523bad63
TK
2649.BI load \fR=\fPint
2650ACT load multiplier. Default: 1.
7aeb1e94 2651.TP
523bad63
TK
2652.BI test\-duration\fR=\fPtime
2653How long the entire test takes to run. When the unit is omitted, the value
2654is given in seconds. Default: 24h.
1008602c 2655.TP
523bad63
TK
2656.BI threads\-per\-queue\fR=\fPint
2657Number of read I/O threads per device. Default: 8.
e5f34d95 2658.TP
523bad63
TK
2659.BI read\-req\-num\-512\-blocks\fR=\fPint
2660Number of 512B blocks to read at the time. Default: 3.
d54fce84 2661.TP
523bad63
TK
2662.BI large\-block\-op\-kbytes\fR=\fPint
2663Size of large block ops in KiB (writes). Default: 131072.
d54fce84 2664.TP
523bad63
TK
2665.BI prep
2666Set to run ACT prep phase.
2667.SS "Tiobench profile options"
6d500c2e 2668.TP
523bad63
TK
2669.BI size\fR=\fPstr
2670Size in MiB.
0d978694 2671.TP
523bad63
TK
2672.BI block\fR=\fPint
2673Block size in bytes. Default: 4096.
0d978694 2674.TP
523bad63
TK
2675.BI numruns\fR=\fPint
2676Number of runs.
0d978694 2677.TP
523bad63
TK
2678.BI dir\fR=\fPstr
2679Test directory.
65fa28ca 2680.TP
523bad63
TK
2681.BI threads\fR=\fPint
2682Number of threads.
d60e92d1 2683.SH OUTPUT
40943b9a
TK
2684Fio spits out a lot of output. While running, fio will display the status of the
2685jobs created. An example of that would be:
d60e92d1 2686.P
40943b9a
TK
2687.nf
2688 Jobs: 1 (f=1): [_(1),M(1)][24.8%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 01m:31s]
2689.fi
d1429b5c 2690.P
40943b9a
TK
2691The characters inside the first set of square brackets denote the current status of
2692each thread. The first character is the first job defined in the job file, and so
2693forth. The possible values (in typical life cycle order) are:
d60e92d1
AC
2694.RS
2695.TP
40943b9a 2696.PD 0
d60e92d1 2697.B P
40943b9a 2698Thread setup, but not started.
d60e92d1
AC
2699.TP
2700.B C
2701Thread created.
2702.TP
2703.B I
40943b9a
TK
2704Thread initialized, waiting or generating necessary data.
2705.TP
2706.B P
2707Thread running pre\-reading file(s).
2708.TP
2709.B /
2710Thread is in ramp period.
d60e92d1
AC
2711.TP
2712.B R
2713Running, doing sequential reads.
2714.TP
2715.B r
2716Running, doing random reads.
2717.TP
2718.B W
2719Running, doing sequential writes.
2720.TP
2721.B w
2722Running, doing random writes.
2723.TP
2724.B M
2725Running, doing mixed sequential reads/writes.
2726.TP
2727.B m
2728Running, doing mixed random reads/writes.
2729.TP
40943b9a
TK
2730.B D
2731Running, doing sequential trims.
2732.TP
2733.B d
2734Running, doing random trims.
2735.TP
d60e92d1
AC
2736.B F
2737Running, currently waiting for \fBfsync\fR\|(2).
2738.TP
2739.B V
40943b9a
TK
2740Running, doing verification of written data.
2741.TP
2742.B f
2743Thread finishing.
d60e92d1
AC
2744.TP
2745.B E
40943b9a 2746Thread exited, not reaped by main thread yet.
d60e92d1
AC
2747.TP
2748.B \-
40943b9a
TK
2749Thread reaped.
2750.TP
2751.B X
2752Thread reaped, exited with an error.
2753.TP
2754.B K
2755Thread reaped, exited due to signal.
d1429b5c 2756.PD
40943b9a
TK
2757.RE
2758.P
2759Fio will condense the thread string as not to take up more space on the command
2760line than needed. For instance, if you have 10 readers and 10 writers running,
2761the output would look like this:
2762.P
2763.nf
2764 Jobs: 20 (f=20): [R(10),W(10)][4.0%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 57m:36s]
2765.fi
d60e92d1 2766.P
40943b9a
TK
2767Note that the status string is displayed in order, so it's possible to tell which of
2768the jobs are currently doing what. In the example above this means that jobs 1\-\-10
2769are readers and 11\-\-20 are writers.
d60e92d1 2770.P
40943b9a
TK
2771The other values are fairly self explanatory \-\- number of threads currently
2772running and doing I/O, the number of currently open files (f=), the estimated
2773completion percentage, the rate of I/O since last check (read speed listed first,
2774then write speed and optionally trim speed) in terms of bandwidth and IOPS,
2775and time to completion for the current running group. It's impossible to estimate
2776runtime of the following groups (if any).
d60e92d1 2777.P
40943b9a
TK
2778When fio is done (or interrupted by Ctrl\-C), it will show the data for
2779each thread, group of threads, and disks in that order. For each overall thread (or
2780group) the output looks like:
2781.P
2782.nf
2783 Client1: (groupid=0, jobs=1): err= 0: pid=16109: Sat Jun 24 12:07:54 2017
2784 write: IOPS=88, BW=623KiB/s (638kB/s)(30.4MiB/50032msec)
2785 slat (nsec): min=500, max=145500, avg=8318.00, stdev=4781.50
2786 clat (usec): min=170, max=78367, avg=4019.02, stdev=8293.31
2787 lat (usec): min=174, max=78375, avg=4027.34, stdev=8291.79
2788 clat percentiles (usec):
2789 | 1.00th=[ 302], 5.00th=[ 326], 10.00th=[ 343], 20.00th=[ 363],
2790 | 30.00th=[ 392], 40.00th=[ 404], 50.00th=[ 416], 60.00th=[ 445],
2791 | 70.00th=[ 816], 80.00th=[ 6718], 90.00th=[12911], 95.00th=[21627],
2792 | 99.00th=[43779], 99.50th=[51643], 99.90th=[68682], 99.95th=[72877],
2793 | 99.99th=[78119]
2794 bw ( KiB/s): min= 532, max= 686, per=0.10%, avg=622.87, stdev=24.82, samples= 100
2795 iops : min= 76, max= 98, avg=88.98, stdev= 3.54, samples= 100
d3b9694d
VF
2796 lat (usec) : 250=0.04%, 500=64.11%, 750=4.81%, 1000=2.79%
2797 lat (msec) : 2=4.16%, 4=1.84%, 10=4.90%, 20=11.33%, 50=5.37%
2798 lat (msec) : 100=0.65%
40943b9a
TK
2799 cpu : usr=0.27%, sys=0.18%, ctx=12072, majf=0, minf=21
2800 IO depths : 1=85.0%, 2=13.1%, 4=1.8%, 8=0.1%, 16=0.0%, 32=0.0%, >=64=0.0%
2801 submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2802 complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
2803 issued rwt: total=0,4450,0, short=0,0,0, dropped=0,0,0
2804 latency : target=0, window=0, percentile=100.00%, depth=8
2805.fi
2806.P
2807The job name (or first job's name when using \fBgroup_reporting\fR) is printed,
2808along with the group id, count of jobs being aggregated, last error id seen (which
2809is 0 when there are no errors), pid/tid of that thread and the time the job/group
2810completed. Below are the I/O statistics for each data direction performed (showing
2811writes in the example above). In the order listed, they denote:
d60e92d1 2812.RS
d60e92d1 2813.TP
40943b9a
TK
2814.B read/write/trim
2815The string before the colon shows the I/O direction the statistics
2816are for. \fIIOPS\fR is the average I/Os performed per second. \fIBW\fR
2817is the average bandwidth rate shown as: value in power of 2 format
2818(value in power of 10 format). The last two values show: (total
2819I/O performed in power of 2 format / \fIruntime\fR of that thread).
d60e92d1
AC
2820.TP
2821.B slat
40943b9a
TK
2822Submission latency (\fImin\fR being the minimum, \fImax\fR being the
2823maximum, \fIavg\fR being the average, \fIstdev\fR being the standard
2824deviation). This is the time it took to submit the I/O. For
2825sync I/O this row is not displayed as the slat is really the
2826completion latency (since queue/complete is one operation there).
2827This value can be in nanoseconds, microseconds or milliseconds \-\-\-
2828fio will choose the most appropriate base and print that (in the
2829example above nanoseconds was the best scale). Note: in \fB\-\-minimal\fR mode
2830latencies are always expressed in microseconds.
d60e92d1
AC
2831.TP
2832.B clat
40943b9a
TK
2833Completion latency. Same names as slat, this denotes the time from
2834submission to completion of the I/O pieces. For sync I/O, clat will
2835usually be equal (or very close) to 0, as the time from submit to
2836complete is basically just CPU time (I/O has already been done, see slat
2837explanation).
d60e92d1 2838.TP
d3b9694d
VF
2839.B lat
2840Total latency. Same names as slat and clat, this denotes the time from
2841when fio created the I/O unit to completion of the I/O operation.
2842.TP
d60e92d1 2843.B bw
40943b9a
TK
2844Bandwidth statistics based on samples. Same names as the xlat stats,
2845but also includes the number of samples taken (\fIsamples\fR) and an
2846approximate percentage of total aggregate bandwidth this thread
2847received in its group (\fIper\fR). This last value is only really
2848useful if the threads in this group are on the same disk, since they
2849are then competing for disk access.
2850.TP
2851.B iops
2852IOPS statistics based on samples. Same names as \fBbw\fR.
d60e92d1 2853.TP
d3b9694d
VF
2854.B lat (nsec/usec/msec)
2855The distribution of I/O completion latencies. This is the time from when
2856I/O leaves fio and when it gets completed. Unlike the separate
2857read/write/trim sections above, the data here and in the remaining
2858sections apply to all I/Os for the reporting group. 250=0.04% means that
28590.04% of the I/Os completed in under 250us. 500=64.11% means that 64.11%
2860of the I/Os required 250 to 499us for completion.
2861.TP
d60e92d1 2862.B cpu
40943b9a
TK
2863CPU usage. User and system time, along with the number of context
2864switches this thread went through, usage of system and user time, and
2865finally the number of major and minor page faults. The CPU utilization
2866numbers are averages for the jobs in that reporting group, while the
2867context and fault counters are summed.
d60e92d1
AC
2868.TP
2869.B IO depths
40943b9a
TK
2870The distribution of I/O depths over the job lifetime. The numbers are
2871divided into powers of 2 and each entry covers depths from that value
2872up to those that are lower than the next entry \-\- e.g., 16= covers
2873depths from 16 to 31. Note that the range covered by a depth
2874distribution entry can be different to the range covered by the
2875equivalent \fBsubmit\fR/\fBcomplete\fR distribution entry.
2876.TP
2877.B IO submit
2878How many pieces of I/O were submitting in a single submit call. Each
2879entry denotes that amount and below, until the previous entry \-\- e.g.,
288016=100% means that we submitted anywhere between 9 to 16 I/Os per submit
2881call. Note that the range covered by a \fBsubmit\fR distribution entry can
2882be different to the range covered by the equivalent depth distribution
2883entry.
2884.TP
2885.B IO complete
2886Like the above \fBsubmit\fR number, but for completions instead.
2887.TP
2888.B IO issued rwt
2889The number of \fBread/write/trim\fR requests issued, and how many of them were
2890short or dropped.
d60e92d1 2891.TP
d3b9694d
VF
2892.B IO latency
2893These values are for \fBlatency-target\fR and related options. When
2894these options are engaged, this section describes the I/O depth required
2895to meet the specified latency target.
d60e92d1 2896.RE
d60e92d1 2897.P
40943b9a
TK
2898After each client has been listed, the group statistics are printed. They
2899will look like this:
2900.P
2901.nf
2902 Run status group 0 (all jobs):
2903 READ: bw=20.9MiB/s (21.9MB/s), 10.4MiB/s\-10.8MiB/s (10.9MB/s\-11.3MB/s), io=64.0MiB (67.1MB), run=2973\-3069msec
2904 WRITE: bw=1231KiB/s (1261kB/s), 616KiB/s\-621KiB/s (630kB/s\-636kB/s), io=64.0MiB (67.1MB), run=52747\-53223msec
2905.fi
2906.P
2907For each data direction it prints:
d60e92d1
AC
2908.RS
2909.TP
40943b9a
TK
2910.B bw
2911Aggregate bandwidth of threads in this group followed by the
2912minimum and maximum bandwidth of all the threads in this group.
2913Values outside of brackets are power\-of\-2 format and those
2914within are the equivalent value in a power\-of\-10 format.
d60e92d1 2915.TP
40943b9a
TK
2916.B io
2917Aggregate I/O performed of all threads in this group. The
2918format is the same as \fBbw\fR.
d60e92d1 2919.TP
40943b9a
TK
2920.B run
2921The smallest and longest runtimes of the threads in this group.
d60e92d1 2922.RE
d60e92d1 2923.P
40943b9a
TK
2924And finally, the disk statistics are printed. This is Linux specific.
2925They will look like this:
2926.P
2927.nf
2928 Disk stats (read/write):
2929 sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
2930.fi
2931.P
2932Each value is printed for both reads and writes, with reads first. The
2933numbers denote:
d60e92d1
AC
2934.RS
2935.TP
2936.B ios
2937Number of I/Os performed by all groups.
2938.TP
2939.B merge
007c7be9 2940Number of merges performed by the I/O scheduler.
d60e92d1
AC
2941.TP
2942.B ticks
2943Number of ticks we kept the disk busy.
2944.TP
40943b9a 2945.B in_queue
d60e92d1
AC
2946Total time spent in the disk queue.
2947.TP
2948.B util
40943b9a
TK
2949The disk utilization. A value of 100% means we kept the disk
2950busy constantly, 50% would be a disk idling half of the time.
d60e92d1 2951.RE
8423bd11 2952.P
40943b9a
TK
2953It is also possible to get fio to dump the current output while it is running,
2954without terminating the job. To do that, send fio the USR1 signal. You can
2955also get regularly timed dumps by using the \fB\-\-status\-interval\fR
2956parameter, or by creating a file in `/tmp' named
2957`fio\-dump\-status'. If fio sees this file, it will unlink it and dump the
2958current output status.
d60e92d1 2959.SH TERSE OUTPUT
40943b9a
TK
2960For scripted usage where you typically want to generate tables or graphs of the
2961results, fio can output the results in a semicolon separated format. The format
2962is one long line of values, such as:
d60e92d1 2963.P
40943b9a
TK
2964.nf
2965 2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%
2966 A description of this job goes here.
2967.fi
d60e92d1 2968.P
40943b9a 2969The job description (if provided) follows on a second line.
d60e92d1 2970.P
40943b9a
TK
2971To enable terse output, use the \fB\-\-minimal\fR or
2972`\-\-output\-format=terse' command line options. The
2973first value is the version of the terse output format. If the output has to be
2974changed for some reason, this number will be incremented by 1 to signify that
2975change.
d60e92d1 2976.P
40943b9a
TK
2977Split up, the format is as follows (comments in brackets denote when a
2978field was introduced or whether it's specific to some terse version):
d60e92d1 2979.P
40943b9a
TK
2980.nf
2981 terse version, fio version [v3], jobname, groupid, error
2982.fi
525c2bfa 2983.RS
40943b9a
TK
2984.P
2985.B
2986READ status:
525c2bfa 2987.RE
40943b9a
TK
2988.P
2989.nf
2990 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
2991 Submission latency: min, max, mean, stdev (usec)
2992 Completion latency: min, max, mean, stdev (usec)
2993 Completion latency percentiles: 20 fields (see below)
2994 Total latency: min, max, mean, stdev (usec)
2995 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
2996 IOPS [v5]: min, max, mean, stdev, number of samples
2997.fi
d60e92d1 2998.RS
40943b9a
TK
2999.P
3000.B
3001WRITE status:
a2c95580 3002.RE
40943b9a
TK
3003.P
3004.nf
3005 Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3006 Submission latency: min, max, mean, stdev (usec)
3007 Completion latency: min, max, mean, stdev (usec)
3008 Completion latency percentiles: 20 fields (see below)
3009 Total latency: min, max, mean, stdev (usec)
3010 Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3011 IOPS [v5]: min, max, mean, stdev, number of samples
3012.fi
a2c95580 3013.RS
40943b9a
TK
3014.P
3015.B
3016TRIM status [all but version 3]:
d60e92d1
AC
3017.RE
3018.P
40943b9a
TK
3019.nf
3020 Fields are similar to \fBREAD/WRITE\fR status.
3021.fi
a2c95580 3022.RS
a2c95580 3023.P
40943b9a 3024.B
d1429b5c 3025CPU usage:
d60e92d1
AC
3026.RE
3027.P
40943b9a
TK
3028.nf
3029 user, system, context switches, major faults, minor faults
3030.fi
d60e92d1 3031.RS
40943b9a
TK
3032.P
3033.B
3034I/O depths:
d60e92d1
AC
3035.RE
3036.P
40943b9a
TK
3037.nf
3038 <=1, 2, 4, 8, 16, 32, >=64
3039.fi
562c2d2f 3040.RS
40943b9a
TK
3041.P
3042.B
3043I/O latencies microseconds:
562c2d2f 3044.RE
40943b9a
TK
3045.P
3046.nf
3047 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000
3048.fi
562c2d2f 3049.RS
40943b9a
TK
3050.P
3051.B
3052I/O latencies milliseconds:
562c2d2f
DN
3053.RE
3054.P
40943b9a
TK
3055.nf
3056 <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000
3057.fi
f2f788dd 3058.RS
40943b9a
TK
3059.P
3060.B
3061Disk utilization [v3]:
f2f788dd
JA
3062.RE
3063.P
40943b9a
TK
3064.nf
3065 disk name, read ios, write ios, read merges, write merges, read ticks, write ticks, time spent in queue, disk utilization percentage
3066.fi
562c2d2f 3067.RS
d60e92d1 3068.P
40943b9a
TK
3069.B
3070Additional Info (dependent on continue_on_error, default off):
d60e92d1 3071.RE
2fc26c3d 3072.P
40943b9a
TK
3073.nf
3074 total # errors, first error code
3075.fi
2fc26c3d
IC
3076.RS
3077.P
40943b9a
TK
3078.B
3079Additional Info (dependent on description being set):
3080.RE
3081.P
2fc26c3d 3082.nf
40943b9a
TK
3083 Text description
3084.fi
3085.P
3086Completion latency percentiles can be a grouping of up to 20 sets, so for the
3087terse output fio writes all of them. Each field will look like this:
3088.P
3089.nf
3090 1.00%=6112
3091.fi
3092.P
3093which is the Xth percentile, and the `usec' latency associated with it.
3094.P
3095For \fBDisk utilization\fR, all disks used by fio are shown. So for each disk there
3096will be a disk utilization section.
3097.P
3098Below is a single line containing short names for each of the fields in the
3099minimal output v3, separated by semicolons:
3100.P
3101.nf
3102 terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
2fc26c3d 3103.fi
d9e557ab
VF
3104.SH JSON+ OUTPUT
3105The \fBjson+\fR output format is identical to the \fBjson\fR output format except that it
3106adds a full dump of the completion latency bins. Each \fBbins\fR object contains a
3107set of (key, value) pairs where keys are latency durations and values count how
3108many I/Os had completion latencies of the corresponding duration. For example,
3109consider:
d9e557ab 3110.RS
40943b9a 3111.P
d9e557ab
VF
3112"bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1, "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" : 534, "105984" : 5995, "107008" : 7529, ... }
3113.RE
40943b9a 3114.P
d9e557ab
VF
3115This data indicates that one I/O required 87,552ns to complete, two I/Os required
3116100,864ns to complete, and 7529 I/Os required 107,008ns to complete.
40943b9a 3117.P
d9e557ab 3118Also included with fio is a Python script \fBfio_jsonplus_clat2csv\fR that takes
40943b9a
TK
3119json+ output and generates CSV\-formatted latency data suitable for plotting.
3120.P
d9e557ab 3121The latency durations actually represent the midpoints of latency intervals.
40943b9a 3122For details refer to `stat.h' in the fio source.
29dbd1e5 3123.SH TRACE FILE FORMAT
40943b9a
TK
3124There are two trace file format that you can encounter. The older (v1) format is
3125unsupported since version 1.20\-rc3 (March 2008). It will still be described
29dbd1e5 3126below in case that you get an old trace and want to understand it.
29dbd1e5 3127.P
40943b9a
TK
3128In any case the trace is a simple text file with a single action per line.
3129.TP
29dbd1e5 3130.B Trace file format v1
40943b9a 3131Each line represents a single I/O action in the following format:
29dbd1e5 3132.RS
40943b9a
TK
3133.RS
3134.P
29dbd1e5 3135rw, offset, length
29dbd1e5
JA
3136.RE
3137.P
40943b9a
TK
3138where `rw=0/1' for read/write, and the `offset' and `length' entries being in bytes.
3139.P
3140This format is not supported in fio versions >= 1.20\-rc3.
3141.RE
3142.TP
29dbd1e5 3143.B Trace file format v2
40943b9a
TK
3144The second version of the trace file format was added in fio version 1.17. It
3145allows to access more then one file per trace and has a bigger set of possible
3146file actions.
29dbd1e5 3147.RS
40943b9a 3148.P
29dbd1e5 3149The first line of the trace file has to be:
40943b9a
TK
3150.RS
3151.P
3152"fio version 2 iolog"
3153.RE
3154.P
29dbd1e5 3155Following this can be lines in two different formats, which are described below.
40943b9a
TK
3156.P
3157.B
29dbd1e5 3158The file management format:
40943b9a
TK
3159.RS
3160filename action
29dbd1e5 3161.P
40943b9a 3162The `filename' is given as an absolute path. The `action' can be one of these:
29dbd1e5
JA
3163.RS
3164.TP
3165.B add
40943b9a 3166Add the given `filename' to the trace.
29dbd1e5
JA
3167.TP
3168.B open
40943b9a
TK
3169Open the file with the given `filename'. The `filename' has to have
3170been added with the \fBadd\fR action before.
29dbd1e5
JA
3171.TP
3172.B close
40943b9a
TK
3173Close the file with the given `filename'. The file has to have been
3174\fBopen\fRed before.
3175.RE
29dbd1e5 3176.RE
29dbd1e5 3177.P
40943b9a
TK
3178.B
3179The file I/O action format:
3180.RS
3181filename action offset length
29dbd1e5 3182.P
40943b9a
TK
3183The `filename' is given as an absolute path, and has to have been \fBadd\fRed and
3184\fBopen\fRed before it can be used with this format. The `offset' and `length' are
3185given in bytes. The `action' can be one of these:
29dbd1e5
JA
3186.RS
3187.TP
3188.B wait
40943b9a
TK
3189Wait for `offset' microseconds. Everything below 100 is discarded.
3190The time is relative to the previous `wait' statement.
29dbd1e5
JA
3191.TP
3192.B read
40943b9a 3193Read `length' bytes beginning from `offset'.
29dbd1e5
JA
3194.TP
3195.B write
40943b9a 3196Write `length' bytes beginning from `offset'.
29dbd1e5
JA
3197.TP
3198.B sync
40943b9a 3199\fBfsync\fR\|(2) the file.
29dbd1e5
JA
3200.TP
3201.B datasync
40943b9a 3202\fBfdatasync\fR\|(2) the file.
29dbd1e5
JA
3203.TP
3204.B trim
40943b9a
TK
3205Trim the given file from the given `offset' for `length' bytes.
3206.RE
29dbd1e5 3207.RE
29dbd1e5 3208.SH CPU IDLENESS PROFILING
40943b9a
TK
3209In some cases, we want to understand CPU overhead in a test. For example, we
3210test patches for the specific goodness of whether they reduce CPU usage.
3211Fio implements a balloon approach to create a thread per CPU that runs at idle
3212priority, meaning that it only runs when nobody else needs the cpu.
3213By measuring the amount of work completed by the thread, idleness of each CPU
3214can be derived accordingly.
3215.P
3216An unit work is defined as touching a full page of unsigned characters. Mean and
3217standard deviation of time to complete an unit work is reported in "unit work"
3218section. Options can be chosen to report detailed percpu idleness or overall
3219system idleness by aggregating percpu stats.
29dbd1e5 3220.SH VERIFICATION AND TRIGGERS
40943b9a
TK
3221Fio is usually run in one of two ways, when data verification is done. The first
3222is a normal write job of some sort with verify enabled. When the write phase has
3223completed, fio switches to reads and verifies everything it wrote. The second
3224model is running just the write phase, and then later on running the same job
3225(but with reads instead of writes) to repeat the same I/O patterns and verify
3226the contents. Both of these methods depend on the write phase being completed,
3227as fio otherwise has no idea how much data was written.
3228.P
3229With verification triggers, fio supports dumping the current write state to
3230local files. Then a subsequent read verify workload can load this state and know
3231exactly where to stop. This is useful for testing cases where power is cut to a
3232server in a managed fashion, for instance.
3233.P
29dbd1e5 3234A verification trigger consists of two things:
29dbd1e5 3235.RS
40943b9a
TK
3236.P
32371) Storing the write state of each job.
3238.P
32392) Executing a trigger command.
29dbd1e5 3240.RE
40943b9a
TK
3241.P
3242The write state is relatively small, on the order of hundreds of bytes to single
3243kilobytes. It contains information on the number of completions done, the last X
3244completions, etc.
3245.P
3246A trigger is invoked either through creation ('touch') of a specified file in
3247the system, or through a timeout setting. If fio is run with
3248`\-\-trigger\-file=/tmp/trigger\-file', then it will continually
3249check for the existence of `/tmp/trigger\-file'. When it sees this file, it
3250will fire off the trigger (thus saving state, and executing the trigger
29dbd1e5 3251command).
40943b9a
TK
3252.P
3253For client/server runs, there's both a local and remote trigger. If fio is
3254running as a server backend, it will send the job states back to the client for
3255safe storage, then execute the remote trigger, if specified. If a local trigger
3256is specified, the server will still send back the write state, but the client
3257will then execute the trigger.
29dbd1e5
JA
3258.RE
3259.P
3260.B Verification trigger example
3261.RS
40943b9a
TK
3262Let's say we want to run a powercut test on the remote Linux machine 'server'.
3263Our write workload is in `write\-test.fio'. We want to cut power to 'server' at
3264some point during the run, and we'll run this test from the safety or our local
3265machine, 'localbox'. On the server, we'll start the fio backend normally:
3266.RS
3267.P
3268server# fio \-\-server
3269.RE
3270.P
29dbd1e5 3271and on the client, we'll fire off the workload:
40943b9a
TK
3272.RS
3273.P
3274localbox$ fio \-\-client=server \-\-trigger\-file=/tmp/my\-trigger \-\-trigger\-remote="bash \-c "echo b > /proc/sysrq\-triger""
3275.RE
3276.P
3277We set `/tmp/my\-trigger' as the trigger file, and we tell fio to execute:
3278.RS
3279.P
3280echo b > /proc/sysrq\-trigger
3281.RE
3282.P
3283on the server once it has received the trigger and sent us the write state. This
3284will work, but it's not really cutting power to the server, it's merely
3285abruptly rebooting it. If we have a remote way of cutting power to the server
3286through IPMI or similar, we could do that through a local trigger command
3287instead. Let's assume we have a script that does IPMI reboot of a given hostname,
3288ipmi\-reboot. On localbox, we could then have run fio with a local trigger
3289instead:
3290.RS
3291.P
3292localbox$ fio \-\-client=server \-\-trigger\-file=/tmp/my\-trigger \-\-trigger="ipmi\-reboot server"
3293.RE
3294.P
3295For this case, fio would wait for the server to send us the write state, then
3296execute `ipmi\-reboot server' when that happened.
29dbd1e5
JA
3297.RE
3298.P
3299.B Loading verify state
3300.RS
40943b9a
TK
3301To load stored write state, a read verification job file must contain the
3302\fBverify_state_load\fR option. If that is set, fio will load the previously
29dbd1e5 3303stored state. For a local fio run this is done by loading the files directly,
40943b9a
TK
3304and on a client/server run, the server backend will ask the client to send the
3305files over and load them from there.
29dbd1e5 3306.RE
a3ae5b05 3307.SH LOG FILE FORMATS
a3ae5b05
JA
3308Fio supports a variety of log file formats, for logging latencies, bandwidth,
3309and IOPS. The logs share a common format, which looks like this:
40943b9a 3310.RS
a3ae5b05 3311.P
40943b9a
TK
3312time (msec), value, data direction, block size (bytes), offset (bytes)
3313.RE
3314.P
3315`Time' for the log entry is always in milliseconds. The `value' logged depends
3316on the type of log, it will be one of the following:
3317.RS
a3ae5b05
JA
3318.TP
3319.B Latency log
168bb587 3320Value is latency in nsecs
a3ae5b05
JA
3321.TP
3322.B Bandwidth log
6d500c2e 3323Value is in KiB/sec
a3ae5b05
JA
3324.TP
3325.B IOPS log
40943b9a
TK
3326Value is IOPS
3327.RE
a3ae5b05 3328.P
40943b9a
TK
3329`Data direction' is one of the following:
3330.RS
a3ae5b05
JA
3331.TP
3332.B 0
40943b9a 3333I/O is a READ
a3ae5b05
JA
3334.TP
3335.B 1
40943b9a 3336I/O is a WRITE
a3ae5b05
JA
3337.TP
3338.B 2
40943b9a 3339I/O is a TRIM
a3ae5b05 3340.RE
40943b9a
TK
3341.P
3342The entry's `block size' is always in bytes. The `offset' is the offset, in bytes,
3343from the start of the file, for that particular I/O. The logging of the offset can be
3344toggled with \fBlog_offset\fR.
3345.P
3346Fio defaults to logging every individual I/O. When IOPS are logged for individual
3347I/Os the `value' entry will always be 1. If windowed logging is enabled through
3348\fBlog_avg_msec\fR, fio logs the average values over the specified period of time.
3349If windowed logging is enabled and \fBlog_max_value\fR is set, then fio logs
3350maximum values in that window instead of averages. Since `data direction', `block size'
3351and `offset' are per\-I/O values, if windowed logging is enabled they
3352aren't applicable and will be 0.
49da1240 3353.SH CLIENT / SERVER
40943b9a
TK
3354Normally fio is invoked as a stand\-alone application on the machine where the
3355I/O workload should be generated. However, the backend and frontend of fio can
3356be run separately i.e., the fio server can generate an I/O workload on the "Device
3357Under Test" while being controlled by a client on another machine.
3358.P
3359Start the server on the machine which has access to the storage DUT:
3360.RS
3361.P
3362$ fio \-\-server=args
3363.RE
3364.P
3365where `args' defines what fio listens to. The arguments are of the form
3366`type,hostname' or `IP,port'. `type' is either `ip' (or ip4) for TCP/IP
3367v4, `ip6' for TCP/IP v6, or `sock' for a local unix domain socket.
3368`hostname' is either a hostname or IP address, and `port' is the port to listen
3369to (only valid for TCP/IP, not a local socket). Some examples:
3370.RS
3371.TP
e0ee7a8b 33721) \fBfio \-\-server\fR
40943b9a
TK
3373Start a fio server, listening on all interfaces on the default port (8765).
3374.TP
e0ee7a8b 33752) \fBfio \-\-server=ip:hostname,4444\fR
40943b9a
TK
3376Start a fio server, listening on IP belonging to hostname and on port 4444.
3377.TP
e0ee7a8b 33783) \fBfio \-\-server=ip6:::1,4444\fR
40943b9a
TK
3379Start a fio server, listening on IPv6 localhost ::1 and on port 4444.
3380.TP
e0ee7a8b 33814) \fBfio \-\-server=,4444\fR
40943b9a
TK
3382Start a fio server, listening on all interfaces on port 4444.
3383.TP
e0ee7a8b 33845) \fBfio \-\-server=1.2.3.4\fR
40943b9a
TK
3385Start a fio server, listening on IP 1.2.3.4 on the default port.
3386.TP
e0ee7a8b 33876) \fBfio \-\-server=sock:/tmp/fio.sock\fR
40943b9a
TK
3388Start a fio server, listening on the local socket `/tmp/fio.sock'.
3389.RE
3390.P
3391Once a server is running, a "client" can connect to the fio server with:
3392.RS
3393.P
3394$ fio <local\-args> \-\-client=<server> <remote\-args> <job file(s)>
3395.RE
3396.P
3397where `local\-args' are arguments for the client where it is running, `server'
3398is the connect string, and `remote\-args' and `job file(s)' are sent to the
3399server. The `server' string follows the same format as it does on the server
3400side, to allow IP/hostname/socket and port strings.
3401.P
3402Fio can connect to multiple servers this way:
3403.RS
3404.P
3405$ fio \-\-client=<server1> <job file(s)> \-\-client=<server2> <job file(s)>
3406.RE
3407.P
3408If the job file is located on the fio server, then you can tell the server to
3409load a local file as well. This is done by using \fB\-\-remote\-config\fR:
3410.RS
3411.P
3412$ fio \-\-client=server \-\-remote\-config /path/to/file.fio
3413.RE
3414.P
3415Then fio will open this local (to the server) job file instead of being passed
3416one from the client.
3417.P
ff6bb260 3418If you have many servers (example: 100 VMs/containers), you can input a pathname
40943b9a
TK
3419of a file containing host IPs/names as the parameter value for the
3420\fB\-\-client\fR option. For example, here is an example `host.list'
3421file containing 2 hostnames:
3422.RS
3423.P
3424.PD 0
39b5f61e 3425host1.your.dns.domain
40943b9a 3426.P
39b5f61e 3427host2.your.dns.domain
40943b9a
TK
3428.PD
3429.RE
3430.P
39b5f61e 3431The fio command would then be:
40943b9a
TK
3432.RS
3433.P
3434$ fio \-\-client=host.list <job file(s)>
3435.RE
3436.P
3437In this mode, you cannot input server\-specific parameters or job files \-\- all
39b5f61e 3438servers receive the same job file.
40943b9a
TK
3439.P
3440In order to let `fio \-\-client' runs use a shared filesystem from multiple
3441hosts, `fio \-\-client' now prepends the IP address of the server to the
3442filename. For example, if fio is using the directory `/mnt/nfs/fio' and is
3443writing filename `fileio.tmp', with a \fB\-\-client\fR `hostfile'
3444containing two hostnames `h1' and `h2' with IP addresses 192.168.10.120 and
3445192.168.10.121, then fio will create two files:
3446.RS
3447.P
3448.PD 0
39b5f61e 3449/mnt/nfs/fio/192.168.10.120.fileio.tmp
40943b9a 3450.P
39b5f61e 3451/mnt/nfs/fio/192.168.10.121.fileio.tmp
40943b9a
TK
3452.PD
3453.RE
d60e92d1
AC
3454.SH AUTHORS
3455.B fio
aa58d252 3456was written by Jens Axboe <jens.axboe@oracle.com>,
f8b8f7da 3457now Jens Axboe <axboe@fb.com>.
d1429b5c
AC
3458.br
3459This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au> based
d60e92d1 3460on documentation by Jens Axboe.
40943b9a
TK
3461.br
3462This man page was rewritten by Tomohiro Kusumi <tkusumi@tuxera.com> based
3463on documentation by Jens Axboe.
d60e92d1 3464.SH "REPORTING BUGS"
482900c9 3465Report bugs to the \fBfio\fR mailing list <fio@vger.kernel.org>.
6468020d 3466.br
40943b9a
TK
3467See \fBREPORTING\-BUGS\fR.
3468.P
3469\fBREPORTING\-BUGS\fR: \fIhttp://git.kernel.dk/cgit/fio/plain/REPORTING\-BUGS\fR
d60e92d1 3470.SH "SEE ALSO"
d1429b5c
AC
3471For further documentation see \fBHOWTO\fR and \fBREADME\fR.
3472.br
40943b9a 3473Sample jobfiles are available in the `examples/' directory.
9040e236 3474.br
40943b9a
TK
3475These are typically located under `/usr/share/doc/fio'.
3476.P
3477\fBHOWTO\fR: \fIhttp://git.kernel.dk/cgit/fio/plain/HOWTO\fR
9040e236 3478.br
40943b9a 3479\fBREADME\fR: \fIhttp://git.kernel.dk/cgit/fio/plain/README\fR