.. option:: --debug=type
- Enable verbose tracing of various fio actions. May be ``all`` for all types
- or individual types separated by a comma (e.g. ``--debug=file,mem`` will
- enable file and memory debugging). Currently, additional logging is
- available for:
+ Enable verbose tracing of various fio actions. May be ``all`` for all types
+ or individual types separated by a comma (e.g. ``--debug=file,mem`` will
+ enable file and memory debugging). Currently, additional logging is
+ available for:
- *process*
+ *process*
Dump info related to processes.
- *file*
+ *file*
Dump info related to file actions.
- *io*
+ *io*
Dump info related to I/O queuing.
- *mem*
+ *mem*
Dump info related to memory allocations.
- *blktrace*
+ *blktrace*
Dump info related to blktrace setup.
- *verify*
+ *verify*
Dump info related to I/O verification.
- *all*
+ *all*
Enable all debug options.
- *random*
+ *random*
Dump info related to random offset generation.
- *parse*
+ *parse*
Dump info related to option matching and parsing.
- *diskutil*
+ *diskutil*
Dump info related to disk utilization updates.
- *job:x*
+ *job:x*
Dump info only related to job number x.
- *mutex*
+ *mutex*
Dump info only related to mutex up/down ops.
- *profile*
+ *profile*
Dump info related to profile extensions.
- *time*
+ *time*
Dump info related to internal time keeping.
- *net*
+ *net*
Dump info related to networking connections.
- *rate*
+ *rate*
Dump info related to I/O rate switching.
- *compress*
+ *compress*
Dump info related to log compress/decompress.
- *?* or *help*
+ *?* or *help*
Show available debug options.
.. option:: --parse-only
- Parse options only, don\'t start any I/O.
+ Parse options only, don't start any I/O.
.. option:: --output=filename
Write output to file `filename`.
+.. option:: --output-format=type
+
+ Set the reporting format to `normal`, `terse`, `json`, or `json+`. Multiple
+ formats can be selected, separated by a comma. `terse` is a CSV based
+ format. `json+` is like `json`, except it adds a full dump of the latency
+ buckets.
+
+.. option:: --runtime
+ Limit run time to runtime seconds.
+
.. option:: --bandwidth-log
Generate aggregate bandwidth logs.
.. option:: --append-terse
- Print statistics in selected mode AND terse, semicolon-delimited format.
- **deprecated**, use :option:`--output-format` instead to select multiple
- formats.
-
-.. option:: --output-format=type
-
- Set the reporting format to `normal`, `terse`, `json`, or `json+`. Multiple
- formats can be selected, separated by a comma. `terse` is a CSV based
- format. `json+` is like `json`, except it adds a full dump of the latency
- buckets.
+ Print statistics in selected mode AND terse, semicolon-delimited format.
+ **Deprecated**, use :option:`--output-format` instead to select multiple
+ formats.
.. option:: --terse-version=type
.. option:: --version
- Print version info and exit.
+ Print version information and exit.
.. option:: --help
.. option:: --crctest=[test]
- Test the speed of the built-in checksumming functions. If no argument is
- given all of them are tested. Alternatively, a comma separated list can be passed, in
- which case the given ones are tested.
+ Test the speed of the built-in checksumming functions. If no argument is
+ given, all of them are tested. Alternatively, a comma separated list can
+ be passed, in which case the given ones are tested.
.. option:: --cmdhelp=command
.. option:: --enghelp=[ioengine[,command]]
- List all commands defined by :option:`ioengine`, or print help for `command`
- defined by :option:`ioengine`. If no :option:`ioengine` is given, list all
- available ioengines.
+ List all commands defined by :option:`ioengine`, or print help for `command`
+ defined by :option:`ioengine`. If no :option:`ioengine` is given, list all
+ available ioengines.
.. option:: --showcmd=jobfile
- Turn a job file into command line options.
+ Convert `jobfile` to a set of command-line options.
.. option:: --readonly
- Turn on safety read-only checks, preventing writes. The ``--readonly``
- option is an extra safety guard to prevent users from accidentally starting
- a write workload when that is not desired. Fio will only write if
- `rw=write/randwrite/rw/randrw` is given. This extra safety net can be used
- as an extra precaution as ``--readonly`` will also enable a write check in
- the I/O engine core to prevent writes due to unknown user space bug(s).
+ Turn on safety read-only checks, preventing writes. The ``--readonly``
+ option is an extra safety guard to prevent users from accidentally starting
+ a write workload when that is not desired. Fio will only write if
+ `rw=write/randwrite/rw/randrw` is given. This extra safety net can be used
+ as an extra precaution as ``--readonly`` will also enable a write check in
+ the I/O engine core to prevent writes due to unknown user space bug(s).
.. option:: --eta=when
- When real-time ETA estimate should be printed. May be `always`, `never` or
- `auto`.
+ Specifies when real-time ETA estimate should be printed. `when` may be
+ `always`, `never` or `auto`.
.. option:: --eta-newline=time
.. option:: --section=name
- Only run specified section in job file. Multiple sections can be specified.
- The ``--section`` option allows one to combine related jobs into one file.
- E.g. one job file could define light, moderate, and heavy sections. Tell
- fio to run only the "heavy" section by giving ``--section=heavy``
- command line option. One can also specify the "write" operations in one
- section and "verify" operation in another section. The ``--section`` option
- only applies to job sections. The reserved *global* section is always
- parsed and used.
+ Only run specified section `name` in job file. Multiple sections can be specified.
+ The ``--section`` option allows one to combine related jobs into one file.
+ E.g. one job file could define light, moderate, and heavy sections. Tell
+ fio to run only the "heavy" section by giving ``--section=heavy``
+ command line option. One can also specify the "write" operations in one
+ section and "verify" operation in another section. The ``--section`` option
+ only applies to job sections. The reserved *global* section is always
+ parsed and used.
.. option:: --alloc-size=kb
- Set the internal smalloc pool to this size in KiB. The
- ``--alloc-size`` switch allows one to use a larger pool size for smalloc.
- If running large jobs with randommap enabled, fio can run out of memory.
- Smalloc is an internal allocator for shared structures from a fixed size
- memory pool and can grow to 16 pools. The pool size defaults to 16MiB.
+ Set the internal smalloc pool size to `kb` in KiB. The
+ ``--alloc-size`` switch allows one to use a larger pool size for smalloc.
+ If running large jobs with randommap enabled, fio can run out of memory.
+ Smalloc is an internal allocator for shared structures from a fixed size
+ memory pool and can grow to 16 pools. The pool size defaults to 16MiB.
- NOTE: While running :file:`.fio_smalloc.*` backing store files are visible
- in :file:`/tmp`.
+ NOTE: While running :file:`.fio_smalloc.*` backing store files are visible
+ in :file:`/tmp`.
.. option:: --warnings-fatal
- All fio parser warnings are fatal, causing fio to exit with an
- error.
+ All fio parser warnings are fatal, causing fio to exit with an
+ error.
.. option:: --max-jobs=nr
- Maximum number of threads/processes to support.
+ Set the maximum number of threads/processes to support.
.. option:: --server=args
- Start a backend server, with `args` specifying what to listen to.
- See `Client/Server`_ section.
+ Start a backend server, with `args` specifying what to listen to.
+ See `Client/Server`_ section.
.. option:: --daemonize=pidfile
- Background a fio server, writing the pid to the given `pidfile` file.
+ Background a fio server, writing the pid to the given `pidfile` file.
.. option:: --client=hostname
- Instead of running the jobs locally, send and run them on the given host or
- set of hosts. See `Client/Server`_ section.
+ Instead of running the jobs locally, send and run them on the given host or
+ set of hosts. See `Client/Server`_ section.
.. option:: --remote-config=file
.. option:: --idle-prof=option
- Report CPU idleness. *option* is one of the following:
+ Report CPU idleness. `option` is one of the following:
**calibrate**
Run unit work calibration only and exit.
run. Simple math is also supported on these keywords, so you can perform actions
like::
- size=8*$mb_memory
+ size=8*$mb_memory
and get that properly expanded to 8 times the size of memory in the machine.
~~~~~~~~~~~~~~~
**str**
- String. This is a sequence of alpha characters.
+ String: A sequence of alphanumeric characters.
**time**
Integer with possible time suffix. Without a unit value is interpreted as
Integer. A whole number value, which may contain an integer prefix
and an integer suffix:
- [*integer prefix*] **number** [*integer suffix*]
+ [*integer prefix*] **number** [*integer suffix*]
The optional *integer prefix* specifies the number's base. The default
is decimal. *0x* specifies hexadecimal.
To specify power-of-2 binary values defined in IEC 80000-13:
- * *k* -- means kibi (Ki) or 1024
+ * *K* -- means kibi (Ki) or 1024
* *M* -- means mebi (Mi) or 1024**2
* *G* -- means gibi (Gi) or 1024**3
* *T* -- means tebi (Ti) or 1024**4
compatibility with old scripts. For example, 4k means 4096.
For quantities of data, an optional unit of 'B' may be included
- (e.g., 'kB' is the same as 'k').
+ (e.g., 'kB' is the same as 'k').
The *integer suffix* is not case sensitive (e.g., m/mi mean mebi/mega,
not milli). 'b' and 'B' both mean byte, not bit.
**float_list**
A list of floating point numbers, separated by a ':' character.
+With the above in mind, here follows the complete list of fio job parameters.
+
Units
~~~~~
Bit based.
-With the above in mind, here follows the complete list of fio job parameters.
-
-
Job description
~~~~~~~~~~~~~~~
**pareto**
Use a *Pareto* distribution to decide what file to access.
- **gauss**
+ **normal**
Use a *Gaussian* (normal) distribution to decide what file to
access.
+ **gauss**
+ Alias for normal.
+
For *random*, *roundrobin*, and *sequential*, a postfix can be appended to
tell fio how many I/Os to issue before switching to a new file. For example,
specifying ``file_service_type=random:8`` would cause fio to issue
``sequential`` is only useful for random I/O, where fio would normally
generate a new random offset for every I/O. If you append e.g. 8 to randread,
- you would get a new random offset for every 8 I/O's. The result would be a
- seek for only every 8 I/O's, instead of for every I/O. Use ``rw=randread:8``
+ you would get a new random offset for every 8 I/Os. The result would be a
+ seek for only every 8 I/Os, instead of for every I/O. Use ``rw=randread:8``
to specify that. As sequential I/O is already sequential, setting
``sequential`` for that would not result in any differences. ``identical``
behaves in a similar fashion, except it sends the same offset 8 number of
**none**
Do not pre-allocate space.
+ **native**
+ Use a platform's native pre-allocation call but fall back to
+ **none** behavior if it fails/is not implemented.
+
**posix**
Pre-allocate via :manpage:`posix_fallocate(3)`.
Backward-compatible alias for **posix**.
May not be available on all supported platforms. **keep** is only available
- on Linux. If using ZFS on Solaris this must be set to **none** because ZFS
- doesn't support it. Default: **posix**.
+ on Linux. If using ZFS on Solaris this cannot be set to **posix**
+ because ZFS doesn't support pre-allocation. Default: **native** if any
+ pre-allocation methods are available, **none** if not.
.. option:: fadvise_hint=str
**random**
Advise using **FADV_RANDOM**.
-.. option:: fadvise_stream=int
+.. option:: write_hint=str
+
+ Use :manpage:`fcntl(2)` to advise the kernel what life time to expect
+ from a write. Only supported on Linux, as of version 4.13. Accepted
+ values are:
+
+ **none**
+ No particular life time associated with this file.
+
+ **short**
+ Data written to this file has a short life time.
+
+ **medium**
+ Data written to this file has a medium life time.
+
+ **long**
+ Data written to this file has a long life time.
- Use :manpage:`posix_fadvise(2)` to advise the kernel what stream ID the
- writes issued belong to. Only supported on Linux. Note, this option may
- change going forward.
+ **extreme**
+ Data written to this file has a very long life time.
+
+ The values are all relative to each other, and no absolute meaning
+ should be associated with them.
.. option:: offset=int
* 60% of accesses should be to the first 10%
* 30% of accesses should be to the next 20%
- * 8% of accesses should be to to the next 30%
+ * 8% of accesses should be to the next 30%
* 2% of accesses should be to the next 40%
we can define that through zoning of the random accesses. For the above
typically won't work with direct I/O, as that normally requires sector
alignment.
-.. option:: bs_is_seq_rand
+.. option:: bs_is_seq_rand=bool
If this option is set, fio will use the normal read,write blocksize settings
as sequential,random blocksize settings instead. Any random read or write
**cudamalloc**
Use GPU memory as the buffers for GPUDirect RDMA benchmark.
+ The ioengine must be rdma.
The area allocated is a function of the maximum allowed bs size for the job,
multiplied by the I/O depth given. Note that for **shmhuge** and
Set RWF_HIPRI on I/O, indicating to the kernel that it's of higher priority
than normal.
+.. option:: hipri_percentage : [pvsync2]
+
+ When hipri is set this determines the probability of a pvsync2 IO being high
+ priority. The default is 100%.
+
.. option:: cpuload=int : [cpuio]
Attempt to use the specified percentage of CPU cycles. This is a mandatory
[libhdfs]
- the listening port of the HFDS cluster namenode.
+ The listening port of the HFDS cluster namenode.
.. option:: interface=str : [netsplice] [net]
hostname if the job is a TCP listener or UDP reader. For unix sockets, the
normal filename option should be used and the port is invalid.
-.. option:: listen : [net]
+.. option:: listen : [netsplice] [net]
For TCP network connections, tell fio to listen for incoming connections
rather than initiating an outgoing connection. The :option:`hostname` must
be omitted if this option is used.
-.. option:: pingpong : [net]
+.. option:: pingpong : [netsplice] [net]
Normally a network writer will just continue writing data, and a network
reader will just consume packages. If ``pingpong=1`` is set, a writer will
``pingpong=1`` should only be set for a single reader when multiple readers
are listening to the same address.
-.. option:: window_size : [net]
+.. option:: window_size : [netsplice] [net]
Set the desired socket buffer size for the connection.
-.. option:: mss : [net]
+.. option:: mss : [netsplice] [net]
Set the TCP maximum segment size (TCP_MAXSEG).
**0**
Default. Preallocate donor's file on init.
**1**
- Allocate space immediately inside defragment event, and free right
+ Allocate space immediately inside defragment event, and free right
after event.
.. option:: clustername=str : [rbd]
.. option:: chunk_size : [libhdfs]
- the size of the chunk to use for each file.
+ The size of the chunk to use for each file.
I/O depth
16 requests, it will let the depth drain down to 4 before starting to fill
it again.
+.. option:: serialize_overlap=bool
+
+ Serialize in-flight I/Os that might otherwise cause or suffer from data races.
+ When two or more I/Os are submitted simultaneously, there is no guarantee that
+ the I/Os will be processed or completed in the submitted order. Further, if
+ two or more of those I/Os are writes, any overlapping region between them can
+ become indeterminate/undefined on certain storage. These issues can cause
+ verification to fail erratically when at least one of the racing I/Os is
+ changing data and the overlapping region has a non-zero size. Setting
+ ``serialize_overlap`` tells fio to avoid provoking this behavior by explicitly
+ serializing in-flight I/Os that have a non-zero overlap. Note that setting
+ this option can reduce both performance and the `:option:iodepth` achieved.
+ Additionally this option does not work when :option:`io_submit_mode` is set to
+ offload. Default: false.
+
.. option:: io_submit_mode=str
This option controls how fio submits the I/O to the I/O engine. The default
replay, the file needs to be turned into a blkparse binary data file first
(``blkparse <device> -o /dev/null -d file_for_fio.bin``).
-.. option:: replay_no_stall=int
+.. option:: replay_no_stall=bool
When replaying I/O with :option:`read_iolog` the default behavior is to
attempt to respect the timestamps within the log and replay them with the
.. option:: verifysort_nr=int
- Pre-load and sort verify blocks for a read workload.
+ Pre-load and sort verify blocks for a read workload.
.. option:: verify_offset=int
.. option:: trim_backlog=int
- Verify that trim/discarded blocks are returned as zeros.
+ Trim after this number of blocks are written.
.. option:: trim_backlog_batch=int
Enable experimental verification.
-
Steady state
~~~~~~~~~~~~
all jobs in a file will be part of the same reporting group, unless
separated by a :option:`stonewall`.
-.. option:: stats
+.. option:: stats=bool
By default, fio collects and shows final output results for all jobs
that run. If this option is set to 0, then fio will ignore it in
you instead want to log the maximum value, set this option to 1. Defaults to
0, meaning that averaged values are logged.
-.. option:: log_offset=int
+.. option:: log_offset=bool
If this is set, the iolog options will include the byte offset for the I/O
- entry as well as the other data values.
+ entry as well as the other data values. Defaults to 0 meaning that
+ offsets are not present in logs. Also see `Log File Formats`_.
.. option:: log_compression=int
**ios**
Number of I/Os performed by all groups.
**merge**
- Number of merges I/O the I/O scheduler.
+ Number of merges performed by the I/O scheduler.
**ticks**
Number of ticks we kept the disk busy.
**in_queue**
change.
Split up, the format is as follows (comments in brackets denote when a
-field was introduced or whether its specific to some terse version):
+field was introduced or whether it's specific to some terse version):
::
Below is a single line containing short names for each of the fields in the
minimal output v3, separated by semicolons::
-terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
+ terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
+
+
+JSON+ output
+------------
+
+The `json+` output format is identical to the `json` output format except that it
+adds a full dump of the completion latency bins. Each `bins` object contains a
+set of (key, value) pairs where keys are latency durations and values count how
+many I/Os had completion latencies of the corresponding duration. For example,
+consider:
+
+ "bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1, "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" : 534, "105984" : 5995, "107008" : 7529, ... }
+
+This data indicates that one I/O required 87,552ns to complete, two I/Os required
+100,864ns to complete, and 7529 I/Os required 107,008ns to complete.
+
+Also included with fio is a Python script `fio_jsonplus_clat2csv` that takes
+json+ output and generates CSV-formatted latency data suitable for plotting.
+
+The latency durations actually represent the midpoints of latency intervals.
+For details refer to stat.h.
Trace file format
Fio supports a variety of log file formats, for logging latencies, bandwidth,
and IOPS. The logs share a common format, which looks like this:
- *time* (`msec`), *value*, *data direction*, *offset*
+ *time* (`msec`), *value*, *data direction*, *block size* (`bytes`),
+ *offset* (`bytes`)
-Time for the log entry is always in milliseconds. The *value* logged depends
+*Time* for the log entry is always in milliseconds. The *value* logged depends
on the type of log, it will be one of the following:
**Latency log**
**2**
I/O is a TRIM
-The *offset* is the offset, in bytes, from the start of the file, for that
-particular I/O. The logging of the offset can be toggled with
-:option:`log_offset`.
+The entry's *block size* is always in bytes. The *offset* is the offset, in bytes,
+from the start of the file, for that particular I/O. The logging of the offset can be
+toggled with :option:`log_offset`.
Fio defaults to logging every individual I/O. When IOPS are logged for individual
-I/Os the value entry will always be 1. If windowed logging is enabled through
+I/Os the *value* entry will always be 1. If windowed logging is enabled through
:option:`log_avg_msec`, fio logs the average values over the specified period of time.
If windowed logging is enabled and :option:`log_max_value` is set, then fio logs
-maximum values in that window instead of averages. Since 'data direction' and
-'offset' are per-I/O values, they aren't applicable if windowed logging is enabled.
+maximum values in that window instead of averages. Since *data direction*, *block
+size* and *offset* are per-I/O values, if windowed logging is enabled they
+aren't applicable and will be 0.
-Client/server
+Client/Server
-------------
Normally fio is invoked as a stand-alone application on the machine where the