.. option:: runtime=time
- Tell fio to terminate processing after the specified period of time. It
- can be quite hard to determine for how long a specified job will run, so
- this parameter is handy to cap the total runtime to a given time. When
- the unit is omitted, the value is interpreted in seconds.
+ Limit runtime. The test will run until it completes the configured I/O
+ workload or until it has run for this specified amount of time, whichever
+ occurs first. It can be quite hard to determine for how long a specified
+ job will run, so this parameter is handy to cap the total runtime to a
+ given time. When the unit is omitted, the value is interpreted in
+ seconds.
.. option:: time_based
.. option:: opendir=str
- Recursively open any files below directory `str`.
+ Recursively open any files below directory `str`. This accepts only a
+ single directory and unlike related options, colons appearing in the
+ path must not be escaped.
.. option:: lockfile=str
.. option:: max_open_zones=int
- A zone of a zoned block device is in the open state when it is partially
- written (i.e. not all sectors of the zone have been written). Zoned
- block devices may have a limit on the total number of zones that can
- be simultaneously in the open state, that is, the number of zones that
- can be written to simultaneously. The :option:`max_open_zones` parameter
- limits the number of zones to which write commands are issued by all fio
- jobs, that is, limits the number of zones that will be in the open
- state. This parameter is relevant only if the :option:`zonemode` =zbd is
- used. The default value is always equal to maximum number of open zones
- of the target zoned block device and a value higher than this limit
- cannot be specified by users unless the option
- :option:`ignore_zone_limits` is specified. When
- :option:`ignore_zone_limits` is specified or the target device has no
- limit on the number of zones that can be in an open state,
- :option:`max_open_zones` can specify 0 to disable any limit on the
- number of zones that can be simultaneously written to by all jobs.
+ When a zone of a zoned block device is partially written (i.e. not all
+ sectors of the zone have been written), the zone is in one of three
+ conditions: 'implicit open', 'explicit open' or 'closed'. Zoned block
+ devices may have a limit called 'max_open_zones' (same name as the
+ parameter) on the total number of zones that can simultaneously be in
+ the 'implicit open' or 'explicit open' conditions. Zoned block devices
+ may have another limit called 'max_active_zones', on the total number of
+ zones that can simultaneously be in the three conditions. The
+ :option:`max_open_zones` parameter limits the number of zones to which
+ write commands are issued by all fio jobs, that is, limits the number of
+ zones that will be in the conditions. When the device has the
+ max_open_zones limit and does not have the max_active_zones limit, the
+ :option:`max_open_zones` parameter limits the number of zones in the two
+ open conditions up to the limit. In this case, fio includes zones in the
+ two open conditions to the write target zones at fio start. When the
+ device has both the max_open_zones and the max_active_zones limits, the
+ :option:`max_open_zones` parameter limits the number of zones in the
+ three conditions up to the limit. In this case, fio includes zones in
+ the three conditions to the write target zones at fio start.
+
+ This parameter is relevant only if the :option:`zonemode` =zbd is used.
+ The default value is always equal to the max_open_zones limit of the
+ target zoned block device and a value higher than this limit cannot be
+ specified by users unless the option :option:`ignore_zone_limits` is
+ specified. When :option:`ignore_zone_limits` is specified or the target
+ device does not have the max_open_zones limit, :option:`max_open_zones`
+ can specify 0 to disable any limit on the number of zones that can be
+ simultaneously written to by all jobs.
.. option:: job_max_open_zones=int
OpenBSD and ZFS on Solaris don't support direct I/O. On Windows the synchronous
ioengines don't support direct I/O. Default: false.
-.. option:: atomic=bool
-
- If value is true, attempt to use atomic direct I/O. Atomic writes are
- guaranteed to be stable once acknowledged by the operating system. Only
- Linux supports O_ATOMIC right now.
-
.. option:: buffered=bool
If value is true, use buffered I/O. This is the opposite of the
.. option:: randrepeat=bool
- Seed the random number generator used for random I/O patterns in a
- predictable way so the pattern is repeatable across runs. Default: true.
+ Seed all random number generators in a predictable way so the pattern
+ is repeatable across runs. Default: true.
.. option:: allrandrepeat=bool
- Seed all random number generators in a predictable way so results are
- repeatable across runs. Default: false.
+ Alias for :option:`randrepeat`. Default: true.
.. option:: randseed=int
**random**
Advise using **FADV_RANDOM**.
+ **noreuse**
+ Advise using **FADV_NOREUSE**. This may be a no-op on older Linux
+ kernels. Since Linux 6.3, it provides a hint to the LRU algorithm.
+ See the :manpage:`posix_fadvise(2)` man page.
+
.. option:: write_hint=str
Use :manpage:`fcntl(2)` to advise the kernel what life time to expect
before overwriting. The `trimwrite` mode works well for this
constraint.
- **pmemblk**
- Read and write using filesystem DAX to a file on a filesystem
- mounted with DAX on a persistent memory device through the PMDK
- libpmemblk library.
-
**dev-dax**
Read and write using device DAX to a persistent memory device (e.g.,
/dev/dax0.0) through the PMDK libpmem library.
reads and writes. See :manpage:`ionice(1)`. See also the
:option:`prioclass` option.
+.. option:: cmdprio_hint=int[,int] : [io_uring] [libaio]
+
+ Set the I/O priority hint to use for I/Os that must be issued with
+ a priority when :option:`cmdprio_percentage` or
+ :option:`cmdprio_bssplit` is set. If not specified when
+ :option:`cmdprio_percentage` or :option:`cmdprio_bssplit` is set,
+ this defaults to 0 (no hint). A single value applies to reads and
+ writes. Comma-separated values may be specified for reads and writes.
+ See also the :option:`priohint` option.
+
.. option:: cmdprio=int[,int] : [io_uring] [libaio]
Set the I/O priority value to use for I/Os that must be issued with
cmdprio_bssplit=blocksize/percentage:blocksize/percentage
- In this case, each entry will use the priority class and priority
- level defined by the options :option:`cmdprio_class` and
- :option:`cmdprio` respectively.
+ In this case, each entry will use the priority class, priority hint
+ and priority level defined by the options :option:`cmdprio_class`,
+ :option:`cmdprio` and :option:`cmdprio_hint` respectively.
The second accepted format for this option is:
accepted format does not restrict all entries to have the same priority
class and priority level.
- For both formats, only the read and write data directions are supported,
+ The third accepted format for this option is:
+
+ cmdprio_bssplit=blocksize/percentage/class/level/hint:...
+
+ This is an extension of the second accepted format that allows to also
+ specify a priority hint.
+
+ For all formats, only the read and write data directions are supported,
values for trim IOs are ignored. This option is mutually exclusive with
the :option:`cmdprio_percentage` option.
For direct I/O, requests will only succeed if cache invalidation isn't required,
file blocks are fully allocated and the disk request could be issued immediately.
+.. option:: fdp=bool : [io_uring_cmd] [xnvme]
+
+ Enable Flexible Data Placement mode for write commands.
+
+.. option:: fdp_pli_select=str : [io_uring_cmd] [xnvme]
+
+ Defines how fio decides which placement ID to use next. The following
+ types are defined:
+
+ **random**
+ Choose a placement ID at random (uniform).
+
+ **roundrobin**
+ Round robin over available placement IDs. This is the
+ default.
+
+ The available placement ID index/indices is defined by the option
+ :option:`fdp_pli`.
+
+.. option:: fdp_pli=str : [io_uring_cmd] [xnvme]
+
+ Select which Placement ID Index/Indicies this job is allowed to use for
+ writes. By default, the job will cycle through all available Placement
+ IDs, so use this to isolate these identifiers to specific jobs. If you
+ want fio to use placement identifier only at indices 0, 2 and 5 specify
+ ``fdp_pli=0,2,5``.
+
+.. option:: md_per_io_size=int : [io_uring_cmd]
+
+ Size in bytes for separate metadata buffer per IO. Default: 0.
+
.. option:: cpuload=int : [cpuio]
Attempt to use the specified percentage of CPU cycles. This is a mandatory
performance. The default is to enable it only if
:option:`libblkio_wait_mode=eventfd <libblkio_wait_mode>`.
+.. option:: no_completion_thread : [windowsaio]
+
+ Avoid using a separate thread for completion polling.
+
I/O depth
~~~~~~~~~
fio will ignore the thinktime and continue doing IO at the specified
rate, instead of entering a catch-up mode after thinktime is done.
+.. option:: rate_cycle=int
+
+ Average bandwidth for :option:`rate` and :option:`rate_min` over this number
+ of milliseconds. Defaults to 1000.
+
I/O latency
~~~~~~~~~~~
microseconds. Comma-separated values may be specified for reads, writes,
and trims as described in :option:`blocksize`.
-.. option:: rate_cycle=int
-
- Average bandwidth for :option:`rate` and :option:`rate_min` over this number
- of milliseconds. Defaults to 1000.
-
I/O replay
~~~~~~~~~~
priority setting, see I/O engine specific :option:`cmdprio_percentage`
and :option:`cmdprio_class` options.
+.. option:: priohint=int
+
+ Set the I/O priority hint. This is only applicable to platforms that
+ support I/O priority classes and to devices with features controlled
+ through priority hints, e.g. block devices supporting command duration
+ limits, or CDL. CDL is a way to indicate the desired maximum latency
+ of I/Os so that the device can optimize its internal command scheduling
+ according to the latency limits indicated by the user.
+
+ For per-I/O priority hint setting, see the I/O engine specific
+ :option:`cmdprio_hint` option.
+
.. option:: cpus_allowed=str
Controls the same options as :option:`cpumask`, but accepts a textual
verification pass, according to the settings in the job file used. Default
false.
+.. option:: experimental_verify=bool
+
+ Enable experimental verification. Standard verify records I/O metadata
+ for later use during the verification phase. Experimental verify
+ instead resets the file after the write phase and then replays I/Os for
+ the verification phase.
+
.. option:: trim_percentage=int
Number of verify blocks to discard/trim.
Trim this number of I/O blocks.
-.. option:: experimental_verify=bool
-
- Enable experimental verification. Standard verify records I/O metadata
- for later use during the verification phase. Experimental verify
- instead resets the file after the write phase and then replays I/Os for
- the verification phase.
-
Steady state
~~~~~~~~~~~~
.. option:: steadystate_duration=time, ss_dur=time
- A rolling window of this duration will be used to judge whether steady state
- has been reached. Data will be collected once per second. The default is 0
- which disables steady state detection. When the unit is omitted, the
- value is interpreted in seconds.
+ A rolling window of this duration will be used to judge whether steady
+ state has been reached. Data will be collected every
+ :option:`ss_interval`. The default is 0 which disables steady state
+ detection. When the unit is omitted, the value is interpreted in
+ seconds.
.. option:: steadystate_ramp_time=time, ss_ramp=time
collection for checking the steady state job termination criterion. The
default is 0. When the unit is omitted, the value is interpreted in seconds.
+.. option:: steadystate_check_interval=time, ss_interval=time
+
+ The values during the rolling window will be collected with a period of
+ this value. If :option:`ss_interval` is 30s and :option:`ss_dur` is
+ 300s, 10 measurements will be taken. Default is 1s but that might not
+ converge, especially for slower devices, so set this accordingly. When
+ the unit is omitted, the value is interpreted in seconds.
+
Measurements and reporting
~~~~~~~~~~~~~~~~~~~~~~~~~~
It is the sum of submission and completion latency.
**bw**
- Bandwidth statistics based on samples. Same names as the xlat stats,
- but also includes the number of samples taken (**samples**) and an
- approximate percentage of total aggregate bandwidth this thread
- received in its group (**per**). This last value is only really
- useful if the threads in this group are on the same disk, since they
- are then competing for disk access.
+ Bandwidth statistics based on measurements from discrete
+ intervals. Fio continuously monitors bytes transferred and I/O
+ operations completed. By default fio calculates bandwidth in
+ each half-second interval (see :option:`bwavgtime`) and reports
+ descriptive statistics for the measurements here. Same names as
+ the xlat stats, but also includes the number of samples taken
+ (**samples**) and an approximate percentage of total aggregate
+ bandwidth this thread received in its group (**per**). This
+ last value is only really useful if the threads in this group
+ are on the same disk, since they are then competing for disk
+ access.
**iops**
- IOPS statistics based on samples. Same names as bw.
+ IOPS statistics based on measurements from discrete intervals.
+ For details see the description for bw above. See
+ :option:`iopsavgtime` to control the duration of the intervals.
+ Same values reported here as for bw except for percentage.
**lat (nsec/usec/msec)**
The distribution of I/O completion latencies. This is the time from when
And finally, the disk statistics are printed. This is Linux specific. They will look like this::
Disk stats (read/write):
- sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
+ sda: ios=16398/16511, sectors=32321/65472, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
Each value is printed for both reads and writes, with reads first. The
numbers denote:
**ios**
Number of I/Os performed by all groups.
+**sectors**
+ Amount of data transferred in units of 512 bytes for all groups.
**merge**
Number of merges performed by the I/O scheduler.
**ticks**