Merge branch 'master' of https://github.com/donny372/fio into master

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index 6283fc1d92931fd860915c92735d0f9136c3ac7e..1c90e4a55eafa49e666a9e6581edeb0e76741ae3 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -738,12 +738,13 @@ Accepted values are:
  .RS
  .TP
  .B none
-The \fBzonerange\fR, \fBzonesize\fR and \fBzoneskip\fR parameters are ignored.
+The \fBzonerange\fR, \fBzonesize\fR \fBzonecapacity\fR and \fBzoneskip\fR
+parameters are ignored.
  .TP
  .B strided
  I/O happens in a single zone until \fBzonesize\fR bytes have been transferred.
  After that number of bytes has been transferred processing of the next zone
-starts.
+starts. The \fBzonecapacity\fR parameter is ignored.
  .TP
  .B zbd
  Zoned block device mode. I/O happens sequentially in each zone, even if random
@@ -771,6 +772,14 @@ zoned block device, the specified \fBzonesize\fR must be 0 or equal to the
  device zone size. For a regular block device or file, the specified
  \fBzonesize\fR must be at least 512B.
  .TP
+.BI zonecapacity \fR=\fPint
+For \fBzonemode\fR=zbd, this defines the capacity of a single zone, which is
+the accessible area starting from the zone start address. This parameter only
+applies when using \fBzonemode\fR=zbd in combination with regular block devices.
+If not specified it defaults to the zone size. If the target device is a zoned
+block device, the zone capacity is obtained from the device information and this
+option is ignored.
+.TP
  .BI zoneskip \fR=\fPint
  For \fBzonemode\fR=strided, the number of bytes to skip after \fBzonesize\fR
  bytes of data have been transferred.
@@ -804,7 +813,11 @@ so. Default: false.
  When running a random write test across an entire drive many more zones will be
  open than in a typical application workload. Hence this command line option
  that allows to limit the number of open zones. The number of open zones is
-defined as the number of zones to which write commands are issued.
+defined as the number of zones to which write commands are issued by all
+threads/processes.
+.TP
+.BI job_max_open_zones \fR=\fPint
+Limit on the number of simultaneously opened zones per single thread/process.
  .TP
  .BI zone_reset_threshold \fR=\fPfloat
  A number between zero and one that indicates the ratio of logical blocks with
@@ -1629,6 +1642,12 @@ I/O. Requires \fBfilename\fR option to specify either block or
  character devices. This engine supports trim operations. The
  sg engine includes engine specific options.
  .TP
+.B libzbc
+Synchronous I/O engine for SMR hard-disks using the \fBlibzbc\fR
+library. The target can be either an sg character device or
+a block device file. This engine supports the zonemode=zbd zone
+operations.
+.TP
  .B null
  Doesn't transfer any data, just pretends to. This is mainly used to
  exercise fio itself and for debugging/testing purposes.
@@ -1795,12 +1814,13 @@ In addition, there are some parameters which are only valid when a specific
  with the caveat that when used on the command line, they must come after the
  \fBioengine\fR that defines them is selected.
  .TP
-.BI (io_uring)hipri
-If this option is set, fio will attempt to use polled IO completions. Normal IO
-completions generate interrupts to signal the completion of IO, polled
-completions do not. Hence they are require active reaping by the application.
-The benefits are more efficient IO for high IOPS scenarios, and lower latencies
-for low queue depth IO.
+.BI (io_uring, libaio)cmdprio_percentage \fR=\fPint
+Set the percentage of I/O that will be issued with higher priority by setting
+the priority bit. Non-read I/O is likely unaffected by ``cmdprio_percentage``.
+This option cannot be used with the `prio` or `prioclass` options. For this
+option to set the priority bit properly, NCQ priority must be supported and
+enabled and `direct=1' option must be used. fio must also be run as the root
+user.
  .TP
  .BI (io_uring)fixedbufs
  If fio is asked to do direct IO, then Linux will map pages for each IO call, and
@@ -1808,6 +1828,13 @@ release them when IO is done. If this option is set, the pages are pre-mapped
  before IO is started. This eliminates the need to map and release for each IO.
  This is more efficient, and reduces the IO latency as well.
  .TP
+.BI (io_uring)hipri
+If this option is set, fio will attempt to use polled IO completions. Normal IO
+completions generate interrupts to signal the completion of IO, polled
+completions do not. Hence they are require active reaping by the application.
+The benefits are more efficient IO for high IOPS scenarios, and lower latencies
+for low queue depth IO.
+.TP
  .BI (io_uring)registerfiles
  With this option, fio registers the set of files being used with the kernel.
  This avoids the overhead of managing file counts in the kernel, making the
@@ -1839,6 +1866,22 @@ than normal.
  When hipri is set this determines the probability of a pvsync2 I/O being high
  priority. The default is 100%.
  .TP
+.BI (pvsync2,libaio,io_uring)nowait
+By default if a request cannot be executed immediately (e.g. resource starvation,
+waiting on locks) it is queued and the initiating process will be blocked until
+the required resource becomes free.
+This option sets the RWF_NOWAIT flag (supported from the 4.14 Linux kernel) and
+the call will return instantly with EAGAIN or a partial result rather than waiting.
+
+It is useful to also use \fBignore_error\fR=EAGAIN when using this option.
+Note: glibc 2.27, 2.28 have a bug in syscall wrappers preadv2, pwritev2.
+They return EOPNOTSUP instead of EAGAIN.
+
+For cached I/O, using this option usually means a request operates only with
+cached data. Currently the RWF_NOWAIT flag does not supported for cached write.
+For direct I/O, requests will only succeed if cache invalidation isn't required,
+file blocks are fully allocated and the disk request could be issued immediately.
+.TP
  .BI (cpuio)cpuload \fR=\fPint
  Attempt to use the specified percentage of CPU cycles. This is a mandatory
  option when using cpuio I/O engine.
@@ -2025,6 +2068,10 @@ on the client site it will be used in the rdma_resolve_add()
  function. This can be useful when multiple paths exist between the
  client and the server or in certain loopback configurations.
  .TP
+.BI (filestat)stat_type \fR=\fPstr
+Specify stat system call type to measure lookup/getattr performance.
+Default is \fBstat\fR for \fBstat\fR\|(2).
+.TP
  .BI (sg)readfua \fR=\fPbool
  With readfua option set to 1, read operations include the force
  unit access (fua) flag. Default: 0.
@@ -2258,6 +2305,11 @@ The percentage of I/Os that must fall within the criteria specified by
  defaults to 100.0, meaning that all I/Os must be equal or below to the value
  set by \fBlatency_target\fR.
  .TP
+.BI latency_run \fR=\fPbool
+Used with \fBlatency_target\fR. If false (default), fio will find the highest
+queue depth that meets \fBlatency_target\fR and exit. If true, fio will continue
+running and try to meet \fBlatency_target\fR by adjusting queue depth.
+.TP
  .BI max_latency \fR=\fPtime
  If set, fio will exit the job with an ETIMEDOUT error if it exceeds this
  maximum latency. When the unit is omitted, the value is interpreted in
@@ -2284,7 +2336,9 @@ replay, the file needs to be turned into a blkparse binary data file first
  You can specify a number of files by separating the names with a ':' character.
  See the \fBfilename\fR option for information on how to escape ':'
  characters within the file names. These files will be sequentially assigned to
-job clones created by \fBnumjobs\fR.
+job clones created by \fBnumjobs\fR. '-' is a reserved name, meaning read from
+stdin, notably if \fBfilename\fR is set to '-' which means stdin as well,
+then this flag can't be set to '-'.
  .TP
  .BI read_iolog_chunked \fR=\fPbool
  Determines how iolog is read. If false (default) entire \fBread_iolog\fR will
@@ -2386,10 +2440,14 @@ priority class.
  Set the I/O priority value of this job. Linux limits us to a positive value
  between 0 and 7, with 0 being the highest. See man
  \fBionice\fR\|(1). Refer to an appropriate manpage for other operating
-systems since meaning of priority may differ.
+systems since meaning of priority may differ. For per-command priority
+setting, see I/O engine specific `cmdprio_percentage` and `hipri_percentage`
+options.
  .TP
  .BI prioclass \fR=\fPint
-Set the I/O priority class. See man \fBionice\fR\|(1).
+Set the I/O priority class. See man \fBionice\fR\|(1). For per-command
+priority setting, see I/O engine specific `cmdprio_percentage` and `hipri_percent`
+options.
  .TP
  .BI cpus_allowed \fR=\fPstr
  Controls the same options as \fBcpumask\fR, but accepts a textual
@@ -2511,7 +2569,8 @@ been exceeded before retrying operations.
  Wait for preceding jobs in the job file to exit, before starting this
  one. Can be used to insert serialization points in the job file. A stone
  wall also implies starting a new reporting group, see
-\fBgroup_reporting\fR.
+\fBgroup_reporting\fR. Optionally you can use `stonewall=0` to disable or
+`stonewall=1` to enable it.
  .TP
  .BI exitall
  By default, fio will continue running all other jobs when one job finishes.
@@ -2519,15 +2578,27 @@ Sometimes this is not the desired action. Setting \fBexitall\fR will instead
  make fio terminate all jobs in the same group, as soon as one job of that
  group finishes.
  .TP
-.BI exit_what
+.BI exit_what \fR=\fPstr
  By default, fio will continue running all other jobs when one job finishes.
-Sometimes this is not the desired action. Setting \fBexit_all\fR will instead
+Sometimes this is not the desired action. Setting \fBexitall\fR will instead
  make fio terminate all jobs in the same group. The option \fBexit_what\fR
-allows to control which jobs get terminated when \fBexitall\fR is enabled. The
-default is \fBgroup\fR and does not change the behaviour of \fBexitall\fR. The
-setting \fBall\fR terminates all jobs. The setting \fBstonewall\fR terminates
-all currently running jobs across all groups and continues execution with the
-next stonewalled group.
+allows you to control which jobs get terminated when \fBexitall\fR is enabled.
+The default value is \fBgroup\fR.
+The allowed values are:
+.RS
+.RS
+.TP
+.B all
+terminates all jobs.
+.TP
+.B group
+is the default and does not change the behaviour of \fBexitall\fR.
+.TP
+.B stonewall
+terminates all currently running jobs across all groups and continues
+execution with the next stonewalled group.
+.RE
+.RE
  .TP
  .BI exec_prerun \fR=\fPstr
  Before running this job, issue the command specified through
@@ -2986,23 +3057,24 @@ Disable measurements of submission latency numbers. See
  Disable measurements of throughput/bandwidth numbers. See
  \fBdisable_lat\fR.
  .TP
+.BI slat_percentiles \fR=\fPbool
+Report submission latency percentiles. Submission latency is not recorded
+for synchronous ioengines.
+.TP
  .BI clat_percentiles \fR=\fPbool
-Enable the reporting of percentiles of completion latencies. This option is
-mutually exclusive with \fBlat_percentiles\fR.
+Report completion latency percentiles.
  .TP
  .BI lat_percentiles \fR=\fPbool
-Enable the reporting of percentiles of I/O latencies. This is similar to
-\fBclat_percentiles\fR, except that this includes the submission latency.
-This option is mutually exclusive with \fBclat_percentiles\fR.
+Report total latency percentiles. Total latency is the sum of submission
+latency and completion latency.
  .TP
  .BI percentile_list \fR=\fPfloat_list
-Overwrite the default list of percentiles for completion latencies and the
-block error histogram. Each number is a floating number in the range
+Overwrite the default list of percentiles for latencies and the
+block error histogram. Each number is a floating point number in the range
  (0,100], and the maximum length of the list is 20. Use ':' to separate the
-numbers, and list the numbers in ascending order. For example,
-`\-\-percentile_list=99.5:99.9' will cause fio to report the values of
-completion latency below which 99.5% and 99.9% of the observed latencies
-fell, respectively.
+numbers. For example, `\-\-percentile_list=99.5:99.9' will cause fio to
+report the latency durations below which 99.5% and 99.9% of the observed
+latencies fell, respectively.
  .TP
  .BI significant_figures \fR=\fPint
  If using \fB\-\-output\-format\fR of `normal', set the significant figures
@@ -3815,7 +3887,8 @@ Fio supports a variety of log file formats, for logging latencies, bandwidth,
  and IOPS. The logs share a common format, which looks like this:
  .RS
  .P
-time (msec), value, data direction, block size (bytes), offset (bytes)
+time (msec), value, data direction, block size (bytes), offset (bytes),
+command priority
  .RE
  .P
  `Time' for the log entry is always in milliseconds. The `value' logged depends
@@ -3849,6 +3922,9 @@ The entry's `block size' is always in bytes. The `offset' is the position in byt
  from the start of the file for that particular I/O. The logging of the offset can be
  toggled with \fBlog_offset\fR.
  .P
+`Command priority` is 0 for normal priority and 1 for high priority. This is controlled
+by the ioengine specific \fBcmdprio_percentage\fR.
+.P
  Fio defaults to logging every individual I/O but when windowed logging is set
  through \fBlog_avg_msec\fR, either the average (by default) or the maximum
  (\fBlog_max_value\fR is set) `value' seen over the specified period of time