engines/libblkio: Add option libblkio_force_enable_completion_eventfd

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index ce9bf3ef4d104ac3c2be4a0a3056a29051975209..da5483037bad06c224c99dc8b095b7709540d2fd 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -67,8 +67,8 @@ List all commands defined by \fIioengine\fR, or print help for \fIcommand\fR
  defined by \fIioengine\fR. If no \fIioengine\fR is given, list all
  available ioengines.
  .TP
-.BI \-\-showcmd \fR=\fPjobfile
-Convert \fIjobfile\fR to a set of command\-line options.
+.BI \-\-showcmd
+Convert given \fIjobfile\fRs to a set of command\-line options.
  .TP
  .BI \-\-readonly
  Turn on safety read\-only checks, preventing writes and trims. The \fB\-\-readonly\fR
@@ -292,7 +292,7 @@ For Zone Block Device Mode:
  .RS
  .P
  .PD 0
-z means Zone 
+z means Zone
  .P
  .PD
  .RE
@@ -569,7 +569,7 @@ by this option will be \fBsize\fR divided by number of files unless an
  explicit size is specified by \fBfilesize\fR.
  .RS
  .P
-Each colon in the wanted path must be escaped with a '\\'
+Each colon in the wanted path must be escaped with a '\e'
  character. For instance, if the path is `/dev/dsk/foo@3,0:c' then you
  would use `filename=/dev/dsk/foo@3,0\\:c' and if the path is
  `F:\\filename' then you would use `filename=F\\:\\filename'.
@@ -830,7 +830,7 @@ so. Default: false.
  .BI max_open_zones \fR=\fPint
  When running a random write test across an entire drive many more zones will be
  open than in a typical application workload. Hence this command line option
-that allows to limit the number of open zones. The number of open zones is
+that allows one to limit the number of open zones. The number of open zones is
  defined as the number of zones to which write commands are issued by all
  threads/processes.
  .TP
@@ -900,7 +900,15 @@ Random mixed reads and writes.
  .TP
  .B trimwrite
  Sequential trim+write sequences. Blocks will be trimmed first,
-then the same blocks will be written to.
+then the same blocks will be written to. So if `io_size=64K' is specified,
+Fio will trim a total of 64K bytes and also write 64K bytes on the same
+trimmed blocks. This behaviour will be consistent with `number_ios' or
+other Fio options limiting the total bytes or number of I/O's.
+.TP
+.B randtrimwrite
+Like
+.B trimwrite ,
+but uses random offsets rather than sequential writes.
  .RE
  .P
  Fio defaults to read if the option is not specified. For the mixed I/O
@@ -1083,7 +1091,7 @@ provided. Data before the given offset will not be touched. This
  effectively caps the file size at `real_size \- offset'. Can be combined with
  \fBsize\fR to constrain the start and end range of the I/O workload.
  A percentage can be specified by a number between 1 and 100 followed by '%',
-for example, `offset=20%' to specify 20%. In ZBD mode, value can be set as 
+for example, `offset=20%' to specify 20%. In ZBD mode, value can be set as
  number of zones using 'z'.
  .TP
  .BI offset_align \fR=\fPint
@@ -1099,7 +1107,7 @@ specified). This option is useful if there are several jobs which are
  intended to operate on a file in parallel disjoint segments, with even
  spacing between the starting points. Percentages can be used for this option.
  If a percentage is given, the generated offset will be aligned to the minimum
-\fBblocksize\fR or to the value of \fBoffset_align\fR if provided.In ZBD mode, value 
+\fBblocksize\fR or to the value of \fBoffset_align\fR if provided.In ZBD mode, value
  can be set as number of zones using 'z'.
  .TP
  .BI number_ios \fR=\fPint
@@ -1216,7 +1224,7 @@ map. For the \fBnormal\fR distribution, a normal (Gaussian) deviation is
  supplied as a value between 0 and 100.
  .P
  The second, optional float is allowed for \fBpareto\fR, \fBzipf\fR and \fBnormal\fR
-distributions. It allows to set base of distribution in non-default place, giving
+distributions. It allows one to set base of distribution in non-default place, giving
  more control over most probable outcome. This value is in range [0-1] which maps linearly to
  range of possible random values.
  Defaults are: random for \fBpareto\fR and \fBzipf\fR, and 0.5 for \fBnormal\fR.
@@ -1668,8 +1676,11 @@ simulate a smaller amount of memory. The amount specified is per worker.
  .TP
  .BI size \fR=\fPint[%|z]
  The total size of file I/O for each thread of this job. Fio will run until
-this many bytes has been transferred, unless runtime is limited by other options
-(such as \fBruntime\fR, for instance, or increased/decreased by \fBio_size\fR).
+this many bytes has been transferred, unless runtime is altered by other means
+such as (1) \fBruntime\fR, (2) \fBio_size\fR, (3) \fBnumber_ios\fR, (4)
+gaps/holes while doing I/O's such as `rw=read:16K', or (5) sequential I/O
+reaching end of the file which is possible when \fBpercentage_random\fR is
+less than 100.
  Fio will divide this size between the available files determined by options
  such as \fBnrfiles\fR, \fBfilename\fR, unless \fBfilesize\fR is
  specified by the job. If the result of division happens to be 0, the size is
@@ -1678,7 +1689,7 @@ If this option is not specified, fio will use the full size of the given
  files or devices. If the files do not exist, size must be given. It is also
  possible to give size as a percentage between 1 and 100. If `size=20%' is
  given, fio will use 20% of the full size of the given files or devices. In ZBD mode,
-size can be given in units of number of zones using 'z'. Can be combined with \fBoffset\fR to 
+size can be given in units of number of zones using 'z'. Can be combined with \fBoffset\fR to
  constrain the start and end range that I/O will be done within.
  .TP
  .BI io_size \fR=\fPint[%|z] "\fR,\fB io_limit" \fR=\fPint[%|z]
@@ -1697,7 +1708,7 @@ also be set as number of zones using 'z'.
  .BI filesize \fR=\fPirange(int)
  Individual file sizes. May be a range, in which case fio will select sizes
  for files at random within the given range. If not given, each created file
-is the same size. This option overrides \fBsize\fR in terms of file size, 
+is the same size. This option overrides \fBsize\fR in terms of file size,
  i.e. \fBsize\fR becomes merely the default for \fBio_size\fR (and
  has no effect it all if \fBio_size\fR is set explicitly).
  .TP
@@ -1981,6 +1992,12 @@ I/O engine using the xNVMe C API, for NVMe devices. The xnvme engine provides
  flexibility to access GNU/Linux Kernel NVMe driver via libaio, IOCTLs, io_uring,
  the SPDK NVMe driver, or your own custom NVMe driver. The xnvme engine includes
  engine specific options. (See \fIhttps://xnvme.io/\fR).
+.TP
+.B libblkio
+Use the libblkio library (\fIhttps://gitlab.com/libblkio/libblkio\fR). The
+specific driver to use must be set using \fBlibblkio_driver\fR. If
+\fBmem\fR/\fBiomem\fR is not specified, memory allocation is delegated to
+libblkio (and so is guaranteed to work with the selected driver).
  .SS "I/O engine specific parameters"
  In addition, there are some parameters which are only valid when a specific
  \fBioengine\fR is in use. These are used identically to normal parameters,
@@ -2055,7 +2072,7 @@ release them when IO is done. If this option is set, the pages are pre-mapped
  before IO is started. This eliminates the need to map and release for each IO.
  This is more efficient, and reduces the IO latency as well.
  .TP
-.BI (io_uring,io_uring_cmd)nonvectored
+.BI (io_uring,io_uring_cmd)nonvectored \fR=\fPint
  With this option, fio will use non-vectored read/write commands, where address
  must contain the address directly. Default is -1.
  .TP
@@ -2082,9 +2099,11 @@ sqthread_poll option.
  Normally fio will submit IO by issuing a system call to notify the kernel of
  available items in the SQ ring. If this option is set, the act of submitting IO
  will be done by a polling thread in the kernel. This frees up cycles for fio, at
-the cost of using more CPU in the system.
+the cost of using more CPU in the system. As submission is just the time it
+takes to fill in the sqe entries and any syscall required to wake up the idle
+kernel thread, fio will not report submission latencies.
  .TP
-.BI (io_uring,io_uring_cmd)sqthread_poll_cpu
+.BI (io_uring,io_uring_cmd)sqthread_poll_cpu \fR=\fPint
  When `sqthread_poll` is set, this option provides a way to define which CPU
  should be used for the polling thread.
  .TP
@@ -2107,7 +2126,7 @@ than normal.
  When hipri is set this determines the probability of a pvsync2 I/O being high
  priority. The default is 100%.
  .TP
-.BI (pvsync2,libaio,io_uring)nowait
+.BI (pvsync2,libaio,io_uring,io_uring_cmd)nowait \fR=\fPbool
  By default if a request cannot be executed immediately (e.g. resource starvation,
  waiting on locks) it is queued and the initiating process will be blocked until
  the required resource becomes free.
@@ -2308,6 +2327,15 @@ The S3 secret key.
  .BI (http)http_s3_keyid \fR=\fPstr
  The S3 key/access id.
  .TP
+.BI (http)http_s3_sse_customer_key \fR=\fPstr
+The encryption customer key in SSE server side.
+.TP
+.BI (http)http_s3_sse_customer_algorithm \fR=\fPstr
+The encryption customer algorithm in SSE server side. Default is \fBAES256\fR
+.TP
+.BI (http)http_s3_storage_class \fR=\fPstr
+Which storage class to access. User-customizable settings. Default is \fBSTANDARD\fR
+.TP
  .BI (http)http_swift_auth_token \fR=\fPstr
  The Swift auth token. See the example configuration file on how to
  retrieve this.
@@ -2479,11 +2507,11 @@ Specify the label or UUID of the DAOS pool to connect to.
  Specify the label or UUID of the DAOS container to open.
  .TP
  .BI (dfs)chunk_size
-Specificy a different chunk size (in bytes) for the dfs file.
+Specify a different chunk size (in bytes) for the dfs file.
  Use DAOS container's chunk size by default.
  .TP
  .BI (dfs)object_class
-Specificy a different object class for the dfs file.
+Specify a different object class for the dfs file.
  Use DAOS container's object class by default.
  .TP
  .BI (nfs)nfs_url
@@ -2521,22 +2549,29 @@ Select the xnvme async command interface. This can take these values.
  .RS
  .TP
  .B emu
-This is default and used to emulate asynchronous I/O
+This is default and use to emulate asynchronous I/O by using a single thread to
+create a queue pair on top of a synchronous I/O interface using the NVMe driver
+IOCTL.
  .TP
  .BI thrpool
-Use thread pool for Asynchronous I/O
+Emulate an asynchronous I/O interface with a pool of userspace threads on top
+of a synchronous I/O interface using the NVMe driver IOCTL. By default four
+threads are used.
  .TP
  .BI io_uring
-Use Linux io_uring/liburing for Asynchronous I/O
+Linux native asynchronous I/O interface which supports both direct and buffered
+I/O.
  .TP
  .BI libaio
  Use Linux aio for Asynchronous I/O
  .TP
  .BI posix
-Use POSIX aio for Asynchronous I/O
+Use the posix asynchronous I/O interface to perform one or more I/O operations
+asynchronously.
  .TP
  .BI nil
-Use nil-io; For introspective perf. evaluation
+Do not transfer any data; just pretend to. This is mainly used for
+introspective performance evaluation.
  .RE
  .RE
  .TP
@@ -2546,10 +2581,14 @@ Select the xnvme synchronous command interface. This can take these values.
  .RS
  .TP
  .B nvme
-This is default and uses Linux NVMe Driver ioctl() for synchronous I/O
+This is default and uses Linux NVMe Driver ioctl() for synchronous I/O.
  .TP
  .BI psync
-Use pread()/write() for synchronous I/O
+This supports regular as well as vectored pread() and pwrite() commands.
+.TP
+.BI block
+This is the same as psync except that it also supports zone management
+commands using Linux block layer IOCTLs.
  .RE
  .RE
  .TP
@@ -2559,21 +2598,70 @@ Select the xnvme admin command interface. This can take these values.
  .RS
  .TP
  .B nvme
-This is default and uses Linux NVMe Driver ioctl() for admin commands
+This is default and uses Linux NVMe Driver ioctl() for admin commands.
  .TP
  .BI block
-Use Linux Block Layer ioctl() and sysfs for admin commands
-.TP
-.BI file_as_ns
-Use file-stat as to construct NVMe idfy responses
+Use Linux Block Layer ioctl() and sysfs for admin commands.
  .RE
  .RE
  .TP
  .BI (xnvme)xnvme_dev_nsid\fR=\fPint
-xnvme namespace identifier, for userspace NVMe driver.
+xnvme namespace identifier for userspace NVMe driver such as SPDK.
  .TP
  .BI (xnvme)xnvme_iovec
  If this option is set, xnvme will use vectored read/write commands.
+.TP
+.BI (libblkio)libblkio_driver \fR=\fPstr
+The libblkio driver to use. Different drivers access devices through different
+underlying interfaces. Available drivers depend on the libblkio version in use
+and are listed at \fIhttps://libblkio.gitlab.io/libblkio/blkio.html#drivers\fR
+.TP
+.BI (libblkio)libblkio_pre_connect_props \fR=\fPstr
+A colon-separated list of libblkio properties to be set after creating but
+before connecting the libblkio instance. Each property must have the format
+\fB<name>=<value>\fR. Colons can be escaped as \fB\\:\fR. These are set after
+the engine sets any other properties, so those can be overriden. Available
+properties depend on the libblkio version in use and are listed at
+\fIhttps://libblkio.gitlab.io/libblkio/blkio.html#properties\fR
+.TP
+.BI (libblkio)libblkio_pre_start_props \fR=\fPstr
+A colon-separated list of libblkio properties to be set after connecting but
+before starting the libblkio instance. Each property must have the format
+\fB<name>=<value>\fR. Colons can be escaped as \fB\\:\fR. These are set after
+the engine sets any other properties, so those can be overriden. Available
+properties depend on the libblkio version in use and are listed at
+\fIhttps://libblkio.gitlab.io/libblkio/blkio.html#properties\fR
+.TP
+.BI (libblkio)hipri
+Use poll queues. This is incompatible with \fBlibblkio_wait_mode=eventfd\fR and
+\fBlibblkio_force_enable_completion_eventfd\fR.
+.TP
+.BI (libblkio)libblkio_vectored
+Submit vectored read and write requests.
+.TP
+.BI (libblkio)libblkio_write_zeroes_on_trim
+Submit trims as "write zeroes" requests instead of discard requests.
+.TP
+.BI (libblkio)libblkio_wait_mode \fR=\fPstr
+How to wait for completions:
+.RS
+.RS
+.TP
+.B block \fR(default)
+Use a blocking call to \fBblkioq_do_io()\fR.
+.TP
+.B eventfd
+Use a blocking call to \fBread()\fR on the completion eventfd.
+.TP
+.B loop
+Use a busy loop with a non-blocking call to \fBblkioq_do_io()\fR.
+.RE
+.RE
+.TP
+.BI (libblkio)libblkio_force_enable_completion_eventfd
+Enable the queue's completion eventfd even when unused. This may impact
+performance. The default is to enable it only if
+\fBlibblkio_wait_mode=eventfd\fR.
  .SS "I/O depth"
  .TP
  .BI iodepth \fR=\fPint
@@ -3297,7 +3385,9 @@ Verify that trim/discarded blocks are returned as zeros.
  Trim this number of I/O blocks.
  .TP
  .BI experimental_verify \fR=\fPbool
-Enable experimental verification.
+Enable experimental verification. Standard verify records I/O metadata for
+later use during the verification phase. Experimental verify instead resets the
+file after the write phase and then replays I/Os for the verification phase.
  .SS "Steady state"
  .TP
  .BI steadystate \fR=\fPstr:float "\fR,\fP ss" \fR=\fPstr:float
@@ -3589,6 +3679,16 @@ EILSEQ) until the runtime is exceeded or the I/O size specified is
  completed. If this option is used, there are two more stats that are
  appended, the total error count and the first error. The error field given
  in the stats is the first error that was hit during the run.
+.RS
+.P
+Note: a write error from the device may go unnoticed by fio when using buffered
+IO, as the write() (or similar) system call merely dirties the kernel pages,
+unless `sync' or `direct' is used. Device IO errors occur when the dirty data is
+actually written out to disk. If fully sync writes aren't desirable, `fsync' or
+`fdatasync' can be used as well. This is specific to writes, as reads are always
+synchronous.
+.RS
+.P
  The allowed values are:
  .RS
  .RS
@@ -4176,7 +4276,7 @@ This format is not supported in fio versions >= 1.20\-rc3.
  .TP
  .B Trace file format v2
  The second version of the trace file format was added in fio version 1.17. It
-allows to access more then one file per trace and has a bigger set of possible
+allows one to access more than one file per trace and has a bigger set of possible
  file actions.
  .RS
  .P