Async verify HOWTO/man update

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index 38341b3191448f67f81adc23c589542b8fd75aee..fa54763dbc43fca5876f287fac908384d82b3d89 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -116,6 +116,30 @@ a number of files by separating the names with a `:' character. `\-' is a
  reserved name, meaning stdin or stdout, depending on the read/write direction
  set.
  .TP
+.BI lockfile \fR=\fPstr
+Fio defaults to not locking any files before it does IO to them. If a file or
+file descriptor is shared, fio can serialize IO to that file to make the end
+result consistent. This is usual for emulating real workloads that share files.
+The lock modes are:
+.RS
+.RS
+.TP
+.B none
+No locking. This is the default.
+.TP
+.B exclusive
+Only one thread or process may do IO at the time, excluding all others.
+.TP
+.B readwrite
+Read-write locking on the file. Many readers may access the file at the same
+time, but writes get exclusive access.
+.RE
+.P
+The option may be post-fixed with a lock batch number. If set, then each
+thread/process may do that amount of IOs to the file before giving up the lock.
+Since lock acquisition is expensive, batching the lock/unlocks will speed up IO.
+.RE
+.P
  .BI opendir \fR=\fPstr
  Recursively open any files below directory \fIstr\fR.
  .TP
@@ -162,6 +186,12 @@ been transfered, unless limited by other options (\fBruntime\fR, for instance).
  Unless \fBnr_files\fR and \fBfilesize\fR options are given, this amount will be
  divided between the available files for the job.
  .TP
+.BI fill_device \fR=\fPbool
+Sets size to something really large and waits for ENOSPC (no space left on
+device) as the terminating condition. Only makes sense with sequential write.
+For a read workload, the mount point will be filled first then IO started on
+the result.
+.TP
  .BI filesize \fR=\fPirange
  Individual file sizes. May be a range, in which case \fBfio\fR will select sizes
  for files at random within the given range, limited to \fBsize\fR in total (if
@@ -187,15 +217,18 @@ block sizes for exact control of the issued IO for a job that has mixed
  block sizes. The format of the option is bssplit=blocksize/percentage,
  optionally adding as many definitions as needed seperated by a colon.
  Example: bssplit=4k/10:64k/50:32k/40 would issue 50% 64k blocks, 10% 4k
-blocks and 40% 32k blocks.
+blocks and 40% 32k blocks. \fBbssplit\fR also supports giving separate
+splits to reads and writes. The format is identical to what the
+\fBbs\fR option accepts, the read and write parts are separated with a
+comma.
  .TP
  .B blocksize_unaligned\fR,\fP bs_unaligned
  If set, any size in \fBblocksize_range\fR may be used.  This typically won't
  work with direct I/O, as that normally requires sector alignment.
  .TP
  .BI blockalign \fR=\fPint[,int] "\fR,\fB ba" \fR=\fPint[,int]
-At what boundary to align random IO offsets. Defaults to the same as
-'blocksize' the minimum blocksize given.  Minimum alignment is typically 512b
+At what boundary to align random IO offsets. Defaults to the same as 'blocksize'
+the minimum blocksize given.  Minimum alignment is typically 512b
  for using direct IO, though it usually depends on the hardware block size.
  This option is mutually exclusive with using a random map for files, so it
  will turn off that option.
@@ -306,6 +339,14 @@ Number of I/O units to keep in flight against the file.  Default: 1.
  .BI iodepth_batch \fR=\fPint
  Number of I/Os to submit at once.  Default: \fBiodepth\fR.
  .TP
+.BI iodepth_batch_complete \fR=\fPint
+This defines how many pieces of IO to retrieve at once. It defaults to 1 which
+ means that we'll ask for a minimum of 1 IO in the retrieval process from the
+kernel. The IO retrieval will go on until we hit the limit set by
+\fBiodepth_low\fR. If this variable is set to 0, then fio will always check for
+completed events before queuing more IO. This helps reduce IO latency, at the
+cost of more retrieval system calls.
+.TP
  .BI iodepth_low \fR=\fPint
  Low watermark indicating when to start filling the queue again.  Default:
  \fBiodepth\fR. 
@@ -324,6 +365,10 @@ Offset in the file to start I/O. Data before the offset will not be touched.
  How many I/Os to perform before issuing an \fBfsync\fR\|(2) of dirty data.  If
  0, don't sync.  Default: 0.
  .TP
+.BI fdatasync \fR=\fPint
+Like \fBfsync\fR, but uses \fBfdatasync\fR\|(2) instead to only sync the
+data parts of the file. Default: 0.
+.TP
  .BI overwrite \fR=\fPbool
  If writing, setup the file first and do overwrites.  Default: false.
  .TP
@@ -343,14 +388,22 @@ Percentage of a mixed workload that should be reads. Default: 50.
  .TP
  .BI rwmixwrite \fR=\fPint
  Percentage of a mixed workload that should be writes.  If \fBrwmixread\fR and
-\fBwrmixwrite\fR are given and do not sum to 100%, the latter of the two
-overrides the first.  Default: 50.
+\fBrwmixwrite\fR are given and do not sum to 100%, the latter of the two
+overrides the first. This may interfere with a given rate setting, if fio is
+asked to limit reads or writes to a certain rate. If that is the case, then
+the distribution may be skewed. Default: 50.
  .TP
  .B norandommap
  Normally \fBfio\fR will cover every block of the file when doing random I/O. If
  this parameter is given, a new offset will be chosen without looking at past
  I/O history.  This parameter is mutually exclusive with \fBverify\fR.
  .TP
+.B softrandommap
+See \fBnorandommap\fR. If fio runs with the random block map enabled and it
+fails to allocate the map, if this option is set it will continue without a
+random block map. As coverage will not be as complete as with random maps, this
+option is disabled by default.
+.TP
  .BI nice \fR=\fPint
  Run job with given nice value.  See \fInice\fR\|(2).
  .TP
@@ -373,18 +426,27 @@ Number of blocks to issue before waiting \fBthinktime\fR microseconds.
  Default: 1.
  .TP
  .BI rate \fR=\fPint
-Cap bandwidth used by this job to this number of KiB/s.
+Cap bandwidth used by this job. The number is in bytes/sec, the normal postfix
+rules apply. You can use \fBrate\fR=500k to limit reads and writes to 500k each,
+or you can specify read and writes separately. Using \fBrate\fR=1m,500k would
+limit reads to 1MB/sec and writes to 500KB/sec. Capping only reads or writes
+can be done with \fBrate\fR=,500k or \fBrate\fR=500k,. The former will only
+limit writes (to 500KB/sec), the latter will only limit reads.
  .TP
  .BI ratemin \fR=\fPint
  Tell \fBfio\fR to do whatever it can to maintain at least the given bandwidth.
-Failing to meet this requirement will cause the job to exit.
+Failing to meet this requirement will cause the job to exit. The same format
+as \fBrate\fR is used for read vs write separation.
  .TP
  .BI rate_iops \fR=\fPint
-Cap the bandwidth to this number of IOPS.  If \fBblocksize\fR is a range, the
-smallest block size is used as the metric.
+Cap the bandwidth to this number of IOPS. Basically the same as rate, just
+specified independently of bandwidth. The same format as \fBrate\fR is used for
+read vs write seperation. If \fBblocksize\fR is a range, the smallest block
+size is used as the metric.
  .TP
  .BI rate_iops_min \fR=\fPint
-If this rate of I/O is not met, the job will exit.
+If this rate of I/O is not met, the job will exit. The same format as \fBrate\fR
+is used for read vs write seperation.
  .TP
  .BI ratecycle \fR=\fPint
  Average bandwidth for \fBrate\fR and \fBratemin\fR over this number of
@@ -412,8 +474,8 @@ as \fBruntime\fR allows.
  If set, fio will run the specified workload for this amount of time before
  logging any performance numbers. Useful for letting performance settle before
  logging results, thus minimizing the runtime required for stable results. Note
-that the ramp_time is considered lead in time for a job, thus it will increase
-the total runtime if a special timeout or runtime is specified.
+that the \fBramp_time\fR is considered lead in time for a job, thus it will
+increase the total runtime if a special timeout or runtime is specified.
  .TP
  .BI invalidate \fR=\fPbool
  Invalidate buffer-cache for the file prior to starting I/O.  Default: true.
@@ -450,6 +512,15 @@ the system must have free huge pages allocated.  \fBmmaphuge\fR also needs to
  have hugetlbfs mounted, and \fIfile\fR must point there.
  .RE
  .TP
+.BI iomem_align \fR=\fPint
+This indiciates the memory alignment of the IO memory buffers. Note that the
+given alignment is applied to the first IO unit buffer, if using \fBiodepth\fR
+the alignment of the following buffers are given by the \fBbs\fR used. In
+other words, if using a \fBbs\fR that is a multiple of the page sized in the
+system, all buffers will be aligned to this value. If using a \fBbs\fR that
+is not page aligned, the alignment of subsequent IO memory buffers is the
+sum of the \fBiomem_align\fR and \fBbs\fR used.
+.TP
  .BI hugepage\-size \fR=\fPint
  Defines the size of a huge page.  Must be at least equal to the system setting.
  Should be a multiple of 1MiB. Default: 4MiB.
@@ -470,6 +541,13 @@ If true, serialize file creation for the jobs.  Default: true.
  .BI create_on_open \fR=\fPbool
  If true, the files are not created until they are opened for IO by the job.
  .TP
+.BI pre_read \fR=\fPbool
+If this is given, files will be pre-read into memory before starting the given
+IO operation. This will also clear the \fR \fBinvalidate\fR flag, since it is
+pointless to pre-read and then drop the cache. This will only work for IO
+engines that are seekable, since they allow you to read the same data
+multiple times. Thus it will not work on eg network or splice IO.
+.TP
  .BI unlink \fR=\fPbool
  Unlink job files when done.  Default: false.
  .TP
@@ -520,6 +598,18 @@ Write the verification header for this number of bytes, which should divide
  If true, exit the job on the first observed verification failure.  Default:
  false.
  .TP
+.BI verify_async \fR=\fPint
+Fio will normally verify IO inline from the submitting thread. This option
+takes an integer describing how many async offload threads to create for IO
+verification instead, causing fio to offload the duty of verifying IO contents
+to one or more separate threads.  If using this offload option, even sync IO
+engines can benefit from using an \fBiodepth\fR setting higher than 1, as it
+allows them to have IO in flight while verifies are running.
+.TP
+.BI verify_async_cpus \fR=\fPstr
+Tell fio to set the given CPU affinity on the async IO verification threads.
+See \fBcpus_allowed\fP for the format used.
+.TP
  .B stonewall
  Wait for preceeding jobs in the job file to exit before starting this one.
  \fBstonewall\fR implies \fBnew_group\fR.
@@ -617,6 +707,14 @@ threads/processes that run IO workloads need only copy that segment, instead of
  entering the kernel with a gettimeofday() call. The CPU set aside for doing
  these time calls will be excluded from other uses. Fio will manually clear it
  from the CPU mask of other jobs.
+.TP
+.BI continue_on_error \fR=\fPbool
+Normally fio will exit the job on the first observed failure. If this option is
+set, fio will continue the job when there is a 'non-fatal error'
+(\fBEIO\fR or \fBEILSEQ\fR) until the runtime is exceeded or the I/O size
+specified is completed. If this option is used, there are two more stats that
+are appended, the total error count and the first error. The error field given
+in the stats is the first error that was hit during the run.
  .SH OUTPUT
  While running, \fBfio\fR will display the status of the created jobs.  For
  example:
@@ -831,7 +929,7 @@ was written by Jens Axboe <jens.axboe@oracle.com>.
  This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au> based
  on documentation by Jens Axboe.
  .SH "REPORTING BUGS"
-Report bugs to the \fBfio\fR mailing list <fio-devel@kernel.dk>.
+Report bugs to the \fBfio\fR mailing list <fio@vger.kernel.org>.
  See \fBREADME\fR.
  .SH "SEE ALSO"
  For further documentation see \fBHOWTO\fR and \fBREADME\fR.