Documentation update

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index 35056a8de58a35c15bbfc299d28f7d7b302ceb9c..190da4fa66f38af46a92f63a7ba0fc6c56c1b912 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -37,9 +37,17 @@ Enable read-only safety checks.
  Specifies when real-time ETA estimate should be printed.  \fIwhen\fR may
  be one of `always', `never' or `auto'.
  .TP
+.BI \-\-section \fR=\fPsec
+Only run section \fIsec\fR from job file.
+.TP
  .BI \-\-cmdhelp \fR=\fPcommand
  Print help information for \fIcommand\fR.  May be `all' for all commands.
  .TP
+.BI \-\-debug \fR=\fPtype
+Enable verbose tracing of various fio actions. May be `all' for all types
+or individual types seperated by a comma (eg \-\-debug=io,file). `help' will
+list all available tracing options.
+.TP
  .B \-\-help
  Display usage information and exit.
  .TP
@@ -53,7 +61,9 @@ except `global', which has a special meaning.  Following the job name is
  a sequence of zero or more parameters, one per line, that define the
  behavior of the job.  Any line starting with a `;' or `#' character is
  considered a comment and ignored.
-job files.
+.P
+If \fIjobfile\fR is specified as `-', the job file will be read from
+standard input.
  .SS "Global Section"
  The global section contains default parameters for jobs specified in the
  job file.  A job is only affected by global sections residing above it,
@@ -67,13 +77,15 @@ Some parameters may take arguments of a specific type.  The types used are:
  String: a sequence of alphanumeric characters.
  .TP
  .I int
-Integer: a whole number, possibly negative.  If prefixed with `0x', the value
-is assumed to be base 16 (hexadecimal).
-.TP
-.I siint
  SI integer: a whole number, possibly containing a suffix denoting the base unit
-of the value.  Accepted suffixes are `k', 'M' and 'G', denoting kilo (1024),
-mega (1024*1024) and giga (1024*1024*1024) respectively.
+of the value.  Accepted suffixes are `k', 'M', 'G', 'T', and 'P', denoting
+kilo (1024), mega (1024^2), giga (1024^3), tera (1024^4), and peta (1024^5)
+respectively. The suffix is not case sensitive. If prefixed with '0x', the
+value is assumed to be base 16 (hexadecimal). A suffix may include a trailing
+'b', for instance 'kb' is identical to 'k'. You can specify a base 10 value
+by using 'KiB', 'MiB', 'GiB', etc. This is useful for disk drives where
+values are often given in base 10 values. Specifying '30GiB' will get you
+30*1000^3 bytes.
  .TP
  .I bool
  Boolean: a true or false value. `0' denotes false, `1' denotes true.
@@ -87,7 +99,7 @@ sets of ranges, they are separated with a `,' or `/' character. For example:
  .SS "Parameter List"
  .TP
  .BI name \fR=\fPstr
-May be used to override the job name.  On the command line, this paramter
+May be used to override the job name.  On the command line, this parameter
  has the special purpose of signalling the start of a new job.
  .TP
  .BI description \fR=\fPstr
@@ -109,6 +121,30 @@ a number of files by separating the names with a `:' character. `\-' is a
  reserved name, meaning stdin or stdout, depending on the read/write direction
  set.
  .TP
+.BI lockfile \fR=\fPstr
+Fio defaults to not locking any files before it does IO to them. If a file or
+file descriptor is shared, fio can serialize IO to that file to make the end
+result consistent. This is usual for emulating real workloads that share files.
+The lock modes are:
+.RS
+.RS
+.TP
+.B none
+No locking. This is the default.
+.TP
+.B exclusive
+Only one thread or process may do IO at the time, excluding all others.
+.TP
+.B readwrite
+Read-write locking on the file. Many readers may access the file at the same
+time, but writes get exclusive access.
+.RE
+.P
+The option may be post-fixed with a lock batch number. If set, then each
+thread/process may do that amount of IOs to the file before giving up the lock.
+Since lock acquisition is expensive, batching the lock/unlocks will speed up IO.
+.RE
+.P
  .BI opendir \fR=\fPstr
  Recursively open any files below directory \fIstr\fR.
  .TP
@@ -141,19 +177,37 @@ to perform before getting a new offset can be specified by appending
  `:\fIint\fR' to the pattern type.  The default is 1.
  .RE
  .TP
+.BI kb_base \fR=\fPint
+The base unit for a kilobyte. The defacto base is 2^10, 1024.  Storage
+manufacturers like to use 10^3 or 1000 as a base ten unit instead, for obvious
+reasons. Allow values are 1024 or 1000, with 1024 being the default.
+.TP
  .BI randrepeat \fR=\fPbool
  Seed the random number generator in a predictable way so results are repeatable
  across runs.  Default: true.
  .TP
+.BI fallocate \fR=\fPbool
+By default, fio will use fallocate() to advise the system of the size of the
+file we are going to write. This can be turned off with fallocate=0. May not
+be available on all supported platforms.
+.TP
  .BI fadvise_hint \fR=\fPbool
  Disable use of \fIposix_fadvise\fR\|(2) to advise the kernel what I/O patterns
  are likely to be issued. Default: true.
  .TP
-.BI size \fR=\fPsiint
+.BI size \fR=\fPint
  Total size of I/O for this job.  \fBfio\fR will run until this many bytes have
  been transfered, unless limited by other options (\fBruntime\fR, for instance).
  Unless \fBnr_files\fR and \fBfilesize\fR options are given, this amount will be
-divided between the available files for the job.
+divided between the available files for the job. If not set, fio will use the
+full size of the given files or devices. If the the files do not exist, size
+must be given.
+.TP
+.BI fill_device \fR=\fPbool
+Sets size to something really large and waits for ENOSPC (no space left on
+device) as the terminating condition. Only makes sense with sequential write.
+For a read workload, the mount point will be filled first then IO started on
+the result.
  .TP
  .BI filesize \fR=\fPirange
  Individual file sizes. May be a range, in which case \fBfio\fR will select sizes
@@ -161,23 +215,50 @@ for files at random within the given range, limited to \fBsize\fR in total (if
  that is given). If \fBfilesize\fR is not specified, each created file is the
  same size.
  .TP
-.BI blocksize \fR=\fPsiint "\fR,\fB bs" \fR=\fPsiint
+.BI blocksize \fR=\fPint[,int] "\fR,\fB bs" \fR=\fPint[,int]
  Block size for I/O units.  Default: 4k.  Values for reads and writes can be
-specified seperately in the format \fIread\fR,\fIwrite\fR, either of
+specified separately in the format \fIread\fR,\fIwrite\fR, either of
  which may be empty to leave that value at its default.
  .TP
-.BI blocksize_range \fR=\fPirange "\fR,\fB bsrange" \fR=\fPirange
+.BI blocksize_range \fR=\fPirange[,irange] "\fR,\fB bsrange" \fR=\fPirange[,irange]
  Specify a range of I/O block sizes.  The issued I/O unit will always be a
  multiple of the minimum size, unless \fBblocksize_unaligned\fR is set.  Applies
-to both reads and writes, but can be specified seperately (see \fBblocksize\fR).
+to both reads and writes if only one range is given, but can be specified
+separately with a comma seperating the values. Example: bsrange=1k-4k,2k-8k.
+Also (see \fBblocksize\fR).
+.TP
+.BI bssplit \fR=\fPstr
+This option allows even finer grained control of the block sizes issued,
+not just even splits between them. With this option, you can weight various
+block sizes for exact control of the issued IO for a job that has mixed
+block sizes. The format of the option is bssplit=blocksize/percentage,
+optionally adding as many definitions as needed seperated by a colon.
+Example: bssplit=4k/10:64k/50:32k/40 would issue 50% 64k blocks, 10% 4k
+blocks and 40% 32k blocks. \fBbssplit\fR also supports giving separate
+splits to reads and writes. The format is identical to what the
+\fBbs\fR option accepts, the read and write parts are separated with a
+comma.
  .TP
  .B blocksize_unaligned\fR,\fP bs_unaligned
  If set, any size in \fBblocksize_range\fR may be used.  This typically won't
  work with direct I/O, as that normally requires sector alignment.
  .TP
+.BI blockalign \fR=\fPint[,int] "\fR,\fB ba" \fR=\fPint[,int]
+At what boundary to align random IO offsets. Defaults to the same as 'blocksize'
+the minimum blocksize given.  Minimum alignment is typically 512b
+for using direct IO, though it usually depends on the hardware block size.
+This option is mutually exclusive with using a random map for files, so it
+will turn off that option.
+.TP
  .B zero_buffers
  Initialise buffers with all zeros. Default: fill buffers with random data.
  .TP
+.B refill_buffers
+If this option is given, fio will refill the IO buffers on every submit. The
+default is to only fill it at init time and reuse that data. Only makes sense
+if zero_buffers isn't specified, naturally. If data verification is enabled,
+refill_buffers is also automatically enabled.
+.TP
  .BI nrfiles \fR=\fPint
  Number of files to use for this job.  Default: 1.
  .TP
@@ -194,6 +275,8 @@ Choose a file at random
  .TP
  .B roundrobin
  Round robin over open files (default).
+.B sequential
+Do each file in the set sequentially.
  .RE
  .P
  The number of I/Os to issue before switching a new file can be specified by
@@ -212,6 +295,10 @@ position the I/O location.
  .B psync
  Basic \fIpread\fR\|(2) or \fIpwrite\fR\|(2) I/O.
  .TP
+.B vsync
+Basic \fIreadv\fR\|(2) or \fIwritev\fR\|(2) I/O. Will emulate queuing by
+coalescing adjacents IOs into a single submission.
+.TP
  .B libaio
  Linux native asynchronous I/O.
  .TP
@@ -269,6 +356,14 @@ Number of I/O units to keep in flight against the file.  Default: 1.
  .BI iodepth_batch \fR=\fPint
  Number of I/Os to submit at once.  Default: \fBiodepth\fR.
  .TP
+.BI iodepth_batch_complete \fR=\fPint
+This defines how many pieces of IO to retrieve at once. It defaults to 1 which
+ means that we'll ask for a minimum of 1 IO in the retrieval process from the
+kernel. The IO retrieval will go on until we hit the limit set by
+\fBiodepth_low\fR. If this variable is set to 0, then fio will always check for
+completed events before queuing more IO. This helps reduce IO latency, at the
+cost of more retrieval system calls.
+.TP
  .BI iodepth_low \fR=\fPint
  Low watermark indicating when to start filling the queue again.  Default:
  \fBiodepth\fR. 
@@ -280,13 +375,38 @@ If true, use non-buffered I/O (usually O_DIRECT).  Default: false.
  If true, use buffered I/O.  This is the opposite of the \fBdirect\fR parameter.
  Default: true.
  .TP
-.BI offset \fR=\fPsiint
+.BI offset \fR=\fPint
  Offset in the file to start I/O. Data before the offset will not be touched.
  .TP
  .BI fsync \fR=\fPint
  How many I/Os to perform before issuing an \fBfsync\fR\|(2) of dirty data.  If
  0, don't sync.  Default: 0.
  .TP
+.BI fdatasync \fR=\fPint
+Like \fBfsync\fR, but uses \fBfdatasync\fR\|(2) instead to only sync the
+data parts of the file. Default: 0.
+.TP
+.BI sync_file_range \fR=\fPstr:int
+Use sync_file_range() for every \fRval\fP number of write operations. Fio will
+track range of writes that have happened since the last sync_file_range() call.
+\fRstr\fP can currently be one or more of:
+.RS
+.TP
+.B wait_before
+SYNC_FILE_RANGE_WAIT_BEFORE
+.TP
+.B write
+SYNC_FILE_RANGE_WRITE
+.TP
+.B wait_after
+SYNC_FILE_RANGE_WRITE
+.TP
+.RE
+.P
+So if you do sync_file_range=wait_before,write:8, fio would use
+\fBSYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE\fP for every 8 writes.
+Also see the sync_file_range(2) man page.  This option is Linux specific.
+.TP
  .BI overwrite \fR=\fPbool
  If writing, setup the file first and do overwrites.  Default: false.
  .TP
@@ -306,14 +426,22 @@ Percentage of a mixed workload that should be reads. Default: 50.
  .TP
  .BI rwmixwrite \fR=\fPint
  Percentage of a mixed workload that should be writes.  If \fBrwmixread\fR and
-\fBwrmixwrite\fR are given and do not sum to 100%, the latter of the two
-overrides the first.  Default: 50.
+\fBrwmixwrite\fR are given and do not sum to 100%, the latter of the two
+overrides the first. This may interfere with a given rate setting, if fio is
+asked to limit reads or writes to a certain rate. If that is the case, then
+the distribution may be skewed. Default: 50.
  .TP
  .B norandommap
  Normally \fBfio\fR will cover every block of the file when doing random I/O. If
  this parameter is given, a new offset will be chosen without looking at past
  I/O history.  This parameter is mutually exclusive with \fBverify\fR.
  .TP
+.B softrandommap
+See \fBnorandommap\fR. If fio runs with the random block map enabled and it
+fails to allocate the map, if this option is set it will continue without a
+random block map. As coverage will not be as complete as with random maps, this
+option is disabled by default.
+.TP
  .BI nice \fR=\fPint
  Run job with given nice value.  See \fInice\fR\|(2).
  .TP
@@ -336,18 +464,27 @@ Number of blocks to issue before waiting \fBthinktime\fR microseconds.
  Default: 1.
  .TP
  .BI rate \fR=\fPint
-Cap bandwidth used by this job to this number of KiB/s.
+Cap bandwidth used by this job. The number is in bytes/sec, the normal postfix
+rules apply. You can use \fBrate\fR=500k to limit reads and writes to 500k each,
+or you can specify read and writes separately. Using \fBrate\fR=1m,500k would
+limit reads to 1MB/sec and writes to 500KB/sec. Capping only reads or writes
+can be done with \fBrate\fR=,500k or \fBrate\fR=500k,. The former will only
+limit writes (to 500KB/sec), the latter will only limit reads.
  .TP
  .BI ratemin \fR=\fPint
  Tell \fBfio\fR to do whatever it can to maintain at least the given bandwidth.
-Failing to meet this requirement will cause the job to exit.
+Failing to meet this requirement will cause the job to exit. The same format
+as \fBrate\fR is used for read vs write separation.
  .TP
  .BI rate_iops \fR=\fPint
-Cap the bandwidth to this number of IOPS.  If \fBblocksize\fR is a range, the
-smallest block size is used as the metric.
+Cap the bandwidth to this number of IOPS. Basically the same as rate, just
+specified independently of bandwidth. The same format as \fBrate\fR is used for
+read vs write seperation. If \fBblocksize\fR is a range, the smallest block
+size is used as the metric.
  .TP
  .BI rate_iops_min \fR=\fPint
-If this rate of I/O is not met, the job will exit.
+If this rate of I/O is not met, the job will exit. The same format as \fBrate\fR
+is used for read vs write seperation.
  .TP
  .BI ratecycle \fR=\fPint
  Average bandwidth for \fBrate\fR and \fBratemin\fR over this number of
@@ -371,6 +508,13 @@ If given, run for the specified \fBruntime\fR duration even if the files are
  completely read or written. The same workload will be repeated as many times
  as \fBruntime\fR allows.
  .TP
+.BI ramp_time \fR=\fPint
+If set, fio will run the specified workload for this amount of time before
+logging any performance numbers. Useful for letting performance settle before
+logging results, thus minimizing the runtime required for stable results. Note
+that the \fBramp_time\fR is considered lead in time for a job, thus it will
+increase the total runtime if a special timeout or runtime is specified.
+.TP
  .BI invalidate \fR=\fPbool
  Invalidate buffer-cache for the file prior to starting I/O.  Default: true.
  .TP
@@ -403,12 +547,25 @@ Same as \fBmmap\fR, but use huge files as backing.
  The amount of memory allocated is the maximum allowed \fBblocksize\fR for the
  job multiplied by \fBiodepth\fR.  For \fBshmhuge\fR or \fBmmaphuge\fR to work,
  the system must have free huge pages allocated.  \fBmmaphuge\fR also needs to
-have hugetlbfs mounted, and \fIfile\fR must point there.
+have hugetlbfs mounted, and \fIfile\fR must point there. At least on Linux,
+huge pages must be manually allocated. See \fB/proc/sys/vm/nr_hugehages\fR
+and the documentation for that. Normally you just need to echo an appropriate
+number, eg echoing 8 will ensure that the OS has 8 huge pages ready for
+use.
  .RE
  .TP
-.BI hugepage\-size \fR=\fPsiint
+.BI iomem_align \fR=\fPint
+This indiciates the memory alignment of the IO memory buffers. Note that the
+given alignment is applied to the first IO unit buffer, if using \fBiodepth\fR
+the alignment of the following buffers are given by the \fBbs\fR used. In
+other words, if using a \fBbs\fR that is a multiple of the page sized in the
+system, all buffers will be aligned to this value. If using a \fBbs\fR that
+is not page aligned, the alignment of subsequent IO memory buffers is the
+sum of the \fBiomem_align\fR and \fBbs\fR used.
+.TP
+.BI hugepage\-size \fR=\fPint
  Defines the size of a huge page.  Must be at least equal to the system setting.
-Should be a multiple of 1MiB. Default: 4MiB.
+Should be a multiple of 1MB. Default: 4MB.
  .TP
  .B exitall
  Terminate all jobs when one finishes.  Default: wait for each job to finish.
@@ -423,6 +580,16 @@ If true, serialize file creation for the jobs.  Default: true.
  .BI create_fsync \fR=\fPbool
  \fIfsync\fR\|(2) data file after creation.  Default: true.
  .TP
+.BI create_on_open \fR=\fPbool
+If true, the files are not created until they are opened for IO by the job.
+.TP
+.BI pre_read \fR=\fPbool
+If this is given, files will be pre-read into memory before starting the given
+IO operation. This will also clear the \fR \fBinvalidate\fR flag, since it is
+pointless to pre-read and then drop the cache. This will only work for IO
+engines that are seekable, since they allow you to read the same data
+multiple times. Thus it will not work on eg network or splice IO.
+.TP
  .BI unlink \fR=\fPbool
  Unlink job files when done.  Default: false.
  .TP
@@ -440,39 +607,77 @@ values are:
  .RS
  .RS
  .TP
-.B md5 crc16 crc32 crc64 crc7 sha256 sha512
-Store appropriate checksum in the header of each block.
+.B md5 crc16 crc32 crc32c crc32c-intel crc64 crc7 sha256 sha512 sha1
+Store appropriate checksum in the header of each block. crc32c-intel is
+hardware accelerated SSE4.2 driven, falls back to regular crc32c if
+not supported by the system.
  .TP
  .B meta
  Write extra information about each I/O (timestamp, block number, etc.). The
-block number is verified.
-.TP
-.B pattern
-Fill I/O buffers with a specific pattern that is used to verify.  The pattern is
-specified by appending `:\fIint\fR' to the parameter. \fIint\fR cannot be larger
-than 32-bits. 
+block number is verified. See \fBverify_pattern\fR as well.
  .TP
  .B null
  Pretend to verify.  Used for testing internals.
  .RE
+
+This option can be used for repeated burn-in tests of a system to make sure
+that the written data is also correctly read back. If the data direction given
+is a read or random read, fio will assume that it should verify a previously
+written file. If the data direction includes any form of write, the verify will
+be of the newly written data.
  .RE
  .TP
  .BI verify_sort \fR=\fPbool
  If true, written verify blocks are sorted if \fBfio\fR deems it to be faster to
  read them back in a sorted manner.  Default: true.
  .TP
-.BI verify_offset \fR=\fPsiint
+.BI verify_offset \fR=\fPint
  Swap the verification header with data somewhere else in the block before
  writing.  It is swapped back before verifying.
  .TP
-.BI verify_interval \fR=\fPsiint
+.BI verify_interval \fR=\fPint
  Write the verification header for this number of bytes, which should divide
  \fBblocksize\fR.  Default: \fBblocksize\fR.
  .TP
+.BI verify_pattern \fR=\fPstr
+If set, fio will fill the io buffers with this pattern. Fio defaults to filling
+with totally random bytes, but sometimes it's interesting to fill with a known
+pattern for io verification purposes. Depending on the width of the pattern,
+fio will fill 1/2/3/4 bytes of the buffer at the time(it can be either a
+decimal or a hex number). The verify_pattern if larger than a 32-bit quantity
+has to be a hex number that starts with either "0x" or "0X". Use with
+\fBverify\fP=meta.
+.TP
  .BI verify_fatal \fR=\fPbool
  If true, exit the job on the first observed verification failure.  Default:
  false.
  .TP
+.BI verify_async \fR=\fPint
+Fio will normally verify IO inline from the submitting thread. This option
+takes an integer describing how many async offload threads to create for IO
+verification instead, causing fio to offload the duty of verifying IO contents
+to one or more separate threads.  If using this offload option, even sync IO
+engines can benefit from using an \fBiodepth\fR setting higher than 1, as it
+allows them to have IO in flight while verifies are running.
+.TP
+.BI verify_async_cpus \fR=\fPstr
+Tell fio to set the given CPU affinity on the async IO verification threads.
+See \fBcpus_allowed\fP for the format used.
+.TP
+.BI verify_backlog \fR=\fPint
+Fio will normally verify the written contents of a job that utilizes verify
+once that job has completed. In other words, everything is written then
+everything is read back and verified. You may want to verify continually
+instead for a variety of reasons. Fio stores the meta data associated with an
+IO block in memory, so for large verify workloads, quite a bit of memory would
+be used up holding this meta data. If this option is enabled, fio will verify
+the previously written blocks before continuing to write new ones.
+.TP
+.BI verify_backlog_batch \fR=\fPint
+Control how many blocks fio will verify if verify_backlog is set. If not set,
+will default to the value of \fBverify_backlog\fR (meaning the entire queue is
+read back and verified).
+.TP
  .B stonewall
  Wait for preceeding jobs in the job file to exit before starting this one.
  \fBstonewall\fR implies \fBnew_group\fR.
@@ -493,10 +698,10 @@ specified.
  Use threads created with \fBpthread_create\fR\|(3) instead of processes created
  with \fBfork\fR\|(2).
  .TP
-.BI zonesize \fR=\fPsiint
+.BI zonesize \fR=\fPint
  Divide file into zones of the specified size in bytes.  See \fBzoneskip\fR.
  .TP
-.BI zoneskip \fR=\fPsiint
+.BI zoneskip \fR=\fPint
  Skip the specified number of bytes when \fBzonesize\fR bytes of data have been
  read.
  .TP
@@ -507,13 +712,34 @@ Write the issued I/O patterns to the specified file.
  Replay the I/O patterns contained in the specified file generated by
  \fBwrite_iolog\fR, or may be a \fBblktrace\fR binary file.
  .TP
-.B write_bw_log
-If given, write bandwidth logs of the jobs in this file.
+.B write_bw_log \fR=\fPstr
+If given, write a bandwidth log of the jobs in this job file. Can be used to
+store data of the bandwidth of the jobs in their lifetime. The included
+fio_generate_plots script uses gnuplot to turn these text files into nice
+graphs. See \fBwrite_log_log\fR for behaviour of given filename. For this
+option, the postfix is _bw.log.
  .TP
  .B write_lat_log
-Same as \fBwrite_bw_log\fR, but writes I/O completion latencies.
+Same as \fBwrite_bw_log\fR, but writes I/O completion latencies.  If no
+filename is given with this option, the default filename of "jobname_type.log"
+is used. Even if the filename is given, fio will still append the type of log.
+.TP
+.B disable_lat \fR=\fPbool
+Disable measurements of total latency numbers. Useful only for cutting
+back the number of calls to gettimeofday, as that does impact performance at
+really high IOPS rates.  Note that to really get rid of a large amount of these
+calls, this option must be used with disable_slat and disable_bw as well.
+.TP
+.B disable_clat \fR=\fPbool
+Disable measurements of submission latency numbers. See \fBdisable_lat\fR.
+.TP
+.B disable_slat \fR=\fPbool
+Disable measurements of submission latency numbers. See \fBdisable_lat\fR.
+.TP
+.B disable_bw_measurement \fR=\fPbool
+Disable measurements of throughput/bandwidth numbers. See \fBdisable_lat\fR.
  .TP
-.BI lockmem \fR=\fPsiint
+.BI lockmem \fR=\fPint
  Pin the specified amount of memory with \fBmlock\fR\|(2).  Can be used to
  simulate a smaller amount of memory.
  .TP
@@ -536,6 +762,46 @@ given time in milliseconds.
  .TP
  .BI disk_util \fR=\fPbool
  Generate disk utilization statistics if the platform supports it. Default: true.
+.TP
+.BI gtod_reduce \fR=\fPbool
+Enable all of the gettimeofday() reducing options (disable_clat, disable_slat,
+disable_bw) plus reduce precision of the timeout somewhat to really shrink the
+gettimeofday() call count. With this option enabled, we only do about 0.4% of
+the gtod() calls we would have done if all time keeping was enabled.
+.TP
+.BI gtod_cpu \fR=\fPint
+Sometimes it's cheaper to dedicate a single thread of execution to just getting
+the current time. Fio (and databases, for instance) are very intensive on
+gettimeofday() calls. With this option, you can set one CPU aside for doing
+nothing but logging current time to a shared memory location. Then the other
+threads/processes that run IO workloads need only copy that segment, instead of
+entering the kernel with a gettimeofday() call. The CPU set aside for doing
+these time calls will be excluded from other uses. Fio will manually clear it
+from the CPU mask of other jobs.
+.TP
+.BI cgroup \fR=\fPstr
+Add job to this control group. If it doesn't exist, it will be created.
+The system must have a mounted cgroup blkio mount point for this to work. If
+your system doesn't have it mounted, you can do so with:
+
+# mount -t cgroup -o blkio none /cgroup
+.TP
+.BI cgroup_weight \fR=\fPint
+Set the weight of the cgroup to this value. See the documentation that comes
+with the kernel, allowed values are in the range of 100..1000.
+.TP
+.BI cgroup_nodelete \fR=\fPbool
+Normally fio will delete the cgroups it has created after the job completion.
+To override this behavior and to leave cgroups around after the job completion,
+set cgroup_nodelete=1. This can be useful if one wants to inspect various
+cgroup files after job completion. Default: false
+.TP
+.BI uid \fR=\fPint
+Instead of running as the invoking user, set the user ID to this value before
+the thread/process does any work.
+.TP
+.BI gid \fR=\fPint
+Set group ID, see \fBuid\fR.
  .SH OUTPUT
  While running, \fBfio\fR will display the status of the created jobs.  For
  example:
@@ -692,7 +958,7 @@ semicolon-delimited format suitable for scripted use.  The fields are:
  .P
  Read status:
  .RS
-.B KiB I/O, bandwidth \fR(KiB/s)\fP, runtime \fR(ms)\fP
+.B KB I/O, bandwidth \fR(KB/s)\fP, runtime \fR(ms)\fP
  .P
  Submission latency:
  .RS
@@ -710,7 +976,7 @@ Bandwidth:
  .P
  Write status:
  .RS
-.B KiB I/O, bandwidth \fR(KiB/s)\fP, runtime \fR(ms)\fP
+.B KB I/O, bandwidth \fR(KB/s)\fP, runtime \fR(ms)\fP
  .P
  Submission latency:
  .RS
@@ -728,7 +994,7 @@ Bandwidth:
  .P
  CPU usage:
  .RS
-.B user, system, context switches
+.B user, system, context switches, major page faults, minor page faults
  .RE
  .P
  IO depth distribution:
@@ -745,12 +1011,13 @@ IO latency distribution (ms):
  .RE
  .SH AUTHORS
  .B fio
-was written by Jens Axboe <jens.axboe@oracle.com>.
+was written by Jens Axboe <jens.axboe@oracle.com>,
+now Jens Axboe <jaxboe@fusionio.com>.
  .br
  This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au> based
  on documentation by Jens Axboe.
  .SH "REPORTING BUGS"
-Report bugs to the \fBfio\fR mailing list <fio-devel@kernel.dk>.
+Report bugs to the \fBfio\fR mailing list <fio@vger.kernel.org>.
  See \fBREADME\fR.
  .SH "SEE ALSO"
  For further documentation see \fBHOWTO\fR and \fBREADME\fR.