Merge branch 'fio-jsonplus-patches' of https://github.com/vincentkfu/fio

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index 629ab01f46080693711d1a6fcdb8a38890ea1ac6..ab978ab3aeb2914ac9d48e2f6cbb5ca2a728132c 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -212,94 +212,154 @@ parentheses).
  The following parameter types are used.
  .TP
  .I str
-String: a sequence of alphanumeric characters.
+String. A sequence of alphanumeric characters.
+.TP
+.I time
+Integer with possible time suffix. Without a unit value is interpreted as
+seconds unless otherwise specified. Accepts a suffix of 'd' for days, 'h' for
+hours, 'm' for minutes, 's' for seconds, 'ms' (or 'msec') for milliseconds and 'us'
+(or 'usec') for microseconds. For example, use 10m for 10 minutes.
  .TP
  .I int
  Integer. A whole number value, which may contain an integer prefix
  and an integer suffix.
-
-[integer prefix]number[integer suffix]
-
-The optional integer prefix specifies the number's base. The default
-is decimal. 0x specifies hexadecimal.
-
-The optional integer suffix specifies the number's units, and includes
-an optional unit prefix and an optional unit.  For quantities
-of data, the default unit is bytes. For quantities of time,
-the default unit is seconds.
-
-With \fBkb_base=1000\fR, fio follows international standards for unit prefixes.
-To specify power-of-10 decimal values defined in the International
-System of Units (SI):
-.nf
+.RS
+.RS
+.P
+[*integer prefix*] **number** [*integer suffix*]
+.RE
+.P
+The optional *integer prefix* specifies the number's base. The default
+is decimal. *0x* specifies hexadecimal.
+.P
+The optional *integer suffix* specifies the number's units, and includes an
+optional unit prefix and an optional unit. For quantities of data, the
+default unit is bytes. For quantities of time, the default unit is seconds
+unless otherwise specified.
+.P
+With `kb_base=1000', fio follows international standards for unit
+prefixes. To specify power-of-10 decimal values defined in the
+International System of Units (SI):
+.RS
+.P
  ki means kilo (K) or 1000
+.RE
+.RS
  mi means mega (M) or 1000**2
+.RE
+.RS
  gi means giga (G) or 1000**3
+.RE
+.RS
  ti means tera (T) or 1000**4
+.RE
+.RS
  pi means peta (P) or 1000**5
-.fi
-
+.RE
+.P
  To specify power-of-2 binary values defined in IEC 80000-13:
-.nf
+.RS
+.P
  k means kibi (Ki) or 1024
+.RE
+.RS
  m means mebi (Mi) or 1024**2
+.RE
+.RS
  g means gibi (Gi) or 1024**3
+.RE
+.RS
  t means tebi (Ti) or 1024**4
+.RE
+.RS
  p means pebi (Pi) or 1024**5
-.fi
-
-With \fBkb_base=1024\fR (the default), the unit prefixes are opposite from
-those specified in the SI and IEC 80000-13 standards to provide
-compatibility with old scripts.  For example, 4k means 4096.
-
-.nf
-Examples with \fBkb_base=1000\fR:
+.RE
+.P
+With `kb_base=1024' (the default), the unit prefixes are opposite
+from those specified in the SI and IEC 80000-13 standards to provide
+compatibility with old scripts. For example, 4k means 4096.
+.P
+For quantities of data, an optional unit of 'B' may be included
+(e.g., 'kB' is the same as 'k').
+.P
+The *integer suffix* is not case sensitive (e.g., m/mi mean mebi/mega,
+not milli). 'b' and 'B' both mean byte, not bit.
+.P
+Examples with `kb_base=1000':
+.RS
+.P
  4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
+.RE
+.RS
  1 MiB: 1048576, 1m, 1024k
+.RE
+.RS
  1 MB: 1000000, 1mi, 1000ki
+.RE
+.RS
  1 TiB: 1073741824, 1t, 1024m, 1048576k
+.RE
+.RS
  1 TB: 1000000000, 1ti, 1000mi, 1000000ki
-.fi
-
-.nf
-Examples with \fBkb_base=1024\fR (default):
+.RE
+.P
+Examples with `kb_base=1024' (default):
+.RS
+.P
  4 KiB: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
+.RE
+.RS
  1 MiB: 1048576, 1m, 1024k
+.RE
+.RS
  1 MB: 1000000, 1mi, 1000ki
+.RE
+.RS
  1 TiB: 1073741824, 1t, 1024m, 1048576k
+.RE
+.RS
  1 TB: 1000000000, 1ti, 1000mi, 1000000ki
-.fi
-
-For quantities of data, an optional unit of 'B' may be included
-(e.g.,  'kb' is the same as 'k').
-
-The integer suffix is not case sensitive (e.g., m/mi mean mebi/mega,
-not milli). 'b' and 'B' both mean byte, not bit.
-
+.RE
+.P
  To specify times (units are not case sensitive):
-.nf
+.RS
+.P
  D means days
+.RE
+.RS
  H means hours
+.RE
+.RS
  M mean minutes
+.RE
+.RS
  s or sec means seconds (default)
+.RE
+.RS
  ms or msec means milliseconds
+.RE
+.RS
  us or usec means microseconds
-.fi
-
+.RE
+.P
+If the option accepts an upper and lower range, use a colon ':' or
+minus '-' to separate such values. See `irange` parameter type.
+If the lower value specified happens to be larger than the upper value
+the two values are swapped.
+.RE
  .TP
  .I bool
-Boolean: a true or false value. `0' denotes false, `1' denotes true.
+Boolean. Usually parsed as an integer, however only defined for
+true and false (1 and 0).
  .TP
  .I irange
-Integer range: a range of integers specified in the format
-\fIlower\fR:\fIupper\fR or \fIlower\fR\-\fIupper\fR. \fIlower\fR and
-\fIupper\fR may contain a suffix as described above.  If an option allows two
-sets of ranges, they are separated with a `,' or `/' character. For example:
-`8\-8k/8M\-4G'.
+Integer range with suffix. Allows value range to be given, such as
+1024-4096. A colon may also be used as the separator, e.g. 1k:4k. If the
+option allows two sets of ranges, they can be specified with a ',' or '/'
+delimiter: 1k-4k/8k-32k. Also see `int` parameter type.
  .TP
  .I float_list
-List of floating numbers: A list of floating numbers, separated by
-a ':' character.
+A list of floating point numbers, separated by a ':' character.
  .SH "JOB DESCRIPTION"
  With the above in mind, here follows the complete list of fio job parameters.
  .TP
@@ -439,7 +499,7 @@ the same blocks will be written to.
  Fio defaults to read if the option is not specified.
  For mixed I/O, the default split is 50/50. For certain types of io the result
  may still be skewed a bit, since the speed may be different. It is possible to
-specify a number of IO's to do before getting a new offset, this is done by
+specify a number of IOs to do before getting a new offset, this is done by
  appending a `:\fI<nr>\fR to the end of the string given. For a random read, it
  would look like \fBrw=randread:8\fR for passing in an offset modifier with a
  value of 8. If the postfix is used with a sequential IO pattern, then the value
@@ -464,8 +524,8 @@ Generate the same offset
  .P
  \fBsequential\fR is only useful for random IO, where fio would normally
  generate a new random offset for every IO. If you append eg 8 to randread, you
-would get a new random offset for every 8 IO's. The result would be a seek for
-only every 8 IO's, instead of for every IO. Use \fBrw=randread:8\fR to specify
+would get a new random offset for every 8 IOs. The result would be a seek for
+only every 8 IOs, instead of for every IO. Use \fBrw=randread:8\fR to specify
  that. As sequential IO is already sequential, setting \fBsequential\fR for that
  would not result in any differences.  \fBidentical\fR behaves in a similar
  fashion, except it sends the same offset 8 number of times before generating a
@@ -550,10 +610,30 @@ Advise using \fBFADV_RANDOM\fR
  .RE
  .RE
  .TP
-.BI fadvise_stream \fR=\fPint
-Use \fBposix_fadvise\fR\|(2) to advise the kernel what stream ID the
-writes issued belong to. Only supported on Linux. Note, this option
-may change going forward.
+.BI write_hint \fR=\fPstr
+Use \fBfcntl\fR\|(2) to advise the kernel what life time to expect from a write.
+Only supported on Linux, as of version 4.13. The values are all relative to
+each other, and no absolute meaning should be associated with them. Accepted
+values are:
+.RS
+.RS
+.TP
+.B none
+No particular life time associated with this file.
+.TP
+.B short
+Data written to this file has a short life time.
+.TP
+.B medium
+Data written to this file has a medium life time.
+.TP
+.B long
+Data written to this file has a long life time.
+.TP
+.B extreme
+Data written to this file has a very long life time.
+.RE
+.RE
  .TP
  .BI size \fR=\fPint
  Total size of I/O for this job.  \fBfio\fR will run until this many bytes have
@@ -1037,7 +1117,7 @@ SYNC_FILE_RANGE_WAIT_BEFORE
  SYNC_FILE_RANGE_WRITE
  .TP
  .B wait_after
-SYNC_FILE_RANGE_WRITE
+SYNC_FILE_RANGE_WAIT_AFTER
  .TP
  .RE
  .P
@@ -1109,7 +1189,7 @@ given a criteria of:
  30% of accesses should be to the next 20%
  .RE
  .RS
-8% of accesses should be to to the next 30%
+8% of accesses should be to the next 30%
  .RE
  .RS
  2% of accesses should be to the next 40%
@@ -1430,7 +1510,7 @@ Should be a multiple of 1MiB. Default: 4MiB.
  .B exitall
  Terminate all jobs when one finishes.  Default: wait for each job to finish.
  .TP
-.B exitall_on_error \fR=\fPbool
+.B exitall_on_error
  Terminate all jobs if one job finishes in error.  Default: wait for each job
  to finish.
  .TP
@@ -1486,7 +1566,7 @@ Unlink job files after each iteration or loop.  Default: false.
  Specifies the number of iterations (runs of the same workload) of this job.
  Default: 1.
  .TP
-.BI verify_only \fR=\fPbool
+.BI verify_only
  Do not perform the specified workload, only verify data still matches previous
  invocation of this workload. This option allows one to check data multiple
  times at a later date without overwriting it. This option makes sense only for
@@ -1676,7 +1756,7 @@ corrupt.
  Replay the I/O patterns contained in the specified file generated by
  \fBwrite_iolog\fR, or may be a \fBblktrace\fR binary file.
  .TP
-.BI replay_no_stall \fR=\fPint
+.BI replay_no_stall \fR=\fPbool
  While replaying I/O patterns using \fBread_iolog\fR the default behavior
  attempts to respect timing information between I/Os.  Enabling
  \fBreplay_no_stall\fR causes I/Os to be replayed as fast as possible while
@@ -1760,7 +1840,8 @@ logs contain 1216 latency bins. See the \fBLOG FILE FORMATS\fR section.
  .TP
  .BI log_offset \fR=\fPbool
  If this is set, the iolog options will include the byte offset for the IO
-entry as well as the other data values.
+entry as well as the other data values. Defaults to 0 meaning that offsets are
+not present in logs. See the \fBLOG FILE FORMATS\fR section.
  .TP
  .BI log_compression \fR=\fPint
  If this is set, fio will compress the IO logs as it goes, to keep the memory
@@ -1983,6 +2064,10 @@ iodepth_batch_complete=0).
  Set RWF_HIPRI on IO, indicating to the kernel that it's of
  higher priority than normal.
  .TP
+.BI (pvsync2)hipri_percentage
+When hipri is set this determines the probability of a pvsync2 IO being high
+priority. The default is 100%.
+.TP
  .BI (net,netsplice)hostname \fR=\fPstr
  The host name or IP address to use for TCP or UDP based IO.
  If the job is a TCP listener or UDP reader, the hostname is not
@@ -2035,7 +2120,7 @@ For TCP network connections, tell fio to listen for incoming
  connections rather than initiating an outgoing connection. The
  hostname must be omitted if this option is used.
  .TP
-.BI (net, pingpong) \fR=\fPbool
+.BI (net,netsplice)pingpong
  Normally a network writer will just continue writing data, and a network reader
  will just consume packets. If pingpong=1 is set, a writer will send its normal
  payload to the reader, then wait for the reader to send the same payload back.
@@ -2045,16 +2130,16 @@ completion latency measures how long it took for the other end to receive and
  send back. For UDP multicast traffic pingpong=1 should only be set for a single
  reader when multiple readers are listening to the same address.
  .TP
-.BI (net, window_size) \fR=\fPint
+.BI (net,netsplice)window_size \fR=\fPint
  Set the desired socket buffer size for the connection.
  .TP
-.BI (net, mss) \fR=\fPint
+.BI (net,netsplice)mss \fR=\fPint
  Set the TCP maximum segment size (TCP_MAXSEG).
  .TP
-.BI (e4defrag,donorname) \fR=\fPstr
+.BI (e4defrag)donorname \fR=\fPstr
  File will be used as a block donor (swap extents between files)
  .TP
-.BI (e4defrag,inplace) \fR=\fPint
+.BI (e4defrag)inplace \fR=\fPint
  Configure donor file block allocation strategy
  .RS
  .BI 0(default) :
@@ -2078,7 +2163,7 @@ Specifies the username (without the 'client.' prefix) used to access the Ceph
  cluster. If the clustername is specified, the clientname shall be the full
  type.id string. If no type. prefix is given, fio will add 'client.' by default.
  .TP
-.BI (mtd)skipbad \fR=\fPbool
+.BI (mtd)skip_bad \fR=\fPbool
  Skip operations against known bad blocks.
  .SH OUTPUT
  While running, \fBfio\fR will display the status of the created jobs.  For
@@ -2217,7 +2302,7 @@ Finally, disk statistics are printed with reads first:
  Number of I/Os performed by all groups.
  .TP
  .B merge
-Number of merges in the I/O scheduler.
+Number of merges performed by the I/O scheduler.
  .TP
  .B ticks
  Number of ticks we kept the disk busy.
@@ -2354,6 +2439,27 @@ the minimal output v3, separated by semicolons:
  terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_max;read_clat_min;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_max;write_clat_min;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;pu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
  .fi
  .RE
+.SH JSON+ OUTPUT
+The \fBjson+\fR output format is identical to the \fBjson\fR output format except that it
+adds a full dump of the completion latency bins. Each \fBbins\fR object contains a
+set of (key, value) pairs where keys are latency durations and values count how
+many I/Os had completion latencies of the corresponding duration. For example,
+consider:
+
+.RS
+"bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1, "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" : 534, "105984" : 5995, "107008" : 7529, ... }
+.RE
+
+This data indicates that one I/O required 87,552ns to complete, two I/Os required
+100,864ns to complete, and 7529 I/Os required 107,008ns to complete.
+
+Also included with fio is a Python script \fBfio_jsonplus_clat2csv\fR that takes
+json+ output and generates CSV-formatted latency data suitable for plotting.
+
+The latency durations actually represent the midpoints of latency intervals.
+For details refer to stat.h.
+
+
  .SH TRACE FILE FORMAT
  There are two trace file format that you can encounter. The older (v1) format
  is unsupported since version 1.20-rc3 (March 2008). It will still be described
@@ -2547,7 +2653,7 @@ the files over and load them from there.
  Fio supports a variety of log file formats, for logging latencies, bandwidth,
  and IOPS. The logs share a common format, which looks like this:
  
-.B time (msec), value, data direction, offset
+.B time (msec), value, data direction, block size (bytes), offset (bytes)
  
  Time for the log entry is always in milliseconds. The value logged depends
  on the type of log, it will be one of the following:
@@ -2582,15 +2688,16 @@ IO is a TRIM
  .PD
  .P
  
-The \fIoffset\fR is the offset, in bytes, from the start of the file, for that
-particular IO. The logging of the offset can be toggled with \fBlog_offset\fR.
+The entry's *block size* is always in bytes. The \fIoffset\fR is the offset, in
+bytes, from the start of the file, for that particular IO. The logging of the
+offset can be toggled with \fBlog_offset\fR.
  
  If windowed logging is enabled through \fBlog_avg_msec\fR, then fio doesn't log
  individual IOs. Instead of logs the average values over the specified
-period of time. Since \fIdata direction\fR and \fIoffset\fR are per-IO values,
-they aren't applicable if windowed logging is enabled. If windowed logging
-is enabled and \fBlog_max_value\fR is set, then fio logs maximum values in
-that window instead of averages.
+period of time. Since \fIdata direction\fR, \fIblock size\fR and \fIoffset\fR
+are per-IO values, if windowed logging is enabled they aren't applicable and
+will be 0. If windowed logging is enabled and \fBlog_max_value\fR is set, then
+fio logs maximum values in that window instead of averages.
  
  For histogram logging the logs look like this:
  
@@ -2699,7 +2806,10 @@ This man page was written by Aaron Carroll <aaronc@cse.unsw.edu.au> based
  on documentation by Jens Axboe.
  .SH "REPORTING BUGS"
  Report bugs to the \fBfio\fR mailing list <fio@vger.kernel.org>.
-See \fBREADME\fR.
+.br
+See \fBREPORTING-BUGS\fR.
+
+\fBREPORTING-BUGS\fR: http://git.kernel.dk/cgit/fio/plain/REPORTING-BUGS
  .SH "SEE ALSO"
  For further documentation see \fBHOWTO\fR and \fBREADME\fR.
  .br