X-Git-Url: https://git.kernel.dk/?p=fio.git;a=blobdiff_plain;f=HOWTO;h=6c69a0ecf7e8fac3c20d9867f9016a74ed1278a8;hp=0b80a62368a88efa464704544bd8fb79696e0119;hb=d4a507c17533f05bcf6d6eeb8d00f3dad1a020a1;hpb=b034c0dd2cdb27d3523b300c1b4b93a1c5b84b3c

diff --git a/HOWTO b/HOWTO
index 0b80a623..6c69a0ec 100644
--- a/HOWTO
+++ b/HOWTO
@@ -98,7 +98,7 @@ Command line options
 
 .. option:: --parse-only
 
-	Parse options only, don\'t start any I/O.
+	Parse options only, don't start any I/O.
 
 .. option:: --output=filename
 
@@ -505,19 +505,19 @@ Parameter types
 	prefixes.  To specify power-of-10 decimal values defined in the
 	International System of Units (SI):
 
-		* *Ki* -- means kilo (K) or 1000
-		* *Mi* -- means mega (M) or 1000**2
-		* *Gi* -- means giga (G) or 1000**3
-		* *Ti* -- means tera (T) or 1000**4
-		* *Pi* -- means peta (P) or 1000**5
+		* *ki* -- means kilo (K) or 1000
+		* *mi* -- means mega (M) or 1000**2
+		* *gi* -- means giga (G) or 1000**3
+		* *ti* -- means tera (T) or 1000**4
+		* *pi* -- means peta (P) or 1000**5
 
 	To specify power-of-2 binary values defined in IEC 80000-13:
 
 		* *k* -- means kibi (Ki) or 1024
-		* *M* -- means mebi (Mi) or 1024**2
-		* *G* -- means gibi (Gi) or 1024**3
-		* *T* -- means tebi (Ti) or 1024**4
-		* *P* -- means pebi (Pi) or 1024**5
+		* *m* -- means mebi (Mi) or 1024**2
+		* *g* -- means gibi (Gi) or 1024**3
+		* *t* -- means tebi (Ti) or 1024**4
+		* *p* -- means pebi (Pi) or 1024**5
 
 	With :option:`kb_base`\=1024 (the default), the unit prefixes are opposite
 	from those specified in the SI and IEC 80000-13 standards to provide
@@ -576,6 +576,8 @@ Parameter types
 **float_list**
 	A list of floating point numbers, separated by a ':' character.
 
+With the above in mind, here follows the complete list of fio job parameters.
+
 
 Units
 ~~~~~
@@ -622,9 +624,6 @@ Units
 		Bit based.
 
 
-With the above in mind, here follows the complete list of fio job parameters.
-
-
 Job description
 ~~~~~~~~~~~~~~~
 
@@ -1015,8 +1014,8 @@ I/O type
 
 	``sequential`` is only useful for random I/O, where fio would normally
 	generate a new random offset for every I/O. If you append e.g. 8 to randread,
-	you would get a new random offset for every 8 I/O's. The result would be a
-	seek for only every 8 I/O's, instead of for every I/O. Use ``rw=randread:8``
+	you would get a new random offset for every 8 I/Os. The result would be a
+	seek for only every 8 I/Os, instead of for every I/O. Use ``rw=randread:8``
 	to specify that. As sequential I/O is already sequential, setting
 	``sequential`` for that would not result in any differences.  ``identical``
 	behaves in a similar fashion, except it sends the same offset 8 number of
@@ -1093,11 +1092,29 @@ I/O type
 		**random**
 			Advise using **FADV_RANDOM**.
 
-.. option:: fadvise_stream=int
+.. option:: write_hint=str
+
+	Use :manpage:`fcntl(2)` to advise the kernel what life time to expect
+	from a write. Only supported on Linux, as of version 4.13. Accepted
+	values are:
+
+		**none**
+			No particular life time associated with this file.
+
+		**short**
+			Data written to this file has a short life time.
+
+		**medium**
+			Data written to this file has a medium life time.
+
+		**long**
+			Data written to this file has a long life time.
 
-	Use :manpage:`posix_fadvise(2)` to advise the kernel what stream ID the
-	writes issued belong to. Only supported on Linux. Note, this option may
-	change going forward.
+		**extreme**
+			Data written to this file has a very long life time.
+
+	The values are all relative to each other, and no absolute meaning
+	should be associated with them.
 
 .. option:: offset=int
 
@@ -1237,7 +1254,7 @@ I/O type
 
 	* 60% of accesses should be to the first 10%
 	* 30% of accesses should be to the next 20%
-	* 8% of accesses should be to to the next 30%
+	* 8% of accesses should be to the next 30%
 	* 2% of accesses should be to the next 40%
 
 	we can define that through zoning of the random accesses. For the above
@@ -1375,7 +1392,7 @@ Block size
 	typically won't work with direct I/O, as that normally requires sector
 	alignment.
 
-.. option:: bs_is_seq_rand
+.. option:: bs_is_seq_rand=bool
 
 	If this option is set, fio will use the normal read,write blocksize settings
 	as sequential,random blocksize settings instead. Any random read or write
@@ -1512,6 +1529,7 @@ Buffers and memory
 
 		**cudamalloc**
 			Use GPU memory as the buffers for GPUDirect RDMA benchmark.
+			The ioengine must be rdma.
 
 	The area allocated is a function of the maximum allowed bs size for the job,
 	multiplied by the I/O depth given. Note that for **shmhuge** and
@@ -1801,6 +1819,11 @@ caveat that when used on the command line, they must come after the
 	Set RWF_HIPRI on I/O, indicating to the kernel that it's of higher priority
 	than normal.
 
+.. option:: hipri_percentage : [pvsync2]
+
+	When hipri is set this determines the probability of a pvsync2 IO being high
+	priority. The default is 100%.
+
 .. option:: cpuload=int : [cpuio]
 
 	Attempt to use the specified percentage of CPU cycles. This is a mandatory
@@ -1835,7 +1858,7 @@ caveat that when used on the command line, they must come after the
 
    [libhdfs]
 
-		the listening port of the HFDS cluster namenode.
+		The listening port of the HFDS cluster namenode.
 
 .. option:: interface=str : [netsplice] [net]
 
@@ -1871,13 +1894,13 @@ caveat that when used on the command line, they must come after the
 	hostname if the job is a TCP listener or UDP reader. For unix sockets, the
 	normal filename option should be used and the port is invalid.
 
-.. option:: listen : [net]
+.. option:: listen : [netsplice] [net]
 
 	For TCP network connections, tell fio to listen for incoming connections
 	rather than initiating an outgoing connection. The :option:`hostname` must
 	be omitted if this option is used.
 
-.. option:: pingpong : [net]
+.. option:: pingpong : [netsplice] [net]
 
 	Normally a network writer will just continue writing data, and a network
 	reader will just consume packages. If ``pingpong=1`` is set, a writer will
@@ -1889,11 +1912,11 @@ caveat that when used on the command line, they must come after the
 	``pingpong=1`` should only be set for a single reader when multiple readers
 	are listening to the same address.
 
-.. option:: window_size : [net]
+.. option:: window_size : [netsplice] [net]
 
 	Set the desired socket buffer size for the connection.
 
-.. option:: mss : [net]
+.. option:: mss : [netsplice] [net]
 
 	Set the TCP maximum segment size (TCP_MAXSEG).
 
@@ -1908,7 +1931,7 @@ caveat that when used on the command line, they must come after the
 	**0**
 		Default. Preallocate donor's file on init.
 	**1**
-		Allocate space immediately inside defragment event,	and free right
+		Allocate space immediately inside defragment event, and free right
 		after event.
 
 .. option:: clustername=str : [rbd]
@@ -1940,7 +1963,7 @@ caveat that when used on the command line, they must come after the
 
 .. option:: chunk_size : [libhdfs]
 
-	the size of the chunk to use for each file.
+	The size of the chunk to use for each file.
 
 
 I/O depth
@@ -2143,7 +2166,7 @@ I/O replay
 	replay, the file needs to be turned into a blkparse binary data file first
 	(``blkparse <device> -o /dev/null -d file_for_fio.bin``).
 
-.. option:: replay_no_stall=int
+.. option:: replay_no_stall=bool
 
 	When replaying I/O with :option:`read_iolog` the default behavior is to
 	attempt to respect the timestamps within the log and replay them with the
@@ -2464,7 +2487,7 @@ Verification
 
 .. option:: verifysort_nr=int
 
-   Pre-load and sort verify blocks for a read workload.
+	Pre-load and sort verify blocks for a read workload.
 
 .. option:: verify_offset=int
 
@@ -2572,7 +2595,7 @@ Verification
 
 .. option:: trim_backlog=int
 
-	Verify that trim/discarded blocks are returned as zeros.
+	Trim after this number of blocks are written.
 
 .. option:: trim_backlog_batch=int
 
@@ -2657,7 +2680,7 @@ Measurements and reporting
 	all jobs in a file will be part of the same reporting group, unless
 	separated by a :option:`stonewall`.
 
-.. option:: stats
+.. option:: stats=bool
 
 	By default, fio collects and shows final output results for all jobs
 	that run. If this option is set to 0, then fio will ignore it in
@@ -2740,10 +2763,11 @@ Measurements and reporting
 	you instead want to log the maximum value, set this option to 1. Defaults to
 	0, meaning that averaged values are logged.
 
-.. option:: log_offset=int
+.. option:: log_offset=bool
 
 	If this is set, the iolog options will include the byte offset for the I/O
-	entry as well as the other data values.
+	entry as well as the other data values. Defaults to 0 meaning that
+	offsets are not present in logs. Also see `Log File Formats`_.
 
 .. option:: log_compression=int
 
@@ -3224,7 +3248,7 @@ numbers denote:
 **ios**
 		Number of I/Os performed by all groups.
 **merge**
-		Number of merges I/O the I/O scheduler.
+		Number of merges performed by the I/O scheduler.
 **ticks**
 		Number of ticks we kept the disk busy.
 **in_queue**
@@ -3260,7 +3284,7 @@ changed for some reason, this number will be incremented by 1 to signify that
 change.
 
 Split up, the format is as follows (comments in brackets denote when a
-field was introduced or whether its specific to some terse version):
+field was introduced or whether it's specific to some terse version):
 
     ::
 
@@ -3339,6 +3363,27 @@ minimal output v3, separated by semicolons::
 	terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
 
 
+JSON+ output
+------------
+
+The `json+` output format is identical to the `json` output format except that it
+adds a full dump of the completion latency bins. Each `bins` object contains a
+set of (key, value) pairs where keys are latency durations and values count how
+many I/Os had completion latencies of the corresponding duration. For example,
+consider:
+
+	"bins" : { "87552" : 1, "89600" : 1, "94720" : 1, "96768" : 1, "97792" : 1, "99840" : 1, "100864" : 2, "103936" : 6, "104960" : 534, "105984" : 5995, "107008" : 7529, ... }
+
+This data indicates that one I/O required 87,552ns to complete, two I/Os required
+100,864ns to complete, and 7529 I/Os required 107,008ns to complete.
+
+Also included with fio is a Python script `fio_jsonplus_clat2csv` that takes
+json+ output and generates CSV-formatted latency data suitable for plotting.
+
+The latency durations actually represent the midpoints of latency intervals.
+For details refer to stat.h.
+
+
 Trace file format
 -----------------
 
@@ -3513,9 +3558,10 @@ Log File Formats
 Fio supports a variety of log file formats, for logging latencies, bandwidth,
 and IOPS. The logs share a common format, which looks like this:
 
-    *time* (`msec`), *value*, *data direction*, *offset*
+    *time* (`msec`), *value*, *data direction*, *block size* (`bytes`),
+    *offset* (`bytes`)
 
-Time for the log entry is always in milliseconds. The *value* logged depends
+*Time* for the log entry is always in milliseconds. The *value* logged depends
 on the type of log, it will be one of the following:
 
     **Latency log**
@@ -3534,16 +3580,17 @@ on the type of log, it will be one of the following:
 	**2**
 		I/O is a TRIM
 
-The *offset* is the offset, in bytes, from the start of the file, for that
-particular I/O. The logging of the offset can be toggled with
-:option:`log_offset`.
+The entry's *block size* is always in bytes. The *offset* is the offset, in bytes,
+from the start of the file, for that particular I/O. The logging of the offset can be
+toggled with :option:`log_offset`.
 
 Fio defaults to logging every individual I/O.  When IOPS are logged for individual
-I/Os the value entry will always be 1.  If windowed logging is enabled through
+I/Os the *value* entry will always be 1. If windowed logging is enabled through
 :option:`log_avg_msec`, fio logs the average values over the specified period of time.
 If windowed logging is enabled and :option:`log_max_value` is set, then fio logs
-maximum values in that window instead of averages.  Since 'data direction' and
-'offset' are per-I/O values, they aren't applicable if windowed logging is enabled.
+maximum values in that window instead of averages. Since *data direction*, *block
+size* and *offset* are per-I/O values, if windowed logging is enabled they
+aren't applicable and will be 0.
 
 Client/Server
 -------------