Fix typo in bssplit documentation

[fio.git] / fio.1
diff --git a/fio.1 b/fio.1

index e488b01bd146ffad96c99c103aeba1763cd251fd..733c7406b33f804bf520ed9face5c9eba8378055 100644 (file)
--- a/fio.1
+++ b/fio.1
@@ -454,7 +454,7 @@ See \fB\-\-max\-jobs\fR. Default: 1.
  Tell fio to terminate processing after the specified period of time. It
  can be quite hard to determine for how long a specified job will run, so
  this parameter is handy to cap the total runtime to a given time. When
-the unit is omitted, the value is intepreted in seconds.
+the unit is omitted, the value is interpreted in seconds.
  .TP
  .BI time_based
  If set, fio will run for the duration of the \fBruntime\fR specified
@@ -1227,7 +1227,7 @@ If you want a workload that has 50% 2k reads and 50% 4k reads, while having
  90% 4k writes and 10% 8k writes, you would specify:
  .RS
  .P
-bssplit=2k/50:4k/50,4k/90,8k/10
+bssplit=2k/50:4k/50,4k/90:8k/10
  .RE
  .P
  Fio supports defining up to 64 different weights for each data direction.
@@ -1523,7 +1523,7 @@ SCSI generic sg v3 I/O. May either be synchronous using the SG_IO
  ioctl, or if the target is an sg character device we use
  \fBread\fR\|(2) and \fBwrite\fR\|(2) for asynchronous
  I/O. Requires \fBfilename\fR option to specify either block or
-character devices.
+character devices. The sg engine includes engine specific options.
  .TP
  .B null
  Doesn't transfer any data, just pretends to. This is mainly used to
@@ -1552,7 +1552,7 @@ single CPU at the desired rate. A job never finishes unless there is
  at least one non\-cpuio job.
  .TP
  .B guasi
-The GUASI I/O engine is the Generic Userspace Asyncronous Syscall
+The GUASI I/O engine is the Generic Userspace Asynchronous Syscall
  Interface approach to async I/O. See \fIhttp://www.xmailserver.org/guasi\-lib.html\fR
  for more info on GUASI.
  .TP
@@ -1628,12 +1628,12 @@ constraint.
  .TP
  .B pmemblk
  Read and write using filesystem DAX to a file on a filesystem
-mounted with DAX on a persistent memory device through the NVML
+mounted with DAX on a persistent memory device through the PMDK
  libpmemblk library.
  .TP
  .B dev\-dax
  Read and write using device DAX to a persistent memory device (e.g.,
-/dev/dax0.0) through the NVML libpmem library.
+/dev/dax0.0) through the PMDK libpmem library.
  .TP
  .B external
  Prefix to specify loading an external I/O engine object file. Append
@@ -1649,7 +1649,7 @@ done other than creating the file.
  .TP
  .B libpmem
  Read and write using mmap I/O to a file on a filesystem
-mounted with DAX on a persistent memory device through the NVML
+mounted with DAX on a persistent memory device through the PMDK
  libpmem library.
  .SS "I/O engine specific parameters"
  In addition, there are some parameters which are only valid when a specific
@@ -1820,6 +1820,41 @@ server side this will be passed into the rdma_bind_addr() function and
  on the client site it will be used in the rdma_resolve_add()
  function. This can be useful when multiple paths exist between the
  client and the server or in certain loopback configurations.
+.TP
+.BI (sg)readfua \fR=\fPbool
+With readfua option set to 1, read operations include the force
+unit access (fua) flag. Default: 0.
+.TP
+.BI (sg)writefua \fR=\fPbool
+With writefua option set to 1, write operations include the force
+unit access (fua) flag. Default: 0.
+.TP
+.BI (sg)sg_write_mode \fR=\fPstr
+Specify the type of write commands to issue. This option can take three
+values:
+.RS
+.RS
+.TP
+.B write (default)
+Write opcodes are issued as usual
+.TP
+.B verify
+Issue WRITE AND VERIFY commands. The BYTCHK bit is set to 0. This
+directs the device to carry out a medium verification with no data
+comparison. The writefua option is ignored with this selection.
+.TP
+.B same
+Issue WRITE SAME commands. This transfers a single block to the device
+and writes this same block of data to a contiguous sequence of LBAs
+beginning at the specified offset. fio's block size parameter
+specifies the amount of data written with each command. However, the
+amount of data actually transferred to the device is equal to the
+device's block (sector) size. For a device with 512 byte sectors,
+blocksize=8k will write 16 sectors with each command. fio will still
+generate 8k of data for each command butonly the first 512 bytes will
+be used and transferred to the device. The writefua option is ignored
+with this selection.
+
  .SS "I/O depth"
  .TP
  .BI iodepth \fR=\fPint
@@ -2028,6 +2063,12 @@ respect the timestamps and attempt to replay them as fast as possible while
  still respecting ordering. The result is the same I/O pattern to a given
  device, but different timings.
  .TP
+.BI replay_time_scale \fR=\fPint
+When replaying I/O with \fBread_iolog\fR, fio will honor the original timing
+in the trace. With this option, it's possible to scale the time. It's a
+percentage option, if set to 50 it means run at 50% the original IO rate in
+the trace. If set to 200, run at twice the original IO rate. Defaults to 100.
+.TP
  .BI replay_redirect \fR=\fPstr
  While replaying I/O patterns using \fBread_iolog\fR the default behavior
  is to replay the IOPS onto the major/minor device that each IOP was recorded
@@ -2053,6 +2094,12 @@ value.
  Scale sector offsets down by this factor when replaying traces.
  .SS "Threads, processes and job synchronization"
  .TP
+.BI replay_skip \fR=\fPstr
+Sometimes it's useful to skip certain IO types in a replay trace. This could
+be, for instance, eliminating the writes in the trace. Or not replaying the
+trims/discards, if you are redirecting to a device that doesn't support them.
+This option takes a comma separated list of read, write, trim, sync.
+.TP
  .BI thread
  Fio defaults to creating jobs by using fork, however if this option is
  given, fio will create jobs by using POSIX Threads' function
@@ -2083,22 +2130,28 @@ systems since meaning of priority may differ.
  .BI prioclass \fR=\fPint
  Set the I/O priority class. See man \fBionice\fR\|(1).
  .TP
-.BI cpumask \fR=\fPint
-Set the CPU affinity of this job. The parameter given is a bit mask of
-allowed CPUs the job may run on. So if you want the allowed CPUs to be 1
-and 5, you would pass the decimal value of (1 << 1 | 1 << 5), or 34. See man
-\fBsched_setaffinity\fR\|(2). This may not work on all supported
-operating systems or kernel versions. This option doesn't work well for a
-higher CPU count than what you can store in an integer mask, so it can only
-control cpus 1\-32. For boxes with larger CPU counts, use
-\fBcpus_allowed\fR.
-.TP
  .BI cpus_allowed \fR=\fPstr
  Controls the same options as \fBcpumask\fR, but accepts a textual
-specification of the permitted CPUs instead. So to use CPUs 1 and 5 you
-would specify `cpus_allowed=1,5'. This option also allows a range of CPUs
-to be specified \-\- say you wanted a binding to CPUs 1, 5, and 8 to 15, you
-would set `cpus_allowed=1,5,8\-15'.
+specification of the permitted CPUs instead and CPUs are indexed from 0. So
+to use CPUs 0 and 5 you would specify `cpus_allowed=0,5'. This option also
+allows a range of CPUs to be specified \-\- say you wanted a binding to CPUs
+0, 5, and 8 to 15, you would set `cpus_allowed=0,5,8\-15'.
+.RS
+.P
+On Windows, when `cpus_allowed' is unset only CPUs from fio's current
+processor group will be used and affinity settings are inherited from the
+system. An fio build configured to target Windows 7 makes options that set
+CPUs processor group aware and values will set both the processor group
+and a CPU from within that group. For example, on a system where processor
+group 0 has 40 CPUs and processor group 1 has 32 CPUs, `cpus_allowed'
+values between 0 and 39 will bind CPUs from processor group 0 and
+`cpus_allowed' values between 40 and 71 will bind CPUs from processor
+group 1. When using `cpus_allowed_policy=shared' all CPUs specified by a
+single `cpus_allowed' option must be from the same processor group. For
+Windows fio builds not built for Windows 7, CPUs will only be selected from
+(and be relative to) whatever processor group fio happens to be running in
+and CPUs from other processor groups cannot be used.
+.RE
  .TP
  .BI cpus_allowed_policy \fR=\fPstr
  Set the policy of how fio distributes the CPUs specified by
@@ -2119,6 +2172,16 @@ enough CPUs are given for the jobs listed, then fio will roundrobin the CPUs
  in the set.
  .RE
  .TP
+.BI cpumask \fR=\fPint
+Set the CPU affinity of this job. The parameter given is a bit mask of
+allowed CPUs the job may run on. So if you want the allowed CPUs to be 1
+and 5, you would pass the decimal value of (1 << 1 | 1 << 5), or 34. See man
+\fBsched_setaffinity\fR\|(2). This may not work on all supported
+operating systems or kernel versions. This option doesn't work well for a
+higher CPU count than what you can store in an integer mask, so it can only
+control cpus 1\-32. For boxes with larger CPU counts, use
+\fBcpus_allowed\fR.
+.TP
  .BI numa_cpu_nodes \fR=\fPstr
  Set this job running on specified NUMA nodes' CPUs. The arguments allow
  comma delimited list of cpu numbers, A\-B ranges, or `all'. Note, to enable
@@ -2134,7 +2197,7 @@ arguments:
  <mode>[:<nodelist>]
  .RE
  .P
-`mode' is one of the following memory poicies: `default', `prefer',
+`mode' is one of the following memory policies: `default', `prefer',
  `bind', `interleave' or `local'. For `default' and `local' memory
  policies, no node needs to be specified. For `prefer', only one node is
  allowed. For `bind' and `interleave' the `nodelist' may be as
@@ -2244,7 +2307,7 @@ Use a crc32c sum of the data area and store it in the header of
  each block. This will automatically use hardware acceleration
  (e.g. SSE4.2 on an x86 or CRC crypto extensions on ARM64) but will
  fall back to software crc32c if none is found. Generally the
-fatest checksum fio supports when hardware accelerated.
+fastest checksum fio supports when hardware accelerated.
  .TP
  .B crc32c\-intel
  Synonym for crc32c.
@@ -2310,16 +2373,6 @@ previously written file. If the data direction includes any form of write,
  the verify will be of the newly written data.
  .RE
  .TP
-.BI verifysort \fR=\fPbool
-If true, fio will sort written verify blocks when it deems it faster to read
-them back in a sorted manner. This is often the case when overwriting an
-existing file, since the blocks are already laid out in the file system. You
-can ignore this option unless doing huge amounts of really fast I/O where
-the red\-black tree sorting CPU time becomes significant. Default: true.
-.TP
-.BI verifysort_nr \fR=\fPint
-Pre\-load and sort verify blocks for a read workload.
-.TP
  .BI verify_offset \fR=\fPint
  Swap the verification header with data somewhere else in the block before
  writing. It is swapped back before verifying.
@@ -2595,7 +2648,8 @@ zlib.
  .BI log_compression_cpus \fR=\fPstr
  Define the set of CPUs that are allowed to handle online log compression for
  the I/O jobs. This can provide better isolation between performance
-sensitive jobs, and background compression work.
+sensitive jobs, and background compression work. See \fBcpus_allowed\fR for
+the format used.
  .TP
  .BI log_store_compressed \fR=\fPbool
  If set, fio will store the log files in a compressed format. They can be