engines: add engine for file delete

[fio.git] / HOWTO
diff --git a/HOWTO b/HOWTO

index e0403b0803f04cb04ef7a14832dd39b3803c34d8..e6078c5f1e16e1143d4057b9a1e03bad21954d1f 100644 (file)
--- a/HOWTO
+++ b/HOWTO
@@ -809,6 +809,8 @@ Target file/device
  
                 **$jobname**
                                 The name of the worker thread or process.
+               **$clientuid**
+                               IP of the fio process when using client/server mode.
                 **$jobnum**
                                 The incremental number of the worker thread or process.
                 **$filenum**
@@ -1144,11 +1146,31 @@ I/O type
         behaves in a similar fashion, except it sends the same offset 8 number of
         times before generating a new offset.
  
-.. option:: unified_rw_reporting=bool
+.. option:: unified_rw_reporting=str
  
         Fio normally reports statistics on a per data direction basis, meaning that
-       reads, writes, and trims are accounted and reported separately. If this
-       option is set fio sums the results and report them as "mixed" instead.
+       reads, writes, and trims are accounted and reported separately. This option
+       determines whether fio reports the results normally, summed together, or as
+       both options.
+       Accepted values are:
+
+               **none**
+                       Normal statistics reporting.
+
+               **mixed**
+                       Statistics are summed per data direction and reported together.
+
+               **both**
+                       Statistics are reported normally, followed by the mixed statistics.
+
+               **0**
+                       Backward-compatible alias for **none**.
+
+               **1**
+                       Backward-compatible alias for **mixed**.
+               
+               **2**
+                       Alias for **both**.
  
  .. option:: randrepeat=bool
  
@@ -1361,7 +1383,7 @@ I/O type
         limit reads or writes to a certain rate.  If that is the case, then the
         distribution may be skewed. Default: 50.
  
-.. option:: random_distribution=str:float[,str:float][,str:float]
+.. option:: random_distribution=str:float[:float][,str:float][,str:float]
  
         By default, fio will use a completely uniform random distribution when asked
         to perform random I/O. Sometimes it is useful to skew the distribution in
@@ -1396,6 +1418,14 @@ I/O type
         map. For the **normal** distribution, a normal (Gaussian) deviation is
         supplied as a value between 0 and 100.
  
+       The second, optional float is allowed for **pareto**, **zipf** and **normal** distributions.
+       It allows to set base of distribution in non-default place, giving more control
+       over most probable outcome. This value is in range [0-1] which maps linearly to
+       range of possible random values.
+       Defaults are: random for **pareto** and **zipf**, and 0.5 for **normal**.
+       If you wanted to use **zipf** with a `theta` of 1.2 centered on 1/4 of allowed value range,
+       you would use ``random_distibution=zipf:1.2:0.25``.
+
         For a **zoned** distribution, fio supports specifying percentages of I/O
         access that should fall within what range of the file or device. For
         example, given a criteria of:
@@ -1677,10 +1707,28 @@ Buffers and memory
         This will be ignored if :option:`pre_read` is also specified for the
         same job.
  
-.. option:: sync=bool
+.. option:: sync=str
+
+       Whether, and what type, of synchronous I/O to use for writes.  The allowed
+       values are:
+
+               **none**
+                       Do not use synchronous IO, the default.
+
+               **0**
+                       Same as **none**.
+
+               **sync**
+                       Use synchronous file IO. For the majority of I/O engines,
+                       this means using O_SYNC.
+
+               **1**
+                       Same as **sync**.
+
+               **dsync**
+                       Use synchronous data IO. For the majority of I/O engines,
+                       this means using O_DSYNC.
  
-       Use synchronous I/O for buffered writes. For the majority of I/O engines,
-       this means using O_SYNC. Default: false.
  
  .. option:: iomem=str, mem=str
  
@@ -1894,20 +1942,14 @@ I/O engine
  
                 **cpuio**
                         Doesn't transfer any data, but burns CPU cycles according to the
-                       :option:`cpuload` and :option:`cpuchunks` options. Setting
-                       :option:`cpuload`\=85 will cause that job to do nothing but burn 85%
+                       :option:`cpuload`, :option:`cpuchunks` and :option:`cpumode` options.
+                       Setting :option:`cpuload`\=85 will cause that job to do nothing but burn 85%
                         of the CPU. In case of SMP machines, use :option:`numjobs`\=<nr_of_cpu>
                         to get desired CPU usage, as the cpuload only loads a
                         single CPU at the desired rate. A job never finishes unless there is
                         at least one non-cpuio job.
-
-               **guasi**
-                       The GUASI I/O engine is the Generic Userspace Asynchronous Syscall
-                       Interface approach to async I/O. See
-
-                       http://www.xmailserver.org/guasi-lib.html
-
-                       for more info on GUASI.
+                       Setting :option:`cpumode`\=qsort replace the default noop instructions loop
+                       by a qsort algorithm to consume more energy.
  
                 **rdma**
                         The RDMA I/O engine supports both RDMA memory semantics
@@ -2013,6 +2055,11 @@ I/O engine
                         and 'nrfiles', so that files will be created.
                         This engine is to measure file lookup and meta data access.
  
+               **filedelete**
+                       Simply delete the files by unlink() and do no I/O to them. You need to set 'filesize'
+                       and 'nrfiles', so that the files will be created.
+                       This engine is to measure file delete.
+
                 **libpmem**
                         Read and write using mmap I/O to a file on a filesystem
                         mounted with DAX on a persistent memory device through the PMDK
@@ -2038,6 +2085,17 @@ I/O engine
                 **nbd**
                         Read and write a Network Block Device (NBD).
  
+               **libcufile**
+                       I/O engine supporting libcufile synchronous access to nvidia-fs and a
+                       GPUDirect Storage-supported filesystem. This engine performs
+                       I/O without transferring buffers between user-space and the kernel,
+                       unless :option:`verify` is set or :option:`cuda_io` is `posix`.
+                       :option:`iomem` must not be `cudamalloc`. This ioengine defines
+                       engine specific options.
+               **dfs**
+                       I/O engine supporting asynchronous read and write operations to the
+                       DAOS File System (DFS) via libdfs.
+
  I/O engine specific parameters
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  
@@ -2159,7 +2217,7 @@ with the caveat that when used on the command line, they must come after the
                 this will be the starting port number since fio will use a range of
                 ports.
  
-   [rdma]
+   [rdma], [librpma_*]
  
                 The port to use for RDMA-CM communication. This should be the same value
                 on the client and the server side.
@@ -2170,6 +2228,15 @@ with the caveat that when used on the command line, they must come after the
         is a TCP listener or UDP reader, the hostname is not used and must be omitted
         unless it is a valid UDP multicast address.
  
+.. option:: serverip=str : [librpma_*]
+
+       The IP address to be used for RDMA-CM based I/O.
+
+.. option:: direct_write_to_pmem=bool : [librpma_*]
+
+       Set to 1 only when Direct Write to PMem from the remote host is possible.
+       Otherwise, set to 0.
+
  .. option:: interface=str : [netsplice] [net]
  
         The IP address of the network interface used to send or receive UDP
@@ -2266,6 +2333,12 @@ with the caveat that when used on the command line, they must come after the
          Poll store instead of waiting for completion. Usually this provides better
          throughput at cost of higher(up to 100%) CPU utilization.
  
+.. option:: touch_objects=bool : [rados]
+
+        During initialization, touch (create if do not exist) all objects (files).
+        Touching all objects affects ceph caches and likely impacts test results.
+        Enabled by default.
+
  .. option:: skip_bad=bool : [mtd]
  
         Skip operations against known bad blocks.
@@ -2331,6 +2404,18 @@ with the caveat that when used on the command line, they must come after the
                 transferred to the device. The writefua option is ignored with this
                 selection.
  
+.. option:: hipri : [sg]
+
+       If this option is set, fio will attempt to use polled IO completions.
+       This will have a similar effect as (io_uring)hipri. Only SCSI READ and
+       WRITE commands will have the SGV4_FLAG_HIPRI set (not UNMAP (trim) nor
+       VERIFY). Older versions of the Linux sg driver that do not support
+       hipri will simply ignore this flag and do normal IO. The Linux SCSI
+       Low Level Driver (LLD) that "owns" the device also needs to support
+       hipri (also known as iopoll and mq_poll). The MegaRAID driver is an
+       example of a SCSI LLD. Default: clear (0) which does normal
+       (interrupted based) IO.
+
  .. option:: http_host=str : [http]
  
         Hostname to connect to. For S3, this could be the bucket hostname.
@@ -2388,6 +2473,46 @@ with the caveat that when used on the command line, they must come after the
         nbd+unix:///?socket=/tmp/socket
         nbds://tlshost/exportname
  
+.. option:: gpu_dev_ids=str : [libcufile]
+
+       Specify the GPU IDs to use with CUDA. This is a colon-separated list of
+       int. GPUs are assigned to workers roundrobin. Default is 0.
+
+.. option:: cuda_io=str : [libcufile]
+
+       Specify the type of I/O to use with CUDA. Default is **cufile**.
+
+       **cufile**
+               Use libcufile and nvidia-fs. This option performs I/O directly
+               between a GPUDirect Storage filesystem and GPU buffers,
+               avoiding use of a bounce buffer. If :option:`verify` is set,
+               cudaMemcpy is used to copy verificaton data between RAM and GPU.
+               Verification data is copied from RAM to GPU before a write
+               and from GPU to RAM after a read. :option:`direct` must be 1.
+       **posix**
+               Use POSIX to perform I/O with a RAM buffer, and use cudaMemcpy
+               to transfer data between RAM and the GPUs. Data is copied from
+               GPU to RAM before a write and copied from RAM to GPU after a
+               read. :option:`verify` does not affect use of cudaMemcpy.
+
+.. option:: pool=str : [dfs]
+
+       Specify the UUID of the DAOS pool to connect to.
+
+.. option:: cont=str : [dfs]
+
+       Specify the UUID of the DAOS container to open.
+
+.. option:: chunk_size=int : [dfs]
+
+       Specificy a different chunk size (in bytes) for the dfs file.
+       Use DAOS container's chunk size by default.
+
+.. option:: object_class=str : [dfs]
+
+       Specificy a different object class for the dfs file.
+       Use DAOS container's object class by default.
+
  I/O depth
  ~~~~~~~~~
  
@@ -2482,7 +2607,8 @@ I/O depth
         can increase latencies. The benefit is that fio can manage submission rates
         independently of the device completion rates. This avoids skewed latency
         reporting if I/O gets backed up on the device side (the coordinated omission
-       problem).
+       problem). Note that this option cannot reliably be used with async IO
+       engines.
  
  
  I/O rate
@@ -2511,6 +2637,13 @@ I/O rate
         before we have to complete it and do our :option:`thinktime`. In other words, this
         setting effectively caps the queue depth if the latter is larger.
  
+.. option:: thinktime_blocks_type=str
+
+       Only valid if :option:`thinktime` is set - control how :option:`thinktime_blocks`
+       triggers. The default is `complete`, which triggers thinktime when fio completes
+       :option:`thinktime_blocks` blocks. If this is set to `issue`, then the trigger happens
+       at the issue side.
+
  .. option:: rate=int[,int][,int]
  
         Cap the bandwidth used by this job. The number is in bytes/sec, the normal
@@ -2591,11 +2724,12 @@ I/O latency
         true, fio will continue running and try to meet :option:`latency_target`
         by adjusting queue depth.
  
-.. option:: max_latency=time
+.. option:: max_latency=time[,time][,time]
  
         If set, fio will exit the job with an ETIMEDOUT error if it exceeds this
         maximum latency. When the unit is omitted, the value is interpreted in
-       microseconds.
+       microseconds. Comma-separated values may be specified for reads, writes,
+       and trims as described in :option:`blocksize`.
  
  .. option:: rate_cycle=int
  
@@ -2861,15 +2995,10 @@ Threads, processes and job synchronization
         ``flow=8`` and another job has ``flow=-1``, then there will be a roughly 1:8
         ratio in how much one runs vs the other.
  
-.. option:: flow_watermark=int
-
-       The maximum value that the absolute value of the flow counter is allowed to
-       reach before the job must wait for a lower value of the counter.
-
  .. option:: flow_sleep=int
  
-       The period of time, in microseconds, to wait after the flow watermark has
-       been exceeded before retrying operations.
+       The period of time, in microseconds, to wait after the flow counter
+       has exceeded its proportion before retrying operations.
  
  .. option:: stonewall, wait_for_previous
  
@@ -3927,7 +4056,7 @@ will be a disk utilization section.
  Below is a single line containing short names for each of the fields in the
  minimal output v3, separated by semicolons::
  
-        terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_min;read_clat_max;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_min;write_clat_max;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
+        terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth_kb;read_iops;read_runtime_ms;read_slat_min_us;read_slat_max_us;read_slat_mean_us;read_slat_dev_us;read_clat_min_us;read_clat_max_us;read_clat_mean_us;read_clat_dev_us;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min_us;read_lat_max_us;read_lat_mean_us;read_lat_dev_us;read_bw_min_kb;read_bw_max_kb;read_bw_agg_pct;read_bw_mean_kb;read_bw_dev_kb;write_kb;write_bandwidth_kb;write_iops;write_runtime_ms;write_slat_min_us;write_slat_max_us;write_slat_mean_us;write_slat_dev_us;write_clat_min_us;write_clat_max_us;write_clat_mean_us;write_clat_dev_us;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min_us;write_lat_max_us;write_lat_mean_us;write_lat_dev_us;write_bw_min_kb;write_bw_max_kb;write_bw_agg_pct;write_bw_mean_kb;write_bw_dev_kb;cpu_user;cpu_sys;cpu_csw;cpu_mjf;cpu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
  
  In client/server mode terse output differs from what appears when jobs are run
  locally. Disk utilization data is omitted from the standard terse output and