Engines should not touch nr_open_files anymore

[fio.git] / HOWTO
diff --git a/HOWTO b/HOWTO

index a9ee7ab12be207af57300370476e263af0e740a0..ea943ee04544408a60776a67807dec5737d741a8 100644 (file)
--- a/HOWTO
+++ b/HOWTO
@@ -49,7 +49,7 @@ bottom, it contains the following basic parameters:
  
         IO engine       How do we issue io? We could be memory mapping the
                         file, we could be using regular read/write, we
-                       could be using splice, async io, or even
+                       could be using splice, async io, syslet, or even
                         SG (SCSI generic sg).
  
         IO depth        If the io engine is async, how large a queuing
@@ -108,8 +108,8 @@ to use any ascii name you want, except 'global' which has special meaning.
  A global section sets defaults for the jobs described in that file. A job
  may override a global section parameter, and a job file may even have
  several global sections if so desired. A job is only affected by a global
-section residing above it. If the first character in a line is a ';', the
-entire line is discarded as a comment.
+section residing above it. If the first character in a line is a ';' or a
+'#', the entire line is discarded as a comment.
  
  So lets look at a really simple job file that define to threads, each
  randomly reading from a 128MiB file.
@@ -179,7 +179,10 @@ siint      SI integer. A whole number value, which may contain a postfix
  bool   Boolean. Usually parsed as an integer, however only defined for
         true and false (1 and 0).
  irange Integer range with postfix. Allows value range to be given, such
-       as 1024-4096. Also see siint.
+       as 1024-4096. A colon may also be used as the seperator, eg
+       1k:4k. If the option allows two sets of ranges, they can be
+       specified with a ',' or '/' delimiter: 1k-4k/8k-32k. Also see
+       siint.
  
  With the above in mind, here follows the complete list of fio job
  parameters.
@@ -190,13 +193,19 @@ name=str  ASCII name of the job. This may be used to override the
                 special purpose of also signaling the start of a new
                 job.
  
+description=str        Text description of the job. Doesn't do anything except
+               dump this text description when this job is run. It's
+               not parsed.
+
  directory=str  Prefix filenames with this directory. Used to places files
                 in a different location than "./".
  
  filename=str   Fio normally makes up a filename based on the job name,
                 thread number, and file number. If you want to share
                 files between threads in a job or several jobs, specify
-               a filename for each of them to override the default.
+               a filename for each of them to override the default. If
+               the ioengine used is 'net', the filename is the host and
+               port to connect to in the format of =host:port.
  
  rw=str         Type of io pattern. Accepted values are:
  
@@ -244,6 +253,14 @@ bs_unaligned       If this option is given, any byte size value within bsrange
  
  nrfiles=int    Number of files to use for this job. Defaults to 1.
  
+file_service_type=str  Defines how fio decides which file from a job to
+               service next. The following types are defined:
+
+                       random  Just choose a file at random.
+
+                       roundrobin  Round robin over open files. This
+                               is the default.
+
  ioengine=str   Defines how the job issues io to the file. The following
                 types are defined:
  
@@ -261,6 +278,9 @@ ioengine=str        Defines how the job issues io to the file. The following
                                 vmsplice(2) to transfer data from user
                                 space to the kernel.
  
+                       syslet-rw Use the syslet system calls to make
+                               regular read/write async.
+
                         sg      SCSI generic sg v3 io. May either be
                                 synchronous using the SG_IO ioctl, or if
                                 the target is an sg character device
@@ -271,11 +291,33 @@ ioengine=str      Defines how the job issues io to the file. The following
                                 to. This is mainly used to exercise fio
                                 itself and for debugging/testing purposes.
  
+                       net     Transfer over the network to given host:port.
+                               'filename' must be set appropriately to
+                               filename=host:port regardless of send
+                               or receive, if the latter only the port
+                               argument is used.
+
+                       external Prefix to specify loading an external
+                               IO engine object file. Append the engine
+                               filename, eg ioengine=external:/tmp/foo.o
+                               to load ioengine foo.o in /tmp.
+
  iodepth=int    This defines how many io units to keep in flight against
                 the file. The default is 1 for each file defined in this
                 job, can be overridden with a larger value for higher
                 concurrency.
  
+iodepth_batch=int This defines how many pieces of IO to submit at once.
+               It defaults to the same as iodepth, but can be set lower
+               if one so desires.
+
+iodepth_low=int        The low water mark indicating when to start filling
+               the queue again. Defaults to the same as iodepth, meaning
+               that fio will attempt to keep the queue full at all times.
+               If iodepth is set to eg 16 and iodepth_low is set to 4, then
+               after fio has filled the queue of 16 requests, it will let
+               the depth drain down to 4 before starting to fill it again.
+
  direct=bool    If value is true, use non-buffered io. This is usually
                 O_DIRECT.
  
@@ -325,7 +367,14 @@ prioclass=int      Set the io priority class. See man ionice(1).
  
  thinktime=int  Stall the job x microseconds after an io has completed before
                 issuing the next. May be used to simulate processing being
-               done by an application. See thinktime_blocks.
+               done by an application. See thinktime_blocks and
+               thinktime_spin.
+
+thinktime_spin=int
+               Only valid if thinktime is set - pretend to spend CPU time
+               doing something with the data received, before falling back
+               to sleeping for the rest of the period specified by
+               thinktime.
  
  thinktime_blocks
                 Only valid if thinktime is set - control how many blocks
@@ -420,8 +469,9 @@ create_serialize=bool       If true, serialize the file creating for the jobs.
  create_fsync=bool      fsync the data file after creation. This is the
                         default.
  
-unlink=bool    Unlink the job files when done. fio defaults to doing this,
-               if it created the file itself.
+unlink=bool    Unlink the job files when done. Not the default, as repeated
+               runs of that job would then waste time recreating the fileset
+               again and again.
  
  loops=int      Run the specified number of iterations of this job. Used
                 to repeat the same workload a given number of times. Defaults
@@ -446,7 +496,15 @@ stonewall  Wait for preceeding jobs in the job file to exit, before
  
  numjobs=int    Create the specified number of clones of this job. May be
                 used to setup a larger number of threads/processes doing
-               the same thing.
+               the same thing. We regard that grouping of jobs as a
+               specific group.
+
+group_reporting        If 'numjobs' is set, it may be interesting to display
+               statistics for the group as a whole instead of for each
+               individual job. This is especially true of 'numjobs' is
+               large, looking at individual thread/process output quickly
+               becomes unwieldy. If 'group_reporting' is specified, fio
+               will show the final report per-group instead of per-job.
  
  thread         fio defaults to forking jobs, however if this option is
                 given, fio will use pthread_create(3) to create threads
@@ -500,7 +558,7 @@ cpuchunks=int       If the job is a CPU cycle eater, split the load into
  fio spits out a lot of output. While running, fio will display the
  status of the jobs created. An example of that would be:
  
-Threads running: 1: [_r] [24.79% done] [ 13509/  8334 kb/s] [eta 00h:01m:31s]
+Threads: 1: [_r] [24.8% done] [ 13509/  8334 kb/s] [eta 00h:01m:31s]
  
  The characters inside the square brackets denote the current status of
  each thread. The possible values (in typical life cycle order) are:
@@ -532,10 +590,13 @@ direction, the output looks like:
  
  Client1 (g=0): err= 0:
    write: io=    32MiB, bw=   666KiB/s, runt= 50320msec
-    slat (msec): min=    0, max=  136, avg= 0.03, dev= 1.92
-    clat (msec): min=    0, max=  631, avg=48.50, dev=86.82
-    bw (KiB/s) : min=    0, max= 1196, per=51.00%, avg=664.02, dev=681.68
+    slat (msec): min=    0, max=  136, avg= 0.03, stdev= 1.92
+    clat (msec): min=    0, max=  631, avg=48.50, stdev=86.82
+    bw (KiB/s) : min=    0, max= 1196, per=51.00%, avg=664.02, stdev=681.68
    cpu        : usr=1.49%, sys=0.25%, ctx=7969
+  IO depths    : 1=0.1%, 2=0.3%, 4=0.5%, 8=99.0%, 16=0.0%, 32=0.0%, >32=0.0%
+     lat (msec): 2=1.6%, 4=0.0%, 10=3.2%, 20=12.8%, 50=38.4%, 100=24.8%,
+     lat (msec): 250=15.2%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2048=0.0%
  
  The client number is printed, along with the group id and error of that
  thread. Below is the io statistics, here for writes. In the order listed,
@@ -560,6 +621,17 @@ runt=              The runtime of that thread
                 same disk, since they are then competing for disk access.
  cpu=           CPU usage. User and system time, along with the number
                 of context switches this thread went through.
+IO depths=     The distribution of io depths over the job life time. The
+               numbers are divided into powers of 2, so for example the
+               16= entries includes depths up to that value but higher
+               than the previous entry. In other words, it covers the
+               range from 16 to 31.
+IO latencies=  The distribution of IO completion latencies. This is the
+               time from when IO leaves fio and when it gets completed.
+               The numbers follow the same pattern as the IO depths,
+               meaning that 2=1.6% means that 1.6% of the IO completed
+               within 2 msecs, 20=12.8% means that 12.8% of the IO
+               took more than 10 msecs, but less than (or equal to) 20 msecs.
  
  After each client has been listed, the group statistics are printed. They
  will look like this:
@@ -597,10 +669,11 @@ util=             The disk utilization. A value of 100% means we kept the disk
  ----------------
  
  For scripted usage where you typically want to generate tables or graphs
-of the results, fio can output the results in a comma separated format.
+of the results, fio can output the results in a semicolon separated format.
  The format is one long line of values, such as:
  
-client1,0,0,936,331,2894,0,0,0.000000,0.000000,1,170,22.115385,34.290410,16,714,84.252874%,366.500000,566.417819,3496,1237,2894,0,0,0.000000,0.000000,0,246,6.671625,21.436952,0,2534,55.465300%,1406.600000,2008.044216,0.000000%,0.431928%,1109
+client1;0;0;1906777;1090804;1790;0;0;0.000000;0.000000;0;0;0.000000;0.000000;929380;1152890;25.510151%;1078276.333333;128948.113404;0;0;0;0;0;0.000000;0.000000;0;0;0.000000;0.000000;0;0;0.000000%;0.000000;0.000000;100.000000%;0.000000%;324;100.0%;0.0%;0.0%;0.0%;0.0%;0.0%;0.0%;100.0%;0.0%;0.0%;0.0%;0.0%;0.0%
+;0.0%;0.0%;0.0%;0.0%;0.0%
  
  Split up, the format is as follows:
  
@@ -616,4 +689,7 @@ Split up, the format is as follows:
                 Completion latency: min, max, mean, deviation
                 Bw: min, max, aggregate percentage of total, mean, deviation
         CPU usage: user, system, context switches
+       IO depths: <=1, 2, 4, 8, 16, 32, >=64
+       IO latencies: <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, >=2000
+       Text description