HOWTO

   1 How fio works
   2 -------------
   3
   4 The first step in getting fio to simulate a desired I/O workload, is writing a
   5 job file describing that specific setup. A job file may contain any number of
   6 threads and/or files -- the typical contents of the job file is a *global*
   7 section defining shared parameters, and one or more job sections describing the
   8 jobs involved. When run, fio parses this file and sets everything up as
   9 described. If we break down a job from top to bottom, it contains the following
  10 basic parameters:
  11
  12 `I/O type`_
  13
  14                 Defines the I/O pattern issued to the file(s).  We may only be reading
  15                 sequentially from this file(s), or we may be writing randomly. Or even
  16                 mixing reads and writes, sequentially or randomly.
  17                 Should we be doing buffered I/O, or direct/raw I/O?
  18
  19 `Block size`_
  20
  21                 In how large chunks are we issuing I/O? This may be a single value,
  22                 or it may describe a range of block sizes.
  23
  24 `I/O size`_
  25
  26                 How much data are we going to be reading/writing.
  27
  28 `I/O engine`_
  29
  30                 How do we issue I/O? We could be memory mapping the file, we could be
  31                 using regular read/write, we could be using splice, async I/O, or even
  32                 SG (SCSI generic sg).
  33
  34 `I/O depth`_
  35
  36                 If the I/O engine is async, how large a queuing depth do we want to
  37                 maintain?
  38
  39
  40 `Target file/device`_
  41
  42                 How many files are we spreading the workload over.
  43
  44 `Threads, processes and job synchronization`_
  45
  46                 How many threads or processes should we spread this workload over.
  47
  48 The above are the basic parameters defined for a workload, in addition there's a
  49 multitude of parameters that modify other aspects of how this job behaves.
  50
  51
  52 Command line options
  53 --------------------
  54
  55 .. option:: --debug=type
  56
  57     Enable verbose tracing of various fio actions.  May be ``all`` for all types
  58     or individual types separated by a comma (e.g. ``--debug=file,mem`` will
  59     enable file and memory debugging).  Currently, additional logging is
  60     available for:
  61
  62     *process*
  63                         Dump info related to processes.
  64     *file*
  65                         Dump info related to file actions.
  66     *io*
  67                         Dump info related to I/O queuing.
  68     *mem*
  69                         Dump info related to memory allocations.
  70     *blktrace*
  71                         Dump info related to blktrace setup.
  72     *verify*
  73                         Dump info related to I/O verification.
  74     *all*
  75                         Enable all debug options.
  76     *random*
  77                         Dump info related to random offset generation.
  78     *parse*
  79                         Dump info related to option matching and parsing.
  80     *diskutil*
  81                         Dump info related to disk utilization updates.
  82     *job:x*
  83                         Dump info only related to job number x.
  84     *mutex*
  85                         Dump info only related to mutex up/down ops.
  86     *profile*
  87                         Dump info related to profile extensions.
  88     *time*
  89                         Dump info related to internal time keeping.
  90     *net*
  91                         Dump info related to networking connections.
  92     *rate*
  93                         Dump info related to I/O rate switching.
  94     *compress*
  95                         Dump info related to log compress/decompress.
  96     *?* or *help*
  97                         Show available debug options.
  98
  99 .. option:: --parse-only
 100
 101     Parse options only, don\'t start any I/O.
 102
 103 .. option:: --output=filename
 104
 105         Write output to file `filename`.
 106
 107 .. option:: --bandwidth-log
 108
 109         Generate aggregate bandwidth logs.
 110
 111 .. option:: --minimal
 112
 113         Print statistics in a terse, semicolon-delimited format.
 114
 115 .. option:: --append-terse
 116
 117     Print statistics in selected mode AND terse, semicolon-delimited format.
 118     **deprecated**, use :option:`--output-format` instead to select multiple
 119     formats.
 120
 121 .. option:: --output-format=type
 122
 123         Set the reporting format to `normal`, `terse`, `json`, or `json+`.  Multiple
 124         formats can be selected, separated by a comma.  `terse` is a CSV based
 125         format.  `json+` is like `json`, except it adds a full dump of the latency
 126         buckets.
 127
 128 .. option:: --terse-version=type
 129
 130         Set terse version output format (default 3, or 2 or 4 or 5).
 131
 132 .. option:: --version
 133
 134         Print version info and exit.
 135
 136 .. option:: --help
 137
 138         Print this page.
 139
 140 .. option:: --cpuclock-test
 141
 142         Perform test and validation of internal CPU clock.
 143
 144 .. option:: --crctest=test
 145
 146     Test the speed of the builtin checksumming functions. If no argument is
 147     given, all of them are tested. Or a comma separated list can be passed, in
 148     which case the given ones are tested.
 149
 150 .. option:: --cmdhelp=command
 151
 152         Print help information for `command`. May be ``all`` for all commands.
 153
 154 .. option:: --enghelp=[ioengine[,command]]
 155
 156     List all commands defined by :option:`ioengine`, or print help for `command`
 157     defined by :option:`ioengine`.  If no :option:`ioengine` is given, list all
 158     available ioengines.
 159
 160 .. option:: --showcmd=jobfile
 161
 162         Turn a job file into command line options.
 163
 164 .. option:: --readonly
 165
 166     Turn on safety read-only checks, preventing writes.  The ``--readonly``
 167     option is an extra safety guard to prevent users from accidentally starting
 168     a write workload when that is not desired.  Fio will only write if
 169     `rw=write/randwrite/rw/randrw` is given.  This extra safety net can be used
 170     as an extra precaution as ``--readonly`` will also enable a write check in
 171     the I/O engine core to prevent writes due to unknown user space bug(s).
 172
 173 .. option:: --eta=when
 174
 175         When real-time ETA estimate should be printed.  May be `always`, `never` or
 176         `auto`.
 177
 178 .. option:: --eta-newline=time
 179
 180         Force a new line for every `time` period passed.  When the unit is omitted,
 181         the value is interpreted in seconds.
 182
 183 .. option:: --status-interval=time
 184
 185         Force full status dump every `time` period passed.  When the unit is
 186         omitted, the value is interpreted in seconds.
 187
 188 .. option:: --section=name
 189
 190     Only run specified section in job file.  Multiple sections can be specified.
 191     The ``--section`` option allows one to combine related jobs into one file.
 192     E.g. one job file could define light, moderate, and heavy sections. Tell
 193     fio to run only the "heavy" section by giving ``--section=heavy``
 194     command line option.  One can also specify the "write" operations in one
 195     section and "verify" operation in another section.  The ``--section`` option
 196     only applies to job sections.  The reserved *global* section is always
 197     parsed and used.
 198
 199 .. option:: --alloc-size=kb
 200
 201     Set the internal smalloc pool to this size in kb (def 1024).  The
 202     ``--alloc-size`` switch allows one to use a larger pool size for smalloc.
 203     If running large jobs with randommap enabled, fio can run out of memory.
 204     Smalloc is an internal allocator for shared structures from a fixed size
 205     memory pool. The pool size defaults to 16M and can grow to 8 pools.
 206
 207     NOTE: While running :file:`.fio_smalloc.*` backing store files are visible
 208     in :file:`/tmp`.
 209
 210 .. option:: --warnings-fatal
 211
 212     All fio parser warnings are fatal, causing fio to exit with an
 213     error.
 214
 215 .. option:: --max-jobs=nr
 216
 217         Maximum number of threads/processes to support.
 218
 219 .. option:: --server=args
 220
 221     Start a backend server, with `args` specifying what to listen to.
 222     See `Client/Server`_ section.
 223
 224 .. option:: --daemonize=pidfile
 225
 226     Background a fio server, writing the pid to the given `pidfile` file.
 227
 228 .. option:: --client=hostname
 229
 230     Instead of running the jobs locally, send and run them on the given host or
 231     set of hosts.  See `Client/Server`_ section.
 232
 233 .. option:: --remote-config=file
 234
 235         Tell fio server to load this local file.
 236
 237 .. option:: --idle-prof=option
 238
 239         Report cpu idleness on a system or percpu basis
 240         ``--idle-prof=system,percpu`` or
 241         run unit work calibration only ``--idle-prof=calibrate``.
 242
 243 .. option:: --inflate-log=log
 244
 245         Inflate and output compressed log.
 246
 247 .. option:: --trigger-file=file
 248
 249         Execute trigger cmd when file exists.
 250
 251 .. option:: --trigger-timeout=t
 252
 253         Execute trigger at this time.
 254
 255 .. option:: --trigger=cmd
 256
 257         Set this command as local trigger.
 258
 259 .. option:: --trigger-remote=cmd
 260
 261         Set this command as remote trigger.
 262
 263 .. option:: --aux-path=path
 264
 265         Use this path for fio state generated files.
 266
 267 Any parameters following the options will be assumed to be job files, unless
 268 they match a job file parameter. Multiple job files can be listed and each job
 269 file will be regarded as a separate group. Fio will :option:`stonewall`
 270 execution between each group.
 271
 272
 273 Job file format
 274 ---------------
 275
 276 As previously described, fio accepts one or more job files describing what it is
 277 supposed to do. The job file format is the classic ini file, where the names
 278 enclosed in [] brackets define the job name. You are free to use any ASCII name
 279 you want, except *global* which has special meaning.  Following the job name is
 280 a sequence of zero or more parameters, one per line, that define the behavior of
 281 the job. If the first character in a line is a ';' or a '#', the entire line is
 282 discarded as a comment.
 283
 284 A *global* section sets defaults for the jobs described in that file. A job may
 285 override a *global* section parameter, and a job file may even have several
 286 *global* sections if so desired. A job is only affected by a *global* section
 287 residing above it.
 288
 289 The :option:`--cmdhelp` option also lists all options. If used with an `option`
 290 argument, :option:`--cmdhelp` will detail the given `option`.
 291
 292 See the `examples/` directory for inspiration on how to write job files.  Note
 293 the copyright and license requirements currently apply to `examples/` files.
 294
 295 So let's look at a really simple job file that defines two processes, each
 296 randomly reading from a 128MiB file:
 297
 298 .. code-block:: ini
 299
 300     ; -- start job file --
 301     [global]
 302     rw=randread
 303     size=128m
 304
 305     [job1]
 306
 307     [job2]
 308
 309     ; -- end job file --
 310
 311 As you can see, the job file sections themselves are empty as all the described
 312 parameters are shared. As no :option:`filename` option is given, fio makes up a
 313 `filename` for each of the jobs as it sees fit. On the command line, this job
 314 would look as follows::
 315
 316 $ fio --name=global --rw=randread --size=128m --name=job1 --name=job2
 317
 318
 319 Let's look at an example that has a number of processes writing randomly to
 320 files:
 321
 322 .. code-block:: ini
 323
 324     ; -- start job file --
 325     [random-writers]
 326     ioengine=libaio
 327     iodepth=4
 328     rw=randwrite
 329     bs=32k
 330     direct=0
 331     size=64m
 332     numjobs=4
 333     ; -- end job file --
 334
 335 Here we have no *global* section, as we only have one job defined anyway.  We
 336 want to use async I/O here, with a depth of 4 for each file. We also increased
 337 the buffer size used to 32KiB and define numjobs to 4 to fork 4 identical
 338 jobs. The result is 4 processes each randomly writing to their own 64MiB
 339 file. Instead of using the above job file, you could have given the parameters
 340 on the command line. For this case, you would specify::
 341
 342 $ fio --name=random-writers --ioengine=libaio --iodepth=4 --rw=randwrite --bs=32k --direct=0 --size=64m --numjobs=4
 343
 344 When fio is utilized as a basis of any reasonably large test suite, it might be
 345 desirable to share a set of standardized settings across multiple job files.
 346 Instead of copy/pasting such settings, any section may pull in an external
 347 :file:`filename.fio` file with *include filename* directive, as in the following
 348 example::
 349
 350     ; -- start job file including.fio --
 351     [global]
 352     filename=/tmp/test
 353     filesize=1m
 354     include glob-include.fio
 355
 356     [test]
 357     rw=randread
 358     bs=4k
 359     time_based=1
 360     runtime=10
 361     include test-include.fio
 362     ; -- end job file including.fio --
 363
 364 .. code-block:: ini
 365
 366     ; -- start job file glob-include.fio --
 367     thread=1
 368     group_reporting=1
 369     ; -- end job file glob-include.fio --
 370
 371 .. code-block:: ini
 372
 373     ; -- start job file test-include.fio --
 374     ioengine=libaio
 375     iodepth=4
 376     ; -- end job file test-include.fio --
 377
 378 Settings pulled into a section apply to that section only (except *global*
 379 section). Include directives may be nested in that any included file may contain
 380 further include directive(s). Include files may not contain [] sections.
 381
 382
 383 Environment variables
 384 ~~~~~~~~~~~~~~~~~~~~~
 385
 386 Fio also supports environment variable expansion in job files. Any sub-string of
 387 the form ``${VARNAME}`` as part of an option value (in other words, on the right
 388 of the '='), will be expanded to the value of the environment variable called
 389 `VARNAME`.  If no such environment variable is defined, or `VARNAME` is the
 390 empty string, the empty string will be substituted.
 391
 392 As an example, let's look at a sample fio invocation and job file::
 393
 394 $ SIZE=64m NUMJOBS=4 fio jobfile.fio
 395
 396 .. code-block:: ini
 397
 398     ; -- start job file --
 399     [random-writers]
 400     rw=randwrite
 401     size=${SIZE}
 402     numjobs=${NUMJOBS}
 403     ; -- end job file --
 404
 405 This will expand to the following equivalent job file at runtime:
 406
 407 .. code-block:: ini
 408
 409     ; -- start job file --
 410     [random-writers]
 411     rw=randwrite
 412     size=64m
 413     numjobs=4
 414     ; -- end job file --
 415
 416 Fio ships with a few example job files, you can also look there for inspiration.
 417
 418 Reserved keywords
 419 ~~~~~~~~~~~~~~~~~
 420
 421 Additionally, fio has a set of reserved keywords that will be replaced
 422 internally with the appropriate value. Those keywords are:
 423
 424 **$pagesize**
 425
 426         The architecture page size of the running system.
 427
 428 **$mb_memory**
 429
 430         Megabytes of total memory in the system.
 431
 432 **$ncpus**
 433
 434         Number of online available CPUs.
 435
 436 These can be used on the command line or in the job file, and will be
 437 automatically substituted with the current system values when the job is
 438 run. Simple math is also supported on these keywords, so you can perform actions
 439 like::
 440
 441         size=8*$mb_memory
 442
 443 and get that properly expanded to 8 times the size of memory in the machine.
 444
 445
 446 Job file parameters
 447 -------------------
 448
 449 This section describes in details each parameter associated with a job.  Some
 450 parameters take an option of a given type, such as an integer or a
 451 string. Anywhere a numeric value is required, an arithmetic expression may be
 452 used, provided it is surrounded by parentheses. Supported operators are:
 453
 454         - addition (+)
 455         - subtraction (-)
 456         - multiplication (*)
 457         - division (/)
 458         - modulus (%)
 459         - exponentiation (^)
 460
 461 For time values in expressions, units are microseconds by default. This is
 462 different than for time values not in expressions (not enclosed in
 463 parentheses). The following types are used:
 464
 465
 466 Parameter types
 467 ~~~~~~~~~~~~~~~
 468
 469 **str**
 470     String. This is a sequence of alpha characters.
 471
 472 **time**
 473         Integer with possible time suffix.  Without a unit value is interpreted as
 474         seconds unless otherwise specified.  Accepts a suffix of 'd' for days, 'h' for
 475         hours, 'm' for minutes, 's' for seconds, 'ms' (or 'msec') for milliseconds and
 476         'us' (or 'usec') for microseconds.  For example, use 10m for 10 minutes.
 477
 478 .. _int:
 479
 480 **int**
 481         Integer. A whole number value, which may contain an integer prefix
 482         and an integer suffix:
 483
 484         [*integer prefix*] **number** [*integer suffix*]
 485
 486         The optional *integer prefix* specifies the number's base. The default
 487         is decimal. *0x* specifies hexadecimal.
 488
 489         The optional *integer suffix* specifies the number's units, and includes an
 490         optional unit prefix and an optional unit.  For quantities of data, the
 491         default unit is bytes. For quantities of time, the default unit is seconds
 492         unless otherwise specified.
 493
 494         With :option:`kb_base`\=1000, fio follows international standards for unit
 495         prefixes.  To specify power-of-10 decimal values defined in the
 496         International System of Units (SI):
 497
 498                 * *Ki* -- means kilo (K) or 1000
 499                 * *Mi* -- means mega (M) or 1000**2
 500                 * *Gi* -- means giga (G) or 1000**3
 501                 * *Ti* -- means tera (T) or 1000**4
 502                 * *Pi* -- means peta (P) or 1000**5
 503
 504         To specify power-of-2 binary values defined in IEC 80000-13:
 505
 506                 * *k* -- means kibi (Ki) or 1024
 507                 * *M* -- means mebi (Mi) or 1024**2
 508                 * *G* -- means gibi (Gi) or 1024**3
 509                 * *T* -- means tebi (Ti) or 1024**4
 510                 * *P* -- means pebi (Pi) or 1024**5
 511
 512         With :option:`kb_base`\=1024 (the default), the unit prefixes are opposite
 513         from those specified in the SI and IEC 80000-13 standards to provide
 514         compatibility with old scripts.  For example, 4k means 4096.
 515
 516         For quantities of data, an optional unit of 'B' may be included
 517         (e.g.,  'kB' is the same as 'k').
 518
 519         The *integer suffix* is not case sensitive (e.g., m/mi mean mebi/mega,
 520         not milli). 'b' and 'B' both mean byte, not bit.
 521
 522         Examples with :option:`kb_base`\=1000:
 523
 524                 * *4 KiB*: 4096, 4096b, 4096B, 4ki, 4kib, 4kiB, 4Ki, 4KiB
 525                 * *1 MiB*: 1048576, 1mi, 1024ki
 526                 * *1 MB*: 1000000, 1m, 1000k
 527                 * *1 TiB*: 1099511627776, 1ti, 1024gi, 1048576mi
 528                 * *1 TB*: 1000000000, 1t, 1000m, 1000000k
 529
 530         Examples with :option:`kb_base`\=1024 (default):
 531
 532                 * *4 KiB*: 4096, 4096b, 4096B, 4k, 4kb, 4kB, 4K, 4KB
 533                 * *1 MiB*: 1048576, 1m, 1024k
 534                 * *1 MB*: 1000000, 1mi, 1000ki
 535                 * *1 TiB*: 1099511627776, 1t, 1024g, 1048576m
 536                 * *1 TB*: 1000000000, 1ti, 1000mi, 1000000ki
 537
 538         To specify times (units are not case sensitive):
 539
 540                 * *D* -- means days
 541                 * *H* -- means hours
 542                 * *M* -- means minutes
 543                 * *s* -- or sec means seconds (default)
 544                 * *ms* -- or *msec* means milliseconds
 545                 * *us* -- or *usec* means microseconds
 546
 547         If the option accepts an upper and lower range, use a colon ':' or
 548         minus '-' to separate such values. See :ref:`irange <irange>`.
 549         If the lower value specified happens to be larger than the upper value
 550         the two values are swapped.
 551
 552 .. _bool:
 553
 554 **bool**
 555         Boolean. Usually parsed as an integer, however only defined for
 556         true and false (1 and 0).
 557
 558 .. _irange:
 559
 560 **irange**
 561         Integer range with suffix. Allows value range to be given, such as
 562         1024-4096. A colon may also be used as the separator, e.g. 1k:4k. If the
 563         option allows two sets of ranges, they can be specified with a ',' or '/'
 564         delimiter: 1k-4k/8k-32k. Also see :ref:`int <int>`.
 565
 566 **float_list**
 567         A list of floating point numbers, separated by a ':' character.
 568
 569
 570 Units
 571 ~~~~~
 572
 573 .. option:: kb_base=int
 574
 575         Select the interpretation of unit prefixes in input parameters.
 576
 577                 **1000**
 578                         Inputs comply with IEC 80000-13 and the International
 579                         System of Units (SI). Use:
 580
 581                                 - power-of-2 values with IEC prefixes (e.g., KiB)
 582                                 - power-of-10 values with SI prefixes (e.g., kB)
 583
 584                 **1024**
 585                         Compatibility mode (default).  To avoid breaking old scripts:
 586
 587                                 - power-of-2 values with SI prefixes
 588                                 - power-of-10 values with IEC prefixes
 589
 590         See :option:`bs` for more details on input parameters.
 591
 592         Outputs always use correct prefixes.  Most outputs include both
 593         side-by-side, like::
 594
 595                 bw=2383.3kB/s (2327.4KiB/s)
 596
 597         If only one value is reported, then kb_base selects the one to use:
 598
 599                 **1000** -- SI prefixes
 600
 601                 **1024** -- IEC prefixes
 602
 603 .. option:: unit_base=int
 604
 605         Base unit for reporting.  Allowed values are:
 606
 607         **0**
 608                 Use auto-detection (default).
 609         **8**
 610                 Byte based.
 611         **1**
 612                 Bit based.
 613
 614
 615 With the above in mind, here follows the complete list of fio job parameters.
 616
 617
 618 Job description
 619 ~~~~~~~~~~~~~~~
 620
 621 .. option:: name=str
 622
 623         ASCII name of the job. This may be used to override the name printed by fio
 624         for this job. Otherwise the job name is used. On the command line this
 625         parameter has the special purpose of also signaling the start of a new job.
 626
 627 .. option:: description=str
 628
 629         Text description of the job. Doesn't do anything except dump this text
 630         description when this job is run. It's not parsed.
 631
 632 .. option:: loops=int
 633
 634         Run the specified number of iterations of this job. Used to repeat the same
 635         workload a given number of times. Defaults to 1.
 636
 637 .. option:: numjobs=int
 638
 639         Create the specified number of clones of this job. Each clone of job
 640         is spawned as an independent thread or process. May be used to setup a
 641         larger number of threads/processes doing the same thing. Each thread is
 642         reported separately; to see statistics for all clones as a whole, use
 643         :option:`group_reporting` in conjunction with :option:`new_group`.
 644         See :option:`--max-jobs`.  Default: 1.
 645
 646
 647 Time related parameters
 648 ~~~~~~~~~~~~~~~~~~~~~~~
 649
 650 .. option:: runtime=time
 651
 652         Tell fio to terminate processing after the specified period of time.  It
 653         can be quite hard to determine for how long a specified job will run, so
 654         this parameter is handy to cap the total runtime to a given time.  When
 655         the unit is omitted, the value is intepreted in seconds.
 656
 657 .. option:: time_based
 658
 659         If set, fio will run for the duration of the :option:`runtime` specified
 660         even if the file(s) are completely read or written. It will simply loop over
 661         the same workload as many times as the :option:`runtime` allows.
 662
 663 .. option:: startdelay=irange(time)
 664
 665         Delay the start of job for the specified amount of time.  Can be a single
 666         value or a range.  When given as a range, each thread will choose a value
 667         randomly from within the range.  Value is in seconds if a unit is omitted.
 668
 669 .. option:: ramp_time=time
 670
 671         If set, fio will run the specified workload for this amount of time before
 672         logging any performance numbers. Useful for letting performance settle
 673         before logging results, thus minimizing the runtime required for stable
 674         results. Note that the ``ramp_time`` is considered lead in time for a job,
 675         thus it will increase the total runtime if a special timeout or
 676         :option:`runtime` is specified.  When the unit is omitted, the value is
 677         given in seconds.
 678
 679 .. option:: clocksource=str
 680
 681         Use the given clocksource as the base of timing. The supported options are:
 682
 683                 **gettimeofday**
 684                         :manpage:`gettimeofday(2)`
 685
 686                 **clock_gettime**
 687                         :manpage:`clock_gettime(2)`
 688
 689                 **cpu**
 690                         Internal CPU clock source
 691
 692         cpu is the preferred clocksource if it is reliable, as it is very fast (and
 693         fio is heavy on time calls). Fio will automatically use this clocksource if
 694         it's supported and considered reliable on the system it is running on,
 695         unless another clocksource is specifically set. For x86/x86-64 CPUs, this
 696         means supporting TSC Invariant.
 697
 698 .. option:: gtod_reduce=bool
 699
 700         Enable all of the :manpage:`gettimeofday(2)` reducing options
 701         (:option:`disable_clat`, :option:`disable_slat`, :option:`disable_bw_measurement`) plus
 702         reduce precision of the timeout somewhat to really shrink the
 703         :manpage:`gettimeofday(2)` call count. With this option enabled, we only do
 704         about 0.4% of the :manpage:`gettimeofday(2)` calls we would have done if all
 705         time keeping was enabled.
 706
 707 .. option:: gtod_cpu=int
 708
 709         Sometimes it's cheaper to dedicate a single thread of execution to just
 710         getting the current time. Fio (and databases, for instance) are very
 711         intensive on :manpage:`gettimeofday(2)` calls. With this option, you can set
 712         one CPU aside for doing nothing but logging current time to a shared memory
 713         location. Then the other threads/processes that run I/O workloads need only
 714         copy that segment, instead of entering the kernel with a
 715         :manpage:`gettimeofday(2)` call. The CPU set aside for doing these time
 716         calls will be excluded from other uses. Fio will manually clear it from the
 717         CPU mask of other jobs.
 718
 719
 720 Target file/device
 721 ~~~~~~~~~~~~~~~~~~
 722
 723 .. option:: directory=str
 724
 725         Prefix filenames with this directory. Used to place files in a different
 726         location than :file:`./`.  You can specify a number of directories by
 727         separating the names with a ':' character. These directories will be
 728         assigned equally distributed to job clones creates with :option:`numjobs` as
 729         long as they are using generated filenames. If specific `filename(s)` are
 730         set fio will use the first listed directory, and thereby matching the
 731         `filename` semantic which generates a file each clone if not specified, but
 732         let all clones use the same if set.
 733
 734         See the :option:`filename` option for escaping certain characters.
 735
 736 .. option:: filename=str
 737
 738         Fio normally makes up a `filename` based on the job name, thread number, and
 739         file number. If you want to share files between threads in a job or several
 740         jobs with fixed file paths, specify a `filename` for each of them to override
 741         the default. If the ioengine is file based, you can specify a number of files
 742         by separating the names with a ':' colon. So if you wanted a job to open
 743         :file:`/dev/sda` and :file:`/dev/sdb` as the two working files, you would use
 744         ``filename=/dev/sda:/dev/sdb``. This also means that whenever this option is
 745         specified, :option:`nrfiles` is ignored. The size of regular files specified
 746         by this option will be :option:`size` divided by number of files unless
 747         explicit size is specified by :option:`filesize`.
 748
 749         On Windows, disk devices are accessed as :file:`\\\\.\\PhysicalDrive0` for
 750         the first device, :file:`\\\\.\\PhysicalDrive1` for the second etc.
 751         Note: Windows and FreeBSD prevent write access to areas
 752         of the disk containing in-use data (e.g. filesystems).  If the wanted
 753         `filename` does need to include a colon, then escape that with a ``\``
 754         character. For instance, if the `filename` is :file:`/dev/dsk/foo@3,0:c`,
 755         then you would use ``filename="/dev/dsk/foo@3,0\:c"``.  The
 756         :file:`-` is a reserved name, meaning stdin or stdout.  Which of the two
 757         depends on the read/write direction set.
 758
 759 .. option:: filename_format=str
 760
 761         If sharing multiple files between jobs, it is usually necessary to have fio
 762         generate the exact names that you want. By default, fio will name a file
 763         based on the default file format specification of
 764         :file:`jobname.jobnumber.filenumber`. With this option, that can be
 765         customized. Fio will recognize and replace the following keywords in this
 766         string:
 767
 768                 **$jobname**
 769                                 The name of the worker thread or process.
 770                 **$jobnum**
 771                                 The incremental number of the worker thread or process.
 772                 **$filenum**
 773                                 The incremental number of the file for that worker thread or
 774                                 process.
 775
 776         To have dependent jobs share a set of files, this option can be set to have
 777         fio generate filenames that are shared between the two. For instance, if
 778         :file:`testfiles.$filenum` is specified, file number 4 for any job will be
 779         named :file:`testfiles.4`. The default of :file:`$jobname.$jobnum.$filenum`
 780         will be used if no other format specifier is given.
 781
 782 .. option:: unique_filename=bool
 783
 784         To avoid collisions between networked clients, fio defaults to prefixing any
 785         generated filenames (with a directory specified) with the source of the
 786         client connecting. To disable this behavior, set this option to 0.
 787
 788 .. option:: opendir=str
 789
 790         Recursively open any files below directory `str`.
 791
 792 .. option:: lockfile=str
 793
 794         Fio defaults to not locking any files before it does I/O to them. If a file
 795         or file descriptor is shared, fio can serialize I/O to that file to make the
 796         end result consistent. This is usual for emulating real workloads that share
 797         files. The lock modes are:
 798
 799                 **none**
 800                         No locking. The default.
 801                 **exclusive**
 802                         Only one thread or process may do I/O at a time, excluding all
 803                         others.
 804                 **readwrite**
 805                         Read-write locking on the file. Many readers may
 806                         access the file at the same time, but writes get exclusive access.
 807
 808 .. option:: nrfiles=int
 809
 810         Number of files to use for this job. Defaults to 1. The size of files
 811         will be :option:`size` divided by this unless explicit size is specified by
 812         :option:`filesize`. Files are created for each thread separately, and each
 813         file will have a file number within its name by default, as explained in
 814         :option:`filename` section.
 815
 816
 817 .. option:: openfiles=int
 818
 819         Number of files to keep open at the same time. Defaults to the same as
 820         :option:`nrfiles`, can be set smaller to limit the number simultaneous
 821         opens.
 822
 823 .. option:: file_service_type=str
 824
 825         Defines how fio decides which file from a job to service next. The following
 826         types are defined:
 827
 828                 **random**
 829                         Choose a file at random.
 830
 831                 **roundrobin**
 832                         Round robin over opened files. This is the default.
 833
 834                 **sequential**
 835                         Finish one file before moving on to the next. Multiple files can
 836                         still be open depending on 'openfiles'.
 837
 838                 **zipf**
 839                         Use a *Zipf* distribution to decide what file to access.
 840
 841                 **pareto**
 842                         Use a *Pareto* distribution to decide what file to access.
 843
 844                 **gauss**
 845                         Use a *Gaussian* (normal) distribution to decide what file to
 846                         access.
 847
 848         For *random*, *roundrobin*, and *sequential*, a postfix can be appended to
 849         tell fio how many I/Os to issue before switching to a new file. For example,
 850         specifying ``file_service_type=random:8`` would cause fio to issue
 851         8 I/Os before selecting a new file at random. For the non-uniform
 852         distributions, a floating point postfix can be given to influence how the
 853         distribution is skewed. See :option:`random_distribution` for a description
 854         of how that would work.
 855
 856 .. option:: ioscheduler=str
 857
 858         Attempt to switch the device hosting the file to the specified I/O scheduler
 859         before running.
 860
 861 .. option:: create_serialize=bool
 862
 863         If true, serialize the file creation for the jobs.  This may be handy to
 864         avoid interleaving of data files, which may greatly depend on the filesystem
 865         used and even the number of processors in the system.  Default: true.
 866
 867 .. option:: create_fsync=bool
 868
 869         fsync the data file after creation. This is the default.
 870
 871 .. option:: create_on_open=bool
 872
 873         Don't pre-setup the files for I/O, just create open() when it's time to do
 874         I/O to that file.  Default: false.
 875
 876 .. option:: create_only=bool
 877
 878         If true, fio will only run the setup phase of the job.  If files need to be
 879         laid out or updated on disk, only that will be done -- the actual job contents
 880         are not executed.  Default: false.
 881
 882 .. option:: allow_file_create=bool
 883
 884         If true, fio is permitted to create files as part of its workload. This is
 885         the default behavior. If this option is false, then fio will error out if
 886         the files it needs to use don't already exist. Default: true.
 887
 888 .. option:: allow_mounted_write=bool
 889
 890         If this isn't set, fio will abort jobs that are destructive (e.g. that write)
 891         to what appears to be a mounted device or partition. This should help catch
 892         creating inadvertently destructive tests, not realizing that the test will
 893         destroy data on the mounted file system. Note that some platforms don't allow
 894         writing against a mounted device regardless of this option. Default: false.
 895
 896 .. option:: pre_read=bool
 897
 898         If this is given, files will be pre-read into memory before starting the
 899         given I/O operation. This will also clear the :option:`invalidate` flag,
 900         since it is pointless to pre-read and then drop the cache. This will only
 901         work for I/O engines that are seek-able, since they allow you to read the
 902         same data multiple times. Thus it will not work on non-seekable I/O engines
 903         (e.g. network, splice). Default: false.
 904
 905 .. option:: unlink=bool
 906
 907         Unlink the job files when done. Not the default, as repeated runs of that
 908         job would then waste time recreating the file set again and again. Default:
 909         false.
 910
 911 .. option:: unlink_each_loop=bool
 912
 913         Unlink job files after each iteration or loop.  Default: false.
 914
 915 .. option:: zonesize=int
 916
 917         Divide a file into zones of the specified size. See :option:`zoneskip`.
 918
 919 .. option:: zonerange=int
 920
 921         Give size of an I/O zone.  See :option:`zoneskip`.
 922
 923 .. option:: zoneskip=int
 924
 925         Skip the specified number of bytes when :option:`zonesize` data has been
 926         read. The two zone options can be used to only do I/O on zones of a file.
 927
 928
 929 I/O type
 930 ~~~~~~~~
 931
 932 .. option:: direct=bool
 933
 934         If value is true, use non-buffered I/O. This is usually O_DIRECT. Note that
 935         ZFS on Solaris doesn't support direct I/O.  On Windows the synchronous
 936         ioengines don't support direct I/O.  Default: false.
 937
 938 .. option:: atomic=bool
 939
 940         If value is true, attempt to use atomic direct I/O. Atomic writes are
 941         guaranteed to be stable once acknowledged by the operating system. Only
 942         Linux supports O_ATOMIC right now.
 943
 944 .. option:: buffered=bool
 945
 946         If value is true, use buffered I/O. This is the opposite of the
 947         :option:`direct` option. Defaults to true.
 948
 949 .. option:: readwrite=str, rw=str
 950
 951         Type of I/O pattern. Accepted values are:
 952
 953                 **read**
 954                                 Sequential reads.
 955                 **write**
 956                                 Sequential writes.
 957                 **trim**
 958                                 Sequential trims (Linux block devices only).
 959                 **randwrite**
 960                                 Random writes.
 961                 **randread**
 962                                 Random reads.
 963                 **randtrim**
 964                                 Random trims (Linux block devices only).
 965                 **rw,readwrite**
 966                                 Sequential mixed reads and writes.
 967                 **randrw**
 968                                 Random mixed reads and writes.
 969                 **trimwrite**
 970                                 Sequential trim+write sequences. Blocks will be trimmed first,
 971                                 then the same blocks will be written to.
 972
 973         Fio defaults to read if the option is not specified.  For the mixed I/O
 974         types, the default is to split them 50/50.  For certain types of I/O the
 975         result may still be skewed a bit, since the speed may be different. It is
 976         possible to specify a number of I/O's to do before getting a new offset,
 977         this is done by appending a ``:<nr>`` to the end of the string given.  For a
 978         random read, it would look like ``rw=randread:8`` for passing in an offset
 979         modifier with a value of 8. If the suffix is used with a sequential I/O
 980         pattern, then the value specified will be added to the generated offset for
 981         each I/O.  For instance, using ``rw=write:4k`` will skip 4k for every
 982         write. It turns sequential I/O into sequential I/O with holes.  See the
 983         :option:`rw_sequencer` option.
 984
 985 .. option:: rw_sequencer=str
 986
 987         If an offset modifier is given by appending a number to the ``rw=<str>``
 988         line, then this option controls how that number modifies the I/O offset
 989         being generated. Accepted values are:
 990
 991                 **sequential**
 992                         Generate sequential offset.
 993                 **identical**
 994                         Generate the same offset.
 995
 996         ``sequential`` is only useful for random I/O, where fio would normally
 997         generate a new random offset for every I/O. If you append e.g. 8 to randread,
 998         you would get a new random offset for every 8 I/O's. The result would be a
 999         seek for only every 8 I/O's, instead of for every I/O. Use ``rw=randread:8``
1000         to specify that. As sequential I/O is already sequential, setting
1001         ``sequential`` for that would not result in any differences.  ``identical``
1002         behaves in a similar fashion, except it sends the same offset 8 number of
1003         times before generating a new offset.
1004
1005 .. option:: unified_rw_reporting=bool
1006
1007         Fio normally reports statistics on a per data direction basis, meaning that
1008         reads, writes, and trims are accounted and reported separately. If this
1009         option is set fio sums the results and report them as "mixed" instead.
1010
1011 .. option:: randrepeat=bool
1012
1013         Seed the random number generator used for random I/O patterns in a
1014         predictable way so the pattern is repeatable across runs. Default: true.
1015
1016 .. option:: allrandrepeat=bool
1017
1018         Seed all random number generators in a predictable way so results are
1019         repeatable across runs.  Default: false.
1020
1021 .. option:: randseed=int
1022
1023         Seed the random number generators based on this seed value, to be able to
1024         control what sequence of output is being generated.  If not set, the random
1025         sequence depends on the :option:`randrepeat` setting.
1026
1027 .. option:: fallocate=str
1028
1029         Whether pre-allocation is performed when laying down files.
1030         Accepted values are:
1031
1032                 **none**
1033                         Do not pre-allocate space.
1034
1035                 **posix**
1036                         Pre-allocate via :manpage:`posix_fallocate(3)`.
1037
1038                 **keep**
1039                         Pre-allocate via :manpage:`fallocate(2)` with
1040                         FALLOC_FL_KEEP_SIZE set.
1041
1042                 **0**
1043                         Backward-compatible alias for **none**.
1044
1045                 **1**
1046                         Backward-compatible alias for **posix**.
1047
1048         May not be available on all supported platforms. **keep** is only available
1049         on Linux. If using ZFS on Solaris this must be set to **none** because ZFS
1050         doesn't support it. Default: **posix**.
1051
1052 .. option:: fadvise_hint=str
1053
1054         Use :manpage:`posix_fadvise(2)` to advise the kernel on what I/O patterns
1055         are likely to be issued.  Accepted values are:
1056
1057                 **0**
1058                         Backwards-compatible hint for "no hint".
1059
1060                 **1**
1061                         Backwards compatible hint for "advise with fio workload type". This
1062                         uses **FADV_RANDOM** for a random workload, and **FADV_SEQUENTIAL**
1063                         for a sequential workload.
1064
1065                 **sequential**
1066                         Advise using **FADV_SEQUENTIAL**.
1067
1068                 **random**
1069                         Advise using **FADV_RANDOM**.
1070
1071 .. option:: fadvise_stream=int
1072
1073         Use :manpage:`posix_fadvise(2)` to advise the kernel what stream ID the
1074         writes issued belong to. Only supported on Linux. Note, this option may
1075         change going forward.
1076
1077 .. option:: offset=int
1078
1079         Start I/O at the provided offset in the file, given as either a fixed size or
1080         a percentage. If a percentage is given, the next ``blockalign``-ed offset
1081         will be used. Data before the given offset will not be touched. This
1082         effectively caps the file size at `real_size - offset`. Can be combined with
1083         :option:`size` to constrain the start and end range of the I/O workload.
1084
1085 .. option:: offset_increment=int
1086
1087         If this is provided, then the real offset becomes `offset + offset_increment
1088         * thread_number`, where the thread number is a counter that starts at 0 and
1089         is incremented for each sub-job (i.e. when :option:`numjobs` option is
1090         specified). This option is useful if there are several jobs which are
1091         intended to operate on a file in parallel disjoint segments, with even
1092         spacing between the starting points.
1093
1094 .. option:: number_ios=int
1095
1096         Fio will normally perform I/Os until it has exhausted the size of the region
1097         set by :option:`size`, or if it exhaust the allocated time (or hits an error
1098         condition). With this setting, the range/size can be set independently of
1099         the number of I/Os to perform. When fio reaches this number, it will exit
1100         normally and report status. Note that this does not extend the amount of I/O
1101         that will be done, it will only stop fio if this condition is met before
1102         other end-of-job criteria.
1103
1104 .. option:: fsync=int
1105
1106         If writing to a file, issue a sync of the dirty data for every number of
1107         blocks given. For example, if you give 32 as a parameter, fio will sync the
1108         file for every 32 writes issued. If fio is using non-buffered I/O, we may
1109         not sync the file. The exception is the sg I/O engine, which synchronizes
1110         the disk cache anyway. Defaults to 0, which means no sync every certain
1111         number of writes.
1112
1113 .. option:: fdatasync=int
1114
1115         Like :option:`fsync` but uses :manpage:`fdatasync(2)` to only sync data and
1116         not metadata blocks.  In Windows, FreeBSD, and DragonFlyBSD there is no
1117         :manpage:`fdatasync(2)`, this falls back to using :manpage:`fsync(2)`.
1118         Defaults to 0, which means no sync data every certain number of writes.
1119
1120 .. option:: write_barrier=int
1121
1122    Make every `N-th` write a barrier write.
1123
1124 .. option:: sync_file_range=str:val
1125
1126         Use :manpage:`sync_file_range(2)` for every `val` number of write
1127         operations. Fio will track range of writes that have happened since the last
1128         :manpage:`sync_file_range(2)` call. `str` can currently be one or more of:
1129
1130                 **wait_before**
1131                         SYNC_FILE_RANGE_WAIT_BEFORE
1132                 **write**
1133                         SYNC_FILE_RANGE_WRITE
1134                 **wait_after**
1135                         SYNC_FILE_RANGE_WAIT_AFTER
1136
1137         So if you do ``sync_file_range=wait_before,write:8``, fio would use
1138         ``SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE`` for every 8
1139         writes. Also see the :manpage:`sync_file_range(2)` man page.  This option is
1140         Linux specific.
1141
1142 .. option:: overwrite=bool
1143
1144         If true, writes to a file will always overwrite existing data. If the file
1145         doesn't already exist, it will be created before the write phase begins. If
1146         the file exists and is large enough for the specified write phase, nothing
1147         will be done. Default: false.
1148
1149 .. option:: end_fsync=bool
1150
1151         If true, :manpage:`fsync(2)` file contents when a write stage has completed.
1152         Default: false.
1153
1154 .. option:: fsync_on_close=bool
1155
1156         If true, fio will :manpage:`fsync(2)` a dirty file on close.  This differs
1157         from :option:`end_fsync` in that it will happen on every file close, not
1158         just at the end of the job.  Default: false.
1159
1160 .. option:: rwmixread=int
1161
1162         Percentage of a mixed workload that should be reads. Default: 50.
1163
1164 .. option:: rwmixwrite=int
1165
1166         Percentage of a mixed workload that should be writes. If both
1167         :option:`rwmixread` and :option:`rwmixwrite` is given and the values do not
1168         add up to 100%, the latter of the two will be used to override the
1169         first. This may interfere with a given rate setting, if fio is asked to
1170         limit reads or writes to a certain rate.  If that is the case, then the
1171         distribution may be skewed. Default: 50.
1172
1173 .. option:: random_distribution=str:float[,str:float][,str:float]
1174
1175         By default, fio will use a completely uniform random distribution when asked
1176         to perform random I/O. Sometimes it is useful to skew the distribution in
1177         specific ways, ensuring that some parts of the data is more hot than others.
1178         fio includes the following distribution models:
1179
1180                 **random**
1181                                 Uniform random distribution
1182
1183                 **zipf**
1184                                 Zipf distribution
1185
1186                 **pareto**
1187                                 Pareto distribution
1188
1189                 **gauss**
1190                                 Normal (Gaussian) distribution
1191
1192                 **zoned**
1193                                 Zoned random distribution
1194
1195         When using a **zipf** or **pareto** distribution, an input value is also
1196         needed to define the access pattern. For **zipf**, this is the `zipf
1197         theta`. For **pareto**, it's the `Pareto power`. Fio includes a test
1198         program, :command:`genzipf`, that can be used visualize what the given input
1199         values will yield in terms of hit rates.  If you wanted to use **zipf** with
1200         a `theta` of 1.2, you would use ``random_distribution=zipf:1.2`` as the
1201         option. If a non-uniform model is used, fio will disable use of the random
1202         map. For the **gauss** distribution, a normal deviation is supplied as a
1203         value between 0 and 100.
1204
1205         For a **zoned** distribution, fio supports specifying percentages of I/O
1206         access that should fall within what range of the file or device. For
1207         example, given a criteria of:
1208
1209         * 60% of accesses should be to the first 10%
1210         * 30% of accesses should be to the next 20%
1211         * 8% of accesses should be to to the next 30%
1212         * 2% of accesses should be to the next 40%
1213
1214         we can define that through zoning of the random accesses. For the above
1215         example, the user would do::
1216
1217                 random_distribution=zoned:60/10:30/20:8/30:2/40
1218
1219         similarly to how :option:`bssplit` works for setting ranges and percentages
1220         of block sizes. Like :option:`bssplit`, it's possible to specify separate
1221         zones for reads, writes, and trims. If just one set is given, it'll apply to
1222         all of them.
1223
1224 .. option:: percentage_random=int[,int][,int]
1225
1226         For a random workload, set how big a percentage should be random. This
1227         defaults to 100%, in which case the workload is fully random. It can be set
1228         from anywhere from 0 to 100.  Setting it to 0 would make the workload fully
1229         sequential. Any setting in between will result in a random mix of sequential
1230         and random I/O, at the given percentages.  Comma-separated values may be
1231         specified for reads, writes, and trims as described in :option:`blocksize`.
1232
1233 .. option:: norandommap
1234
1235         Normally fio will cover every block of the file when doing random I/O. If
1236         this option is given, fio will just get a new random offset without looking
1237         at past I/O history. This means that some blocks may not be read or written,
1238         and that some blocks may be read/written more than once. If this option is
1239         used with :option:`verify` and multiple blocksizes (via :option:`bsrange`),
1240         only intact blocks are verified, i.e., partially-overwritten blocks are
1241         ignored.
1242
1243 .. option:: softrandommap=bool
1244
1245         See :option:`norandommap`. If fio runs with the random block map enabled and
1246         it fails to allocate the map, if this option is set it will continue without
1247         a random block map. As coverage will not be as complete as with random maps,
1248         this option is disabled by default.
1249
1250 .. option:: random_generator=str
1251
1252         Fio supports the following engines for generating
1253         I/O offsets for random I/O:
1254
1255                 **tausworthe**
1256                         Strong 2^88 cycle random number generator
1257                 **lfsr**
1258                         Linear feedback shift register generator
1259                 **tausworthe64**
1260                         Strong 64-bit 2^258 cycle random number generator
1261
1262         **tausworthe** is a strong random number generator, but it requires tracking
1263         on the side if we want to ensure that blocks are only read or written
1264         once. **LFSR** guarantees that we never generate the same offset twice, and
1265         it's also less computationally expensive. It's not a true random generator,
1266         however, though for I/O purposes it's typically good enough. **LFSR** only
1267         works with single block sizes, not with workloads that use multiple block
1268         sizes. If used with such a workload, fio may read or write some blocks
1269         multiple times. The default value is **tausworthe**, unless the required
1270         space exceeds 2^32 blocks. If it does, then **tausworthe64** is
1271         selected automatically.
1272
1273
1274 Block size
1275 ~~~~~~~~~~
1276
1277 .. option:: blocksize=int[,int][,int], bs=int[,int][,int]
1278
1279         The block size in bytes used for I/O units. Default: 4096.  A single value
1280         applies to reads, writes, and trims.  Comma-separated values may be
1281         specified for reads, writes, and trims.  A value not terminated in a comma
1282         applies to subsequent types.
1283
1284         Examples:
1285
1286                 **bs=256k**
1287                         means 256k for reads, writes and trims.
1288
1289                 **bs=8k,32k**
1290                         means 8k for reads, 32k for writes and trims.
1291
1292                 **bs=8k,32k,**
1293                         means 8k for reads, 32k for writes, and default for trims.
1294
1295                 **bs=,8k**
1296                         means default for reads, 8k for writes and trims.
1297
1298                 **bs=,8k,**
1299                         means default for reads, 8k for writes, and default for trims.
1300
1301 .. option:: blocksize_range=irange[,irange][,irange], bsrange=irange[,irange][,irange]
1302
1303         A range of block sizes in bytes for I/O units.  The issued I/O unit will
1304         always be a multiple of the minimum size, unless
1305         :option:`blocksize_unaligned` is set.
1306
1307         Comma-separated ranges may be specified for reads, writes, and trims as
1308         described in :option:`blocksize`.
1309
1310         Example: ``bsrange=1k-4k,2k-8k``.
1311
1312 .. option:: bssplit=str[,str][,str]
1313
1314         Sometimes you want even finer grained control of the block sizes issued, not
1315         just an even split between them.  This option allows you to weight various
1316         block sizes, so that you are able to define a specific amount of block sizes
1317         issued. The format for this option is::
1318
1319                 bssplit=blocksize/percentage:blocksize/percentage
1320
1321         for as many block sizes as needed. So if you want to define a workload that
1322         has 50% 64k blocks, 10% 4k blocks, and 40% 32k blocks, you would write::
1323
1324                 bssplit=4k/10:64k/50:32k/40
1325
1326         Ordering does not matter. If the percentage is left blank, fio will fill in
1327         the remaining values evenly. So a bssplit option like this one::
1328
1329                 bssplit=4k/50:1k/:32k/
1330
1331         would have 50% 4k ios, and 25% 1k and 32k ios. The percentages always add up
1332         to 100, if bssplit is given a range that adds up to more, it will error out.
1333
1334         Comma-separated values may be specified for reads, writes, and trims as
1335         described in :option:`blocksize`.
1336
1337         If you want a workload that has 50% 2k reads and 50% 4k reads, while having
1338         90% 4k writes and 10% 8k writes, you would specify::
1339
1340                 bssplit=2k/50:4k/50,4k/90,8k/10
1341
1342 .. option:: blocksize_unaligned, bs_unaligned
1343
1344         If set, fio will issue I/O units with any size within
1345         :option:`blocksize_range`, not just multiples of the minimum size.  This
1346         typically won't work with direct I/O, as that normally requires sector
1347         alignment.
1348
1349 .. option:: bs_is_seq_rand
1350
1351         If this option is set, fio will use the normal read,write blocksize settings
1352         as sequential,random blocksize settings instead. Any random read or write
1353         will use the WRITE blocksize settings, and any sequential read or write will
1354         use the READ blocksize settings.
1355
1356 .. option:: blockalign=int[,int][,int], ba=int[,int][,int]
1357
1358         Boundary to which fio will align random I/O units.  Default:
1359         :option:`blocksize`.  Minimum alignment is typically 512b for using direct
1360         I/O, though it usually depends on the hardware block size. This option is
1361         mutually exclusive with using a random map for files, so it will turn off
1362         that option.  Comma-separated values may be specified for reads, writes, and
1363         trims as described in :option:`blocksize`.
1364
1365
1366 Buffers and memory
1367 ~~~~~~~~~~~~~~~~~~
1368
1369 .. option:: zero_buffers
1370
1371         Initialize buffers with all zeros. Default: fill buffers with random data.
1372
1373 .. option:: refill_buffers
1374
1375         If this option is given, fio will refill the I/O buffers on every
1376         submit. The default is to only fill it at init time and reuse that
1377         data. Only makes sense if zero_buffers isn't specified, naturally. If data
1378         verification is enabled, `refill_buffers` is also automatically enabled.
1379
1380 .. option:: scramble_buffers=bool
1381
1382         If :option:`refill_buffers` is too costly and the target is using data
1383         deduplication, then setting this option will slightly modify the I/O buffer
1384         contents to defeat normal de-dupe attempts. This is not enough to defeat
1385         more clever block compression attempts, but it will stop naive dedupe of
1386         blocks. Default: true.
1387
1388 .. option:: buffer_compress_percentage=int
1389
1390         If this is set, then fio will attempt to provide I/O buffer content (on
1391         WRITEs) that compress to the specified level. Fio does this by providing a
1392         mix of random data and a fixed pattern. The fixed pattern is either zeroes,
1393         or the pattern specified by :option:`buffer_pattern`. If the pattern option
1394         is used, it might skew the compression ratio slightly. Note that this is per
1395         block size unit, for file/disk wide compression level that matches this
1396         setting, you'll also want to set :option:`refill_buffers`.
1397
1398 .. option:: buffer_compress_chunk=int
1399
1400         See :option:`buffer_compress_percentage`. This setting allows fio to manage
1401         how big the ranges of random data and zeroed data is. Without this set, fio
1402         will provide :option:`buffer_compress_percentage` of blocksize random data,
1403         followed by the remaining zeroed. With this set to some chunk size smaller
1404         than the block size, fio can alternate random and zeroed data throughout the
1405         I/O buffer.
1406
1407 .. option:: buffer_pattern=str
1408
1409         If set, fio will fill the I/O buffers with this pattern or with the contents
1410         of a file. If not set, the contents of I/O buffers are defined by the other
1411         options related to buffer contents. The setting can be any pattern of bytes,
1412         and can be prefixed with 0x for hex values. It may also be a string, where
1413         the string must then be wrapped with ``""``. Or it may also be a filename,
1414         where the filename must be wrapped with ``''`` in which case the file is
1415         opened and read. Note that not all the file contents will be read if that
1416         would cause the buffers to overflow. So, for example::
1417
1418                 buffer_pattern='filename'
1419
1420         or::
1421
1422                 buffer_pattern="abcd"
1423
1424         or::
1425
1426                 buffer_pattern=-12
1427
1428         or::
1429
1430                 buffer_pattern=0xdeadface
1431
1432         Also you can combine everything together in any order::
1433
1434                 buffer_pattern=0xdeadface"abcd"-12'filename'
1435
1436 .. option:: dedupe_percentage=int
1437
1438         If set, fio will generate this percentage of identical buffers when
1439         writing. These buffers will be naturally dedupable. The contents of the
1440         buffers depend on what other buffer compression settings have been set. It's
1441         possible to have the individual buffers either fully compressible, or not at
1442         all. This option only controls the distribution of unique buffers.
1443
1444 .. option:: invalidate=bool
1445
1446         Invalidate the buffer/page cache parts for this file prior to starting
1447         I/O if the platform and file type support it. Defaults to true.
1448         This will be ignored if :option:`pre_read` is also specified for the
1449         same job.
1450
1451 .. option:: sync=bool
1452
1453         Use synchronous I/O for buffered writes. For the majority of I/O engines,
1454         this means using O_SYNC. Default: false.
1455
1456 .. option:: iomem=str, mem=str
1457
1458         Fio can use various types of memory as the I/O unit buffer.  The allowed
1459         values are:
1460
1461                 **malloc**
1462                         Use memory from :manpage:`malloc(3)` as the buffers.  Default memory
1463                         type.
1464
1465                 **shm**
1466                         Use shared memory as the buffers. Allocated through
1467                         :manpage:`shmget(2)`.
1468
1469                 **shmhuge**
1470                         Same as shm, but use huge pages as backing.
1471
1472                 **mmap**
1473                         Use mmap to allocate buffers. May either be anonymous memory, or can
1474                         be file backed if a filename is given after the option. The format
1475                         is `mem=mmap:/path/to/file`.
1476
1477                 **mmaphuge**
1478                         Use a memory mapped huge file as the buffer backing. Append filename
1479                         after mmaphuge, ala `mem=mmaphuge:/hugetlbfs/file`.
1480
1481                 **mmapshared**
1482                         Same as mmap, but use a MMAP_SHARED mapping.
1483
1484                 **cudamalloc**
1485                         Use GPU memory as the buffers for GPUDirect RDMA benchmark.
1486
1487         The area allocated is a function of the maximum allowed bs size for the job,
1488         multiplied by the I/O depth given. Note that for **shmhuge** and
1489         **mmaphuge** to work, the system must have free huge pages allocated. This
1490         can normally be checked and set by reading/writing
1491         :file:`/proc/sys/vm/nr_hugepages` on a Linux system. Fio assumes a huge page
1492         is 4MiB in size. So to calculate the number of huge pages you need for a
1493         given job file, add up the I/O depth of all jobs (normally one unless
1494         :option:`iodepth` is used) and multiply by the maximum bs set. Then divide
1495         that number by the huge page size. You can see the size of the huge pages in
1496         :file:`/proc/meminfo`. If no huge pages are allocated by having a non-zero
1497         number in `nr_hugepages`, using **mmaphuge** or **shmhuge** will fail. Also
1498         see :option:`hugepage-size`.
1499
1500         **mmaphuge** also needs to have hugetlbfs mounted and the file location
1501         should point there. So if it's mounted in :file:`/huge`, you would use
1502         `mem=mmaphuge:/huge/somefile`.
1503
1504 .. option:: iomem_align=int
1505
1506         This indicates the memory alignment of the I/O memory buffers.  Note that
1507         the given alignment is applied to the first I/O unit buffer, if using
1508         :option:`iodepth` the alignment of the following buffers are given by the
1509         :option:`bs` used. In other words, if using a :option:`bs` that is a
1510         multiple of the page sized in the system, all buffers will be aligned to
1511         this value. If using a :option:`bs` that is not page aligned, the alignment
1512         of subsequent I/O memory buffers is the sum of the :option:`iomem_align` and
1513         :option:`bs` used.
1514
1515 .. option:: hugepage-size=int
1516
1517         Defines the size of a huge page. Must at least be equal to the system
1518         setting, see :file:`/proc/meminfo`. Defaults to 4MiB.  Should probably
1519         always be a multiple of megabytes, so using ``hugepage-size=Xm`` is the
1520         preferred way to set this to avoid setting a non-pow-2 bad value.
1521
1522 .. option:: lockmem=int
1523
1524         Pin the specified amount of memory with :manpage:`mlock(2)`. Can be used to
1525         simulate a smaller amount of memory.  The amount specified is per worker.
1526
1527
1528 I/O size
1529 ~~~~~~~~
1530
1531 .. option:: size=int
1532
1533         The total size of file I/O for each thread of this job. Fio will run until
1534         this many bytes has been transferred, unless runtime is limited by other options
1535         (such as :option:`runtime`, for instance, or increased/decreased by :option:`io_size`).
1536         Fio will divide this size between the available files determined by options
1537         such as :option:`nrfiles`, :option:`filename`, unless :option:`filesize` is
1538         specified by the job. If the result of division happens to be 0, the size is
1539         set to the physical size of the given files or devices if they exist.
1540         If this option is not specified, fio will use the full size of the given
1541         files or devices.  If the files do not exist, size must be given. It is also
1542         possible to give size as a percentage between 1 and 100. If ``size=20%`` is
1543         given, fio will use 20% of the full size of the given files or devices.
1544         Can be combined with :option:`offset` to constrain the start and end range
1545         that I/O will be done within.
1546
1547 .. option:: io_size=int, io_limit=int
1548
1549         Normally fio operates within the region set by :option:`size`, which means
1550         that the :option:`size` option sets both the region and size of I/O to be
1551         performed. Sometimes that is not what you want. With this option, it is
1552         possible to define just the amount of I/O that fio should do. For instance,
1553         if :option:`size` is set to 20GiB and :option:`io_size` is set to 5GiB, fio
1554         will perform I/O within the first 20GiB but exit when 5GiB have been
1555         done. The opposite is also possible -- if :option:`size` is set to 20GiB,
1556         and :option:`io_size` is set to 40GiB, then fio will do 40GiB of I/O within
1557         the 0..20GiB region.
1558
1559 .. option:: filesize=int
1560
1561         Individual file sizes. May be a range, in which case fio will select sizes
1562         for files at random within the given range and limited to :option:`size` in
1563         total (if that is given). If not given, each created file is the same size.
1564         This option overrides :option:`size` in terms of file size, which means
1565         this value is used as a fixed size or possible range of each file.
1566
1567 .. option:: file_append=bool
1568
1569         Perform I/O after the end of the file. Normally fio will operate within the
1570         size of a file. If this option is set, then fio will append to the file
1571         instead. This has identical behavior to setting :option:`offset` to the size
1572         of a file.  This option is ignored on non-regular files.
1573
1574 .. option:: fill_device=bool, fill_fs=bool
1575
1576         Sets size to something really large and waits for ENOSPC (no space left on
1577         device) as the terminating condition. Only makes sense with sequential
1578         write. For a read workload, the mount point will be filled first then I/O
1579         started on the result. This option doesn't make sense if operating on a raw
1580         device node, since the size of that is already known by the file system.
1581         Additionally, writing beyond end-of-device will not return ENOSPC there.
1582
1583
1584 I/O engine
1585 ~~~~~~~~~~
1586
1587 .. option:: ioengine=str
1588
1589         Defines how the job issues I/O to the file. The following types are defined:
1590
1591                 **sync**
1592                         Basic :manpage:`read(2)` or :manpage:`write(2)`
1593                         I/O. :manpage:`lseek(2)` is used to position the I/O location.
1594                         See :option:`fsync` and :option:`fdatasync` for syncing write I/Os.
1595
1596                 **psync**
1597                         Basic :manpage:`pread(2)` or :manpage:`pwrite(2)` I/O.  Default on
1598                         all supported operating systems except for Windows.
1599
1600                 **vsync**
1601                         Basic :manpage:`readv(2)` or :manpage:`writev(2)` I/O.  Will emulate
1602                         queuing by coalescing adjacent I/Os into a single submission.
1603
1604                 **pvsync**
1605                         Basic :manpage:`preadv(2)` or :manpage:`pwritev(2)` I/O.
1606
1607                 **pvsync2**
1608                         Basic :manpage:`preadv2(2)` or :manpage:`pwritev2(2)` I/O.
1609
1610                 **libaio**
1611                         Linux native asynchronous I/O. Note that Linux may only support
1612                         queued behaviour with non-buffered I/O (set ``direct=1`` or
1613                         ``buffered=0``).
1614                         This engine defines engine specific options.
1615
1616                 **posixaio**
1617                         POSIX asynchronous I/O using :manpage:`aio_read(3)` and
1618                         :manpage:`aio_write(3)`.
1619
1620                 **solarisaio**
1621                         Solaris native asynchronous I/O.
1622
1623                 **windowsaio**
1624                         Windows native asynchronous I/O.  Default on Windows.
1625
1626                 **mmap**
1627                         File is memory mapped with :manpage:`mmap(2)` and data copied
1628                         to/from using :manpage:`memcpy(3)`.
1629
1630                 **splice**
1631                         :manpage:`splice(2)` is used to transfer the data and
1632                         :manpage:`vmsplice(2)` to transfer data from user space to the
1633                         kernel.
1634
1635                 **sg**
1636                         SCSI generic sg v3 I/O. May either be synchronous using the SG_IO
1637                         ioctl, or if the target is an sg character device we use
1638                         :manpage:`read(2)` and :manpage:`write(2)` for asynchronous
1639                         I/O. Requires filename option to specify either block or character
1640                         devices.
1641
1642                 **null**
1643                         Doesn't transfer any data, just pretends to.  This is mainly used to
1644                         exercise fio itself and for debugging/testing purposes.
1645
1646                 **net**
1647                         Transfer over the network to given ``host:port``.  Depending on the
1648                         :option:`protocol` used, the :option:`hostname`, :option:`port`,
1649                         :option:`listen` and :option:`filename` options are used to specify
1650                         what sort of connection to make, while the :option:`protocol` option
1651                         determines which protocol will be used.  This engine defines engine
1652                         specific options.
1653
1654                 **netsplice**
1655                         Like **net**, but uses :manpage:`splice(2)` and
1656                         :manpage:`vmsplice(2)` to map data and send/receive.
1657                         This engine defines engine specific options.
1658
1659                 **cpuio**
1660                         Doesn't transfer any data, but burns CPU cycles according to the
1661                         :option:`cpuload` and :option:`cpuchunks` options. Setting
1662                         :option:`cpuload`\=85 will cause that job to do nothing but burn 85%
1663                         of the CPU. In case of SMP machines, use :option:`numjobs`
1664                         =<no_of_cpu> to get desired CPU usage, as the cpuload only loads a
1665                         single CPU at the desired rate. A job never finishes unless there is
1666                         at least one non-cpuio job.
1667
1668                 **guasi**
1669                         The GUASI I/O engine is the Generic Userspace Asyncronous Syscall
1670                         Interface approach to async I/O. See
1671
1672                         http://www.xmailserver.org/guasi-lib.html
1673
1674                         for more info on GUASI.
1675
1676                 **rdma**
1677                         The RDMA I/O engine supports both RDMA memory semantics
1678                         (RDMA_WRITE/RDMA_READ) and channel semantics (Send/Recv) for the
1679                         InfiniBand, RoCE and iWARP protocols.
1680
1681                 **falloc**
1682                         I/O engine that does regular fallocate to simulate data transfer as
1683                         fio ioengine.
1684
1685                         DDIR_READ
1686                                 does fallocate(,mode = FALLOC_FL_KEEP_SIZE,).
1687
1688                         DDIR_WRITE
1689                                 does fallocate(,mode = 0).
1690
1691                         DDIR_TRIM
1692                                 does fallocate(,mode = FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE).
1693
1694                 **ftruncate**
1695                         I/O engine that sends :manpage:`ftruncate(2)` operations in response
1696                         to write (DDIR_WRITE) events. Each ftruncate issued sets the file's
1697                         size to the current block offset. Block size is ignored.
1698
1699                 **e4defrag**
1700                         I/O engine that does regular EXT4_IOC_MOVE_EXT ioctls to simulate
1701                         defragment activity in request to DDIR_WRITE event.
1702
1703                 **rbd**
1704                         I/O engine supporting direct access to Ceph Rados Block Devices
1705                         (RBD) via librbd without the need to use the kernel rbd driver. This
1706                         ioengine defines engine specific options.
1707
1708                 **gfapi**
1709                         Using Glusterfs libgfapi sync interface to direct access to
1710                         Glusterfs volumes without having to go through FUSE.  This ioengine
1711                         defines engine specific options.
1712
1713                 **gfapi_async**
1714                         Using Glusterfs libgfapi async interface to direct access to
1715                         Glusterfs volumes without having to go through FUSE. This ioengine
1716                         defines engine specific options.
1717
1718                 **libhdfs**
1719                         Read and write through Hadoop (HDFS).  The :file:`filename` option
1720                         is used to specify host,port of the hdfs name-node to connect.  This
1721                         engine interprets offsets a little differently.  In HDFS, files once
1722                         created cannot be modified.  So random writes are not possible. To
1723                         imitate this, libhdfs engine expects bunch of small files to be
1724                         created over HDFS, and engine will randomly pick a file out of those
1725                         files based on the offset generated by fio backend. (see the example
1726                         job file to create such files, use ``rw=write`` option). Please
1727                         note, you might want to set necessary environment variables to work
1728                         with hdfs/libhdfs properly.  Each job uses its own connection to
1729                         HDFS.
1730
1731                 **mtd**
1732                         Read, write and erase an MTD character device (e.g.,
1733                         :file:`/dev/mtd0`). Discards are treated as erases. Depending on the
1734                         underlying device type, the I/O may have to go in a certain pattern,
1735                         e.g., on NAND, writing sequentially to erase blocks and discarding
1736                         before overwriting. The writetrim mode works well for this
1737                         constraint.
1738
1739                 **pmemblk**
1740                         Read and write using filesystem DAX to a file on a filesystem
1741                         mounted with DAX on a persistent memory device through the NVML
1742                         libpmemblk library.
1743
1744                 **dev-dax**
1745                         Read and write using device DAX to a persistent memory device (e.g.,
1746                         /dev/dax0.0) through the NVML libpmem library.
1747
1748                 **external**
1749                         Prefix to specify loading an external I/O engine object file. Append
1750                         the engine filename, e.g. ``ioengine=external:/tmp/foo.o`` to load
1751                         ioengine :file:`foo.o` in :file:`/tmp`.
1752
1753
1754 I/O engine specific parameters
1755 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1756
1757 In addition, there are some parameters which are only valid when a specific
1758 ioengine is in use. These are used identically to normal parameters, with the
1759 caveat that when used on the command line, they must come after the
1760 :option:`ioengine` that defines them is selected.
1761
1762 .. option:: userspace_reap : [libaio]
1763
1764         Normally, with the libaio engine in use, fio will use the
1765         :manpage:`io_getevents(2)` system call to reap newly returned events.  With
1766         this flag turned on, the AIO ring will be read directly from user-space to
1767         reap events. The reaping mode is only enabled when polling for a minimum of
1768         0 events (e.g. when :option:`iodepth_batch_complete` `=0`).
1769
1770 .. option:: hipri : [pvsync2]
1771
1772         Set RWF_HIPRI on I/O, indicating to the kernel that it's of higher priority
1773         than normal.
1774
1775 .. option:: cpuload=int : [cpuio]
1776
1777         Attempt to use the specified percentage of CPU cycles. This is a mandatory
1778         option when using cpuio I/O engine.
1779
1780 .. option:: cpuchunks=int : [cpuio]
1781
1782         Split the load into cycles of the given time. In microseconds.
1783
1784 .. option:: exit_on_io_done=bool : [cpuio]
1785
1786         Detect when I/O threads are done, then exit.
1787
1788 .. option:: hostname=str : [netsplice] [net]
1789
1790         The host name or IP address to use for TCP or UDP based I/O.  If the job is
1791         a TCP listener or UDP reader, the host name is not used and must be omitted
1792         unless it is a valid UDP multicast address.
1793
1794 .. option:: namenode=str : [libhdfs]
1795
1796         The host name or IP address of a HDFS cluster namenode to contact.
1797
1798 .. option:: port=int
1799
1800    [netsplice], [net]
1801
1802                 The TCP or UDP port to bind to or connect to. If this is used with
1803                 :option:`numjobs` to spawn multiple instances of the same job type, then
1804                 this will be the starting port number since fio will use a range of
1805                 ports.
1806
1807    [libhdfs]
1808
1809                 the listening port of the HFDS cluster namenode.
1810
1811 .. option:: interface=str : [netsplice] [net]
1812
1813         The IP address of the network interface used to send or receive UDP
1814         multicast.
1815
1816 .. option:: ttl=int : [netsplice] [net]
1817
1818         Time-to-live value for outgoing UDP multicast packets. Default: 1.
1819
1820 .. option:: nodelay=bool : [netsplice] [net]
1821
1822         Set TCP_NODELAY on TCP connections.
1823
1824 .. option:: protocol=str : [netsplice] [net]
1825
1826 .. option:: proto=str : [netsplice] [net]
1827
1828         The network protocol to use. Accepted values are:
1829
1830         **tcp**
1831                 Transmission control protocol.
1832         **tcpv6**
1833                 Transmission control protocol V6.
1834         **udp**
1835                 User datagram protocol.
1836         **udpv6**
1837                 User datagram protocol V6.
1838         **unix**
1839                 UNIX domain socket.
1840
1841         When the protocol is TCP or UDP, the port must also be given, as well as the
1842         hostname if the job is a TCP listener or UDP reader. For unix sockets, the
1843         normal filename option should be used and the port is invalid.
1844
1845 .. option:: listen : [net]
1846
1847         For TCP network connections, tell fio to listen for incoming connections
1848         rather than initiating an outgoing connection. The :option:`hostname` must
1849         be omitted if this option is used.
1850
1851 .. option:: pingpong : [net]
1852
1853         Normally a network writer will just continue writing data, and a network
1854         reader will just consume packages. If ``pingpong=1`` is set, a writer will
1855         send its normal payload to the reader, then wait for the reader to send the
1856         same payload back. This allows fio to measure network latencies. The
1857         submission and completion latencies then measure local time spent sending or
1858         receiving, and the completion latency measures how long it took for the
1859         other end to receive and send back.  For UDP multicast traffic
1860         ``pingpong=1`` should only be set for a single reader when multiple readers
1861         are listening to the same address.
1862
1863 .. option:: window_size : [net]
1864
1865         Set the desired socket buffer size for the connection.
1866
1867 .. option:: mss : [net]
1868
1869         Set the TCP maximum segment size (TCP_MAXSEG).
1870
1871 .. option:: donorname=str : [e4defrag]
1872
1873         File will be used as a block donor(swap extents between files).
1874
1875 .. option:: inplace=int : [e4defrag]
1876
1877         Configure donor file blocks allocation strategy:
1878
1879         **0**
1880                 Default. Preallocate donor's file on init.
1881         **1**
1882                 Allocate space immediately inside defragment event,     and free right
1883                 after event.
1884
1885 .. option:: clustername=str : [rbd]
1886
1887         Specifies the name of the Ceph cluster.
1888
1889 .. option:: rbdname=str : [rbd]
1890
1891         Specifies the name of the RBD.
1892
1893 .. option:: pool=str : [rbd]
1894
1895         Specifies the name of the Ceph pool containing RBD.
1896
1897 .. option:: clientname=str : [rbd]
1898
1899         Specifies the username (without the 'client.' prefix) used to access the
1900         Ceph cluster. If the *clustername* is specified, the *clientname* shall be
1901         the full *type.id* string. If no type. prefix is given, fio will add
1902         'client.' by default.
1903
1904 .. option:: skip_bad=bool : [mtd]
1905
1906         Skip operations against known bad blocks.
1907
1908 .. option:: hdfsdirectory : [libhdfs]
1909
1910         libhdfs will create chunk in this HDFS directory.
1911
1912 .. option:: chunk_size : [libhdfs]
1913
1914         the size of the chunk to use for each file.
1915
1916
1917 I/O depth
1918 ~~~~~~~~~
1919
1920 .. option:: iodepth=int
1921
1922         Number of I/O units to keep in flight against the file.  Note that
1923         increasing *iodepth* beyond 1 will not affect synchronous ioengines (except
1924         for small degrees when :option:`verify_async` is in use).  Even async
1925         engines may impose OS restrictions causing the desired depth not to be
1926         achieved.  This may happen on Linux when using libaio and not setting
1927         :option:`direct`\=1, since buffered I/O is not async on that OS.  Keep an
1928         eye on the I/O depth distribution in the fio output to verify that the
1929         achieved depth is as expected. Default: 1.
1930
1931 .. option:: iodepth_batch_submit=int, iodepth_batch=int
1932
1933         This defines how many pieces of I/O to submit at once.  It defaults to 1
1934         which means that we submit each I/O as soon as it is available, but can be
1935         raised to submit bigger batches of I/O at the time. If it is set to 0 the
1936         :option:`iodepth` value will be used.
1937
1938 .. option:: iodepth_batch_complete_min=int, iodepth_batch_complete=int
1939
1940         This defines how many pieces of I/O to retrieve at once. It defaults to 1
1941         which means that we'll ask for a minimum of 1 I/O in the retrieval process
1942         from the kernel. The I/O retrieval will go on until we hit the limit set by
1943         :option:`iodepth_low`. If this variable is set to 0, then fio will always
1944         check for completed events before queuing more I/O. This helps reduce I/O
1945         latency, at the cost of more retrieval system calls.
1946
1947 .. option:: iodepth_batch_complete_max=int
1948
1949         This defines maximum pieces of I/O to retrieve at once. This variable should
1950         be used along with :option:`iodepth_batch_complete_min`\=int variable,
1951         specifying the range of min and max amount of I/O which should be
1952         retrieved. By default it is equal to :option:`iodepth_batch_complete_min`
1953         value.
1954
1955         Example #1::
1956
1957                 iodepth_batch_complete_min=1
1958                 iodepth_batch_complete_max=<iodepth>
1959
1960         which means that we will retrieve at least 1 I/O and up to the whole
1961         submitted queue depth. If none of I/O has been completed yet, we will wait.
1962
1963         Example #2::
1964
1965                 iodepth_batch_complete_min=0
1966                 iodepth_batch_complete_max=<iodepth>
1967
1968         which means that we can retrieve up to the whole submitted queue depth, but
1969         if none of I/O has been completed yet, we will NOT wait and immediately exit
1970         the system call. In this example we simply do polling.
1971
1972 .. option:: iodepth_low=int
1973
1974         The low water mark indicating when to start filling the queue
1975         again. Defaults to the same as :option:`iodepth`, meaning that fio will
1976         attempt to keep the queue full at all times.  If :option:`iodepth` is set to
1977         e.g. 16 and *iodepth_low* is set to 4, then after fio has filled the queue of
1978         16 requests, it will let the depth drain down to 4 before starting to fill
1979         it again.
1980
1981 .. option:: io_submit_mode=str
1982
1983         This option controls how fio submits the I/O to the I/O engine. The default
1984         is `inline`, which means that the fio job threads submit and reap I/O
1985         directly. If set to `offload`, the job threads will offload I/O submission
1986         to a dedicated pool of I/O threads. This requires some coordination and thus
1987         has a bit of extra overhead, especially for lower queue depth I/O where it
1988         can increase latencies. The benefit is that fio can manage submission rates
1989         independently of the device completion rates. This avoids skewed latency
1990         reporting if I/O gets back up on the device side (the coordinated omission
1991         problem).
1992
1993
1994 I/O rate
1995 ~~~~~~~~
1996
1997 .. option:: thinktime=time
1998
1999         Stall the job for the specified period of time after an I/O has completed before issuing the
2000         next. May be used to simulate processing being done by an application.
2001         When the unit is omitted, the value is interpreted in microseconds.  See
2002         :option:`thinktime_blocks` and :option:`thinktime_spin`.
2003
2004 .. option:: thinktime_spin=time
2005
2006         Only valid if :option:`thinktime` is set - pretend to spend CPU time doing
2007         something with the data received, before falling back to sleeping for the
2008         rest of the period specified by :option:`thinktime`.  When the unit is
2009         omitted, the value is interpreted in microseconds.
2010
2011 .. option:: thinktime_blocks=int
2012
2013         Only valid if :option:`thinktime` is set - control how many blocks to issue,
2014         before waiting `thinktime` usecs. If not set, defaults to 1 which will make
2015         fio wait `thinktime` usecs after every block. This effectively makes any
2016         queue depth setting redundant, since no more than 1 I/O will be queued
2017         before we have to complete it and do our thinktime. In other words, this
2018         setting effectively caps the queue depth if the latter is larger.
2019
2020 .. option:: rate=int[,int][,int]
2021
2022         Cap the bandwidth used by this job. The number is in bytes/sec, the normal
2023         suffix rules apply.  Comma-separated values may be specified for reads,
2024         writes, and trims as described in :option:`blocksize`.
2025
2026 .. option:: rate_min=int[,int][,int]
2027
2028         Tell fio to do whatever it can to maintain at least this bandwidth. Failing
2029         to meet this requirement will cause the job to exit.  Comma-separated values
2030         may be specified for reads, writes, and trims as described in
2031         :option:`blocksize`.
2032
2033 .. option:: rate_iops=int[,int][,int]
2034
2035         Cap the bandwidth to this number of IOPS. Basically the same as
2036         :option:`rate`, just specified independently of bandwidth. If the job is
2037         given a block size range instead of a fixed value, the smallest block size
2038         is used as the metric.  Comma-separated values may be specified for reads,
2039         writes, and trims as described in :option:`blocksize`.
2040
2041 .. option:: rate_iops_min=int[,int][,int]
2042
2043         If fio doesn't meet this rate of I/O, it will cause the job to exit.
2044         Comma-separated values may be specified for reads, writes, and trims as
2045         described in :option:`blocksize`.
2046
2047 .. option:: rate_process=str
2048
2049         This option controls how fio manages rated I/O submissions. The default is
2050         `linear`, which submits I/O in a linear fashion with fixed delays between
2051         I/Os that gets adjusted based on I/O completion rates. If this is set to
2052         `poisson`, fio will submit I/O based on a more real world random request
2053         flow, known as the Poisson process
2054         (https://en.wikipedia.org/wiki/Poisson_point_process). The lambda will be
2055         10^6 / IOPS for the given workload.
2056
2057
2058 I/O latency
2059 ~~~~~~~~~~~
2060
2061 .. option:: latency_target=time
2062
2063         If set, fio will attempt to find the max performance point that the given
2064         workload will run at while maintaining a latency below this target.  When
2065         the unit is omitted, the value is interpreted in microseconds.  See
2066         :option:`latency_window` and :option:`latency_percentile`.
2067
2068 .. option:: latency_window=time
2069
2070         Used with :option:`latency_target` to specify the sample window that the job
2071         is run at varying queue depths to test the performance.  When the unit is
2072         omitted, the value is interpreted in microseconds.
2073
2074 .. option:: latency_percentile=float
2075
2076         The percentage of I/Os that must fall within the criteria specified by
2077         :option:`latency_target` and :option:`latency_window`. If not set, this
2078         defaults to 100.0, meaning that all I/Os must be equal or below to the value
2079         set by :option:`latency_target`.
2080
2081 .. option:: max_latency=time
2082
2083         If set, fio will exit the job with an ETIMEDOUT error if it exceeds this
2084         maximum latency. When the unit is omitted, the value is interpreted in
2085         microseconds.
2086
2087 .. option:: rate_cycle=int
2088
2089         Average bandwidth for :option:`rate` and :option:`rate_min` over this number
2090         of milliseconds. Defaults to 1000.
2091
2092
2093 I/O replay
2094 ~~~~~~~~~~
2095
2096 .. option:: write_iolog=str
2097
2098         Write the issued I/O patterns to the specified file. See
2099         :option:`read_iolog`.  Specify a separate file for each job, otherwise the
2100         iologs will be interspersed and the file may be corrupt.
2101
2102 .. option:: read_iolog=str
2103
2104         Open an iolog with the specified file name and replay the I/O patterns it
2105         contains. This can be used to store a workload and replay it sometime
2106         later. The iolog given may also be a blktrace binary file, which allows fio
2107         to replay a workload captured by :command:`blktrace`. See
2108         :manpage:`blktrace(8)` for how to capture such logging data. For blktrace
2109         replay, the file needs to be turned into a blkparse binary data file first
2110         (``blkparse <device> -o /dev/null -d file_for_fio.bin``).
2111
2112 .. option:: replay_no_stall=int
2113
2114         When replaying I/O with :option:`read_iolog` the default behavior is to
2115         attempt to respect the time stamps within the log and replay them with the
2116         appropriate delay between IOPS. By setting this variable fio will not
2117         respect the timestamps and attempt to replay them as fast as possible while
2118         still respecting ordering. The result is the same I/O pattern to a given
2119         device, but different timings.
2120
2121 .. option:: replay_redirect=str
2122
2123         While replaying I/O patterns using :option:`read_iolog` the default behavior
2124         is to replay the IOPS onto the major/minor device that each IOP was recorded
2125         from.  This is sometimes undesirable because on a different machine those
2126         major/minor numbers can map to a different device.  Changing hardware on the
2127         same system can also result in a different major/minor mapping.
2128         ``replay_redirect`` causes all IOPS to be replayed onto the single specified
2129         device regardless of the device it was recorded
2130         from. i.e. :option:`replay_redirect`\= :file:`/dev/sdc` would cause all I/O
2131         in the blktrace or iolog to be replayed onto :file:`/dev/sdc`.  This means
2132         multiple devices will be replayed onto a single device, if the trace
2133         contains multiple devices. If you want multiple devices to be replayed
2134         concurrently to multiple redirected devices you must blkparse your trace
2135         into separate traces and replay them with independent fio invocations.
2136         Unfortunately this also breaks the strict time ordering between multiple
2137         device accesses.
2138
2139 .. option:: replay_align=int
2140
2141         Force alignment of I/O offsets and lengths in a trace to this power of 2
2142         value.
2143
2144 .. option:: replay_scale=int
2145
2146         Scale sector offsets down by this factor when replaying traces.
2147
2148
2149 Threads, processes and job synchronization
2150 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2151
2152 .. option:: thread
2153
2154         Fio defaults to forking jobs, however if this option is given, fio will use
2155         POSIX Threads function :manpage:`pthread_create(3)` to create threads instead
2156         of forking processes.
2157
2158 .. option:: wait_for=str
2159
2160         Specifies the name of the already defined job to wait for. Single waitee
2161         name only may be specified. If set, the job won't be started until all
2162         workers of the waitee job are done.
2163
2164         ``wait_for`` operates on the job name basis, so there are a few
2165         limitations. First, the waitee must be defined prior to the waiter job
2166         (meaning no forward references). Second, if a job is being referenced as a
2167         waitee, it must have a unique name (no duplicate waitees).
2168
2169 .. option:: nice=int
2170
2171         Run the job with the given nice value. See man :manpage:`nice(2)`.
2172
2173         On Windows, values less than -15 set the process class to "High"; -1 through
2174         -15 set "Above Normal"; 1 through 15 "Below Normal"; and above 15 "Idle"
2175         priority class.
2176
2177 .. option:: prio=int
2178
2179         Set the I/O priority value of this job. Linux limits us to a positive value
2180         between 0 and 7, with 0 being the highest.  See man
2181         :manpage:`ionice(1)`. Refer to an appropriate manpage for other operating
2182         systems since meaning of priority may differ.
2183
2184 .. option:: prioclass=int
2185
2186         Set the I/O priority class. See man :manpage:`ionice(1)`.
2187
2188 .. option:: cpumask=int
2189
2190         Set the CPU affinity of this job. The parameter given is a bitmask of
2191         allowed CPU's the job may run on. So if you want the allowed CPUs to be 1
2192         and 5, you would pass the decimal value of (1 << 1 | 1 << 5), or 34. See man
2193         :manpage:`sched_setaffinity(2)`. This may not work on all supported
2194         operating systems or kernel versions. This option doesn't work well for a
2195         higher CPU count than what you can store in an integer mask, so it can only
2196         control cpus 1-32. For boxes with larger CPU counts, use
2197         :option:`cpus_allowed`.
2198
2199 .. option:: cpus_allowed=str
2200
2201         Controls the same options as :option:`cpumask`, but it allows a text setting
2202         of the permitted CPUs instead. So to use CPUs 1 and 5, you would specify
2203         ``cpus_allowed=1,5``. This options also allows a range of CPUs. Say you
2204         wanted a binding to CPUs 1, 5, and 8-15, you would set
2205         ``cpus_allowed=1,5,8-15``.
2206
2207 .. option:: cpus_allowed_policy=str
2208
2209         Set the policy of how fio distributes the CPUs specified by
2210         :option:`cpus_allowed` or cpumask. Two policies are supported:
2211
2212                 **shared**
2213                         All jobs will share the CPU set specified.
2214                 **split**
2215                         Each job will get a unique CPU from the CPU set.
2216
2217         **shared** is the default behaviour, if the option isn't specified. If
2218         **split** is specified, then fio will will assign one cpu per job. If not
2219         enough CPUs are given for the jobs listed, then fio will roundrobin the CPUs
2220         in the set.
2221
2222 .. option:: numa_cpu_nodes=str
2223
2224         Set this job running on specified NUMA nodes' CPUs. The arguments allow
2225         comma delimited list of cpu numbers, A-B ranges, or `all`. Note, to enable
2226         numa options support, fio must be built on a system with libnuma-dev(el)
2227         installed.
2228
2229 .. option:: numa_mem_policy=str
2230
2231         Set this job's memory policy and corresponding NUMA nodes. Format of the
2232         arguments::
2233
2234                 <mode>[:<nodelist>]
2235
2236         ``mode`` is one of the following memory policy: ``default``, ``prefer``,
2237         ``bind``, ``interleave``, ``local`` For ``default`` and ``local`` memory
2238         policy, no node is needed to be specified.  For ``prefer``, only one node is
2239         allowed.  For ``bind`` and ``interleave``, it allow comma delimited list of
2240         numbers, A-B ranges, or `all`.
2241
2242 .. option:: cgroup=str
2243
2244         Add job to this control group. If it doesn't exist, it will be created. The
2245         system must have a mounted cgroup blkio mount point for this to work. If
2246         your system doesn't have it mounted, you can do so with::
2247
2248                 # mount -t cgroup -o blkio none /cgroup
2249
2250 .. option:: cgroup_weight=int
2251
2252         Set the weight of the cgroup to this value. See the documentation that comes
2253         with the kernel, allowed values are in the range of 100..1000.
2254
2255 .. option:: cgroup_nodelete=bool
2256
2257         Normally fio will delete the cgroups it has created after the job
2258         completion. To override this behavior and to leave cgroups around after the
2259         job completion, set ``cgroup_nodelete=1``.  This can be useful if one wants
2260         to inspect various cgroup files after job completion. Default: false.
2261
2262 .. option:: flow_id=int
2263
2264         The ID of the flow. If not specified, it defaults to being a global
2265         flow. See :option:`flow`.
2266
2267 .. option:: flow=int
2268
2269         Weight in token-based flow control. If this value is used, then there is a
2270         'flow counter' which is used to regulate the proportion of activity between
2271         two or more jobs. Fio attempts to keep this flow counter near zero. The
2272         ``flow`` parameter stands for how much should be added or subtracted to the
2273         flow counter on each iteration of the main I/O loop. That is, if one job has
2274         ``flow=8`` and another job has ``flow=-1``, then there will be a roughly 1:8
2275         ratio in how much one runs vs the other.
2276
2277 .. option:: flow_watermark=int
2278
2279         The maximum value that the absolute value of the flow counter is allowed to
2280         reach before the job must wait for a lower value of the counter.
2281
2282 .. option:: flow_sleep=int
2283
2284         The period of time, in microseconds, to wait after the flow watermark has
2285         been exceeded before retrying operations.
2286
2287 .. option:: stonewall, wait_for_previous
2288
2289         Wait for preceding jobs in the job file to exit, before starting this
2290         one. Can be used to insert serialization points in the job file. A stone
2291         wall also implies starting a new reporting group, see
2292         :option:`group_reporting`.
2293
2294 .. option:: exitall
2295
2296         When one job finishes, terminate the rest. The default is to wait for each
2297         job to finish, sometimes that is not the desired action.
2298
2299 .. option:: exec_prerun=str
2300
2301         Before running this job, issue the command specified through
2302         :manpage:`system(3)`. Output is redirected in a file called
2303         :file:`jobname.prerun.txt`.
2304
2305 .. option:: exec_postrun=str
2306
2307         After the job completes, issue the command specified though
2308         :manpage:`system(3)`. Output is redirected in a file called
2309         :file:`jobname.postrun.txt`.
2310
2311 .. option:: uid=int
2312
2313         Instead of running as the invoking user, set the user ID to this value
2314         before the thread/process does any work.
2315
2316 .. option:: gid=int
2317
2318         Set group ID, see :option:`uid`.
2319
2320
2321 Verification
2322 ~~~~~~~~~~~~
2323
2324 .. option:: verify_only
2325
2326         Do not perform specified workload, only verify data still matches previous
2327         invocation of this workload. This option allows one to check data multiple
2328         times at a later date without overwriting it. This option makes sense only
2329         for workloads that write data, and does not support workloads with the
2330         :option:`time_based` option set.
2331
2332 .. option:: do_verify=bool
2333
2334         Run the verify phase after a write phase. Only valid if :option:`verify` is
2335         set. Default: true.
2336
2337 .. option:: verify=str
2338
2339         If writing to a file, fio can verify the file contents after each iteration
2340         of the job. Each verification method also implies verification of special
2341         header, which is written to the beginning of each block. This header also
2342         includes meta information, like offset of the block, block number, timestamp
2343         when block was written, etc.  :option:`verify` can be combined with
2344         :option:`verify_pattern` option.  The allowed values are:
2345
2346                 **md5**
2347                         Use an md5 sum of the data area and store it in the header of
2348                         each block.
2349
2350                 **crc64**
2351                         Use an experimental crc64 sum of the data area and store it in the
2352                         header of each block.
2353
2354                 **crc32c**
2355                         Use a crc32c sum of the data area and store it in the header of each
2356                         block.
2357
2358                 **crc32c-intel**
2359                         Use hardware assisted crc32c calculation provided on SSE4.2 enabled
2360                         processors. Falls back to regular software crc32c, if not supported
2361                         by the system.
2362
2363                 **crc32**
2364                         Use a crc32 sum of the data area and store it in the header of each
2365                         block.
2366
2367                 **crc16**
2368                         Use a crc16 sum of the data area and store it in the header of each
2369                         block.
2370
2371                 **crc7**
2372                         Use a crc7 sum of the data area and store it in the header of each
2373                         block.
2374
2375                 **xxhash**
2376                         Use xxhash as the checksum function. Generally the fastest software
2377                         checksum that fio supports.
2378
2379                 **sha512**
2380                         Use sha512 as the checksum function.
2381
2382                 **sha256**
2383                         Use sha256 as the checksum function.
2384
2385                 **sha1**
2386                         Use optimized sha1 as the checksum function.
2387
2388                 **sha3-224**
2389                         Use optimized sha3-224 as the checksum function.
2390
2391                 **sha3-256**
2392                         Use optimized sha3-256 as the checksum function.
2393
2394                 **sha3-384**
2395                         Use optimized sha3-384 as the checksum function.
2396
2397                 **sha3-512**
2398                         Use optimized sha3-512 as the checksum function.
2399
2400                 **meta**
2401                         This option is deprecated, since now meta information is included in
2402                         generic verification header and meta verification happens by
2403                         default. For detailed information see the description of the
2404                         :option:`verify` setting. This option is kept because of
2405                         compatibility's sake with old configurations. Do not use it.
2406
2407                 **pattern**
2408                         Verify a strict pattern. Normally fio includes a header with some
2409                         basic information and checksumming, but if this option is set, only
2410                         the specific pattern set with :option:`verify_pattern` is verified.
2411
2412                 **null**
2413                         Only pretend to verify. Useful for testing internals with
2414                         :option:`ioengine`\=null, not for much else.
2415
2416         This option can be used for repeated burn-in tests of a system to make sure
2417         that the written data is also correctly read back. If the data direction
2418         given is a read or random read, fio will assume that it should verify a
2419         previously written file. If the data direction includes any form of write,
2420         the verify will be of the newly written data.
2421
2422 .. option:: verifysort=bool
2423
2424         If true, fio will sort written verify blocks when it deems it faster to read
2425         them back in a sorted manner. This is often the case when overwriting an
2426         existing file, since the blocks are already laid out in the file system. You
2427         can ignore this option unless doing huge amounts of really fast I/O where
2428         the red-black tree sorting CPU time becomes significant. Default: true.
2429
2430 .. option:: verifysort_nr=int
2431
2432    Pre-load and sort verify blocks for a read workload.
2433
2434 .. option:: verify_offset=int
2435
2436         Swap the verification header with data somewhere else in the block before
2437         writing. It is swapped back before verifying.
2438
2439 .. option:: verify_interval=int
2440
2441         Write the verification header at a finer granularity than the
2442         :option:`blocksize`. It will be written for chunks the size of
2443         ``verify_interval``. :option:`blocksize` should divide this evenly.
2444
2445 .. option:: verify_pattern=str
2446
2447         If set, fio will fill the I/O buffers with this pattern. Fio defaults to
2448         filling with totally random bytes, but sometimes it's interesting to fill
2449         with a known pattern for I/O verification purposes. Depending on the width
2450         of the pattern, fio will fill 1/2/3/4 bytes of the buffer at the time(it can
2451         be either a decimal or a hex number).  The ``verify_pattern`` if larger than
2452         a 32-bit quantity has to be a hex number that starts with either "0x" or
2453         "0X". Use with :option:`verify`. Also, ``verify_pattern`` supports %o
2454         format, which means that for each block offset will be written and then
2455         verified back, e.g.::
2456
2457                 verify_pattern=%o
2458
2459         Or use combination of everything::
2460
2461                 verify_pattern=0xff%o"abcd"-12
2462
2463 .. option:: verify_fatal=bool
2464
2465         Normally fio will keep checking the entire contents before quitting on a
2466         block verification failure. If this option is set, fio will exit the job on
2467         the first observed failure. Default: false.
2468
2469 .. option:: verify_dump=bool
2470
2471         If set, dump the contents of both the original data block and the data block
2472         we read off disk to files. This allows later analysis to inspect just what
2473         kind of data corruption occurred. Off by default.
2474
2475 .. option:: verify_async=int
2476
2477         Fio will normally verify I/O inline from the submitting thread. This option
2478         takes an integer describing how many async offload threads to create for I/O
2479         verification instead, causing fio to offload the duty of verifying I/O
2480         contents to one or more separate threads. If using this offload option, even
2481         sync I/O engines can benefit from using an :option:`iodepth` setting higher
2482         than 1, as it allows them to have I/O in flight while verifies are running.
2483         Defaults to 0 async threads, i.e. verification is not asynchronous.
2484
2485 .. option:: verify_async_cpus=str
2486
2487         Tell fio to set the given CPU affinity on the async I/O verification
2488         threads. See :option:`cpus_allowed` for the format used.
2489
2490 .. option:: verify_backlog=int
2491
2492         Fio will normally verify the written contents of a job that utilizes verify
2493         once that job has completed. In other words, everything is written then
2494         everything is read back and verified. You may want to verify continually
2495         instead for a variety of reasons. Fio stores the meta data associated with
2496         an I/O block in memory, so for large verify workloads, quite a bit of memory
2497         would be used up holding this meta data. If this option is enabled, fio will
2498         write only N blocks before verifying these blocks.
2499
2500 .. option:: verify_backlog_batch=int
2501
2502         Control how many blocks fio will verify if :option:`verify_backlog` is
2503         set. If not set, will default to the value of :option:`verify_backlog`
2504         (meaning the entire queue is read back and verified).  If
2505         ``verify_backlog_batch`` is less than :option:`verify_backlog` then not all
2506         blocks will be verified, if ``verify_backlog_batch`` is larger than
2507         :option:`verify_backlog`, some blocks will be verified more than once.
2508
2509 .. option:: verify_state_save=bool
2510
2511         When a job exits during the write phase of a verify workload, save its
2512         current state. This allows fio to replay up until that point, if the verify
2513         state is loaded for the verify read phase. The format of the filename is,
2514         roughly::
2515
2516         <type>-<jobname>-<jobindex>-verify.state.
2517
2518         <type> is "local" for a local run, "sock" for a client/server socket
2519         connection, and "ip" (192.168.0.1, for instance) for a networked
2520         client/server connection. Defaults to true.
2521
2522 .. option:: verify_state_load=bool
2523
2524         If a verify termination trigger was used, fio stores the current write state
2525         of each thread. This can be used at verification time so that fio knows how
2526         far it should verify.  Without this information, fio will run a full
2527         verification pass, according to the settings in the job file used.  Default
2528         false.
2529
2530 .. option:: trim_percentage=int
2531
2532         Number of verify blocks to discard/trim.
2533
2534 .. option:: trim_verify_zero=bool
2535
2536         Verify that trim/discarded blocks are returned as zeroes.
2537
2538 .. option:: trim_backlog=int
2539
2540         Verify that trim/discarded blocks are returned as zeroes.
2541
2542 .. option:: trim_backlog_batch=int
2543
2544         Trim this number of I/O blocks.
2545
2546 .. option:: experimental_verify=bool
2547
2548         Enable experimental verification.
2549
2550
2551 Steady state
2552 ~~~~~~~~~~~~
2553
2554 .. option:: steadystate=str:float, ss=str:float
2555
2556         Define the criterion and limit for assessing steady state performance. The
2557         first parameter designates the criterion whereas the second parameter sets
2558         the threshold. When the criterion falls below the threshold for the
2559         specified duration, the job will stop. For example, `iops_slope:0.1%` will
2560         direct fio to terminate the job when the least squares regression slope
2561         falls below 0.1% of the mean IOPS. If :option:`group_reporting` is enabled
2562         this will apply to all jobs in the group. Below is the list of available
2563         steady state assessment criteria. All assessments are carried out using only
2564         data from the rolling collection window. Threshold limits can be expressed
2565         as a fixed value or as a percentage of the mean in the collection window.
2566
2567                 **iops**
2568                         Collect IOPS data. Stop the job if all individual IOPS measurements
2569                         are within the specified limit of the mean IOPS (e.g., ``iops:2``
2570                         means that all individual IOPS values must be within 2 of the mean,
2571                         whereas ``iops:0.2%`` means that all individual IOPS values must be
2572                         within 0.2% of the mean IOPS to terminate the job).
2573
2574                 **iops_slope**
2575                         Collect IOPS data and calculate the least squares regression
2576                         slope. Stop the job if the slope falls below the specified limit.
2577
2578                 **bw**
2579                         Collect bandwidth data. Stop the job if all individual bandwidth
2580                         measurements are within the specified limit of the mean bandwidth.
2581
2582                 **bw_slope**
2583                         Collect bandwidth data and calculate the least squares regression
2584                         slope. Stop the job if the slope falls below the specified limit.
2585
2586 .. option:: steadystate_duration=time, ss_dur=time
2587
2588         A rolling window of this duration will be used to judge whether steady state
2589         has been reached. Data will be collected once per second. The default is 0
2590         which disables steady state detection.  When the unit is omitted, the
2591         value is interpreted in seconds.
2592
2593 .. option:: steadystate_ramp_time=time, ss_ramp=time
2594
2595         Allow the job to run for the specified duration before beginning data
2596         collection for checking the steady state job termination criterion. The
2597         default is 0.  When the unit is omitted, the value is interpreted in seconds.
2598
2599
2600 Measurements and reporting
2601 ~~~~~~~~~~~~~~~~~~~~~~~~~~
2602
2603 .. option:: per_job_logs=bool
2604
2605         If set, this generates bw/clat/iops log with per file private filenames. If
2606         not set, jobs with identical names will share the log filename. Default:
2607         true.
2608
2609 .. option:: group_reporting
2610
2611         It may sometimes be interesting to display statistics for groups of jobs as
2612         a whole instead of for each individual job.  This is especially true if
2613         :option:`numjobs` is used; looking at individual thread/process output
2614         quickly becomes unwieldy.  To see the final report per-group instead of
2615         per-job, use :option:`group_reporting`. Jobs in a file will be part of the
2616         same reporting group, unless if separated by a :option:`stonewall`, or by
2617         using :option:`new_group`.
2618
2619 .. option:: new_group
2620
2621         Start a new reporting group. See: :option:`group_reporting`.  If not given,
2622         all jobs in a file will be part of the same reporting group, unless
2623         separated by a :option:`stonewall`.
2624
2625 .. option:: stats
2626
2627         By default, fio collects and shows final output results for all jobs
2628         that run. If this option is set to 0, then fio will ignore it in
2629         the final stat output.
2630
2631 .. option:: write_bw_log=str
2632
2633         If given, write a bandwidth log for this job. Can be used to store data of
2634         the bandwidth of the jobs in their lifetime. The included
2635         :command:`fio_generate_plots` script uses :command:`gnuplot` to turn these
2636         text files into nice graphs. See :option:`write_lat_log` for behaviour of
2637         given filename. For this option, the postfix is :file:`_bw.x.log`, where `x`
2638         is the index of the job (`1..N`, where `N` is the number of jobs). If
2639         :option:`per_job_logs` is false, then the filename will not include the job
2640         index.  See `Log File Formats`_.
2641
2642 .. option:: write_lat_log=str
2643
2644         Same as :option:`write_bw_log`, except that this option stores I/O
2645         submission, completion, and total latencies instead. If no filename is given
2646         with this option, the default filename of :file:`jobname_type.log` is
2647         used. Even if the filename is given, fio will still append the type of
2648         log. So if one specifies::
2649
2650                 write_lat_log=foo
2651
2652         The actual log names will be :file:`foo_slat.x.log`, :file:`foo_clat.x.log`,
2653         and :file:`foo_lat.x.log`, where `x` is the index of the job (1..N, where N
2654         is the number of jobs). This helps :command:`fio_generate_plot` find the
2655         logs automatically. If :option:`per_job_logs` is false, then the filename
2656         will not include the job index.  See `Log File Formats`_.
2657
2658 .. option:: write_hist_log=str
2659
2660         Same as :option:`write_lat_log`, but writes I/O completion latency
2661         histograms. If no filename is given with this option, the default filename
2662         of :file:`jobname_clat_hist.x.log` is used, where `x` is the index of the
2663         job (1..N, where `N` is the number of jobs). Even if the filename is given,
2664         fio will still append the type of log.  If :option:`per_job_logs` is false,
2665         then the filename will not include the job index. See `Log File Formats`_.
2666
2667 .. option:: write_iops_log=str
2668
2669         Same as :option:`write_bw_log`, but writes IOPS. If no filename is given
2670         with this option, the default filename of :file:`jobname_type.x.log` is
2671         used,where `x` is the index of the job (1..N, where `N` is the number of
2672         jobs). Even if the filename is given, fio will still append the type of
2673         log. If :option:`per_job_logs` is false, then the filename will not include
2674         the job index. See `Log File Formats`_.
2675
2676 .. option:: log_avg_msec=int
2677
2678         By default, fio will log an entry in the iops, latency, or bw log for every
2679         I/O that completes. When writing to the disk log, that can quickly grow to a
2680         very large size. Setting this option makes fio average the each log entry
2681         over the specified period of time, reducing the resolution of the log.  See
2682         :option:`log_max_value` as well. Defaults to 0, logging all entries.
2683
2684 .. option:: log_hist_msec=int
2685
2686         Same as :option:`log_avg_msec`, but logs entries for completion latency
2687         histograms. Computing latency percentiles from averages of intervals using
2688         :option:`log_avg_msec` is inaccurate. Setting this option makes fio log
2689         histogram entries over the specified period of time, reducing log sizes for
2690         high IOPS devices while retaining percentile accuracy.  See
2691         :option:`log_hist_coarseness` as well. Defaults to 0, meaning histogram
2692         logging is disabled.
2693
2694 .. option:: log_hist_coarseness=int
2695
2696         Integer ranging from 0 to 6, defining the coarseness of the resolution of
2697         the histogram logs enabled with :option:`log_hist_msec`. For each increment
2698         in coarseness, fio outputs half as many bins. Defaults to 0, for which
2699         histogram logs contain 1216 latency bins. See `Log File Formats`_.
2700
2701 .. option:: log_max_value=bool
2702
2703         If :option:`log_avg_msec` is set, fio logs the average over that window. If
2704         you instead want to log the maximum value, set this option to 1. Defaults to
2705         0, meaning that averaged values are logged.
2706
2707 .. option:: log_offset=int
2708
2709         If this is set, the iolog options will include the byte offset for the I/O
2710         entry as well as the other data values.
2711
2712 .. option:: log_compression=int
2713
2714         If this is set, fio will compress the I/O logs as it goes, to keep the
2715         memory footprint lower. When a log reaches the specified size, that chunk is
2716         removed and compressed in the background. Given that I/O logs are fairly
2717         highly compressible, this yields a nice memory savings for longer runs. The
2718         downside is that the compression will consume some background CPU cycles, so
2719         it may impact the run. This, however, is also true if the logging ends up
2720         consuming most of the system memory.  So pick your poison. The I/O logs are
2721         saved normally at the end of a run, by decompressing the chunks and storing
2722         them in the specified log file. This feature depends on the availability of
2723         zlib.
2724
2725 .. option:: log_compression_cpus=str
2726
2727         Define the set of CPUs that are allowed to handle online log compression for
2728         the I/O jobs. This can provide better isolation between performance
2729         sensitive jobs, and background compression work.
2730
2731 .. option:: log_store_compressed=bool
2732
2733         If set, fio will store the log files in a compressed format. They can be
2734         decompressed with fio, using the :option:`--inflate-log` command line
2735         parameter. The files will be stored with a :file:`.fz` suffix.
2736
2737 .. option:: log_unix_epoch=bool
2738
2739         If set, fio will log Unix timestamps to the log files produced by enabling
2740         write_type_log for each log type, instead of the default zero-based
2741         timestamps.
2742
2743 .. option:: block_error_percentiles=bool
2744
2745         If set, record errors in trim block-sized units from writes and trims and
2746         output a histogram of how many trims it took to get to errors, and what kind
2747         of error was encountered.
2748
2749 .. option:: bwavgtime=int
2750
2751         Average the calculated bandwidth over the given time. Value is specified in
2752         milliseconds. If the job also does bandwidth logging through
2753         :option:`write_bw_log`, then the minimum of this option and
2754         :option:`log_avg_msec` will be used.  Default: 500ms.
2755
2756 .. option:: iopsavgtime=int
2757
2758         Average the calculated IOPS over the given time. Value is specified in
2759         milliseconds. If the job also does IOPS logging through
2760         :option:`write_iops_log`, then the minimum of this option and
2761         :option:`log_avg_msec` will be used.  Default: 500ms.
2762
2763 .. option:: disk_util=bool
2764
2765         Generate disk utilization statistics, if the platform supports it.
2766         Default: true.
2767
2768 .. option:: disable_lat=bool
2769
2770         Disable measurements of total latency numbers. Useful only for cutting back
2771         the number of calls to :manpage:`gettimeofday(2)`, as that does impact
2772         performance at really high IOPS rates.  Note that to really get rid of a
2773         large amount of these calls, this option must be used with
2774         :option:`disable_slat` and :option:`disable_bw_measurement` as well.
2775
2776 .. option:: disable_clat=bool
2777
2778         Disable measurements of completion latency numbers. See
2779         :option:`disable_lat`.
2780
2781 .. option:: disable_slat=bool
2782
2783         Disable measurements of submission latency numbers. See
2784         :option:`disable_slat`.
2785
2786 .. option:: disable_bw_measurement=bool, disable_bw=bool
2787
2788         Disable measurements of throughput/bandwidth numbers. See
2789         :option:`disable_lat`.
2790
2791 .. option:: clat_percentiles=bool
2792
2793         Enable the reporting of percentiles of completion latencies.
2794
2795 .. option:: percentile_list=float_list
2796
2797         Overwrite the default list of percentiles for completion latencies and the
2798         block error histogram.  Each number is a floating number in the range
2799         (0,100], and the maximum length of the list is 20. Use ``:`` to separate the
2800         numbers, and list the numbers in ascending order. For example,
2801         ``--percentile_list=99.5:99.9`` will cause fio to report the values of
2802         completion latency below which 99.5% and 99.9% of the observed latencies
2803         fell, respectively.
2804
2805
2806 Error handling
2807 ~~~~~~~~~~~~~~
2808
2809 .. option:: exitall_on_error
2810
2811         When one job finishes in error, terminate the rest. The default is to wait
2812         for each job to finish.
2813
2814 .. option:: continue_on_error=str
2815
2816         Normally fio will exit the job on the first observed failure. If this option
2817         is set, fio will continue the job when there is a 'non-fatal error' (EIO or
2818         EILSEQ) until the runtime is exceeded or the I/O size specified is
2819         completed. If this option is used, there are two more stats that are
2820         appended, the total error count and the first error. The error field given
2821         in the stats is the first error that was hit during the run.
2822
2823         The allowed values are:
2824
2825                 **none**
2826                         Exit on any I/O or verify errors.
2827
2828                 **read**
2829                         Continue on read errors, exit on all others.
2830
2831                 **write**
2832                         Continue on write errors, exit on all others.
2833
2834                 **io**
2835                         Continue on any I/O error, exit on all others.
2836
2837                 **verify**
2838                         Continue on verify errors, exit on all others.
2839
2840                 **all**
2841                         Continue on all errors.
2842
2843                 **0**
2844                         Backward-compatible alias for 'none'.
2845
2846                 **1**
2847                         Backward-compatible alias for 'all'.
2848
2849 .. option:: ignore_error=str
2850
2851         Sometimes you want to ignore some errors during test in that case you can
2852         specify error list for each error type, instead of only being able to
2853         ignore the default 'non-fatal error' using :option:`continue_on_error`.
2854         ``ignore_error=READ_ERR_LIST,WRITE_ERR_LIST,VERIFY_ERR_LIST`` errors for
2855         given error type is separated with ':'. Error may be symbol ('ENOSPC',
2856         'ENOMEM') or integer.  Example::
2857
2858                 ignore_error=EAGAIN,ENOSPC:122
2859
2860         This option will ignore EAGAIN from READ, and ENOSPC and 122(EDQUOT) from
2861         WRITE. This option works by overriding :option:`continue_on_error` with
2862         the list of errors for each error type if any.
2863
2864 .. option:: error_dump=bool
2865
2866         If set dump every error even if it is non fatal, true by default. If
2867         disabled only fatal error will be dumped.
2868
2869 Running predefined workloads
2870 ----------------------------
2871
2872 Fio includes predefined profiles that mimic the I/O workloads generated by
2873 other tools.
2874
2875 .. option:: profile=str
2876
2877         The predefined workload to run.  Current profiles are:
2878
2879                 **tiobench**
2880                         Threaded I/O bench (tiotest/tiobench) like workload.
2881
2882                 **act**
2883                         Aerospike Certification Tool (ACT) like workload.
2884
2885 To view a profile's additional options use :option:`--cmdhelp` after specifying
2886 the profile.  For example::
2887
2888 $ fio --profile=act --cmdhelp
2889
2890 Act profile options
2891 ~~~~~~~~~~~~~~~~~~~
2892
2893 .. option:: device-names=str
2894         :noindex:
2895
2896         Devices to use.
2897
2898 .. option:: load=int
2899         :noindex:
2900
2901         ACT load multiplier.  Default: 1.
2902
2903 .. option:: test-duration=time
2904         :noindex:
2905
2906         How long the entire test takes to run.  When the unit is omitted, the value
2907         is given in seconds.  Default: 24h.
2908
2909 .. option:: threads-per-queue=int
2910         :noindex:
2911
2912         Number of read IO threads per device.  Default: 8.
2913
2914 .. option:: read-req-num-512-blocks=int
2915         :noindex:
2916
2917         Number of 512B blocks to read at the time.  Default: 3.
2918
2919 .. option:: large-block-op-kbytes=int
2920         :noindex:
2921
2922         Size of large block ops in KiB (writes).  Default: 131072.
2923
2924 .. option:: prep
2925         :noindex:
2926
2927         Set to run ACT prep phase.
2928
2929 Tiobench profile options
2930 ~~~~~~~~~~~~~~~~~~~~~~~~
2931
2932 .. option:: size=str
2933         :noindex:
2934
2935         Size in MiB
2936
2937 .. option:: block=int
2938         :noindex:
2939
2940         Block size in bytes.  Default: 4096.
2941
2942 .. option:: numruns=int
2943         :noindex:
2944
2945         Number of runs.
2946
2947 .. option:: dir=str
2948         :noindex:
2949
2950         Test directory.
2951
2952 .. option:: threads=int
2953         :noindex:
2954
2955         Number of threads.
2956
2957 Interpreting the output
2958 -----------------------
2959
2960 Fio spits out a lot of output. While running, fio will display the status of the
2961 jobs created. An example of that would be::
2962
2963     Jobs: 1 (f=1): [_(1),M(1)][24.8%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 01m:31s]
2964
2965 The characters inside the square brackets denote the current status of each
2966 thread. The possible values (in typical life cycle order) are:
2967
2968 +------+-----+-----------------------------------------------------------+
2969 | Idle | Run |                                                           |
2970 +======+=====+===========================================================+
2971 | P    |     | Thread setup, but not started.                            |
2972 +------+-----+-----------------------------------------------------------+
2973 | C    |     | Thread created.                                           |
2974 +------+-----+-----------------------------------------------------------+
2975 | I    |     | Thread initialized, waiting or generating necessary data. |
2976 +------+-----+-----------------------------------------------------------+
2977 |      |  p  | Thread running pre-reading file(s).                       |
2978 +------+-----+-----------------------------------------------------------+
2979 |      |  R  | Running, doing sequential reads.                          |
2980 +------+-----+-----------------------------------------------------------+
2981 |      |  r  | Running, doing random reads.                              |
2982 +------+-----+-----------------------------------------------------------+
2983 |      |  W  | Running, doing sequential writes.                         |
2984 +------+-----+-----------------------------------------------------------+
2985 |      |  w  | Running, doing random writes.                             |
2986 +------+-----+-----------------------------------------------------------+
2987 |      |  M  | Running, doing mixed sequential reads/writes.             |
2988 +------+-----+-----------------------------------------------------------+
2989 |      |  m  | Running, doing mixed random reads/writes.                 |
2990 +------+-----+-----------------------------------------------------------+
2991 |      |  F  | Running, currently waiting for :manpage:`fsync(2)`        |
2992 +------+-----+-----------------------------------------------------------+
2993 |      |  V  | Running, doing verification of written data.              |
2994 +------+-----+-----------------------------------------------------------+
2995 | E    |     | Thread exited, not reaped by main thread yet.             |
2996 +------+-----+-----------------------------------------------------------+
2997 | _    |     | Thread reaped, or                                         |
2998 +------+-----+-----------------------------------------------------------+
2999 | X    |     | Thread reaped, exited with an error.                      |
3000 +------+-----+-----------------------------------------------------------+
3001 | K    |     | Thread reaped, exited due to signal.                      |
3002 +------+-----+-----------------------------------------------------------+
3003
3004 Fio will condense the thread string as not to take up more space on the command
3005 line as is needed. For instance, if you have 10 readers and 10 writers running,
3006 the output would look like this::
3007
3008     Jobs: 20 (f=20): [R(10),W(10)][4.0%][r=20.5MiB/s,w=23.5MiB/s][r=82,w=94 IOPS][eta 57m:36s]
3009
3010 Fio will still maintain the ordering, though. So the above means that jobs 1..10
3011 are readers, and 11..20 are writers.
3012
3013 The other values are fairly self explanatory -- number of threads currently
3014 running and doing I/O, the number of currently open files (f=), the rate of I/O
3015 since last check (read speed listed first, then write speed and optionally trim
3016 speed), and the estimated completion percentage and time for the current
3017 running group. It's impossible to estimate runtime of the following groups (if
3018 any). Note that the string is displayed in order, so it's possible to tell which
3019 of the jobs are currently doing what. The first character is the first job
3020 defined in the job file, and so forth.
3021
3022 When fio is done (or interrupted by :kbd:`ctrl-c`), it will show the data for
3023 each thread, group of threads, and disks in that order. For each data direction,
3024 the output looks like::
3025
3026     Client1 (g=0): err= 0:
3027       write: io=    32MiB, bw=   666KiB/s, iops=89 , runt= 50320msec
3028         slat (msec): min=    0, max=  136, avg= 0.03, stdev= 1.92
3029         clat (msec): min=    0, max=  631, avg=48.50, stdev=86.82
3030         bw (KiB/s) : min=    0, max= 1196, per=51.00%, avg=664.02, stdev=681.68
3031       cpu        : usr=1.49%, sys=0.25%, ctx=7969, majf=0, minf=17
3032       IO depths    : 1=0.1%, 2=0.3%, 4=0.5%, 8=99.0%, 16=0.0%, 32=0.0%, >32=0.0%
3033          submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
3034          complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
3035          issued r/w: total=0/32768, short=0/0
3036          lat (msec): 2=1.6%, 4=0.0%, 10=3.2%, 20=12.8%, 50=38.4%, 100=24.8%,
3037          lat (msec): 250=15.2%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2048=0.0%
3038
3039 The client number is printed, along with the group id and error of that
3040 thread. Below is the I/O statistics, here for writes. In the order listed, they
3041 denote:
3042
3043 **io**
3044                 Number of megabytes I/O performed.
3045
3046 **bw**
3047                 Average bandwidth rate.
3048
3049 **iops**
3050                 Average I/Os performed per second.
3051
3052 **runt**
3053                 The runtime of that thread.
3054
3055 **slat**
3056                 Submission latency (avg being the average, stdev being the standard
3057                 deviation). This is the time it took to submit the I/O. For sync I/O,
3058                 the slat is really the completion latency, since queue/complete is one
3059                 operation there. This value can be in milliseconds or microseconds, fio
3060                 will choose the most appropriate base and print that. In the example
3061                 above, milliseconds is the best scale. Note: in :option:`--minimal` mode
3062                 latencies are always expressed in microseconds.
3063
3064 **clat**
3065                 Completion latency. Same names as slat, this denotes the time from
3066                 submission to completion of the I/O pieces. For sync I/O, clat will
3067                 usually be equal (or very close) to 0, as the time from submit to
3068                 complete is basically just CPU time (I/O has already been done, see slat
3069                 explanation).
3070
3071 **bw**
3072                 Bandwidth. Same names as the xlat stats, but also includes an
3073                 approximate percentage of total aggregate bandwidth this thread received
3074                 in this group. This last value is only really useful if the threads in
3075                 this group are on the same disk, since they are then competing for disk
3076                 access.
3077
3078 **cpu**
3079                 CPU usage. User and system time, along with the number of context
3080                 switches this thread went through, usage of system and user time, and
3081                 finally the number of major and minor page faults. The CPU utilization
3082                 numbers are averages for the jobs in that reporting group, while the
3083                 context and fault counters are summed.
3084
3085 **IO depths**
3086                 The distribution of I/O depths over the job life time. The numbers are
3087                 divided into powers of 2, so for example the 16= entries includes depths
3088                 up to that value but higher than the previous entry. In other words, it
3089                 covers the range from 16 to 31.
3090
3091 **IO submit**
3092                 How many pieces of I/O were submitting in a single submit call. Each
3093                 entry denotes that amount and below, until the previous entry -- e.g.,
3094                 8=100% mean that we submitted anywhere in between 5-8 I/Os per submit
3095                 call.
3096
3097 **IO complete**
3098                 Like the above submit number, but for completions instead.
3099
3100 **IO issued**
3101                 The number of read/write requests issued, and how many of them were
3102                 short.
3103
3104 **IO latencies**
3105                 The distribution of I/O completion latencies. This is the time from when
3106                 I/O leaves fio and when it gets completed.  The numbers follow the same
3107                 pattern as the I/O depths, meaning that 2=1.6% means that 1.6% of the
3108                 I/O completed within 2 msecs, 20=12.8% means that 12.8% of the I/O took
3109                 more than 10 msecs, but less than (or equal to) 20 msecs.
3110
3111 After each client has been listed, the group statistics are printed. They
3112 will look like this::
3113
3114     Run status group 0 (all jobs):
3115        READ: io=64MB, aggrb=22178, minb=11355, maxb=11814, mint=2840msec, maxt=2955msec
3116       WRITE: io=64MB, aggrb=1302, minb=666, maxb=669, mint=50093msec, maxt=50320msec
3117
3118 For each data direction, it prints:
3119
3120 **io**
3121                 Number of megabytes I/O performed.
3122 **aggrb**
3123                 Aggregate bandwidth of threads in this group.
3124 **minb**
3125                 The minimum average bandwidth a thread saw.
3126 **maxb**
3127                 The maximum average bandwidth a thread saw.
3128 **mint**
3129                 The smallest runtime of the threads in that group.
3130 **maxt**
3131                 The longest runtime of the threads in that group.
3132
3133 And finally, the disk statistics are printed. They will look like this::
3134
3135   Disk stats (read/write):
3136     sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
3137
3138 Each value is printed for both reads and writes, with reads first. The
3139 numbers denote:
3140
3141 **ios**
3142                 Number of I/Os performed by all groups.
3143 **merge**
3144                 Number of merges I/O the I/O scheduler.
3145 **ticks**
3146                 Number of ticks we kept the disk busy.
3147 **io_queue**
3148                 Total time spent in the disk queue.
3149 **util**
3150                 The disk utilization. A value of 100% means we kept the disk
3151                 busy constantly, 50% would be a disk idling half of the time.
3152
3153 It is also possible to get fio to dump the current output while it is running,
3154 without terminating the job. To do that, send fio the **USR1** signal.  You can
3155 also get regularly timed dumps by using the :option:`--status-interval`
3156 parameter, or by creating a file in :file:`/tmp` named
3157 :file:`fio-dump-status`. If fio sees this file, it will unlink it and dump the
3158 current output status.
3159
3160
3161 Terse output
3162 ------------
3163
3164 For scripted usage where you typically want to generate tables or graphs of the
3165 results, fio can output the results in a semicolon separated format.  The format
3166 is one long line of values, such as::
3167
3168     2;card0;0;0;7139336;121836;60004;1;10109;27.932460;116.933948;220;126861;3495.446807;1085.368601;226;126864;3523.635629;1089.012448;24063;99944;50.275485%;59818.274627;5540.657370;7155060;122104;60004;1;8338;29.086342;117.839068;388;128077;5032.488518;1234.785715;391;128085;5061.839412;1236.909129;23436;100928;50.287926%;59964.832030;5644.844189;14.595833%;19.394167%;123706;0;7313;0.1%;0.1%;0.1%;0.1%;0.1%;0.1%;100.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.01%;0.02%;0.05%;0.16%;6.04%;40.40%;52.68%;0.64%;0.01%;0.00%;0.01%;0.00%;0.00%;0.00%;0.00%;0.00%
3169     A description of this job goes here.
3170
3171 The job description (if provided) follows on a second line.
3172
3173 To enable terse output, use the :option:`--minimal` command line option. The
3174 first value is the version of the terse output format. If the output has to be
3175 changed for some reason, this number will be incremented by 1 to signify that
3176 change.
3177
3178 Split up, the format is as follows (comments in brackets denote when a
3179 field was introduced or whether its specific to some terse version):
3180
3181     ::
3182
3183         terse version, fio version [v3], jobname, groupid, error
3184
3185     READ status::
3186
3187         Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3188         Submission latency: min, max, mean, stdev (usec)
3189         Completion latency: min, max, mean, stdev (usec)
3190         Completion latency percentiles: 20 fields (see below)
3191         Total latency: min, max, mean, stdev (usec)
3192         Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3193         IOPS [v5]: min, max, mean, stdev, number of samples
3194
3195     WRITE status:
3196
3197     ::
3198
3199         Total IO (KiB), bandwidth (KiB/sec), IOPS, runtime (msec)
3200         Submission latency: min, max, mean, stdev (usec)
3201         Completion latency: min, max, mean, stdev (usec)
3202         Completion latency percentiles: 20 fields (see below)
3203         Total latency: min, max, mean, stdev (usec)
3204         Bw (KiB/s): min, max, aggregate percentage of total, mean, stdev, number of samples [v5]
3205         IOPS [v5]: min, max, mean, stdev, number of samples
3206
3207     TRIM status [all but version 3]:
3208
3209         Fields are similar to READ/WRITE status.
3210
3211     CPU usage::
3212
3213         user, system, context switches, major faults, minor faults
3214
3215     I/O depths::
3216
3217         <=1, 2, 4, 8, 16, 32, >=64
3218
3219     I/O latencies microseconds::
3220
3221         <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000
3222
3223     I/O latencies milliseconds::
3224
3225         <=2, 4, 10, 20, 50, 100, 250, 500, 750, 1000, 2000, >=2000
3226
3227     Disk utilization [v3]::
3228
3229         Disk name, Read ios, write ios,
3230         Read merges, write merges,
3231         Read ticks, write ticks,
3232         Time spent in queue, disk utilization percentage
3233
3234     Additional Info (dependent on continue_on_error, default off)::
3235
3236         total # errors, first error code
3237
3238     Additional Info (dependent on description being set)::
3239
3240         Text description
3241
3242 Completion latency percentiles can be a grouping of up to 20 sets, so for the
3243 terse output fio writes all of them. Each field will look like this::
3244
3245         1.00%=6112
3246
3247 which is the Xth percentile, and the `usec` latency associated with it.
3248
3249 For disk utilization, all disks used by fio are shown. So for each disk there
3250 will be a disk utilization section.
3251
3252 Below is a single line containing short names for each of the fields in the
3253 minimal output v3, separated by semicolons:
3254
3255 terse_version_3;fio_version;jobname;groupid;error;read_kb;read_bandwidth;read_iops;read_runtime_ms;read_slat_min;read_slat_max;read_slat_mean;read_slat_dev;read_clat_max;read_clat_min;read_clat_mean;read_clat_dev;read_clat_pct01;read_clat_pct02;read_clat_pct03;read_clat_pct04;read_clat_pct05;read_clat_pct06;read_clat_pct07;read_clat_pct08;read_clat_pct09;read_clat_pct10;read_clat_pct11;read_clat_pct12;read_clat_pct13;read_clat_pct14;read_clat_pct15;read_clat_pct16;read_clat_pct17;read_clat_pct18;read_clat_pct19;read_clat_pct20;read_tlat_min;read_lat_max;read_lat_mean;read_lat_dev;read_bw_min;read_bw_max;read_bw_agg_pct;read_bw_mean;read_bw_dev;write_kb;write_bandwidth;write_iops;write_runtime_ms;write_slat_min;write_slat_max;write_slat_mean;write_slat_dev;write_clat_max;write_clat_min;write_clat_mean;write_clat_dev;write_clat_pct01;write_clat_pct02;write_clat_pct03;write_clat_pct04;write_clat_pct05;write_clat_pct06;write_clat_pct07;write_clat_pct08;write_clat_pct09;write_clat_pct10;write_clat_pct11;write_clat_pct12;write_clat_pct13;write_clat_pct14;write_clat_pct15;write_clat_pct16;write_clat_pct17;write_clat_pct18;write_clat_pct19;write_clat_pct20;write_tlat_min;write_lat_max;write_lat_mean;write_lat_dev;write_bw_min;write_bw_max;write_bw_agg_pct;write_bw_mean;write_bw_dev;cpu_user;cpu_sys;cpu_csw;cpu_mjf;pu_minf;iodepth_1;iodepth_2;iodepth_4;iodepth_8;iodepth_16;iodepth_32;iodepth_64;lat_2us;lat_4us;lat_10us;lat_20us;lat_50us;lat_100us;lat_250us;lat_500us;lat_750us;lat_1000us;lat_2ms;lat_4ms;lat_10ms;lat_20ms;lat_50ms;lat_100ms;lat_250ms;lat_500ms;lat_750ms;lat_1000ms;lat_2000ms;lat_over_2000ms;disk_name;disk_read_iops;disk_write_iops;disk_read_merges;disk_write_merges;disk_read_ticks;write_ticks;disk_queue_time;disk_util
3256
3257
3258 Trace file format
3259 -----------------
3260
3261 There are two trace file format that you can encounter. The older (v1) format is
3262 unsupported since version 1.20-rc3 (March 2008). It will still be described
3263 below in case that you get an old trace and want to understand it.
3264
3265 In any case the trace is a simple text file with a single action per line.
3266
3267
3268 Trace file format v1
3269 ~~~~~~~~~~~~~~~~~~~~
3270
3271 Each line represents a single I/O action in the following format::
3272
3273         rw, offset, length
3274
3275 where `rw=0/1` for read/write, and the offset and length entries being in bytes.
3276
3277 This format is not supported in fio versions => 1.20-rc3.
3278
3279
3280 Trace file format v2
3281 ~~~~~~~~~~~~~~~~~~~~
3282
3283 The second version of the trace file format was added in fio version 1.17.  It
3284 allows to access more then one file per trace and has a bigger set of possible
3285 file actions.
3286
3287 The first line of the trace file has to be::
3288
3289     fio version 2 iolog
3290
3291 Following this can be lines in two different formats, which are described below.
3292
3293 The file management format::
3294
3295     filename action
3296
3297 The filename is given as an absolute path. The action can be one of these:
3298
3299 **add**
3300                 Add the given filename to the trace.
3301 **open**
3302                 Open the file with the given filename. The filename has to have
3303                 been added with the **add** action before.
3304 **close**
3305                 Close the file with the given filename. The file has to have been
3306                 opened before.
3307
3308
3309 The file I/O action format::
3310
3311     filename action offset length
3312
3313 The `filename` is given as an absolute path, and has to have been added and
3314 opened before it can be used with this format. The `offset` and `length` are
3315 given in bytes. The `action` can be one of these:
3316
3317 **wait**
3318            Wait for `offset` microseconds. Everything below 100 is discarded.
3319            The time is relative to the previous `wait` statement.
3320 **read**
3321            Read `length` bytes beginning from `offset`.
3322 **write**
3323            Write `length` bytes beginning from `offset`.
3324 **sync**
3325            :manpage:`fsync(2)` the file.
3326 **datasync**
3327            :manpage:`fdatasync(2)` the file.
3328 **trim**
3329            Trim the given file from the given `offset` for `length` bytes.
3330
3331 CPU idleness profiling
3332 ----------------------
3333
3334 In some cases, we want to understand CPU overhead in a test. For example, we
3335 test patches for the specific goodness of whether they reduce CPU usage.
3336 Fio implements a balloon approach to create a thread per CPU that runs at idle
3337 priority, meaning that it only runs when nobody else needs the cpu.
3338 By measuring the amount of work completed by the thread, idleness of each CPU
3339 can be derived accordingly.
3340
3341 An unit work is defined as touching a full page of unsigned characters. Mean and
3342 standard deviation of time to complete an unit work is reported in "unit work"
3343 section. Options can be chosen to report detailed percpu idleness or overall
3344 system idleness by aggregating percpu stats.
3345
3346
3347 Verification and triggers
3348 -------------------------
3349
3350 Fio is usually run in one of two ways, when data verification is done. The first
3351 is a normal write job of some sort with verify enabled. When the write phase has
3352 completed, fio switches to reads and verifies everything it wrote. The second
3353 model is running just the write phase, and then later on running the same job
3354 (but with reads instead of writes) to repeat the same I/O patterns and verify
3355 the contents. Both of these methods depend on the write phase being completed,
3356 as fio otherwise has no idea how much data was written.
3357
3358 With verification triggers, fio supports dumping the current write state to
3359 local files. Then a subsequent read verify workload can load this state and know
3360 exactly where to stop. This is useful for testing cases where power is cut to a
3361 server in a managed fashion, for instance.
3362
3363 A verification trigger consists of two things:
3364
3365 1) Storing the write state of each job.
3366 2) Executing a trigger command.
3367
3368 The write state is relatively small, on the order of hundreds of bytes to single
3369 kilobytes. It contains information on the number of completions done, the last X
3370 completions, etc.
3371
3372 A trigger is invoked either through creation ('touch') of a specified file in
3373 the system, or through a timeout setting. If fio is run with
3374 :option:`--trigger-file`\= :file:`/tmp/trigger-file`, then it will continually
3375 check for the existence of :file:`/tmp/trigger-file`. When it sees this file, it
3376 will fire off the trigger (thus saving state, and executing the trigger
3377 command).
3378
3379 For client/server runs, there's both a local and remote trigger. If fio is
3380 running as a server backend, it will send the job states back to the client for
3381 safe storage, then execute the remote trigger, if specified. If a local trigger
3382 is specified, the server will still send back the write state, but the client
3383 will then execute the trigger.
3384
3385 Verification trigger example
3386 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3387
3388 Let's say we want to run a powercut test on the remote machine 'server'.  Our
3389 write workload is in :file:`write-test.fio`. We want to cut power to 'server' at
3390 some point during the run, and we'll run this test from the safety or our local
3391 machine, 'localbox'. On the server, we'll start the fio backend normally::
3392
3393         server# fio --server
3394
3395 and on the client, we'll fire off the workload::
3396
3397         localbox$ fio --client=server --trigger-file=/tmp/my-trigger --trigger-remote="bash -c \"echo b > /proc/sysrq-triger\""
3398
3399 We set :file:`/tmp/my-trigger` as the trigger file, and we tell fio to execute::
3400
3401         echo b > /proc/sysrq-trigger
3402
3403 on the server once it has received the trigger and sent us the write state. This
3404 will work, but it's not **really** cutting power to the server, it's merely
3405 abruptly rebooting it. If we have a remote way of cutting power to the server
3406 through IPMI or similar, we could do that through a local trigger command
3407 instead. Let's assume we have a script that does IPMI reboot of a given hostname,
3408 ipmi-reboot. On localbox, we could then have run fio with a local trigger
3409 instead::
3410
3411         localbox$ fio --client=server --trigger-file=/tmp/my-trigger --trigger="ipmi-reboot server"
3412
3413 For this case, fio would wait for the server to send us the write state, then
3414 execute ``ipmi-reboot server`` when that happened.
3415
3416 Loading verify state
3417 ~~~~~~~~~~~~~~~~~~~~
3418
3419 To load stored write state, a read verification job file must contain the
3420 :option:`verify_state_load` option. If that is set, fio will load the previously
3421 stored state. For a local fio run this is done by loading the files directly,
3422 and on a client/server run, the server backend will ask the client to send the
3423 files over and load them from there.
3424
3425
3426 Log File Formats
3427 ----------------
3428
3429 Fio supports a variety of log file formats, for logging latencies, bandwidth,
3430 and IOPS. The logs share a common format, which looks like this:
3431
3432     *time* (`msec`), *value*, *data direction*, *offset*
3433
3434 Time for the log entry is always in milliseconds. The *value* logged depends
3435 on the type of log, it will be one of the following:
3436
3437     **Latency log**
3438                 Value is latency in usecs
3439     **Bandwidth log**
3440                 Value is in KiB/sec
3441     **IOPS log**
3442                 Value is IOPS
3443
3444 *Data direction* is one of the following:
3445
3446         **0**
3447                 I/O is a READ
3448         **1**
3449                 I/O is a WRITE
3450         **2**
3451                 I/O is a TRIM
3452
3453 The *offset* is the offset, in bytes, from the start of the file, for that
3454 particular I/O. The logging of the offset can be toggled with
3455 :option:`log_offset`.
3456
3457 If windowed logging is enabled through :option:`log_avg_msec` then fio doesn't
3458 log individual I/Os. Instead of logs the average values over the specified period
3459 of time. Since 'data direction' and 'offset' are per-I/O values, they aren't
3460 applicable if windowed logging is enabled. If windowed logging is enabled and
3461 :option:`log_max_value` is set, then fio logs maximum values in that window
3462 instead of averages.
3463
3464
3465 Client/server
3466 -------------
3467
3468 Normally fio is invoked as a stand-alone application on the machine where the
3469 I/O workload should be generated. However, the frontend and backend of fio can
3470 be run separately. Ie the fio server can generate an I/O workload on the "Device
3471 Under Test" while being controlled from another machine.
3472
3473 Start the server on the machine which has access to the storage DUT::
3474
3475         fio --server=args
3476
3477 where args defines what fio listens to. The arguments are of the form
3478 ``type,hostname`` or ``IP,port``. *type* is either ``ip`` (or ip4) for TCP/IP
3479 v4, ``ip6`` for TCP/IP v6, or ``sock`` for a local unix domain socket.
3480 *hostname* is either a hostname or IP address, and *port* is the port to listen
3481 to (only valid for TCP/IP, not a local socket). Some examples:
3482
3483 1) ``fio --server``
3484
3485    Start a fio server, listening on all interfaces on the default port (8765).
3486
3487 2) ``fio --server=ip:hostname,4444``
3488
3489    Start a fio server, listening on IP belonging to hostname and on port 4444.
3490
3491 3) ``fio --server=ip6:::1,4444``
3492
3493    Start a fio server, listening on IPv6 localhost ::1 and on port 4444.
3494
3495 4) ``fio --server=,4444``
3496
3497    Start a fio server, listening on all interfaces on port 4444.
3498
3499 5) ``fio --server=1.2.3.4``
3500
3501    Start a fio server, listening on IP 1.2.3.4 on the default port.
3502
3503 6) ``fio --server=sock:/tmp/fio.sock``
3504
3505    Start a fio server, listening on the local socket /tmp/fio.sock.
3506
3507 Once a server is running, a "client" can connect to the fio server with::
3508
3509         fio <local-args> --client=<server> <remote-args> <job file(s)>
3510
3511 where `local-args` are arguments for the client where it is running, `server`
3512 is the connect string, and `remote-args` and `job file(s)` are sent to the
3513 server. The `server` string follows the same format as it does on the server
3514 side, to allow IP/hostname/socket and port strings.
3515
3516 Fio can connect to multiple servers this way::
3517
3518     fio --client=<server1> <job file(s)> --client=<server2> <job file(s)>
3519
3520 If the job file is located on the fio server, then you can tell the server to
3521 load a local file as well. This is done by using :option:`--remote-config` ::
3522
3523    fio --client=server --remote-config /path/to/file.fio
3524
3525 Then fio will open this local (to the server) job file instead of being passed
3526 one from the client.
3527
3528 If you have many servers (example: 100 VMs/containers), you can input a pathname
3529 of a file containing host IPs/names as the parameter value for the
3530 :option:`--client` option.  For example, here is an example :file:`host.list`
3531 file containing 2 hostnames::
3532
3533         host1.your.dns.domain
3534         host2.your.dns.domain
3535
3536 The fio command would then be::
3537
3538     fio --client=host.list <job file(s)>
3539
3540 In this mode, you cannot input server-specific parameters or job files -- all
3541 servers receive the same job file.
3542
3543 In order to let ``fio --client`` runs use a shared filesystem from multiple
3544 hosts, ``fio --client`` now prepends the IP address of the server to the
3545 filename.  For example, if fio is using the directory :file:`/mnt/nfs/fio` and is
3546 writing filename :file:`fileio.tmp`, with a :option:`--client` `hostfile`
3547 containing two hostnames ``h1`` and ``h2`` with IP addresses 192.168.10.120 and
3548 192.168.10.121, then fio will create two files::
3549
3550         /mnt/nfs/fio/192.168.10.120.fileio.tmp
3551         /mnt/nfs/fio/192.168.10.121.fileio.tmp