fio resides in a git repo, the canonical place is:
-git://brick.kernel.dk/data/git/fio.git
+git://git.kernel.dk/fio.git
+
+If you are inside a corporate firewall, git:// may not always work for
+you. In that case you can use the http protocol, path is the same:
+
+http://git.kernel.dk/fio.git
Snapshots are frequently generated and they include the git meta data as
well. You can download them here:
http://brick.kernel.dk/snaps/
-Pascal Bleser <guru@unixtech.be> has fio RPMs in his repository, you
-can find them here:
+Binary packages
+---------------
+
+Debian:
+Starting with Debian "Squeeze", fio packages are part of the official
+Debian repository. http://packages.debian.org/search?keywords=fio
+
+Ubuntu:
+Starting with Ubuntu 10.04 LTS (aka "Lucid Lynx"), fio packages are part
+of the Ubuntu "universe" repository.
+http://packages.ubuntu.com/search?keywords=fio
+
+SUSE:
+Pascal Bleser <guru@unixtech.be> has fio RPMs in his repository for SUSE
+variants, you can find them here:
http://linux01.gwdg.de/~pbleser/rpm-navigation.php?cat=System/fio
+Red Hat, CentOS & Co:
+Dag Wieƫrs has RPMs for Red Hat related distros, find them here:
+http://dag.wieers.com/rpm/packages/fio/
+
+Mandriva:
+Mandriva has integrated fio into their package repository, so installing
+on that distro should be as easy as typing 'urpmi fio'.
+
+Solaris:
+Packages for Solaris are available from OpenCSW. Install their pkgutil
+tool (http://www.opencsw.org/get-it/pkgutil/) and then install fio via
+'pkgutil -i fio'.
+
+Windows:
+Bruce Cran <bruce@cran.org.uk> has fio packages for Windows at
+http://www.bluestop.org/fio .
+
+
+Mailing list
+------------
+
+There's a mailing list associated with fio. It's meant for general
+discussion, bug reporting, questions, and development - basically anything
+that has to do with fio. An automated mail detailing recent commits is
+automatically sent to the list at most daily. The list address is
+fio@vger.kernel.org, subscribe by sending an email to
+majordomo@vger.kernel.org with
+
+subscribe fio
+
+in the body of the email. Archives can be found here:
+
+http://www.spinics.net/lists/fio/
+
+and archives for the old list can be found here:
+
+http://maillist.kernel.dk/fio-devel/
+
Building
--------
-Just type 'make' and 'make install'. If on FreeBSD, for now you have to
-specify the FreeBSD Makefile with -f, eg:
+Just type 'make' and 'make install'.
+
+Note that GNU make is required. On BSD it's available from devel/gmake;
+on Solaris it's in the SUNWgmake package. On platforms where GNU make
+isn't the default, type 'gmake' instead of 'make'.
+
+If your compile fails with an error like this:
+
+ CC gettime.o
+In file included from fio.h:23,
+ from gettime.c:8:
+os/os.h:15:20: error: libaio.h: No such file or directory
+In file included from gettime.c:8:
+fio.h:119: error: field 'iocb' has incomplete type
+make: *** [gettime.o] Error 1
-$ make -f Makefile.Freebsd && make -f Makefile.FreeBSD install
+Check that you have the libaio development package installed. On RPM
+based distros, it's typically called libaio-devel.
-Likewise with OpenSolaris, use the Makefile.solaris to compile there.
-This might change in the future if I opt for an autoconf type setup.
+
+Windows
+-------
+
+On Windows MinGW (http://www.mingw.org/) is required in order to
+build fio. To create an MSI installer package install WiX 3.6 from
+http://wix.sourceforge.net/releases/ and run dobuild.cmd from the
+os/windows directory.
Command line
------------
$ fio
- -t <sec> Runtime in seconds
- -l Generate per-job latency logs
- -w Generate per-job bandwidth logs
- -o <file> Log output to file
- -m Minimal (terse) output
- -h Print help info
- -v Print version information and exit
-
-Any parameters following the options will be assumed to be job files.
-You can add as many as you want, each job file will be regarded as a
-separate group and fio will stonewall it's execution.
+ --debug Enable some debugging options (see below)
+ --output Write output to file
+ --runtime Runtime in seconds
+ --latency-log Generate per-job latency logs
+ --bandwidth-log Generate per-job bandwidth logs
+ --minimal Minimal (terse) output
+ --version Print version info and exit
+ --terse-version=type Terse version output format (default 3, or 2 or 4).
+ --help Print this page
+ --cmdhelp=cmd Print command help, "all" for all of them
+ --enghelp=engine Print ioengine help, or list available ioengines
+ --enghelp=engine,cmd Print help for an ioengine cmd
+ --showcmd Turn a job file into command line options
+ --readonly Turn on safety read-only checks, preventing
+ writes
+ --eta=when When ETA estimate should be printed
+ May be "always", "never" or "auto"
+ --section=name Only run specified section in job file.
+ Multiple sections can be specified.
+ --alloc-size=kb Set smalloc pool to this size in kb (def 1024)
+ --warnings-fatal Fio parser warnings are fatal
+ --max-jobs Maximum number of threads/processes to support
+ --server=args Start backend server. See Client/Server section.
+ --client=host Connect to specified backend.
+
+
+Any parameters following the options will be assumed to be job files,
+unless they match a job file parameter. You can add as many as you want,
+each job file will be regarded as a separate group and fio will stonewall
+its execution.
+
+The --readonly switch is an extra safety guard to prevent accidentally
+turning on a write setting when that is not desired. Fio will only write
+if rw=write/randwrite/rw/randrw is given, but this extra safety net can
+be used as an extra precaution. It will also enable a write check in the
+io engine core to prevent an accidental write due to a fio bug.
+
+The debug switch allows adding options that trigger certain logging
+options in fio. Currently the options are:
+
+ process Dump info related to processes
+ file Dump info related to file actions
+ io Dump info related to IO queuing
+ mem Dump info related to memory allocations
+ blktrace Dump info related to blktrace setup
+ verify Dump info related to IO verification
+ all Enable all debug options
+ random Dump info related to random offset generation
+ parse Dump info related to option matching and parsing
+ diskutil Dump info related to disk utilization updates
+ job:x Dump info only related to job number x
+ mutex Dump info only related to mutex up/down ops
+ profile Dump info related to profile extensions
+ time Dump info related to internal time keeping
+ ? or help Show available debug options.
+
+You can specify as many as you want, eg --debug=file,mem will enable
+file and memory debugging.
+
+The section switch is meant to make it easier to ship a bigger job file
+instead of several smaller ones. Say you define a job file with light,
+moderate, and heavy parts. Then you can ask fio to run the given part
+only by giving it a --section=heavy command line option. The section
+option only applies to job sections, the reserved 'global' section is
+always parsed and taken into account.
+
+Fio has an internal allocator for shared memory called smalloc. It
+allocates shared structures from this pool. The pool defaults to 1024k
+in size, and can grow to 128 pools. If running large jobs with randommap
+enabled it can run out of memory, in which case the --alloc-size switch
+is handy for starting with a larger pool size. The backing store is
+files in /tmp. Fio cleans up after itself, while it is running you
+may see .fio_smalloc.* files in /tmp.
Job file
--------
-Only a few options can be controlled with command line parameters,
-generally it's a lot easier to just write a simple job file to describe
-the workload. The job file format is in the ini style format, as it's
-easy to read and write for the user.
+See the HOWTO file for a more detailed description of parameters and what
+they mean. This file contains the terse version. You can describe big and
+complex setups with the command line, but generally it's a lot easier to
+just write a simple job file to describe the workload. The job file format
+is in the ini style format, as that is easy to read and write for the user.
The job file parameters are:
name=x Use 'x' as the identifier for this job.
+ description=x 'x' is a text description of the job.
directory=x Use 'x' as the top level directory for storing files
+ filename=x Force the use of 'x' as the filename for all files
+ in this thread. If not given, fio will make up
+ a suitable filename based on the thread and file
+ number.
rw=x 'x' may be: read, randread, write, randwrite,
rw (read-write mix), randrw (read-write random mix)
rwmixcycle=x Base cycle for switching between read and write
across runs, if 'x' is 1.
size=x Set file size to x bytes (x string can include k/m/g)
ioengine=x 'x' may be: aio/libaio/linuxaio for Linux aio,
- posixaio for POSIX aio, sync for regular read/write io,
- mmap for mmap'ed io, splice for using splice/vmsplice,
- or sgio for direct SG_IO io. The latter only works on
- Linux on SCSI (or SCSI-like devices, such as
- usb-storage or sata/libata driven) devices.
+ posixaio for POSIX aio, solarisaio for Solaris
+ native async IO, windowsaio for Windows native async IO,
+ sync for regular read/write io,
+ psync for regular pread/pwrite io, vsync for regular
+ readv/writev (with queuing emulation) mmap for mmap'ed
+ io, syslet-rw for syslet driven read/write, splice for
+ using splice/vmsplice, sg for direct SG_IO io, net
+ for network io, or cpuio for a cycler burner load. sg
+ only works on Linux on SCSI (or SCSI-like devices, such
+ as usb-storage or sata/libata driven) devices. Fio also
+ has a null io engine, which is mainly used for testing
+ fio itself.
+
iodepth=x For async io, allow 'x' ios in flight
overwrite=x If 'x', layout a write file first.
nrfiles=x Spread io load over 'x' number of files per job,
also include k/m postfix.
direct=x 1 for direct IO, 0 for buffered IO
thinktime=x "Think" x usec after each io
- rate=x Throttle rate to x KiB/sec
- ratemin=x Quit if rate of x KiB/sec can't be met
+ rate=x Throttle rate to x KB/sec
+ ratemin=x Quit if rate of x KB/sec can't be met
ratecycle=x ratemin averaged over x msecs
cpumask=x Only allow job to run on CPUs defined by mask.
- fsync=x If writing, fsync after every x blocks have been written
+ cpus_allowed=x Like 'cpumask', but allow text setting of CPU affinity.
+ fsync=x If writing with buffered IO, fsync after every
+ 'x' blocks have been written.
+ end_fsync=x If 'x', run fsync() after end-of-job.
startdelay=x Start this thread x seconds after startup
- timeout=x Terminate x seconds after startup. Can include a
+ runtime=x Terminate x seconds after startup. Can include a
normal time suffix if not given in seconds, such as
'm' for minutes, 'h' for hours, and 'd' for days.
offset=x Start io at offset x (x string can include k/m/g)
invalidate=x Invalidate page cache for file prior to doing io
- sync=x Use sync writes if x and writing
+ sync=x Use sync writes if x and writing buffered IO.
mem=x If x == malloc, use malloc for buffers. If x == shm,
- use shm for buffers. If x == mmap, use anon mmap.
+ use shared memory for buffers. If x == mmap, use
+ anonymous mmap.
exitall When one thread quits, terminate the others
bwavgtime=x Average bandwidth stats over an x msec window.
create_serialize=x If 'x', serialize file creation.
create_fsync=x If 'x', run fsync() after file creation.
- end_fsync=x If 'x', run fsync() after end-of-job.
+ unlink If set, unlink files when done.
loops=x Run the job 'x' number of times.
verify=x If 'x' == md5, use md5 for verifies. If 'x' == crc32,
use crc32 for verifies. md5 is 'safer', but crc32 is
a lot faster. Only makes sense for writing to a file.
+ For other types of checksumming, see HOWTO.
stonewall Wait for preceeding jobs to end before running.
numjobs=x Create 'x' similar entries for this job
thread Use pthreads instead of forked jobs
can be used to gauge hard drive speed over the entire
platter, without reading everything. Both x/y can
include k/m/g suffix.
- iolog=x Open and read io pattern from file 'x'. The file must
- contain one io action per line in the following format:
- rw, offset, length
- where with rw=0/1 for read/write, and the offset
- and length entries being in bytes.
+ read_iolog=x Open and read io pattern from file 'x'. The file format
+ is described in the HOWTO.
write_iolog=x Write an iolog to file 'x' in the same format as iolog.
The iolog options are exclusive, if both given the
- read iolog will be performed.
+ read iolog will be performed. Specify a separate file
+ for each job, otherwise the iologs will be interspersed
+ and the file may be corrupt.
+ write_bw_log Write a bandwidth log.
+ write_lat_log Write a latency log.
lockmem=x Lock down x amount of memory on the machine, to
simulate a machine with less memory available. x can
include k/m/g suffix.
ioscheduler=x Use ioscheduler 'x' for this job.
cpuload=x For a CPU io thread, percentage of CPU time to attempt
to burn.
- cpuchunks=x Split burn cycles into pieces of x.
+ cpuchunks=x Split burn cycles into pieces of x usecs.
+
-Examples using a job file
--------------------------
+Client/server
+------------
-Example 1) Two random readers
+Normally you would run fio as a stand-alone application on the machine
+where the IO workload should be generated. However, it is also possible to
+run the frontend and backend of fio separately. This makes it possible to
+have a fio server running on the machine(s) where the IO workload should
+be running, while controlling it from another machine.
-Lets say we want to simulate two threads reading randomly from a file
-each. They will be doing IO in 4KiB chunks, using raw (O_DIRECT) IO.
-Since they share most parameters, we'll put those in the [global]
-section. Job 1 will use a 128MiB file, job 2 will use a 256MiB file.
+To start the server, you would do:
-; ---snip---
+fio --server=args
-[global]
-ioengine=sync ; regular read/write(2), the default
-rw=randread
-bs=4k
-direct=1
+on that machine, where args defines what fio listens to. The arguments
+are of the form 'type,hostname or IP,port'. 'type' is either 'ip' (or ip4)
+for TCP/IP v4, 'ip6' for TCP/IP v6, or 'sock' for a local unix domain socket.
+'hostname' is either a hostname or IP address, and 'port' is the port to
+listen to (only valid for TCP/IP, not a local socket). Some examples:
-[file1]
-size=128m
-
-[file2]
-size=256m
-
-; ---snip---
-
-Generally the [] bracketed name specifies a file name, but the "global"
-keyword is reserved for setting options that are inherited by each
-subsequent job description. It's possible to have several [global]
-sections in the job file, each one adds options that are inherited by
-jobs defined below it. The name can also point to a block device, such
-as /dev/sda. To run the above job file, simply do:
-
-$ fio jobfile
-
-Example 2) Many random writers
-
-Say we want to exercise the IO subsystem some more. We'll define 64
-threads doing random buffered writes. We'll let each thread use async io
-with a depth of 4 ios in flight. A job file would then look like this:
-
-; ---snip---
-
-[global]
-ioengine=libaio
-iodepth=4
-rw=randwrite
-bs=32k
-direct=0
-size=64m
-
-[files]
-numjobs=64
-
-; ---snip---
-
-This will create files.[0-63] and perform the random writes to them.
-
-There are endless ways to define jobs, the examples/ directory contains
-a few more examples.
-
-
-Interpreting the output
------------------------
-
-fio spits out a lot of output. While running, fio will display the
-status of the jobs created. An example of that would be:
-
-Threads running: 1: [_r] [24.79% done] [eta 00h:01m:31s]
-
-The characters inside the square brackets denote the current status of
-each thread. The possible values (in typical life cycle order) are:
-
-Idle Run
----- ---
-P Thread setup, but not started.
-C Thread created.
-I Thread initialized, waiting.
- R Running, doing sequential reads.
- r Running, doing random reads.
- W Running, doing sequential writes.
- w Running, doing random writes.
- M Running, doing mixed sequential reads/writes.
- m Running, doing mixed random reads/writes.
- F Running, currently waiting for fsync()
-V Running, doing verification of written data.
-E Thread exited, not reaped by main thread yet.
-_ Thread reaped.
-
-The other values are fairly self explanatory - number of threads
-currently running and doing io, and the estimated completion percentage
-and time for the running group. It's impossible to estimate runtime
-of the following groups (if any).
-
-When fio is done (or interrupted by ctrl-c), it will show the data for
-each thread, group of threads, and disks in that order. For each data
-direction, the output looks like:
-
-Client1 (g=0): err= 0:
- write: io= 32MiB, bw= 666KiB/s, runt= 50320msec
- slat (msec): min= 0, max= 136, avg= 0.03, dev= 1.92
- clat (msec): min= 0, max= 631, avg=48.50, dev=86.82
- bw (KiB/s) : min= 0, max= 1196, per=51.00%, avg=664.02, dev=681.68
- cpu : usr=1.49%, sys=0.25%, ctx=7969
-
-The client number is printed, along with the group id and error of that
-thread. Below is the io statistics, here for writes. In the order listed,
-they denote:
-
-io= Number of megabytes io performed
-bw= Average bandwidth rate
-runt= The runtime of that thread
- slat= Submission latency (avg being the average, dev being the
- standard deviation). This is the time it took to submit
- the io. For sync io, the slat is really the completion
- latency, since queue/complete is one operation there.
- clat= Completion latency. Same names as slat, this denotes the
- time from submission to completion of the io pieces. For
- sync io, clat will usually be equal (or very close) to 0,
- as the time from submit to complete is basically just
- CPU time (io has already been done, see slat explanation).
- bw= Bandwidth. Same names as the xlat stats, but also includes
- an approximate percentage of total aggregate bandwidth
- this thread received in this group. This last value is
- only really useful if the threads in this group are on the
- same disk, since they are then competing for disk access.
-cpu= CPU usage. User and system time, along with the number
- of context switches this thread went through.
-
-After each client has been listed, the group statistics are printed. They
-will look like this:
-
-Run status group 0 (all jobs):
- READ: io=64MiB, aggrb=22178, minb=11355, maxb=11814, mint=2840msec, maxt=2955msec
- WRITE: io=64MiB, aggrb=1302, minb=666, maxb=669, mint=50093msec, maxt=50320msec
-
-For each data direction, it prints:
-
-io= Number of megabytes io performed.
-aggrb= Aggregate bandwidth of threads in this group.
-minb= The minimum average bandwidth a thread saw.
-maxb= The maximum average bandwidth a thread saw.
-mint= The smallest runtime of the threads in that group.
-maxt= The longest runtime of the threads in that group.
-
-And finally, the disk statistics are printed. They will look like this:
-
-Disk stats (read/write):
- sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00%
-
-Each value is printed for both reads and writes, with reads first. The
-numbers denote:
-
-ios= Number of ios performed by all groups.
-merge= Number of merges io the io scheduler.
-ticks= Number of ticks we kept the disk busy.
-io_queue= Total time spent in the disk queue.
-util= The disk utilization. A value of 100% means we kept the disk
- busy constantly, 50% would be a disk idling half of the time.
-
-
-Terse output
-------------
+1) fio --server
+
+ Start a fio server, listening on all interfaces on the default port (8765).
+
+2) fio --server=ip:hostname,4444
+
+ Start a fio server, listening on IP belonging to hostname and on port 4444.
+
+3) fio --server=ip6:::1,4444
+
+ Start a fio server, listening on IPv6 localhost ::1 and on port 4444.
+
+4) fio --server=,4444
+
+ Start a fio server, listening on all interfaces on port 4444.
+
+5) fio --server=1.2.3.4
+
+ Start a fio server, listening on IP 1.2.3.4 on the default port.
+
+6) fio --server=sock:/tmp/fio.sock
+
+ Start a fio server, listening on the local socket /tmp/fio.sock.
+
+When a server is running, you can connect to it from a client. The client
+is run with:
+
+fio --local-args --client=server --remote-args <job file(s)>
+
+where --local-args are arguments that are local to the client where it is
+running, 'server' is the connect string, and --remote-args and <job file(s)>
+are sent to the server. The 'server' string follows the same format as it
+does on the server side, to allow IP/hostname/socket and port strings.
+You can connect to multiple clients as well, to do that you could run:
+
+fio --client=server2 <job file(s)> --client=server2 <job file(s)>
+
+
+Platforms
+---------
+
+Fio works on (at least) Linux, Solaris, AIX, HP-UX, OSX, NetBSD, Windows
+and FreeBSD. Some features and/or options may only be available on some of
+the platforms, typically because those features only apply to that platform
+(like the solarisaio engine, or the splice engine on Linux).
+
+Some features are not available on FreeBSD/Solaris even if they could be
+implemented, I'd be happy to take patches for that. An example of that is
+disk utility statistics and (I think) huge page support, support for that
+does exist in FreeBSD/Solaris.
+
+Fio uses pthread mutexes for signalling and locking and FreeBSD does not
+support process shared pthread mutexes. As a result, only threads are
+supported on FreeBSD. This could be fixed with sysv ipc locking or
+other locking alternatives.
+
+Other *BSD platforms are untested, but fio should work there almost out
+of the box. Since I don't do test runs or even compiles on those platforms,
+your mileage may vary. Sending me patches for other platforms is greatly
+appreciated. There's a lot of value in having the same test/benchmark tool
+available on all platforms.
+
+Note that POSIX aio is not enabled by default on AIX. If you get messages like:
+
+ Symbol resolution failed for /usr/lib/libc.a(posix_aio.o) because:
+ Symbol _posix_kaio_rdwr (number 2) is not exported from dependent module /unix.
+
+you need to enable POSIX aio. Run the following commands as root:
+
+ # lsdev -C -l posix_aio0
+ posix_aio0 Defined Posix Asynchronous I/O
+ # cfgmgr -l posix_aio0
+ # lsdev -C -l posix_aio0
+ posix_aio0 Available Posix Asynchronous I/O
+
+POSIX aio should work now. To make the change permanent:
-For scripted usage where you typically want to generate tables or graphs
-of the results, fio can output the results in a comma seperated format.
-The format is one long line of values, such as:
-
-client1,0,0,936,331,2894,0,0,0.000000,0.000000,1,170,22.115385,34.290410,16,714,84.252874%,366.500000,566.417819,3496,1237,2894,0,0,0.000000,0.000000,0,246,6.671625,21.436952,0,2534,55.465300%,1406.600000,2008.044216,0.000000%,0.431928%,1109
-
-Split up, the format is as follows:
-
- jobname, groupid, error
- READ status:
- KiB IO, bandwidth (KiB/sec), runtime (msec)
- Submission latency: min, max, mean, deviation
- Completion latency: min, max, mean, deviation
- Bw: min, max, aggreate percentage of total, mean, deviation
- WRITE status:
- KiB IO, bandwidth (KiB/sec), runtime (msec)
- Submission latency: min, max, mean, deviation
- Completion latency: min, max, mean, deviation
- Bw: min, max, aggreate percentage of total, mean, deviation
- CPU usage: user, system, context switches
+ # chdev -l posix_aio0 -P -a autoconfig='available'
+ posix_aio0 changed
Author