X-Git-Url: https://git.kernel.dk/?p=fio.git;a=blobdiff_plain;f=README;h=4c7b542a8ecf4688b139aaf488225eeac26228db;hp=4ed6078695f2991b3f900475b898cb544f53e608;hb=f41862f7e3f61f6f133dd7477c4aa5385d612f62;hpb=906c8d75eef9247c02d1f1f6771b6fa2338329fa diff --git a/README b/README index 4ed60786..4c7b542a 100644 --- a/README +++ b/README @@ -14,61 +14,220 @@ Source fio resides in a git repo, the canonical place is: -git://brick.kernel.dk/data/git/fio.git +git://git.kernel.dk/fio.git + +If you are inside a corporate firewall, git:// may not always work for +you. In that case you can use the http protocol, path is the same: + +http://git.kernel.dk/fio.git Snapshots are frequently generated and they include the git meta data as well. You can download them here: http://brick.kernel.dk/snaps/ -Pascal Bleser has fio RPMs in his repository, you -can find them here: +Binary packages +--------------- + +Debian: +Starting with Debian "Squeeze", fio packages are part of the official +Debian repository. http://packages.debian.org/search?keywords=fio + +Ubuntu: +Starting with Ubuntu 10.04 LTS (aka "Lucid Lynx"), fio packages are part +of the Ubuntu "universe" repository. +http://packages.ubuntu.com/search?keywords=fio + +SUSE: +Pascal Bleser has fio RPMs in his repository for SUSE +variants, you can find them here: http://linux01.gwdg.de/~pbleser/rpm-navigation.php?cat=System/fio +Red Hat, CentOS & Co: +Dag Wieërs has RPMs for Red Hat related distros, find them here: +http://dag.wieers.com/rpm/packages/fio/ + +Mandriva: +Mandriva has integrated fio into their package repository, so installing +on that distro should be as easy as typing 'urpmi fio'. + +Solaris: +Packages for Solaris are available from OpenCSW. Install their pkgutil +tool (http://www.opencsw.org/get-it/pkgutil/) and then install fio via +'pkgutil -i fio'. + +Windows: +Bruce Cran has fio packages for Windows at +http://www.bluestop.org/fio . + + +Mailing list +------------ + +There's a mailing list associated with fio. It's meant for general +discussion, bug reporting, questions, and development - basically anything +that has to do with fio. An automated mail detailing recent commits is +automatically sent to the list at most daily. The list address is +fio@vger.kernel.org, subscribe by sending an email to +majordomo@vger.kernel.org with + +subscribe fio + +in the body of the email. Archives can be found here: + +http://www.spinics.net/lists/fio/ + +and archives for the old list can be found here: + +http://maillist.kernel.dk/fio-devel/ + Building -------- -Just type 'make' and 'make install'. If on FreeBSD, for now you have to -specify the FreeBSD Makefile with -f, eg: +Just type 'make' and 'make install'. + +Note that GNU make is required. On BSD it's available from devel/gmake; +on Solaris it's in the SUNWgmake package. On platforms where GNU make +isn't the default, type 'gmake' instead of 'make'. + +If your compile fails with an error like this: + + CC gettime.o +In file included from fio.h:23, + from gettime.c:8: +os/os.h:15:20: error: libaio.h: No such file or directory +In file included from gettime.c:8: +fio.h:119: error: field 'iocb' has incomplete type +make: *** [gettime.o] Error 1 -$ make -f Makefile.Freebsd && make -f Makefile.FreeBSD install +Check that you have the libaio development package installed. On RPM +based distros, it's typically called libaio-devel. -Likewise with OpenSolaris, use the Makefile.solaris to compile there. -This might change in the future if I opt for an autoconf type setup. + +Windows +------- + +On Windows Cygwin (http://www.cygwin.com/) is required in order to +build fio. To create an MSI installer package install WiX 3.7 from +http://wixtoolset.org and run dobuild.cmd from the +os/windows directory. + +How to compile FIO on 64-bit Windows: + + 1. Install Cygwin (http://www.cygwin.com/setup.exe). Install 'make' and all + packages starting with 'mingw64-i686' and 'mingw64-x86_64'. + 2. Download ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release/dll/x64/pthreadGC2.dll + and copy to the fio source directory. + 3. Open the Cygwin Terminal. + 4. Go to the fio directory (source files). + 5. Run 'make clean'. + 6. Run 'make'. Command line ------------ $ fio - -t Runtime in seconds - -l Generate per-job latency logs - -w Generate per-job bandwidth logs - -f Read for job descriptions - -o Log output to file - -m Minimal (terse) output - -h Print help info - -v Print version information and exit - -Any parameters following the options will be assumed to be job files. -You can add as many as you want, each job file will be regarded as a -separate group and fio will stonewall it's execution. + --debug Enable some debugging options (see below) + --output Write output to file + --runtime Runtime in seconds + --latency-log Generate per-job latency logs + --bandwidth-log Generate per-job bandwidth logs + --minimal Minimal (terse) output + --output-format=type Output format (terse,json,normal) + --terse-version=type Terse version output format (default 3, or 2 or 4). + --version Print version info and exit + --help Print this page + --cpuclock-test Perform test/validation of CPU clock + --cmdhelp=cmd Print command help, "all" for all of them + --enghelp=engine Print ioengine help, or list available ioengines + --enghelp=engine,cmd Print help for an ioengine cmd + --showcmd Turn a job file into command line options + --readonly Turn on safety read-only checks, preventing + writes + --eta=when When ETA estimate should be printed + May be "always", "never" or "auto" + --section=name Only run specified section in job file. + Multiple sections can be specified. + --alloc-size=kb Set smalloc pool to this size in kb (def 1024) + --warnings-fatal Fio parser warnings are fatal + --max-jobs Maximum number of threads/processes to support + --server=args Start backend server. See Client/Server section. + --client=host Connect to specified backend. + --idle-prof=option Report cpu idleness on a system or percpu basis + (option=system,percpu) or run unit work + calibration only (option=calibrate). + + +Any parameters following the options will be assumed to be job files, +unless they match a job file parameter. You can add as many as you want, +each job file will be regarded as a separate group and fio will stonewall +its execution. + +The --readonly switch is an extra safety guard to prevent accidentally +turning on a write setting when that is not desired. Fio will only write +if rw=write/randwrite/rw/randrw is given, but this extra safety net can +be used as an extra precaution. It will also enable a write check in the +io engine core to prevent an accidental write due to a fio bug. + +The debug switch allows adding options that trigger certain logging +options in fio. Currently the options are: + + process Dump info related to processes + file Dump info related to file actions + io Dump info related to IO queuing + mem Dump info related to memory allocations + blktrace Dump info related to blktrace setup + verify Dump info related to IO verification + all Enable all debug options + random Dump info related to random offset generation + parse Dump info related to option matching and parsing + diskutil Dump info related to disk utilization updates + job:x Dump info only related to job number x + mutex Dump info only related to mutex up/down ops + profile Dump info related to profile extensions + time Dump info related to internal time keeping + ? or help Show available debug options. + +You can specify as many as you want, eg --debug=file,mem will enable +file and memory debugging. + +The section switch is meant to make it easier to ship a bigger job file +instead of several smaller ones. Say you define a job file with light, +moderate, and heavy parts. Then you can ask fio to run the given part +only by giving it a --section=heavy command line option. The section +option only applies to job sections, the reserved 'global' section is +always parsed and taken into account. + +Fio has an internal allocator for shared memory called smalloc. It +allocates shared structures from this pool. The pool defaults to 1024k +in size, and can grow to 128 pools. If running large jobs with randommap +enabled it can run out of memory, in which case the --alloc-size switch +is handy for starting with a larger pool size. The backing store is +files in /tmp. Fio cleans up after itself, while it is running you +may see .fio_smalloc.* files in /tmp. Job file -------- -Only a few options can be controlled with command line parameters, -generally it's a lot easier to just write a simple job file to describe -the workload. The job file format is in the ini style format, as it's -easy to read and write for the user. +See the HOWTO file for a more detailed description of parameters and what +they mean. This file contains the terse version. You can describe big and +complex setups with the command line, but generally it's a lot easier to +just write a simple job file to describe the workload. The job file format +is in the ini style format, as that is easy to read and write for the user. The job file parameters are: name=x Use 'x' as the identifier for this job. + description=x 'x' is a text description of the job. directory=x Use 'x' as the top level directory for storing files + filename=x Force the use of 'x' as the filename for all files + in this thread. If not given, fio will make up + a suitable filename based on the thread and file + number. rw=x 'x' may be: read, randread, write, randwrite, rw (read-write mix), randrw (read-write random mix) rwmixcycle=x Base cycle for switching between read and write @@ -82,13 +241,24 @@ The job file parameters are: across runs, if 'x' is 1. size=x Set file size to x bytes (x string can include k/m/g) ioengine=x 'x' may be: aio/libaio/linuxaio for Linux aio, - posixaio for POSIX aio, sync for regular read/write io, - mmap for mmap'ed io, splice for using splice/vmsplice, - or sgio for direct SG_IO io. The latter only works on - Linux on SCSI (or SCSI-like devices, such as - usb-storage or sata/libata driven) devices. + posixaio for POSIX aio, solarisaio for Solaris + native async IO, windowsaio for Windows native async IO, + sync for regular read/write io, + psync for regular pread/pwrite io, vsync for regular + readv/writev (with queuing emulation) mmap for mmap'ed + io, syslet-rw for syslet driven read/write, splice for + using splice/vmsplice, sg for direct SG_IO io, net + for network io, rdma for RDMA io, or cpuio for a + cycler burner load. sg only works on Linux on + SCSI (or SCSI-like devices, such as usb-storage or + sata/libata driven) devices. Fio also has a null + io engine, which is mainly used for testing + fio itself. + iodepth=x For async io, allow 'x' ios in flight overwrite=x If 'x', layout a write file first. + nrfiles=x Spread io load over 'x' number of files per job, + if possible. prio=x Run io at prio X, 0-7 is the kernel allowed range prioclass=x Run io at prio class X bs=x Use 'x' for thread blocksize. May include k/m postfix. @@ -96,29 +266,39 @@ The job file parameters are: also include k/m postfix. direct=x 1 for direct IO, 0 for buffered IO thinktime=x "Think" x usec after each io - rate=x Throttle rate to x KiB/sec - ratemin=x Quit if rate of x KiB/sec can't be met + rate=x Throttle rate to x KB/sec + ratemin=x Quit if rate of x KB/sec can't be met ratecycle=x ratemin averaged over x msecs cpumask=x Only allow job to run on CPUs defined by mask. - fsync=x If writing, fsync after every x blocks have been written + cpus_allowed=x Like 'cpumask', but allow text setting of CPU affinity. + numa_cpu_nodes=x,y-z Allow job to run on specified NUMA nodes' CPU. + numa_mem_policy=m:x,y-z Setup numa memory allocation policy. + 'm' stands for policy, such as local, interleave, + bind, prefer, local. 'x, y-z' are numa node(s) for + memory allocation according to policy. + fsync=x If writing with buffered IO, fsync after every + 'x' blocks have been written. + end_fsync=x If 'x', run fsync() after end-of-job. startdelay=x Start this thread x seconds after startup - timeout=x Terminate x seconds after startup. Can include a + runtime=x Terminate x seconds after startup. Can include a normal time suffix if not given in seconds, such as 'm' for minutes, 'h' for hours, and 'd' for days. offset=x Start io at offset x (x string can include k/m/g) invalidate=x Invalidate page cache for file prior to doing io - sync=x Use sync writes if x and writing + sync=x Use sync writes if x and writing buffered IO. mem=x If x == malloc, use malloc for buffers. If x == shm, - use shm for buffers. If x == mmap, use anon mmap. + use shared memory for buffers. If x == mmap, use + anonymous mmap. exitall When one thread quits, terminate the others bwavgtime=x Average bandwidth stats over an x msec window. create_serialize=x If 'x', serialize file creation. create_fsync=x If 'x', run fsync() after file creation. - end_fsync=x If 'x', run fsync() after end-of-job. + unlink If set, unlink files when done. loops=x Run the job 'x' number of times. verify=x If 'x' == md5, use md5 for verifies. If 'x' == crc32, use crc32 for verifies. md5 is 'safer', but crc32 is a lot faster. Only makes sense for writing to a file. + For other types of checksumming, see HOWTO. stonewall Wait for preceeding jobs to end before running. numjobs=x Create 'x' similar entries for this job thread Use pthreads instead of forked jobs @@ -128,14 +308,15 @@ The job file parameters are: can be used to gauge hard drive speed over the entire platter, without reading everything. Both x/y can include k/m/g suffix. - iolog=x Open and read io pattern from file 'x'. The file must - contain one io action per line in the following format: - rw, offset, length - where with rw=0/1 for read/write, and the offset - and length entries being in bytes. + read_iolog=x Open and read io pattern from file 'x'. The file format + is described in the HOWTO. write_iolog=x Write an iolog to file 'x' in the same format as iolog. The iolog options are exclusive, if both given the - read iolog will be performed. + read iolog will be performed. Specify a separate file + for each job, otherwise the iologs will be interspersed + and the file may be corrupt. + write_bw_log Write a bandwidth log. + write_lat_log Write a latency log. lockmem=x Lock down x amount of memory on the machine, to simulate a machine with less memory available. x can include k/m/g suffix. @@ -143,202 +324,120 @@ The job file parameters are: exec_prerun=x Run 'x' before job io is begun. exec_postrun=x Run 'x' after job io has finished. ioscheduler=x Use ioscheduler 'x' for this job. + cpuload=x For a CPU io thread, percentage of CPU time to attempt + to burn. + cpuchunks=x Split burn cycles into pieces of x usecs. -Examples using a job file -------------------------- -Example 1) Two random readers +Client/server +------------ -Lets say we want to simulate two threads reading randomly from a file -each. They will be doing IO in 4KiB chunks, using raw (O_DIRECT) IO. -Since they share most parameters, we'll put those in the [global] -section. Job 1 will use a 128MiB file, job 2 will use a 256MiB file. +Normally you would run fio as a stand-alone application on the machine +where the IO workload should be generated. However, it is also possible to +run the frontend and backend of fio separately. This makes it possible to +have a fio server running on the machine(s) where the IO workload should +be running, while controlling it from another machine. -; ---snip--- +To start the server, you would do: -[global] -ioengine=sync ; regular read/write(2), the default -rw=randread -bs=4k -direct=1 +fio --server=args -[file1] -size=128m +on that machine, where args defines what fio listens to. The arguments +are of the form 'type,hostname or IP,port'. 'type' is either 'ip' (or ip4) +for TCP/IP v4, 'ip6' for TCP/IP v6, or 'sock' for a local unix domain socket. +'hostname' is either a hostname or IP address, and 'port' is the port to +listen to (only valid for TCP/IP, not a local socket). Some examples: -[file2] -size=256m - -; ---snip--- - -Generally the [] bracketed name specifies a file name, but the "global" -keyword is reserved for setting options that are inherited by each -subsequent job description. It's possible to have several [global] -sections in the job file, each one adds options that are inherited by -jobs defined below it. The name can also point to a block device, such -as /dev/sda. To run the above job file, simply do: - -$ fio jobfile - -Example 2) Many random writers - -Say we want to exercise the IO subsystem some more. We'll define 64 -threads doing random buffered writes. We'll let each thread use async io -with a depth of 4 ios in flight. A job file would then look like this: - -; ---snip--- - -[global] -ioengine=libaio -iodepth=4 -rw=randwrite -bs=32k -direct=0 -size=64m - -[files] -numjobs=64 - -; ---snip--- - -This will create files.[0-63] and perform the random writes to them. - -There are endless ways to define jobs, the examples/ directory contains -a few more examples. +1) fio --server + Start a fio server, listening on all interfaces on the default port (8765). + +2) fio --server=ip:hostname,4444 + + Start a fio server, listening on IP belonging to hostname and on port 4444. + +3) fio --server=ip6:::1,4444 + + Start a fio server, listening on IPv6 localhost ::1 and on port 4444. + +4) fio --server=,4444 + + Start a fio server, listening on all interfaces on port 4444. + +5) fio --server=1.2.3.4 + + Start a fio server, listening on IP 1.2.3.4 on the default port. + +6) fio --server=sock:/tmp/fio.sock + + Start a fio server, listening on the local socket /tmp/fio.sock. + +When a server is running, you can connect to it from a client. The client +is run with: + +fio --local-args --client=server --remote-args + +where --local-args are arguments that are local to the client where it is +running, 'server' is the connect string, and --remote-args and +are sent to the server. The 'server' string follows the same format as it +does on the server side, to allow IP/hostname/socket and port strings. +You can connect to multiple clients as well, to do that you could run: + +fio --client=server2 --client=server2 + + +Platforms +--------- + +Fio works on (at least) Linux, Solaris, AIX, HP-UX, OSX, NetBSD, Windows +and FreeBSD. Some features and/or options may only be available on some of +the platforms, typically because those features only apply to that platform +(like the solarisaio engine, or the splice engine on Linux). + +Some features are not available on FreeBSD/Solaris even if they could be +implemented, I'd be happy to take patches for that. An example of that is +disk utility statistics and (I think) huge page support, support for that +does exist in FreeBSD/Solaris. + +Fio uses pthread mutexes for signalling and locking and FreeBSD does not +support process shared pthread mutexes. As a result, only threads are +supported on FreeBSD. This could be fixed with sysv ipc locking or +other locking alternatives. + +Other *BSD platforms are untested, but fio should work there almost out +of the box. Since I don't do test runs or even compiles on those platforms, +your mileage may vary. Sending me patches for other platforms is greatly +appreciated. There's a lot of value in having the same test/benchmark tool +available on all platforms. + +Note that POSIX aio is not enabled by default on AIX. If you get messages like: + + Symbol resolution failed for /usr/lib/libc.a(posix_aio.o) because: + Symbol _posix_kaio_rdwr (number 2) is not exported from dependent module /unix. -Interpreting the output ------------------------ - -fio spits out a lot of output. While running, fio will display the -status of the jobs created. An example of that would be: - -Threads running: 1: [_r] [24.79% done] [eta 00h:01m:31s] - -The characters inside the square brackets denote the current status of -each thread. The possible values (in typical life cycle order) are: - -Idle Run ----- --- -P Thread setup, but not started. -C Thread created. -I Thread initialized, waiting. - R Running, doing sequential reads. - r Running, doing random reads. - W Running, doing sequential writes. - w Running, doing random writes. - M Running, doing mixed sequential reads/writes. - m Running, doing mixed random reads/writes. - F Running, currently waiting for fsync() -V Running, doing verification of written data. -E Thread exited, not reaped by main thread yet. -_ Thread reaped. - -The other values are fairly self explanatory - number of threads -currently running and doing io, and the estimated completion percentage -and time for the running group. It's impossible to estimate runtime -of the following groups (if any). - -When fio is done (or interrupted by ctrl-c), it will show the data for -each thread, group of threads, and disks in that order. For each data -direction, the output looks like: - -Client1 (g=0): err= 0: - write: io= 32MiB, bw= 666KiB/s, runt= 50320msec - slat (msec): min= 0, max= 136, avg= 0.03, dev= 1.92 - clat (msec): min= 0, max= 631, avg=48.50, dev=86.82 - bw (KiB/s) : min= 0, max= 1196, per=51.00%, avg=664.02, dev=681.68 - cpu : usr=1.49%, sys=0.25%, ctx=7969 - -The client number is printed, along with the group id and error of that -thread. Below is the io statistics, here for writes. In the order listed, -they denote: - -io= Number of megabytes io performed -bw= Average bandwidth rate -runt= The runtime of that thread - slat= Submission latency (avg being the average, dev being the - standard deviation). This is the time it took to submit - the io. For sync io, the slat is really the completion - latency, since queue/complete is one operation there. - clat= Completion latency. Same names as slat, this denotes the - time from submission to completion of the io pieces. For - sync io, clat will usually be equal (or very close) to 0, - as the time from submit to complete is basically just - CPU time (io has already been done, see slat explanation). - bw= Bandwidth. Same names as the xlat stats, but also includes - an approximate percentage of total aggregate bandwidth - this thread received in this group. This last value is - only really useful if the threads in this group are on the - same disk, since they are then competing for disk access. -cpu= CPU usage. User and system time, along with the number - of context switches this thread went through. - -After each client has been listed, the group statistics are printed. They -will look like this: - -Run status group 0 (all jobs): - READ: io=64MiB, aggrb=22178, minb=11355, maxb=11814, mint=2840msec, maxt=2955msec - WRITE: io=64MiB, aggrb=1302, minb=666, maxb=669, mint=50093msec, maxt=50320msec - -For each data direction, it prints: - -io= Number of megabytes io performed. -aggrb= Aggregate bandwidth of threads in this group. -minb= The minimum average bandwidth a thread saw. -maxb= The maximum average bandwidth a thread saw. -mint= The smallest runtime of the threads in that group. -maxt= The longest runtime of the threads in that group. - -And finally, the disk statistics are printed. They will look like this: - -Disk stats (read/write): - sda: ios=16398/16511, merge=30/162, ticks=6853/819634, in_queue=826487, util=100.00% - -Each value is printed for both reads and writes, with reads first. The -numbers denote: - -ios= Number of ios performed by all groups. -merge= Number of merges io the io scheduler. -ticks= Number of ticks we kept the disk busy. -io_queue= Total time spent in the disk queue. -util= The disk utilization. A value of 100% means we kept the disk - busy constantly, 50% would be a disk idling half of the time. - - -Terse output ------------- - -For scripted usage where you typically want to generate tables or graphs -of the results, fio can output the results in a comma seperated format. -The format is one long line of values, such as: +you need to enable POSIX aio. Run the following commands as root: -client1,0,0,936,331,2894,0,0,0.000000,0.000000,1,170,22.115385,34.290410,16,714,84.252874%,366.500000,566.417819,3496,1237,2894,0,0,0.000000,0.000000,0,246,6.671625,21.436952,0,2534,55.465300%,1406.600000,2008.044216,0.000000%,0.431928%,1109 + # lsdev -C -l posix_aio0 + posix_aio0 Defined Posix Asynchronous I/O + # cfgmgr -l posix_aio0 + # lsdev -C -l posix_aio0 + posix_aio0 Available Posix Asynchronous I/O -Split up, the format is as follows: +POSIX aio should work now. To make the change permanent: - jobname, groupid, error - READ status: - KiB IO, bandwidth (KiB/sec), runtime (msec) - Submission latency: min, max, mean, deviation - Completion latency: min, max, mean, deviation - Bw: min, max, aggreate percentage of total, mean, deviation - WRITE status: - KiB IO, bandwidth (KiB/sec), runtime (msec) - Submission latency: min, max, mean, deviation - Completion latency: min, max, mean, deviation - Bw: min, max, aggreate percentage of total, mean, deviation - CPU usage: user, system, context switches + # chdev -l posix_aio0 -P -a autoconfig='available' + posix_aio0 changed Author ------ -Fio was written by Jens Axboe to enable flexible testing +Fio was written by Jens Axboe to enable flexible testing of the Linux IO subsystem and schedulers. He got tired of writing specific test applications to simulate a given workload, and found that the existing io benchmark/test tools out there weren't flexible enough to do what he wanted. -Jens Axboe 20060609 +Jens Axboe 20060905