6. Normal output
7. Terse output
8. Trace file format
+9. CPU idleness profiling
1.0 Overview and history
------------------------
can specify a number of files by separating the names with a
':' colon. So if you wanted a job to open /dev/sda and /dev/sdb
as the two working files, you would use
- filename=/dev/sda:/dev/sdb. On Windows, disk devices are accessed
- as \\.\PhysicalDrive0 for the first device, \\.\PhysicalDrive1
- for the second etc.
- Note: Windows and FreeBSD prevent write access to areas of the disk
- containing in-use data (e.g. filesystems).
- If the wanted filename does need to include a colon, then escape that
- with a '\' character.
- For instance, if the filename is "/dev/dsk/foo@3,0:c",
- then you would use filename="/dev/dsk/foo@3,0\:c".
- '-' is a reserved name, meaning stdin or stdout. Which of the
- two depends on the read/write direction set.
+ filename=/dev/sda:/dev/sdb. On Windows, disk devices are
+ accessed as \\.\PhysicalDrive0 for the first device,
+ \\.\PhysicalDrive1 for the second etc. Note: Windows and
+ FreeBSD prevent write access to areas of the disk containing
+ in-use data (e.g. filesystems).
+ If the wanted filename does need to include a colon, then
+ escape that with a '\' character. For instance, if the filename
+ is "/dev/dsk/foo@3,0:c", then you would use
+ filename="/dev/dsk/foo@3,0\:c". '-' is a reserved name, meaning
+ stdin or stdout. Which of the two depends on the read/write
+ direction set.
+
+filename_format=str
+ If sharing multiple files between jobs, it is usually necessary
+ to have fio generate the exact names that you want. By default,
+ fio will name a file based on the default file format
+ specification of jobname.jobnumber.filenumber. With this
+ option, that can be customized. Fio will recognize and replace
+ the following keywords in this string:
+
+ $jobname
+ The name of the worker thread or process.
+
+ $jobnum
+ The incremental number of the worker thread or
+ process.
+
+ $filenum
+ The incremental number of the file for that worker
+ thread or process.
+
+ To have dependent jobs share a set of files, this option can
+ be set to have fio generate filenames that are shared between
+ the two. For instance, if testfiles.$filenum is specified,
+ file number 4 for any job will be named testfiles.4. The
+ default of $jobname.$jobnum.$filenum will be used if
+ no other format specifier is given.
opendir=str Tell fio to recursively add any file it can find in this
directory and down the file system tree.
same time, but writes get exclusive
access.
- The option may be post-fixed with a lock batch number. If
- set, then each thread/process may do that amount of IOs to
- the file before giving up the lock. Since lock acquisition is
- expensive, batching the lock/unlocks will speed up IO.
-
readwrite=str
rw=str Type of io pattern. Accepted values are:
ten unit instead, for obvious reasons. Allow values are
1024 or 1000, with 1024 being the default.
+unified_rw_reporting=bool Fio normally reports statistics on a per
+ data direction basis, meaning that read, write, and trim are
+ accounted and reported separately. If this option is set,
+ the fio will sum the results and report them as "mixed"
+ instead.
+
randrepeat=bool For random IO workloads, seed the generator in a predictable
way so that results are repeatable across repetitions.
fill_device=bool
fill_fs=bool Sets size to something really large and waits for ENOSPC (no
space left on device) as the terminating condition. Only makes
- sense with sequential write. For a read workload, the mount
+ sense with sequential write. For a read workload, the mount
point will be filled first then IO started on the result. This
option doesn't make sense if operating on a raw device node,
since the size of that is already known by the file system.
and is large enough for the specified write phase, nothing
will be done.
-end_fsync=bool If true, fsync file contents when the job exits.
+end_fsync=bool If true, fsync file contents when a write stage has completed.
fsync_on_close=bool If true, fio will fsync() a dirty file on close.
This differs from end_fsync in that it will happen on every
block sizes, not with workloads that use multiple block
sizes. If used with such a workload, fio may read or write
some blocks multiple times.
-
+
nice=int Run the job with the given nice value. See man nice(2).
prio=int Set the io priority value of this job. Linux limits us to
numa_cpu_nodes=str Set this job running on spcified NUMA nodes' CPUs. The
arguments allow comma delimited list of cpu numbers,
A-B ranges, or 'all'. Note, to enable numa options support,
- export the following environment variables,
- export EXTFLAGS+=" -DFIO_HAVE_LIBNUMA "
- export EXTLIBS+=" -lnuma "
+ fio must be built on a system with libnuma-dev(el) installed.
numa_mem_policy=str Set this job's memory policy and corresponding NUMA
nodes. Format of the argements:
ioscheduler=str Attempt to switch the device hosting the file to the specified
io scheduler before running.
-cpuload=int If the job is a CPU cycle eater, attempt to use the specified
- percentage of CPU cycles.
-
-cpuchunks=int If the job is a CPU cycle eater, split the load into
- cycles of the given time. In microseconds.
-
disk_util=bool Generate disk utilization statistics, if the platform
supports it. Defaults to on.
the values of completion latency below which 99.5% and
99.9% of the observed latencies fell, respectively.
+clocksource=str Use the given clocksource as the base of timing. The
+ supported options are:
+
+ gettimeofday gettimeofday(2)
+
+ clock_gettime clock_gettime(2)
+
+ cpu Internal CPU clock source
+
+ cpu is the preferred clocksource if it is reliable, as it
+ is very fast (and fio is heavy on time calls). Fio will
+ automatically use this clocksource if it's supported and
+ considered reliable on the system it is running on, unless
+ another clocksource is specifically set. For x86/x86-64 CPUs,
+ this means supporting TSC Invariant.
+
gtod_reduce=bool Enable all of the gettimeofday() reducing options
(disable_clat, disable_slat, disable_bw) plus reduce
precision of the timeout somewhat to really shrink
enabled when polling for a minimum of 0 events (eg when
iodepth_batch_complete=0).
+[cpu] cpuload=int Attempt to use the specified percentage of CPU cycles.
+
+[cpu] cpuchunks=int Split the load into cycles of the given time. In
+ microseconds.
+
[netsplice] hostname=str
[net] hostname=str The host name or IP address to use for TCP or UDP based IO.
If the job is a TCP listener or UDP reader, the hostname is not
[netsplice] port=int
[net] port=int The TCP or UDP port to bind to or connect to.
+[netsplice] nodelay=bool
+[net] nodelay=bool Set TCP_NODELAY on TCP connections.
+
[netsplice] protocol=str
[netsplice] proto=str
[net] protocol=str
[net] listen For TCP network connections, tell fio to listen for incoming
connections rather than initiating an outgoing connection. The
hostname must be omitted if this option is used.
+[net] pingpong Normal a network writer will just continue writing data, and
+ a network reader will just consume packages. If pingpong=1
+ is set, a writer will send its normal payload to the reader,
+ then wait for the reader to send the same payload back. This
+ allows fio to measure network latencies. The submission
+ and completion latencies then measure local time spent
+ sending or receiving, and the completion latency measures
+ how long it took for the other end to receive and send back.
+
[e4defrag] donorname=str
File will be used as a block donor(swap extents between files)
[e4defrag] inplace=int
sync fsync() the file
datasync fdatasync() the file
trim trim the given file from the given 'offset' for 'length' bytes
+
+
+9.0 CPU idleness profiling
+
+In some cases, we want to understand CPU overhead in a test. For example,
+we test patches for the specific goodness of whether they reduce CPU usage.
+fio implements a balloon approach to create a thread per CPU that runs at
+idle priority, meaning that it only runs when nobody else needs the cpu.
+By measuring the amount of work completed by the thread, idleness of each
+CPU can be derived accordingly.
+
+An unit work is defined as touching a full page of unsigned characters. Mean
+and standard deviation of time to complete an unit work is reported in "unit
+work" section. Options can be chosen to report detailed percpu idleness or
+overall system idleness by aggregating percpu stats.