X-Git-Url: https://git.kernel.dk/?a=blobdiff_plain;f=fio.1;h=45ec8d43dcbf8172318f91d12c33005037386bd9;hb=4ef1562a013513fd0a0048cca4048f28d308a90f;hp=f15194ff78c464913c6cf7cc1e110f5817b0d2de;hpb=d4e74fda98b60dc175b4114492fcc7c21c617ddd;p=fio.git diff --git a/fio.1 b/fio.1 index f15194ff..45ec8d43 100644 --- a/fio.1 +++ b/fio.1 @@ -1462,9 +1462,31 @@ starting I/O if the platform and file type support it. Defaults to true. This will be ignored if \fBpre_read\fR is also specified for the same job. .TP -.BI sync \fR=\fPbool -Use synchronous I/O for buffered writes. For the majority of I/O engines, -this means using O_SYNC. Default: false. +.BI sync \fR=\fPstr +Whether, and what type, of synchronous I/O to use for writes. The allowed +values are: +.RS +.RS +.TP +.B none +Do not use synchronous IO, the default. +.TP +.B 0 +Same as \fBnone\fR. +.TP +.B sync +Use synchronous file IO. For the majority of I/O engines, +this means using O_SYNC. +.TP +.B 1 +Same as \fBsync\fR. +.TP +.B dsync +Use synchronous data IO. For the majority of I/O engines, +this means using O_DSYNC. +.PD +.RE +.RE .TP .BI iomem \fR=\fPstr "\fR,\fP mem" \fR=\fPstr Fio can use various types of memory as the I/O unit buffer. The allowed @@ -1561,7 +1583,8 @@ if \fBsize\fR is set to 20GiB and \fBio_size\fR is set to 5GiB, fio will perform I/O within the first 20GiB but exit when 5GiB have been done. The opposite is also possible \-\- if \fBsize\fR is set to 20GiB, and \fBio_size\fR is set to 40GiB, then fio will do 40GiB of I/O within -the 0..20GiB region. +the 0..20GiB region. Value can be set as percentage: \fBio_size\fR=N%. +In this case \fBio_size\fR multiplies \fBsize\fR= value. .TP .BI filesize \fR=\fPirange(int) Individual file sizes. May be a range, in which case fio will select sizes @@ -1674,11 +1697,6 @@ to get desired CPU usage, as the cpuload only loads a single CPU at the desired rate. A job never finishes unless there is at least one non-cpuio job. .TP -.B guasi -The GUASI I/O engine is the Generic Userspace Asynchronous Syscall -Interface approach to async I/O. See \fIhttp://www.xmailserver.org/guasi-lib.html\fR -for more info on GUASI. -.TP .B rdma The RDMA I/O engine supports both RDMA memory semantics (RDMA_WRITE/RDMA_READ) and channel semantics (Send/Recv) for the @@ -1808,6 +1826,13 @@ Read and write iscsi lun with libiscsi. .TP .B nbd Synchronous read and write a Network Block Device (NBD). +.TP +.B libcufile +I/O engine supporting libcufile synchronous access to nvidia-fs and a +GPUDirect Storage-supported filesystem. This engine performs +I/O without transferring buffers between user-space and the kernel, +unless \fBverify\fR is set or \fBcuda_io\fR is \fBposix\fR. \fBiomem\fR must +not be \fBcudamalloc\fR. This ioengine defines engine specific options. .SS "I/O engine specific parameters" In addition, there are some parameters which are only valid when a specific \fBioengine\fR is in use. These are used identically to normal parameters, @@ -2121,7 +2146,36 @@ Example URIs: \fInbd+unix:///?socket=/tmp/socket\fR .TP \fInbds://tlshost/exportname\fR - +.RE +.RE +.TP +.BI (libcufile)gpu_dev_ids\fR=\fPstr +Specify the GPU IDs to use with CUDA. This is a colon-separated list of int. +GPUs are assigned to workers roundrobin. Default is 0. +.TP +.BI (libcufile)cuda_io\fR=\fPstr +Specify the type of I/O to use with CUDA. This option +takes the following values: +.RS +.RS +.TP +.B cufile (default) +Use libcufile and nvidia-fs. This option performs I/O directly +between a GPUDirect Storage filesystem and GPU buffers, +avoiding use of a bounce buffer. If \fBverify\fR is set, +cudaMemcpy is used to copy verification data between RAM and GPU(s). +Verification data is copied from RAM to GPU before a write +and from GPU to RAM after a read. +\fBdirect\fR must be 1. +.TP +.BI posix +Use POSIX to perform I/O with a RAM buffer, and use +cudaMemcpy to transfer data between RAM and the GPU(s). +Data is copied from GPU to RAM before a write and copied +from RAM to GPU after a read. \fBverify\fR does not affect +the use of cudaMemcpy. +.RE +.RE .SS "I/O depth" .TP .BI iodepth \fR=\fPint @@ -2219,7 +2273,7 @@ has a bit of extra overhead, especially for lower queue depth I/O where it can increase latencies. The benefit is that fio can manage submission rates independently of the device completion rates. This avoids skewed latency reporting if I/O gets backed up on the device side (the coordinated omission -problem). +problem). Note that this option cannot reliably be used with async IO engines. .SS "I/O rate" .TP .BI thinktime \fR=\fPtime