This will be ignored if :option:`pre_read` is also specified for the
same job.
-.. option:: sync=bool
+.. option:: sync=str
+
+ Whether, and what type, of synchronous I/O to use for writes. The allowed
+ values are:
+
+ **none**
+ Do not use synchronous IO, the default.
+
+ **0**
+ Same as **none**.
+
+ **sync**
+ Use synchronous file IO. For the majority of I/O engines,
+ this means using O_SYNC.
+
+ **1**
+ Same as **sync**.
+
+ **dsync**
+ Use synchronous data IO. For the majority of I/O engines,
+ this means using O_DSYNC.
- Use synchronous I/O for buffered writes. For the majority of I/O engines,
- this means using O_SYNC. Default: false.
.. option:: iomem=str, mem=str
single CPU at the desired rate. A job never finishes unless there is
at least one non-cpuio job.
- **guasi**
- The GUASI I/O engine is the Generic Userspace Asynchronous Syscall
- Interface approach to async I/O. See
-
- http://www.xmailserver.org/guasi-lib.html
-
- for more info on GUASI.
-
**rdma**
The RDMA I/O engine supports both RDMA memory semantics
(RDMA_WRITE/RDMA_READ) and channel semantics (Send/Recv) for the
**nbd**
Read and write a Network Block Device (NBD).
+ **libcufile**
+ I/O engine supporting libcufile synchronous access to nvidia-fs and a
+ GPUDirect Storage-supported filesystem. This engine performs
+ I/O without transferring buffers between user-space and the kernel,
+ unless :option:`verify` is set or :option:`cuda_io` is `posix`.
+ :option:`iomem` must not be `cudamalloc`. This ioengine defines
+ engine specific options.
+
I/O engine specific parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nbd+unix:///?socket=/tmp/socket
nbds://tlshost/exportname
+.. option:: gpu_dev_ids=str : [libcufile]
+
+ Specify the GPU IDs to use with CUDA. This is a colon-separated list of
+ int. GPUs are assigned to workers roundrobin. Default is 0.
+
+.. option:: cuda_io=str : [libcufile]
+
+ Specify the type of I/O to use with CUDA. Default is **cufile**.
+
+ **cufile**
+ Use libcufile and nvidia-fs. This option performs I/O directly
+ between a GPUDirect Storage filesystem and GPU buffers,
+ avoiding use of a bounce buffer. If :option:`verify` is set,
+ cudaMemcpy is used to copy verificaton data between RAM and GPU.
+ Verification data is copied from RAM to GPU before a write
+ and from GPU to RAM after a read. :option:`direct` must be 1.
+ **posix**
+ Use POSIX to perform I/O with a RAM buffer, and use cudaMemcpy
+ to transfer data between RAM and the GPUs. Data is copied from
+ GPU to RAM before a write and copied from RAM to GPU after a
+ read. :option:`verify` does not affect use of cudaMemcpy.
+
I/O depth
~~~~~~~~~
can increase latencies. The benefit is that fio can manage submission rates
independently of the device completion rates. This avoids skewed latency
reporting if I/O gets backed up on the device side (the coordinated omission
- problem).
+ problem). Note that this option cannot reliably be used with async IO
+ engines.
I/O rate
``flow=8`` and another job has ``flow=-1``, then there will be a roughly 1:8
ratio in how much one runs vs the other.
-.. option:: flow_watermark=int
-
- The maximum value that the absolute value of the flow counter is allowed to
- reach before the job must wait for a lower value of the counter.
-
.. option:: flow_sleep=int
- The period of time, in microseconds, to wait after the flow watermark has
- been exceeded before retrying operations.
+ The period of time, in microseconds, to wait after the flow counter
+ has exceeded its proportion before retrying operations.
.. option:: stonewall, wait_for_previous