X-Git-Url: https://git.kernel.dk/?a=blobdiff_plain;f=HOWTO;h=7e46cee0eceac9f8d85c5b6191b3335dd72c2e14;hb=dee9b29bef5bc344815d7a53dda6bb21426f2bfa;hp=e0403b0803f04cb04ef7a14832dd39b3803c34d8;hpb=f0ed01ed095cf1ca7c1945a5a0267e8f73b7b4a9;p=fio.git diff --git a/HOWTO b/HOWTO index e0403b08..7e46cee0 100644 --- a/HOWTO +++ b/HOWTO @@ -1677,10 +1677,28 @@ Buffers and memory This will be ignored if :option:`pre_read` is also specified for the same job. -.. option:: sync=bool +.. option:: sync=str + + Whether, and what type, of synchronous I/O to use for writes. The allowed + values are: + + **none** + Do not use synchronous IO, the default. + + **0** + Same as **none**. + + **sync** + Use synchronous file IO. For the majority of I/O engines, + this means using O_SYNC. + + **1** + Same as **sync**. + + **dsync** + Use synchronous data IO. For the majority of I/O engines, + this means using O_DSYNC. - Use synchronous I/O for buffered writes. For the majority of I/O engines, - this means using O_SYNC. Default: false. .. option:: iomem=str, mem=str @@ -1901,14 +1919,6 @@ I/O engine single CPU at the desired rate. A job never finishes unless there is at least one non-cpuio job. - **guasi** - The GUASI I/O engine is the Generic Userspace Asynchronous Syscall - Interface approach to async I/O. See - - http://www.xmailserver.org/guasi-lib.html - - for more info on GUASI. - **rdma** The RDMA I/O engine supports both RDMA memory semantics (RDMA_WRITE/RDMA_READ) and channel semantics (Send/Recv) for the @@ -2038,6 +2048,14 @@ I/O engine **nbd** Read and write a Network Block Device (NBD). + **libcufile** + I/O engine supporting libcufile synchronous access to nvidia-fs and a + GPUDirect Storage-supported filesystem. This engine performs + I/O without transferring buffers between user-space and the kernel, + unless :option:`verify` is set or :option:`cuda_io` is `posix`. + :option:`iomem` must not be `cudamalloc`. This ioengine defines + engine specific options. + I/O engine specific parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -2388,6 +2406,28 @@ with the caveat that when used on the command line, they must come after the nbd+unix:///?socket=/tmp/socket nbds://tlshost/exportname +.. option:: gpu_dev_ids=str : [libcufile] + + Specify the GPU IDs to use with CUDA. This is a colon-separated list of + int. GPUs are assigned to workers roundrobin. Default is 0. + +.. option:: cuda_io=str : [libcufile] + + Specify the type of I/O to use with CUDA. Default is **cufile**. + + **cufile** + Use libcufile and nvidia-fs. This option performs I/O directly + between a GPUDirect Storage filesystem and GPU buffers, + avoiding use of a bounce buffer. If :option:`verify` is set, + cudaMemcpy is used to copy verificaton data between RAM and GPU. + Verification data is copied from RAM to GPU before a write + and from GPU to RAM after a read. :option:`direct` must be 1. + **posix** + Use POSIX to perform I/O with a RAM buffer, and use cudaMemcpy + to transfer data between RAM and the GPUs. Data is copied from + GPU to RAM before a write and copied from RAM to GPU after a + read. :option:`verify` does not affect use of cudaMemcpy. + I/O depth ~~~~~~~~~ @@ -2482,7 +2522,8 @@ I/O depth can increase latencies. The benefit is that fio can manage submission rates independently of the device completion rates. This avoids skewed latency reporting if I/O gets backed up on the device side (the coordinated omission - problem). + problem). Note that this option cannot reliably be used with async IO + engines. I/O rate @@ -2861,15 +2902,10 @@ Threads, processes and job synchronization ``flow=8`` and another job has ``flow=-1``, then there will be a roughly 1:8 ratio in how much one runs vs the other. -.. option:: flow_watermark=int - - The maximum value that the absolute value of the flow counter is allowed to - reach before the job must wait for a lower value of the counter. - .. option:: flow_sleep=int - The period of time, in microseconds, to wait after the flow watermark has - been exceeded before retrying operations. + The period of time, in microseconds, to wait after the flow counter + has exceeded its proportion before retrying operations. .. option:: stonewall, wait_for_previous