trim: add support for multiple ranges NVMe specification allow multiple ranges for the dataset management commands. Currently the block ioctl only allows a single range for trim, however multiple ranges can be specified using nvme character device. Add an option num_range to send multiple range per trim request, which only works if the data direction is solely trim i.e. trim or randtrim. Add FIO_MULTI_RANGE_TRIM as the ioengine flag, to restrict the usage of this new option. For multi range trim request this modifies the way IO buffers are used. The buffer length will depend on number of trim ranges and the actual buffer will contains start and length of each range entry. This increases fio server version (FIO_SERVER_VER) to 103. Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Link: https://lore.kernel.org/r/20240215151812.138370-2-ankit.kumar@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
engines:xnvme: add support for end to end data protection This patch enables support for protection information to xnvme ioengine. This adds 4 new ioengine specific options * pi_act - Protection information action. Default: 1 * pi_chk - Can be set to GUARD, APPTAG or REFTAG * apptag - Sets apptag field of command dword 15 * apptag_mask - Sets apptag_mask field of command dword 15 For the sake of consistency these options are the same as the ones used by io_uring_cmd ioengine and SPDK's external ioengine. Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20240213153315.134202-4-ankit.kumar@samsung.com Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
engines/xnvme: add support for metadata This enables support for separate metadata buffers with xnvme ioengine. This is done by providing xnvme specific option md_per_io_size, which for the sake of consistency is the same option used by io_uring_cmd engine and SPDK's external ioengine. Bump up the required xnvme support to v0.7.4 Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20240213153315.134202-3-ankit.kumar@samsung.com Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
docs: explain duplicate logging timestamps When a fio job ends, it cleans up by flushing any accumulated latency log data for jobs with log_avg_msec enabled. This means that the final logging interval may be different from what was specified by log_avg_msec. In some cases there may even be duplicate timestamps. Add an explanation for this phenomenon to the documentation. During job cleanup it's possible to simply suppress the final log entry if log_avg_msec has not passed since the previous log entry was recorded but this throws away data that some users may depend on. For instance, a 55s job with log_avg_msec=10000 would have no long entry for the final 5s if we suppressed the final log entry. Users concerned about final log entries with duplicate timestamps should just ignore the second entry since it is likely based on only a handful of I/Os. Duplicate log entry example: $ sudo ./fio --name=test --iodepth=2 --ioengine=libaio --time_based --runtime=5s --log_avg_msec=1000 --write_lat_log=test --filename=/dev/vda --direct=1 test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=2 fio-3.36-61-g9cfa-dirty Starting 1 process Jobs: 1 (f=1): [R(1)][100.0%][r=250MiB/s][r=64.0k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=1490: Mon Feb 12 12:19:13 2024 read: IOPS=63.6k, BW=248MiB/s (260MB/s)(1242MiB/5001msec) slat (nsec): min=691, max=37070, avg=1272.54, stdev=674.26 clat (usec): min=4, max=1731, avg=29.83, stdev= 7.03 lat (usec): min=15, max=1734, avg=31.10, stdev= 7.19 clat percentiles (usec): | 1.00th=[ 23], 5.00th=[ 25], 10.00th=[ 26], 20.00th=[ 27], | 30.00th=[ 28], 40.00th=[ 29], 50.00th=[ 30], 60.00th=[ 31], | 70.00th=[ 32], 80.00th=[ 33], 90.00th=[ 35], 95.00th=[ 37], | 99.00th=[ 41], 99.50th=[ 43], 99.90th=[ 58], 99.95th=[ 74], | 99.99th=[ 104] bw ( KiB/s): min=244464, max=258112, per=100.00%, avg=254410.67, stdev=4788.90, samples=9 iops : min=61116, max=64528, avg=63602.67, stdev=1197.23, samples=9 lat (usec) : 10=0.01%, 20=0.06%, 50=99.76%, 100=0.16%, 250=0.01% lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01% cpu : usr=11.46%, sys=18.20%, ctx=159414, majf=0, minf=49 IO depths : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=317842,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=2 Run status group 0 (all jobs): READ: bw=248MiB/s (260MB/s), 248MiB/s-248MiB/s (260MB/s-260MB/s), io=1242MiB (1302MB), run=5001-5001msec Disk stats (read/write): vda: ios=311248/0, sectors=2489984/0, merge=0/0, ticks=7615/0, in_queue=7615, util=98.10% $ cat test_lat.1.log 1000, 31907, 0, 0, 0 2000, 30705, 0, 0, 0 3000, 30738, 0, 0, 0 4000, 31196, 0, 0, 0 5000, 30997, 0, 0, 0 5000, 31559, 0, 0, 0 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Add support for VSOCK to engine/net.c * configure: add option to enable/disable vsock support * engines/net.c: add vsock support The VSOCK address family facilitates communication between virtual machines and the host they are running on. The addressing is formed by 2 integers: <CID, port> - CID: Context ID, it is the ID assigned to the VM 0, 1, 2 CIDs are reserved: 0 - hypervisor CID (rarely used) 1 - local communication (loopback) 2 - host CID (the guest can always reach the host using CID=2) - port: port number on 32bit to reach a specific process * examples: add 3 simple job files for vsock (one sender, one receiver and one that uses vsock loopback interface similar to examples/netio.fio) * fio.1: add vsock to supported protocols together with the required parameters * HOWTO.rst: add vsock to supported protocols together with the required parameters Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
fio: Introduce new constant thinkcycles option The thinkcycles parameter allows to set a number of cycles to spin between requests to model real-world applications more realistically The thinktime parameter family can be used to model an application processing the data to be able to model real-world applications more closely. Unfortunately this is currently set per constant time and therefore is affected by CPU frequency settings or task migration to a CPU with different capacity. The new thinkcycles parameter closes that gap and allows specifying a constant number of cycles instead, such that CPU capacity is taken into account. Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Update docs to clarify how to pass job options in client mode When run in client mode, fio does not pass any job options specified on the command line to the fio server. When run in client mode, all job options must be specified via local or remote job files. Update the docs to indicate this to avoid end-user confusion. Fixes #1629 Signed-off-by: Nick Neumann nick@pcpartpicker.com
Make log_unix_epoch an official alias of log_alternate_epoch log_alternate_epoch was introduced along with log_alternate_epoch_clock_id, and generalized the idea of log_unix_epoch. Both options had the same effect. So we make log_unix_epoch an official alias of log_alternate_epoch, instead of maintaining both redundant options. Signed-off-by: Nick Neumann nick@pcpartpicker.com
Record job start time to fix time pain points Add a new key in the json per-job output, job_start, that records the job start time obtained via a call to clock_gettime using the clock_id specified by the new job_start_clock_id option. This allows times of fio jobs and log entries to be compared/ordered against each other and against other system events recorded against the same clock_id. Add a note to the documentation for group_reporting about how there are several per-job values for which only the first job's value is recorded in the json output format when group_reporting is enabled. Fixes #1544 Signed-off-by: Nick Neumann nick@pcpartpicker.com
engines:io_uring: uring_cmd add support for protection info This patch enables support for protection information to nvme command backend of io_uring_cmd ioengine. The patch only supports protection information action bit set to 1, for read and write operation. This adds 4 new ioengine specific options * pi_act - Protection information action. Default: 1 * pi_chk - Can be set to GUARD, APPTAG or REFTAG * apptag - Sets apptag field of command dword 15 * apptag_mask - Sets apptag_mask field of command dword 15 For the sake of consistency these options are the same as the ones used by SPDK's external ioengine. For pi_act=1, if namespace is formatted with metadata size equal to protection information size, the nvme controller inserts and removes protection information for write and read command respectively. Added a check so that fio doesn't send metadata for such cases. Storage tag support is not present, so return an error for that. Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Link: https://lore.kernel.org/r/20230814145747.114725-5-ankit.kumar@samsung.com Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
engines:io_uring: enable support for separate metadata buffer This patch enables support for separate metadata buffer with io_uring_cmd ioengine. As we are unaware of metadata size during buffer allocation, we provide an option md_per_io_size. This option must be used to specify metadata buffer size for single IO, if namespace is formatted with a separate metadata buffer. For the sake of consistency this is the same option as used by SPDK's external ioengine. Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Link: https://lore.kernel.org/r/20230814145747.114725-4-ankit.kumar@samsung.com Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
cmdprio: Add support for per I/O priority hint Introduce the new option cmdprio_hint to allow specifying I/O priority hints per IO with the io_uring and libaio IO engines. A third acceptable format for the cmdprio_bssplit option is also introduced to allow specifying an I/O hint in addition to a priority class and level. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230721110510.44772-6-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
options: add priohint option Introduce the new option priohint to allow users to specify an I/O priority hint applying to all IOs issued by a job. This increases fio server version (FIO_SERVER_VER) to 101. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230721110510.44772-5-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>