fio.git
14 months agoiolog: handle trim commands when reading iologs
Vincent Fu [Fri, 10 Feb 2023 18:57:54 +0000 (13:57 -0500)]
iolog: handle trim commands when reading iologs

Add code to process trim commands when we are reading an iolog.

Fixes: https://github.com/axboe/fio/issues/769
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agopmemblk: remove pmemblk engine
osalyk [Fri, 17 Feb 2023 07:56:51 +0000 (02:56 -0500)]
pmemblk: remove pmemblk engine

No further support or maintenance of the libpmemblk library is planned.
https://pmem.io/blog/2022/11/update-on-pmdk-and-our-long-term-support-strategy/
https://github.com/pmem/pmdk/pull/5538

Signed-off-by: osalyk <oksana.salyk@intel.com>
14 months agoMerge branch 'Read_Stats_Not_Reported_For_Timed_Backlog_Verifies' of github.com:horsh...
Vincent Fu [Wed, 15 Feb 2023 17:49:31 +0000 (12:49 -0500)]
Merge branch 'Read_Stats_Not_Reported_For_Timed_Backlog_Verifies' of github.com:horshack-dpreview/fio

* 'Read_Stats_Not_Reported_For_Timed_Backlog_Verifies' of github.com:horshack-dpreview/fio:
  Read stats for backlog verifies not reported for time-expired workloads

14 months agoexamples: update nbd.fio fiograph diagram
Vincent Fu [Tue, 14 Feb 2023 15:47:50 +0000 (10:47 -0500)]
examples: update nbd.fio fiograph diagram

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoexamples: Small updates to nbd.fio
Richard W.M. Jones [Mon, 13 Feb 2023 13:23:27 +0000 (13:23 +0000)]
examples: Small updates to nbd.fio

Improve the documentation, describing how to use nbdkit with a local
file.  Move the suggested test file to /var/tmp since /tmp might be a
tmpfs.  Use indenting to make it easier to read.

Use ${uri} instead of ${unixsocket} since nbdkit 1.14 was released
nearly 4 years ago.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agot/zbd: add test cases for zone_reset_threshold option
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:07 +0000 (16:09 +0900)]
t/zbd: add test cases for zone_reset_threshold option

The zone_reset_threshold option works for multiple jobs only when the
jobs have same write range. Add three test cases to confirm that the
option works for multiple jobs as expected. The first test case checks
that different write ranges are reported as an error. The second test
case checks that multiple write jobs work when they have same write
range. The third test case checks that a read job and a write job work
when they have different IO ranges.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: initialize valid data bytes accounting at file setup
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:06 +0000 (16:09 +0900)]
zbd: initialize valid data bytes accounting at file setup

The valid data bytes accounting field is initialized at file reset,
after each job started. Each job locks zones to check write pointer
positions of its write target zones. This can cause zone lock contention
with write by other jobs.

To avoid the zone lock contention, move the initialization from file
reset to file setup before job start. It allows to access the write
pointers and the accounting field without locks. Remove the lock and
unlock codes which are no longer required. Ensure the locks are not
required by checking run-state in the struct thread_data. Also rename
the function zbd_set_vdb() to zbd_verify_and_set_vdb() to be consistent
with other functions called in zbd_setup_files().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: check write ranges for zone_reset_threshold option
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:05 +0000 (16:09 +0900)]
zbd: check write ranges for zone_reset_threshold option

The valid data bytes accounting is used for zone_reset_threshold option.
This accounting usage has two issues. The first issue is unexpected
zone reset due to different IO ranges. The valid data bytes accounting
is done for all IO ranges per device, and shared by all jobs. On the
other hand, the zone_reset_threshold option is defined as the ratio to
each job's IO range. When a job refers to the accounting value, it
includes writes to IO ranges out of the job's IO range. Then zone reset
is triggered earlier than expected.

The second issue is accounting value initialization. The initialization
of the accounting field is repeated for each job, then the value
initialized by the first job is overwritten by other jobs. This works as
expected for single job or multiple jobs with same write range. However,
when multiple jobs have different write ranges, the overwritten value is
wrong except for the last job.

To ensure that the accounting works as expected for the option, check
that write ranges of all jobs are same. If jobs have different write
ranges, report it as an error. Initialize the accounting field only once
for the first job. All jobs have same write range, then one time
initialization is enough. Update man page to clarify this limitation of
the option.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: account valid data bytes only for zone_reset_threshold option
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:04 +0000 (16:09 +0900)]
zbd: account valid data bytes only for zone_reset_threshold option

The valid data bytes accounting is used only for zone_reset_threshold
option. Avoid the accounting when the option is not specified.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agodoc: fix unit of zone_reset_threshold and relation to other option
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:03 +0000 (16:09 +0900)]
doc: fix unit of zone_reset_threshold and relation to other option

The zone_reset_threshold option uses the 'sectors with data' accounting
then it was described to have 'logical block' as its unit. However, the
accounting was implemented with 'byte' unit. Fix the description of the
option.

Also, the zone_reset_threshold option works together with the
zone_reset_frequency option. Describe this relation also.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: rename the accounting 'sectors with data' to 'valid data bytes'
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:02 +0000 (16:09 +0900)]
zbd: rename the accounting 'sectors with data' to 'valid data bytes'

The 'sectors with data' accounting was designed to have 'sector' as its
unit. Then related variables have the word 'sector' in their names. Also
related code comments use the words 'sector' or 'logical blocks'.
However, actually it was implemented to have 'byte' as unit. Rename
related variables and comments to indicate the byte unit. Also replace
the abbreviation swd to vdb.

Fixes: a7c2b6fc2959 ("Add support for resetting zones periodically")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: remove CHECK_SWD feature
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:01 +0000 (16:09 +0900)]
zbd: remove CHECK_SWD feature

The 'sectors with data' accounting had been used for CHECK_SWD debug
feature. It compared expected written data size and actually written
data size for zonemode=zbd. However, this feature has been disabled for
a while and not actively used. Also, the sector with data accounting has
two issues. The first issue is wrong accounting for multiple jobs with
different write ranges. The second issue is job start up failure due to
zone lock contention.

Avoid using the accounting by removing the CHECK_SWD feature and related
code. Also rename the function zbd_process_swd() to zbd_set_swd() to
clarify that it no longer works for CHECK_SWD.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agozbd: refer file->last_start[] instead of sectors with data accounting
Shin'ichiro Kawasaki [Thu, 9 Feb 2023 07:09:00 +0000 (16:09 +0900)]
zbd: refer file->last_start[] instead of sectors with data accounting

To decide the first IO direction of randrw workload, the function
zbd_adjust_ddir() refers to the zbd_info->sectors_with_data value which
indicates the number of bytes written to the zoned block devices being
accessed. However, this accounting has two issues. The first issue is
wrong accounting for multiple jobs with different write ranges. The
second issue is job start up failure due to zone lock contention.

Avoid using zbd_info->sectors_with_data and simply refer to file->
last_start[DDIR_WRITE] instead. It is initialized with -1ULL for each
job. After any write operation is done by the job, it keeps valid
offset. If it has valid offset, written data is expected and the first
IO direction can be read.

Also remove zbd_info->sectors_with_data, which is no longer used. Keep
the field zbd_info->wp_sectors_with_data since it is still used for
zones with write pointers.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoRead stats for backlog verifies not reported for time-expired workloads
Horshack [Fri, 10 Feb 2023 02:47:38 +0000 (21:47 -0500)]
Read stats for backlog verifies not reported for time-expired workloads

When verify_backlog is used on a write-only workload with a runtime= value
and the runtime expires before the workload has written its full dataset,
the read stats for the backlog verifies are not reported, resulting in a
stat result showing only the workload writes (ie, the "read:" results
section is completely missing from fio's stats output)

The logic in thread_main() fails to call update_runtime() for DDIR_READ
because the existing call to update_runtime() for DDIR_READ on write-only
workloads is currently only done after do_verify() is complete, which wont
be called in this scenario because td->terminate is true due to the
expiration of the runtime.

Link: https://github.com/axboe/fio/issues/1515
Signed-off-by: Adam Horshack (horshack@live.com)
14 months agoMerge branch 'msg-Modify_QD_Sync_Warning_For_offload' of https://github.com/horshack...
Vincent Fu [Fri, 10 Feb 2023 16:49:46 +0000 (11:49 -0500)]
Merge branch 'msg-Modify_QD_Sync_Warning_For_offload' of https://github.com/horshack-dpreview/fio

* 'msg-Modify_QD_Sync_Warning_For_offload' of https://github.com/horshack-dpreview/fio:
  Suppress sync engine QD > 1 warning if io_submit_mode is offload

14 months agoSuppress sync engine QD > 1 warning if io_submit_mode is offload
Horshack [Thu, 9 Feb 2023 16:31:55 +0000 (11:31 -0500)]
Suppress sync engine QD > 1 warning if io_submit_mode is offload

The user is warned if iodepth > 1 when using a synchronous I/O engine,
since the engine can only submit one I/O at a time. This warning is not
accounting for the case of the user enabling I/O submission offload
threads via io_submit_mode=offload. Modified the warning conditional to
suppress the warning when this is the case.

Signed-off-by: Adam Horshack (horshack@live.com)
14 months agoMerge branch 'Offload_Segfault_Write_Log' of https://github.com/horshack-dpreview/fio
Jens Axboe [Thu, 9 Feb 2023 16:34:32 +0000 (09:34 -0700)]
Merge branch 'Offload_Segfault_Write_Log' of https://github.com/horshack-dpreview/fio

* 'Offload_Segfault_Write_Log' of https://github.com/horshack-dpreview/fio:
  SIGSEGV / Exit 139 when write_iolog used with io_submit_mode=offload

14 months agoSIGSEGV / Exit 139 when write_iolog used with io_submit_mode=offload
Horshack [Thu, 9 Feb 2023 16:03:12 +0000 (11:03 -0500)]
SIGSEGV / Exit 139 when write_iolog used with io_submit_mode=offload

Segmentation fault when log_io_u() attempts to write an entry to a
user-specified write_iolog file, if the I/O is issued from an offload
thread created by io_submit_mode=offload. Call path:

rate-submit.c::io_workqueue_fn() -> td_io_queue() -> log_io_u(td, io_u)

The log file handle in thread_data->iolog_f opened by init_iolog() is not
being copied to the offload thread's private copy of thread_data, causing a
NULL deference when fprintf() is called to write to the log file.

Fix is to copy the main thread's td->iolog_f to the offload thread's td at
creation time. Seems a bit disjointed to be copying individual fields between
these two structures on an as-needed basis rather than having a mechanism to
replicate the entire structure, or at least replicating the I/O submission
specific fields by moving them into a nested structure that's copied wholesale
in io_workqueue_init_worker_fn() - that way future code changes to the I/O
submission path wont cause the same bug for fields needed by both the inline
and offline submission paths.

Signed-off-by: Adam Horshack (horshack@live.com)
14 months agoioengines: clarify FIO_RO_NEEDS_RW_OPEN flag
Vincent Fu [Tue, 7 Feb 2023 15:44:00 +0000 (10:44 -0500)]
ioengines: clarify FIO_RO_NEEDS_RW_OPEN flag

This flag is only checked in generic_open_file(). So ioengines with the
own open file routines that do not call generic_open_file() will be
unaffected.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoengines/libzbc: for read workloads always open devices with O_RDONLY flag
Vincent Fu [Tue, 7 Feb 2023 14:37:29 +0000 (09:37 -0500)]
engines/libzbc: for read workloads always open devices with O_RDONLY flag

libzbc uses the SG_IO ioctl to send commands to devices (instead of
using write() to send commands to character devices). So we don't need
to open character devices with the O_RDWR flag.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoRevert "engines/libzbc: set FIO_RO_NEEDS_RW_OPEN engine flag"
Vincent Fu [Tue, 7 Feb 2023 14:33:55 +0000 (09:33 -0500)]
Revert "engines/libzbc: set FIO_RO_NEEDS_RW_OPEN engine flag"

This reverts commit 6d7f8d9a31f9ecdeab0eed8f23c63b9a94ec61f6.

The FIO_RO_NEEDS_RW_OPEN file affects only generic_open_file but the
libzbc ioengine has its own file open routine. So the flag has no effect
and its presence may be misleading.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoengines/libzbc: set FIO_RO_NEEDS_RW_OPEN engine flag
Jens Axboe [Mon, 6 Feb 2023 19:36:37 +0000 (12:36 -0700)]
engines/libzbc: set FIO_RO_NEEDS_RW_OPEN engine flag

The libzbc engine also needs a writeable open, even for a read-only
workload.

Fixes: d72b10e3ca2f ("fio: add FIO_RO_NEEDS_RW_OPEN ioengine flag")
Reported-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
14 months agoMerge branch 'perf-Avoid_Clock_Check_For_No_Rate_Check' of https://github.com/horshac...
Jens Axboe [Mon, 6 Feb 2023 16:53:16 +0000 (09:53 -0700)]
Merge branch 'perf-Avoid_Clock_Check_For_No_Rate_Check' of https://github.com/horshack-dpreview/fio

* 'perf-Avoid_Clock_Check_For_No_Rate_Check' of https://github.com/horshack-dpreview/fio:
  Improve IOPs 50% by avoiding clock sampling when rate options not used

14 months agoImprove IOPs 50% by avoiding clock sampling when rate options not used
Horshack [Mon, 6 Feb 2023 02:17:31 +0000 (21:17 -0500)]
Improve IOPs 50% by avoiding clock sampling when rate options not used

Profiling revealed thread_main() is spending 50% of its time in calls to
utime_since_now() from rate_ddir(). This call is only necessary if the user
specified a rate option for the job. A conditional was added to avoid the call
if !should_check_rate(). See this link for details and profiling data:

https://github.com/axboe/fio/issues/1501#issuecomment-1418327049

Signed-off-by: Adam Horshack (horshack@live.com)
14 months agofio: add FIO_RO_NEEDS_RW_OPEN ioengine flag
Vincent Fu [Fri, 3 Feb 2023 14:54:50 +0000 (09:54 -0500)]
fio: add FIO_RO_NEEDS_RW_OPEN ioengine flag

Some oddball cases like sg/bsg require devices to be opened for writing
in order to do read commands. So fio has been opening character devices
in rw mode for read workloads. However, nvme generic character devices
do not need (and may refuse) a writeable open for read workloads. So
instead of always opening character devices in rw mode, open devices in
rw mode for read workloads only if the ioengine has the
FIO_RO_NEEDS_RW_OPEN flag.

Link: https://lore.kernel.org/fio/20230203123421.126720-1-joshi.k@samsung.com/
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agoMerge branch 'master' of https://github.com/horshack-dpreview/fio
Vincent Fu [Fri, 3 Feb 2023 17:06:32 +0000 (12:06 -0500)]
Merge branch 'master' of https://github.com/horshack-dpreview/fio

* 'master' of https://github.com/horshack-dpreview/fio:
  Add -replay_skip support for fio-generated I/O logs

14 months agoAdd -replay_skip support for fio-generated I/O logs
Horshack [Thu, 2 Feb 2023 16:37:01 +0000 (11:37 -0500)]
Add -replay_skip support for fio-generated I/O logs

-replay_skip is an existing option to specify classes of I/Os to skip
(read, write, etc..) when replaying I/Os via the -read_iolog option.
The code currently only implements -replay_skip for blktrace I/O logs.
This pull request adds -replay_skip support for logs generated by fio
via the -write_iolog feature.

Signed-off-by: Adam Horshack (horshack@live.com)
14 months agolib/pattern: fix formatting
Vincent Fu [Tue, 31 Jan 2023 15:44:54 +0000 (10:44 -0500)]
lib/pattern: fix formatting

The fix I committed was not formatted nicely. Make the code look nicer.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agotest: add test for lib/pattern segfault issue
Vincent Fu [Tue, 31 Jan 2023 15:43:13 +0000 (10:43 -0500)]
test: add test for lib/pattern segfault issue

Add t/jobs/t0028-c6cade16.fio to test the ability for the routines in
lib/pattern.c to parse multi-part buffer patterns.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
14 months agolib/pattern: Fix seg fault when calculating pattern length
Vincent Fu [Mon, 30 Jan 2023 15:37:48 +0000 (10:37 -0500)]
lib/pattern: Fix seg fault when calculating pattern length

When --buffer_pattern or --verify_pattern has multiple elements
(0x110x22 or 0xdeadface"abcd"-12'filename') calculating the length
produces a segmentation fault in parse_and_fill_pattern() because it
increments out when out is passed to the parse_* routines it calls.

This patch uses the fix provided in the GitHub issue.

Fixes: https://github.com/axboe/fio/issues/1500
Fixes: 6c9397396eb83a6ce64a998795e7a50552e4337e "lib/pattern: Support
NULL output buffer in parse_and_fill_pattern()"

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agoEnable crc32c accelleration for arm64 on OSX
Jens Axboe [Wed, 25 Jan 2023 15:01:30 +0000 (08:01 -0700)]
Enable crc32c accelleration for arm64 on OSX

Before:

jensaxboe@Jenss-MacBook-Pro fio % ./fio --crctest=crc32c
crc32c:   440.18 MiB/sec

After:

ensaxboe@Jenss-MacBook-Pro fio % ./fio --crctest=crc32c
crc32c: 23923.00 MiB/sec

We know we have it on osx on arm hardware, enabling it is pretty
straightforward with that assumption.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
15 months agoMakefile: add -Wno-stringop-truncation for y.tab.o
Jens Axboe [Wed, 25 Jan 2023 03:54:48 +0000 (20:54 -0700)]
Makefile: add -Wno-stringop-truncation for y.tab.o

This file is auto-generated, and it currently spews the following
warning for me:

In function ‘setup_to_parse_string’,
    inlined from ‘evaluate_arithmetic_expression’ at y.tab.c:1571:2:
y.tab.c:1559:9: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-truncation]
 1559 |         strncpy(lexer_input_buffer, string, len);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
y.tab.c:1556:19: note: length computed here
 1556 |         if (len > strlen(string))
      |                   ^~~~~~~~~~~~~~

when compiled with:

gcc (Debian 12.2.0-14) 12.2.0

Just set -Wno-stringop-truncation unconditionally in the Makefile for
this file, don't think there's any point in checking if this warning
has been enabled manually.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
15 months agotools/fiograph: accommodate job files not ending in .fio
Vincent Fu [Thu, 19 Jan 2023 00:58:08 +0000 (19:58 -0500)]
tools/fiograph: accommodate job files not ending in .fio

For job files not ending in .fio, fiograph will overwrite the job file
with a graphviz script and then delete it if --keep is not specified.

Fix this by creating temporary files to contain the graphviz script and
image file. Then rename or delete the script and image file as directed
by the user specified options.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agoexamples: remove test.png
Vincent Fu [Thu, 19 Jan 2023 18:10:22 +0000 (13:10 -0500)]
examples: remove test.png

test.png doesn't correspond to any example job files. It seems to have
been inadvertantly added to the repository.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agotools/fiograph: improve default config file search
Vincent Fu [Wed, 18 Jan 2023 16:14:14 +0000 (11:14 -0500)]
tools/fiograph: improve default config file search

When a config file is not explicitly specified the current default is to
search for fiograph.conf only in the current directory. Change this to
try to use fiograph.conf in the directory where fiograph.py is located
when fiograph.conf is not found in the current directory.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agotools/fiograph: improve default output file name
Vincent Fu [Wed, 18 Jan 2023 15:52:31 +0000 (10:52 -0500)]
tools/fiograph: improve default output file name

Instead of removing all occurrences of '.fio' in the job filename, only
remove '.fio' when it is at the end of the job filename.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agotools/fiograph: add link to file formats
Vincent Fu [Wed, 18 Jan 2023 15:43:18 +0000 (10:43 -0500)]
tools/fiograph: add link to file formats

To the fiograph help text add a link to a list of the supported image
file output formats.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agoexamples: add missing fiograph diagram for sg_write_same_ndob.fio
Vincent Fu [Wed, 11 Jan 2023 20:22:40 +0000 (15:22 -0500)]
examples: add missing fiograph diagram for sg_write_same_ndob.fio

This fiograph diagram was missed in the earlier patch that added missing
fiograph diagrams for the example job files. Now each example job file
should have a fiograph diagram.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
15 months agodoc: clarify the usage of rw_sequencer
Ankit Kumar [Wed, 11 Jan 2023 06:48:05 +0000 (12:18 +0530)]
doc: clarify the usage of rw_sequencer

Update man page clarifying the usage of rw_sequencer=sequential
Added few examples explaining the offset generation for rw_sequencer.

Fixes: https://github.com/axboe/fio/issues/1223

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/xnvme: add support for picking mem backend
Ankit Kumar [Thu, 22 Dec 2022 04:39:51 +0000 (10:09 +0530)]
engines/xnvme: add support for picking mem backend

Add option to the xnvme fio engine for picking a
memory backend. Update the fio document.

Signed-off-by: Mads Ynddal <m.ynddal@samsung.com>
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/xnvme: add subnqn to fio-options
Ankit Kumar [Thu, 22 Dec 2022 04:39:50 +0000 (10:09 +0530)]
engines/xnvme: add subnqn to fio-options

For fio to utilize a fabrics target with multiple systems, it needs a
way for the user to specify which subsystem to use. This is done by
providing 'subnqn' as fio-option.

Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/xnvme: user space vfio based backend
Ankit Kumar [Thu, 22 Dec 2022 04:39:49 +0000 (10:09 +0530)]
engines/xnvme: user space vfio based backend

Add an option to use user-space VFIO-based backend,
implemented using libvfn.
Update xnvme engine options for missing backends.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/xnvme: fixes for xnvme ioengine
Ankit Kumar [Thu, 22 Dec 2022 04:39:48 +0000 (10:09 +0530)]
engines/xnvme: fixes for xnvme ioengine

1. fix error-handling in xnvme_fioe_queue()
2. fix resource-leak in xnvme_fioe_init()

Signed-off-by: Simon A. F. Lund <simon.lund@samsung.com>
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agot/run-fio-tests: relax acceptance criteria for t0008
Vincent Fu [Tue, 13 Dec 2022 21:34:20 +0000 (21:34 +0000)]
t/run-fio-tests: relax acceptance criteria for t0008

t0008 runs a 50/50 seq read/write job on a 32M file. Since a random number
generator determines whether fio issues a read or a write command, the
read/write ratio will not be exactly 50/50.  This patch relaxes the acceptance
criteria to avoid false positives when we do not use the default random number
seeds.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoexamples: add missing fiograph diagrams
Vincent Fu [Fri, 16 Dec 2022 18:20:08 +0000 (13:20 -0500)]
examples: add missing fiograph diagrams

We have added multiple example job files recently without including
fiograph diagrams for them. This patch adds fiograph diagrams for
example job files where they were missing.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agotools/fiograph: update config file
Vincent Fu [Fri, 16 Dec 2022 18:44:30 +0000 (13:44 -0500)]
tools/fiograph: update config file

Add 'specific_options' for some new ioengines. Also add a missing
ioengine option for the sg ioengine.

We should have some sort of script to fill in these ioengine options.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoexample: add a zoned block device write example with GC by trim workload
Shin'ichiro Kawasaki [Thu, 15 Dec 2022 01:56:06 +0000 (10:56 +0900)]
example: add a zoned block device write example with GC by trim workload

Add an example job file which shows how to simulate garbage collection
of zoned block devices using a job with trim workload and flow options.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agoexample: add a zoned block device write example with GC by zone resets
Shin'ichiro Kawasaki [Thu, 15 Dec 2022 01:56:05 +0000 (10:56 +0900)]
example: add a zoned block device write example with GC by zone resets

Add an example job file which shows how to simulate garbage collection
of zoned block devices using options zone_reset_threshold and
zone_reset_frequency.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agoHOWTO/man: improve descriptions of max open zones options
Shin'ichiro Kawasaki [Thu, 15 Dec 2022 01:56:04 +0000 (10:56 +0900)]
HOWTO/man: improve descriptions of max open zones options

The options max_open_zones and job_max_open_zones control the number of
open zones for zonemode=zbd. However, descriptions in HOWTO and man
about these options are not clear enough since it is not well described
what an open zone is. Improve the description by adding explanation of
the open zone state. Also explain the default values for these options.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20221215015606.2767187-3-shinichiro.kawasaki@wdc.com
[axboe: apply wording suggestion from Damien]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agoman: fix troff warning
Shin'ichiro Kawasaki [Thu, 15 Dec 2022 01:56:03 +0000 (10:56 +0900)]
man: fix troff warning

'make doc' reports a warning:

  troff: <standard input>:2554: warning: can't find font 'b'

To avoid it, add missing 'P' for troff built-in command '\f'.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agot/io_uring: adjust IORING_REGISTER_MAP_BUFFERS value
Jens Axboe [Mon, 12 Dec 2022 23:57:52 +0000 (16:57 -0700)]
t/io_uring: adjust IORING_REGISTER_MAP_BUFFERS value

This isn't upstream yet, but current test patches have it at 26.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agoengines/libblkio: Share a single blkio instance among threads in same process
Alberto Faria [Thu, 1 Dec 2022 22:08:03 +0000 (22:08 +0000)]
engines/libblkio: Share a single blkio instance among threads in same process

fio groups all subjobs that set option 'thread' into a single process.
Have them all share a single `struct blkio` instance, with one `struct
blkioq` per thread/subjob. This allows benchmarking multi-queue setups.

Note that `struct blkio` instances cannot be shared across different
processes.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add options for some driver-specific properties
Alberto Faria [Thu, 1 Dec 2022 22:08:02 +0000 (22:08 +0000)]
engines/libblkio: Add options for some driver-specific properties

The properties are either common to several drivers or particularly
relevant for benchmarking, so this should help write cleaner workload
files.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add option libblkio_force_enable_completion_eventfd
Alberto Faria [Thu, 1 Dec 2022 22:08:01 +0000 (22:08 +0000)]
engines/libblkio: Add option libblkio_force_enable_completion_eventfd

When set, the queue's completion fd is enabled even when it isn't used,
i.e., even if option libblkio_wait_mode is _not_ set to "eventfd".

Depending on the libblkio driver, this can have an impact on
performance. This option allows evaluating that overhead.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add option libblkio_wait_mode
Alberto Faria [Thu, 1 Dec 2022 22:08:00 +0000 (22:08 +0000)]
engines/libblkio: Add option libblkio_wait_mode

It allows configuring how the engine waits for request completions,
instead of always using a blocking blkioq_do_io() call.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add option libblkio_write_zeroes_on_trim
Alberto Faria [Thu, 1 Dec 2022 22:07:59 +0000 (22:07 +0000)]
engines/libblkio: Add option libblkio_write_zeroes_on_trim

When set, trim IOs will be submitted as blkioq_write_zeroes() requests
instead of blkioq_discard() requests.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add option libblkio_vectored
Alberto Faria [Thu, 1 Dec 2022 22:07:58 +0000 (22:07 +0000)]
engines/libblkio: Add option libblkio_vectored

When enabled, read and write requests are submitted as vectored requests
using blkioq_{readv,writev}(), instead of using blkioq_{read,write}().

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Add support for poll queues
Alberto Faria [Thu, 1 Dec 2022 22:07:57 +0000 (22:07 +0000)]
engines/libblkio: Add support for poll queues

Configure a poll queue instead of a "regular" queue when option hipri is
set.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoengines/libblkio: Allow setting option mem/iomem
Alberto Faria [Thu, 1 Dec 2022 22:07:56 +0000 (22:07 +0000)]
engines/libblkio: Allow setting option mem/iomem

This allows users to customize data buffer memory using fio's existing
options. Users become responsible for ensuring that the allocated memory
satisfies all constraints imposed by the libblkio driver under use.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoAdd engine flag FIO_SKIPPABLE_IOMEM_ALLOC
Alberto Faria [Thu, 1 Dec 2022 22:07:55 +0000 (22:07 +0000)]
Add engine flag FIO_SKIPPABLE_IOMEM_ALLOC

It makes it valid to set option mem/iomem even when the engine specifies
iomem_alloc and iomem_free callbacks, allowing users to optionally use
fio's customizable memory allocation logic instead of the engine's.

This is in preparation for giving libblkio engine users the choice
between controlling memory allocation or delegating it to the libblkio
library.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoAdd a libblkio engine
Alberto Faria [Thu, 1 Dec 2022 22:07:54 +0000 (22:07 +0000)]
Add a libblkio engine

The libblkio library provides a unified API for efficiently accessing
block devices using modern high-performance block I/O interfaces like
io_uring and vhost-user-blk. Using libblkio reduces the amount of code
needed for interfacing with storage devices and allows developers to
focus on their applcations.

Add a libblkio engine that uses libblkio to perform I/O. This is useful
to benchmark the library itself, and also adds support for storage
interfaces and devices otherwise not supported by fio, such as
virtio-blk PCI, vhost-user, and vhost-vDPA devices.

See the libblkio documentation [2] or KVM Forum 2022 [3] presentation
for more information on the library itself.

[1] https://gitlab.com/libblkio/libblkio
[2] https://libblkio.gitlab.io/libblkio/index.html
[3] https://static.sched.com/hosted_files/kvmforum2022/8c/libblkio-kvm-forum-2022.pdf

Signed-off-by: Alberto Faria <afaria@redhat.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agodoc: update about size
Ankit Kumar [Thu, 1 Dec 2022 05:08:32 +0000 (10:38 +0530)]
doc: update about size

In few cases with fio option size the number of bytes of data transferred
is actually less than what we specified. This can happen if there are
gaps or holes while doing I/O's or if we are running a mix of sequential
and random workload.
Update the documentation for that.

Fixes: https://github.com/axboe/fio/issues/1486

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agobackend: respect return value of init_io_u_buffers
Shin'ichiro Kawasaki [Thu, 1 Dec 2022 02:44:25 +0000 (11:44 +0900)]
backend: respect return value of init_io_u_buffers

When workloads require large buffer for I/O, fio fails to allocate I/O
buffer but does not report meaningful error message. It just accesses
to null pointer and fail with signal 11. This symptom is observed with
the command line below:

$ fio --name=job --filename=/tmp/fio --rw=write --bs=1g --size=1g \
      --iodepth=128 --ioengine=libaio

The I/O buffer allocation is done in function init_io_u_buffers. The
allocation failure is not reported because return value of the function
is ignored. Check the return value and report to the higher layer.

Fixes: 71e6e5a2fd5c ("iolog replay: Realloc io_u buffers to adapt to operation size.")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20221201024425.2340442-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
16 months agodocs: description for experimental_verify
Vincent Fu [Tue, 29 Nov 2022 22:09:41 +0000 (17:09 -0500)]
docs: description for experimental_verify

Explain how experimental_verify differs from standard verify.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agodocs: synchronize fio.1 and HOWTO changes
Vincent Fu [Tue, 29 Nov 2022 18:05:51 +0000 (13:05 -0500)]
docs: synchronize fio.1 and HOWTO changes

A couple recent patches changed only one of fio.1 or HOWTO. This patch
synchronizes the changes between the two documents.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
16 months agoMerge branch 'lintian-manpage-fixes' of https://github.com/hoexter/fio
Jens Axboe [Mon, 28 Nov 2022 19:54:53 +0000 (12:54 -0700)]
Merge branch 'lintian-manpage-fixes' of https://github.com/hoexter/fio

* 'lintian-manpage-fixes' of https://github.com/hoexter/fio:
  Use correct backslash escape in man 1 fio
  Spelling: Fix allows to -> allows one to in man 1 fio

16 months agoUse correct backslash escape in man 1 fio
Sven Hoexter [Mon, 28 Nov 2022 19:06:36 +0000 (20:06 +0100)]
Use correct backslash escape in man 1 fio

The usage of '\\' to create a literal backslash enclosed in
apostrophe is sometimes wrongly rendered as an acute accent.

Reported by lintian, a Debian package linter.

Signed-off-by: Sven Hoexter <sven@stormbind.net>
16 months agoSpelling: Fix allows to -> allows one to in man 1 fio
Sven Hoexter [Mon, 28 Nov 2022 18:47:48 +0000 (19:47 +0100)]
Spelling: Fix allows to -> allows one to in man 1 fio

Reported by lintian, a Debian package linter.

Signed-off-by: Sven Hoexter <sven@stormbind.net>
17 months agodoc: update about sqthread_poll
Ankit Kumar [Wed, 23 Nov 2022 11:27:38 +0000 (16:57 +0530)]
doc: update about sqthread_poll

Update that when sqthread_poll is enabled fio will not report submission
latency.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agoengines:io_uring: fix clat calculation for sqthread poll
Ankit Kumar [Wed, 23 Nov 2022 11:27:37 +0000 (16:57 +0530)]
engines:io_uring: fix clat calculation for sqthread poll

When sqthread_poll is specified for io_uring and io_uring_cmd I/O
engines, fio reports garbage value for completion latencies. This is
because the issue time was not recorded. Added a change for that.
On the other hand submission latency for sqthread poll is really just
the time it takes to fill in the SQ ring entries and any syscall
required to wake up the idle kernel thread. So there is really no need
to report those.

This fixes the issue: https://github.com/axboe/fio/issues/1484

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agoMerge branch 'patch-1' of https://github.com/chienfuchen32/fio
Jens Axboe [Wed, 23 Nov 2022 17:41:00 +0000 (10:41 -0700)]
Merge branch 'patch-1' of https://github.com/chienfuchen32/fio

* 'patch-1' of https://github.com/chienfuchen32/fio:
  update documentation typo

17 months agoupdate documentation typo
chienfuchen32 [Wed, 23 Nov 2022 14:33:29 +0000 (22:33 +0800)]
update documentation typo

Signed-off-by: chienfuchen32 <chienfuchen32@gmail.com>
17 months agotest: add large pattern test
Logan Gunthorpe [Fri, 18 Nov 2022 23:16:01 +0000 (16:16 -0700)]
test: add large pattern test

Add a test which writes a file with a 16KB buffer pattern, then
verify the file with the same pattern.

The test writes a single 16KB block and thus should be the same as
the patttern file. Verify that this is the case after the test is
run.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agolib/pattern: Support binary pattern buffers on windows
Logan Gunthorpe [Fri, 18 Nov 2022 23:16:00 +0000 (16:16 -0700)]
lib/pattern: Support binary pattern buffers on windows

On windows, binary files used as pattern buffers may be mangled or
truncated seeing the files are openned in text mode.

Fix this by passing O_BINARY on windows when openning the file.

Suggested-by: Vincent Fu <vincentfu@gmail.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agooptions: Support arbitrarily long pattern buffers
Logan Gunthorpe [Fri, 18 Nov 2022 23:15:59 +0000 (16:15 -0700)]
options: Support arbitrarily long pattern buffers

Dynamically allocate the pattern buffer to remove the 512B length
restriction. To accomplish this, store a pointer instead of a fixed
block of memory for the buffers in the thread_options structure.
Then introduce and use the function parse_and_fill_pattern_alloc()
which will calculate the approprite size of the buffer and allocate
it before filling it.

The buffers will be freed, along with a number of string buffers
in free_thread_options_to_cpu(). They will also be reallocated (if
necessary) when receiving them over the wire with
convert_thread_options_to_cpu().

This allows for specifying real world compressible data (eg. The
Calgary Corpus) for the buffer_pattern option.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agolib/pattern: Support short repeated read calls when loading from file
Logan Gunthorpe [Fri, 18 Nov 2022 23:15:58 +0000 (16:15 -0700)]
lib/pattern: Support short repeated read calls when loading from file

Once a pattern file can be much larger, it will be possible that
kernel will return a short read while loading the file and thus may
randomly only load part of the file.

Fix this by putting the read in a loop so the entire file will be
read.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agolib/pattern: Support NULL output buffer in parse_and_fill_pattern()
Logan Gunthorpe [Fri, 18 Nov 2022 23:15:57 +0000 (16:15 -0700)]
lib/pattern: Support NULL output buffer in parse_and_fill_pattern()

Support passing a NULL output buffer to parse_and_fill_pattern().
Each formatting function simply needs to avoid accessing the buffer
when it is NULL. This allows calculating the required size of the
buffer before one might be allocated.

This will be useful in a subsequent patch for dynamically allocating
the pattern buffers.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agocconv: Support pattern buffers of arbitrary size
Logan Gunthorpe [Fri, 18 Nov 2022 23:15:56 +0000 (16:15 -0700)]
cconv: Support pattern buffers of arbitrary size

Change the thread_options_pack structure to support pattern buffers
of arbitrary size by using a flexible array at the end of the the
structure to store both the verify_pattern and the buffer_pattern
in that order.

In this way, only the actual bytes of each pattern will be sent over
the wire and patterns of an arbitrary size can be used with the packed
structure.

In order to determine the required size of the structure the function
thread_options_pack_size() is introduced which returns the total
number of bytes required for a given thread_options instance.

The two callsites of convert_thread_options_to_net() are then converted
to dynamically allocate a pdu of the appropriate size and the
two callsites of convert_thread_options_to_cpu() are modified to
take the size of the received data to prevent buffer overruns.

Also add specific testing of this feature in fio_test_cconv().

Seeing this changes the client/server protocol, the FIO_SERVER_VER
is bumped.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agot/zbd: add test case to check experimental_verify option
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:08 +0000 (11:13 +0900)]
t/zbd: add test case to check experimental_verify option

The option experimental_verify does not work with zonemode=zbd. Add a
test case to check that fio errors out when the both options are
specified.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agot/zbd: remove experimental_verify option from test case #54
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:07 +0000 (11:13 +0900)]
t/zbd: remove experimental_verify option from test case #54

The option experimental_verify does not work with zonemode=zbd. Remove
it from the test case #54.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agot/zbd: add test case to check zone_reset_threshold/frequency with verify
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:06 +0000 (11:13 +0900)]
t/zbd: add test case to check zone_reset_threshold/frequency with verify

Recent commit fixed assertion failure observed with zone_reset_threshold
and zone_reset_frequency options together with verify. Add a test case
to confirm the fix. Run four types of workloads and confirm no error.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agot/zbd: modify test case #34 for block size unaligned to zone size
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:05 +0000 (11:13 +0900)]
t/zbd: modify test case #34 for block size unaligned to zone size

The test case #34 confirmed that the block size unaligned to zone size
is handled as an error. After recent fix, now fio is able to handle such
block sizes, then the check for the error is no longer required.

Instead of removing this unnecessary test case, change it to cover
verify with complex workload. It runs random write workload with high
queue depth with verify. Use two types of block sizes unaligned to zone
size. This test workload is same as test case #57 except the verify
option and block sizes.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agot/zbd: fix test case #33 for block size unaligned to zone size
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:04 +0000 (11:13 +0900)]
t/zbd: fix test case #33 for block size unaligned to zone size

Recent fix of zonemode=zbd support for block size unaligned to zone size
unveiled that the test case #33 has two issues.

First issue is test preparation. Before the fix, write was done to only
to the first zone due to a bug in zone selection which happens when
block size is not a divisor of zone size. Then, status of second zone
did not affect. After the fix, write count to the zones may vary if the
second zone has almost full status since the zone is skipped from write
target. Fix this by resetting the write target zones.

Second issue is expected written data size. The test case checks that
the written data size is larger than the io_size option value. This
expectation was fine before the fix because data write was repeated in
do_io() and the limit was checked with io_issue_bytes_exceeded(),
which triggers loop break when written data is larger than io_size.
However, after the fix, the limit is checked with keep_running() in
thread_main(). According to code and block comment in keep_running(),
fio job terminates even when written data size is smaller than io_size
if the gap is smaller than maximum IO size. Then the expected written
data size is the largest multiple of block size smaller than or equal to
the io_size. This io_size check change resulted in the test case
failure. Avoid the failure by fixing the expected written data size
calculation.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd: prevent experimental verify with zonemode=zbd
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:03 +0000 (11:13 +0900)]
zbd: prevent experimental verify with zonemode=zbd

When the experimental_verify option is specified, fio does not record
normal I/O metadata to create verify read unit. Instead, fio resets file
status after normal I/O and before verify start, then replay I/Os to
regenerate write unit and adjust it to verify read. This I/O replay does
not work for zonemode=zbd since zone status needs to be restored at
verify start to the status before the normal I/O. However, such status
restore moves write pointers and erases data pattern for verify.

Check that the options experimental_verify and zonemode=zbd are not
specified together and error out in case they are both specified. Also
remove the helper function zbd_replay_write_order() which is called from
zbd_adjust_block(). This function adjusts verify read unit to meet zone
write restrictions for experimental verify, but such adjustment does not
work because zone status is not restored before verify start.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd: fix zone reset condition for verify
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:02 +0000 (11:13 +0900)]
zbd: fix zone reset condition for verify

When data verification is requested, zbd_file_reset() resets zones only
when td->runstate is not TD_VERIFYING so that the data to read back for
verify is not wiped out. However, even when verify data to read is left,
td->runstate is not always TD_VERIFYING. When verify_backlog option is
set, or when block size is not divisor of zone size, zbd_file_reset()
can be called while td->runstate is TD_RUNNING. This causes verify
failures.

To avoid the failures, improve the check condition to reset zones in
zbd_file_reset(). On top of td->runstate, refer td->io_hist_len,
td->verify_batch and td->o.verify_backlog values to avoid zone reset.
This is same check as check_get_verify().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd, verify: verify before zone reset for zone_reset_threshold/frequency
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:01 +0000 (11:13 +0900)]
zbd, verify: verify before zone reset for zone_reset_threshold/frequency

When zone_reset_threshold and zone_reset_frequency options are specified
with zonemode=zbd, it resets zones not full. When verify option is
specified on top of that, the zone reset of non-full zones erases data
for verify and causes verify failure. Current implementation avoids this
scenario by assert.

To allow zone_reset_threshold/frequency options together with verify,
do verify before the zone reset. When zone reset is required to an open
zone with verify data, call get_next_verify() to get io_u for verify
read and return it from zbd_adjust_block(). When io_u->file is set,
get_next_verify() assumes the io_u is requeued and does nothing. Unset
io_u->file to tell get_next_verify() is not requeued.

Also modify verify_io_u() to skip rand_seed check when the option
zone_reset_frequency is set. When the option is set, random seed is not
reset for verify reads in same manner as verify_backlog option, then
this check is not valid.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd: allow block size not divisor of zone size
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:13:00 +0000 (11:13 +0900)]
zbd: allow block size not divisor of zone size

Current implementation checks that block size is divisor of zone size
when verify work load is specified. After the recent fix of block size
unaligned to zone, this check is no longer valid. Remove the check.

The check had been valid since such block size left unwritten area at
each zone end and keeps the zones in open/active status until verify
read is done. It easily hit max open/active zones limitation. After the
fix, the zones with unwritten area are finished then they do not hit the
limitation.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd: finish zones with remainder smaller than minimum write block size
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:12:59 +0000 (11:12 +0900)]
zbd: finish zones with remainder smaller than minimum write block size

When zonemode is zbd and block size is not divisor of zone size, write
target zone selection does not work as expected. When the write is
random write and the device has max open zone limit, the random write is
repeated to the zones selected up to the max open zone limit. All writes
are repeated only to the zones. When the write is sequential write,
write is done only for the first zone. The cause of such unexpected zone
selection is current write target zone selection logic. It selects write
target zones within open zones. When block size is not divisor of zone
size, the selected open zone has only remainder of writable blocks
smaller than the block size. Fio resets such zone after zone selection
and continues writing to it. This zone reset is required not to exceed
the limit of max_open_zones option or max_active_zone limit of the zoned
device, but it does not simulate the workload.

To avoid the zone reset and unexpected write to same zone, fix write
target zone handling of zones with remainder smaller than write block
size. Do not reset but finish such zone not to exceed the max_open_zones
option and max_active_zone limit. Then choose the zone next to the
finished zone as write target. To implement this, add the helper
function zbd_finish_zone().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agozbd: add zbd_zone_remainder() helper function
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:12:58 +0000 (11:12 +0900)]
zbd: add zbd_zone_remainder() helper function

Add the helper function zbd_zone_remainder(), which returns the number
of bytes that are still available for writing before the zone gets full.
Use this function to improve readability. It will also be used in the
following patch.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agoengines/libzbc: add libzbc_finish_zone() helper function
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:12:57 +0000 (11:12 +0900)]
engines/libzbc: add libzbc_finish_zone() helper function

To support zone finish operation to ZBC drives through libzbc, add
finish_zone() callback to struct ioengine_ops, and implement in libzbc
IO engine. This feature is used to keep the same zone handling by
zonemode=zbd for libzbc engine as other engines.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agooslib: blkzoned: add blkzoned_finish_zone() helper function
Shin'ichiro Kawasaki [Mon, 14 Nov 2022 02:12:56 +0000 (11:12 +0900)]
oslib: blkzoned: add blkzoned_finish_zone() helper function

Add the helper function blkzoned_finish_zone() to support zone finish
operation to zoned block devices through ioctl. This feature will be
used to change status of zones which is not yet full but does not have
enough size to write next block, so that such zones do not exceed max
active zones limit. This function does zone finish only when kernel
supports the ioctl BLKFINISHZONE. Otherwise, it does nothing. This
should be fine since the kernel without BLKFINISHZONE does not report
max active zone limit through sysfs to user space.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
17 months agoMerge branch 'master' of https://github.com/bvanassche/fio
Vincent Fu [Mon, 14 Nov 2022 13:47:00 +0000 (08:47 -0500)]
Merge branch 'master' of https://github.com/bvanassche/fio

* 'master' of https://github.com/bvanassche/fio:
  os/os.h: Improve cpus_configured()
  configure: Fix the struct nvme_uring_cmd detection
  configure: Fix clock_gettime() detection

17 months agoos/os.h: Improve cpus_configured()
Bart Van Assche [Sun, 13 Nov 2022 23:10:10 +0000 (15:10 -0800)]
os/os.h: Improve cpus_configured()

Fix the following Coverity complaint:

1. negative_return: Calling sysconf, which might return a negative value.
2. return_negative_fn: Returning the return value of sysconf, which might be negative.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
17 months agoconfigure: Fix the struct nvme_uring_cmd detection
Bart Van Assche [Sun, 13 Nov 2022 23:11:54 +0000 (15:11 -0800)]
configure: Fix the struct nvme_uring_cmd detection

Prevent that struct nvme_uring_cmd detection fails as follows:

error: unused variable 'cmd' [-Werror=unused-variable]

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
17 months agoconfigure: Fix clock_gettime() detection
Bart Van Assche [Sun, 13 Nov 2022 23:09:22 +0000 (15:09 -0800)]
configure: Fix clock_gettime() detection

Prevent that the clock_gettime() and CLOCK_MONOTONIC tests fail as follows:

error: argument 2 null where non-null expected [-Werror=nonnull]

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
17 months agoMerge branch 'master' of https://github.com/bvanassche/fio
Jens Axboe [Mon, 7 Nov 2022 23:20:04 +0000 (16:20 -0700)]
Merge branch 'master' of https://github.com/bvanassche/fio

* 'master' of https://github.com/bvanassche/fio:
  Android: Enable zoned block device support
  Windows: Fix the build

17 months agoAndroid: Enable zoned block device support
Bart Van Assche [Mon, 7 Nov 2022 19:14:34 +0000 (11:14 -0800)]
Android: Enable zoned block device support

Enable support for --zonemode=zbd on Android.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
17 months agoWindows: Fix the build
Bart Van Assche [Mon, 7 Nov 2022 20:37:16 +0000 (12:37 -0800)]
Windows: Fix the build

Fix the build errors about passing arguments without a prototype.

Fixes: 93bcfd20e37c ("Move Windows port to MinGW")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
17 months agoFio 3.33 fio-3.33
Jens Axboe [Sun, 6 Nov 2022 20:55:41 +0000 (13:55 -0700)]
Fio 3.33

Signed-off-by: Jens Axboe <axboe@kernel.dk>
17 months agotest: use homebrew to install sphinx instead of pip on macOS
Vincent Fu [Fri, 4 Nov 2022 17:15:10 +0000 (13:15 -0400)]
test: use homebrew to install sphinx instead of pip on macOS

With the current GitHub Actions macOS image, pip3 install sphinx does
not appear to place sphinx-doc in the path. This results in
documentation build failures.

Resolve this by using homebrew to install sphinx-doc and add it to the
search path.

https://www.sphinx-doc.org/en/master/usage/installation.html
https://github.com/vincentkfu/fio/actions/runs/3395703049/jobs/5645918799

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>