Vincent Fu [Mon, 6 Nov 2023 18:41:53 +0000 (13:41 -0500)]
client/server: enable per_job_logs option
On the client side log files were being overwritten when per_job_logs
was set to false because of the flags used when log files were opened.
Add per_job_logs to the on-wire protocol so that the client can adjust
the flags and open files in append mode when per_job_logs is set to
false.
Fixes: https://github.com/axboe/fio/issues/1032
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Fri, 3 Nov 2023 15:21:22 +0000 (11:21 -0400)]
Merge branch 'thinkcycles-parameter' of https://github.com/cloehle/fio
* 'thinkcycles-parameter' of https://github.com/cloehle/fio:
fio: Introduce new constant thinkcycles option
Christian Loehle [Mon, 23 Oct 2023 09:42:26 +0000 (10:42 +0100)]
fio: Introduce new constant thinkcycles option
The thinkcycles parameter allows to set a number of cycles to spin between
requests to model real-world applications more realistically
The thinktime parameter family can be used to model an application processing
the data to be able to model real-world applications more closely.
Unfortunately this is currently set per constant time and therefore is affected
by CPU frequency settings or task migration to a CPU with different capacity.
The new thinkcycles parameter closes that gap and allows specifying a constant
number of cycles instead, such that CPU capacity is taken into account.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Ankit Kumar [Thu, 2 Nov 2023 13:59:28 +0000 (19:29 +0530)]
engines/xnvme: fix fdp support for userspace drivers
The xNVMe backend supports FDP commands for userspace drivers
such as SPDK. Enable support in the xnvme ioengine.
Update the xnvme fdp example file accordingly.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20231102135928.195372-1-ankit.kumar@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 31 Oct 2023 15:27:15 +0000 (09:27 -0600)]
Merge branch 'pi-perf' of https://github.com/ankit-sam/fio
* 'pi-perf' of https://github.com/ankit-sam/fio:
crct10: use isa-l for crc if available
Ankit Kumar [Tue, 31 Oct 2023 18:45:40 +0000 (00:15 +0530)]
crct10: use isa-l for crc if available
isa-l provides fast implementation for various polynomials.
This will be only used for end to end data protection, and has
a significant impact on performance.
See: https://github.com/intel/isa-l
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Vincent Fu [Wed, 25 Oct 2023 21:53:40 +0000 (17:53 -0400)]
Merge branch 'englist' of https://github.com/vt-alt/fio
* 'englist' of https://github.com/vt-alt/fio:
nfs: Fix incorrect engine registering for '--enghelp' list
Vincent Fu [Wed, 25 Oct 2023 18:47:45 +0000 (18:47 +0000)]
engines/io_uring_cmd: allocate enough ranges for async trims
We round up the iodepth to the next highest power of 2. So io_u->index
can be greater than the iodepth specified by the user. Make sure we
allocate enough of the buffers used to store the ranges for async trim
commands when the iodepth specified by the user is not a power of 2.
Fixes:
4885a6eba420ce216e4102df3e42229e167d1b7b ("engines/io_uring_cmd:
make trims async")
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vitaly Chikunov [Tue, 24 Oct 2023 02:29:40 +0000 (05:29 +0300)]
nfs: Fix incorrect engine registering for '--enghelp' list
`ioengine` from `nfs` (internal) engine is incorrectly exported thus
overriding its value in constructor callbacks of other external engines,
that are used for registering engine for listing with `--enghelp`.
Because flist is unsafe to double adding it also making `engine_list` to
become corrupt and causing infinite loop or abnormal list termination
when printing engine list.
Issue: https://github.com/axboe/fio/issues/1655
Fixes:
9326926b ("NFS engine")
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Jens Axboe [Mon, 23 Oct 2023 14:32:46 +0000 (08:32 -0600)]
Merge branch 'spellingfixes-2023-10-23' of https://github.com/proact-de/fio
* 'spellingfixes-2023-10-23' of https://github.com/proact-de/fio:
Various spelling fixes.
Martin Steigerwald [Mon, 23 Oct 2023 14:14:50 +0000 (16:14 +0200)]
Various spelling fixes.
Most of them have been reported by Debian's Lintian tool.
Signed-off-by: Martin Steigerwald <martin.steigerwald@proact.de>
Jens Axboe [Mon, 23 Oct 2023 00:52:51 +0000 (18:52 -0600)]
Merge branch 'fix-riscv64-cpu-clock' of https://github.com/gilbsgilbs/fio
* 'fix-riscv64-cpu-clock' of https://github.com/gilbsgilbs/fio:
riscv64: get clock from `rdtime` instead of `rdcycle`
Gilbert Gilb's [Sun, 22 Oct 2023 17:06:45 +0000 (19:06 +0200)]
riscv64: get clock from `rdtime` instead of `rdcycle`
`rdcycle` pseudo-instruction accesses the "cycle CSR" which holds the
real count of CPU core clock cycles [1]. As this leaves room for
side-channel attacks, access to this register from userland might be
forbidden by the kernel, which results in a SIGILL [2].
Anyhow, it seems that the actual usage of the `get_cpu_clock` function
in fio is about getting a wall-clock rather than the actual CPU core
clock (for instance, x86 uses `rdtsc`), so this is technically a bug.
The "time CSR" is the proper register to track time on riscv64. Also,
the "time CSR" is more likely to be available from userspace and not
cause a crash.
[1] RISC-V ISA Section 10.1: https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-
20191213.pdf
[2] https://lore.kernel.org/all/YxIzgYP3MujXdqwj@aurel32.net/T/
Signed-off-by: N. Le Roux <gilbsgilbs@gmail.com>
Signed-off-by: Gilbert Gilb's <gilbsgilbert@gmail.com>
Jens Axboe [Fri, 20 Oct 2023 10:32:39 +0000 (04:32 -0600)]
Merge branch 'master' of https://github.com/michalbiesek/fio
* 'master' of https://github.com/michalbiesek/fio:
riscv64: add syscall helpers
Jens Axboe [Fri, 20 Oct 2023 10:30:43 +0000 (04:30 -0600)]
Fio 3.36
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Michal Biesek [Wed, 23 Aug 2023 15:26:22 +0000 (17:26 +0200)]
riscv64: add syscall helpers
Use syscall helpers to optimized io_uring_enter(2) calls, eliminating
the need for libc functions. These wrappers are adapted from liburing.
Signed-off-by: michalbiesek <michalbiesek@gmail.com>
Vincent Fu [Thu, 19 Oct 2023 12:37:23 +0000 (08:37 -0400)]
Merge branch 'master' of https://github.com/shailevi23/fio
* 'master' of https://github.com/shailevi23/fio:
helper_thread: fix pthread_sigmask typo.
configure: improve pthread_sigmask detection.
Jens Axboe [Thu, 19 Oct 2023 12:32:25 +0000 (06:32 -0600)]
Merge branch 'fix_issue_1642' of https://github.com/zqs-Oppenauer/fio
* 'fix_issue_1642' of https://github.com/zqs-Oppenauer/fio:
fix assert failed when timeout during call rate_ddir.
zhuqingsong.0909 [Thu, 19 Oct 2023 03:29:27 +0000 (11:29 +0800)]
fix assert failed when timeout during call rate_ddir.
Adding DDIR_TIMEOUT in enum fio_ddir, and rate_ddir returns it when fio timeouts.
set_io_u_file will directly break out of the loop, and fill_io_u won't be called,
which causes assert to fail in rate_ddir, because td->rwmix_ddir is DDIR_INVAL.
Signed-off-by: QingSong Zhu zhuqingsong.0909@bytedance.com
Shai Levy [Mon, 16 Oct 2023 11:29:04 +0000 (14:29 +0300)]
helper_thread: fix pthread_sigmask typo.
Signed-off-by: Shai Levy <shailevy23@gmail.com>.
Shai Levy [Mon, 16 Oct 2023 11:26:20 +0000 (14:26 +0300)]
configure: improve pthread_sigmask detection.
On Windows system, pthread_sigmask is defined as a noop which will
trigger unused variable warning for sigmask.
By triggering the same warning in the configure script, we make
CONFIG_PTHREAD_SIGMASK undefined in the Windows msys2 build.
Signed-off-by: Shai Levy <shailevy23@gmail.com>.
Vincent Fu [Mon, 16 Oct 2023 14:03:36 +0000 (10:03 -0400)]
ci: explicitly install pygments and certifi on macos
The documentation build on macOS started failing because of errors with
the pygments and certifi modules. Homebrew is not automatically
installing pygments and python-certifi which are listed as packages that
sphinx-doc depends on because they are already present in the runner
image at the required versions (2.16.1 and 2023.7.22, respectively).
Explicitly installing the two packages bumps the versions to slightly
newer ones (2.16.1_1 and 2023.7.22_1, respectively). This appears to
resolve the documentation build problem.
https://formulae.brew.sh/formula/sphinx-doc
https://github.com/axboe/fio/actions/runs/
6533001329/job/
17739452911#step:13:155
https://github.com/vincentkfu/fio/actions/runs/
6535039949/job/
17743571376#step:13:148
https://github.com/vincentkfu/fio/actions/runs/
6535229986/job/
17744177918#step:6:10
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Fri, 6 Oct 2023 07:13:20 +0000 (16:13 +0900)]
t/zbd: avoid test case 45 failure
When zonemode=zbd option is not specified, random writes to zoned block
devices fail because writes to sequential write required zones shall
happen only at write pointers. Randomly chosen write addresses do not
match with the write pointers, then fail. On such failures, fio prints
out the message below and tell users how to avoid the failures:
"fio: first I/O failed. If .* is a zoned block device, consider --zonemode=zbd".
The test case 45 in t/zbd/test-zbd-support confirms the message is
printed when the first random write command to a sequential write
required zone fails. However, the random write can succeed very rarely
since the randomly chosen write address can be same as the write pointer
address. For example, a zoned block device with 1MB zone size with 4KB
block size device can have the first random write at write pointer with
ratio of 4KB/1MB = 1/256. This causes sporadic test case failures.
Avoid the failures by two changes. Firstly, change the random write
range from a zone to whole sequential write required zones to reduce the
failure ratio. Secondly, repeat the test if the message is not printed
by the accidental write success. As the test repeated, failure ratio is
multiplied and the failure ratio becomes as small as it can be ignored.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20231006071320.425270-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Fri, 6 Oct 2023 19:27:23 +0000 (15:27 -0400)]
Merge branch 'fix-stat-overflow' of https://github.com/stilor/fio
* 'fix-stat-overflow' of https://github.com/stilor/fio:
Handle 32-bit overflows in disk utilization stats
Change memcpy() calls to assignments
Alexey Neyman [Tue, 3 Oct 2023 22:49:02 +0000 (22:49 +0000)]
Handle 32-bit overflows in disk utilization stats
Linux prints [1] some of the values reported in block device's `stat`
sysfs file as 32-bit unsigned integers. fio interprets them as 64-bit
integers when reading that sysfs file and performs further arithmetics
on them in 64-bits. If the reported value overflows during fio run,
a huge bogus value is reported in the "disk utilization" block instead.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/genhd.c#n962
Signed-off-by: Alexey Neyman <aneyman@google.com>
Alexey Neyman [Thu, 5 Oct 2023 18:43:17 +0000 (18:43 +0000)]
Change memcpy() calls to assignments
This is to avoid triggering a spurious warning caused by [1], which
is triggered by the next commit in chain (unrelated change in
update_io_tick_disk()).
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111696
Signed-off-by: Alexey Neyman <aneyman@google.com>
Vincent Fu [Mon, 2 Oct 2023 13:41:54 +0000 (06:41 -0700)]
iolog: don't truncate time values
We store iolog timestamps as 64-bit unsigned integers but when we print
timestamps in the logs we type cast them to unsigned longs. This is fine
on platforms with 64-bit longs but on platforms like Windows with 32-bit
longs this truncates the timestamps. Truncation causes problems
especially when options such as --log_unix_epoch are enabled because the
number of milliseconds since 01-01-1970 does not fit in a 32-bit
unsigned long.
Fix this by getting rid of the type cast and using PRIu64 as the format
specifier.
Fixes: https://github.com/axboe/fio/issues/1638
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Fri, 29 Sep 2023 12:58:10 +0000 (08:58 -0400)]
ci: switch macos runs from macos-12 to macos-13
macOS 13 was released in Oct 2022 and the GitHub Actions image debuted
Apr 2023. Let's switch to this new platform.
It seems to work fine based on two runs testing the nice fix in the
previous patch.
https://github.com/vincentkfu/fio/actions/runs/
6352923622/job/
17256648185
https://github.com/vincentkfu/fio/actions/runs/
6353159388/job/
17257286332
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Fri, 29 Sep 2023 14:04:50 +0000 (10:04 -0400)]
workqueue: handle nice better
nice returns the program's schedule priority. This can be a negative
value when there is no error condition. To check if nice is triggering
an error we have to check errno.
The most recent three macOS test failures appear to be due to this
problem.
https://github.com/axboe/fio/actions/runs/
6347762639/job/
17243247222
https://github.com/axboe/fio/actions/runs/
6346019205/job/
17239370410
https://github.com/axboe/fio/actions/runs/
6241934089/job/
16944981876
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Jens Axboe [Fri, 29 Sep 2023 06:05:10 +0000 (00:05 -0600)]
Merge branch 'fix_verify_block_offset' of https://github.com/ipylypiv/fio
* 'fix_verify_block_offset' of https://github.com/ipylypiv/fio:
verify: Fix the bad pattern block offset value
Igor Pylypiv [Thu, 28 Sep 2023 23:37:14 +0000 (16:37 -0700)]
verify: Fix the bad pattern block offset value
We offset buf by header_size for pattern verification. Add header_size
to the mismatched buf offset to get the correct block offset value.
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Shin'ichiro Kawasaki [Wed, 13 Sep 2023 01:52:49 +0000 (10:52 +0900)]
t/zbd: set mq-deadline scheduler to device-mapper destination devices
When write workloads run on zoned block devices, mq-deadline scheduler is
required to ensure write operations are sequential. To fulfill this
requirement, the test script t/zbd/test-zbd-support sets mq-deadline to
the sysfs attribute "queue/scheduler". However, this preparation does
not work when the write target device is a bio based device-mapper
device. The device is bio based then I/O scheduler does not work.
Setting mq-deadline to the sysfs attribute has no effect. On top of
that, the sysfs attribute "queue/scheduler" is no longer available for
bio based device-mapper devices since Linux kernel version v6.5.
To ensure mq-deadline scheduler for bio based device-mapper devices,
improve the helper function set_io_scheduler. If the sysfs attribute
"queue/scheduler" is available, use it. Otherwise, check if the test
device is a zoned device-mapper (linear, flakey or crypt). If so, set
mq-deadline scheduler to destination devices of the device-mapper
device. To implement these, add some helper functions.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230913015249.2226799-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Wed, 20 Sep 2023 11:41:17 +0000 (07:41 -0400)]
Merge branch 'fio_client_server_doc_fix' of https://github.com/pcpartpicker/fio
* 'fio_client_server_doc_fix' of https://github.com/pcpartpicker/fio:
Update docs to clarify how to pass job options in client mode
aggieNick02 [Tue, 19 Sep 2023 23:14:54 +0000 (18:14 -0500)]
Update docs to clarify how to pass job options in client mode
When run in client mode, fio does not pass any job options specified on
the command line to the fio server. When run in client mode, all job
options must be specified via local or remote job files. Update the docs
to indicate this to avoid end-user confusion.
Fixes #1629
Signed-off-by: Nick Neumann nick@pcpartpicker.com
Vincent Fu [Thu, 14 Sep 2023 22:54:25 +0000 (18:54 -0400)]
verify: open state file in binary mode on Windows
Make sure we open the verify state file in binary mode to avoid any
possible conversion of NL to CR+NL. This is the same fix we did for
1fb215e991d260a128e35d761f6850e8d9e4c333.
Fixes: https://github.com/axboe/fio/issues/1631
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 11 Sep 2023 16:25:00 +0000 (21:55 +0530)]
engines:nvme: fill command fields as per pi check bits
Fill the application and reference tag field for read and write
command only when pi_chk has the relevant bit set.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ankit Kumar [Mon, 11 Sep 2023 16:24:59 +0000 (21:54 +0530)]
engines:io_uring_cmd: disallow verify for e2e pi with extended blocks
For extended logical block sizes we cannot use verify when end to end
data protection checks are enabled. The CRC field in PI section of
data buffer creates conflict during verify phase.
The verify check is also redundant as end to end data protection already
ensures data integrity. So disallow use of verify for this case.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Vincent Fu [Mon, 11 Sep 2023 15:31:27 +0000 (11:31 -0400)]
Merge branch 'pcpp_epoch_fixing_2' of https://github.com/PCPartPicker/fio
* 'pcpp_epoch_fixing_2' of https://github.com/PCPartPicker/fio:
Make log_unix_epoch an official alias of log_alternate_epoch
Record job start time to fix time pain points
aggieNick02 [Fri, 8 Sep 2023 20:34:09 +0000 (15:34 -0500)]
Make log_unix_epoch an official alias of log_alternate_epoch
log_alternate_epoch was introduced along with
log_alternate_epoch_clock_id, and generalized the idea of
log_unix_epoch. Both options had the same effect. So we make
log_unix_epoch an official alias of log_alternate_epoch, instead of
maintaining both redundant options.
Signed-off-by: Nick Neumann nick@pcpartpicker.com
aggieNick02 [Fri, 1 Sep 2023 15:50:34 +0000 (10:50 -0500)]
Record job start time to fix time pain points
Add a new key in the json per-job output, job_start, that records the
job start time obtained via a call to clock_gettime using the clock_id
specified by the new job_start_clock_id option. This allows times of fio
jobs and log entries to be compared/ordered against each other and
against other system events recorded against the same clock_id.
Add a note to the documentation for group_reporting about how there are
several per-job values for which only the first job's value is recorded
in the json output format when group_reporting is enabled.
Fixes #1544
Signed-off-by: Nick Neumann nick@pcpartpicker.com
Jens Axboe [Sat, 2 Sep 2023 13:35:49 +0000 (07:35 -0600)]
Merge branch 'pcpp_parse_nr_fix' of https://github.com/PCPartPicker/fio
* 'pcpp_parse_nr_fix' of https://github.com/PCPartPicker/fio:
Add basic error checking to parsing nr from rw=randrw:<nr>, etc
aggieNick02 [Fri, 1 Sep 2023 22:01:05 +0000 (17:01 -0500)]
Add basic error checking to parsing nr from rw=randrw:<nr>, etc
Previously this was parsed by just doing atoi(). This returns 0 or has
undefined behavior in error cases.
Silently getting a 0 for nr is not great. In fact, 0 (or less) should
likely not be allowed for nr; while the code handles it, the effective
result is that the randomness is gone - all I/O becomes sequential. It
makes sense to prohibit 0 as an nr value in the random case.
We leverage str_to_decimal to do our parsing instead of atoi. It isn't
perfect, but it is a lot more resilient than atoi, and used in other
similar places. We can then return an error when parsing fails, and also
return an error when the parsed numeric value is outside of the ranges
that can be stored in the unsigned int used for nr, along with when nr
is 0.
Fixes #1622
Signed-off-by: Nick Neumann nick@pcpartpicker.com
Jens Axboe [Wed, 23 Aug 2023 14:21:39 +0000 (08:21 -0600)]
Merge branch 'master' of https://github.com/michalbiesek/fio
* 'master' of https://github.com/michalbiesek/fio:
Add RISC-V 64 support
Michal Biesek [Tue, 22 Aug 2023 23:03:02 +0000 (01:03 +0200)]
Add RISC-V 64 support
Signed-off-by: Michal Biesek <michalbiesek@gmail.com>
Ankit Kumar [Wed, 16 Aug 2023 09:46:16 +0000 (15:16 +0530)]
examples: add example and fiograph for protection information options
Add missing io_uring_cmd ioengine options to fiograph config.
Add two example job files for the protection information options.
These include one for DIF i.e. extended LBA data size, and the other
for DIX i.e. separate metadata buffer case.
Add the corresponding fiograph diagram for these.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230816094616.132240-1-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Jens Axboe [Tue, 15 Aug 2023 01:59:20 +0000 (19:59 -0600)]
engines/io_uring: fix leak of 'ld' in error path
Not really important as we're exiting anyway, but this silences some
of the static checkers that like to complain about this sort of
thing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Vincent Fu [Fri, 28 Jul 2023 15:47:12 +0000 (15:47 +0000)]
t/nvmept_pi: test script for protection information
Carry out tests of the code supporting end-to-end data protection via
the io_uring_cmd ioengine's nvme command type.
The test script detects the available protection information formats
supported by the target device. Then for each of these configurations,
the script formats the device and runs a series of tests.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Mon, 31 Jul 2023 17:02:55 +0000 (17:02 +0000)]
t/fiotestlib: use config variable to skip test at runtime
Check a test config variable to skip a test at runtime. This will be
used to skip a test when the test runner determines that it should not
be run.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:47 +0000 (20:27 +0530)]
engines:io_uring: generate and verify pi for 64b guard
Generate and verify protection information for 64 bit guard format, for
the nvme backend of io_uring_cmd ioengine. The support is there for
both the cases where metadata is transferred in separate buffer, or
transferred at the end of logical block creating an extended logical
block.
This support also takes into consideration when protection information
resides in last or first 16 bytes of metadata.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-11-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:46 +0000 (20:27 +0530)]
engines:nvme: pull required 48 bit accessors from linux kernel
Pull the 48 bit helpers, required for supporting 48 bit reference tags.
Add GPL 2.0 license to nvme.c and nvme.h files.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-10-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:45 +0000 (20:27 +0530)]
crc: pull required crc64 nvme apis from linux kernel
Pull the required nvme crc64 apis and table from the linux kernel. This
is required to generate and verify 64 bit guard tag for nvme backend
of io_uring_cmd ioengine.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-9-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:44 +0000 (20:27 +0530)]
engines:io_uring: generate and verify pi for 16b guard
Generate and verify protection information for 16 bit guard format, for
the nvme backend of io_uring_cmd ioengine. The support is there for
both the cases where metadata is transferred in separate buffer, or
transferred at the end of logical block creating an extended logical
block.
This support also takes into consideration when protection information
resides in last or first 8 bytes of metadata.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-8-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:43 +0000 (20:27 +0530)]
crc: pull required crc16-t10 files from linux kernel
Pull the required crc16 t10 files from the linux kernel. This is
required to generate and verify guard tag for nvme backend of
io_uring_cmd ioengine.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-7-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:42 +0000 (20:27 +0530)]
io_u: move engine data out of union
io_uring_cmd ioengine requires engine data to store nvme protection
information data.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-6-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:41 +0000 (20:27 +0530)]
engines:io_uring: uring_cmd add support for protection info
This patch enables support for protection information to nvme command
backend of io_uring_cmd ioengine. The patch only supports protection
information action bit set to 1, for read and write operation.
This adds 4 new ioengine specific options
* pi_act - Protection information action. Default: 1
* pi_chk - Can be set to GUARD, APPTAG or REFTAG
* apptag - Sets apptag field of command dword 15
* apptag_mask - Sets apptag_mask field of command dword 15
For the sake of consistency these options are the same as the ones used
by SPDK's external ioengine.
For pi_act=1, if namespace is formatted with metadata size equal to
protection information size, the nvme controller inserts and removes
protection information for write and read command respectively.
Added a check so that fio doesn't send metadata for such cases.
Storage tag support is not present, so return an error for that.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-5-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:40 +0000 (20:27 +0530)]
engines:io_uring: enable support for separate metadata buffer
This patch enables support for separate metadata buffer with
io_uring_cmd ioengine. As we are unaware of metadata size during buffer
allocation, we provide an option md_per_io_size. This option must be
used to specify metadata buffer size for single IO, if namespace is
formatted with a separate metadata buffer.
For the sake of consistency this is the same option as used by SPDK's
external ioengine.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-4-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:39 +0000 (20:27 +0530)]
engines:io_uring: update arguments to fetch nvme data
This is a prep patch to keep number of arguments for fio_nvme_get_info
in check. The follow up patches will enable metadata, protection info.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-3-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Ankit Kumar [Mon, 14 Aug 2023 14:57:38 +0000 (20:27 +0530)]
engines:io_uring: add missing error during open file
This change ensures the error is propogated to upper layers to make fio
exit with a non-zero return code.
Add filename for errors when block size is not a multiple of logical
blocks.
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Link: https://lore.kernel.org/r/20230814145747.114725-2-ankit.kumar@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Thu, 3 Aug 2023 00:53:21 +0000 (20:53 -0400)]
eta: calculate aggregate bw statistics even when eta is disabled
The --bandwidth-log command-line option instructs fio to generate
aggregate bandwidth log files. These measurements are recorded by the
code generating the eta status line. When eta is disabled the aggregate
bandwidth log measurements are not calculated. Change the eta code to
record the measurements even when eta is not needed.
eta is disabled under these conditions
- explicitly with --eta=never
- STDOUT is not a TTY (shell redirection, nohup, etc)
- output format excludes normal output
Fixes: https://github.com/axboe/fio/issues/1599
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Wed, 2 Aug 2023 16:30:17 +0000 (12:30 -0400)]
t/fiotestlib: make recorded command prettier
Instead of recording fio test commands as a single very long line, put
each option on its own line to make the command easier for humans to
digest.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Vincent Fu [Wed, 2 Aug 2023 16:23:37 +0000 (12:23 -0400)]
t/nvmept: fix typo
Make the filenames for the nvmept artifacts start with nvmept instead
of readonly.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Jens Axboe [Mon, 31 Jul 2023 21:03:37 +0000 (15:03 -0600)]
Merge branch 'master' of https://github.com/min22/fio
* 'master' of https://github.com/min22/fio:
iolog.c: fix inaccurate clat when replay trace
Jens Axboe [Mon, 31 Jul 2023 21:03:03 +0000 (15:03 -0600)]
Merge branch 'improment/constness' of https://github.com/dpronin/fio
* 'improment/constness' of https://github.com/dpronin/fio:
use 'const' where it is required
Denis Pronin [Sun, 30 Jul 2023 22:29:04 +0000 (01:29 +0300)]
use 'const' where it is required
protect variables and parameters from programmer's point of view with
'constness'
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Jens Axboe [Fri, 28 Jul 2023 17:32:22 +0000 (11:32 -0600)]
Revert "correctly free thread_data options at the topmost parent process"
This reverts commit
913028e97ceedcf2cf1ec6ec32228b3c50e7337c.
This commit is causing the static analyzers to freak out, and also
crashes on Windows. Revert it for now.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 28 Jul 2023 15:11:15 +0000 (09:11 -0600)]
Merge branch 'td-eo-double-free-fix' of https://github.com/dpronin/fio
* 'td-eo-double-free-fix' of https://github.com/dpronin/fio:
correctly free thread_data options at the topmost parent process
Jens Axboe [Fri, 28 Jul 2023 15:11:01 +0000 (09:11 -0600)]
Merge branch 'master' of https://github.com/dpronin/fio
* 'master' of https://github.com/dpronin/fio:
fix missing headers in multiple files
Jens Axboe [Fri, 28 Jul 2023 15:10:44 +0000 (09:10 -0600)]
Merge branch 'io_uring' of https://github.com/dpronin/fio
* 'io_uring' of https://github.com/dpronin/fio:
io_uring engine: 'atomic_load_relaxed' instead of 'atomic_load_acquire'
Denis Pronin [Fri, 28 Jul 2023 14:25:06 +0000 (17:25 +0300)]
io_uring engine: 'atomic_load_relaxed' instead of 'atomic_load_acquire'
motivation here is that we do not have here any explicit READ dependency
on atomic load because actually we just need in these places only
operation to perform atomically without any explicit barriers given by
memory model
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Denis Pronin [Thu, 27 Jul 2023 22:26:22 +0000 (01:26 +0300)]
correctly free thread_data options at the topmost parent process
for non-threaded mode: since thread_data::eo is a pointer within shared
memory between the topmost fio parent process and its children let the
fio parent process set the pointer to NULL as just it frees its copy of
'eo' as memory previously allocated by means of 'malloc' meaning that
each child and the parent process itself must free it
for threaded mode we leave it as it has always been
also we do not need to check td->io_ops for being able to free td->eo in
fio_options_free()
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Denis Pronin [Fri, 28 Jul 2023 09:39:58 +0000 (12:39 +0300)]
fix missing headers in multiple files
some files require to have some missing headers included
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Jens Axboe [Thu, 27 Jul 2023 19:48:26 +0000 (13:48 -0600)]
Merge branch 'helper_thread-fix-missing-stdbool-header' of https://github.com/dpronin/fio
* 'helper_thread-fix-missing-stdbool-header' of https://github.com/dpronin/fio:
helper_thread.h: forwardly declare structures fio_sem and sk_out
helper_thread.h: include missing stdbool.h because 'bool' type is used
Denis Pronin [Thu, 27 Jul 2023 19:08:45 +0000 (22:08 +0300)]
helper_thread.h: forwardly declare structures fio_sem and sk_out
helper_thread_create() function requires two structures to be declared
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Denis Pronin [Thu, 27 Jul 2023 19:06:59 +0000 (22:06 +0300)]
helper_thread.h: include missing stdbool.h because 'bool' type is used
missing headers should be included at the places where they are
certainly used
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Jens Axboe [Thu, 27 Jul 2023 19:11:01 +0000 (13:11 -0600)]
Merge branch 'diskutil-fix-missing-headers' of https://github.com/dpronin/fio
* 'diskutil-fix-missing-headers' of https://github.com/dpronin/fio:
diskutil.h: fix missing headers wanted by the header
Denis Pronin [Thu, 27 Jul 2023 18:49:31 +0000 (21:49 +0300)]
diskutil.h: fix missing headers wanted by the header
diskutil.h requires 3 more headers to fulfill several types therein
without having to rely on headers hopefully included before this one
Signed-off-by: Denis Pronin <dannftk@yandex.ru>
Kookoo Gu [Wed, 26 Jul 2023 04:48:35 +0000 (12:48 +0800)]
iolog.c: fix inaccurate clat when replay trace
When do timestamp replay with high qd it will only reap the
completed commands when the qd reach the max iodepth, the commands
probably are finished long ago before command completion handling.
Fix is to use io_u_queued_complete instead of just usec_sleep in
iolog_delay
Signed-off-by: Kookoo Gu <Zhimin.Gu@solidigm.com>
Jens Axboe [Fri, 21 Jul 2023 21:23:40 +0000 (15:23 -0600)]
Merge branch 'prio-hints'
* prio-hints:
stats: Add hint information to per priority level stats
cmdprio: Add support for per I/O priority hint
options: add priohint option
os-linux: add initial support for IO priority hints
cmdprio: Introduce generic option definitions
os-linux: Cleanup IO priority class and value macros
Damien Le Moal [Fri, 21 Jul 2023 11:05:10 +0000 (20:05 +0900)]
stats: Add hint information to per priority level stats
Modify the json and standard per-priority output stats to display the
hint value together with the priority class and level.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-7-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 21 Jul 2023 11:05:09 +0000 (20:05 +0900)]
cmdprio: Add support for per I/O priority hint
Introduce the new option cmdprio_hint to allow specifying I/O priority
hints per IO with the io_uring and libaio IO engines. A third acceptable
format for the cmdprio_bssplit option is also introduced to allow
specifying an I/O hint in addition to a priority class and level.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-6-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 21 Jul 2023 11:05:08 +0000 (20:05 +0900)]
options: add priohint option
Introduce the new option priohint to allow users to specify an I/O
priority hint applying to all IOs issued by a job. This increases fio
server version (FIO_SERVER_VER) to 101.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-5-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 21 Jul 2023 11:05:07 +0000 (20:05 +0900)]
os-linux: add initial support for IO priority hints
Add initial support for Linux to allow specifying a hint for any
priority value. With this change, a priority value becomes the
combination of a priority class, a priority level and a hint.
The generic os.h ioprio manipulation macros, as well as the
os-dragonfly.h ioprio manipulation macros are modified to ignore this
hint.
For all other OSes that do not support priority classes, priotity hints
are ignored and always equal to 0.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-4-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 21 Jul 2023 11:05:06 +0000 (20:05 +0900)]
cmdprio: Introduce generic option definitions
The definition of the per-I/O priority options for the io_uring and
libaio I/O engines are almost identical, differing only by the option
group and option data structure used.
Introduce the CMDPRIO_OPTIONS macro in engines/cmdprio.h to generically
define these options in the io_uring and libaio engines to simplify the
code.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-3-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 21 Jul 2023 11:05:05 +0000 (20:05 +0900)]
os-linux: Cleanup IO priority class and value macros
In os/os-linux.h, define the ioprio() macro using the already defined
IOPRIO_MAX_PRIO macro instead of hard coding the maximum priority value
again. Also move the definitions of the ioprio_class() and ioprio()
macros before the ioprio_value() function and use ioprio_class() inside
ioprio_value_is_class_rt() instead of re-coding the iopriority class
extraction again in that function.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230721110510.44772-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Fri, 21 Jul 2023 04:44:44 +0000 (13:44 +0900)]
backend: clear IO_U_F_FLIGHT flag in zero byte read path
When read io_u completes with zero byte read, it sets EIO as the error
and put the io_u. However, it does not clear the IO_U_F_FLIGHT flag.
When fio runs with --ignore_error=EIO option, the io_u with the flag is
reused for next I/O and causes an assertion failure:
fio: ioengines.c:335: td_io_queue: Assertion `(io_u->flags & IO_U_F_FLIGHT) == 0' failed.
The failure is observed with blktests test case block/011 which runs fio
with the --ignore_error=EIO option [1].
[1] https://github.com/osandov/blktests/issues/29
Fix this by calling clear_io_u() instead of put_io_u() in the zero byte
read path. clear_io_u() clears the IO_U_F_FLIGHT flag then calls
put_io_u().
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230721044444.749537-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dmitry Fomichev [Wed, 19 Jul 2023 10:57:56 +0000 (19:57 +0900)]
t/zbd: add max_active configs to run-tests-against-nullb
Introduce several new test device configurations to cover the cases with
max_active_zones is not being zero, i.e. limited. Two group of new
configurations are added, one with max_active_zones == max_open_zones
and the other with max_active_zones > max_open_zones.
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-14-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Dmitry Fomichev [Wed, 19 Jul 2023 10:57:55 +0000 (19:57 +0900)]
t/zbd: fix null_blk configuration in run-tests-against-nullb
Correctly set max_open in null_blk configfs.
Fix displayed number of conventional zones in section config banner.
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-13-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:54 +0000 (19:57 +0900)]
t/zbd: add missing prep_write for test cases with write workloads
The test cases from 54 to 57 do writes but miss prep_write() call which
resets zones of the test target device with max_active_zones limit. This
results in failures due to open zones out of I/O ranges and
max_active_zones limit error. Add the missing prep_write() call to avoid
the failures.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-12-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:53 +0000 (19:57 +0900)]
t/zbd: fix fio failure check and SG node failure in test case 31
The test case 31 runs fio twice but the failure of the first fio run was
not checked. This allowed the test case pass even with wrong
max_open_zones value. To fix this, check exit code of the fio run.
Also, the first fio run fails when the test target devices are SG nodes,
since libzbc I/O engine is not used. To fix this, call the ioengine()
helper function which adjusts I/O engine for each device.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-11-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:52 +0000 (19:57 +0900)]
t/zbd: get max_open_zones from sysfs
The helper bash function gets max_open_zones limit of the test target
device using sg_inq and libzbc tools. This works for SAS/SATA devices
but does not work for ZNS or null_blk devices. This results is running
the test case 31 with wrong max_open_zones value. Fix this by referring
max_open_zones sysfs attribute.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-10-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:51 +0000 (19:57 +0900)]
t/zbd: add test case to check max_active_zones limit error message
The recent fio change introduced a new error message to indicate
max_active_zones limit error of zoned block devices. Add a test case to
check the error message is reported.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-9-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:50 +0000 (19:57 +0900)]
t/zbd: add test case to check zones in closed condition
When the zoned block device has max_active_zones limit, the zones in
open or closed condition consume resource on the device. If the number
of zones in open or closed condition gets larger than the
max_active_zones limit, the device reports an error. Until the recent
fix ("zbd: write to closed zones on the devices with max_active_zones
limit"), fio handled only zones in open condition as write target then
fio was not able to avoid the error.
Add a test which confirms that the fix avoids the error by handling
zones in closed condition as write target. This test case requires the
device has max_active_zones limit. Prepare zones in closed condition as
many as the max_active_zones limit. Do random write and check no error.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-8-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:49 +0000 (19:57 +0900)]
t/zbd: add max_active_zone variable
To test fio behavior on zoned block devices with max_active_zones limit,
add a global variable which holds the limit value. Also add helper
functions to check max_active_zones limit of the test target devices and
max_active_zones requirement of test cases.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-7-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:48 +0000 (19:57 +0900)]
t/zbd: add close_zone helper function
Add a helper function which sets the specified zone in closed condition.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-6-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:47 +0000 (19:57 +0900)]
docs: modify max_open_zones option description
A recent commit modified the max_open_zones option to improve handling
of zoned block devices with max_active_zones limit. Modify description
of the option to meet the change.
For that purpose, explain the relation between the max_open_zones option
and the device side limits max_active_zones and max_open_zones. Also
mention about three zone conditions 'implicit open', 'explict open' and
'closed'. And replace the word 'zone state' with 'zone condition'.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-5-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:46 +0000 (19:57 +0900)]
zbd: print max_active_zones limit error message
When zoned block devices have max_active_zones limit and when write
operations exceed that limit, Linux block sub-system reports EOVERFLOW.
However, the strerror() string for EOVERFLOW does not mention about
max_active_zones then it confuses users.
To avoid the confusion, print additional error message to indicate the
max_active_zones limit. For this purpose, add a hook function
zbd_log_err() and call it from __io_u_log_error().
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-4-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:45 +0000 (19:57 +0900)]
zbd: write to closed zones on the devices with max_active_zones limit
Current fio implementation does not handle zones in closed condition as
write target zones. When the device has max_active_zones limit, the
write to other zones may cause errors by exceeding the limit, since the
zones in closed condition consume the device resource for the
max_active_zones limit.
To avoid the error, handle the zones in closed condition as write target
in same manner as the zones in open condition when the device has the
max_active_zones limit. At the job start, check each condition of the
zones in the IO ranges and if it has closed condition, pass the zone
to zbd_write_zones_get() in same manner as the zones in open condition.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Shin'ichiro Kawasaki [Wed, 19 Jul 2023 10:57:44 +0000 (19:57 +0900)]
zbd: get max_active_zones limit value from zoned devices
As a preparation to improve open zones accounting for devices with the
max_active_zones limit, get the limit from the devices. In same manner
as max_open_zones, call get_max_active_zones callback if the I/O engine
supports it. Add the new call back to the I/O engine API and bump up
FIO_IOOPS_VERSION. It is expected that io_uring and xnvme engines to
support the callback later. When the callback is not available, refer
max_active_zones sysfs attribute for block devices. When the limit value
is not available, use zero value which means no limit. Keep the obtained
limit value in the struct zoned_block_device_info.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20230719105756.553146-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Jens Axboe [Sat, 15 Jul 2023 15:57:43 +0000 (09:57 -0600)]
Merge branch 'patch-3' of https://github.com/yangjueji/fio
* 'patch-3' of https://github.com/yangjueji/fio:
fix: io_uring sqpoll issue_time empty when kernel not yet read sq
Michael Kelley [Fri, 14 Jul 2023 17:06:01 +0000 (17:06 +0000)]
thinktime: Avoid calculating a negative time left to wait
When the thinktime_spin option specifies a value that is within
a few milliseconds of the thinktime value, in handle_thinktime()
it's possible in a VM environment for the duration of usec_spin()
to exceed the thinktime value. While doing usec_spin(), the vCPU
could get de-scheduled or the hypervisor could steal CPU time
from the vCPU. When the guest vCPU runs after being scheduled
again, it may read the clock and find that more time has elapsed
than intended. In such a case, the time left to wait could be
calculated as a negative value. Subsequent calculations then go
awry because the time left is cast as unsigned.
Fix this by detecting when the time left would go negative and
just set it to zero.
Fixes:
1a9bf8146 ("Add option to ignore thinktime for rated IO")
Fixes: https://github.com/axboe/fio/issues/1588
Link: https://lore.kernel.org/fio/1689354334-131024-1-git-send-email-mikelley@microsoft.com/T/#u
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>