fio.git
3 years agotravis: install python3 scipy for Linux and macOS tests
Vincent Fu [Thu, 28 May 2020 14:12:52 +0000 (10:12 -0400)]
travis: install python3 scipy for Linux and macOS tests

Since the test scripts triggered by TravisCI now all rely on python3,
make sure we always install scipy for python3.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agotesting: change two test scripts to refer to python3
Vincent Fu [Thu, 28 May 2020 13:05:06 +0000 (09:05 -0400)]
testing: change two test scripts to refer to python3

Since python2 is no longer supported we should now use python3 in our
test scripts. Change the shebang lines for two test scripts to refer to
python3.

Note that t/sgunmap-test.py and t/sgunmap-perf.py still refer to
python2.  I no longer have the means to test those two scripts and am
leaving those unchanged.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agoMerge branch 'testing' of https://github.com/vincentkfu/fio
Jens Axboe [Thu, 28 May 2020 17:07:31 +0000 (11:07 -0600)]
Merge branch 'testing' of https://github.com/vincentkfu/fio

* 'testing' of https://github.com/vincentkfu/fio:
  .travis: enable arm64 architecture builds
  t/run-fio-tests: pass-through arguments to test scripts
  appveyor: use on_finish section to upload artifacts

3 years ago.travis: enable arm64 architecture builds
Vincent Fu [Tue, 26 May 2020 20:55:58 +0000 (16:55 -0400)]
.travis: enable arm64 architecture builds

The travis-ci containers do not support the cmdprio_percentage option.
So skip latency_percentile.py tests using that option.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agot/run-fio-tests: pass-through arguments to test scripts
Vincent Fu [Tue, 26 May 2020 20:54:44 +0000 (16:54 -0400)]
t/run-fio-tests: pass-through arguments to test scripts

Add an option to pass-through arguments to specified test scripts. This
can be used to alter the behavior of tests on different platforms.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agoappveyor: use on_finish section to upload artifacts
Vincent Fu [Tue, 26 May 2020 17:22:38 +0000 (13:22 -0400)]
appveyor: use on_finish section to upload artifacts

We cannot rely on the artifacts section to upload test artifacts because
when a test failure occurs, the entire build process stops and the
artifacts are not uploaded. Use the on_finish section instead to upload
test artifacts.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agozbd: Fix compilation error on BSD
Shin'ichiro Kawasaki [Thu, 28 May 2020 12:56:42 +0000 (21:56 +0900)]
zbd: Fix compilation error on BSD

Commit b76949618d55 ("fio: Generalize zonemode=zbd") enabled zbd.c
compilation on other operating systems than Linux. This caused a
compilation error on NetBSD as follows:

ld: zbd.o: in function `parse_zone_info':
fio/zbd.c:422: undefined reference to `pthread_mutexattr_setpshared'
ld: zbd.o: in function `init_zone_info':
fio/zbd.c:378: undefined reference to `pthread_mutexattr_setpshared'
gmake: *** [Makefile:483: fio] Error 1

Same error is expected on other BSD OSes.

Fix this by initializing mutex using helper functions pshared.c provides.
To initialize mutex with POSIX_MUTEX_RECURSIVE attribute type, utilize
mutex_init_pshared_with_type().

Reported-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Fixes: b76949618d55 ("fio: Generalize zonemode=zbd")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agopshared: Add mutex_init_pshared_with_type()
Shin'ichiro Kawasaki [Thu, 28 May 2020 12:56:41 +0000 (21:56 +0900)]
pshared: Add mutex_init_pshared_with_type()

To initialize mutex to be shared across processes, the helper function
mutex_init_pshared() is available. However, it does not allow to set
mutex attribute types such as POSIX_MUTEX_RECURSIVE.

To allow setting mutex attribute types, introduce another helper function
mutex_init_pshared_with_type(). It initialize mutex for sharing across
processes and set attribute types specified as its argument.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agot/zbd: make the test script easier to terminate
Dmitry Fomichev [Mon, 25 May 2020 21:32:56 +0000 (06:32 +0900)]
t/zbd: make the test script easier to terminate

Very often, it takes more than one ^C to terminate test-zbd-support
script. Just a single ^C does end the test that is currently being
executed, but then the script proceeds to the next test. This commit
adds a simple signal handler to exit the test loop after receiving
a Ctrl-C.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agot/zbd: beautify test script output
Dmitry Fomichev [Mon, 25 May 2020 21:32:55 +0000 (06:32 +0900)]
t/zbd: beautify test script output

The test printout columns are better aligned now. Also, the test
result, PASS/FAIL, is now color-coded and that makes it easier
to spot failures.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agolibzbc: fix whitespace errors
Dmitry Fomichev [Mon, 25 May 2020 21:32:54 +0000 (06:32 +0900)]
libzbc: fix whitespace errors

Make checkpatch happy... no functional change.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agolibzbc: cleanup init code
Dmitry Fomichev [Mon, 25 May 2020 21:32:53 +0000 (06:32 +0900)]
libzbc: cleanup init code

Make sure every allocated data structure gets freed in case of
unsuccessful libzbc ioengine initialization.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge branch 'parse-and-fill-pattern' of https://github.com/bvanassche/fio
Jens Axboe [Sun, 24 May 2020 18:03:56 +0000 (12:03 -0600)]
Merge branch 'parse-and-fill-pattern' of https://github.com/bvanassche/fio

* 'parse-and-fill-pattern' of https://github.com/bvanassche/fio:
  Do not read past the end of fmt_desc[]
  Declare a static variable 'const'
  Fix spelling in a source code comment

3 years agoDo not read past the end of fmt_desc[]
Bart Van Assche [Sun, 24 May 2020 03:39:47 +0000 (20:39 -0700)]
Do not read past the end of fmt_desc[]

Callers of parse_format() pass a size in bytes while the parse_format()
function itself expects a number of elements. Fix this by making the
fmt_desc[] array NULL-terminated. This patch fixes the following Coverity
complaint:

CID 300986 (#1 of 1): Out-of-bounds access (OVERRUN)
overrun-buffer-arg: Overrunning array fmt_desc of 1 24-byte elements by
passing it to a function which accesses it at element index 23 (byte
offset 575) using argument 24U.

Cc: Roman Pen <r.peniaev@gmail.com>
Fixes: 634bd210c17a ("lib/pattern: add set of functions to parse combined pattern input")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
3 years agoDeclare a static variable 'const'
Bart Van Assche [Sun, 24 May 2020 03:30:17 +0000 (20:30 -0700)]
Declare a static variable 'const'

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
3 years agoFix spelling in a source code comment
Bart Van Assche [Sun, 24 May 2020 03:28:21 +0000 (20:28 -0700)]
Fix spelling in a source code comment

Change two occurrences of 'descritor' into 'descriptor'

Cc: Roman Pen <r.peniaev@gmail.com>
Fixes: 634bd210c17a ("lib/pattern: add set of functions to parse combined pattern input")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
3 years agoFio 3.20 fio-3.20
Jens Axboe [Sat, 23 May 2020 17:14:14 +0000 (11:14 -0600)]
Fio 3.20

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge branch 'master' of https://github.com/ffontaine/fio
Jens Axboe [Sat, 23 May 2020 17:13:25 +0000 (11:13 -0600)]
Merge branch 'master' of https://github.com/ffontaine/fio

* 'master' of https://github.com/ffontaine/fio:
  Makefile: fix build of io_uring on sh4

3 years agoMakefile: fix build of io_uring on sh4
Fabrice Fontaine [Sat, 23 May 2020 17:07:40 +0000 (19:07 +0200)]
Makefile: fix build of io_uring on sh4

SuperH compile currently fails with:

/usr/lfs/hdd_v1/rc-buildroot-test/scripts/instance-0/output-1/host/opt/ext-toolchain/bin/../lib/gcc/sh4-buildroot-linux-uclibc/8.3.0/../../../../sh4-buildroot-linux-uclibc/bin/ld: t/io_uring.o: in function `submitter_fn':
/usr/lfs/hdd_v1/rc-buildroot-test/scripts/instance-0/output-1/build/fio-3.19/t/io_uring.c:131: undefined reference to `arch_flags'
/usr/lfs/hdd_v1/rc-buildroot-test/scripts/instance-0/output-1/host/opt/ext-toolchain/bin/../lib/gcc/sh4-buildroot-linux-uclibc/8.3.0/../../../../sh4-buildroot-linux-uclibc/bin/ld: /usr/lfs/hdd_v1/rc-buildroot-test/scripts/instance-0/output-1/build/fio-3.19/t/io_uring.c:367: undefined reference to `arch_flags'
collect2: error: ld returned 1 exit status

Fix that by ensuring we have a stub arch.o with the necessary arch flags

Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
3 years agozbd: make zbd_info->mutex non-recursive
Alexey Dobriyan [Thu, 21 May 2020 23:17:16 +0000 (02:17 +0300)]
zbd: make zbd_info->mutex non-recursive

There is no reason for it to be recursive. Resursiveness leaked
from struct fio_zone_info::mutex initialisation.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: introduce per job maximum open zones limit
Alexey Dobriyan [Thu, 21 May 2020 23:17:15 +0000 (02:17 +0300)]
zbd: introduce per job maximum open zones limit

It is not possible to maintain sustained per-thread iodepth in ZBD mode.
The way code is written, "max_open_zones" acts as a global limit, and
once one or few threads open all "max_open_zones" zones, other threads
can't open anything and _exit_ prematurely.

This config is guaranteed to make equal number of zone resets/IO now:
each thread generates identical pattern and doesn't intersect with other
threads:

zonemode=zbd
zonesize=...
rw=write

numjobs=N
offset_increment=M*zonesize

[j]
size=M*zonesize

Patch introduces "job_max_open_zones" which is per-thread/process limit.
"max_open_zones" remains per file/device limit. Both limits are checked
for each open zone so one thread can't kick out others.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: don't lock zones outside working area
Alexey Dobriyan [Thu, 21 May 2020 23:17:14 +0000 (02:17 +0300)]
zbd: don't lock zones outside working area

Currently threads lock each other zones even if their working areas as
defined by [f->file_offset, f->file_offset + f->io_size) don't intersect.
This leads to unnecessary quiescing.

Patch clamps every zone to [->min_zone, ->max_zone) when doing search
for opened zone and more importantly adds an assert so that any
unnecessary zone locking becomes very visible.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: bump ZBD_MAX_OPEN_ZONES
Alexey Dobriyan [Thu, 21 May 2020 23:17:13 +0000 (02:17 +0300)]
zbd: bump ZBD_MAX_OPEN_ZONES

128 opened zones is not enough for us!

4096 opened zones is OK for 64×iodepth=64 stress testing.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoverify: decouple seed generation from buffer fill
Alexey Dobriyan [Thu, 21 May 2020 23:17:12 +0000 (02:17 +0300)]
verify: decouple seed generation from buffer fill

It is nicer this way and there will be more code in this area
with ZBD verification.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge branch 'latency_run' of https://github.com/liu-song-6/fio
Jens Axboe [Thu, 21 May 2020 14:36:07 +0000 (08:36 -0600)]
Merge branch 'latency_run' of https://github.com/liu-song-6/fio

* 'latency_run' of https://github.com/liu-song-6/fio:
  Add option latency_run to continue enable latency_target

3 years agoMerge branch 'testing' of https://github.com/vincentkfu/fio
Jens Axboe [Wed, 20 May 2020 20:19:12 +0000 (14:19 -0600)]
Merge branch 'testing' of https://github.com/vincentkfu/fio

* 'testing' of https://github.com/vincentkfu/fio:
  t/zbd: improve error handling for test scripts
  testing: use max-jobs to speed up testing
  docs: update cmdprio_percentage with note about root user
  t/latency_percentiles: run cmdprio_percentage tests only if root
  t/run-fio-tests: better catch file errors
  t/jsonplus2csv_test: reduce file size

3 years agot/zbd: improve error handling for test scripts
Vincent Fu [Tue, 19 May 2020 18:55:56 +0000 (14:55 -0400)]
t/zbd: improve error handling for test scripts

Use exit instead of return to abort the scripts if modprobe null_blk
fails. With return, the script continues to run after printing an error
message. Also abort if the null block device setup fails for the regular
null block device test script.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agotesting: use max-jobs to speed up testing
Vincent Fu [Mon, 18 May 2020 18:14:12 +0000 (14:14 -0400)]
testing: use max-jobs to speed up testing

Allocating fio's default memory footprint takes a few moments. Following
https://www.spinics.net/lists/fio/msg08529.html, use the max-jobs option
to reduce fio's memory footprint. This reduces the runtime of the full
test suite by about 40s.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agodocs: update cmdprio_percentage with note about root user
Vincent Fu [Mon, 18 May 2020 17:50:53 +0000 (13:50 -0400)]
docs: update cmdprio_percentage with note about root user

The io_uring/libaio cmdprio_percentage option can only be used when fio
is run from the root user account because IOs are submitted with IO
priority class IOPRIO_CLASS_RT. Note in the documentation that fio must
be run from the root account to use this option.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agoMerge branch 'issue-989' of https://github.com/Nordix/fio
Jens Axboe [Wed, 20 May 2020 13:06:43 +0000 (07:06 -0600)]
Merge branch 'issue-989' of https://github.com/Nordix/fio

* 'issue-989' of https://github.com/Nordix/fio:
  Corrected scope of for-loop

3 years agoCorrected scope of for-loop
Lars Ekman [Wed, 20 May 2020 09:27:11 +0000 (11:27 +0200)]
Corrected scope of for-loop

Fixes #989 - fio2gnuplot dont generate clat (raw) plots

Signed-off-by: Lars Ekman <lars.g.ekman@est.tech>
3 years agoMerge branch '32-bit-fixes' of https://github.com/sitsofe/fio
Jens Axboe [Tue, 19 May 2020 22:14:19 +0000 (16:14 -0600)]
Merge branch '32-bit-fixes' of https://github.com/sitsofe/fio

* '32-bit-fixes' of https://github.com/sitsofe/fio:
  Fix 32-bit/LLP64 platform truncation issues

3 years agoFix 32-bit/LLP64 platform truncation issues
Sitsofe Wheeler [Tue, 19 May 2020 21:41:49 +0000 (22:41 +0100)]
Fix 32-bit/LLP64 platform truncation issues

- After 140a6888 ("rate: Convert the rate and rate_min options to
  FIO_OPTS_ULL") landed 32-bit/LLP64 platforms need additional changes
  to cope with 64 bit I/O rate values
- The seed is 64 bit but was being being truncated to 32 bits in
  td_fill_rand_seeds_internal() on bit/LLP64 platforms

Prior to this commit when running an fio compiled with

CC=clang-9 ./configure \
  --extra-cflags="-fsanitize=undefined,implicit-integer-truncation "
  "-fno-builtin"

using this job

./fio --ioengine=null --bs=1M --rate=6G --rate_min=5G --name=test --size=100G

warnings like the following were produced

init.c:996:27: runtime error: implicit conversion from type 'uint64_t' (aka 'unsigned long long') of value 5942511153023025289 (64-bit, unsigned) to type 'unsigned int' changed the value to 2914779273 (32-bit, unsigned)
[..]
backend.c:212:25: runtime error: implicit conversion from type 'unsigned long long' of value 12886999040 (64-bit, unsigned) to type 'unsigned long' changed the value to 2097152 (32-bit, unsigned)

inside a 32-bit Ubuntu 18.04 docker container.

Fixes: https://github.com/axboe/fio/issues/716

3 years agoAdd option latency_run to continue enable latency_target
Song Liu [Mon, 18 May 2020 05:39:49 +0000 (22:39 -0700)]
Add option latency_run to continue enable latency_target

Currently, latency_target run will exist once fio find the highest queue
depth that meets latency_target. Add option latency_run. If set, fio will
continue running and try to meet latency_target by adusting queue depth.

Signed-off-by: Song Liu <songliubraving@fb.com>
3 years agoMerge branch 'stephen/rate-ull' of https://github.com/sbates130272/fio
Jens Axboe [Tue, 19 May 2020 18:31:49 +0000 (12:31 -0600)]
Merge branch 'stephen/rate-ull' of https://github.com/sbates130272/fio

* 'stephen/rate-ull' of https://github.com/sbates130272/fio:
  rate: Convert the rate and rate_min options to FIO_OPTS_ULL

3 years agoAllow more flexibility in zone start and span
Pierre Labat [Fri, 15 May 2020 16:22:13 +0000 (11:22 -0500)]
Allow more flexibility in zone start and span

Allow sequential read to start anywhere in a zone (option
offset), and have a span smaller than a zone (option size).

A use case is a Key Value Store reading a set of keys or values
starting somewhere in a zone.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Pierre Labat <plabat@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agot/latency_percentiles: run cmdprio_percentage tests only if root
Vincent Fu [Mon, 18 May 2020 17:41:52 +0000 (13:41 -0400)]
t/latency_percentiles: run cmdprio_percentage tests only if root

The libaio/io_uring cmdprio_percentage option only works when fio is run
from the root user account. Skip these tests (instead of failing) when
this test script is run from a regular user account.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agot/run-fio-tests: better catch file errors
Vincent Fu [Mon, 18 May 2020 14:26:21 +0000 (10:26 -0400)]
t/run-fio-tests: better catch file errors

Handle file not found errors more gracefully. Make sure we always catch
exceptions when opening files so that we can fail gracefully when
problems occur.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agot/jsonplus2csv_test: reduce file size
Vincent Fu [Mon, 18 May 2020 14:06:42 +0000 (10:06 -0400)]
t/jsonplus2csv_test: reduce file size

Reduce the file size to better accommodate testing on low-powered
devices. Also add a comment about the ionegine choice.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agot/zbd: Use max-jobs=16 option
Damien Le Moal [Fri, 8 May 2020 07:56:45 +0000 (16:56 +0900)]
t/zbd: Use max-jobs=16 option

Use max-jobs option to reduce memory usage and speedup execution of
test-zbd-support.

With --max-jobs=16, twice the largest number of jobs used in all test
cases, the execution time of test-zbd-support against a zoned nullblk
device is lowered from 64s to 41s on a laptop.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_u: Optimize set_rw_ddir()
Damien Le Moal [Fri, 8 May 2020 07:56:44 +0000 (16:56 +0900)]
io_u: Optimize set_rw_ddir()

There is no need to execute zbd_adjust_ddir() for a job that is not
using zonemode=zbd. So move the job mode test out of zbd_adjust_ddir()
and conditionally execute this function by first testing the job mode
in set_rw_ddir().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: Rename zbd_init()
Damien Le Moal [Fri, 8 May 2020 07:56:43 +0000 (16:56 +0900)]
zbd: Rename zbd_init()

Clarify the execution context of zbd_init() by renaming this function
to zbd_setup_files() as it is called from the setup_files() function.
While at it, wrap the use of zbd_free_zone_info() into the inline
function zbd_close_file() to avoid an unecessary function call when
closing files that are not zoned block device files of zonemode=zbd
jobs, that is, files that do not have zbd_info initialized.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: Optimize zbd_file_reset()
Damien Le Moal [Fri, 8 May 2020 07:56:42 +0000 (16:56 +0900)]
zbd: Optimize zbd_file_reset()

For a job not writing, a device zones will not be reset by executing
zbc_file_reset() so there is no need to scan all zones of the job
operating range. Avoid this overhead by returning early for jobs that
are not writing.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: Fix read with verify
Damien Le Moal [Fri, 8 May 2020 07:56:41 +0000 (16:56 +0900)]
zbd: Fix read with verify

For a read only workload with verify option enabled, executing
zbd_replay_write_order() will ignore target zones that are full and try
to open another zone. This either triggers an assert if max_open_zones
is unused, or result in verify failing. Fix this by executing
zbd_replay_write_order() only for writing workloads. This fix is also
consistent with the fact that zoned devices do not implicitly open
zones for read operations.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozbd: Fix potential deadlock on read operations
Damien Le Moal [Fri, 8 May 2020 07:56:40 +0000 (16:56 +0900)]
zbd: Fix potential deadlock on read operations

For read-only workloads, zbd_find_zone() has a similar zone locking
behavior as for write IOs: zones to be read are locked when an IO is
prepared and unlocked when the IO completes. With an asynchronous IO
engine, this can create deadlocks if 2 threads are trying to read the
same 2 zones. For instance, if thread A already has a lock on zone 1
and is waiting for a lock on zone 2 while thread B already has a lock
on zone 2 and waiting for a lock on zone 1.

The fix is similar to previous fixes for this potential deadlock,
namely, use zone_lock() instead of directly calling pthread_mutex_lock()
to ensure that a thread issues the IOs it already has prepared if it
encounters a locked zone, doing so ensuring forward progress.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoiolog: Fix write_iolog_close()
Damien Le Moal [Fri, 8 May 2020 07:56:39 +0000 (16:56 +0900)]
iolog: Fix write_iolog_close()

If the init_iolog() call from backend.c thread_main() fails (e.g. wrong
file path given), td->iolog_f is not set but write_iolog_close() is
still called from thread_main() error processing. This causes a seg
fault and unclean termination of fio. Fix this by changing
write_iolog_close() to do nothing if td->iolog_f is NULL.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge branch 'rados' of https://github.com/vincentkfu/fio
Jens Axboe [Thu, 14 May 2020 17:47:17 +0000 (11:47 -0600)]
Merge branch 'rados' of https://github.com/vincentkfu/fio

* 'rados' of https://github.com/vincentkfu/fio:
  engines/rados: fix build issue with thread_cond_t vs pthread_cond_t

3 years agoengines/rados: fix build issue with thread_cond_t vs pthread_cond_t
Vincent Fu [Thu, 14 May 2020 16:54:11 +0000 (12:54 -0400)]
engines/rados: fix build issue with thread_cond_t vs pthread_cond_t

The Travis-CI Linux build fails because the type for completed_more_io
was changed from pthread_cond_t to thread_cond_t:

https://travis-ci.org/github/axboe/fio/jobs/687073515

Change it back to pthread_cond_t.

Fixes: 1e30d8d005a568169c0749f5fc6fb2d5f09dcc97 ("engines/rados: Added
waiting for completion on cleanup.")
Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agoMerge branch 'rados-cleanup-wait' of https://github.com/aclamk/fio
Jens Axboe [Thu, 14 May 2020 15:37:24 +0000 (09:37 -0600)]
Merge branch 'rados-cleanup-wait' of https://github.com/aclamk/fio

* 'rados-cleanup-wait' of https://github.com/aclamk/fio:
  engines/rados: Added waiting for completion on cleanup.

3 years agoengines/rados: Added waiting for completion on cleanup.
Adam Kupczyk [Sat, 9 May 2020 09:22:04 +0000 (05:22 -0400)]
engines/rados: Added waiting for completion on cleanup.

This change protects against problems when closing connection to ceph,
while some aio are in flight.

Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>
3 years agoMerge branch 'helper-thread-select' of https://github.com/vincentkfu/fio
Jens Axboe [Wed, 13 May 2020 14:10:32 +0000 (08:10 -0600)]
Merge branch 'helper-thread-select' of https://github.com/vincentkfu/fio

* 'helper-thread-select' of https://github.com/vincentkfu/fio:
  helper_thread: better handle select() return value

3 years agohelper_thread: better handle select() return value
Vincent Fu [Tue, 12 May 2020 16:50:25 +0000 (12:50 -0400)]
helper_thread: better handle select() return value

On Windows, the ETA is not updated after ramp_time expires. For example:

C:\fio-dev>fio\fio --name=test --runtime=5s --time_based --ramp_time=5 --size=1M --ioengine=null --thread
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=null, iodepth=1
fio-3.19-54-g9bc8
Starting 1 thread
Jobs: 1 (f=0): [/(1)][-.-%][eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=5344: Tue May 12 10:40:49 2020
  read: IOPS=2535k, BW=9903MiB/s (10.4GB/s)(48.4GiB/5001msec)
    clat (nsec): min=38, max=10680, avg=40.94, stdev= 4.20
     lat (nsec): min=107, max=10751, avg=110.78, stdev= 6.13
...

Notice that the last ETA update line indicates that there are still 5s
of runtime left even though the job has finished. This occurs because
the while loop in helper_thread_main() finishes soon after ramp_time
expires instead of continuing to run until the last job has completed.
The while loop ends because the return value for select() is stored in
ret. select() can return positive values in non-error conditions. The
while loop should not end when select() returns a positive value.

Fixes: 700ad386aa88 ("helper_thread: Complain if select() fails")

3 years agoMerge branch 'btrace2fio' of https://github.com/liu-song-6/fio
Jens Axboe [Mon, 11 May 2020 18:09:31 +0000 (12:09 -0600)]
Merge branch 'btrace2fio' of https://github.com/liu-song-6/fio

* 'btrace2fio' of https://github.com/liu-song-6/fio:
  btrace2fio: create separate jobs for pid with both read/write and trim

3 years agobtrace2fio: create separate jobs for pid with both read/write and trim
Song Liu [Mon, 11 May 2020 17:27:07 +0000 (10:27 -0700)]
btrace2fio: create separate jobs for pid with both read/write and trim

Single fio job cannot do read/write and trim. Generate two separate jobs
for pid that does both read/write and trim: pidxxx and pidxxx_trim.

Signed-off-by: Song Liu <songliubraving@fb.com>
3 years agorate: Convert the rate and rate_min options to FIO_OPTS_ULL
Stephen Bates [Fri, 8 May 2020 14:14:49 +0000 (08:14 -0600)]
rate: Convert the rate and rate_min options to FIO_OPTS_ULL

In many high-performance systems today it is possible to exceed 4GiB/s
throughput. Therefore convert the rate and rate_min options from
FIO_OPTS_INT to FIO_OPTS_ULL.

Fixes #716.

Signed-off-by: Stephen Bates <sbates@raithlin.com>
3 years agoMerge branch 'helper_thread_test' of https://github.com/vincentkfu/fio
Jens Axboe [Wed, 29 Apr 2020 15:05:11 +0000 (09:05 -0600)]
Merge branch 'helper_thread_test' of https://github.com/vincentkfu/fio

* 'helper_thread_test' of https://github.com/vincentkfu/fio:
  helper_thread: refactor status-interval and steadystate code
  helper_thread: fix inconsistent status intervals
  helper_thread: cleanups

3 years agohelper_thread: refactor status-interval and steadystate code
Vincent Fu [Wed, 29 Apr 2020 11:19:54 +0000 (05:19 -0600)]
helper_thread: refactor status-interval and steadystate code

The code patterns for the status-interval and steadystate tasks are the same.
So refactor the common code into a separate function. The disk util code is not
the same because the task has a return code.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agohelper_thread: fix inconsistent status intervals
Vincent Fu [Tue, 28 Apr 2020 18:16:46 +0000 (12:16 -0600)]
helper_thread: fix inconsistent status intervals

The signal handler safety changes to the helper thread have resulted in
inconsistent status-interval intervals. Consider the following:

$ ./fio-canonical/fio --name=test --rw=randwrite --ioengine=libaio --direct=1 --runtime=180 --time_based --filename=/dev/fioa --output=write-canonical.out --minimal --status-interval=1
$ cut -d ';' -f 50 < write-canonical.out | awk 'NR>1{print $1-p} {p=$1}' | sort -n | tail
1002
1002
1002
1002
1002
1042
1046
1251
1252
1252

Several of the status-interval output lines are ~1250ms apart.

This patch moves code for triggering the status-interval output from the main
fio process to the helper thread. The resulting intervals are much closer to
the desired 1000ms.

$ ./fio/fio --name=test --rw=randwrite --ioengine=libaio --direct=1 --runtime=180 --time_based --filename=/dev/fioa --minimal --status-interval=1 --output=write-test.out
$ cut -d ';' -f 50 < write-test.out | awk 'NR>1{print $1-p} {p=$1}' | sort -n | tail
1001
1001
1001
1001
1001
1001
1001
1001
1001
1001

Reported-by: <nate.rivers@wdc.com>
Fixes: 31eca641ad91 ("Fix a potential deadlock in helper_do_stat()")
Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
3 years agohelper_thread: cleanups
Vincent Fu [Tue, 28 Apr 2020 17:27:14 +0000 (11:27 -0600)]
helper_thread: cleanups

- instead of always using a timeout of DISK_UTIL_MSEC, use a possibly shorter
  period for the select() timeout
- drop the timespec_add_msec() call because the target is overwritten in short
  order by clock_gettime()

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agoMerge branch 'gcc1' of https://github.com/kusumi/fio
Jens Axboe [Tue, 21 Apr 2020 21:44:31 +0000 (15:44 -0600)]
Merge branch 'gcc1' of https://github.com/kusumi/fio

* 'gcc1' of https://github.com/kusumi/fio:
  json: Fix compile error on RHEL6

4 years agojson: Fix compile error on RHEL6
Tomohiro Kusumi [Tue, 21 Apr 2020 19:17:12 +0000 (04:17 +0900)]
json: Fix compile error on RHEL6

eb2f29b7fd("Make the JSON code easier to analyze") doesn't compile
on RHEL6 using gcc4.x.

Using "{.object = val,}," for an union field seems to fix the issue,
but just use "arg.object = val;" instead as this is guaranteed to
compile on supported platforms.

--
    CC gettime.o
In file included from stat.h:7,
                 from thread_options.h:7,
                 from fio.h:18,
                 from gettime.c:7:
json.h: In function 'json_object_add_value_object':
json.h:95: error: unknown field 'object' specified in initializer
json.h:95: warning: missing braces around initializer
json.h:95: warning: (near initialization for 'arg.<anonymous>')
json.h:95: warning: initialization makes integer from pointer without a cast
make: *** [gettime.o] Error 1

Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
4 years agojson: don't use named initializers for anonymous unions
Jens Axboe [Tue, 21 Apr 2020 03:20:03 +0000 (21:20 -0600)]
json: don't use named initializers for anonymous unions

Older compilers don't like it, and we can just do make it work a bit
differently instead.

Fixes: https://github.com/axboe/fio/issues/966
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agozbd: Fix I/O direction adjustment step for random read/write
Shin'ichiro Kawasaki [Thu, 16 Apr 2020 11:30:36 +0000 (20:30 +0900)]
zbd: Fix I/O direction adjustment step for random read/write

Commit fb0259fb ("zbd: Ensure first I/O is write for random read/write to
sequential zones") introduced a step to change direction of io_u from
read to write when that is the first I/O of the random read/write
workload to zoned block devices. However, such direction adjustment
results in inconsistent I/O length when read block size and write block
size are different.

To avoid the inconsistency between I/O direction and I/O length,
adjust the I/O direction before the I/O length is set. Move the step
from zbd_adjust_block() to set_rw_ddir(). To minimize changes in
set_rw_ddir(), introduce zbd_adjust_ddir() helper function.

Fixes: fb0259fb ("zbd: Ensure first I/O is write for random read/write to sequential zones")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge branch 'patch-1' of https://github.com/aakarshg/fio
Jens Axboe [Thu, 16 Apr 2020 20:04:26 +0000 (14:04 -0600)]
Merge branch 'patch-1' of https://github.com/aakarshg/fio

* 'patch-1' of https://github.com/aakarshg/fio:
  Add fio-histo-log-pctiles to make file

4 years agoAdd fio-histo-log-pctiles to make file
Aakarsh Gopi [Thu, 16 Apr 2020 18:34:46 +0000 (14:34 -0400)]
Add fio-histo-log-pctiles to make file

This was missing earlier

4 years agoMerge branch 'appveyor-artifacts' of https://github.com/vincentkfu/fio
Jens Axboe [Wed, 15 Apr 2020 14:29:01 +0000 (08:29 -0600)]
Merge branch 'appveyor-artifacts' of https://github.com/vincentkfu/fio

* 'appveyor-artifacts' of https://github.com/vincentkfu/fio:
  appveyor: make test artifacts available for inspection

4 years agoappveyor: make test artifacts available for inspection
Vincent Fu [Tue, 14 Apr 2020 14:10:45 +0000 (10:10 -0400)]
appveyor: make test artifacts available for inspection

For debugging test failures, package test artifacts and make them
available for download. Exclude certain files to reduce size of the
compressed archive.

Suggested-by: Sitsofe Wheeler <sitsofe@gmail.com>
Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agozbd: fix sequential write pattern with verify= and max_open_zones=
Alexey Dobriyan [Mon, 13 Apr 2020 18:51:55 +0000 (21:51 +0300)]
zbd: fix sequential write pattern with verify= and max_open_zones=

Sequential write with max_open_zones=1 has interesting (read: buggy)
interaction with verify=.

If verify is off, then job runs correctly and IO is sequential,
and restarted from offset 0 and remains sequential.

If verify is on, then 1 full run is done and verified correctly.
At this point there is exactly 1 open zone which is the last zone.

Now IO restarts from offset 0 and pick_random_zone() picks opened zone
#0 which is the last zone because offset is 0. All IO is redirected
to the last zone, which is rewritten once triggering verify again.

IO pattern becomes: 1 full sequential rewrite followed by constant
sequential rewrites of the last zone.

[global]
filename=/dev/loop0
direct=1
zonemode=zbd
zonesize=1M
bs=512K
rw=write
verify=xxhash
[j]
max_open_zones=1
io_size=3G

Fix is to close every zone given that verification acts as a barrier
between jobs.

max_open_zones=2 can restart from half of the device, etc.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agozbd: Ensure first I/O is write for random read/write to sequential zones
Shin'ichiro Kawasaki [Mon, 13 Apr 2020 08:33:00 +0000 (17:33 +0900)]
zbd: Ensure first I/O is write for random read/write to sequential zones

In case read is chosen for the first random I/O for sequential write
required zones, fio stops because no data can be read from the zones with
empty status. Enforce to write at the first I/O to make sure data to read
exists for the following read operations.

The unexpected fio stop symptom was observed with test case #30 of
t/zbd/test-zbd-support. When the test case was run repeatedly resetting
all zones with -r option, it often passes with too short run time.

Reviewed-by: Damien Le Moal <damien.lemoaal@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agot/zbd: Fix a bug in reset_zone() for all zones reset
Shin'ichiro Kawasaki [Mon, 13 Apr 2020 08:32:59 +0000 (17:32 +0900)]
t/zbd: Fix a bug in reset_zone() for all zones reset

The bash function reset_zone() is expected to reset all zones when -1 is
provided as its second argument. However, it fails to reset all zones
using blkzone command because of wrong and unnecessary options provided
to blkzone. Remove the option to fix it.

This failure was found with running test-zbd-support with -r option.

Reviewed-by: Damien Le Moal <damien.lemoaal@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agot/zbd: Fix a bug in max_open_zones()
Shin'ichiro Kawasaki [Mon, 13 Apr 2020 08:32:58 +0000 (17:32 +0900)]
t/zbd: Fix a bug in max_open_zones()

When sg_inq command is executed to check if it can provide maximum open
zones, the command's standard output was not discarded and caused
unexpected script behavior. Fix it discarding the standard output.

Reviewed-by: Damien Le Moal <damien.lemoaal@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agozbd: fix zonemode=zbd with NDEBUG
Alexey Dobriyan [Fri, 10 Apr 2020 19:06:21 +0000 (22:06 +0300)]
zbd: fix zonemode=zbd with NDEBUG

assert() with NDEBUG doesn't evaluate argument.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge branch 'fix-cflags' of https://github.com/Hi-Angel/fio
Jens Axboe [Mon, 13 Apr 2020 14:04:09 +0000 (08:04 -0600)]
Merge branch 'fix-cflags' of https://github.com/Hi-Angel/fio

* 'fix-cflags' of https://github.com/Hi-Angel/fio:
  configure/Makefile: don't override user CFLAGS

4 years agoconfigure/Makefile: don't override user CFLAGS
Konstantin Kharlamov [Mon, 13 Apr 2020 11:57:19 +0000 (14:57 +0300)]
configure/Makefile: don't override user CFLAGS

It is a usual practice to build sw by passing `CFLAGS="-foo"` on
configure stage. It didn't work with FIO though. This commit fixes two
problems:

* configure: this script was overriding user CFLAGS
* Makefile: this script was appending its own CFLAGS instead of
  prepending them. The problem with this one is that it sets a -O3
option, but a user may have wanted to disable optimization, so they set
-O0 option. And by appending our CFLAGS we make user CFLAGS to not work.

Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
4 years agoMerge branch 'zbd-build' of https://github.com/vincentkfu/fio
Jens Axboe [Wed, 8 Apr 2020 14:46:35 +0000 (08:46 -0600)]
Merge branch 'zbd-build' of https://github.com/vincentkfu/fio

* 'zbd-build' of https://github.com/vincentkfu/fio:
  Revert ".travis.yml: remove pip line from xcode11.2 config"
  zbd: fix Windows build errors

4 years agoexamples: add libzbc ioengine example scripts
Damien Le Moal [Wed, 8 Apr 2020 06:53:09 +0000 (15:53 +0900)]
examples: add libzbc ioengine example scripts

Add two example script files (random write and sequential read)
illustrating the use of the libzbc ioengine with zonemode=zbd.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoexamples: add zonemode=zbd example scripts
Damien Le Moal [Wed, 8 Apr 2020 06:46:59 +0000 (15:46 +0900)]
examples: add zonemode=zbd example scripts

Add two example script files (random write and sequential read)
illustrating the use of zonemode=zbd with the psync and libaio
ioengines.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agozbd: Fix missing mutex unlock and warnings detected with coverity
Damien Le Moal [Wed, 8 Apr 2020 06:46:45 +0000 (15:46 +0900)]
zbd: Fix missing mutex unlock and warnings detected with coverity

With max_open_zones != 0, if no candidate zone for open is found by
zbd_convert_to_open_zone(), the file zbd_info mutex as well as the
current target zone mutex must both be unlocked before returning NULL.

While at it, also assert check for min_bs != 0 where min_bs is used for
divisions to avoid division by zero warnings from coverity.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Fixes: 6463db6c1d3a ("fio: fix interaction between offset/size...")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoRevert ".travis.yml: remove pip line from xcode11.2 config"
Vincent Fu [Wed, 8 Apr 2020 11:22:18 +0000 (07:22 -0400)]
Revert ".travis.yml: remove pip line from xcode11.2 config"

This reverts commit 839e0223363e323a4acbdfaf785b03d5aa9f53ba.

Two weeks ago an update to the xcode11.2 image required the above patch
to get macOS testing working. Recently the xcode11.2 image was changed
back to its earlier state. So we now need to revert the above patch for
testing to work.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agozbd: fix Windows build errors
Vincent Fu [Wed, 8 Apr 2020 11:20:12 +0000 (07:20 -0400)]
zbd: fix Windows build errors

Adding the os.h include resolves the build problems.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agozbd: Fix build errors on Windows and MacOS
Damien Le Moal [Wed, 8 Apr 2020 01:54:26 +0000 (10:54 +0900)]
zbd: Fix build errors on Windows and MacOS

Including dirent.h is not needed, so remove it to avoid a compilation
error on Windows and MacOS. Also make sure that EREMOTEIO is defined as
some OSes do not have this error code.

Fixes: b76949618d55 ("fio: Generalize zonemode=zbd")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge branch 'rdma-fixes' of https://github.com/dmonakhov/fio
Jens Axboe [Tue, 7 Apr 2020 22:05:23 +0000 (16:05 -0600)]
Merge branch 'rdma-fixes' of https://github.com/dmonakhov/fio

* 'rdma-fixes' of https://github.com/dmonakhov/fio:
  engine/rdmaio: fix io_u initialization
  engines: check options before dereference

4 years agot/zbd: Add support for libzbc IO engine tests
Dmitry Fomichev [Tue, 7 Apr 2020 01:59:00 +0000 (10:59 +0900)]
t/zbd: Add support for libzbc IO engine tests

Modify the test-zbd-support script to accept SG node device files for
tests with the libzbc IO engine. This IO engine can also be tested with
a block device file using the new -l option which forces all test cases
to have the option --ioengine=libzbc.

New helper functions are added to discover the capacity, logical block
size etc of devices specified using an SG node file.

To facilitate troubleshooting of problems, the option -z is also added
to automatically add the option --debug=zbd to all test cases.

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agofio: Introduce libzbc IO engine
Dmitry Fomichev [Tue, 7 Apr 2020 01:58:59 +0000 (10:58 +0900)]
fio: Introduce libzbc IO engine

Many storage users in the field are using Linux enterprise distributions
with somewhat old kernel versions 3.x that do not have zoned block
device/ZBC/ZAC support, or distributions with more recent kernel
versions that do not have zoned block device support enabled by
default, i.e. not supported by the distribution vendor.

Despite this, there are many examples of production applications using
SMR disks directly using SCSI passthrough commands.

SMR disks performance tests and qualification using fio in such
environments is possible using the sg IO engine but writing scripts
is not easy as the zonemode=zbd cannot be used due to its lack of
support for ZBC operations (report zones, zone reset, etc).

Rather than modifying the sg IO engine, a simpler approach to provide
passthrough SMR support in fio is to use libzbc
(https://github.com/hgst/libzbc) to implement a ZBC compliant ioengine
supporting zonemode=zbd zone operations. With this, it becomes possible
to run more easily fio against SMR disks on systems without kernel
zoned block device support. This approach will also naturally enable
support for other ZBD disks varieties besides ZAC/ZBC SMR disks, namely
the upcoming Zone Domains/Zone Realms (ZD/ZR) drives, aka, dynamic
hybrid SMR drives.

This new libzbc IO engine implements the three IO engine methods related
to zoned devices: get_zoned_model(), report_zones() and reset_wp(),
allowing the use of zonemode=zbd. Special open_file(), close_file() and
get_file_size() methods are provided and implemented using libzbc
functions. The queue() operation allows only synchronous read and write
operations using the libzbc functions zbc_pread() and zbc_pwrite().

Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoioengines: Add zoned block device operations
Damien Le Moal [Tue, 7 Apr 2020 01:58:58 +0000 (10:58 +0900)]
ioengines: Add zoned block device operations

Define three new IO engines operations: zoned model discovery, zone
information report and zone write pointer reset. These allow an
ioengine to provide special implementation of these operations if the
system does not support them natively through system calls or on Linux
to replace the default Linux blkzoned.h ioctl based generic
implementation in oslib/linux-blkzoned.c.

FIO internal and external ioengines using direct device access
(e.g. Linux SG) or OS specific IO engines can provide an implementation
of these method to enabled zoned block device zonemode=zbd workloads.

On Linux, the IO engine zone operations have precedence over the
default zone operation implementation in oslib/linux-blkzoned.c.

This patch also increments FIO_IOOPS_VERSION to 26 and adds a
skeleton implementation of the new ioengine operations in
engines/skeleton_external.c.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agofio: Generalize zonemode=zbd
Damien Le Moal [Tue, 7 Apr 2020 01:58:57 +0000 (10:58 +0900)]
fio: Generalize zonemode=zbd

Generalize the implementation of the zbd zonemode for non-linux systems
and Linux systems without the blkzoned.h header file (that is, linux
systems with a kernel predating v4.10 or kernels compiled without zoned
block device support).

The configuration option CONFIG_HAS_BLKZONED determines if the system
supports or not zoned block devices. This option can be set for Linux
only for now. If it is set, the file oslib/linux-blkzoned.c is compiled
and the 3 functions defined are used by the zbd.c code to determine a
block device zoned model, get zone information and reset zones.
For systems that do not set the CONFIG_HAS_BLKZONED option,
zonemode=zbd will be useable with regular block devices with the
zbd code emulating zones as is already done currently.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoengine/rdmaio: fix io_u initialization
Dmitry Monakhov [Tue, 7 Apr 2020 19:18:42 +0000 (22:18 +0300)]
engine/rdmaio: fix io_u initialization

Currenly rdmaio engine fataly broken.
We fill io_u buffer inside engine->init() phase, but at this point td->io_u_freelist is empty,
so initialization code does nothing, so io_u->engine_data will be unitialized,
later this result in null pointer dereferent in fio_rdmaio_prep()

This patch moves io_u initialization to post_init() callback

4 years agoengines: check options before dereference
Dmitry Monakhov [Tue, 7 Apr 2020 17:33:46 +0000 (20:33 +0300)]
engines: check options before dereference

If FIO_OPT_STR_STORE option not provided it is initialized with NULL value, but
there are many places which assumes that is may be empty string
For example, commands below endup with null pointer dereference
fio  --name=test --ioengine=e4engine --size=1M
fio  --name=test --ioengine=rdma --port=1234 --size=1M

4 years agozbd: fixup ->zone_size_log2 if zone size is not power of 2
Alexey Dobriyan [Mon, 6 Apr 2020 19:56:10 +0000 (22:56 +0300)]
zbd: fixup ->zone_size_log2 if zone size is not power of 2

Code like this doesn't work if log2 is 0xffffffff.

if (f->zbd_info->zone_size_log2 > 0)
                zone_idx = offset >> f->zbd_info->zone_size_log2;
        else
                zone_idx = offset / f->zbd_info->zone_size;

Other than that everything else works!

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agozbd: Fix potential zone lock deadlock
Damien Le Moal [Mon, 6 Apr 2020 10:51:32 +0000 (19:51 +0900)]
zbd: Fix potential zone lock deadlock

Commit b27aef6abfba ("zbd: use zone_lock to lock a zone") to fix
potential deadlocks with zonemode=zbd  zone locking was incomplete.
The execution of the zone lock stress test t/zbd test case 48 still
sometimes lead to deadlocks (a large number of repeated execution is
sometimes needed).

The remaining deadlock pattern identified with the repeated execution
of this test is due to the concurrent execution of jobs doing random
async writes to zones. In such case, any of the job may trigger an all
zone reset through the path get_next_rand_block() -> fio_file_reset()
while async writes are still inflight. The fix for this is to use the
zone_lock() function instead of directly calling pthread_mutex_lock()i
to ensure that no async IO is inflight for a zone that is part of a
reset range.

Suggested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agofio: fix interaction between offset/size limited threads and "max_open_zones"
Alexey Dobriyan [Thu, 2 Apr 2020 19:21:02 +0000 (22:21 +0300)]
fio: fix interaction between offset/size limited threads and "max_open_zones"

If thread bumps into "max_open_zones" limit, it tries to close/reopen some
other zone before issuing IO. This scan is done over full list of block device's
opened zones. It means that a zone which doesn't belong to thread's working
area can be altered or IO can be retargeted at such zone.

If IO is retargeted then it will be dropped by "is_valid_offset()" check.

What happens with null block device testing is that one thread monopolises
IO and others threads do basically nothing.

This config will reliably succeed now:

[global]
zonemode=zbd
zonesize=1M
rw=randwrite
...
thread
numjobs=2
offset_increment=128M

[j]
max_open_zones=2
size=2M

Starting 2 threads
zbd      7991  /dev/nullb0: zbd model string: host-managed
zbd      7991  Device /dev/nullb0 has 1024 zones of size 1024 KB
zbd      8009  /dev/nullb0: examining zones 0 .. 2
zbd      8010  /dev/nullb0: examining zones 128 .. 130
zbd      8009  /dev/nullb0: opening zone 0
zbd      8010  /dev/nullb0: opening zone 128
zbd      8009  /dev/nullb0: queued I/O (0, 4096) for zone 0
zbd      8009  zbd_convert_to_open_zone(/dev/nullb0): starting from zone 128 (offset 1552384, buflen 4096)

retargeted for other thread's zone (zone 0 => zone 128)

zbd      8010  /dev/nullb0: queued I/O (134217728, 4096) for zone 128
zbd      8009  zbd_convert_to_open_zone(/dev/nullb0): returning zone 128
zbd      8009  Dropped request with offset 134221824

and dropped

Note: quasi-randomness is kind of necessary to spread I/O. Imagine index 0
is picked all the time, zone living there will be reopened constantly and
get relatively little I/O.

Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoMerge branch 'github-issue-947' of https://github.com/vincentkfu/fio
Jens Axboe [Tue, 31 Mar 2020 16:21:48 +0000 (10:21 -0600)]
Merge branch 'github-issue-947' of https://github.com/vincentkfu/fio

* 'github-issue-947' of https://github.com/vincentkfu/fio:
  stat: eliminate extra log samples

4 years agostat: eliminate extra log samples
Vincent Fu [Tue, 31 Mar 2020 11:26:16 +0000 (07:26 -0400)]
stat: eliminate extra log samples

b2a432bfbb6d inadvertently added extra log samples.

$ ./fio-canonical/fio --name=test --time_based --runtime=10s --write_lat_log=fio-07-b2a432 --log_avg_msec=1000 --size=1G --rw=rw
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.17-93-gb2a4
Starting 1 process
...
$ cat fio-07-b2a432_clat.1.log
1000, 5851, 0, 0, 0
1000, 2551, 1, 0, 0
1000, 5028, 1, 0, 0
2000, 4175, 0, 0, 0
2000, 3214, 1, 0, 0
2000, 60619, 0, 0, 0
...

There should only be two lines at each timestamp (one for reads, one for
writes), but the first two timestamps have three lines each.

The cause is an inadvertent change in stat.c:add_log_sample() of
__add_stat_to_log to _add_stat_to_log. Reverting to the two-underscore
version resolves this issue.

Fixes: https://github.com/axboe/fio/issues/947
Fixes: b2a432bfbb6d ("Per-command priority: Priority logging and libaio/io_uring cmdprio_percentage")
Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agoMerge branch 'jsonplus2csv' of https://github.com/vincentkfu/fio
Jens Axboe [Thu, 26 Mar 2020 15:03:40 +0000 (09:03 -0600)]
Merge branch 'jsonplus2csv' of https://github.com/vincentkfu/fio

* 'jsonplus2csv' of https://github.com/vincentkfu/fio:
  .travis.yml: remove pip line from xcode11.2 config
  t/jsonplus2csv_test.py: test script for tools/fio_jsonplus_clat2csv
  tools/fio_jsonplus2csv: accommodate multiple lat measurements

4 years ago.travis.yml: remove pip line from xcode11.2 config
Vincent Fu [Wed, 25 Mar 2020 17:48:39 +0000 (13:48 -0400)]
.travis.yml: remove pip line from xcode11.2 config

travis-ci changed the xcode11.2 image and 'pip' is no longer available.
So only run 'pip install scipy' for the default xcode image.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agot/jsonplus2csv_test.py: test script for tools/fio_jsonplus_clat2csv
Vincent Fu [Wed, 25 Mar 2020 16:53:54 +0000 (12:53 -0400)]
t/jsonplus2csv_test.py: test script for tools/fio_jsonplus_clat2csv

Add a script to run a basic jsonplus to CSV conversion and then validate
the conversion.

Also integrate this test script with t/run-fio-tests.py and install the
python package 'six' to support fio_jsonplus_clat2csv in the AppVeyor
build/testing environment.

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agotools/fio_jsonplus2csv: accommodate multiple lat measurements
Vincent Fu [Mon, 23 Mar 2020 22:14:40 +0000 (18:14 -0400)]
tools/fio_jsonplus2csv: accommodate multiple lat measurements

Add some intelligence to this script so that it works for any of
submission, completion, and total latency whenever they are present. The
CSV data format is changed to accommodate this.

While we're here also do the following:

add a way to generate optional debug output
add validate option that compares generated CSV data with the original
json+ data
fix style issues identified by pylint3
update documentation

Signed-off-by: Vincent Fu <vincent.fu@wdc.com>
4 years agozbd: add test for stressing zone locking
Naohiro Aota [Fri, 28 Feb 2020 07:12:48 +0000 (16:12 +0900)]
zbd: add test for stressing zone locking

Add a test to stress zone locking mechanism by having a large number of
threads with a small number of max_open_zones. Run 30 seconds time-based
fio under the timeout command. After 45 seconds, "timeout" kill -KILL the
fio process. If a zone lock deadlocks, fio is killed by the timeout
command, and this test fails. If not, fio runs to the end and this test
success.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoio_u: ensure io_u_quiesce() to process all the IOs
Naohiro Aota [Fri, 28 Feb 2020 07:12:47 +0000 (16:12 +0900)]
io_u: ensure io_u_quiesce() to process all the IOs

Currently, when IO have an error io_u_quiesce() stops processing
in-flight IOs there and leaves other IOs non-completed. This is not a
desired behavior for io_u_quiesce(). Fix it by continuing even on
error.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobackend: always clean up pending aios
Naohiro Aota [Fri, 28 Feb 2020 07:12:46 +0000 (16:12 +0900)]
backend: always clean up pending aios

cleanup_pending_aios() is called when a thread exits with error, so all the
call site of this function is under "if (td->error)". However, commit
d28174f0189c ("workqueue: ensure we see deferred error for IOs"), for some
reason, added "if (td->error) return" at the head of this function, making
this function practically void. Revert this part to ensure cleaning up
pending aios.

Besides, cleanup_pending_aios() should not return even when
io_u_queued_complete() failed. Because, it keeps in-flight aios left.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>