fio.git
2 years agoMerge branch 'master' of https://github.com/bvanassche/fio
Jens Axboe [Thu, 24 Feb 2022 19:40:19 +0000 (12:40 -0700)]
Merge branch 'master' of https://github.com/bvanassche/fio

* 'master' of https://github.com/bvanassche/fio:
  Fix three compiler warnings

2 years agoFix three compiler warnings
Bart Van Assche [Thu, 24 Feb 2022 19:05:41 +0000 (11:05 -0800)]
Fix three compiler warnings

Fix three occurrences of the following clang compiler warning:

warning: suggest braces around initialization of subobject [-Wmissing-braces]

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2 years agoio_uring: use syscall helpers for the hot path
Jens Axboe [Mon, 21 Feb 2022 16:43:48 +0000 (09:43 -0700)]
io_uring: use syscall helpers for the hot path

The only real hot system call here is the io_uring_enter(2) call,
as that'll happen during the IO submission/completion parts. The rest
are just setup function calls, we don't really care about those.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agox86-64: add system call definitions
Jens Axboe [Mon, 21 Feb 2022 16:43:15 +0000 (09:43 -0700)]
x86-64: add system call definitions

Avoid a libc function call, just define our own syscall wrappers for
this architecture. Lifted from liburing.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoaarch64: add system call definitions
Jens Axboe [Mon, 21 Feb 2022 16:41:53 +0000 (09:41 -0700)]
aarch64: add system call definitions

Avoid a libc function call, just define our own syscall wrappers for
this architecture. Lifted from liburing.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'genfio-tempfile' of https://github.com/scop/fio
Jens Axboe [Sun, 20 Feb 2022 19:39:11 +0000 (12:39 -0700)]
Merge branch 'genfio-tempfile' of https://github.com/scop/fio

* 'genfio-tempfile' of https://github.com/scop/fio:
  genfio: fix temporary file handling

2 years agoMerge branch 'spelling' of https://github.com/scop/fio
Jens Axboe [Sun, 20 Feb 2022 19:28:51 +0000 (12:28 -0700)]
Merge branch 'spelling' of https://github.com/scop/fio

* 'spelling' of https://github.com/scop/fio:
  Spelling and grammar fixes

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'which-command-v-type-P' of https://github.com/scop/fio
Jens Axboe [Sun, 20 Feb 2022 19:26:52 +0000 (12:26 -0700)]
Merge branch 'which-command-v-type-P' of https://github.com/scop/fio

* 'which-command-v-type-P' of https://github.com/scop/fio:
  ci, t, tools: use `command` and `type` instead of `which`

2 years agoSpelling and grammar fixes
Ville Skyttä [Thu, 4 Nov 2021 07:39:32 +0000 (09:39 +0200)]
Spelling and grammar fixes

Signed-off-by: Ville Skyttä <ville.skytta@upcloud.com>
2 years agoci, t, tools: use `command` and `type` instead of `which`
Ville Skyttä [Thu, 4 Nov 2021 07:30:28 +0000 (09:30 +0200)]
ci, t, tools: use `command` and `type` instead of `which`

`which` is not POSIX, and cannot be assumed to installed everywhere.

`command -v` is available in POSIX and its predecessors at least since
1994: https://pubs.opengroup.org/onlinepubs/7908799/
It can be used as a replacement for `which` in a number of occurrences
in fio.

For bash scripts, `type -P` is available as a builtin replacement for
`which` and its $PATH search semantics.

Signed-off-by: Ville Skyttä <ville.skytta@upcloud.com>
2 years agogenfio: fix temporary file handling
Ville Skyttä [Tue, 2 Nov 2021 21:41:00 +0000 (23:41 +0200)]
genfio: fix temporary file handling

As a side effect, the template temp file is no longer left behind, as
a unique filename is used for it on each run.

Use the same method of figuring out the temp dir as in
check_status_file().

Signed-off-by: Ville Skyttä <ville.skytta@upcloud.com>
2 years agoMerge branch 'rpma-update-RPMA-engines-with-new-librpma-completions-API' of https...
Jens Axboe [Fri, 18 Feb 2022 16:02:03 +0000 (09:02 -0700)]
Merge branch 'rpma-update-RPMA-engines-with-new-librpma-completions-API' of https://github.com/ldorau/fio

* 'rpma-update-RPMA-engines-with-new-librpma-completions-API' of https://github.com/ldorau/fio:
  rpma: update RPMA engines with new librpma completions API
  rpma: RPMA engines require librpma>=v0.11.0 with rpma_cq_get_wc()

2 years agorpma: update RPMA engines with new librpma completions API
Oksana Salyk [Fri, 4 Feb 2022 19:00:36 +0000 (14:00 -0500)]
rpma: update RPMA engines with new librpma completions API

The API of librpma has been changed between v0.10.0 and v0.12.0
and fio has to be updated.

Signed-off-by: Oksana Salyk <oksana.salyk@intel.com>
2 years agorpma: RPMA engines require librpma>=v0.11.0 with rpma_cq_get_wc()
Lukasz Dorau [Fri, 18 Feb 2022 13:57:18 +0000 (14:57 +0100)]
rpma: RPMA engines require librpma>=v0.11.0 with rpma_cq_get_wc()

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
2 years agoCorrect F_FULLSYNC -> F_FULLFSYNC
Jens Axboe [Thu, 17 Feb 2022 19:53:59 +0000 (12:53 -0700)]
Correct F_FULLSYNC -> F_FULLFSYNC

Apparently used a mix of the two, inconsistently.

Fixes: a04e0665cb5d ("Use fcntl(..., F_FULLSYNC) if available")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoUse fcntl(..., F_FULLSYNC) if available
Jens Axboe [Thu, 17 Feb 2022 19:08:41 +0000 (12:08 -0700)]
Use fcntl(..., F_FULLSYNC) if available

Some operating systems don't perform a data integrity flush when
fsync() is done, but provide fcntl(fd, F_FULLSYNC) to provide that kind
of guarantee.

To ensure that comparisons between operating systems is fair, use
fcntl() to do a proper sync if available.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agot/io_uring: align buffers correctly on non-4k page sizes
Jens Axboe [Thu, 17 Feb 2022 17:18:49 +0000 (10:18 -0700)]
t/io_uring: align buffers correctly on non-4k page sizes

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agot/io_uring: allow non-power-of-2 queue depths
Jens Axboe [Thu, 17 Feb 2022 17:16:19 +0000 (10:16 -0700)]
t/io_uring: allow non-power-of-2 queue depths

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agodiskutil: include limits.h for PATH_MAX
Jens Axboe [Wed, 16 Feb 2022 00:11:06 +0000 (17:11 -0700)]
diskutil: include limits.h for PATH_MAX

On OmniOS, compilation fails because of a missing PATH_MAX definition:

$ gmake
    CC cconv.o
In file included from stat.h:6:0,
                 from thread_options.h:7,
                 from cconv.c:4:
diskutil.h:52:12: error: 'PATH_MAX' undeclared here (not in a function); did you mean 'INT8_MAX'?
  char path[PATH_MAX];
            ^~~~~~~~
            INT8_MAX
gmake: *** [Makefile:505: cconv.o] Error 1

Add limits.h to fix that.

Link: https://github.com/axboe/fio/issues/1344
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'check_min_rate_cleanup' of https://github.com/PCPartPicker/fio
Jens Axboe [Tue, 15 Feb 2022 21:15:58 +0000 (14:15 -0700)]
Merge branch 'check_min_rate_cleanup' of https://github.com/PCPartPicker/fio

* 'check_min_rate_cleanup' of https://github.com/PCPartPicker/fio:
  Cleanup __check_min_rate

2 years agoMerge branch 'rand_nr_bugfix' of https://github.com/PCPartPicker/fio
Jens Axboe [Tue, 15 Feb 2022 20:54:21 +0000 (13:54 -0700)]
Merge branch 'rand_nr_bugfix' of https://github.com/PCPartPicker/fio

* 'rand_nr_bugfix' of https://github.com/PCPartPicker/fio:
  Fix :<nr> suffix with random read/write causing 0 initial offset

2 years agoFix :<nr> suffix with random read/write causing 0 initial offset
aggieNick02 [Tue, 15 Feb 2022 17:59:34 +0000 (11:59 -0600)]
Fix :<nr> suffix with random read/write causing 0 initial offset

When using the :<nr> suffix with random reads or writes, the initial
offset would be set to 0 for the first nr-1 operations. This happened
because td->ddir_seq_nr was initialized to the specified option value,
when it needs to always be initialized to 1, so that the first call to
get_next_offset leads to choosing a new random offset for the first nr
operations.

Signed-off-by: Nick Neumann nick@pcpartpicker.com
2 years agoMerge branch 'fix_bytesrate_eta' of https://github.com/PCPartPicker/fio
Jens Axboe [Tue, 15 Feb 2022 19:22:31 +0000 (12:22 -0700)]
Merge branch 'fix_bytesrate_eta' of https://github.com/PCPartPicker/fio

* 'fix_bytesrate_eta' of https://github.com/PCPartPicker/fio:
  Fix ETA display when rate and/or rate_min are specified

2 years agoFix ETA display when rate and/or rate_min are specified
aggieNick02 [Mon, 14 Feb 2022 21:13:50 +0000 (15:13 -0600)]
Fix ETA display when rate and/or rate_min are specified

The base passed to num2str in the ETA display code passed the wrong
base (0 instead of 1). Additionally, je->sig_figs was never set and
defaulted to 0. Both of these caused the desired range in the ETA code
to always display 0-0 when rate or rate_min was specified.

Signed-off-by: Nick Neumann nick@pcpartpicker.com
2 years agoci: detect Windows installer build failures
Vincent Fu [Tue, 15 Feb 2022 13:30:30 +0000 (13:30 +0000)]
ci: detect Windows installer build failures

When the Windows installer build fails, the build script actually
continues running and does not detect the failure. Use ls to determine
if the MSI file exists in order to detect whether or not the installer
build succeeded.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20220215133027.931-1-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoCleanup __check_min_rate
aggieNick02 [Mon, 14 Feb 2022 03:42:27 +0000 (21:42 -0600)]
Cleanup __check_min_rate

This is a cleanup of __check_min_rate. In looking at stuff for previous fixes,
it seems like there are a lot of boolean checks of things that are always true
or always false. I'll explain my reasoning for each change; it is possible I'm
missing something somehow but I've run through it a few times.

Here's my logic:

1) td->rate_bytes and td->rate_blocks are 0 on first call to __check_min_rate,
and then are the previous iteration's value of td->this_io_bytes and
td->this_io_blocks on subsequent calls

2) bytes and iops are the current iteration's values of td->this_io_bytes and
td->this_io_blocks

3) The values of td->this_io_bytes and td->this_io_blocks are monotonic with
respect to each call of __check_min_rate

Therefore, bytes and iops are always greater than or equal to td->rate_bytes
and td->rate_blocks. This means the "if (bytes < td->rate_bytes[ddir]) {" on
line 176 can never happen.

Now, I want to say the same thing about line 197, but that line is weird/wrong
in another way. rate_iops is td->o.rate_iops, the specified desired iops rate
from the job. So I believe that is a bug - the specified desired iops rate
should not even be examined in this function, just like the same is true for
the desired bytes rate. I'm pretty sure what is meant is to compare iops to
td->rate_blocks just like bytes is compared to td->rate_bytes in line 176,
which would similarly always be false.

Now we can focus on the else caluses (lines 180-192 and lines 202-213). If
spent is 0, we should just be returning false early like in 169-170, so let's
move that case up with it. The "if (rate < ratemin || bytes <
td->rate_bytes[ddir]) {" and "if (rate < rate_iops_min || iops <
td->rate_blocks[ddir]) {" both have impossibilities as the second part of the
or clause. All we really want is to compare computed bytes rate to ratemin, and
computed iops rate to rate_iops_min.

With all of that, this function becomes a lot simpler. The rest of the cleanup
is renaming of variables to make what they are clearer, and some other simple
things (like initializing the variables directly instead of initializing to
zero and then doing +=). The renames are as follows:

- td->lastrate to td->last_rate_check_time, the last time a min rate check was
performed

- bytes to current_rate_check_bytes, the number of bytes transferred so far at
the time this call to __check_min_rate was made

- iops to current_rate_check_blocks, the number of blocks transferred so far at
the time this call to __check_min_rate was made

- rate to current_rate_bytes or current_rate_iops, depending on if it is used
as the current cycle's byte rate or block rate

- ratemin to option_rate_bytes_min, the user supplied desired minimum bytes
rate

- rate_iops eliminated - should not be used in this function

- rate_iops_min to option_rate_iops_min, the user supplied desired minimum
block rate

- td->rate_bytes to td->last_rate_check_bytes - the number of bytes transferred
the *last* time a minimum rate check was called *and* passed (not
shortcircuited because not enough time had elapsed for the cycle or settling)

- td->rate_blocks to td->last_rate_check_blocks - the number of blocks
transferred the *last* time a minimum rate check was called *and* passed (not
shortcircuited because not enough time had elapsed for the cycle or settling)

Signed-off-by: Nick Neumann nick@pcpartpicker.com
2 years agoMerge branch 'fio-docs-ci' of https://github.com/vincentkfu/fio
Jens Axboe [Fri, 11 Feb 2022 23:29:44 +0000 (16:29 -0700)]
Merge branch 'fio-docs-ci' of https://github.com/vincentkfu/fio

* 'fio-docs-ci' of https://github.com/vincentkfu/fio:
  windows: update the installer build for renamed files
  ci: install sphinx packages and add doc building to GitHub Actions
  HOWTO: combine two chunk_size listings into a single one
  HOWTO: combine separate hipri listings into a single one
  HOWTO: combine multiple pool option listings
  docs: rename HOWTO to HOWTO.rst
  docs: update Makefile in order to detect build failures
  docs: document cpumode option for the cpuio ioengine

2 years agowindows: update the installer build for renamed files
Vincent Fu [Fri, 11 Feb 2022 21:55:41 +0000 (16:55 -0500)]
windows: update the installer build for renamed files

Update the MSI build instructions to point to the new README.rst and
HOWTO.rst

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agoMerge branch 'fio_offload_fixes' of https://github.com/PCPartPicker/fio
Jens Axboe [Fri, 11 Feb 2022 21:25:33 +0000 (14:25 -0700)]
Merge branch 'fio_offload_fixes' of https://github.com/PCPartPicker/fio

* 'fio_offload_fixes' of https://github.com/PCPartPicker/fio:
  Fix issues (assert or uninit var, hang) with check_min_rate and offloading

2 years agoci: install sphinx packages and add doc building to GitHub Actions
Vincent Fu [Fri, 4 Feb 2022 21:19:04 +0000 (16:19 -0500)]
ci: install sphinx packages and add doc building to GitHub Actions

To better detect breakage in our documentation builds let's add them to
our GitHub Actions CI.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agoFix issues (assert or uninit var, hang) with check_min_rate and offloading
aggieNick02 [Fri, 11 Feb 2022 20:46:12 +0000 (14:46 -0600)]
Fix issues (assert or uninit var, hang) with check_min_rate and offloading

Using rate_min/rate_iops_min when io_submit_mode=offload option is set
leads to intermittent asserts and doesn't work. The variable comp_time
is never set in do_io in backend.c in the offload case, and comp_time is
then used in the calls to check_min_rate. The time computations in
check_min_rate either assert and terminate fio, or return meaningless
values, so any rate checking is not correct.

This first issue is fixed by adding a call to fio_gettime in the
offloading case. Once that is done though, there is still another
problem remaining. When the min rate is not achieved (with the
offloading option), fio detects it and tries to exit but fails. It ends
up in a state where ctrl-C will not cause an exit either. This happens
because cleanup_pending_aio(td) in the error case hangs in its second
call to io_u_queued_complete. Calling workqueue_flush in the error case
(when offloading) fixes the problem by making sure nothing is left in
the work queues and cleanup can proceed as it does in the non-offload
case.

Signed-off-by: Nick Neumann <nick@pcpartpicker.com>
2 years agoAdd aarch64 cpu clock support
Jens Axboe [Fri, 11 Feb 2022 13:58:12 +0000 (06:58 -0700)]
Add aarch64 cpu clock support

We can use cntvct_el0 to read the CPU clock.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agot/io_uring: avoid unused `nr_batch` warning
Jens Axboe [Fri, 11 Feb 2022 13:42:13 +0000 (06:42 -0700)]
t/io_uring: avoid unused `nr_batch` warning

If we have libaio support, but not an appropriate CPU clock, then the
build throws a warning on nr_batch being assigned but never used.

Mirror what was done on the io_uring init path and only defined and
set `nr_batch` if we have CPU clock support.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agofio: really use LDFLAGS when linking dynamic engines
Eric Sandeen [Tue, 8 Feb 2022 16:00:39 +0000 (10:00 -0600)]
fio: really use LDFLAGS when linking dynamic engines

Fix stupid braino on my part.

Fixes: 2b3d4a6a924e ("fio: use LDFLAGS when linking dynamic engines")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Link: https://lore.kernel.org/r/1644336039-12774-1-git-send-email-sandeen@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoHOWTO: combine two chunk_size listings into a single one
Vincent Fu [Fri, 4 Feb 2022 21:05:37 +0000 (16:05 -0500)]
HOWTO: combine two chunk_size listings into a single one

This resolves the documentation build warning about multiple listings
for the chunk_size option.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agoHOWTO: combine separate hipri listings into a single one
Vincent Fu [Fri, 4 Feb 2022 20:59:37 +0000 (15:59 -0500)]
HOWTO: combine separate hipri listings into a single one

Resolve doc build warnings about multiple appearances of the hipri
option.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agoHOWTO: combine multiple pool option listings
Vincent Fu [Fri, 4 Feb 2022 20:34:10 +0000 (15:34 -0500)]
HOWTO: combine multiple pool option listings

Listing the pool option in multiple places makes it impossible to link
to it in the documentation. Combine the two pool option listings into
one to resolve the doc build warning.

Also clean up a few small formatting issues.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agodocs: rename HOWTO to HOWTO.rst
Vincent Fu [Fri, 4 Feb 2022 20:08:24 +0000 (15:08 -0500)]
docs: rename HOWTO to HOWTO.rst

Since the HOWTO uses the rst format, we should identify it that way. It
will display nicely on github.com.

Also update the documentation to refer to the new HOWTO.rst file.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agodocs: update Makefile in order to detect build failures
Vincent Fu [Fri, 28 Jan 2022 18:50:11 +0000 (18:50 +0000)]
docs: update Makefile in order to detect build failures

With the -W option sphinx-docs will yield a non-zero return code when it
encounters warnings.

With the --keep-going option sphinx-docs will continue running to the
end of the build even if it encounters errors.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agodocs: document cpumode option for the cpuio ioengine
Vincent Fu [Fri, 28 Jan 2022 18:45:29 +0000 (18:45 +0000)]
docs: document cpumode option for the cpuio ioengine

The cpumode option for the cpuio ioengine never had its own entry in the
documentation. Add an entry so that the documentation builds cleanly.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agot/io_uring: fix warnings for !ARCH_HAVE_CPU_CLOCK
Jens Axboe [Fri, 4 Feb 2022 16:02:49 +0000 (09:02 -0700)]
t/io_uring: fix warnings for !ARCH_HAVE_CPU_CLOCK

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: make free_clat_prio_stats() safe against NULL
Niklas Cassel [Fri, 4 Feb 2022 00:17:49 +0000 (00:17 +0000)]
stat: make free_clat_prio_stats() safe against NULL

The sfree() in free_clat_prio_stats() itself handles NULL, so the function
already handles a struct thread_stat without any per priority stats.
(Per priority stats are disabled on threads/thread_stats that we know will
never be able to contain more than a single priority.)

However, if malloc() in e.g. gen_mixed_ddir_stats_from_ts() or
__show_run_stats() failed to allocate memory, free_clat_prio_stats() will
be supplied a NULL pointer.

Fix free_clat_prio_stats() to handle a NULL pointer gracefully.

Fixes: 4ad856497c0b ("stat: add a new function to allocate a clat_prio_stat array")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220204001741.34419-1-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agofio: use correct function declaration for set_epoch_time()
Jens Axboe [Thu, 3 Feb 2022 23:05:02 +0000 (16:05 -0700)]
fio: use correct function declaration for set_epoch_time()

Fixes: d5b3cfd4064d ("Support for alternate epochs in fio log files")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'fio_pr_alternate_epoch' of https://github.com/PCPartPicker/fio
Jens Axboe [Thu, 3 Feb 2022 22:34:40 +0000 (15:34 -0700)]
Merge branch 'fio_pr_alternate_epoch' of https://github.com/PCPartPicker/fio

* 'fio_pr_alternate_epoch' of https://github.com/PCPartPicker/fio:
  Support for alternate epochs in fio log files

2 years agoMerge branch 'cifuzz-integration' of https://github.com/DavidKorczynski/fio
Jens Axboe [Thu, 3 Feb 2022 22:33:47 +0000 (15:33 -0700)]
Merge branch 'cifuzz-integration' of https://github.com/DavidKorczynski/fio

* 'cifuzz-integration' of https://github.com/DavidKorczynski/fio:
  ci/Github actions: add CIFuzz integration

2 years agoMerge branch 'freebsd-comment-update' of https://github.com/macdice/fio
Jens Axboe [Thu, 3 Feb 2022 22:33:21 +0000 (15:33 -0700)]
Merge branch 'freebsd-comment-update' of https://github.com/macdice/fio

* 'freebsd-comment-update' of https://github.com/macdice/fio:
  Update comments about availability of fdatasync().

2 years agot/latency_percentiles.py: add tests for the new cmdprio_bssplit format
Niklas Cassel [Thu, 3 Feb 2022 19:28:32 +0000 (19:28 +0000)]
t/latency_percentiles.py: add tests for the new cmdprio_bssplit format

Add two new test cases for the new cmdprio_bssplit format.

While at it, fixup some small typos in the existing code.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-19-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: remove unused high/low prio struct members
Niklas Cassel [Thu, 3 Feb 2022 19:28:32 +0000 (19:28 +0000)]
stat: remove unused high/low prio struct members

Now when all users have moved to the new clat_prio_stat arrays,
remove io_u_plat_high_prio, io_u_plat_low_prio, clat_high_prio_stat,
and clat_low_prio_stat, as they are no longer used.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-18-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agogfio: drop support for high/low priority latency results
Niklas Cassel [Thu, 3 Feb 2022 19:28:31 +0000 (19:28 +0000)]
gfio: drop support for high/low priority latency results

High/low priority latencies have been replaced by a per prio array.
This allows us to have latency results for more than just two priorities.

Unfortunately this currently means that we have to drop the support for
visualizing the high/low priority latencies.

If someone wants to know the per prio latency results, both the regular
output and the json output contain this information.

The GUI could be extended to support the new per priority format at a
later time, if anyone has a huge need for this feature.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-17-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: convert json output to a new per priority granularity format
Niklas Cassel [Thu, 3 Feb 2022 19:28:31 +0000 (19:28 +0000)]
stat: convert json output to a new per priority granularity format

The JSON output will no longer contain high_prio/low_prio entries, but will
instead include a new list "prios", which will include an object per
prioclass/priolevel combination. Each of these objects will either have a
"clat_ns" object or a "lat_ns" object, depending on which latency type was
being tracked.

This JSON structure should make it easy if the per priority stats were ever
extended to be able to track multiple latency types at the same time, as
each prioclass/priolevel object will then simply contain (e.g.) both a
"clat_ns" and a "lat_ns" object.

Convert the JSON output to this new per priority granularity format,
and convert the tests to work with the new JSON output.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-16-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: report clat stats on a per priority granularity
Niklas Cassel [Thu, 3 Feb 2022 19:28:30 +0000 (19:28 +0000)]
stat: report clat stats on a per priority granularity

Convert the stat code to report clat stats on a per priority granularity,
rather than simply supporting high/low priority.

This is made possible by using the new clat_prio_stat array (per ddir),
together with the clat_prio_stat index which is saved in each io_u.

The per priority samples are only printed when there are samples for more
than one priority in the clat_prio_stat array. If there are only samples
for one priority, that means that all I/Os where submitted using the same
priority, so no need to print.

For example, running the following fio command:
fio --name=test --filename=/dev/sdc --direct=1 --runtime=60 --rw=randread \
    --ioengine=io_uring --ioscheduler=mq-deadline --iodepth=32 --bs=32k \
    --prioclass=2 --prio=7 --cmdprio_bssplit=32k/20/3/0:32k/10/1/4

Now results in the following output:
test: (groupid=0, jobs=1): err= 0: pid=465655: Tue Feb  1 02:24:47 2022
  read: IOPS=146, BW=4695KiB/s (4808kB/s)(276MiB/60239msec)
    slat (usec): min=18, max=335, avg=62.87, stdev=22.59
    clat (msec): min=2, max=2135, avg=217.97, stdev=287.26
     lat (msec): min=2, max=2135, avg=218.03, stdev=287.26
    clat prio 2/7 (msec): min=3, max=606, avg=106.57, stdev=86.64
    clat prio 3/0 (msec): min=10, max=2135, avg=664.94, stdev=339.42
    clat prio 1/4 (msec): min=2, max=300, avg=52.29, stdev=42.52
    clat percentiles (msec):
     |  1.00th=[    8],  5.00th=[   14], 10.00th=[   19], 20.00th=[   33],
     | 30.00th=[   52], 40.00th=[   77], 50.00th=[  108], 60.00th=[  144],
     | 70.00th=[  192], 80.00th=[  300], 90.00th=[  684], 95.00th=[  911],
     | 99.00th=[ 1234], 99.50th=[ 1318], 99.90th=[ 1687], 99.95th=[ 1770],
     | 99.99th=[ 2140]
    clat prio 2/7 (69.25% of IOs) percentiles (msec):
     |  1.00th=[    7],  5.00th=[   13], 10.00th=[   17], 20.00th=[   28],
     | 30.00th=[   44], 40.00th=[   64], 50.00th=[   85], 60.00th=[  111],
     | 70.00th=[  140], 80.00th=[  174], 90.00th=[  226], 95.00th=[  279],
     | 99.00th=[  368], 99.50th=[  418], 99.90th=[  502], 99.95th=[  567],
     | 99.99th=[  609]
    clat prio 3/0 (20.91% of IOs) percentiles (msec):
     |  1.00th=[   44],  5.00th=[  138], 10.00th=[  205], 20.00th=[  347],
     | 30.00th=[  464], 40.00th=[  558], 50.00th=[  659], 60.00th=[  760],
     | 70.00th=[  860], 80.00th=[  961], 90.00th=[ 1099], 95.00th=[ 1217],
     | 99.00th=[ 1485], 99.50th=[ 1687], 99.90th=[ 1871], 99.95th=[ 2140],
     | 99.99th=[ 2140]
    clat prio 1/4 (9.84% of IOs) percentiles (msec):
     |  1.00th=[    7],  5.00th=[   10], 10.00th=[   13], 20.00th=[   18],
     | 30.00th=[   24], 40.00th=[   30], 50.00th=[   39], 60.00th=[   51],
     | 70.00th=[   63], 80.00th=[   84], 90.00th=[  114], 95.00th=[  136],
     | 99.00th=[  188], 99.50th=[  197], 99.90th=[  300], 99.95th=[  300],
     | 99.99th=[  300]
   bw (  KiB/s): min= 3456, max= 5888, per=100.00%, avg=4697.60, stdev=472.38, samples=120
   iops        : min=  108, max=  184, avg=146.80, stdev=14.76, samples=120
  lat (msec)   : 4=0.11%, 10=2.57%, 20=8.67%, 50=18.21%, 100=18.34%
  lat (msec)   : 250=28.87%, 500=9.41%, 750=5.22%, 1000=5.09%, 2000=3.50%
  lat (msec)   : >=2000=0.01%
  cpu          : usr=0.16%, sys=0.97%, ctx=17715, majf=0, minf=262
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=99.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=8839,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-15-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: disable per prio stats where not needed
Niklas Cassel [Thu, 3 Feb 2022 19:28:30 +0000 (19:28 +0000)]
stat: disable per prio stats where not needed

In order to avoid allocating a clat_prio_stat array for threadstats that we
know will never be able to contain more than a single priority, introduce a
new member disable_prio_stat in struct thread_stat.

The naming prefix is disable, since we want the default value to be 0
(enabled). This is because in default case, we do want sum_thread_stats()
to generate a per prio stat array. Only in the case where we know that we
don't want per priority stats to be generated, should this member be set
to 1.

Server version is intentionally not incremented, as it will be incremented
in a later patch in the series. No need to bump it multiple times for the
same patch series.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-14-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: add helper for resetting the latency buckets
Niklas Cassel [Thu, 3 Feb 2022 19:28:29 +0000 (19:28 +0000)]
stat: add helper for resetting the latency buckets

Add a helper for resetting the latency buckets, and call it where
appropriate.

This makes the code easier to read, and puts the reset of the DDIR_SYNC
latency buckets together with the other statements for DDIR_SYNC.

A follow up patch will also make use of this new helper function.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-13-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: increment members counter after call to sum_thread_stats()
Niklas Cassel [Thu, 3 Feb 2022 19:28:29 +0000 (19:28 +0000)]
stat: increment members counter after call to sum_thread_stats()

Increment ts->members after the call to sum_thread_stats(), just like how
it's done in client.c and gclient.c.

There is no reason why stat.c should increment ts->members before calling
sum_thread_stats(). Change stat.c so that it is consistent with client.c
and gclient.c. This way, sum_thread_stats() could actually make use of
ts->members (if it wanted to), since it is now being updated consistently.

No logical change, as currently, ts->members is only used in
show_thread_status_normal(), which is always called after the call to
sum_thread_stats().

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-12-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: use enum fio_ddir consistently
Niklas Cassel [Thu, 3 Feb 2022 19:28:28 +0000 (19:28 +0000)]
stat: use enum fio_ddir consistently

Most functions in stat.c uses enum fio_ddir dir both as a parameter
and as a local variable in functions.

int ddir is used in a very few places.

Convert the int ddir uses to enum fio_ddir dir so that the code is
consistent.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-11-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoexamples: add new cmdprio_bssplit format examples
Niklas Cassel [Thu, 3 Feb 2022 19:28:28 +0000 (19:28 +0000)]
examples: add new cmdprio_bssplit format examples

Add examples of the new cmdprio_bssplit format to cmdprio-bssplit.fio.

In this new format, a priority class and a priority level can be specified
in the cmdprio_bssplit entry itself.

Add the new cmdprio_bssplit format examples as new jobs, as the old format
is still supported.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-10-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agocmdprio: add support for a new cmdprio_bssplit entry format
Niklas Cassel [Thu, 3 Feb 2022 19:28:27 +0000 (19:28 +0000)]
cmdprio: add support for a new cmdprio_bssplit entry format

Add support for a new cmdprio_bssplit format, while keeping support for the
old format, by migrating to the split_parse_prio_ddir() parsing function.

In this new format, a priority class and priority level is defined inside
each entry itself. In comparison with the old format, the new format does
not restrict all entries to share the same priority class and priority
level.

Therefore, this new format is very useful if you need to submit I/Os with
multiple IO priority class + IO priority level combinations, e.g. when
testing or verifying an IO scheduler.

cmdprio will allocate a clat_prio_stat array that holds all unique
priorities (including the default priority). Finally, it will set the
clat_prio pointer in the struct thread_stat (td->ts.clat_prio) to the
newly allocated array.

We also add a clat_prio_stat index to io_u.h, that will inform which array
element (which priority value) this specific I/O was submitted with.
The clat_prio_stat index will be used by the stat.c code, to avoid a costly
search operation to find the correct array element to use, for each and
every add_sample().

Note that while this patch will send down the correct I/O pattern to the
drive (potentially using multiple different priorities), it will not
display the cmdprio_{bssplit,percentage} stats correctly until a later
commit in the series (which changes stat.c to report clat stats on a per
priority granularity). This was done to ease reviewing.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-9-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agooptions: add a parsing function for an additional cmdprio_bssplit format
Niklas Cassel [Thu, 3 Feb 2022 19:28:27 +0000 (19:28 +0000)]
options: add a parsing function for an additional cmdprio_bssplit format

The cmdprio_bssplit ioengine option for io_uring/libaio is currently parsed
using split_parse_ddir(). While this function works fine for parsing the
existing cmdprio_bssplit entry format, it forces every cmdprio_bssplit
entry to use the priority defined by cmdprio and cmdprio_class. This means
that there will only ever be at most two different priority values used in
the job.

To enable us to use more than two different priority values, add a new
parsing function, split_parse_prio_ddir(), that will support parsing the
existing cmdprio_bssplit entry format (blocksize/percentage), and a new
cmdprio_bssplit entry format (blocksize/percentage/prioclass/priolevel).

Since IO engines can be compiled as plugins, having the parse function in
options.c avoids potential problems with ioengines having different
versions of the same parsing function.

A follow up patch will change to the new parsing function.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-8-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoos: define min/max prio class and level for systems without ioprio
Niklas Cassel [Thu, 3 Feb 2022 19:28:26 +0000 (19:28 +0000)]
os: define min/max prio class and level for systems without ioprio

In order to avoid additional ifdef FIO_HAVE_IOPRIO_CLASS/FIO_HAVE_IOPRIO
from being added to the code, define IOPRIO_{MIN,MAX}_PRIO_CLASS and
IOPRIO_{MIN,MAX}_PRIO_CLASS as zero for systems without support for ioprio.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-7-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: add a new function to allocate a clat_prio_stat array
Niklas Cassel [Thu, 3 Feb 2022 19:28:26 +0000 (19:28 +0000)]
stat: add a new function to allocate a clat_prio_stat array

To be able to report clat stats on a per priority granularity (instead of
only high/low priority), we need a new function which allocates a
clat_prio_stat array. This array will hold the clat stats for all the
different priorities that will be used by the struct td.

The clat_prio_stat array will eventually replace io_u_plat_high_prio,
io_u_plat_low_prio, clat_high_prio_stat, and clat_low_prio_stat.

A follow up patch will convert stat.c to use the new array.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-6-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoclient/server: convert ss_data to use an offset instead of fixed position
Niklas Cassel [Thu, 3 Feb 2022 19:28:25 +0000 (19:28 +0000)]
client/server: convert ss_data to use an offset instead of fixed position

Store the location of the ss_data in the payload itself, rather than
assuming that it is always located at a fixed location, directly after
the cmd_ts_pdu data.

This is done as a cleanup patch in order to be able to handle
clat_prio_stats, which just like ss_data, may or may not be part of the
payload.

Server version is intentionally not incremented, as it will be incremented
in a later patch in the series. No need to bump it multiple times for the
same patch series.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-5-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: save the default ioprio in struct thread_stat
Niklas Cassel [Thu, 3 Feb 2022 19:28:25 +0000 (19:28 +0000)]
stat: save the default ioprio in struct thread_stat

To be able to report clat stats on a per priority granularity (instead of
only high/low priority), we need to be able to get the priority value that
was used for the stats in clat_stat.

When a thread is using a single priority (e.g. option prio/prioclass is
used (without any cmdprio options)), all the clat stats for this thread
will be stored in clat_stat. The problem with this is sum_thread_stats()
does not know the priority value that corresponds to the stats stored in
clat_stat.

Since we cannot access td->ioprio from sum_thread_stats(), simply mirror
td->ioprio inside struct thread_stat.

This way, sum_thread_stats() will be able to reuse the global clat stats
in clat_stat, without the need to duplicate the data for per priority
stats, in the case where there is only a single priority in use.

Server version is intentionally not incremented, as it will be incremented
in a later patch in the series. No need to bump it multiple times for the
same patch series.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-4-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agobackend: do ioprio_set() before calling the ioengine init callback
Niklas Cassel [Thu, 3 Feb 2022 19:28:24 +0000 (19:28 +0000)]
backend: do ioprio_set() before calling the ioengine init callback

To be able to report clat stats on a per priority granularity (instead of
only high/low priority), we need to do ioprio_set(), and the matching
td->ioprio assignment, before calling the io engine init callback.

When a thread is using more than a single priority (e.g. option
cmdprio_percentage is used), fio_cmdprio_init() will need to allocate and
initialize an array that will hold the clat stats for all the different
priorities that will be used by the struct td.

For fio_cmdprio_init() to be able to initialize a per priority clat array
properly, we need to assign td->ioprio before calling td_io_init().

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-3-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoinit: verify option lat_percentiles consistency for all jobs in group
Niklas Cassel [Thu, 3 Feb 2022 19:28:24 +0000 (19:28 +0000)]
init: verify option lat_percentiles consistency for all jobs in group

lat_percentiles is used to control if the high/low latency statistics
(which are saved in ts->io_u_plat_high_prio/ts->io_u_plat_low_prio)
should collect and display total latencies instead of completion latencies.

When doing group reporting, stat.c:__show_run_stats() happily overwrites
the dst ts with the setting of each job, which means that the summing can
take total lat samples for some jobs, and clat samples for some jobs, while
adding samples into the same group result.

The output summary will claim that the results are of whatever type the
final job in the group is set to.

To make sure that this cannot happen, verify that the option
lat_percentiles is consistent for all jobs in group.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220203192814.18552-2-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoserver: fix formatting issue
Jens Axboe [Thu, 3 Feb 2022 22:28:16 +0000 (15:28 -0700)]
server: fix formatting issue

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'master' of https://github.com/blah325/fio
Jens Axboe [Thu, 3 Feb 2022 22:22:41 +0000 (15:22 -0700)]
Merge branch 'master' of https://github.com/blah325/fio

* 'master' of https://github.com/blah325/fio:
  Added a new windows only IO engine option “no_completion_thread”.
  Add Windows support for --server.
  Avoid client calls to recv() without prior poll()

2 years agoAdded a new windows only IO engine option “no_completion_thread”.
james rizzo [Thu, 16 Dec 2021 22:35:45 +0000 (15:35 -0700)]
Added a new windows only IO engine option “no_completion_thread”.

Without this option, Windows FIO creates a
completion polling thread for each worker thread. This also
requires an event queue for the completion thread to forward
completions to the worker thread. Polling directly improves
performance and better matches the linuxaio engine model.

Signed-off-by: james rizzo <james.rizzo@broadcom.com>
2 years agoMerge branch 'patch-1' of https://github.com/Nikratio/fio
Jens Axboe [Fri, 28 Jan 2022 21:50:51 +0000 (14:50 -0700)]
Merge branch 'patch-1' of https://github.com/Nikratio/fio

* 'patch-1' of https://github.com/Nikratio/fio:
  I/O size: fix description of filesize

2 years agoMakefile: build t/fio-dedupe only if zlib support is found
Vincent Fu [Fri, 28 Jan 2022 21:46:37 +0000 (21:46 +0000)]
Makefile: build t/fio-dedupe only if zlib support is found

df284fbdc23974c931865a8ddb7d171606b3c778 added zlib support as a
requirement for building fio-dedupe.

The current patch changes the Makefile to avoid trying to build
fio-dedupe when zlib support is not present.

Link: https://lore.kernel.org/fio/51b79acb-c314-143b-514c-a22ff9462829@gmail.com/T/#u
Reported-by: Professor Pro <annivation@gmail.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20220128214611.165312-1-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'docs' of https://github.com/vincentkfu/fio
Jens Axboe [Fri, 28 Jan 2022 20:12:13 +0000 (13:12 -0700)]
Merge branch 'docs' of https://github.com/vincentkfu/fio

* 'docs' of https://github.com/vincentkfu/fio:
  docs: update fio docs to pull from README.rst
  docs: rename README to README.rst
  Revert "Update README to markdown format"

2 years agodocs: update fio docs to pull from README.rst
Vincent Fu [Fri, 28 Jan 2022 19:13:23 +0000 (19:13 +0000)]
docs: update fio docs to pull from README.rst

doc/fio_doc.rst and doc/fio_man.rst originally included fio/README.
Change both to include fio/README.rst instead.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agodocs: rename README to README.rst
Vincent Fu [Fri, 28 Jan 2022 19:30:40 +0000 (14:30 -0500)]
docs: rename README to README.rst

GitHub can display reStructuredText. So just add the appropriate
extension to the README.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agoRevert "Update README to markdown format"
Vincent Fu [Fri, 28 Jan 2022 19:29:26 +0000 (14:29 -0500)]
Revert "Update README to markdown format"

This reverts commit 82250ffc96497652b7f6f9b1b707ae1bee4d8f89.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
2 years agofio: use LDFLAGS when linking dynamic engines
Eric Sandeen [Wed, 26 Jan 2022 14:49:45 +0000 (08:49 -0600)]
fio: use LDFLAGS when linking dynamic engines

Without this, locally defined LDFLAGS won't be applied when
linking the dynamically loaded IO engines.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agot/io_uring: link with libaio when necessary
Eric Sandeen [Tue, 25 Jan 2022 18:57:39 +0000 (12:57 -0600)]
t/io_uring: link with libaio when necessary

When CONFIG_LIBAIO is enabled, we need t/io_uring to link with it.
(libaio_LIBS only affects the aio engine, AFAICT.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'rpma-add-support-for-File-System-DAX' of https://github.com/ldorau/fio
Jens Axboe [Wed, 26 Jan 2022 15:40:18 +0000 (08:40 -0700)]
Merge branch 'rpma-add-support-for-File-System-DAX' of https://github.com/ldorau/fio

* 'rpma-add-support-for-File-System-DAX' of https://github.com/ldorau/fio:
  rpma: add support for File System DAX
  rpma: RPMA engine requires librpma>=v0.10.0 with rpma_mr_advise()

2 years agorpma: add support for File System DAX
Wang, Long [Tue, 25 Jan 2022 09:18:14 +0000 (10:18 +0100)]
rpma: add support for File System DAX

File System DAX is handled in a different way than Device DAX:

1) In case of File System DAX, each thread uses a separate file
from this file system and no offset is needed. In case of Device DAX,
each thread uses a separate offset within the same Device DAX.

2) File System DAX requires rpma_mr_advise(3)(ibv_advise_mr(3))
to be called for the registered memory to avoid page faults
and degraded performance.

Ref: https://github.com/axboe/fio/issues/1238

Signed-off-by: Wang, Long <long1.wang@intel.com>
2 years agorpma: RPMA engine requires librpma>=v0.10.0 with rpma_mr_advise()
Lukasz Dorau [Mon, 24 Jan 2022 22:56:47 +0000 (23:56 +0100)]
rpma: RPMA engine requires librpma>=v0.10.0 with rpma_mr_advise()

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
2 years agoMerge branch 'master' of https://github.com/ben-ihelputech/fio
Jens Axboe [Fri, 21 Jan 2022 17:46:26 +0000 (10:46 -0700)]
Merge branch 'master' of https://github.com/ben-ihelputech/fio

* 'master' of https://github.com/ben-ihelputech/fio:
  Update README to markdown format

2 years agoUpdate README to markdown format
ben-ihelputech [Fri, 21 Jan 2022 15:01:13 +0000 (09:01 -0600)]
Update README to markdown format

- Updated README to README.md to make it look nicer when rendered on Github.

2 years agoiolog.c: Fix memory leak for blkparse case
Lukas Straub [Wed, 19 Jan 2022 21:14:40 +0000 (21:14 +0000)]
iolog.c: Fix memory leak for blkparse case

init_blkparse_read (load_blkparse previously) didn't free the
filename. Fix this by freeing it in the init_iolog function and
handling it for both init_iolog_read and init_blkparse_read.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/e4acf183ab789b7284bfa96089ebe1256e15f98d.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace.c: Make thread-safe by removing local static variables
Lukas Straub [Wed, 19 Jan 2022 21:14:36 +0000 (21:14 +0000)]
blktrace.c: Make thread-safe by removing local static variables

Local static variables are not thread-safe. Make the functions in
blktrace.c safe by replacing them.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/b805bb3f6acf6c5b4d8811872c62af939aac62a7.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace.c: Don't sleep indefinitely if there is a wrong timestamp
Lukas Straub [Wed, 19 Jan 2022 21:14:33 +0000 (21:14 +0000)]
blktrace.c: Don't sleep indefinitely if there is a wrong timestamp

Each of my traces have a single entry with a wrong timestamp
that causes a underflow followed by a infinite sleep.

Fix this by checking for underflow.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/a19b7ea899093c4c0ed98d2d9a310f2f0f01fddd.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace.c: Don't hardcode direct-io
Lukas Straub [Wed, 19 Jan 2022 21:14:30 +0000 (21:14 +0000)]
blktrace.c: Don't hardcode direct-io

This is unexpected if one wants to test performance of a
standard filesystem (by pointing replay_redirect to a standard file)
with buffered io.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/239cc0c47c346408607772fb423aa5745a3779dd.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agolinux-dev-lookup.c: Put the check for replay_redirect in the beginning
Lukas Straub [Wed, 19 Jan 2022 21:14:26 +0000 (21:14 +0000)]
linux-dev-lookup.c: Put the check for replay_redirect in the beginning

The machine may not have any block device nodes (like my dev container)
which makes this function fail despite replay_redirect being set.

Move the check to the beginning to fix this.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/0dd4b6407f7b7f5f15f1fcad409554ff339ffca1.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace.c: Add support for read_iolog_chunked
Lukas Straub [Wed, 19 Jan 2022 21:14:23 +0000 (21:14 +0000)]
blktrace.c: Add support for read_iolog_chunked

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/d43a8a2d5fd23d9756cdcf280cd2f3572585f264.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoiolog.c: Make iolog_items_to_fetch public
Lukas Straub [Wed, 19 Jan 2022 21:14:20 +0000 (21:14 +0000)]
iolog.c: Make iolog_items_to_fetch public

This function be needed in the next patch.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/81c9fbb31bbf0c487dc0ebff5eb85ca764fb14ef.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace.c: Use file stream interface instead of fifo
Lukas Straub [Wed, 19 Jan 2022 21:14:16 +0000 (21:14 +0000)]
blktrace.c: Use file stream interface instead of fifo

Like in iolog.c use the file stream interface for accessing
the iolog file.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Link: https://lore.kernel.org/r/5f52a20f95ebead7fa9ae8bce0acf8f0570219ca.1642626314.git.lukasstraub2@web.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agodocs: documentation for sg WRITE STREAM(16)
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
docs: documentation for sg WRITE STREAM(16)

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-7-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agosg: allow fio to open and close streams for WRITE STREAM(16) commands
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
sg: allow fio to open and close streams for WRITE STREAM(16) commands

If --stream_id=0 then fio will open a stream for WRITE STREAM(16) commands and
close the stream when the device file is closed.

Example:
./fio --name=test --filename=/dev/sdb --ioengine=sg --number_ios=1 --debug=file,io --sg_write_mode=write_stream --rw=randwrite
fio: set debug option file
fio: set debug option io
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sg, iodepth=1
fio-3.27
Starting 1 process
file     1072297 setup files
file     1072297 get file size for 0x7f0306fa5110/0//dev/sdb
file     1072307 trying file /dev/sdb 290
file     1072307 fd open /dev/sdb
file     1072307 file not found in hash /dev/sdb
file     1072307 sgio_stream_control: opened stream 1
file     1072307 get file /dev/sdb, ref=0
io       1072307 drop page cache /dev/sdb
file     1072307 goodf=1, badf=2, ff=2b1
file     1072307 get_next_file_rr: 0x7f0306fa5110
file     1072307 get_next_file: 0x7f0306fa5110 [/dev/sdb]
file     1072307 get file /dev/sdb, ref=1
io       1072307 fill: io_u 0xb55700: off=0x35ef554000,len=0x1000,ddir=1,file=/dev/sdb
io       1072307 prep: io_u 0xb55700: off=0x35ef554000,len=0x1000,ddir=1,file=/dev/sdb
io       1072307 prep: io_u 0xb55700: ret=0
io       1072307 queue: io_u 0xb55700: off=0x35ef554000,len=0x1000,ddir=1,file=/dev/sdb
io       1072307 complete: io_u 0xb55700: off=0x35ef554000,len=0x1000,ddir=1,file=/dev/sdb
file     1072307 put file /dev/sdb, ref=2
file     1072307 close files
file     1072307 put file /dev/sdb, ref=1
file     1072307 sgio_stream_control: closed stream 1
file     1072307 fd close /dev/sdb
io       1072307 close ioengine sg
io       1072307 free ioengine sg

test: (groupid=0, jobs=1): err= 0: pid=1072307: Mon Aug 16 14:25:45 2021
  write: IOPS=200, BW=800KiB/s (819kB/s)(4096B/5msec); 0 zone resets
    clat (nsec): min=93339, max=93339, avg=93339.00, stdev= 0.00
     lat (nsec): min=96201, max=96201, avg=96201.00, stdev= 0.00
    clat percentiles (nsec):
     |  1.00th=[93696],  5.00th=[93696], 10.00th=[93696], 20.00th=[93696],
     | 30.00th=[93696], 40.00th=[93696], 50.00th=[93696], 60.00th=[93696],
     | 70.00th=[93696], 80.00th=[93696], 90.00th=[93696], 95.00th=[93696],
     | 99.00th=[93696], 99.50th=[93696], 99.90th=[93696], 99.95th=[93696],
     | 99.99th=[93696]
  lat (usec)   : 100=100.00%
  cpu          : usr=100.00%, sys=0.00%, ctx=2, majf=0, minf=20
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=800KiB/s (819kB/s), 800KiB/s-800KiB/s (819kB/s-819kB/s), io=4096B (4096B), run=5-5msec

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-6-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agosg: add support for WRITE STREAM(16) commands
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
sg: add support for WRITE STREAM(16) commands

Add the "write_stream" option to sg_write_mode to send WRITE STREAM(16)
commands. Use the new stream_id option to set the stream identifier.

Example:

sg_stream_ctl -o /dev/sdb
Assigned stream id: 1
./fio --name=test --filename=/dev/sdb --ioengine=sg --sg_write_mode=write_stream --stream_id=1 --rw=randwrite --time_based --runtime=10s
...
sg_stream_ctl -c --id=1 /dev/sdb

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-5-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agosg: improve sg_write_mode option names
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
sg: improve sg_write_mode option names

There is a name collision for the sg_write_mode options for the WRITE AND
VERIFY and VERIFY commands. Deprecate the 'verify' option and use
'write_and_verify' instead. Do the same thing for 'same' and 'write_same' to
have a consistent naming scheme. The original option names are still supported
for backward compatibility but list them as deprecated.

Here are the new sg_write_mode options:

Option SCSI command
write WRITE (default)
write_and_verify WRITE AND VERIFY
verify (deprecated) WRITE AND VERIFY
write_same WRITE SAME
same (deprecated) WRITE SAME
write_same_ndob         WRITE SAME with NDOB flag set
verify_bytchk_00 VERIFY with BYTCHK set to 00
verify_bytchk_01 VERIFY with BYTCHK set to 01
verify_bytchk_11 VERIFY with BYTCHK set to 11

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-4-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agosg: add support for WRITE SAME(16) commands with NDOB flag set
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
sg: add support for WRITE SAME(16) commands with NDOB flag set

Add the sg_write_mode option write_same_ndob to issue WRITE SAME(16) commands
with the no data output buffer flag set. This flag is not supported for WRITE
SAME(10). So all commands with this option will be WRITE SAME(16).

Also include an example job file.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-3-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agosg: add support for VERIFY command using write modes
Vincent Fu [Mon, 15 Nov 2021 20:07:17 +0000 (20:07 +0000)]
sg: add support for VERIFY command using write modes

fio does not have an explicit verify data direction and creating a new data
direction just for SCSI VERIFY commands probably is not worthwhile. The format
of SCSI VERIFY commands matches that of write operations since VERIFY commands
can include data transfer to the device. So it seems reasonable to have VERIFY
commands be accounted for as write operations by fio.

Use the sg_write_mode option to support SCSI VERIFY commands with different
BYTCHK values.

BYTCHK Description
00 No data is transferred to the device; device data is checked
01 Device data is compared with data transferred to device
11 Same as 01 except that only one sector of data is transferred to the
device and each sector specified in the verification extent is compared against
this transferred data.

Also update documentation and add a couple example jobs files.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20211115200807.117138-2-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: move unified=both mixed allocation and calculation to new helper
Niklas Cassel [Mon, 17 Jan 2022 15:50:54 +0000 (15:50 +0000)]
stat: move unified=both mixed allocation and calculation to new helper

When using unified_rw_reporting=both, we need to print both the
per ddir stats, as well as the mixed stats.

In order to print both, the regular printing functions are responsible
for printing the per ddir stats from the unmodified struct thread_stat,
and show_mixed_ddir_status(), show_mixed_ddir_status_terse()
or add_mixed_ddir_status_json() is responsible for calculating and
printing the mixed stats.

In order to keep the original struct thread_stat intact, these three
functions have to allocate a new local thread_stat, where the mixed ddir
result can be stored before printing.

Move the allocation and calculation of this new struct thread_stat to a
new helper function, so that the code is easier to follow.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220117155045.311453-3-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: remove duplicated code in show_mixed_ddir_status()
Niklas Cassel [Mon, 17 Jan 2022 15:50:53 +0000 (15:50 +0000)]
stat: remove duplicated code in show_mixed_ddir_status()

When using unified_rw_reporting=mixed, show_ddir_status() is called,
and is solely responsible for printing the mixed stats.

When using unified_rw_reporting=both, show_ddir_status() is called
and prints the regular output, after that, show_mixed_ddir_status()
is called to print the mixed stats.

The way that show_mixed_ddir_status_terse() and
add_mixed_ddir_status_json() is implemented, is to alloc a new local ts
that will hold the mixed result, and then simply call the regular non-mixed
print function show_ddir_status_terse()/add_ddir_status_json() with this
local ts.

show_mixed_ddir_status() also allocates a new local ts, but fails to
initialize the lat percentiles and the percentile_list in the new local ts.
Therefore, show_mixed_ddir_status() has duplicated all the code from
show_ddir_status(), except that it uses the lat percentiles and the
percentile_list from the original ts.

Simplify show_mixed_ddir_status(), to behave in the same way as
show_mixed_ddir_status_terse() and add_mixed_ddir_status_json().

In other words, initialize the lat percentiles and the percentile_list in
the new local ts, and replace all the duplicated code with a simple call to
the regular non-mixed print function (show_ddir_status()).

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220117155045.311453-2-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoinit: do not create lat logs when not needed
Damien Le Moal [Mon, 17 Jan 2022 02:11:27 +0000 (11:11 +0900)]
init: do not create lat logs when not needed

When any of the options disable_lat, disable_slat and disable_clat are
used, there is no need to create the lat log associated with the
disabled latency. In addition, when write_lat_log is also specified,
this change avoids the creation of empty latency log files.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20220117021127.9259-1-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agostat: remove unnecessary bool parameter to sum_thread_stats()
Niklas Cassel [Mon, 10 Jan 2022 09:01:39 +0000 (09:01 +0000)]
stat: remove unnecessary bool parameter to sum_thread_stats()

We can deduce if it is the first struct io_stat src being added to the
struct io_stat dst by checking if the current amount of samples in dst
is zero.

Therefore, remove the bool parameter "first" to sum_stat().
Since sum_stat() was the only user of the bool parameter "first" to
the sum_thread_stats() function, we can remove it from sum_thread_stats()
as well.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220110090133.69955-1-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoengines/io_uring: don't set CQSIZE clamp unconditionally
Jens Axboe [Mon, 10 Jan 2022 02:34:27 +0000 (19:34 -0700)]
engines/io_uring: don't set CQSIZE clamp unconditionally

For older kernels without IORING_SETUP_CQSIZE, we'll get EINVAL if we
set it. Just retry the ring setup if that happens.

Link: https://github.com/axboe/fio/issues/1324
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoMerge branch 'github-actions-i686' of https://github.com/vincentkfu/fio
Jens Axboe [Thu, 23 Dec 2021 23:27:33 +0000 (16:27 -0700)]
Merge branch 'github-actions-i686' of https://github.com/vincentkfu/fio

* 'github-actions-i686' of https://github.com/vincentkfu/fio:
  t/io_uring: fix help defaults for aio and random_io
  t/io_uring: fix 32-bit build warnings
  Revert "ci: temporarily remove linux-i686-gcc build"
  ci: workaround for problem with i686 builds