Niklas Cassel [Fri, 12 Nov 2021 09:54:39 +0000 (09:54 +0000)]
docs: update cmdprio_percentage documentation
Commit
1437d6357429 ("libaio,io_uring: relax cmdprio_percentage
constraints") relaxed the cmdprio_percentage constraints such that
cmdprio_percentage and prioclass/prio could be used together.
However, it forgot to remove the mention of this constraint from
the docs. Update the docs to reflect the new behavior.
Fixes:
1437d6357429 ("libaio,io_uring: relax cmdprio_percentage constraints")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20211112095428.158300-2-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Niklas Cassel [Mon, 8 Nov 2021 13:12:09 +0000 (13:12 +0000)]
stat: create a init_thread_stat_min_vals() helper
Create a init_thread_stat_min_vals() helper so that we can remove
duplicated code.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211108131143.80158-1-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 25 Oct 2021 18:38:35 +0000 (12:38 -0600)]
Merge branch 'evelu-peak' of https://github.com/ErwanAliasr1/fio
* 'evelu-peak' of https://github.com/ErwanAliasr1/fio:
t/one-core-peak: Don't report errors if missing NVME features
t/io_uring: Fixing typo in help message
t/one-core-peak: Reporting SElinux status
Erwan Velu [Sun, 17 Oct 2021 20:00:02 +0000 (22:00 +0200)]
t/one-core-peak: Don't report errors if missing NVME features
Some NVMEs doesn't support some features, an error message is reported
like in the following example :
NVMe status: INVALID_FIELD: A reserved coded value or an unsupported value in a defined field(0x4002)
nvme2n1: Temp:26 C, Autonomous Power State Transition:, PowerState:0, Completion Queues:135, Submission Queues:135
This commit will only report features if available :
nvme2n1: Completion Queues:135, Submission Queues:135, PowerState:0, Temp:27 C
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Sun, 17 Oct 2021 19:44:53 +0000 (21:44 +0200)]
t/io_uring: Fixing typo in help message
Commit
a71ad043a3f4a introduce the DMA pre mapping support but made a
typo in the help message.
This option is enabled via -D, not -R.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Sun, 17 Oct 2021 19:18:40 +0000 (21:18 +0200)]
t/one-core-peak: Reporting SElinux status
SElinux can influence the overall performance.
Let's report its state
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Fri, 22 Oct 2021 16:19:04 +0000 (10:19 -0600)]
Merge branch 'master' of https://github.com/bvanassche/fio
* 'master' of https://github.com/bvanassche/fio:
Android: Add io_uring support
Bart Van Assche [Thu, 21 Oct 2021 21:41:40 +0000 (14:41 -0700)]
Android: Add io_uring support
This patch has been tested on a recent Android phone. Compilation of this
patch has been verified as follows:
NDK=/usr/lib/android-ndk
export LIBS="-landroid"
export UNAME=Android
for ((i=23;i<=30;i++)); do
echo "==== i = $i ===="
export CC=$NDK/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android${i}-clang
[ -e "$CC" ] || continue
./configure && make -j$(nproc) fio || break
done
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Jens Axboe [Tue, 19 Oct 2021 22:09:21 +0000 (16:09 -0600)]
Merge branch 'patch-1' of https://github.com/sweettea/fio
* 'patch-1' of https://github.com/sweettea/fio:
t/fuzz: Clean up generated dependency makefiles
Sweet Tea Dorminy [Tue, 19 Oct 2021 20:31:27 +0000 (16:31 -0400)]
t/fuzz: Clean up generated dependency makefiles
Currently, the 'clean' target cleans up the t/ directory, but not its
subdirectories. As t/fuzz contains c files, though, dependency makefiles
are created there and should be cleaned up.
Signed-off-by: Sweet Tea Dorminy <sweettea@dorminy.me>
Jens Axboe [Tue, 19 Oct 2021 01:29:46 +0000 (19:29 -0600)]
Merge branch 'fixes_1290' of https://github.com/rthardin/fio
* 'fixes_1290' of https://github.com/rthardin/fio:
Use min_bs in rate_process=poisson
Ryan Hardin [Mon, 18 Oct 2021 20:43:22 +0000 (16:43 -0400)]
Use min_bs in rate_process=poisson
This fixes an issue where IOPS targets were not met
when the `bs` parameter was not given explicitly, such
as when using `bssplit`.
Fixes #1290
Signed-off-by: Ryan Hardin <ryan.hardin@nutanix.com>
Vincent Fu [Tue, 21 Sep 2021 21:27:11 +0000 (21:27 +0000)]
run-fio-tests: make test runs more resilient
Catch exceptions that occur during test setup/running/evaluation. This
makes it more likely that the entire test suite can run to completion
even if some tests fail in an unexpected fashion.
In particular I have seen failures in FioJobTest_t0014() when the test
is run on a bare metal machine. Without this patch these failures make
the entire script grind to a halt.
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20210921212639.61319-1-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Wed, 13 Oct 2021 06:09:03 +0000 (15:09 +0900)]
t/zbd: Add -w option to ensure no open zone before write tests
The commit
b34eb155e4a6 ("t/zbd: Reset all zones before test when max
open zones is specified") introduced -o max_open_zones option to the
script t/zbd/test-zbd-support. It passes max_open_zones value to fio and
resets all zones of the test target device before each test case run
with write operation. This zone reset by the script ensures that no zone
out of the IO range is in open status and the write operation do not
exceed the max_open_zones limit.
On the other hand, since commit
d2f442bc0bd5 ("ioengines: add
get_max_open_zones zoned block device operation"), fio automatically
fetches the max_open_zones value. So it is no longer required to pass
the max_open_zones value from the script to fio. To simplify the script
usage, introduce -w option which does not require max_open_zones value.
This option just resets zones before test cases with write operation.
Of note is that fio itself resets the zones exceeding max_open_zones
limit since the commit
954217b90191 ("zbd: Initialize open zones list
referring zone status at fio start"), but it just resets zones within
the fio IO range. Still zone reset by the test script is required for
zones out of IO range. Zone reset out of IO range by fio is not
implemented since it may cause unexpected data erasure.
Suggested-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211013060903.166543-6-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Wed, 13 Oct 2021 06:09:02 +0000 (15:09 +0900)]
t/zbd: Align block size to zone capacity
The test cases #5, #6, #15 and #37 writes data and read it back (or
write with verify option for read back). When test target zones have
zone capacity unaligned to the block size, read request can not be made
to all of the written data, and the test cases fail.
To avoid the failures, check zone capacity of zones and get block size
which can align to the zone capacities. Then use the block size for the
test cases.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211013060903.166543-5-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Wed, 13 Oct 2021 06:09:01 +0000 (15:09 +0900)]
t/zbd: Do not use too large block size in test case #4
The test case #4 specifies zone size as block size to read a zone. For
some devices, zone size is very large in GB order, then single pread64
system call can not complete the request. This makes the test case fail.
To avoid the failure, keep the block size adequate. If zone size is too
large, use logical_block_size * 256 as the block size.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211013060903.166543-4-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Wed, 13 Oct 2021 06:09:00 +0000 (15:09 +0900)]
zbd: Fix type of local variable min_bs
In zbd.c, thread option min_bs[] is referred and stored in the local
variable min_bs. Elements of min_bs[] have type unsigned long long, but
the local variable min_bs has type uint32_t. When an element of min_bs[]
has value larger than UINT32_MAX, it overflows on assignment to min_bs.
To avoid the overflow, fix type of the local variable min_bs from
uint32_t to uint64_t. Use uint64_t rather than unsigned long long to be
more specific about data size and consistency in zbd.c. The variable is
passed to the helper function zbd_find_zone(), then fix the type of the
argument of the function also.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211013060903.166543-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Wed, 13 Oct 2021 06:08:59 +0000 (15:08 +0900)]
zbd: Remove cast to unsigned long long for printf
Many of the variables in zbd.c have type uint64_t. They are casted to
unsigned long long and printed with printf %llu format to handle
uint64_t types difference among architectures. This requires many
lengthy casts to unsigned long long.
To simplify the code, remove the casts to unsigned long long. Some of
the casts are simply unnecessary. To remove other casts, replace the
printf format %llu with PRIu64 so that uint64_t type difference among
architectures is handled accordingly.
Fio build pass of this change was confirmed with 32bit ARM cross
compiler and 64bit x86 compiler.
Suggested-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Link: https://lore.kernel.org/r/20211013060903.166543-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rebecca Cran [Sat, 16 Oct 2021 06:17:38 +0000 (00:17 -0600)]
engines/http.c: add fallthrough annotation to _curl_trace
To avoid the warning from clang "warning: unannotated fall-through
between switch labels [-Wimplicit-fallthrough]" swap the "fall through"
comment with the "fallthrough;" annotation from compiler.h.
Since the second "fall through" comment isn't really a new fall-through,
remove it.
Signed-off-by: Rebecca Cran <rebecca@bsdio.com>
Link: https://lore.kernel.org/r/20211016061738.76654-1-rebecca@bsdio.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pankaj Raghav [Fri, 15 Oct 2021 12:09:56 +0000 (14:09 +0200)]
t/io_uring: Fix the parameters calculation for multiple threads scenario
The this_done, this_call and this_reap parameter should be a summation of
the corresponding field from all the submitters.
Currently, we are adding the done, calls and reaps param of the last used
submitter nthread times.
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 14 Oct 2021 21:01:30 +0000 (15:01 -0600)]
Merge branch 'evelu-typo' of https://github.com/ErwanAliasr1/fio
* 'evelu-typo' of https://github.com/ErwanAliasr1/fio:
t/io_uring: Fixing typo
Erwan Velu [Thu, 14 Oct 2021 20:38:36 +0000 (22:38 +0200)]
t/io_uring: Fixing typo
s/Maxiumum/Maximum/g
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Thu, 14 Oct 2021 13:56:56 +0000 (07:56 -0600)]
t/io_uring: include a maximum IOPS seen when exiting
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 13 Oct 2021 12:17:44 +0000 (06:17 -0600)]
t/io_uring: don't append 'K' to IOPS if we don't divide by 1000
Impressive two errors in that silly change.
Reported-by: Erwan Velu <erwanaliasr1@gmail.com>
Fixes:
dc10c23ab9a7 ("t/io_uring: show IOPS in increments of 1000 IOPS if necessary")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 13 Oct 2021 00:41:14 +0000 (18:41 -0600)]
t/io_uring: update for new DMA map buffers API
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 12 Oct 2021 20:25:23 +0000 (14:25 -0600)]
t/io_uring: add test support for pre mapping DMA buffers
This is in no shape or form the final evolution or API of this, but
easier to stuff it in here for testing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 12 Oct 2021 20:09:33 +0000 (14:09 -0600)]
t/io_uring: fix silly identical branch error
The previous change inadvertently added the / 1000 to both branches, it
should of course only be done on the first one.
Fixes:
dc10c23ab9a7 ("t/io_uring: show IOPS in increments of 1000 IOPS if necessary")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 12 Oct 2021 19:50:54 +0000 (13:50 -0600)]
Merge branch 'evelu-onecore' of https://github.com/ErwanAliasr1/fio
* 'evelu-onecore' of https://github.com/ErwanAliasr1/fio:
t/one-core-peak: Improving check_sysblock_value error handling
Jens Axboe [Tue, 12 Oct 2021 19:48:45 +0000 (13:48 -0600)]
t/io_uring: show IOPS in increments of 1000 IOPS if necessary
It's a bit hard to read the millions of IOPS, so if we're above 100K
IOPS, scale by 1000 and add a K instead. This is easier to read:
IOPS=7235K, BW=3532MiB/s, IOS/call=31/31, inflight=(78 114)
IOPS=7218K, BW=3524MiB/s, IOS/call=32/32, inflight=(79 105)
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Erwan Velu [Tue, 12 Oct 2021 19:39:58 +0000 (21:39 +0200)]
t/one-core-peak: Improving check_sysblock_value error handling
The current code was reporting the following output:
cat: /sys/block/nvme0n1/queue/wbt_lat_usec: Argument invalide
nvme0n1: /sys/block/nvme0n1/queue/wbt_lat_usec set to 0.
Warning: nvme0n1: Cannot set 0 on /sys/block/nvme0n1/queue/wbt_lat_usec
This is problematic for several reasons:
- cat reports an error at reading wbt_lat_usec
- a message says it set wbt_lat_usec to 0
- a warning reports it cannot set wbt_lat_usec to 0
This commit:
- prevents the first error to be printed
- only report wbt_lat_usec is set to 0 if succeed unles it print the Warning message.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Tue, 12 Oct 2021 19:20:56 +0000 (13:20 -0600)]
Merge branch 'windows-res' of https://github.com/bjpaupor/fio
* 'windows-res' of https://github.com/bjpaupor/fio:
Query Windows clock frequency and use reported max
Brandon Paupore [Tue, 12 Oct 2021 19:00:41 +0000 (14:00 -0500)]
Query Windows clock frequency and use reported max
Previously FIO used the Windows lower-bound clock frequency of 64 Hz for
its helper-thread. This caused IOPS/BW logs to have large drift between
timestamps when not using per-unit logging for those measurements.
Now query the current resolution and set to use the maximum for more
accurate timestamps. Note that the resolution is automatically restored
after FIO terminates.
Signed-off-by: Brandon Paupore <brandon.paupore@wdc.com>
Jens Axboe [Mon, 11 Oct 2021 15:49:21 +0000 (09:49 -0600)]
io_u: don't attempt to requeue for full residual
If we get zero bytes transferred, then don't attempt to re-set the
io_u and requeue the IO. That's a fatal condition for this IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sat, 9 Oct 2021 18:56:11 +0000 (12:56 -0600)]
t/io_uring: fix latency stats for depth == 1
Two issues here:
- Stat increment accounting was off-by-one, causing no stats added
for depth == 1
- The stat batch count should be a minimum of 2, since it's really
a mask.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 7 Oct 2021 12:18:21 +0000 (06:18 -0600)]
Merge branch 'evelu-ocp' of https://github.com/ErwanAliasr1/fio
* 'evelu-ocp' of https://github.com/ErwanAliasr1/fio:
t/io_uring: Add -r option to control the runtime
t/one-core-peak: Reporting RETPOLINE & PAGE_TABLE_ISOLATION
t/one-core-peak: Reporting kernel cmdline
t/one-core-peak: Reporting BLK_WBT_MQ
t/one-core-peak: Reporting BLK_CGROUP
Erwan Velu [Wed, 6 Oct 2021 21:40:27 +0000 (23:40 +0200)]
t/io_uring: Add -r option to control the runtime
By default the test is running until someone press Ctrl-C.
This commit add an option to define the expected runtime.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 6 Oct 2021 21:42:29 +0000 (23:42 +0200)]
t/one-core-peak: Reporting RETPOLINE & PAGE_TABLE_ISOLATION
These settings can influence the max perf if enabled.
Let's report them.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 6 Oct 2021 21:25:49 +0000 (23:25 +0200)]
t/one-core-peak: Reporting kernel cmdline
The cmdline can contain many interesting options that were set and could
influence the final result/one-core-peak: Reporting kernel cmdline
The cmdline can contain many interesting options that were set and could
influence the final result
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 6 Oct 2021 21:02:15 +0000 (23:02 +0200)]
t/one-core-peak: Reporting BLK_WBT_MQ
If BLK_WBT_MQ is set, some ktime_get() call can be seen in the io path.
Let's report the value of this setting and disable it if present.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 6 Oct 2021 20:19:30 +0000 (22:19 +0200)]
t/one-core-peak: Reporting BLK_CGROUP
When BLK_CGROUP is enabled, it induces some rdtsc calls which reduce the
overall performance.
Let's report if this option is enabled.
The tool was reporting BLK_CGROUP_IOCOST which wasn't the right one.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Tue, 5 Oct 2021 12:58:07 +0000 (06:58 -0600)]
t/io_uring: get rid of old debug printfs
We don't really care about the sq/cq ring pointers, that was something
I originally added as this test tool was the first one that I wrote to
bring up io_uring and help debug ring issues.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 5 Oct 2021 12:38:41 +0000 (06:38 -0600)]
t/io_uring: print submitter id with tid on startup
Makes it easier to match up multiple threads with the stats.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 23:04:04 +0000 (17:04 -0600)]
t/io_uring: clean up aio wait loop
No functional changes, just makes it easier to read and gets rid of
an indentation.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 22:35:15 +0000 (16:35 -0600)]
t/io_uring: check for valid clock_index and finish state for stats
If the clock_index isn't non-zero, it's not valid and we should disregard
the sample. Ditto if an exit signal has been sent, we're done at that
point and aren't interested in the last samples.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 22:18:39 +0000 (16:18 -0600)]
t/io_uring: don't track IO latencies the first second of runtime
The most variation is usually seen at startup, so don't start tracking
latencies until we've done the first reporting run. Things should be
nice and stable at that point.
To make this cheaper on the fast path, clock_index is only valid if
it's non-zero. This makes checking for stats cheap in the reap path.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 22:16:01 +0000 (16:16 -0600)]
t/io_uring: don't print partial IOPS etc output if exit signal was received
The run always terminates with what looks like a much slower cycle than
the previous seconds. That's not really the case, it's just that the
sleep() got interrupted by the signal and we slept less than we thought
we did, yet we still account it as a full second.
Just make it cleaner and break if finish is set.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 18:42:01 +0000 (12:42 -0600)]
t/io_uring: add support for legacy AIO
Just as a comparison point, not really interesting otherwise. It doesn't
support any of the advanced features, just basic IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Oct 2021 18:33:40 +0000 (12:33 -0600)]
t/io_uring: remove extra add_stat() call
If we're batching the stat updates, it's incorrect to add the individual
stat. Would have skewed the percentiles, and make -t1 run slower than it
otherwise would have.
Fixes:
ab85494f8bf0 ("t/io_uring: batch stat updates")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 1 Oct 2021 19:55:52 +0000 (13:55 -0600)]
Merge branch 'evelu-fixes2' of https://github.com/ErwanAliasr1/fio
* 'evelu-fixes2' of https://github.com/ErwanAliasr1/fio:
t/one-core-peak: nvme-cli as optional tooling
t/one-core-peak: Report numa as off if missing
Erwan Velu [Fri, 1 Oct 2021 19:43:07 +0000 (21:43 +0200)]
t/one-core-peak: nvme-cli as optional tooling
Not all systems has nvme-cli installed.
If present then let's print additional low-level info,
If not, let's ignore and continue.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Fri, 1 Oct 2021 19:37:29 +0000 (21:37 +0200)]
t/one-core-peak: Report numa as off if missing
Some systems doesn't have numa enabled,
if so don't report an error but report numa as off.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Shin'ichiro Kawasaki [Fri, 1 Oct 2021 10:32:57 +0000 (19:32 +0900)]
Refer td->loops instead of td->o.loops to fix loop count issue
In the github issues #1093 and #1278, it was reported that the loops
option does not work as expected when do_verify=0 option is specified.
Per analysis by Sowmya Ravi, the cause was as follows:
1) keep_running() decrements td->o.loops at job repetition, then
td->o.loops has zero value when the last loop is executed.
2) clear_io_state() is called at the beginning of the thread_main loop
for each repetition for loops option.
3) clear_io_state() calls reset_io_counters() which resets
td->nr_done_files to zero when td->o.loops is non-zero.
4) For the last loop of loops option, clear_io_state() call does not
clear td->nr_done_files since td->l.loops is zero. This results in a
setup error in do_io().
To fix the issue, modify reset_io_counters() to refer td->loops instead
of td->o.loops. td->o.loops is not a good reference since it is updated
in keep_running(). td->loops is not updated during fio run, and safe to
refer.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20211001103257.4130231-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Fri, 1 Oct 2021 10:32:56 +0000 (19:32 +0900)]
Revert "Fix for loop count issue when do_verify=0 (#1093)"
This reverts commit
499cded5f435a0a7c379b606eb3e903d7f43c360.
The commit enabled clear_io_state() call in the loop of thread_main()
after completion of IOs, regardless of verify option. This sets zero to
td->nr_done_files even when the IOs are sequential workload with holes.
Such IOs depend on td->nr_done_files to judge job completion in
__get_next_file(). With zero value in td->nr_done_files, the sequential
IOs do not complete as expected, and results in failure of a test case
Revert the commit to avoid the failure. Regarding the loop count issue
with do_verify=0 option, another fix patch follows.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20211001103257.4130231-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 1 Oct 2021 17:11:53 +0000 (11:11 -0600)]
t/io_uring: correct percentile ranking
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Thu, 30 Sep 2021 00:02:36 +0000 (09:02 +0900)]
zbd: Fix unexpected job termination by open zone search failure
Test case #46 in t/zbd/test-zbd-support fails when it is repeated
hundreds of times on null_blk zoned devices. The test case uses libaio
IO engine to run 8 random write jobs on 4 sequential write required
zones. When all of the 4 zones get almost full but still open for
in-flight writes, the helper function zbd_convert_to_open_zone() fails
to get an opened zone for next write. This results in unexpected job
termination.
To avoid the unexpected job termination, retry the steps in
zbd_convert_to_open_zone(). Before retry, call io_u_quiesce() to ensure
that the in-flight writes get completed.
To prevent infinite loop by the retry, retry only when any IOs are
in-flight or in-flight IOs get completed. To check in-flight IO count of
all jobs, add a new helper function any_io_in_flight().
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Link: https://lore.kernel.org/r/20210930000236.4116945-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 30 Sep 2021 02:15:45 +0000 (20:15 -0600)]
t/io_uring: store TSC rate in local file
Doesn't change on a single machine, so let's just cache the value instead
of requiring it to be specified every time. If we specify the rate, the
local data is updated. If we don't specify it, we check the file, and use
the rate in there if it exists.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 29 Sep 2021 17:38:58 +0000 (11:38 -0600)]
Merge branch 'patch-1' of https://github.com/ravisowmya/fio
* 'patch-1' of https://github.com/ravisowmya/fio:
Fix for loop count issue when do_verify=0 (#1093)
ravisowmya [Tue, 28 Sep 2021 19:09:38 +0000 (12:09 -0700)]
Fix for loop count issue when do_verify=0 (#1093)
'clear_io_state' is called twice and resets the nr_done_files.
'clear_io_state' resets the nr_done_files if loop>=1.
This API is called twice with in thread_main and the second call is
skipped if do_verify=0. We rely on the first call for setup management.
So, for the very last loop, we would have skipped reseting
'nr_done_files' because loops=0 resulting in an IO error
in do_io and we exit without performing any IOs. Fix will invoke
the second call to clear_io_state
Signed-off-by: Sowmya Ravi sowmyaravi.92@gmail.com
Jens Axboe [Tue, 28 Sep 2021 19:28:18 +0000 (13:28 -0600)]
Merge branch 'sigbreak' of https://github.com/bjpaupor/fio
* 'sigbreak' of https://github.com/bjpaupor/fio:
add signal handlers for Windows SIGBREAK
Brandon Paupore [Tue, 28 Sep 2021 17:12:15 +0000 (12:12 -0500)]
add signal handlers for Windows SIGBREAK
Signed-off-by: Brandon Paupore <brandon.paupore@wdc.com>
Jens Axboe [Sun, 26 Sep 2021 22:32:32 +0000 (16:32 -0600)]
Merge branch 'onecore' of https://github.com/ByteHamster/fio
* 'onecore' of https://github.com/ByteHamster/fio:
Pick core for running t/one-core-peak.sh
Jens Axboe [Sun, 26 Sep 2021 22:32:05 +0000 (16:32 -0600)]
Merge branch 'evelu-fio' of https://github.com/ErwanAliasr1/fio
* 'evelu-fio' of https://github.com/ErwanAliasr1/fio:
one-core-peak: Reporting NVME features
t/one-core-peak: Reporting kernel config
one-core-peak.sh: Fixing bash
Erwan Velu [Sun, 26 Sep 2021 20:26:27 +0000 (22:26 +0200)]
one-core-peak: Reporting NVME features
This commit get some low-level features of NVME drives and report them.
It includes, temperature, apste, power state and submission & completion queues
A typical output looks like :
nvme0n1: MODEL=Samsung SSD 970 EVO Plus 2TB FW=2B2QEXM7 serial=S59CNM0R417706B PCI=0000:01:00.0@8.0 GT/s PCIe IRQ=62 NUMA=0 CPUS=0-31
nvme0n1: Temp:34 C, Autonomous Power State Transition: Enabled, PowerState:4, Completion Queues:32, Submission Queues:32
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Sun, 26 Sep 2021 19:43:39 +0000 (21:43 +0200)]
t/one-core-peak: Reporting kernel config
This patch add a reporting of some items of the kernel config.
A typical output looks like :
system: KERNEL: 5.15.0-rc2+
system: KERNEL: CONFIG_BLK_CGROUP_IOCOST=y
system: KERNEL: CONFIG_HZ=1000
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
ByteHamster [Wed, 22 Sep 2021 14:30:35 +0000 (16:30 +0200)]
Pick core for running t/one-core-peak.sh
Erwan Velu [Sun, 26 Sep 2021 19:03:36 +0000 (21:03 +0200)]
one-core-peak.sh: Fixing bash
This commit fixes some warning around the bash syntax
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Sun, 26 Sep 2021 15:58:05 +0000 (09:58 -0600)]
Merge branch 'tsc' of https://github.com/ErwanAliasr1/fio
* 'tsc' of https://github.com/ErwanAliasr1/fio:
one-core-peak: Adding option to reporting latencies
one-core-peak: Avoid reporting Unknown memory speed
Erwan Velu [Sat, 25 Sep 2021 21:51:24 +0000 (23:51 +0200)]
one-core-peak: Adding option to reporting latencies
Since commit
932131c944b10f2a03f4028318c454c98eca489f,
it is now possible to report the io_uring benchmark latencies.
This patch detects the current TSC value and enable the latency feature if requested.
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Erwan Velu [Sat, 25 Sep 2021 21:49:12 +0000 (23:49 +0200)]
one-core-peak: Avoid reporting Unknown memory speed
Some BIOSes, reports the configured mem speed to unknown making the report useless.
Adding a match on a real speed to avoid this.
Before: system: MEMORY: Unknown
After: system: MEMORY: 3466 MT/s
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Jens Axboe [Sat, 25 Sep 2021 20:56:14 +0000 (14:56 -0600)]
Merge branch 'evelu-uring' of https://github.com/ErwanAliasr1/fio
* 'evelu-uring' of https://github.com/ErwanAliasr1/fio:
t/io_uring.c: Adding \n on help
Erwan Velu [Sat, 25 Sep 2021 20:45:51 +0000 (22:45 +0200)]
t/io_uring.c: Adding \n on help
Without these \n, the new options were baddly printed
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Jens Axboe [Sat, 25 Sep 2021 20:38:10 +0000 (14:38 -0600)]
t/io_uring: batch stat updates
Track the last clock_index, and batch increments if at all possible.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Sat, 25 Sep 2021 20:25:05 +0000 (14:25 -0600)]
t/io_uring: add support for latency tracking
This will display the latency percentiles for the run when done, per
submitter thread. It takes two arguments:
-t<x> Enable latency tracking if x is non-zero
-T<Y> Set TSC clock rate to Y Hz
The tsc rate can be programatically deduced (fio does this), for now
pass it in. dmesg will generally tell you:
tsc: Refined TSC clocksource calibration: 3699.889 MHz
and you'd then do:
-t1 -T3699889000
for that. Here's an example, synchronous optane gen2 read:
[...]
IOPS=254118, BW=124MiB/s, IOS/call=0/0, inflight=(1)
IOPS=255024, BW=124MiB/s, IOS/call=0/0, inflight=(1)
IOPS=255100, BW=124MiB/s, IOS/call=0/0, inflight=(1)
IOPS=254791, BW=124MiB/s, IOS/call=0/0, inflight=(1)
^CExiting on signal 2
IOPS=100086, BW=48MiB/s, IOS/call=1/1, inflight=(1)
515102: Latency percentiles:
percentiles (nsec):
| 1.0000th=[ 3857], 5.0000th=[ 3857], 10.0000th=[ 3857],
| 20.0000th=[ 3857], 30.0000th=[ 3857], 40.0000th=[ 3892],
| 50.0000th=[ 3892], 60.0000th=[ 3892], 70.0000th=[ 3892],
| 80.0000th=[ 3892], 90.0000th=[ 3961], 95.0000th=[ 3961],
| 99.9000th=[ 8752], 99.5000th=[ 8752], 99.9000th=[ 8752],
| 99.9500th=[ 9064], 99.9900th=[ 9755]
Or a higher depth run:
IOPS=
3549568, BW=1733MiB/s, IOS/call=32/32, inflight=(64)
IOPS=
3547712, BW=1732MiB/s, IOS/call=32/31, inflight=(111)
IOPS=
3549504, BW=1733MiB/s, IOS/call=32/31, inflight=(128)
^CExiting on signal 2
IOPS=
1413600, BW=690MiB/s, IOS/call=32/32, inflight=(35)
515078: Latency percentiles:
percentiles (nsec):
| 1.0000th=[13630], 5.0000th=[14322], 10.0000th=[15291],
| 20.0000th=[16121], 30.0000th=[20065], 40.0000th=[21726],
| 50.0000th=[22279], 60.0000th=[26154], 70.0000th=[27814],
| 80.0000th=[28368], 90.0000th=[33903], 95.0000th=[34180],
| 99.9000th=[52862], 99.5000th=[52862], 99.9000th=[52862],
| 99.9500th=[56183], 99.9900th=[67807]
Note that latency tracking isn't cheap, even if we tried to do it in the
cheapest way possible. The peak workload shown here will run at ~3.7M
IOPS without tracking, and as shown about 3.55M with tracking enabled.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 24 Sep 2021 21:17:44 +0000 (15:17 -0600)]
t/io_uring: don't print BW numbers for do_nop
They don't mean anything for nops, we're just interested in IOPS here.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 23 Sep 2021 15:15:16 +0000 (09:15 -0600)]
t/io_uring: ensure batch counts are smaller or equal to depth
If you use a batch submit or complete count that's larger than the
depth, then t/io_uring will stall. Make sure to sanitize the counts
so that any batch values is always <= total depth.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 21 Sep 2021 00:29:40 +0000 (18:29 -0600)]
Merge branch 'one-core' of https://github.com/ErwanAliasr1/fio
* 'one-core' of https://github.com/ErwanAliasr1/fio:
t/one-core.sh: Adding script to run the one-core io benchmark
Erwan Velu [Thu, 16 Sep 2021 20:52:22 +0000 (22:52 +0200)]
t/one-core.sh: Adding script to run the one-core io benchmark
Associated to fio, the t/io_uring test is used to compute the max IOPS a
single core can get.
Jens published several times the procedure he uses, but trying to
reproduce this setup is error-prone. It's easy to miss a configuration
and get a different result.
This script is about setting up a common setup to reproduce these runs.
From the fio directory, execute like the folliowing :
[user@fio] t/one-core.sh /dev/nvme0n1 [other drives]
##################################################:
system: CPU: AMD EPYC 7502P 32-Core Processor
system: MEMORY: 2933 MT/s
system: KERNEL: 5.10.35-1.el8.x86_64
nvme0n1: MODEL=Samsung SSD 970 EVO Plus 2TB FW=2B2QEXM7 serial=S59CNM0R417706B PCI=0000:01:00.0@8.0 GT/s PCIe IRQ=64 NUMA=0 CPUS=0-23
nvme0n1: set none as io scheduler
nvme0n1: iostats set to 1.
nvme0n1: nomerge set to 0.
Warning: For better performance, you should enable nvme poll queues by setting nvme.poll_queues=32 on the kernel commande line
##################################################:
io_uring: Running taskset -c 0,12 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n4 /dev/nvme0n1
[...]
IOPS=731008, BWPS=356 MB IOS/call=32/31, inflight=(108 127 126 106)
This script will take care of the following items:
- nvme poll queues
- io scheduler
- iostats
- io_poll
- nomerge
- finding the logical cores running on the first physical core
- cpu frequency governor on performance
- cpu idle governor on menu
- calling t/io_uring with the proper parameters in 512 bytes fashion
- reporting the nvme & pci configuration
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Thu, 16 Sep 2021 17:41:06 +0000 (11:41 -0600)]
t/io_uring: fix bandwidth calculation
Fixes:
22fd35012cea ("t/io_uring: Reporting bandwidth")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 16 Sep 2021 17:29:57 +0000 (11:29 -0600)]
Merge branch 'bwps' of https://github.com/ErwanAliasr1/fio
* 'bwps' of https://github.com/ErwanAliasr1/fio:
t/io_uring: Reporting bandwidth
Erwan Velu [Thu, 16 Sep 2021 16:46:30 +0000 (18:46 +0200)]
t/io_uring: Reporting bandwidth
When performing tests at various block size, it's sometimes a bit
difficult to estimate if we reach the limit of the datapath.
This commit offer to simply prints the resulting bandwitdh of the IOPS
multiplied by the block size.
A typical output looks like :
[user@hosŧ] t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n4 /dev/nvme0n1
...
IOPS=729856, BW=356 MiB/s, IOS/call=32/32, inflight=(105 119 108 109)
[user@host] t/io_uring -b4096 -d128 -c32 -s32 -p1 -F1 -B1 -n4 /dev/nvme0n1
...
IOPS=746368, BW=2915 MiB/s, IOS/call=32/31, inflight=(121 115 122 122)
In the 4K case, as for a PCI Gen3 product, we are clearly limited by the
bandwidth while in the 512 case we hit latency issues.
BW is expressed in MiB/sec.
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Jens Axboe [Wed, 15 Sep 2021 12:51:01 +0000 (06:51 -0600)]
t/io_uring: add switch -O for O_DIRECT vs buffered
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 13 Sep 2021 20:09:01 +0000 (14:09 -0600)]
zbd: remove dead zone retrieval call
A previous commit missed to realize that not only was the assignment
useless, that also made the very call to zbd_zone_nr() useless as
well. Remove it.
Fixes:
000ecb5fe36d ("zbd: Removing useless variable assignment")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 13 Sep 2021 19:18:26 +0000 (13:18 -0600)]
t/io_uring: add -N option for do_nop
Makes it easier than asking people to edit and compile.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 13 Sep 2021 19:14:29 +0000 (13:14 -0600)]
t/io_uring: don't require a file for do_nop runs
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 8 Sep 2021 21:40:47 +0000 (15:40 -0600)]
Merge branch 'ft' of https://github.com/ErwanAliasr1/fio
* 'ft' of https://github.com/ErwanAliasr1/fio:
log: Removing useless assignment
zbd: Removing useless variable assignment
lib/fls.h: Remove unused variable assignment
engines/sg: Removing useless variable assignment
stat: Avoid freeing null pointer
filesetup: Removing unused variable usage
engines/sg: Return error if generic_close_file fails
Erwan Velu [Wed, 8 Sep 2021 21:10:50 +0000 (23:10 +0200)]
log: Removing useless assignment
The last len assigment is never read which makes it useless.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 8 Sep 2021 21:00:45 +0000 (23:00 +0200)]
zbd: Removing useless variable assignment
zone_idx_b is set but never read again.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 8 Sep 2021 20:52:10 +0000 (22:52 +0200)]
lib/fls.h: Remove unused variable assignment
x is modified just before the last set of r but x is never used again.
Let's remove this useless assignment.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 8 Sep 2021 20:43:39 +0000 (22:43 +0200)]
engines/sg: Removing useless variable assignment
ret is set to -1 but the break statement will not use this value.
So let's remove this useless assignment which could be confusing.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 8 Sep 2021 20:35:59 +0000 (22:35 +0200)]
stat: Avoid freeing null pointer
If ovals is NULL, the jump to out will free(ovals) and will trigger an error.
As the out label was only used for this condition, let's remove it and return immediately.
As out was also used as a variable name, this makes the function easier
to read and more robust.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Wed, 8 Sep 2021 20:31:04 +0000 (14:31 -0600)]
README: add link to new lore archive
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Erwan Velu [Wed, 8 Sep 2021 20:22:56 +0000 (22:22 +0200)]
filesetup: Removing unused variable usage
done is set to true but this is useless as break will
stop the while loop.
So let's remove this useless assignment.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Erwan Velu [Wed, 8 Sep 2021 20:18:53 +0000 (22:18 +0200)]
engines/sg: Return error if generic_close_file fails
The current code was returning 1 if generic_close_file() fails.
The ret value was prepared with the real error, let's return this one as
the per generic_open_file() error handling.
Signed-off-by: Erwan Velu <erwanaliasr1@gmail.com>
Jens Axboe [Wed, 8 Sep 2021 20:12:16 +0000 (14:12 -0600)]
t/io_uring: ensure that nthreads is > 0
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Andrzej Jakowski [Wed, 8 Sep 2021 18:35:00 +0000 (11:35 -0700)]
t/io_uring: allow flexible IO threads assignment
This patch allows to flexibly assign IO threads to fileset. When
you specify:
t/io_uring -n 5 /dev/dev1 dev/dev2
First file/device will get 3 IO threads and second file/device
remaining 2 IO threads. When there is more files then IO threads,
IO thread may get assigned multiple files/devices.
Signed-off-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 8 Sep 2021 14:59:48 +0000 (08:59 -0600)]
Fio 3.28
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 8 Sep 2021 14:07:57 +0000 (08:07 -0600)]
t/io_uring: don't make setrlimit() failing fatal
We don't even need this on newer kernels, so just ignore it if it
fails. The worst that can happen is that buffer registration will
fail.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Andrzej Jakowski [Wed, 8 Sep 2021 04:10:44 +0000 (21:10 -0700)]
t/io_uring: fixes in output
Provide description of available options in usage command
and fix alignment so they look pretty.
Also remove debug output.
Signed-off-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Shin'ichiro Kawasaki [Mon, 6 Sep 2021 01:50:00 +0000 (10:50 +0900)]
options: Add thinktime_iotime option
The thinktime option allows stalling a job for a specified amount of
time. Using the thinktime_blocks option, periodic stalls can be added
every thinktime_blocks IOs. However, with this option, the periodic
stall may not be repeated at equal time intervals as the time to execute
thinktime_blocks IOs may vary.
To control the thinktime interval by time, introduce the option
thinktime_iotime. With this new option, the thinktime stall is repeated
after IOs are executed for thinktime_iotime. If this option is used
together with the thinktime_blocks option, the thinktime pause is
repeated after thinktime_iotime or after thinktime_blocks IOs, whichever
happens first.
To support the new option, add a new member thinktime_iotime in the
struct thread_options and the struct thread_options_pack. Avoid size
increase of the struct thread_options_pack by replacing a padding 'pad5'
with the new member. To keep thinktime related members close, move the
members near the position where the padding was placed. Make same
changes to the struct thread_option also for consistency.
To track the time and IO block count at the last stall, add
last_thinktime variable and last_thinktime_blocks variable to struct
thread_data. Also, introduce the helper function init_thinktime()
to group thinktime related preparations.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 3 Sep 2021 15:20:27 +0000 (15:20 +0000)]
examples: add examples for cmdprio_* IO priority options
Add the example scripts cmdprio-percentage.fio and cmdprio-bssplit.fio
to illustrate the use of the cmdprio_percentage, cmdprio_class,
cmdprio and cmdprio_bssplit options. Also add the fiograph output
images for these example scripts.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>