Age | Commit message (Collapse) | Author |
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
* 'fio_reset_sqe' of https://github.com/anarazel/fio:
engines/io_uring: Fully clear out previous SQE contents.
|
|
Without this change SQEs can contain data set in previous
submissions. E.g. a WRITE following an fdatasync would have still have
IORING_FSYNC_DATASYNC set in sync_flags, which shares storage with the
WRITE's rw_flags. Which was not reset, causing all writes to be
synchronous. Similarly, an fsync following a READ/WRITE would not
reset off/addr/len, which causes errors, because the kernel's
io_prep_fsync returns an error if e.g. addr is not 0.
While this could also be fixed by resetting only the unused fields in
the respective branches, it seems less failure prone to start with a
zeroed out sqe.
|
|
* 'ioring_add_sync_file_range' of https://github.com/anarazel/fio:
engines/io_uring: Add support for sync_file_range.
|
|
* 'fix_iouring_eintr' of https://github.com/anarazel/fio:
engines/io_uring: Handle EINTR.
|
|
Previously sync_file_range() requests were just dropped to the floor.
Signed-off-by: Andres Freund <andres@anarazel.de>
|
|
Several paths in io_uring_enter can trigger EINTR, but it was not
handled, leading to fio failing with spurious error messages.
An easy way to trigger EINTR is to just strace a running fio using
the io_uring engine and detach again.
Signed-off-by: Andres Freund <andres@anarazel.de>
|
|
If used with a raw bdev, we're crashing in attempting to open a
registered file before we have actually registered them.
If we're called before files are registered, just open the file
normally. This is done to query sizes etc, and we'll get the file
closed after that anyway. The job open/close will use the right
registered fd.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commands like the following do not honor the value given by the offset
option:
./fio --name=test --rw=randread --runtime=10s --offset=90% --time_based --ioengine=null --size=1T --norandommap --randrepeat=0
./fio --name=test --size=8k --offset=4k
In the random case, eventually a random offset will be generated beyond
the 1T file size, leading to a failure.
In the sequential case, a 12k file will be created despite size
specifying the 8k end boundary.
This patch modifies setup_files() so that f->io_size incorporates the
offset for cases like those above.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The alloc-size option actually directs fio to allocate additional shared
memory pools of the specified size, augmenting the default allocation.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Was under libaio for now, lets move it to a distinct option group name.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This feature is exposed as a separate option, like fixedbufs, and
provides a way for fio to register a set of files with the kernel.
This improves IO efficiency.
It is also a requirement to be able to use sqthread_poll, as that
feature requires fixed files on the kernel side.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Change the calculation of free_blocks in add_pool() to use SMALLOC_BPI
instead of SMALLOC_BPB. These two constants are coincidentally the same
on Linux and Windows but SMALLOC_BPI is the correct one to use.
free_blocks is the number of available blocks of size SMALLOC_BPB. It is
the product of the number of unsigned integers in the bitmap
(bitmap_blocks) and the number of bits per unsigned integer
(SMALLOC_BPI).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If one process is making smalloc calls and another process is making
sfree calls, pool->free_blocks and pool->next_non_full will not be
synchronized because the two processes each have independent, local
copies of the variables.
This patch allocates space for the array of struct pool instances from
shared storage so that separate processes will be modifying quantities
stored at the same locations.
This issue was discovered on the server side running a client/server job
with --status-interval=1. Such a job encountered an OOM error when only
~50 objects were allocated from the smalloc pool.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For a multijob workload, each job may specify a zonesize option for
access to a zoned block device or regular device with zonemode=zbd.
In such case, make sure that the zone size value specified by each job
matches the device zone size.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For a job accessing a zoned block device, the zone size is automatically
initialized to the device zone size. However, since zone information for
a zoned block device is parsed once only for the first job
initialization of a multi job workload, only the first job has its
zonesize option intialized, causing problem if the zoneskip option is
also used (assert exit).
Fix this by always initializing a job zonesize option using the job
file zbd information when verifying the job ZBD related sizes and
offsets.
Fixes: 4d37720ae029 ("zbd: Add support for zoneskip option")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we don't enable zbd, then provide an empty stub for the setup.
This fixes a build breakage on anything but Linux.
Fixes: 4d37720ae029 ("zbd: Add support for zoneskip option")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
To speed up device tests (performance and or quality validation) of very
large capacity block devices such as SMR disks, it is useful to allow
skipping some block ranges for sequential workloads. While the
zonemode=stridded implements such feature, it does not allow controlling
read operations in partially written zones of zoned block devices (i.e.
prevent reads after a zone write pointer) and can result in IO errors
if executed on a zoned block devices with zones already written.
To solve this problem, add support for the zoneskip option with
zonemode=zbd, allowing a sequential workload to skip zoneskip bytes once
a zone has been fully written or its data has been read. The zoneskip
option is ignored for random workloads.
For read workloads, zone skipping takes into account the read_beyond_wp
option to switch zone either when all valid data in the zone is read
(read_beyond_wp=0) or the entire zone has been read (read_beyond_wp=1).
Add test47 to t/zbd/test-zbd-support to test that zoneskip invalid
values are handled correctly.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Clarify the use of the zonerange and zonesize options for zonemode=zbd.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Strictly speaking, drive managed disks are not zoned block devices as
they do not provide zone information nor zone commands. So remove
mention of this type of disk in the zoned block device description.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use log_err() instead of log_info() for notifying invalid zonesize
values specified by the user.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix the error message on zbd_create_zone_info() failures to a more
generic message rather than indicating a BLKREPORTZONE ioctl error.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For a job using a zoned device, the zonesize option must always specify
the device zone size. That is checked in the function parse_zone_info().
The zonesize checks in zbd_init() apply only to jobs running with
zonemode=zbd on a regular block device. So move these checks into
init_zone_info() which is used to emulate zone information for regular
block devices.
Fix t/zbd test #43 accordingly.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add the ability for the offset_increment option to understand
percentages. With this patch offset_increment=10% will, for example,
increase the start offset by 10% of the file size for additional jobs
created by numjobs.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Document three debug options and fix a formatting error.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As the libnbd API isn't permanently stable until we reach the 1.0
release (expected soon), some code changes are needed to cope with API
changes between 0.9.6 and 0.9.8. In this case we made changes to
completion handlers after feedback from reviewers. This fix for fio
incorporates all the changes needed and bumps the minimum version to
libnbd >= 0.9.8.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
It has two holes in it, and some weird mid-struct packing. Let's
clean it up.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
I think this is why the build fails for some, which is odd.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Make it easier to run the zoned block device tests.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Due to a previous patch it is no longer necessary to hide the type of
accesses to the 'rate' and 'iops' members in struct jobs_eta.
This patch reverts commit df0ca15ce2ff ("eta: Fix compiler warning").
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch verifies the correctness of the previous patch.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of declaring the whole structure packed, only declare non-aligned
members packed. This patch is an alternative way to fix the following gcc 9
compiler warnings:
eta.c: In function 'calc_thread_status':
eta.c:510:7: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member]
510 | je->rate);
| ~~^~~~~~
eta.c:522:66: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member]
522 | calc_rate(unified_rw_rep, disp_time, io_bytes, disp_io_bytes, je->rate);
| ~~^~~~~~
eta.c:523:64: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member]
523 | calc_iops(unified_rw_rep, disp_time, io_iops, disp_io_iops, je->iops);
|
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Using strncpy() to copy strings is suboptimal because strncpy writes a
bunch of additional unnecessary null bytes. Use snprintf() instead of
strncpy(). An additional advantage of snprintf() is that it guarantees
that the output string is '\0'-terminated.
This patch is an improvement for commit 32e31c8c5f7b ("Fix string copy
compilation warnings").
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Give up if creation of the null_blk instance fails.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch fixes two sparse warnings.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When fio reports write bytes or read bytes, it rounds the number with
units MiB or KiB to fit the number within limited number of digits.
This results in rounding errors of the reported bytes and sometimes
causes test failures for test case #17 in test-zbd-support
which reports incorrect total I/O bytes in case both of write bytes
and read bytes are rounded up.
To avoid the rounding error, increase the number of digits from default
value 4 to 10 to keep precision. For example, a number "256MiB" will be
reported as "267911168B" with this change.
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes a copy and paste error introduced in commit d643a1e29d
("engines: Add Network Block Device (NBD) support using libnbd.").
Thanks: Sitsofe Wheeler
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Two things wrong here:
1) We align buffers by default, so no need for splice to do anything
extra.
2) ->mem_align is not a true/false setting, it's the alignment itself.
Hence the current setting to 1 is just buggy.
Fixes: https://github.com/axboe/fio/issues/810
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The recent addition of the nbd engine overflowed what we support.
Fixes: d643a1e29d31 ("engines: Add Network Block Device (NBD) support using libnbd")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This commit adds a new engine for testing Network Block Devices
directly. It requires libnbd (https://github.com/libguestfs/libnbd).
To see how to test nbdkit or qemu-nbd read examples/nbd.fio.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
firstfree() triggers a warning from the Windows compiler used by
AppVeyor because it doesn't return a value if the for loop iterates to
completion. This patch resolves the compiler warning.
AppVeyor Windows build log: https://ci.appveyor.com/project/axboe/fio/builds/26381726
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
* 'smalloc-gc' of https://github.com/vincentkfu/fio:
smalloc: fix garbage collection problem
t/stest: make the test more challenging
smalloc: print debug info on oom error
|
|
If a large request arrives when pool->next_non_full points to empty
space that is insufficient to satisfy the request, pool->next_non_full
will be inappropriately advanced when the free space is followed by
lines of fully allocated space. The free space originally pointed to by
pool->next_non_full will be unavailable unless a subsequent sfree() call
frees allocated space above it. Resolve this issue by advancing
pool->next_non_full only outside the search loop and only when it points
to fully allocated space.
|
|
Add large smalloc requests to the sfree phase of the test. This exposes
a smalloc garbage collection issue.
|
|
Provide more details about the request and the state of the memory pools
when smalloc encounters an oom situation.
|