AgeCommit message (Collapse)Author
2019-09-19Fio 3.16fio-3.16Jens Axboe
Signed-off-by: Jens Axboe <>
2019-09-19engines/io_uring: remove debug printfJens Axboe
Signed-off-by: Jens Axboe <>
2019-09-13Merge branch 'fio_reset_sqe' of Axboe
* 'fio_reset_sqe' of engines/io_uring: Fully clear out previous SQE contents.
2019-09-13engines/io_uring: Fully clear out previous SQE contents.Andres Freund
Without this change SQEs can contain data set in previous submissions. E.g. a WRITE following an fdatasync would have still have IORING_FSYNC_DATASYNC set in sync_flags, which shares storage with the WRITE's rw_flags. Which was not reset, causing all writes to be synchronous. Similarly, an fsync following a READ/WRITE would not reset off/addr/len, which causes errors, because the kernel's io_prep_fsync returns an error if e.g. addr is not 0. While this could also be fixed by resetting only the unused fields in the respective branches, it seems less failure prone to start with a zeroed out sqe.
2019-09-12Merge branch 'ioring_add_sync_file_range' of Axboe
* 'ioring_add_sync_file_range' of engines/io_uring: Add support for sync_file_range.
2019-09-12Merge branch 'fix_iouring_eintr' of Axboe
* 'fix_iouring_eintr' of engines/io_uring: Handle EINTR.
2019-09-12engines/io_uring: Add support for sync_file_range.Andres Freund
Previously sync_file_range() requests were just dropped to the floor. Signed-off-by: Andres Freund <>
2019-09-12engines/io_uring: Handle EINTR.Andres Freund
Several paths in io_uring_enter can trigger EINTR, but it was not handled, leading to fio failing with spurious error messages. An easy way to trigger EINTR is to just strace a running fio using the io_uring engine and detach again. Signed-off-by: Andres Freund <>
2019-09-12engines/io_uring: fix crash with registerfiles=1Jens Axboe
If used with a raw bdev, we're crashing in attempting to open a registered file before we have actually registered them. If we're called before files are registered, just open the file normally. This is done to query sizes etc, and we'll get the file closed after that anyway. The job open/close will use the right registered fd. Signed-off-by: Jens Axboe <>
2019-09-11filesetup: honor the offset optionVincent Fu
Commands like the following do not honor the value given by the offset option: ./fio --name=test --rw=randread --runtime=10s --offset=90% --time_based --ioengine=null --size=1T --norandommap --randrepeat=0 ./fio --name=test --size=8k --offset=4k In the random case, eventually a random offset will be generated beyond the 1T file size, leading to a failure. In the sequential case, a 12k file will be created despite size specifying the 8k end boundary. This patch modifies setup_files() so that f->io_size incorporates the offset for cases like those above. Signed-off-by: Jens Axboe <>
2019-09-11doc: clarify what --alloc-size doesVincent Fu
The alloc-size option actually directs fio to allocate additional shared memory pools of the specified size, augmenting the default allocation. Signed-off-by: Jens Axboe <>
2019-09-05engines/io_uring: use its own option groupJens Axboe
Was under libaio for now, lets move it to a distinct option group name. Signed-off-by: Jens Axboe <>
2019-09-05engines/io_uring: add support for registered filesJens Axboe
This feature is exposed as a separate option, like fixedbufs, and provides a way for fio to register a set of files with the kernel. This improves IO efficiency. It is also a requirement to be able to use sqthread_poll, as that feature requires fixed files on the kernel side. Signed-off-by: Jens Axboe <>
2019-09-03smalloc: use SMALLOC_BPI instead of SMALLOC_BPB in add_pool()Vincent Fu
Change the calculation of free_blocks in add_pool() to use SMALLOC_BPI instead of SMALLOC_BPB. These two constants are coincidentally the same on Linux and Windows but SMALLOC_BPI is the correct one to use. free_blocks is the number of available blocks of size SMALLOC_BPB. It is the product of the number of unsigned integers in the bitmap (bitmap_blocks) and the number of bits per unsigned integer (SMALLOC_BPI). Signed-off-by: Jens Axboe <>
2019-09-03smalloc: allocate struct pool array from shared memoryVincent Fu
If one process is making smalloc calls and another process is making sfree calls, pool->free_blocks and pool->next_non_full will not be synchronized because the two processes each have independent, local copies of the variables. This patch allocates space for the array of struct pool instances from shared storage so that separate processes will be modifying quantities stored at the same locations. This issue was discovered on the server side running a client/server job with --status-interval=1. Such a job encountered an OOM error when only ~50 objects were allocated from the smalloc pool. Signed-off-by: Jens Axboe <>
2019-08-29zbd: Improve job zonesize initialization checksDamien Le Moal
For a multijob workload, each job may specify a zonesize option for access to a zoned block device or regular device with zonemode=zbd. In such case, make sure that the zone size value specified by each job matches the device zone size. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29zbd: Fix job zone size initializationDamien Le Moal
For a job accessing a zoned block device, the zone size is automatically initialized to the device zone size. However, since zone information for a zoned block device is parsed once only for the first job initialization of a multi job workload, only the first job has its zonesize option intialized, causing problem if the zoneskip option is also used (assert exit). Fix this by always initializing a job zonesize option using the job file zbd information when verifying the job ZBD related sizes and offsets. Fixes: 4d37720ae029 ("zbd: Add support for zoneskip option") Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29zbd: provide empty setup_zbd_zone_mode()Jens Axboe
If we don't enable zbd, then provide an empty stub for the setup. This fixes a build breakage on anything but Linux. Fixes: 4d37720ae029 ("zbd: Add support for zoneskip option") Signed-off-by: Jens Axboe <>
2019-08-29zbd: Add support for zoneskip optionDamien Le Moal
To speed up device tests (performance and or quality validation) of very large capacity block devices such as SMR disks, it is useful to allow skipping some block ranges for sequential workloads. While the zonemode=stridded implements such feature, it does not allow controlling read operations in partially written zones of zoned block devices (i.e. prevent reads after a zone write pointer) and can result in IO errors if executed on a zoned block devices with zones already written. To solve this problem, add support for the zoneskip option with zonemode=zbd, allowing a sequential workload to skip zoneskip bytes once a zone has been fully written or its data has been read. The zoneskip option is ignored for random workloads. For read workloads, zone skipping takes into account the read_beyond_wp option to switch zone either when all valid data in the zone is read (read_beyond_wp=0) or the entire zone has been read (read_beyond_wp=1). Add test47 to t/zbd/test-zbd-support to test that zoneskip invalid values are handled correctly. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29man: Improve zonemode=zbd informationDamien Le Moal
Clarify the use of the zonerange and zonesize options for zonemode=zbd. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29man page: Fix read_beyond_wp descriptionDamien Le Moal
Strictly speaking, drive managed disks are not zoned block devices as they do not provide zone information nor zone commands. So remove mention of this type of disk in the zoned block device description. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29zbd: Fix error messageDamien Le Moal
Use log_err() instead of log_info() for notifying invalid zonesize values specified by the user. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29zbd: Fix initialization error messageDamien Le Moal
Fix the error message on zbd_create_zone_info() failures to a more generic message rather than indicating a BLKREPORTZONE ioctl error. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-29zbd: Cleanup zbd_init()Damien Le Moal
For a job using a zoned device, the zonesize option must always specify the device zone size. That is checked in the function parse_zone_info(). The zonesize checks in zbd_init() apply only to jobs running with zonemode=zbd on a regular block device. So move these checks into init_zone_info() which is used to emulate zone information for regular block devices. Fix t/zbd test #43 accordingly. Signed-off-by: Damien Le Moal <> Signed-off-by: Jens Axboe <>
2019-08-28options: allow offset_increment to understand percentagesVincent Fu
Add the ability for the offset_increment option to understand percentages. With this patch offset_increment=10% will, for example, increase the start offset by 10% of the file size for additional jobs created by numjobs. Signed-off-by: Jens Axboe <>
2019-08-28docs: small HOWTO fixesVincent Fu
Document three debug options and fix a formatting error. Signed-off-by: Jens Axboe <>
2019-08-15nbd: Update for libnbd 0.9.8Richard W.M. Jones
As the libnbd API isn't permanently stable until we reach the 1.0 release (expected soon), some code changes are needed to cope with API changes between 0.9.6 and 0.9.8. In this case we made changes to completion handlers after feedback from reviewers. This fix for fio incorporates all the changes needed and bumps the minimum version to libnbd >= 0.9.8. Signed-off-by: Richard W.M. Jones <> Signed-off-by: Jens Axboe <>
2019-08-15stat: ensure that struct jobs_eta packs nicelyJens Axboe
It has two holes in it, and some weird mid-struct packing. Let's clean it up. Signed-off-by: Jens Axboe <>
2019-08-14eta: use struct jobs_eta_packedJens Axboe
I think this is why the build fails for some, which is odd. Signed-off-by: Jens Axboe <>
2019-08-14Makefile: Add 'fulltest' targetBart Van Assche
Make it easier to run the zoned block device tests. Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14Restore type checking in calc_thread_status()Bart Van Assche
Due to a previous patch it is no longer necessary to hide the type of accesses to the 'rate' and 'iops' members in struct jobs_eta. This patch reverts commit df0ca15ce2ff ("eta: Fix compiler warning"). Cc: Damien Le Moal <> Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14Verify the absence of holes in struct jobs_eta at compile timeBart Van Assche
This patch verifies the correctness of the previous patch. Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14Refine packed annotations in stat.hBart Van Assche
Instead of declaring the whole structure packed, only declare non-aligned members packed. This patch is an alternative way to fix the following gcc 9 compiler warnings: eta.c: In function 'calc_thread_status': eta.c:510:7: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member] 510 | je->rate); | ~~^~~~~~ eta.c:522:66: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member] 522 | calc_rate(unified_rw_rep, disp_time, io_bytes, disp_io_bytes, je->rate); | ~~^~~~~~ eta.c:523:64: error: taking address of packed member of 'struct jobs_eta' may result in an unaligned pointer value [-Werror=address-of-packed-member] 523 | calc_iops(unified_rw_rep, disp_time, io_iops, disp_io_iops, je->iops); | Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14Optimize the code that copies stringsBart Van Assche
Using strncpy() to copy strings is suboptimal because strncpy writes a bunch of additional unnecessary null bytes. Use snprintf() instead of strncpy(). An additional advantage of snprintf() is that it guarantees that the output string is '\0'-terminated. This patch is an improvement for commit 32e31c8c5f7b ("Fix string copy compilation warnings"). Cc: Damien Le Moal <> Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14zbd: Improve robustness of unit testsBart Van Assche
Give up if creation of the null_blk instance fails. Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-14zbd: Declare local functions 'static'Bart Van Assche
This patch fixes two sparse warnings. Signed-off-by: Bart Van Assche <> Signed-off-by: Jens Axboe <>
2019-08-08t/zbd: Fix I/O bytes rounding errorsShin'ichiro Kawasaki
When fio reports write bytes or read bytes, it rounds the number with units MiB or KiB to fit the number within limited number of digits. This results in rounding errors of the reported bytes and sometimes causes test failures for test case #17 in test-zbd-support which reports incorrect total I/O bytes in case both of write bytes and read bytes are rounded up. To avoid the rounding error, increase the number of digits from default value 4 to 10 to keep precision. For example, a number "256MiB" will be reported as "267911168B" with this change. Signed-off-by: Shin'ichiro Kawasaki <> Signed-off-by: Jens Axboe <>
2019-08-05Add tests from t/ to the Windows installerRebecca Cran
Signed-off-by: Jens Axboe <>
2019-08-03nbd: Remove copy and paste error in exampleRichard W.M. Jones
Fixes a copy and paste error introduced in commit d643a1e29d ("engines: Add Network Block Device (NBD) support using libnbd."). Thanks: Sitsofe Wheeler Signed-off-by: Richard W.M. Jones <> Signed-off-by: Jens Axboe <>
2019-08-03engines/splice: remove buggy ->mem_align setJens Axboe
Two things wrong here: 1) We align buffers by default, so no need for splice to do anything extra. 2) ->mem_align is not a true/false setting, it's the alignment itself. Hence the current setting to 1 is just buggy. Fixes: Reported-by: Sitsofe Wheeler <> Signed-off-by: Jens Axboe <>
2019-08-02nbd: Document the NBD-specific uri parameterRichard W.M. Jones
Signed-off-by: Richard W.M. Jones <> Signed-off-by: Jens Axboe <>
2019-08-02parse: bump max value pairs supported from 24 to 32Jens Axboe
The recent addition of the nbd engine overflowed what we support. Fixes: d643a1e29d31 ("engines: Add Network Block Device (NBD) support using libnbd") Signed-off-by: Jens Axboe <>
2019-08-02engines: Add Network Block Device (NBD) support using libnbd.Richard W.M. Jones
This commit adds a new engine for testing Network Block Devices directly. It requires libnbd ( To see how to test nbdkit or qemu-nbd read examples/nbd.fio. Signed-off-by: Richard W.M. Jones <> Signed-off-by: Jens Axboe <>
2019-07-31smalloc: cleanup firstfree()Jens Axboe
Signed-off-by: Jens Axboe <>
2019-07-31smalloc: fix compiler warning on WindowsVincent Fu
firstfree() triggers a warning from the Windows compiler used by AppVeyor because it doesn't return a value if the for loop iterates to completion. This patch resolves the compiler warning. AppVeyor Windows build log: Signed-off-by: Jens Axboe <>
2019-07-31Remove unused fio_assert()Jens Axboe
Signed-off-by: Jens Axboe <>
2019-07-31Merge branch 'smalloc-gc' of Axboe
* 'smalloc-gc' of smalloc: fix garbage collection problem t/stest: make the test more challenging smalloc: print debug info on oom error
2019-07-31smalloc: fix garbage collection problemVincent Fu
If a large request arrives when pool->next_non_full points to empty space that is insufficient to satisfy the request, pool->next_non_full will be inappropriately advanced when the free space is followed by lines of fully allocated space. The free space originally pointed to by pool->next_non_full will be unavailable unless a subsequent sfree() call frees allocated space above it. Resolve this issue by advancing pool->next_non_full only outside the search loop and only when it points to fully allocated space.
2019-07-31t/stest: make the test more challengingVincent Fu
Add large smalloc requests to the sfree phase of the test. This exposes a smalloc garbage collection issue.
2019-07-31smalloc: print debug info on oom errorVincent Fu
Provide more details about the request and the state of the memory pools when smalloc encounters an oom situation.