git.kernel.dk Git - fio.git/log

Merge branch 'fix-randtrimwrite' of https://github.com/minwooim/fio

* 'fix-randtrimwrite' of https://github.com/minwooim/fio:
io_u: fix offset calculation in randtrimwrite

io_u: fix offset calculation in randtrimwrite

For randtrimwrite, we should issue trim + write pair and those offsets
should be same.

This works good for cases without `offset=` option, but not for cases
with `offset=` option.  In cases with `offset=` option, it's necessary
to subtract `file_offset`, which is value of `offset=` option, when
calculationg offset of write.

This is a bit confusing because `last_start` is an actual offset that
has already been issued through trim.  However, `last_start` is the
value to which `file_offset` is added.  Since we add back `file_offset`
later on after calling `get_next_block` in `get_next_offset`,
`last_start` should be adjusted.

Signed-off-by: Jungwon Lee <jjung1.lee@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
[+ updated commit title]

windows: drop nanosleep and clock_gettime

Cygwin and msys2 now provide nanosleep and clock_gettime, so fio no
longer needs to implement them. The presence of our implementations was
triggering build failures:

https://github.com/axboe/fio/actions/runs/15828051168

Since fio no longer provides clock_gettime, stop unconditionally setting
clock_gettime and clock_monotonic to yes on Windows and start detectinga
these features at build time. These two features are successfully
detected by our configure script:

https://github.com/vincentkfu/fio/actions/runs/15832278184

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

Merge branch 'fix-random-distribution-parsing-failure' of https://github.com/leonid-kozlov/fio

* 'fix-random-distribution-parsing-failure' of https://github.com/leonid-kozlov/fio:
parse: use minimum delimiter distance

Merge branch 'fix_real_file_size_when_pi_is_enabled' of https://github.com/SuhoSon/fio

* 'fix_real_file_size_when_pi_is_enabled' of https://github.com/SuhoSon/fio:
io_uring: ensure accurate real_file_size setup for full device access with PI enabled

io_uring: ensure accurate real_file_size setup for full device access with PI enabled

Fix real_file_size calculation when PI is enabled

When PI is enabled, the extended LBA (lba_ext) should be used to calculate
real_file_size instead of lba_size. This ensures FIO can access the entire
device area correctly.

Signed-off by: Suho Son <suho.son@samsung.com>

parse: use minimum delimiter distance

Use minimal distance to delimiter to determine option length

Current implementation of opt_len() makes impossible to
locate option name in random_distribution zones list
combining ':' and ',' chars.
opt_len() function should try to locate option name
by all possible delimiters and return minimal length one
instead of returning first found.

Fixes: https://github.com/axboe/fio/issues/1923

Signed-off-by: Leonid Kozlov <leonid.e.kozlov@gmail.com>

backend: clean up requeued io_u's

When an atttempt to queue an io_u returns FIO_Q_BUSY, the io_u is added
to td->io_u_requeues. If the runtime timeout expires with
td->io_u_requeues not empty, the job will not close the relevant
file because its file->references will be non-zero since the requeued
io_u still holds a reference to the file.

This patch discards the contents of td->io_u_requeues during io_u
cleanup which leads to file closure when its last reference is
destroyed. This is relevant for resource-constrained environments.

Suggested-by: Jonghwi Jeong <jongh2.jeong@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

ioengines: clear in-flight bit for FIO_Q_BUSY syncs

If a sync operation is ever requeued after a previous queue attempt
returns FIO_Q_BUSY the assertion checking that the IO_U_F_FLIGHT bit is
not set will fail because this bit is not cleared when the FIO_Q_BUSY
return value is processed.

This patch makes sure that we clear IO_U_F_FLIGHT when the queue attempt
returns FIO_Q_BUSY for sync operations. The counters that are restored
are not defined for sync operations, so we cannot modify them.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/verify: skip crc7 when running checksum tests

The crc7 checksum has a 1/128 chance of not detecting data corruption
when we mangle data written to the device. Skip these tests when testing
the checksum functions to avoid false test failures.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

Merge branch 'opt/io_uring-sq-full-check' of https://github.com/calebsander/fio

* 'opt/io_uring-sq-full-check' of https://github.com/calebsander/fio:
engines/io_uring: remove unnecessary SQ full check

engines/io_uring: remove unnecessary SQ full check

fio_ioring_queue() bails out if the SQ's tail + 1 == head. This will
always be false, since 0 <= tail - head <= entries. Probably it was
meant to check whether the SQ is full, i.e. tail == head + entries.
(The head index should be loaded with acquire ordering in that case.)
Checking for a full SQ isn't necessary anyways, as the prior check for
ld->queued == td->o.iodepth already ensures the SQ isn't full.

So remove the unnecessary and misleading tail + 1 == head check.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>

Add Zhaoxin support to enable tsc_reliable and arch_random features

Signed-off-by: Runa Guo-oc <RunaGuo-oc@zhaoxin.com>
Link: https://lore.kernel.org/r/20250522104032.17519-1-RunaGuo-oc@zhaoxin.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Fio 3.40

Signed-off-by: Jens Axboe <axboe@kernel.dk>

t/verify: add tests to exercise verify_pattern_interval

Add some more tests with oddball intervals and patterns to
validate the verify_pattern_interval option.

Link: https://lore.kernel.org/r/20250508185832.3702-12-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/verify: Windows --output work-around

For a reason I do not understand Cygwin chokes when fio is given an
option like the following:

--output==/home/vincent.fu/fio-vpi/verify-test-20250505-172034/pattern_%o_vpi_10_vi_1024/3001/verify3001.output

This is the resulting error:

pattern: %o, verify_pattern_interval: 129, verify_interval: 1024
Test 3000 FAILED: [Errno 2] No such file or directory: '/home/vincent.fu/fio-vpi/verify-test-20250505-172034/pattern_%o_vpi_129_vi_1024/3000/verify3000.output' 3000
Test 3001 FAILED: [Errno 2] No such file or directory: '/home/vincent.fu/fio-vpi/verify-test-20250505-172034/pattern_%o_vpi_129_vi_1024/3001/verify3001.output' 3001
0 test(s) passed, 2 failed, 0 skipped

It's not the length because other paths are longer. Even if I escape the
% the error still occurs.

Work around this by simply using the filename instead of the entire
path. The job runs with the test directory as the current working
directory, so it's fine to just use the filename only.

Link: https://lore.kernel.org/r/20250508185832.3702-11-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/verify: test cases for running pattern and pattern_hdr

Add test cases for the recently added verify_pattern_interval and
verify=pattern_hdr options to our pre-existing sets of tests.

In some cases verify failures do not produce non-zero return values
unless verify_fatal is set. Add verify_fatal=1 to affected test cases.

Link: https://lore.kernel.org/r/20250508185832.3702-10-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: add verify_pattern_interval option

It can be useful to fill a device by writing each LBA with its offset
using a verify_pattern value that includes the %o format specifier.
However, it is slow to do this 512B at a time for a large device. This
patch adds the verify_pattern_interval option to enable fio to write
such a data pattern using larger transfer sizes. With this option set,
fio will update the offset every verify_pattern_interval bytes. For a
4096-byte block with verify_pattern=%o the first 512 bytes will be 0,
the second 512 bytes will be 512, the third 512 bytes will be 1024, etc
when verify_pattern_interval=512.

When verify_pattern does not include %o the verify_pattern_interval
option will still re-start the verify pattern eveny N bytes.

Link: https://lore.kernel.org/r/20250508185832.3702-9-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

ci: don't skip verify tests when triggered manually

When our test suite is triggered manually it skips the verify test
script. Change this so that the verify test script is run when the tests
are manually kicked off. This is accomplished by skipping the verify
test suite only when a push or pull request triggers our tests.

With this change we don't have to wait overnight to see the results of
the verify tests.

Link: https://lore.kernel.org/r/20250508185832.3702-8-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

ci: for nightly verify tests use all checksum methods

By default t/verify.py runs tests with only a small selection of
checksum methods. Run the test script with an option so that our nightly
tests use all of the checksum methods.

Link: https://lore.kernel.org/r/20250508185832.3702-7-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/fiotestcommon: lengthen timeout for longer tests

For GitHub-hosted Windows runners t/verify.py runs for longer than 30
min when it tests all checksum methods. Make the timeout for SUCESS_LONG
tests 1hr to accommodate these tests.

Link: https://lore.kernel.org/r/20250508185832.3702-6-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: omit verify type mismatch error message for pattern verify

When we are carrying out pattern verification without a header we should
not print out an error message about a verify type mismatch because
there is no header on the media specifying the verify type.

Link: https://lore.kernel.org/r/20250508185832.3702-5-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: make verify_pattern=%o thread safe

When verify_async is set, multiple threads create instances of
the verify_pattern in the same buffer. This is not a problem if the
pattern is a constant value. However, if the pattern depends on the
offset then pattern verification will produce false failures.

This patch changes verify_io_u_pattern() to allocate brand new
buffers to instantiate the verify_pattern when verify_pattern contains
the %o format specifier and verify_async is set. With each thread having
its own pattern buffer they will no longer interfere with each other.

Failing use case example:

root@localhost:~/fio-dev/fio-canonical# ./fio --name=test --ioengine=io_uring --iodepth=32 --filesize=256k --verify_pattern=%o --verify_async=2 --rw=write
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=32
fio-3.39-44-g19d9
Starting 1 process
fio: got pattern '10', wanted '70'. Bad bits 2
fio: bad pattern block offset 41
pattern: verify failed at file test.0.0 offset 200704, length 4096 (requested block: offset=200704, length=4096, flags=84)

test: (groupid=0, jobs=1): err= 0: pid=3972816: Tue Apr 22 16:59:55 2025
  read: IOPS=21.3k, BW=83.3MiB/s (87.4MB/s)(256KiB/3msec)
    slat (nsec): min=3989, max=35487, avg=5319.23, stdev=4033.94
    clat (usec): min=43, max=2008, avg=1058.31, stdev=513.19
     lat (usec): min=47, max=2012, avg=1063.63, stdev=512.36
    clat percentiles (usec):
     |  1.00th=[   44],  5.00th=[  178], 10.00th=[  347], 20.00th=[  570],
     | 30.00th=[  775], 40.00th=[  930], 50.00th=[ 1074], 60.00th=[ 1237],
     | 70.00th=[ 1385], 80.00th=[ 1532], 90.00th=[ 1729], 95.00th=[ 1811],
     | 99.00th=[ 2008], 99.50th=[ 2008], 99.90th=[ 2008], 99.95th=[ 2008],
     | 99.99th=[ 2008]
  write: IOPS=32.0k, BW=125MiB/s (131MB/s)(256KiB/2msec); 0 zone resets
    slat (usec): min=2, max=328, avg= 9.50, stdev=40.93
    clat (usec): min=250, max=585, avg=382.74, stdev=86.17
     lat (usec): min=262, max=588, avg=392.24, stdev=88.98
    clat percentiles (usec):
     |  1.00th=[  251],  5.00th=[  262], 10.00th=[  293], 20.00th=[  306],
     | 30.00th=[  330], 40.00th=[  334], 50.00th=[  359], 60.00th=[  404],
     | 70.00th=[  416], 80.00th=[  474], 90.00th=[  494], 95.00th=[  545],
     | 99.00th=[  586], 99.50th=[  586], 99.90th=[  586], 99.95th=[  586],
     | 99.99th=[  586]
  lat (usec)   : 50=0.78%, 100=0.78%, 250=2.34%, 500=50.78%, 750=9.38%
  lat (usec)   : 1000=8.59%
  lat (msec)   : 2=26.56%, 4=0.78%
  cpu          : usr=0.00%, sys=50.00%, ctx=54, majf=0, minf=15
  IO depths    : 1=1.6%, 2=3.1%, 4=6.2%, 8=12.5%, 16=25.0%, 32=51.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=97.1%, 8=0.0%, 16=0.0%, 32=2.9%, 64=0.0%, >=64=0.0%
     issued rwts: total=64,64,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=83.3MiB/s (87.4MB/s), 83.3MiB/s-83.3MiB/s (87.4MB/s-87.4MB/s), io=256KiB (262kB), run=3-3msec
  WRITE: bw=125MiB/s (131MB/s), 125MiB/s-125MiB/s (131MB/s-131MB/s), io=256KiB (262kB), run=2-2msec

Disk stats (read/write):
  sda: ios=0/0, sectors=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Link: https://lore.kernel.org/r/20250508185832.3702-4-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: fix verify_offset when used with pattern_hdr

When using verify=pattern_hdr with verify_offset, we cannot reuse the
data already in a buffer because some of the pattern bytes have been
swapped with the verify header. Trying to reuse the buffer contents just
results in the header being swapped back and forth between the
verify_offset location and the beginning of the verify_interval.

Fix this by avoiding reuse of buffer contents.

Failing test case example:
root@localhost:~/fio-dev/fio-canonical# ./fio --name=verify --filesize=8192 --verify_offset=1024 --verify_pattern=1 --rw=write
verify: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.39-38-gf18c
Starting 1 process
fio: got pattern 'ca', wanted '01'. Bad bits 5
fio: bad pattern block offset 1024
pattern: verify failed at file verify.0.0 offset 4096, length 4096 (requested block: offset=4096, length=4096, flags=88)
fio: pid=3624903, err=84/file:io_u.c:2258, func=io_u_sync_complete, error=Invalid or incomplete multibyte or wide character

verify: (groupid=0, jobs=1): err=84 (file:io_u.c:2258, func=io_u_sync_complete, error=Invalid or incomplete multibyte or wide character): pid=3624903: Fri Apr 11 22:56:01 2025
  read: IOPS=2000, BW=8000KiB/s (8192kB/s)(8192B/1msec)
    clat (nsec): min=2185, max=15807, avg=8996.00, stdev=9632.21
     lat (nsec): min=2280, max=16120, avg=9200.00, stdev=9786.36
    clat percentiles (nsec):
     |  1.00th=[ 2192],  5.00th=[ 2192], 10.00th=[ 2192], 20.00th=[ 2192],
     | 30.00th=[ 2192], 40.00th=[ 2192], 50.00th=[ 2192], 60.00th=[15808],
     | 70.00th=[15808], 80.00th=[15808], 90.00th=[15808], 95.00th=[15808],
     | 99.00th=[15808], 99.50th=[15808], 99.90th=[15808], 99.95th=[15808],
     | 99.99th=[15808]
  write: IOPS=2000, BW=8000KiB/s (8192kB/s)(8192B/1msec); 0 zone resets
    clat (nsec): min=7910, max=82940, avg=45425.00, stdev=53054.22
     lat (usec): min=9, max=100, avg=55.03, stdev=64.49
    clat percentiles (nsec):
     |  1.00th=[ 7904],  5.00th=[ 7904], 10.00th=[ 7904], 20.00th=[ 7904],
     | 30.00th=[ 7904], 40.00th=[ 7904], 50.00th=[ 7904], 60.00th=[82432],
     | 70.00th=[82432], 80.00th=[82432], 90.00th=[82432], 95.00th=[82432],
     | 99.00th=[82432], 99.50th=[82432], 99.90th=[82432], 99.95th=[82432],
     | 99.99th=[82432]
  lat (usec)   : 4=25.00%, 10=25.00%, 20=25.00%, 100=25.00%
  cpu          : usr=0.00%, sys=0.00%, ctx=0, majf=0, minf=19
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=20.0%, 4=80.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2,2,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=8000KiB/s (8192kB/s), 8000KiB/s-8000KiB/s (8192kB/s-8192kB/s), io=8192B (8192B), run=1-1msec
  WRITE: bw=8000KiB/s (8192kB/s), 8000KiB/s-8000KiB/s (8192kB/s-8192kB/s), io=8192B (8192B), run=1-1msec

Disk stats (read/write):
  sda: ios=0/0, sectors=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

Link: https://lore.kernel.org/r/20250508185832.3702-3-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: add verify mode for a pattern with header

Add a verify=pattern_hdr option. Previously this was only available when
verify_pattern was set and verify= was omitted. Add a means to
explicitly select this verificaiton mode.

This is useful in the t/verify.py test script because it's troublesome
to have some jobs with a verify= option and omit this option in other
jobs when we want to test pattern verification with a header.

Link: https://lore.kernel.org/r/20250508185832.3702-2-vincent.fu@samsung.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

configure: Fix libnfs cflags and libs

libnfs version 16 requires the gnutls library. Without specifying at
least -lgnutls, builds fail:

LINK fio
/usr/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/15/../../../../lib64/libnfs.so: undefined reference to `gnutls_certificate_set_x509_trust_dir'
/usr/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/15/../../../../lib64/libnfs.so: undefined reference to `gnutls_transport_set_int2'
...

Modify the configure script to add cflags and library options for gnutls
to correctly build libnfs engine.

Also make sure that the CI install the gnutls library header files.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250514070754.38281-1-dlemoal@kernel.org
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

fio_sem, diskutil: introduce fio_shared_sem and use it for diskutil lock

To report disk utilization statistics, fio allocates struct disk_util
objects for each disk of the I/O target files. These struct disk_util
objects are allocated via smalloc() from the fio shared memory, allowing
them to be allocated or freed by any fio process. The disk_util objects
are managed with the global list disk_list and are all freed by the fio
main process at the end of fio_backend() through
disk_util_prune_entries().

The struct disk_util contains the field struct fio_sem *lock which
points to the lock object. This object is allocated via mmap() with
MAP_SHARED flag. When allocated by parent process, this lock can be
shared across fio processes after forking child processes; however, the
parent process can not either access or free it when the lock object is
allocated by child processes. This lock object is also freed by the fio
main process alongside with the disk_util objects.

The commit c492cb1a9b1c ("iolog: fix disk stats issue") modified
init_iolog() to call init_disk_util(), which allocates the fio_sem lock
object for the struct disk_util. This commit enabled the disk
utilization report feature for the files recorded in the I/O replay
files. However, since the added init_disk_util() call is executed in
the child fio job processes, the disk_util object and the fio_sem lock
objects are allocated by the child processes. While allocation by child
process is acceptable for the disk_util object, it causes segmentation
faults when the fio_sem lock objects are freed at the end of the fio
main process.

The segmentation fault can be recreated by running two jobs: one job
does regular I/O to a system disk file. The other job replays I/O to
another disk. It can be triggered using the following command and the
files:

  $ sudo fio recreate.fio

  recreate.fio
  ============
  [dev_a]
  filename=test_file
  rw=read
  size=4096
  [dev_b]
  read_iolog=recreate.iolog

  recreate.iolog
  ==============
  fio version 3 iolog
  0 /dev/nullb0 add
  1 /dev/nullb0 open
  2 /dev/nullb0 read 0 4096
  3 /dev/nullb0 close

This fault happens only when fio jobs are handled as processes. When
thread=1 option is specified, fio jobs are threads and reside within
single memory space, then the fault is not observed.

To prevent the segmentation fault, allocate the fio_sem lock object not
by mmap() but by smalloc(). This ensures the fio main process can free
the fio_sem lock objects along with the disk_util objects. To achieve
this, introduce two new helper functions, fio_shared_sem_init() and
fio_shared_sem_remove(). These functions behave exactly same as the
existing functions fio_sem_init() and fio_sem_remove() except the memory
allocation method.

Do not implement the new functions in the existing source file
fio_sem.c, because it implements fio_sem_init() and fio_sem_remove()
which are used for smalloc() implementation. If the two new functions
were implemented in fio_sem.c, it would create circular references and
cause build failures. Instead, add a new source file fio_shared_sem.c to
implement the new functions.

Fixes: c492cb1a9b1c ("iolog: fix disk stats issue")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250513031837.74780-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

man: Fix recover_zbd_write_error option description

Add a missing ".TP" statement after the zone_reset_frequency option
description to get the correct description for the option
recover_zbd_write_error as a new paragraph.

Fixes: 650c4ad385cf ("zbd: add the recover_zbd_write_error option")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250509020250.483865-1-dlemoal@kernel.org
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

windows: fix pread/pwrite

The pread and pwrite functions for Windows posix emulation never actually seek
to the requested offset. Fix this so that the psync ioengine works correctly on
Windows.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

Merge branch 'continue_on_error_fix_up' of https://github.com/kawasaki/fio

* 'continue_on_error_fix_up' of https://github.com/kawasaki/fio:
oslib: blkzoned: add missing blkzoned_move_zone_wp() stub

HOWTO: fix bad whitespace

Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'patch-1' of https://github.com/avrittrohwer/fio

* 'patch-1' of https://github.com/avrittrohwer/fio:
Document expected filename format for s3 http engine.

Document expected filename format for s3 http engine.

This caused me some headache, let's add some details on how fio expects
the http_host and file to be formatted.

Signed-off-by: Avritt Rohwer avritt@google.com

oslib: blkzoned: add missing blkzoned_move_zone_wp() stub

Commit 4175f4dbec5d ("oslib: blkzoned: add blkzoned_move_zone_wp()
helper function") introduced the new function for Linux, but did not add
its stub function for OSes that lack the blkzoned feature. This caused
build failures on MacOS and Windows. Add the missing stub to fix it.

Fixes: 4175f4dbec5d ("oslib: blkzoned: add blkzoned_move_zone_wp() helper function")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

t/zbd: add run-tests-against-scsi_debug

The newly added test cases in t/zbd/test-zbd-support 72 and 73 require
error injection feature. They can be run with either null_blk or
scsi_debug, which provides the error injection feature. To run the test
cases easily with scsi_debug, add another script run-tests-against-
scsi_debug. It simply prepares a zoned scsi_debug device and run the two
test cases.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250425052148.126788-9-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

t/zbd: add the test cases to confirm continue_on_error option

When the continue_on_error option is specified, it is expected that
write workloads do not stop even when bad blocks cause IO errors and
leave partially written data. Add a test cases to confirm it with
zonemode=zbd and the new option recover_zbd_write_error.

To create the IO errors as expected, use null_blk and scsi_debug.
Especially, use null_blk and its parameters badblocks and
badblocks_once, which can control the block to cause the IO error.
Introduce helper functions which confirms the parameters for bad blocks
are available, and sets up the bad blocks.

Using the helper functions, add four new test cases. The first two cases
confirm that the fio recovers after the IO error with partial write.
One test case covers psync IO engine. The other test case covers async
IO with libaio engine with high queue depth and multiple jobs. The last
two test cases confirm the case that another IO error happen again
during the recovery process from the IO error.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250425052148.126788-8-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

t/zbd: set badblocks related parameters in run-tests-against-nullb

As a preparation to add test cases which check that the
continue_on_error option and the recover_zbd_write_error option work
when bad blocks cause IO errors, set additional null_blk parameters
badblocks_once and badblocks_partial_io. These parameters were added to
Linux kernel version 6.15-rc1 and allows more realistic scenario of
write failures on zoned block devices. The former parameter makes the
specified badblocks recover after the first write, and the latter
parameter leaves partially written data on the device.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250425052148.126788-7-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

zbd: add the recover_zbd_write_error option

When the continue_on_error options is specified, it is expected that the
workload continues to run when non-critical errors happen. However,
write workloads with zonemode=zbd option can not continue after errors,
if the failed writes cause partial data write on the target device. This
partial write creates write pointer gap between the device and fio, then
the next write requests by fio will fail due to unaligned write command
errors. This restriction results in undesirable test stops during long
runs for SMR drives which can recover defect sectors.

To allow the write workloads with zonemode=zbd to continue after write
failures with partial data writes, introduce the new option
recover_zbd_write_error. When this option is specified together with the
continue_on_error option, fio checks the write pointer positions of the
write target zones in the error handling step. Then fix the write
pointer by moving it to the position that the failed writes would have
moved. Bump up FIO_SERVER_VER to note that the new option is added.

For that purpose, add a new function zbd_recover_write_error(). Call it
from zbd_queue_io() for sync IO engines, and from io_completed() for
async IO engines. Modify zbd_queue_io() to pass the pointer to the
status so that zbd_recover_write_error() can modify the status to ignore
the errors. Add three fields to struct fio_zone_info. The two new fields
writes_in_flight and max_write_error_offset track status of in-flight
writes at the write error, so that the write pointer positions can be
fixed after the in-flight writes completed. The field fixing_zone_wp
stores that the write pointer fix is ongoing, then prohibit the new
writes get issued to the zone.

When the failed write is synchronous, the write pointer fix is done by
writing the left data for the failed write. This keeps the verify
patterns written to the device, then verify works together with the
continue_on_zbd_write_error option. When the failed write is
asynchronous, other in-flight writes fail together. In this case, fio
waits for all in-flight writes complete then fix the write pointer. Then
verify data of the failed writes are lost and verify does not work.
Check the continue_on_zbd_write_error option is not specified together
with the verify workload and asynchronous IO engine.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250425052148.126788-6-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

zbd: introduce zbd_move_zone_wp()

As a preparation for continue_on_error option support for zonemode=zbd,
introduce the function zbd_move_zone_wp(). It moves write pointers by
calling blkzoned_move_zone_wp() or move_zone_wp() callback of IO
engines.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250425052148.126788-5-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

engines/libzbc: implement move_zone_wp callback

As a preparation for continue_on_error option support for zonemode=zbd,
implement move_zone_wp() callback for libzbc IO engine.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250425052148.126788-4-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ioengine: add move_zone_wp() callback

As a preparation for continue_on_error option support for zonemode=zbd,
introduce a new callback move_zone_wp() for the IO engines. It moves the
write pointer by writing data in the specified buffer. Also bump up
FIO_IOOPS_VERSION to note that the new callback is added.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250425052148.126788-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

oslib: blkzoned: add blkzoned_move_zone_wp() helper function

As a preparation for continue_on_error option support for zonemode=zbd,
introduce a new function blkzoned_move_zone_wp(). It moves the write
pointer by data write. If data buffer is provided, call pwrite() system
call. If data buffer is not provided, call fallocate() to write zero
data.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250425052148.126788-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'master' of https://github.com/blah325/fio

* 'master' of https://github.com/blah325/fio:
Fix hang on Windows when multiple --client args are present

Fix hang on Windows when multiple --client args are present

The Windows poll function does not clear revents field before it is
populated. As a result, subsequent calls to poll using the same
pollfd reference return with revents set even when there is nothing
available to read. This later results in a hang in recv().

Signed-off-by: James Rizzo <james.rizzo@broadcom.com>

t/zbd: add test for the case all write zones have small remainder

The previous commit fixed the unexpected write stop when all write
target zones have small remainder sectors to write. Add a test case to
confirm the fix.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250414062721.87641-5-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

zbd: finish zone when all random write target zones have small remainder

When a random write target offset points to a zone that is not writable,
zbd_convert_to_write_zone() attempts to convert the write offset to a
different, writable zone. However, the conversion fails when all of the
following conditions are met:

1) the workload has the max_open_zones limit
2) every write target zones, up to the max_open_zones limit, has
remainder sectors smaller than the block size
3) the next random write request targets a zone not in the write target
zone list

In this case, zbd_convert_to_write_zone() can not open another zone
without exceeding the max_open_zones constraint. Therefore, It does not
convert the write to a different zone printing with the debug message
"did not choose another write zone". This leads to an unexpected stop of
the random write workload.

To prevent the unexpected write stop, finish one of the write target
zones with small remainder sectors. Check if all write target zones have
small remainder, and store the result in the new local boolean variable
all_write_zones_have_small_remainder. When this condition is true,
choose one of the write target zones and finish it. Then return the zone
from zbd_convert_to_write_zone() enabling the write process to continue.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250414062721.87641-4-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

zbd: factor out zbd_pick_write_zone()

To prepare for the following fix, factor out a part of
zbd_convert_to_write_zone() to the new function zbd_pick_write_zone().
This function randomly chooses a zone in the array of write zones.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250414062721.87641-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

zbd: move zone finish operation to zbd_convert_to_write_zone()

Currently, when a write target zone has fewer remainder sectors than
the block size, fio finishes the zone to make the zone inactive (not
open), so that another zone can be open and used as a write target zone.
This zone finish operation is implemented in zbd_adjust_block().
However, this placement is less ideal because zbd_adjust_block() manages
not just write requests but also read and trim requests.

Since the zone finish operation is exclusively necessary for write
requests, implement it into zbd_convert_to_write_zone(). While at it,
improve the function comment.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250414062721.87641-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ci: add verify-trim.py test script

On GitHub Actions we cannot insert kernel modules, so skip this script
on tests that run with pull requests and after every push. Instead run
this test with our nightly tests that run in a QEMU environment.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/verify-trim.py: superficial test script for verify/trim

Fio can verify trim operations. This script adds some simple test cases
for this feature.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify/trim: make trim_backlog_batch work

In order to detect when we are at the beginning of a trim phase we check
io_hist_len and should check that the previous operation was not a
*trim* (instead of not a read). Without this change trim_backlog_batch
will have no effect because after one batch is done, fio will simply
start a new batch because io_hist_len is still a multiple of
trim_backlog and the last operation in a batch was a trim which is not a
read.

For check_get_verify checking against read is appropriate but for
check_get_trim we must check against a trim.

Also we need to decrement the trim_batch count for the first trim
operation we send through.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify/trim: stop issuing trims if we run out

If we have drained the list of trim operations but its original contents
were fewer than a full batch we should zero out the running batch count
to make sure that we issue another full set of trim_backlog write
operations before considering trims again. Otherwise we will immediately
trim after each subsequent write operation until we have met the batch
size requirement.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

init: error out when readonly is set for a trim/verify workload

Fio may issue trim commands for a verify/trim job. Abort and print an
error message if this type of job is run with the --readonly option.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

trim_verify: include a trim panel in the output

The trim bit in td_ddir is not set when trim_percentage/backlog is
enabled yet fio still issues trim operations. Detect these cases and
produce output describing trim operations if we issued any.

This is similar to the fix (615c794cbf851c994e94fffe8b8f565e64f137a5)
committed for verify_backlog.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

fio: allow trim operations for verify/trim workloads

Fio has the ability to verify trim operations by running a verify
workload and setting the trim_percentage, trim_backlog, and
trim_verify_zero options. Some of the written blocks will then be
trimmed and then read back to see if they are zeroed out.

This patch changes fio_ro_check to allow trim operations when fio is
running a verify/trim workload.

Fixes: 196ccc44 ("fio.h: also check trim operations in fio_ro_check")
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

ci: set kvm permissions for GHA QEMU jobs

The image used by GitHub-hosted runners changed the default kvm device
permissions recently rendering us no longer able to start guest VMs. The
error message:

Could not access KVM kernel module: Permission denied
qemu-system-x86_64: failed to initialize kvm: Permission denied

Working run: https://github.com/fiotestbot/fio/actions/runs/14186873066
Failed run: https://github.com/fiotestbot/fio/actions/runs/14211189491

Explicitly give the GitHub Actions runner user permission to access the
/dev/kvm device following the guide at

https://github.blog/changelog/2024-04-02-github-actions-hardware-accelerated-android-virtualization-now-available/

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

Merge branch 'iouring-spellingfix-2025-03-18' of https://github.com/proact-de/fio

* 'iouring-spellingfix-2025-03-18' of https://github.com/proact-de/fio:
Fix spelling error in IO uring engine.

Fix spelling error in IO uring engine.

Merge branch 'dfs' of https://github.com/henglgh/fio

* 'dfs' of https://github.com/henglgh/fio:
dfs: fix fail to load dfs engine

dfs: fix fail to load dfs engine

dfs engine mistakenly used a symbol named 'dfs' to call
dlsym(dlhandle, engine_lib), this symbol points a global
variable in dfs.c file. I change this variable name to
'daosfs' to point to 'ioengine' symbol correctly.

Fixes: https://github.com/axboe/fio/issues/1874
Signed-off-by: fugen <fugen@cstor.cn>

ci: add nightly test for verify

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/run-fio-test: add t/verify.py

Add the verify test script to our test runner.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/fiotestcommon: add a success pattern for long tests

On Windows the verify test script runs for longer than 10 minutes. Add a
success pattern that accommodates this test.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/verify.py: Add verify test script

The script contains three sets of tests. The first set of tests
exercises fio's decision making about checking the verify header's
sequence number and random seed. The second set of tests is aimed at
making sure that the checksum functions can detect data mismatches. The
final set of tests exercise fio's verify-related options such as
verify_backlog and verify_inteval.

This test script includes two checksum lists. The first list (default)
contains a subset of the checksum methods offered by fio, whereas the
second list contains the full set of checksum methods. The second, full
set can be run by specifying -c or --complete. Testing all of the
checksum methods can take a long time.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

t/fiotestlib: display stderr size when it is not empty but should be

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/fiotestlib: improve JSON decoding

Sometimes error/informational messages appear at the end of the JSON
data. Try to parse as JSON only the text between the first { and the
last }.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/fiotestcommon: do not require nvmecdev argument for Requirements

Enable Requirements checking for test suites that do not have an
nvmecdev argument. macOS does not support NUMA placement so we need to
skip some tests on that platform when the test suite does not have an
nvmecdev argument. This will be used in an upcoming patch for a set of
verify tests.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

verify: adjust fio_offset_overlap_risk to include randommap

Currently we are using a list to log I/O history if:
* randommap is enabled and fio manages to allocate memory for it.
* there are no offset modfiers with any jobs.
For any different scenario we use an RB tree to handle offset overlaps,
which disables header seed checks for them.

This commit expands fio_offset_overlap_risk() such that it covers
file_randommap() cases.
For random workload with this change these are the possible scenarios

-----------------------------------------------------------------------
|                 |         norandommap=0              |  norandommap=1 |
|-----------------------------------------------------------------------|
| softrandommap=0 |        list (No change)            |    RB tree     |
|                 | (fio was able to allocate memory)  |  (No change)   |
|-----------------|------------------------------------|----------------|
|                 |      RB tree (Now always)          |    RB tree     |
| softrandommap=1 |Even if fio was able to allocate mem|  (No Change)   |
-----------------------------------------------------------------------

With randommap enabled and softrandommap=1 we now always use an RB tree,
even when fio is able to allocate memory for random map. In this case
verify header seed check will be disabled. If users want to check header
seed they can either disable softrandommap or explicilty enable
verify_header_seed.

Effectively this changes randommap from being a per-file property to
per-job property.

This also fixes rand seed mismatch isues, that have been observed when
multiple files are used, such as for the below mentioned configuration.

[global]
do_verify=1
verify=md5
direct=1
[multi_file]
rw=readwrite
directory=.
nrfiles=2
size=32K

Here is the truncated log with debug=verify flag, and an extra log when
the seed gets generated as well as the mismatch.

verify   368109 file ./multi_file.0.1 seed 46386204153304124 offset=0, length=4096
verify   368109 file ./multi_file.0.0 seed 9852480210356360750 offset=0, length=4096
verify   368109 file ./multi_file.0.1 seed 4726550845720924880 offset=4096, length=4096
verify: bad header rand_seed 9852480210356360750, wanted 46386204153304124 at file ./multi_file.0.0 offset 0, length 4096 (requested block: offset=0, length=4096)

Earlier the I/O entries were getting logged in an RB tree, as we were
relying on file_randommap(), which was false for sequential workloads.
In RB tree, files are prioritized first and then the offset. Thus during
the verify phase the I/O entries are removed from tree in order of file
and then offset which is not how it was originally written. With the new
checks, for sequential workload we now store the entries in the list
instead of RB tree.
Even for sequential workload if the user fortuitously specified
norandommap or softrandommap, then I/Os will be stored in an RB tree.
However in this case header seed checks will be disabled.

fixes #740
fixes #746
fixes #844
fixes #1538

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: fix verify issue with offest modifiers

Offset modifiers such as rw=readwrite:8 or rw=write:4K can create
overlaps. For these cases use RB tree instead of list to log the I/O
entries.

Add a helper function fio_offset_overlap_risk() to decide whether to log
the I/O entry in an RB tree or a list.

Disable header seed verification if there are offset modifiers, unless
its explicitly enabled.

fixes #1503

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

init: fixup verify_offset option

Verify offset should swap verification header within the verify interval.
If this is not the case return error. Update the doc. accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

backend: fix verify issue during readwrite

In readwrite mode if specified io_size > size, offsets can overlap.
This will result in verify errors. Add check to handle this case.

Fixes: d782b76f ("Don break too early in readwrite mode")

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: disable write sequence checks with norandommap and iodepth > 1

With norandommap for async I/O engines specifying I/O depth > 1, it is
possible that two or more writes with the same offset are queued at once.
When fio tries to verify the block, it may find a numberio mismatch
because the writes did not land on the media in the order that they were
queued. Avoid these spurious failures by disabling sequence number
checking. Users will still be able to enable sequence number checking
if they explicitly set the verify_header_sequence option.

fio -name=verify -ioengine=libaio -rw=randwrite -verify=sha512 -direct=1 \
-iodepth=32 -filesize=16M -bs=512 -norandommap=1 -debug=io,verify

Below is the truncated log for the above command demonstrating the issue.
This includes extra log entries when write sequence number is saved and
retrieved.

set: io_u->numberio=28489, off=0x5f2400
queue: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
set: io_u->numberio=28574, off=0x5f2400
iolog: overlap 6235136/512, 6235136/512
queue: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
complete: io_u 0x5b8039e75500: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0
complete: io_u 0x5b8039e30d40: off=0x5f2400,len=0x200,ddir=1,file=verify.0.0

retrieve: io_u->numberio=28574, off=0x5f2400
queue: io_u 0x5b8039e1db40: off=0x5f2400,len=0x200,ddir=0,file=verify.0.0

bad header numberio 28489, wanted 28574 at file verify.0.0 offset 6235136, length 512 (requested block: offset=6235136, length=512)

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: fix verify issues with norandommap

When norandommap is enabled, fio logs the I/O entries in a RB tree. This
is to account for offset overlaps and overwrites. Then during verify
phase, the I/O entries are picked from the top and in this case the
smallest offset is verified first and so on. This creates a mismatch
during the header verification as the seed generated at the time of read
differs from what was logged during write. Skip seed verification in
this scenario.

fixes #1756

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: header seed check for read only workloads

For read jobs, users should have the option to verify header seeds at a
later point of time. Currently for read jobs header seeds are not
generated

Consider the below mentioned write followed by read workloads. Here fio
should allow header seed verification.

fio --name=test --filesize=16k --rw=randwrite --verify=md5
fio --name=test --filesize=16k --rw=randread --verify=md5 --verify_header_seed=1

However there are other scenarios where header seed verification will
fail. These include:
* randrepeat is set to false, leading to different seed across runs.
* randseed is different across write and read workloads.
* Read workload is changed from sequential to random or vice versa
   across runs.
* Read workloads run in the same invocation as write, i.e. a write job
   followed by a stonewall read job. Header seed verification will fail
   because random seeds vary between jobs. Refer t/jobs/t0029.fio

If verify_header_seed is explicitly enabled, fio will verify header seed
for the workload.

This reverts part of commit mentioned below
Fixes: def41e55 ("verify: decouple seed generation from buffer fill")

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: disable header seed check for verify_only jobs

For the invoked verify_only job, header seed can match only if it
exactly matches the original write job. This means either randrepeat
should be true, or we must use the same randseed which was used with the
original write job. After write the verify_only workload shouldn't be
changed from sequential to random or vice versa.

Considering these constraints disable verify_header_seed for verify_only
jobs. Users will still be able to enable header seed checking if they
explicitly set the verify_header_sequence option.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: enable header seed check for 100% write jobs

There are 3 modes where verify can be performed. write, read and
readwrite. The existing readwrite condition prohibits header seed check
for write or read workloads. For write workloads, there shouldn't be any
extra limitation that triggers header seed mismatch which cannot be
triggered with readwrite workloads. Hence modify this condition to only
disable verify header seed checks for read workload.

The subsequent patches fixes header seed mismatch issues.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: disable header seed checking instead of overwriting it

The existing header seed is overwritten if zone reset frequency is set or
if verify backlog is enabled. Disable verify header seed check for these
scenarios, unless there is an explicit request to enable it.

Note: There is no fio behavior change intended by this patch.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

fio: add verify_header_seed option

Add a new option to disable the verify header seed check. The header
seed check is enabled by default.
There have been numerous issues observed with header seed mismatch. Hence
this allows end user to disable this check, and proceed with the checksum
verification. This is similar to option verify_write_sequence, which
allows the capability to disable write sequence number check.

Update the documentation accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

init: write sequence behavior change for verify_only mode

Change the behavior for verify_only mode to not disable
verify_write_sequence unless its explicitly enabled.

Update the fio doc. accordingly.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

verify: add missing client/server support for verify_write_sequence

Ensure that we convert verify_write_sequence option for client/server.

Fixes: 2dd80ee4 ("fio: Support verify_write_sequence")

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

filesetup: remove unnecessary check

If read_iolog_file is set, the goto statement moves it beyond this
point. So remove this redundant check.

Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>

Merge branch 'update-docs-for-compare' of https://github.com/minwooim/fio

* 'update-docs-for-compare' of https://github.com/minwooim/fio:
docs: update docs for verify_mode=compare of io_uring_cmd

docs: update docs for verify_mode=compare of io_uring_cmd

Add missing limitation of verify_mode=compare in io_uring_cmd
ioengine. Data verification with NVMe COMPARE command has been
introduced in Commit 6170d92a61da ("io_uring: Support Compare command
for verification") and this should have documented COMPARE command only
supports in case of data pattern verification.

The two more options should be with --verify_mode=compare.
verify_mode=compare
verify=pattern
verify_pattern=<pattern>

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>

Re-introduce RWF_DONTCACHE

This used to be called RWF_UNCACHED, and it never made it upstream. But
as of the 6.14 kernel, RWF_DONTCACHE exists, and provides the same
guarantees that the older RWF_UNCACHED did - it's applied to buffered
IO, and any page cache instantiated for this read or write will be
dropped on IO completion. Any data already in cache will remain in cache
and will not cause IO to be issued.

This adds support for the io_uring and sync IO engines.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 'fix-DNDEBUG' of https://github.com/dandedrick/fio

* 'fix-DNDEBUG' of https://github.com/dandedrick/fio:
t/read-to-pipe-async: fix -DNDEBUG support

t/read-to-pipe-async: fix -DNDEBUG support

When NDEBUG is defined this was trying to call log_err but it didn't
include the header file or link against the relevant .o file. This will
now fully build with -DNDEBUG.

Signed-off-by: Dan Dedrick <dan.dedrick@gmail.com>

Fio 3.39

Signed-off-by: Jens Axboe <axboe@kernel.dk>

crc/sha512: fix missing finalize part of sha512 hash

The sha512 implementation was broken, make an attempt at actually
making it work...

Signed-off-by: Jens Axboe <axboe@kernel.dk>

libaio: Add vectored io support

This adds support for doing vectored I/O to libaio.
Instead of using pread/pwrite calls, this allows libaio to use
preadv/pwritev calls which uses iovecs.
option: libaio_vectored=1

Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/f0d66512e3df3d2142910e996c42389c21232d12.1739608655.git.ritesh.list@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge branch 't0036-0037' of https://github.com/kawasaki/fio

* 't0036-0037' of https://github.com/kawasaki/fio:
t/jobs/t0036,0037: add tests for verify_state_save/load options

t/jobs/t0036,0037: add tests for verify_state_save/load options

Add tests to exercise the options verify_state_save and
verify_state_load. The test case t0036 does it for psync I/O engine and
single file, and t0037 does it for libaio I/O engine and multiple files.
The test cases confirm no regression by the recent commits relevant to
the options.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

t/io_uring: fix passthrough fixed buffer support

A previous commit changed t/io_uring to register a single region for
all of the registered buffers, and while it updated non-passthrough IO
for that change, the passthrough path still sets a specific buffer
index. This makes passthrough with fixed buffers fail for any buffer
but the first one, as it's asking for a buffer that doesn't exist rather
than index the first one. That causes -EFAULT completions.

Ensure the buf_index is set to 0 for passthrough as well.

Fixes: 21f461f8c2b9 ("t/io_uring: register single buffer for whole IO region")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

t/verify-state.c: adjust to verify state format change

The previous commit modified the format of verify state files. To adjust
to the change, add support of the new field "max_no_comps_per_file" and
pass it to __thread_io_list_sz(). Also check the version number against
the new version 4.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250213052510.1474423-4-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

verify: double number of writes to save completions

When asynchronous write workloads have the verify option and the
verify_state_save option enabled, fio saves the last write completions
in the verify state file. The number of writes saved in the verify state
file is equal to the I/O depth specified by the iodepth option. Let N
represent the iodepth. Subsequent verify workloads with the
verify_state_load option read the saved verify state file. As to the
last N verify I/Os, fio checks if they have corresponding write
completions saved by the write workloads. If not, fio does not perform
the verify I/Os since the verify data was not written. This approach
prevents the false-positive verify failures. Refer to the following two
commits for the detail:

ca09be4b1a8e ("Add support for verify triggers and verify state saving")
94a6e1bb4e7d ("Fix verify state for multiple files")

However, when the write workloads are asynchronous, the completion order
of writes can differ from their issue order. In such cases, a write
named "W_before_last_N", issued before the last N writes, may complete
slowly, fall within the last N completions, and be saved in the verify
state file. Conversely, one of the last issued N writes, named
"W_in_last_N", may complete early, and not be saved in the verify state
file. When the subsequent verify workload reads the verify state file
and runs, fio prepares the I/O for the "W_in_last_N" at some point. Fio
tries to find its offset in the verify state file, but that is not
found. Then fio stops the verify workload. This unexpected verify stop
confuses users.

To reduce the chance of the unexpected verify stop due to fluctuations
in the write completion order, increase the number of write completions
saved in the verify state file. Since this issue only occurs with
asynchronous writes, increase the number only for asynchronous
workloads. Add a new field "last_write_comp_depth" to struct thread_data
to store the number. To adjust the size of the verify state file, add a
new field "max_no_comps_per_file" to struct thread_io_list. This field
reflects the number of writes to be saved for each file and allows to
calculate the state file size. These changes affect the verify state
file format, then bump up the verify state header version from 3 to 4.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250213052510.1474423-3-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

verify: print message when verify_state_should_stop() returns false

When verify workloads with the verify_state_load option target an offset
that is not saved in the verify state file, verify_state_should_stop()
function returns false and it stops the verify workloads. This workloads
stop is made without providing a message to users, making it difficult
for them to understand why they stopped. Print a message to inform users
why the verify workloads stopped.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250213052510.1474423-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

t/run-fio-tests: add client/server test script

Add the client/server test script to the global test harness.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/client-server: basic client/server test script

Currently there are only two sets of test cases:

- check that fio correctly handles the global options dictionary in the
  JSON output with one or more servers with job files with and without
  global sections.
- check that the [s,c]lat_percentiles options work for the "All clients"
  summary data.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

ci: install kill binary for Debian platforms

kill is a bash built-in and the binary is not installed by default on
Debian. Install the binary because we need to use Python's subprocess
to run the kill command to stop the fio servers for the client/server
test script.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

t/fiotestlib: improve JSON decoding

Instead of skipping up to a fixed number of lines when trying to decode
JSON output, just skip everything until the first opening {.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>

client: separate global options from multiple servers

In a single client invocation, currently all global options from all
servers accumulate in a single global options dictionary. This is fine
if the invocation creates only a single server session, but when a
client connects to multiple servers, this is a mess because there is no
way to assign different global options to different jobs.

This patch instead creates a global options array when a client creates
multiple server sessions. Each array element contains the global options
for a different server session and is identified by the hostname and
port. When the client connects to only a single server, the global
options object remains a dictionary in order to maintain backward
compatibility.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>