verify: fix integer sizes in verify state file nofiles and depth are 32-bit integers. So we shouldn't use 64-bit conversion functions and casts. The current code actually works fine on little-endian platforms since the conversion is a noop but this is broken on big-endian platforms. Fixes: 94a6e1bb ("Fix verify state for multiple files") Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
verify: fix potential overflow before widen vc->hdr_num and len are both 32 bits wide and their product will be a 32-bit result. So any overflow will be lost. Cast hdr_num to unsigned long long so that nothing is lost if the product overflows a 32-bit integer. This fixes the following issue reported by Coverity. ** CID 486274: Integer handling issues (OVERFLOW_BEFORE_WIDEN) /verify.c: 347 in log_verify_failure() ________________________________________________________________________________________________________ *** CID 486274: Integer handling issues (OVERFLOW_BEFORE_WIDEN) /verify.c: 347 in log_verify_failure() 341 uint32_t len; 342 struct thread_data *td = vc->td; 343 344 offset = vc->io_u->verify_offset; 345 if (td->o.verify != VERIFY_PATTERN_NO_HDR) { 346 len = hdr->len; >>> CID 486274: Integer handling issues (OVERFLOW_BEFORE_WIDEN) >>> Potentially overflowing expression "vc->hdr_num * len" with type "unsigned int" (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type "unsigned long long" (64 bits, unsigned). 347 offset += vc->hdr_num * len; 348 } else { 349 len = vc->io_u->buflen; 350 } 351 352 log_err("%.8s: verify failed at file %s offset %llu, length %u" Fixes: 9c8b90ae ("fix wrong offset for VERIFY_PATTERN_NO_HDR") Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
verify: open state file in binary mode on Windows Make sure we open the verify state file in binary mode to avoid any possible conversion of NL to CR+NL. This is the same fix we did for 1fb215e991d260a128e35d761f6850e8d9e4c333. Fixes: https://github.com/axboe/fio/issues/1631 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
fio: replace malloc+memset with calloc Clean up the code base by replacing malloc+memset with calloc. This patch was generated from the Coccinelle script below. The script below is inspired by similar scripts used elsewhere: https://lore.kernel.org/linux-btrfs/cover.1443546000.git.silvio.fricke@gmail.com/ https://github.com/coccinelle/coccinellery/blob/master/simple_kzalloc/simple_kzalloc1.cocci @@ expression x,y; statement s; type T; @@ -x = malloc(y * sizeof(T)); +x = calloc(y, sizeof(T)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(T)); @@ expression x,y,z; statement s; @@ -x = malloc(y * sizeof(z)); +x = calloc(y, sizeof(z)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(z)); @@ expression e,x; statement s; @@ -x = malloc(e); +x = calloc(1, e); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, e); Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Refactor for_each_td() to catch inappropriate td ptr reuse I recently introduced a bug caused by reusing a struct thread_data *td after the end of a for_each_td() loop construct. Link: https://github.com/axboe/fio/pull/1521#issuecomment-1448591102 To prevent others from making this same mistake, this commit refactors for_each_td() so that both the struct thread_data * and the loop index variable are placed inside their own scope for the loop. This will cause any reference to those variables outside the for_each_td() to produce an undeclared identifier error, provided the outer scope doesn't already reuse those same variable names for other code within the routine (which is fine because the scopes are separate). Because C/C++ doesn't let you declare two different variable types within the scope of a for() loop initializer, creating a scope for both struct thread_data * and the loop index required explicitly declaring a scope with a curly brace. This means for_each_td() includes an opening curly brace to create the scope, which means all uses of for_each_td() must now end with an invocation of a new macro named end_for_each() to emit an ending curly brace to match the scope brace created by for_each_td(): for_each_td(td) { while (td->runstate < TD_EXITED) sleep(1); } end_for_each(); The alternative is to end every for_each_td() construct with an inline curly brace, which is off-putting since the implementation of an extra opening curly brace is abstracted in for_each_td(): for_each_td(td) { while (td->runstate < TD_EXITED) sleep(1); }} Most fio logic only declares "struct thread_data *td" and "int i" for use in for_each_td(), which means those declarations will now cause -Wunused-variable warnings since they're not used outside the scope of the refactored for_each_td(). Those declarations have been removed. Implementing this change caught a latent bug in eta.c::calc_thread_status() that accesses the ending value of struct thread_data *td after the end of for_each_td(), now manifesting as a compile error, so working as designed :) Signed-off-by: Adam Horshack (horshack@live.com)
zbd, verify: verify before zone reset for zone_reset_threshold/frequency When zone_reset_threshold and zone_reset_frequency options are specified with zonemode=zbd, it resets zones not full. When verify option is specified on top of that, the zone reset of non-full zones erases data for verify and causes verify failure. Current implementation avoids this scenario by assert. To allow zone_reset_threshold/frequency options together with verify, do verify before the zone reset. When zone reset is required to an open zone with verify data, call get_next_verify() to get io_u for verify read and return it from zbd_adjust_block(). When io_u->file is set, get_next_verify() assumes the io_u is requeued and does nothing. Unset io_u->file to tell get_next_verify() is not requeued. Also modify verify_io_u() to skip rand_seed check when the option zone_reset_frequency is set. When the option is set, random seed is not reset for verify reads in same manner as verify_backlog option, then this check is not valid. Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
verify: fix numberio accounting of experimental verify As for non-experimental verify, numberio is compared between the numbers saved in metadata and written data header. As for experimental verify, the metadata is not available. Instead of numberio in metadata, it refers td->io_issues[] as the numberio value for the comparison. However, td->io_issues[] is used not only for verify reads but also for normal I/Os. It results in comparison with wrong numberio value and verification failure. Fix this issue by adding a new field td->verify_read_issues which counts up number of verify reads. Substitute td->verify_read_issues to io_u->numberio to refer it for the comparison in experimental verify path. Also move td->io_issues[] substitution to io_u->numberio out of populate_verify_io_u() to keep same behavior in non-experimental verify path. Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
verify: use origina offset for verification Requeued IO might have a different offset, since we increment it for retry conditions. Store the original offset for verification purposes. Just like we have io_u->xfer_buf/io_u->buf and io_u->xfer_buflen/io_u->buflen, we really should be treating io_u->offset the same. But that's a major change, so just add this special original offset so we know that verify has it. Currently we treat io_u->offset like we would have io_u->xfer_offset, so this just makes io_u->verify_offset what io_u->offset should be. Signed-off-by: Jens Axboe <axboe@kernel.dk>
fio: Use atomic_load_acquire() and atomic_store_release() where appropriate The write_barrier() in io_completed() and also the read barriers in verify.c are misplaced: the write barrier should occur before the flags update instead of after and the read barriers should occur after the flags read instead of before. Fix this by using atomic_{load_acquire, store_release}() instead of read and write barriers. Signed-off-by: Bart Van Assche <bvanassche@acm.org>
verify: decouple seed generation from buffer fill It is nicer this way and there will be more code in this area with ZBD verification. Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
verify: Fix test to not check for numberio when verify_only is true io->numberio can not be populated when verify_only is true, because do_dry_run() build and complete IOs immediately, so it can not replicate the numberio that was produced when the data was layered on the media. Without this fix, using write_random [write_stress] filename=${FILENAME} size=${FILESIZE} verify_only=${VERIFY_ONLY} readwrite=randwrite bs=4k ioengine=libaio iodepth=32 direct=1 do_verify=1 verify=crc32c 'VERIFY_ONLY=1 FILENAME=/dev/sda1 FILESIZE=1M fio write_random' passes, but 'VERIFY_ONLY=0 FILENAME=/dev/sda1 FILESIZE=1M fio write_random' fails: """verify_only option fails with verify: bad header numberio 1, wanted 0""". The fix addresses the problem by not checking numberio. Fixes #732 Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Optimize the code that copies strings Using strncpy() to copy strings is suboptimal because strncpy writes a bunch of additional unnecessary null bytes. Use snprintf() instead of strncpy(). An additional advantage of snprintf() is that it guarantees that the output string is '\0'-terminated. This patch is an improvement for commit 32e31c8c5f7b ("Fix string copy compilation warnings"). Cc: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
verify: add requested block information to failure trace If verify_interval is not equal to the block size, FIO would verify the data one segment by one segment according to the verify_interval. The offset and length in the verify failure trace are the offset and length of the corrupted segment, not the offset and block size of the IO. On block storage system, we need the offset and length of the IO to investigate data corruption issue. For me, I usually search the offset of the IO in the storage system trace to figure out what happened in the system. Fixes: https://github.com/axboe/fio/issues/709 Signed-off-by: Jens Axboe <axboe@kernel.dk>