filesetup: clear O_RDWR flag for verify_only write workloads If verify_only is set we don't need to open the file with the O_RDWR flagi for write workloads. So we should clear this flag. This will help when the file is on a read-only file system. Fixes: https://github.com/axboe/fio/issues/1681 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
filesetup: better handle non-uniform distributions When we have a random workload with multiple files, randrepeat=0, and a non-uniform random distribution, the offsets touched will follow the same sequence for all files: $ ./fio --name=test --nrfiles=2 --randrepeat=0 --filesize=16k \ --debug=io --rw=randread --norandommap \ --random_distribution=normal:50 | grep complete: io 23042 complete: io_u 0x55bd982f6000: off=0x2000,len=0x1000,ddir=0,file=test.0.0 io 23042 complete: io_u 0x55bd982f6000: off=0x2000,len=0x1000,ddir=0,file=test.0.1 io 23042 complete: io_u 0x55bd982f6000: off=0x3000,len=0x1000,ddir=0,file=test.0.0 io 23042 complete: io_u 0x55bd982f6000: off=0x3000,len=0x1000,ddir=0,file=test.0.1 io 23042 complete: io_u 0x55bd982f6000: off=0x2000,len=0x1000,ddir=0,file=test.0.0 io 23042 complete: io_u 0x55bd982f6000: off=0x2000,len=0x1000,ddir=0,file=test.0.1 io 23042 complete: io_u 0x55bd982f6000: off=0x3000,len=0x1000,ddir=0,file=test.0.0 io 23042 complete: io_u 0x55bd982f6000: off=0x3000,len=0x1000,ddir=0,file=test.0.1 Notice that the blocks touched for test.0.0 and test.0.1 follow the same sequence. This patch allows the sequence of offsets touched to differ between files by always involving the filename in the seed used for each file. The randrepeat setting will still be respected as it is involved in determining the value for td->rand_seeds[FIO_RAND_BLOCK_OFF]. With the patch applied the above invocation produces output like: io 23022 complete: io_u 0x55ed2cd2c000: off=0x2000,len=0x1000,ddir=0,file=test.0.0 io 23022 complete: io_u 0x55ed2cd2c000: off=0x2000,len=0x1000,ddir=0,file=test.0.1 io 23022 complete: io_u 0x55ed2cd2c000: off=0x3000,len=0x1000,ddir=0,file=test.0.0 io 23022 complete: io_u 0x55ed2cd2c000: off=0x2000,len=0x1000,ddir=0,file=test.0.1 io 23022 complete: io_u 0x55ed2cd2c000: off=0x2000,len=0x1000,ddir=0,file=test.0.0 io 23022 complete: io_u 0x55ed2cd2c000: off=0x3000,len=0x1000,ddir=0,file=test.0.1 io 23022 complete: io_u 0x55ed2cd2c000: off=0x1000,len=0x1000,ddir=0,file=test.0.0 io 23022 complete: io_u 0x55ed2cd2c000: off=0x2000,len=0x1000,ddir=0,file=test.0.1 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
fio: replace malloc+memset with calloc Clean up the code base by replacing malloc+memset with calloc. This patch was generated from the Coccinelle script below. The script below is inspired by similar scripts used elsewhere: https://lore.kernel.org/linux-btrfs/cover.1443546000.git.silvio.fricke@gmail.com/ https://github.com/coccinelle/coccinellery/blob/master/simple_kzalloc/simple_kzalloc1.cocci @@ expression x,y; statement s; type T; @@ -x = malloc(y * sizeof(T)); +x = calloc(y, sizeof(T)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(T)); @@ expression x,y,z; statement s; @@ -x = malloc(y * sizeof(z)); +x = calloc(y, sizeof(z)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(z)); @@ expression e,x; statement s; @@ -x = malloc(e); +x = calloc(1, e); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, e); Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
fio: add fdp support for io_uring_cmd nvme engine Add support for NVMe TP4146 Flexible Data Placemen, allowing placement identifiers in write commands. The user can enabled this with the new "fdp=1" parameter for fio's io_uring_cmd ioengine. By default, the fio jobs will cycle through all the namespace's available placement identifiers for write commands. The user can limit which placement identifiers can be used with additional parameter, "fdp_pli=<list,>", which can be used to separate write intensive jobs from less intensive ones. Setting up your namespace for FDP is outside the scope of 'fio', so this assumes the namespace is already properly configured for the mode. Link: https://lore.kernel.org/fio/CAKi7+wfX-eaUD5pky5cJ824uCzsQ4sPYMZdp3AuCUZOA1TQrYw@mail.gmail.com/T/#m056018eb07229bed00d4e589f9760b2a2aa009fc Based-on-a-patch-by: Ankit Kumar <ankit.kumar@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> [Vincent: fold in sfree fix from Ankit] Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
filesetup: don't skip flags for trim workloads Fio has not been setting O_DIRECT, O_SYNC, O_DSYNC, and O_CREAT for workloads that include trim commands. Stop doing this and actually set these flags when requested for workloads that include trim commands. Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
fio: add FIO_RO_NEEDS_RW_OPEN ioengine flag Some oddball cases like sg/bsg require devices to be opened for writing in order to do read commands. So fio has been opening character devices in rw mode for read workloads. However, nvme generic character devices do not need (and may refuse) a writeable open for read workloads. So instead of always opening character devices in rw mode, open devices in rw mode for read workloads only if the ioengine has the FIO_RO_NEEDS_RW_OPEN flag. Link: https://lore.kernel.org/fio/20230203123421.126720-1-joshi.k@samsung.com/ Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Revert "Fix multithread issues when operating on a single shared file" This reverts commit acbda87c34c743ff2d9e125d9539bcfbbf49eb75. This commit introduced a lot of unintended consequences for create_serialize=0. The aim of the commit can be accomplished with a combination of filesize and io_size. Fixes: https://github.com/axboe/fio/issues/1442 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
filesetup: use correct random seed for non-uniform distributions The index in the random seed array for generating offsets is FIO_RAND_BLOCK_OFF. So this is the index that should be used to find the random seed when fio generates offsets following the Zipf, Pareto, and Gaussian distributions. The previous index 4 actually corresponds to FIO_RAND_MIX_OFF. This change means that the default sequences of non-uniform random offsets generated before and after this patch will differ. So users relying on the repeatability of I/O patterns will have new repeatable patterns after this change. Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Fix multithread issues when operating on a single shared file When nrfiles=1, numjobs>1 and create_serialize=0, multiple threads try to create the single shared file in parallel. If the file was pre-existing, but an incorrect size, then multiple threads are deleting and creating at the same time. When all of this happens in parallel, there is a chance that the file can end up the incorrect size (the chance increases as numjobs increases). These changes handle the corner case described above by having a single thread create/extend the file prior to running all of the threads in parallel. By doing this step early, when setup_files() is called later, it should no longer need to create or extend the file, avoiding the race condition. The user still needs to set a fallocate option other than 'none' or the file will end up 0 bytes in size and the race condition will still occur. It would be simple to add a ftruncate() to the code to force this, but that would override the user's choice of fallocate options. Signed-off-by: Chris Weber <weberc@netapp.com>
filesetup: create zbd_info before jumping to done label For a thread that has zonemode == ZONE_MODE_ZBD set, the zbd code requires that each file (for that thread) has a valid f->zbd_info pointer. This intent was further clarified by commit 5ddf46d0b2df ("zbd: change some f->zbd_info conditionals to asserts"). The zbd info pointer is set by zbd_init_files(), either by creating a new zbd_info struct, or by increasing the refcount of an existing zbd_info. A zbd_info struct contains the in memory state of the zones, including e.g. each zone's wp and zone capacity. Normally, zbd_init_files() is always called, even for read only workloads. However, in the case where a read iolog was supplied, setup_files() currently jumps to the done label before zbd_init_files() has been called. Even for a read only workload, zbd_adjust_block() will do things as checking if the read I/O is below the wp (unless td->o.read_beyond_wp is enabled). In order to be able to do this comparison, we need a valid zbd_info. There is no reason why the zbd code should treat a read only workload different from a read iolog workload. (E.g. the wp for the zones might have changed since the read iolog was recorded.) If the user for some reason wants to disregard the wp check during a read iolog workload, the td->o.read_beyond_wp option can be used, just like in the regular read only workload case. Move the read iolog check and the matching "goto done" after the call to zbd_init_files(). This way, we treat a read iolog workload simlar to a regular read only workload, while avoiding an assertion failure in zbd_setup_files() (which is called after the done label). Reported-by: Shane Moore <shane.moore@wdc.com> Suggested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Tested-by: Shane Moore <shane.moore@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20211202094153.8381-1-Niklas.Cassel@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
zbd: support 'z' suffix for zone granularity Allow users to pass some options with zone granularity which is natural for ZBD workloads. This is nifty for writing quick tests and when firmware guys change zone sizes. Converted options are io_size= offset= offset_increment= size= zoneskip= Example: rw=write numjobs=2 offset=1z offset_increment=10z size=5z io_size=6z Thread 1 will write zones 1, 2, 3, 4, 5, 1. Thread 2 will write zones 11, 12, 13, 14, 15, 11. Note: zonemode=strided doesn't create ZBD zone structure but requires value recalculation. This is why 2 functions are split. Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
filesetup: add engine's io_ops to prepopulate file with data In some cases (e.g. engine marked as diskless) files are not laid out. If the first job is a read job, results are higher than expected (because reading zero page). Each engine should deliver func to prepopulate file with data to avoid this situation. Signed-off-by: Łukasz Stolarczuk <lukasz.stolarczuk@intel.com>
distibutions: Extend flexibility of non-uniform random distributions This change affects options random_distribution and file_service_type. For pareto, zipf and gauss distribution a contept of `center` is implemented. It allows to fix in place a value that is most probable to access. Example: fio --randseed=1 --ioengine=libaio --rw=randwrite --nrfiles=16 --bs=4k \ --size=256m --allow_file_create=1 --write_iolog=log.txt \ --file_service_type=gauss:10:0.1 --filename_format=object.\$filenum --name=x cat log.txt |grep write |cut -f 1 -d " " |sort |uniq -c | sort -n | \ sed "s/[.]/ /" | while read a b c; do echo $c $b $a; done |sort -n 0 object 13429 1 object 17928 2 object 14724 3 object 7845 4 object 2476 5 object 468 6 object 44 7 object 3 12 object 24 13 object 318 14 object 1795 15 object 6482 Signed-off-by: Adam Kupczyk <akupczyk@redhat.com>