stat: log out both average and max over the window Add option log_window_value alias of log_max_value which reports average, max or both the values. Retain backward compatibility by allowing =0 and =1 values to specify avg and max values respectively. There is no change to existing log formats while reporting only average or max values. Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com> Link: https://lore.kernel.org/r/20240125110124.55137-2-ankit.kumar@samsung.com Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
client/server: enable per_job_logs option On the client side log files were being overwritten when per_job_logs was set to false because of the flags used when log files were opened. Add per_job_logs to the on-wire protocol so that the client can adjust the flags and open files in append mode when per_job_logs is set to false. Fixes: https://github.com/axboe/fio/issues/1032 Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Record job start time to fix time pain points Add a new key in the json per-job output, job_start, that records the job start time obtained via a call to clock_gettime using the clock_id specified by the new job_start_clock_id option. This allows times of fio jobs and log entries to be compared/ordered against each other and against other system events recorded against the same clock_id. Add a note to the documentation for group_reporting about how there are several per-job values for which only the first job's value is recorded in the json output format when group_reporting is enabled. Fixes #1544 Signed-off-by: Nick Neumann nick@pcpartpicker.com
fio: replace malloc+memset with calloc Clean up the code base by replacing malloc+memset with calloc. This patch was generated from the Coccinelle script below. The script below is inspired by similar scripts used elsewhere: https://lore.kernel.org/linux-btrfs/cover.1443546000.git.silvio.fricke@gmail.com/ https://github.com/coccinelle/coccinellery/blob/master/simple_kzalloc/simple_kzalloc1.cocci @@ expression x,y; statement s; type T; @@ -x = malloc(y * sizeof(T)); +x = calloc(y, sizeof(T)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(T)); @@ expression x,y,z; statement s; @@ -x = malloc(y * sizeof(z)); +x = calloc(y, sizeof(z)); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, y * sizeof(z)); @@ expression e,x; statement s; @@ -x = malloc(e); +x = calloc(1, e); ( if (!x) s | if (x == NULL) s | ) -memset(x, 0, e); Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
cconv: Support pattern buffers of arbitrary size Change the thread_options_pack structure to support pattern buffers of arbitrary size by using a flexible array at the end of the the structure to store both the verify_pattern and the buffer_pattern in that order. In this way, only the actual bytes of each pattern will be sent over the wire and patterns of an arbitrary size can be used with the packed structure. In order to determine the required size of the structure the function thread_options_pack_size() is introduced which returns the total number of bytes required for a given thread_options instance. The two callsites of convert_thread_options_to_net() are then converted to dynamically allocate a pdu of the appropriate size and the two callsites of convert_thread_options_to_cpu() are modified to take the size of the received data to prevent buffer overruns. Also add specific testing of this feature in fio_test_cconv(). Seeing this changes the client/server protocol, the FIO_SERVER_VER is bumped. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
client: only do le64_to_cpu() on io_sample_data member if iolog is histogram In the case of histogram iolog, the union io_sample_data member is a pointer of struct io_u_plat_entry, while in the case of normal iolog, it is an uint64_t. Thus only need to do the byteswap in case it is an uint64_t. This has been done similarly in server code. Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Tuan Hoang <tuan.hoang1@ibm.com>
stat: report clat stats on a per priority granularity Convert the stat code to report clat stats on a per priority granularity, rather than simply supporting high/low priority. This is made possible by using the new clat_prio_stat array (per ddir), together with the clat_prio_stat index which is saved in each io_u. The per priority samples are only printed when there are samples for more than one priority in the clat_prio_stat array. If there are only samples for one priority, that means that all I/Os where submitted using the same priority, so no need to print. For example, running the following fio command: fio --name=test --filename=/dev/sdc --direct=1 --runtime=60 --rw=randread \ --ioengine=io_uring --ioscheduler=mq-deadline --iodepth=32 --bs=32k \ --prioclass=2 --prio=7 --cmdprio_bssplit=32k/20/3/0:32k/10/1/4 Now results in the following output: test: (groupid=0, jobs=1): err= 0: pid=465655: Tue Feb 1 02:24:47 2022 read: IOPS=146, BW=4695KiB/s (4808kB/s)(276MiB/60239msec) slat (usec): min=18, max=335, avg=62.87, stdev=22.59 clat (msec): min=2, max=2135, avg=217.97, stdev=287.26 lat (msec): min=2, max=2135, avg=218.03, stdev=287.26 clat prio 2/7 (msec): min=3, max=606, avg=106.57, stdev=86.64 clat prio 3/0 (msec): min=10, max=2135, avg=664.94, stdev=339.42 clat prio 1/4 (msec): min=2, max=300, avg=52.29, stdev=42.52 clat percentiles (msec): | 1.00th=[ 8], 5.00th=[ 14], 10.00th=[ 19], 20.00th=[ 33], | 30.00th=[ 52], 40.00th=[ 77], 50.00th=[ 108], 60.00th=[ 144], | 70.00th=[ 192], 80.00th=[ 300], 90.00th=[ 684], 95.00th=[ 911], | 99.00th=[ 1234], 99.50th=[ 1318], 99.90th=[ 1687], 99.95th=[ 1770], | 99.99th=[ 2140] clat prio 2/7 (69.25% of IOs) percentiles (msec): | 1.00th=[ 7], 5.00th=[ 13], 10.00th=[ 17], 20.00th=[ 28], | 30.00th=[ 44], 40.00th=[ 64], 50.00th=[ 85], 60.00th=[ 111], | 70.00th=[ 140], 80.00th=[ 174], 90.00th=[ 226], 95.00th=[ 279], | 99.00th=[ 368], 99.50th=[ 418], 99.90th=[ 502], 99.95th=[ 567], | 99.99th=[ 609] clat prio 3/0 (20.91% of IOs) percentiles (msec): | 1.00th=[ 44], 5.00th=[ 138], 10.00th=[ 205], 20.00th=[ 347], | 30.00th=[ 464], 40.00th=[ 558], 50.00th=[ 659], 60.00th=[ 760], | 70.00th=[ 860], 80.00th=[ 961], 90.00th=[ 1099], 95.00th=[ 1217], | 99.00th=[ 1485], 99.50th=[ 1687], 99.90th=[ 1871], 99.95th=[ 2140], | 99.99th=[ 2140] clat prio 1/4 (9.84% of IOs) percentiles (msec): | 1.00th=[ 7], 5.00th=[ 10], 10.00th=[ 13], 20.00th=[ 18], | 30.00th=[ 24], 40.00th=[ 30], 50.00th=[ 39], 60.00th=[ 51], | 70.00th=[ 63], 80.00th=[ 84], 90.00th=[ 114], 95.00th=[ 136], | 99.00th=[ 188], 99.50th=[ 197], 99.90th=[ 300], 99.95th=[ 300], | 99.99th=[ 300] bw ( KiB/s): min= 3456, max= 5888, per=100.00%, avg=4697.60, stdev=472.38, samples=120 iops : min= 108, max= 184, avg=146.80, stdev=14.76, samples=120 lat (msec) : 4=0.11%, 10=2.57%, 20=8.67%, 50=18.21%, 100=18.34% lat (msec) : 250=28.87%, 500=9.41%, 750=5.22%, 1000=5.09%, 2000=3.50% lat (msec) : >=2000=0.01% cpu : usr=0.16%, sys=0.97%, ctx=17715, majf=0, minf=262 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=99.6%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0% issued rwts: total=8839,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=32 Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20220203192814.18552-15-Niklas.Cassel@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
stat: disable per prio stats where not needed In order to avoid allocating a clat_prio_stat array for threadstats that we know will never be able to contain more than a single priority, introduce a new member disable_prio_stat in struct thread_stat. The naming prefix is disable, since we want the default value to be 0 (enabled). This is because in default case, we do want sum_thread_stats() to generate a per prio stat array. Only in the case where we know that we don't want per priority stats to be generated, should this member be set to 1. Server version is intentionally not incremented, as it will be incremented in a later patch in the series. No need to bump it multiple times for the same patch series. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220203192814.18552-14-Niklas.Cassel@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
client/server: convert ss_data to use an offset instead of fixed position Store the location of the ss_data in the payload itself, rather than assuming that it is always located at a fixed location, directly after the cmd_ts_pdu data. This is done as a cleanup patch in order to be able to handle clat_prio_stats, which just like ss_data, may or may not be part of the payload. Server version is intentionally not incremented, as it will be incremented in a later patch in the series. No need to bump it multiple times for the same patch series. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220203192814.18552-5-Niklas.Cassel@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
stat: save the default ioprio in struct thread_stat To be able to report clat stats on a per priority granularity (instead of only high/low priority), we need to be able to get the priority value that was used for the stats in clat_stat. When a thread is using a single priority (e.g. option prio/prioclass is used (without any cmdprio options)), all the clat stats for this thread will be stored in clat_stat. The problem with this is sum_thread_stats() does not know the priority value that corresponds to the stats stored in clat_stat. Since we cannot access td->ioprio from sum_thread_stats(), simply mirror td->ioprio inside struct thread_stat. This way, sum_thread_stats() will be able to reuse the global clat stats in clat_stat, without the need to duplicate the data for per priority stats, in the case where there is only a single priority in use. Server version is intentionally not incremented, as it will be incremented in a later patch in the series. No need to bump it multiple times for the same patch series. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220203192814.18552-4-Niklas.Cassel@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
stat: remove unnecessary bool parameter to sum_thread_stats() We can deduce if it is the first struct io_stat src being added to the struct io_stat dst by checking if the current amount of samples in dst is zero. Therefore, remove the bool parameter "first" to sum_stat(). Since sum_stat() was the only user of the bool parameter "first" to the sum_thread_stats() function, we can remove it from sum_thread_stats() as well. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220110090133.69955-1-Niklas.Cassel@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
fio: Introduce the log_prio option Introduce the log_prio option to expand priority logging from just a single bit information (priority high vs low) to the full value of the priority value used to execute IOs. When this option is set, the priority value is printed as a 16-bits hexadecimal value combining the I/O priority class and priority level as defined by the ioprio_value() helper. Similarly to the log_offset option, this option does not result in actual I/O priority logging when log_avg_msec is set. This patch also fixes a problem with the IO_U_F_PRIORITY flag, namely that this flag is used to indicate that the IO is being executed with a high priority on the device while at the same time indicating how to account for the IO completion latency (high_prio clat vs low_prio clat). With the introduction of the cmdprio_class and cmdprio options, these assumptions are not necesarilly compatible anymore. These problems are addressed as follows: * The priority_bit field of struct iosample is replaced with the 16-bits priority field representing the full io_u->ioprio value. When log_prio is set, the priority field value is logged as is. When log_prio is not set, 1 is logged as the entry's priority field if the sample priority class is IOPRIO_CLASS_RT, and 0 otherwise. * IO_U_F_PRIORITY is renamed to IO_U_F_HIGH_PRIO to indicate that a job IO has the highest priority within the job context and so must be accounted as such using high_prio clat. While fio final statistics only show accounting of high vs low IO completion latency statistics, the log_prio option allows a user to perform more detailed statistical analysis of a workload using multiple different IO priorities. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge branch 'master' of https://github.com/bvanassche/fio * 'master' of https://github.com/bvanassche/fio: client: Make skipping option appending in handle_job_opt() more selective client: Fix two memory leaks in handle_job_opt() Make json_object_add_value_string() duplicate its 'value' argument
client: Fix another memory leak in an error path Duplicate the hostname after if (...) goto err instead of before that check. This was found by inspecting get_new_client() callers. It is not clear to me why Coverity did not complain about this function. Signed-off-by: Bart Van Assche <bvanassche@acm.org>
client: Make skipping option appending in handle_job_opt() more selective Instead of not appending an option to the option list if JSON output is disabled, only skip appending an option to the JSON option list. See also commit b127b679769c ("client: fix segfault for !json output"). Signed-off-by: Bart Van Assche <bvanassche@acm.org>
client: Fix two memory leaks in handle_job_opt() Do not leak p if pdu->global != 0. This is an improvement for a previous attempt to fix handle_job_opt(). See also commit ebae36a28aee ("client: Fix memory leaks in handle_job_opt()"). Do not leak strdup(pdu->name) when calling json_object_add_value_string(). That function namely (indirectly) duplicates its 'name' argument. This patch fixes the following Coverity complaint: CID 169311 (#1 of 1): Resource leak (RESOURCE_LEAK) 9. leaked_storage: Variable p going out of scope leaks the storage it points to. Signed-off-by: Bart Van Assche <bvanassche@acm.org>