Commit | Line | Data |
---|---|---|
1e613c9c KC |
1 | #!/usr/bin/env python2.7 |
2 | """ | |
3 | Utility for converting *_clat_hist* files generated by fio into latency statistics. | |
4 | ||
5 | Example usage: | |
6 | ||
7 | $ fiologparser_hist.py *_clat_hist* | |
8 | end-time, samples, min, avg, median, 90%, 95%, 99%, max | |
9 | 1000, 15, 192, 1678.107, 1788.859, 1856.076, 1880.040, 1899.208, 1888.000 | |
10 | 2000, 43, 152, 1642.368, 1714.099, 1816.659, 1845.552, 1888.131, 1888.000 | |
11 | 4000, 39, 1152, 1546.962, 1545.785, 1627.192, 1640.019, 1691.204, 1744 | |
12 | ... | |
13 | ||
14 | Notes: | |
15 | ||
16 | * end-times are calculated to be uniform increments of the --interval value given, | |
17 | regardless of when histogram samples are reported. Of note: | |
18 | ||
19 | * Intervals with no samples are omitted. In the example above this means | |
20 | "no statistics from 2 to 3 seconds" and "39 samples influenced the statistics | |
21 | of the interval from 3 to 4 seconds". | |
22 | ||
23 | * Intervals with a single sample will have the same value for all statistics | |
24 | ||
25 | * The number of samples is unweighted, corresponding to the total number of samples | |
26 | which have any effect whatsoever on the interval. | |
27 | ||
28 | * Min statistics are computed using value of the lower boundary of the first bin | |
29 | (in increasing bin order) with non-zero samples in it. Similarly for max, | |
30 | we take the upper boundary of the last bin with non-zero samples in it. | |
31 | This is semantically identical to taking the 0th and 100th percentiles with a | |
32 | 50% bin-width buffer (because percentiles are computed using mid-points of | |
33 | the bins). This enforces the following nice properties: | |
34 | ||
35 | * min <= 50th <= 90th <= 95th <= 99th <= max | |
36 | ||
37 | * min and max are strict lower and upper bounds on the actual | |
38 | min / max seen by fio (and reported in *_clat.* with averaging turned off). | |
39 | ||
40 | * Average statistics use a standard weighted arithmetic mean. | |
41 | ||
42 | * Percentile statistics are computed using the weighted percentile method as | |
43 | described here: https://en.wikipedia.org/wiki/Percentile#Weighted_percentile | |
44 | See weights() method for details on how weights are computed for individual | |
45 | samples. In process_interval() we further multiply by the height of each bin | |
46 | to get weighted histograms. | |
47 | ||
48 | * We convert files given on the command line, assumed to be fio histogram files, | |
65a4d15c | 49 | An individual histogram file can contain the |
1e613c9c KC |
50 | histograms for multiple different r/w directions (notably when --rw=randrw). This |
51 | is accounted for by tracking each r/w direction separately. In the statistics | |
52 | reported we ultimately merge *all* histograms (regardless of r/w direction). | |
53 | ||
54 | * The value of *_GROUP_NR in stat.h (and *_BITS) determines how many latency bins | |
55 | fio outputs when histogramming is enabled. Namely for the current default of | |
56 | GROUP_NR=19, we get 1,216 bins with a maximum latency of approximately 17 | |
57 | seconds. For certain applications this may not be sufficient. With GROUP_NR=24 | |
58 | we have 1,536 bins, giving us a maximum latency of 541 seconds (~ 9 minutes). If | |
59 | you expect your application to experience latencies greater than 17 seconds, | |
60 | you will need to recompile fio with a larger GROUP_NR, e.g. with: | |
61 | ||
62 | sed -i.bak 's/^#define FIO_IO_U_PLAT_GROUP_NR 19\n/#define FIO_IO_U_PLAT_GROUP_NR 24/g' stat.h | |
63 | make fio | |
64 | ||
65 | Quick reference table for the max latency corresponding to a sampling of | |
66 | values for GROUP_NR: | |
67 | ||
68 | GROUP_NR | # bins | max latency bin value | |
69 | 19 | 1216 | 16.9 sec | |
70 | 20 | 1280 | 33.8 sec | |
71 | 21 | 1344 | 67.6 sec | |
72 | 22 | 1408 | 2 min, 15 sec | |
73 | 23 | 1472 | 4 min, 32 sec | |
74 | 24 | 1536 | 9 min, 4 sec | |
75 | 25 | 1600 | 18 min, 8 sec | |
76 | 26 | 1664 | 36 min, 16 sec | |
77 | ||
78 | * At present this program automatically detects the number of histogram bins in | |
79 | the log files, and adjusts the bin latency values accordingly. In particular if | |
80 | you use the --log_hist_coarseness parameter of fio, you get output files with | |
81 | a number of bins according to the following table (note that the first | |
82 | row is identical to the table above): | |
83 | ||
84 | coarse \ GROUP_NR | |
85 | 19 20 21 22 23 24 25 26 | |
86 | ------------------------------------------------------- | |
87 | 0 [[ 1216, 1280, 1344, 1408, 1472, 1536, 1600, 1664], | |
88 | 1 [ 608, 640, 672, 704, 736, 768, 800, 832], | |
89 | 2 [ 304, 320, 336, 352, 368, 384, 400, 416], | |
90 | 3 [ 152, 160, 168, 176, 184, 192, 200, 208], | |
91 | 4 [ 76, 80, 84, 88, 92, 96, 100, 104], | |
92 | 5 [ 38, 40, 42, 44, 46, 48, 50, 52], | |
93 | 6 [ 19, 20, 21, 22, 23, 24, 25, 26], | |
94 | 7 [ N/A, 10, N/A, 11, N/A, 12, N/A, 13], | |
95 | 8 [ N/A, 5, N/A, N/A, N/A, 6, N/A, N/A]] | |
96 | ||
97 | For other values of GROUP_NR and coarseness, this table can be computed like this: | |
98 | ||
99 | bins = [1216,1280,1344,1408,1472,1536,1600,1664] | |
100 | max_coarse = 8 | |
101 | fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else nan, range(max_coarse + 1))) | |
102 | np.transpose(list(map(fncn, bins))) | |
103 | ||
104 | Also note that you can achieve the same downsampling / log file size reduction | |
105 | by pre-processing (before inputting into this script) with half_bins.py. | |
106 | ||
107 | * If you have not adjusted GROUP_NR for your (high latency) application, then you | |
108 | will see the percentiles computed by this tool max out at the max latency bin | |
109 | value as in the first table above, and in this plot (where GROUP_NR=19 and thus we see | |
110 | a max latency of ~16.7 seconds in the red line): | |
111 | ||
112 | https://www.cronburg.com/fio/max_latency_bin_value_bug.png | |
113 | ||
114 | * Motivation for, design decisions, and the implementation process are | |
115 | described in further detail here: | |
116 | ||
117 | https://www.cronburg.com/fio/cloud-latency-problem-measurement/ | |
118 | ||
119 | @author Karl Cronburg <karl.cronburg@gmail.com> | |
120 | """ | |
121 | import os | |
122 | import sys | |
123 | import pandas | |
124 | import numpy as np | |
125 | ||
126 | err = sys.stderr.write | |
127 | ||
128 | def weighted_percentile(percs, vs, ws): | |
129 | """ Use linear interpolation to calculate the weighted percentile. | |
130 | ||
131 | Value and weight arrays are first sorted by value. The cumulative | |
132 | distribution function (cdf) is then computed, after which np.interp | |
133 | finds the two values closest to our desired weighted percentile(s) | |
134 | and linearly interpolates them. | |
135 | ||
136 | percs :: List of percentiles we want to calculate | |
137 | vs :: Array of values we are computing the percentile of | |
138 | ws :: Array of weights for our corresponding values | |
139 | return :: Array of percentiles | |
140 | """ | |
141 | idx = np.argsort(vs) | |
142 | vs, ws = vs[idx], ws[idx] # weights and values sorted by value | |
143 | cdf = 100 * (ws.cumsum() - ws / 2.0) / ws.sum() | |
144 | return np.interp(percs, cdf, vs) # linear interpolation | |
145 | ||
146 | def weights(start_ts, end_ts, start, end): | |
147 | """ Calculate weights based on fraction of sample falling in the | |
148 | given interval [start,end]. Weights computed using vector / array | |
149 | computation instead of for-loops. | |
150 | ||
151 | Note that samples with zero time length are effectively ignored | |
152 | (we set their weight to zero). | |
153 | ||
154 | start_ts :: Array of start times for a set of samples | |
155 | end_ts :: Array of end times for a set of samples | |
156 | start :: int | |
157 | end :: int | |
158 | return :: Array of weights | |
159 | """ | |
160 | sbounds = np.maximum(start_ts, start).astype(float) | |
161 | ebounds = np.minimum(end_ts, end).astype(float) | |
162 | ws = (ebounds - sbounds) / (end_ts - start_ts) | |
163 | if np.any(np.isnan(ws)): | |
164 | err("WARNING: zero-length sample(s) detected. Log file corrupt" | |
165 | " / bad time values? Ignoring these samples.\n") | |
166 | ws[np.where(np.isnan(ws))] = 0.0; | |
167 | return ws | |
168 | ||
169 | def weighted_average(vs, ws): | |
170 | return np.sum(vs * ws) / np.sum(ws) | |
171 | ||
172 | columns = ["end-time", "samples", "min", "avg", "median", "90%", "95%", "99%", "max"] | |
173 | percs = [50, 90, 95, 99] | |
174 | ||
175 | def fmt_float_list(ctx, num=1): | |
176 | """ Return a comma separated list of float formatters to the required number | |
177 | of decimal places. For instance: | |
178 | ||
179 | fmt_float_list(ctx.decimals=4, num=3) == "%.4f, %.4f, %.4f" | |
180 | """ | |
181 | return ', '.join(["%%.%df" % ctx.decimals] * num) | |
182 | ||
183 | # Default values - see beginning of main() for how we detect number columns in | |
184 | # the input files: | |
185 | __HIST_COLUMNS = 1216 | |
186 | __NON_HIST_COLUMNS = 3 | |
187 | __TOTAL_COLUMNS = __HIST_COLUMNS + __NON_HIST_COLUMNS | |
188 | ||
65a4d15c KC |
189 | def read_chunk(rdr, sz): |
190 | """ Read the next chunk of size sz from the given reader. """ | |
1e613c9c KC |
191 | try: |
192 | """ StopIteration occurs when the pandas reader is empty, and AttributeError | |
193 | occurs if rdr is None due to the file being empty. """ | |
194 | new_arr = rdr.read().values | |
195 | except (StopIteration, AttributeError): | |
196 | return None | |
197 | ||
65a4d15c | 198 | """ Extract array of just the times, and histograms matrix without times column. """ |
1e613c9c KC |
199 | times, rws, szs = new_arr[:,0], new_arr[:,1], new_arr[:,2] |
200 | hists = new_arr[:,__NON_HIST_COLUMNS:] | |
1e613c9c | 201 | times = times.reshape((len(times),1)) |
65a4d15c | 202 | arr = np.append(times, hists, axis=1) |
1e613c9c | 203 | |
65a4d15c | 204 | return arr |
1e613c9c KC |
205 | |
206 | def get_min(fps, arrs): | |
207 | """ Find the file with the current first row with the smallest start time """ | |
65a4d15c | 208 | return min([fp for fp in fps if not arrs[fp] is None], key=lambda fp: arrs.get(fp)[0][0]) |
1e613c9c KC |
209 | |
210 | def histogram_generator(ctx, fps, sz): | |
211 | ||
1e613c9c KC |
212 | # Create a chunked pandas reader for each of the files: |
213 | rdrs = {} | |
214 | for fp in fps: | |
215 | try: | |
216 | rdrs[fp] = pandas.read_csv(fp, dtype=int, header=None, chunksize=sz) | |
217 | except ValueError as e: | |
218 | if e.message == 'No columns to parse from file': | |
219 | if not ctx.nowarn: sys.stderr.write("WARNING: Empty input file encountered.\n") | |
220 | rdrs[fp] = None | |
221 | else: | |
222 | raise(e) | |
223 | ||
65a4d15c KC |
224 | # Initial histograms from disk: |
225 | arrs = {fp: read_chunk(rdr, sz) for fp,rdr in rdrs.items()} | |
1e613c9c KC |
226 | while True: |
227 | ||
228 | try: | |
229 | """ ValueError occurs when nothing more to read """ | |
230 | fp = get_min(fps, arrs) | |
231 | except ValueError: | |
232 | return | |
65a4d15c | 233 | arr = arrs[fp] |
1e613c9c | 234 | yield np.insert(arr[0], 1, fps.index(fp)) |
65a4d15c | 235 | arrs[fp] = arr[1:] |
1e613c9c | 236 | |
65a4d15c KC |
237 | if arrs[fp].shape[0] == 0: |
238 | arrs[fp] = read_chunk(rdrs[fp], sz) | |
1e613c9c KC |
239 | |
240 | def _plat_idx_to_val(idx, edge=0.5, FIO_IO_U_PLAT_BITS=6, FIO_IO_U_PLAT_VAL=64): | |
241 | """ Taken from fio's stat.c for calculating the latency value of a bin | |
242 | from that bin's index. | |
243 | ||
244 | idx : the value of the index into the histogram bins | |
245 | edge : fractional value in the range [0,1]** indicating how far into | |
246 | the bin we wish to compute the latency value of. | |
247 | ||
248 | ** edge = 0.0 and 1.0 computes the lower and upper latency bounds | |
249 | respectively of the given bin index. """ | |
250 | ||
251 | # MSB <= (FIO_IO_U_PLAT_BITS-1), cannot be rounded off. Use | |
252 | # all bits of the sample as index | |
253 | if (idx < (FIO_IO_U_PLAT_VAL << 1)): | |
254 | return idx | |
255 | ||
256 | # Find the group and compute the minimum value of that group | |
257 | error_bits = (idx >> FIO_IO_U_PLAT_BITS) - 1 | |
258 | base = 1 << (error_bits + FIO_IO_U_PLAT_BITS) | |
259 | ||
260 | # Find its bucket number of the group | |
261 | k = idx % FIO_IO_U_PLAT_VAL | |
262 | ||
263 | # Return the mean (if edge=0.5) of the range of the bucket | |
264 | return base + ((k + edge) * (1 << error_bits)) | |
265 | ||
266 | def plat_idx_to_val_coarse(idx, coarseness, edge=0.5): | |
267 | """ Converts the given *coarse* index into a non-coarse index as used by fio | |
268 | in stat.h:plat_idx_to_val(), subsequently computing the appropriate | |
269 | latency value for that bin. | |
270 | """ | |
271 | ||
272 | # Multiply the index by the power of 2 coarseness to get the bin | |
273 | # bin index with a max of 1536 bins (FIO_IO_U_PLAT_GROUP_NR = 24 in stat.h) | |
274 | stride = 1 << coarseness | |
275 | idx = idx * stride | |
276 | lower = _plat_idx_to_val(idx, edge=0.0) | |
277 | upper = _plat_idx_to_val(idx + stride, edge=1.0) | |
278 | return lower + (upper - lower) * edge | |
279 | ||
280 | def print_all_stats(ctx, end, mn, ss_cnt, vs, ws, mx): | |
281 | ps = weighted_percentile(percs, vs, ws) | |
282 | ||
283 | avg = weighted_average(vs, ws) | |
284 | values = [mn, avg] + list(ps) + [mx] | |
285 | row = [end, ss_cnt] + map(lambda x: float(x) / ctx.divisor, values) | |
286 | fmt = "%d, %d, %d, " + fmt_float_list(ctx, 5) + ", %d" | |
287 | print (fmt % tuple(row)) | |
288 | ||
289 | def update_extreme(val, fncn, new_val): | |
290 | """ Calculate min / max in the presence of None values """ | |
291 | if val is None: return new_val | |
292 | else: return fncn(val, new_val) | |
293 | ||
294 | # See beginning of main() for how bin_vals are computed | |
295 | bin_vals = [] | |
296 | lower_bin_vals = [] # lower edge of each bin | |
297 | upper_bin_vals = [] # upper edge of each bin | |
298 | ||
299 | def process_interval(ctx, samples, iStart, iEnd): | |
300 | """ Construct the weighted histogram for the given interval by scanning | |
301 | through all the histograms and figuring out which of their bins have | |
302 | samples with latencies which overlap with the given interval | |
303 | [iStart,iEnd]. | |
304 | """ | |
305 | ||
306 | times, files, hists = samples[:,0], samples[:,1], samples[:,2:] | |
307 | iHist = np.zeros(__HIST_COLUMNS) | |
308 | ss_cnt = 0 # number of samples affecting this interval | |
309 | mn_bin_val, mx_bin_val = None, None | |
310 | ||
311 | for end_time,file,hist in zip(times,files,hists): | |
312 | ||
313 | # Only look at bins of the current histogram sample which | |
314 | # started before the end of the current time interval [start,end] | |
315 | start_times = (end_time - 0.5 * ctx.interval) - bin_vals / 1000.0 | |
316 | idx = np.where(start_times < iEnd) | |
317 | s_ts, l_bvs, u_bvs, hs = start_times[idx], lower_bin_vals[idx], upper_bin_vals[idx], hist[idx] | |
318 | ||
319 | # Increment current interval histogram by weighted values of future histogram: | |
320 | ws = hs * weights(s_ts, end_time, iStart, iEnd) | |
321 | iHist[idx] += ws | |
322 | ||
323 | # Update total number of samples affecting current interval histogram: | |
324 | ss_cnt += np.sum(hs) | |
325 | ||
326 | # Update min and max bin values seen if necessary: | |
327 | idx = np.where(hs != 0)[0] | |
328 | if idx.size > 0: | |
329 | mn_bin_val = update_extreme(mn_bin_val, min, l_bvs[max(0, idx[0] - 1)]) | |
330 | mx_bin_val = update_extreme(mx_bin_val, max, u_bvs[min(len(hs) - 1, idx[-1] + 1)]) | |
331 | ||
332 | if ss_cnt > 0: print_all_stats(ctx, iEnd, mn_bin_val, ss_cnt, bin_vals, iHist, mx_bin_val) | |
333 | ||
334 | def guess_max_from_bins(ctx, hist_cols): | |
335 | """ Try to guess the GROUP_NR from given # of histogram | |
336 | columns seen in an input file """ | |
337 | max_coarse = 8 | |
338 | if ctx.group_nr < 19 or ctx.group_nr > 26: | |
339 | bins = [ctx.group_nr * (1 << 6)] | |
340 | else: | |
341 | bins = [1216,1280,1344,1408,1472,1536,1600,1664] | |
342 | coarses = range(max_coarse + 1) | |
343 | fncn = lambda z: list(map(lambda x: z/2**x if z % 2**x == 0 else -10, coarses)) | |
344 | ||
345 | arr = np.transpose(list(map(fncn, bins))) | |
346 | idx = np.where(arr == hist_cols) | |
347 | if len(idx[1]) == 0: | |
348 | table = repr(arr.astype(int)).replace('-10', 'N/A').replace('array',' ') | |
349 | err("Unable to determine bin values from input clat_hist files. Namely \n" | |
350 | "the first line of file '%s' " % ctx.FILE[0] + "has %d \n" % (__TOTAL_COLUMNS,) + | |
351 | "columns of which we assume %d " % (hist_cols,) + "correspond to histogram bins. \n" | |
352 | "This number needs to be equal to one of the following numbers:\n\n" | |
353 | + table + "\n\n" | |
354 | "Possible reasons and corresponding solutions:\n" | |
355 | " - Input file(s) does not contain histograms.\n" | |
356 | " - You recompiled fio with a different GROUP_NR. If so please specify this\n" | |
357 | " new GROUP_NR on the command line with --group_nr\n") | |
358 | exit(1) | |
359 | return bins[idx[1][0]] | |
360 | ||
361 | def main(ctx): | |
362 | ||
363 | # Automatically detect how many columns are in the input files, | |
364 | # calculate the corresponding 'coarseness' parameter used to generate | |
365 | # those files, and calculate the appropriate bin latency values: | |
366 | with open(ctx.FILE[0], 'r') as fp: | |
367 | global bin_vals,lower_bin_vals,upper_bin_vals,__HIST_COLUMNS,__TOTAL_COLUMNS | |
368 | __TOTAL_COLUMNS = len(fp.readline().split(',')) | |
369 | __HIST_COLUMNS = __TOTAL_COLUMNS - __NON_HIST_COLUMNS | |
370 | ||
371 | max_cols = guess_max_from_bins(ctx, __HIST_COLUMNS) | |
372 | coarseness = int(np.log2(float(max_cols) / __HIST_COLUMNS)) | |
373 | bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness), np.arange(__HIST_COLUMNS)), dtype=float) | |
374 | lower_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 0.0), np.arange(__HIST_COLUMNS)), dtype=float) | |
375 | upper_bin_vals = np.array(map(lambda x: plat_idx_to_val_coarse(x, coarseness, 1.0), np.arange(__HIST_COLUMNS)), dtype=float) | |
376 | ||
377 | fps = [open(f, 'r') for f in ctx.FILE] | |
378 | gen = histogram_generator(ctx, fps, ctx.buff_size) | |
379 | ||
380 | print(', '.join(columns)) | |
381 | ||
382 | try: | |
383 | start, end = 0, ctx.interval | |
384 | arr = np.empty(shape=(0,__TOTAL_COLUMNS - 1)) | |
385 | more_data = True | |
386 | while more_data or len(arr) > 0: | |
387 | ||
388 | # Read up to ctx.max_latency (default 20 seconds) of data from end of current interval. | |
389 | while len(arr) == 0 or arr[-1][0] < ctx.max_latency * 1000 + end: | |
390 | try: | |
391 | new_arr = next(gen) | |
392 | except StopIteration: | |
393 | more_data = False | |
394 | break | |
395 | arr = np.append(arr, new_arr.reshape((1,__TOTAL_COLUMNS - 1)), axis=0) | |
396 | arr = arr.astype(int) | |
397 | ||
398 | if arr.size > 0: | |
399 | process_interval(ctx, arr, start, end) | |
400 | ||
401 | # Update arr to throw away samples we no longer need - samples which | |
402 | # end before the start of the next interval, i.e. the end of the | |
403 | # current interval: | |
404 | idx = np.where(arr[:,0] > end) | |
405 | arr = arr[idx] | |
406 | ||
407 | start += ctx.interval | |
408 | end = start + ctx.interval | |
409 | finally: | |
410 | map(lambda f: f.close(), fps) | |
411 | ||
412 | ||
413 | if __name__ == '__main__': | |
414 | import argparse | |
415 | p = argparse.ArgumentParser() | |
416 | arg = p.add_argument | |
417 | arg("FILE", help='space separated list of latency log filenames', nargs='+') | |
418 | arg('--buff_size', | |
419 | default=10000, | |
420 | type=int, | |
421 | help='number of samples to buffer into numpy at a time') | |
422 | ||
423 | arg('--max_latency', | |
424 | default=20, | |
425 | type=float, | |
426 | help='number of seconds of data to process at a time') | |
427 | ||
428 | arg('-i', '--interval', | |
429 | default=1000, | |
430 | type=int, | |
431 | help='interval width (ms)') | |
432 | ||
433 | arg('-d', '--divisor', | |
434 | required=False, | |
435 | type=int, | |
436 | default=1, | |
437 | help='divide the results by this value.') | |
438 | ||
439 | arg('--decimals', | |
440 | default=3, | |
441 | type=int, | |
442 | help='number of decimal places to print floats to') | |
443 | ||
444 | arg('--nowarn', | |
445 | dest='nowarn', | |
446 | action='store_false', | |
447 | default=True, | |
448 | help='do not print any warning messages to stderr') | |
449 | ||
450 | arg('--group_nr', | |
451 | default=19, | |
452 | type=int, | |
453 | help='FIO_IO_U_PLAT_GROUP_NR as defined in stat.h') | |
454 | ||
455 | main(p.parse_args()) | |
456 |