Chris Mason [Mon, 22 Oct 2012 15:22:55 +0000 (11:22 -0400)]
iowatcher: Only hash IOs if there are completion or issue events
We use an IO hash table to keep track of the IOs in flight, and this is
used to calculate the latencies from when we issue the IO to when
we complete the IO.
But if there are no completion events, io is never removed from the hash
table. It grows very large and slows down the run.
Since we already scan all the events looking for outliers, this commit
checks for each major type of event during the scan. If there are
no completion and no issue events, we don't bother inserting things
into the hash table.
If there are no completion events, we clean up during the issue event.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Eric Sandeen [Thu, 18 Oct 2012 21:42:38 +0000 (15:42 -0600)]
iowatcher: iowatcher: support png2theora for videos
ffmpeg is not available on all distributions, so include Theora
as an option, via png2theora, if the output movie filename ends
in .ogg or .ogv
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Wed, 17 Oct 2012 19:45:49 +0000 (15:45 -0400)]
iowatcher: Fix path name handling when the trace files are in the current directory
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Yuanhan Liu [Fri, 12 Oct 2012 16:09:29 +0000 (10:09 -0600)]
iowatcher: Fix buffer overwrite issue
Current code allocates buffer for path based on strdup, which would let
the size of path equals to the size of blktrace_dest_dir. But the code
next that joins it with the filename of dump file, which would overwrite
the buffer, and triggered an issue like following:
$ ./iowatcher -t trace.dump -o trace.svg
Unable to find trace file ./trace.dumpY
^
Refactoring join_path a bit to fix this issue.
Cc: Liu Bo <liub.liubo@gmail.com>
Signed-off-by: Yuanhan Liu <yliu.null@gmail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Fri, 28 Sep 2012 03:55:14 +0000 (21:55 -0600)]
iowatcher: add blktrace destination options
Add 'D' for blktrace destination options so that we can save trace
in the destination directory.
Signed-off-by: Liu Bo <liub.liubo@gmail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Robert Schiele [Mon, 8 Sep 2014 07:38:52 +0000 (09:38 +0200)]
signal condition variable at end of stop_tracers
stop_tracers modifies tp->is_done and thus must signal the condition
variable tracer_wait_unblock is waiting on to monitor tp->is_done.
Not doing so might cause the tool to deadlock if stop_tracers is
called while a tracer thread is in tracer_wait_unblock.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Riku Voipio [Fri, 11 Apr 2014 11:51:19 +0000 (14:51 +0300)]
remove unused barrier.h
While looking for things that needs porting for Aarch64,
barrier.h from blktrace was identified. However, a deeper
look shows that this file is not actually used anymore
in blktrace.
Remove unused file to avoid future confusion.
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Eiichi Tsukata [Tue, 3 Dec 2013 12:04:59 +0000 (21:04 +0900)]
blktrace bno_plot.py: output comprehensive message when gnuplot not found
Currently, bno_plot.py uses os.execvp which does not show enough information
when executed command is not found. For example, when gnuplot is not found
bno_plot.py shows the following messages:
Traceback (most recent call last):
File "/usr/local/bin/bno_plot.py", line 123, in <module>
os.execvp(cmd[0], cmd)
File "/usr/lib64/python2.7/os.py", line 344, in execvp
_execvpe(file, args)
File "/usr/lib64/python2.7/os.py", line 368, in _execvpe
func(file, *argrest)
OSError: [Errno 2] No such file or directory
Users can't understand what happend directly from the message.
Instead of os.execvp, this patch uses os.system which shows the following
messages when gnuplot not found:
sh: gnuplot: command not found
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nathan Zimmer [Mon, 15 Apr 2013 14:53:36 +0000 (09:53 -0500)]
blktrace blkreplay: convert to use a dynamic cpu_set_t
Some distros have changed CPU_SETSIZE in glibc to 4096 since that matches
the NR_CPUS in the linux kernel config file. Some distros have decided to
leave CPU_SETSIZE at 1024. This is a problem if you want to run that distro
on a very large machine.
CPU_SETSIZE is use by the struct cpu_set_t. This means you to deal with cpus
greater the 1024 you must use the dynamic cpu sets, which involves converting
from things like CPU_SET to CPU_SET_S.
Cc: Jens Axboe <axboe@kernel.dk>
Modified by Jens to fix the CPU_{SET,ZERO}_S pointer mixup.
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nathan Zimmer [Mon, 15 Apr 2013 14:53:35 +0000 (09:53 -0500)]
blktrace: use number of configured cpus instead of online cpus
We want to run on all online processors. However is there is a hole in the
online cpumask this won't happen. We need the number of configured processors
instead of online.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nathan Zimmer [Mon, 15 Apr 2013 14:53:34 +0000 (09:53 -0500)]
btreplay: use sysconf to get the number of configured cpus
We should use the standard methods for getting the number of cpus in the
system when they are available. It is good practice to leave the old ways in
place for people stuck on older systems.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nathan Zimmer [Mon, 15 Apr 2013 14:53:33 +0000 (09:53 -0500)]
btreplay: Machines are now large enough that holes need to be dealt with
The current method fails if once we hit the first offlined cpu. This
will correct that case. However this still underreports the number cpus if
the last cpu are offlined.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nathan Zimmer [Mon, 15 Apr 2013 14:53:32 +0000 (09:53 -0500)]
verify_blkparse: Change max_cpus to deal with systems larger the 512
verify_blkpars has troubles with systems larger then 512.
Also there is issue in the scanning code causing the cpu number to be
truncated to the first two digits. i.e cpu 542 would be read as 54.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ivan Dyukov [Tue, 19 Mar 2013 14:16:27 +0000 (08:16 -0600)]
More accurate calculation of the total read/write values
If block device has many request with size less than 1K,
blkparse ignores such requests because it treats each request
in Kb.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Kara [Mon, 10 Sep 2012 08:09:48 +0000 (10:09 +0200)]
iowatcher: Per process IO graphs
Add support for displaying different processes with different color in
the IO graph and movie.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Fri, 21 Sep 2012 18:03:50 +0000 (14:03 -0400)]
iowatcher: Make sure we add the xtick labels if we're only plotting IO
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Tue, 11 Sep 2012 01:01:02 +0000 (21:01 -0400)]
iowatcher: Merge branch 'jan'
Jan Kara's updates for xzoom and yzoom
Conflicts:
main.c
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Thu, 6 Sep 2012 16:23:05 +0000 (18:23 +0200)]
iowatcher: Add option to set action which should be displayed in the IO graph
Sometimes this is useful to see how IO scheduler or storage itself
changes the IO submitted by the application.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Thu, 6 Sep 2012 10:45:34 +0000 (12:45 +0200)]
iowatcher: Improve xticks logic
Ticks on x axis used integral step and fixed number of ticks. That generates
wrong results e.g. for 13s long trace with 10 ticks... Allow the code to
somewhat alter the number of ticks and also use non-integral step.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Fri, 31 Aug 2012 09:37:49 +0000 (11:37 +0200)]
iowatcher: Add options to limit time and sector range
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Thu, 6 Sep 2012 08:14:59 +0000 (10:14 +0200)]
iowatcher: Ignore trace records beyond max_seconds
Currently we report error when we find a trace record beyond max_seconds.
When we allow user to set end of displayed period, records after the end
of period are no longer a bug so just ignore them.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Wed, 5 Sep 2012 21:01:08 +0000 (23:01 +0200)]
iowatcher: Add possibility to limit seconds from below
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Wed, 5 Sep 2012 20:08:24 +0000 (22:08 +0200)]
iowatcher: Rename seconds to max_seconds
Later we will add min_seconds to complement this.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Sat, 1 Sep 2012 21:31:46 +0000 (23:31 +0200)]
iowatcher: Add support for limitting IO graph offset from below
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Sat, 1 Sep 2012 21:37:20 +0000 (23:37 +0200)]
iowatcher: Fix filtering of outliers from below
There are lots of trace actions which do not carry a sector with them (e.g.
plug, unplug, ...). Thus sector is 0 for them and that results in trimming
of outliers from below never working. Fix the problem by accounting only
Queue events in the outlier statistics.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Sat, 1 Sep 2012 21:35:55 +0000 (23:35 +0200)]
iowatcher: Define mask of trace action and use it instead of opencoding the constant
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jan Kara [Thu, 30 Aug 2012 13:59:01 +0000 (15:59 +0200)]
iowatcher: Fix typo in option description
Short variant of --movie is -m, not -p.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Fri, 31 Aug 2012 00:42:30 +0000 (20:42 -0400)]
iowatcher: Check for null mpstat structs while generating plots
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Tue, 28 Aug 2012 06:15:11 +0000 (02:15 -0400)]
iowatcher: Add -c to split the graphs up into multiple columns
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 22:27:59 +0000 (18:27 -0400)]
iowatcher: Fix divide by zero while calculating averages
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 22:09:57 +0000 (18:09 -0400)]
iowatcher: Update the README and the --help output
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 21:39:58 +0000 (17:39 -0400)]
iowatcher: Start support for multiple colums of plots
The movie mode is updated to put extra plots on
the side.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 17:00:30 +0000 (13:00 -0400)]
iowatcher: Fix io line graphs at the edge of the X axis
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 16:53:51 +0000 (12:53 -0400)]
iowatcher: Fix the line graphs for values near the edges of the graph
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 27 Aug 2012 16:22:28 +0000 (12:22 -0400)]
iowatcher: Fix mpstat file permissions
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Fri, 24 Aug 2012 18:31:29 +0000 (14:31 -0400)]
iowatcher: Add initial support for flash tracing
This is incomplete, but it will catch messages from
the flash driver to find the actual chip an IO
was sent to.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Tue, 21 Aug 2012 19:19:35 +0000 (15:19 -0400)]
iowatcher: Add a new movie mode that maps the IOs onto a platter.
The --movie option defaults to spindle mode now,
but you can choose --movie=rect or --movie=spindle
as well.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Tue, 21 Aug 2012 13:18:15 +0000 (09:18 -0400)]
iowatcher: Switch to ffmpeg for movie encoding. Chrome and vlc like these better.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 20 Aug 2012 20:15:55 +0000 (16:15 -0400)]
iowatcher: Add back missing plot title
Chris Mason [Mon, 20 Aug 2012 19:30:38 +0000 (15:30 -0400)]
iowatcher: Fix --help definition
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Mon, 20 Aug 2012 18:36:19 +0000 (14:36 -0400)]
iowatcher: Add mpstat.[ch] into git
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Fri, 17 Aug 2012 16:18:28 +0000 (12:18 -0400)]
iowatcher: Add movie support
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Thu, 16 Aug 2012 18:46:33 +0000 (14:46 -0400)]
iowatcher: Add mpstat graphing support
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Chris Mason [Wed, 15 Aug 2012 20:10:55 +0000 (16:10 -0400)]
iowatcher: Initial revision
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Jens Axboe [Mon, 27 Feb 2012 07:22:17 +0000 (08:22 +0100)]
blktrace 1.0.5
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Vasily Tarasov [Mon, 27 Feb 2012 07:21:11 +0000 (08:21 +0100)]
Too small arrays for file names
In our experiments blktrace/blkparse file names encode a lot of
infomation about the particular experiment. We noticed that for long
enough file names blkparse does not work.
The reason is that per_cpu_info->fname[] is of 128 bytes. As a result,
in setup_file() function only part of the file name gets to ->fname[].
Then stat() fails and we exit the function. Notice, that no error is
printed in this case.
In the following patch ->fname[] size is increased to POSIX defined
PATH_MAX.
Signed-off-by: Vasily Tarasov <tarasov@vasily.name
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 1 Feb 2012 12:17:57 +0000 (13:17 +0100)]
Fix compiler warnings
One was a real bug, assigned i_time twice instead of c_time (which was
left unitialized).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eric Sandeen [Fri, 16 Dec 2011 19:36:56 +0000 (13:36 -0600)]
avoid string overflows
Several places using strcpy would benefit from strncpy
for safety.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:27:33 +0000 (13:27 -0600)]
blktrace: remove unused variable
sp was being incremented w/o initialization, but thankfully
not used otherwise. Just remove it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:25:16 +0000 (13:25 -0600)]
blkparse: initialize cpu_map
We malloc'd cpu_map, and then did:
cpu_map[CPU_IDX(cpu)] |= (1UL << CPU_BIT(cpu));
... not sure how that ever worked if cpu_map was not initialized!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:17:52 +0000 (13:17 -0600)]
btt: close devmap file after processing
Close the file used for btt's -M argument after
processing.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:15:54 +0000 (13:15 -0600)]
Fix several leaks on error paths
In several cases space is allocated for a filename but
not freed if open of that file fails.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:13:21 +0000 (13:13 -0600)]
Remove extraneous malloc in find_input routines
No point in malloc()ing space if we just immediately overwrite
the pointer via strdup. That'll leak some space.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:11:33 +0000 (13:11 -0600)]
Close stream in 'I' switch handling
The file containing the list of devices was never closed
after processing was complete.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:08:14 +0000 (13:08 -0600)]
Free pdu_buff on bad pdu path in process()
On this error path, pdu_buf was never freed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:06:31 +0000 (13:06 -0600)]
Fix potential array overrun in act_to_str
The acts[] array is only N_ACTS elements, so we should not
ever set acts[N_ACTS]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Sandeen [Fri, 16 Dec 2011 19:05:02 +0000 (13:05 -0600)]
Check setvbuf return value
Check for setvbuf failure.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jens Axboe [Tue, 31 Jan 2012 09:53:21 +0000 (10:53 +0100)]
blktrace 1.0.4
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 31 Jan 2012 09:52:25 +0000 (10:52 +0100)]
Merge branch 'master' of ssh://brick.kernel.dk/data/git/blktrace
Mikulas Patocka [Tue, 31 Jan 2012 09:51:50 +0000 (10:51 +0100)]
Fix for realloc bug and wrong error logging
This patch fixes two bugs in blktrace.
1. realloc is called on a wrong memory address (glibc reports heap
corruption if the user sends the output to a pipe, for example "blktrace
/dev/sdc -o -").
2. errno 0 is actually reported if debugfs is not mounted
Mikulas
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 11 Aug 2011 10:49:08 +0000 (12:49 +0200)]
blktrace 1.0.3
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Namhyung Kim [Thu, 11 Aug 2011 10:48:07 +0000 (12:48 +0200)]
Add FLUSH/FUA support
Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
FUA follows WRITE, use the same 'F' flag for both cases and
distinguish them by their (relative) position. The end results
look like (other flags might be shown also):
- WRITE: W
- WRITE_FLUSH: FW
- WRITE_FUA: WF
- WRITE_FLUSH_FUA: FWF
Note that we reuse TC_BARRIER due to lack of bit space of act_mask.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jens Axboe [Wed, 3 Aug 2011 13:06:18 +0000 (15:06 +0200)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Jeff Moyer [Wed, 3 Aug 2011 13:05:55 +0000 (15:05 +0200)]
blkparse: fix up incorrect pc write completion count
I noticed in some traces that I was seeing summaries like the following:
Total (sde):
Reads Queued: 76, 304KiB Writes Queued: 16,384, 1,048MiB
Read Dispatches: 76, 304KiB Write Dispatches: 2,210, 1,048MiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 76, 304KiB Writes Completed: 2,210, 1,048MiB
Read Merges: 0, 0KiB Write Merges: 14,174, 907,136KiB
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 4, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 4 PC Writes Compl.: 2,210
IO unplugs: 2,124 Timer unplugs: 0
Note how there were no PC Writes dispatched, but there were 2210
completed. It turns out to be a minor typo in the code. The attached
patch fixes the reporting for me.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Tao Ma [Thu, 26 May 2011 19:11:09 +0000 (21:11 +0200)]
blktrace: Use be32_to_cpu for blk_io_trace->cpu.
blk_io_trace->cpu is u32, so use be32_to_cpu instead.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Tao Ma [Thu, 26 May 2011 19:11:07 +0000 (21:11 +0200)]
blkparse: Avoid segfault for wrong cpu number.
Currently we only check the magic number to see whether
a blktrace is valid or not, but Bill Broadley did meet
with a case that the cpu info is wrong with a number
of
1725552676. So in resize_cpu_info, we meet with a
overflow when calculating
size = new_count * sizeof(struct per_cpu_info);
And the program will be either segfault or has the error
of out of memory. Although this is more likely a kernel
problem, the blkparse shoudn't segfault for it.
So this patch just check whether the cpu stored in the
trace is the same as the file, if not, just warn it out
and skip it.
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: Bill Broadley <bill@broadley.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jens Axboe [Wed, 16 Mar 2011 08:06:30 +0000 (09:06 +0100)]
blktrace 1.0.2
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Justin TerAvest [Wed, 16 Mar 2011 08:05:09 +0000 (09:05 +0100)]
blktrace: Document default values for -b and -n
To help users better deal with the log message
"You have dropped events, consider using a larger buffer size (-b)",
it's helpful to list the defaults for sub buffer management, without
flags.
Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Tao Ma [Wed, 9 Feb 2011 09:22:39 +0000 (10:22 +0100)]
gitignore: add blkiomon to .gitignore.
Add blkiomon, btreplay/btrecord and btreplay/btreplay to
.gitignore so that they don't show up in "Untracked files.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tao Ma [Wed, 9 Feb 2011 09:22:39 +0000 (10:22 +0100)]
blktrace: remove unused idx from devpath.
idx isn't used, so remove it.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tao Ma [Wed, 9 Feb 2011 09:22:39 +0000 (10:22 +0100)]
blktrace: break mlock in case of is_done.
In 38-rc2, there is a bug in mlock which will return
error in mlock of blktrace(I have sent the corresponding
patch to the lkml). So when we try to break the blktrace
by "ctrl+c", mlock will loop forever and in the end, I
have to use "kill -9" to kill it and then run "blktrace -k"
to stop the tracer. I don't think it is good.
How to reproduce it is simple:
Use a 38-rc kernel, and run
blktrace /dev/sdx
then use "ctrl+c", it doesn't exit.
So this patch adds the check for tp->is_done. In
case of is_done is set, break mlock so that we don't
deadloop in the mlock. In case of the real mlock error,
I will let it to retry 10 times and it should succeed
after 10 tries in case of tp->is_done. If tp isn't set
or tp->is_done isn't set, it works like the original
design.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tao Ma [Fri, 14 Jan 2011 08:06:03 +0000 (09:06 +0100)]
blkiomon: Fix an output error
When we give out some statistics in blkiomon, we don't consider
the situation that the device has no correspoinding action. See
if there is no disk read during the interval, the output in my box is
like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg nan, var nan
With the fix, now it looks like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
Cc: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jens Axboe [Tue, 11 Jan 2011 07:36:21 +0000 (08:36 +0100)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Tao Ma [Tue, 11 Jan 2011 07:35:56 +0000 (08:35 +0100)]
blkparse: Fix blktrace output pipe broken in the new kernel
With the newest kernel(say 2.6.37, some older one should also have the
similar problem), some cfq messages are added to blktrace, so it makes
the old blkparse broken.
See a simple example:
1. blktrace /dev/sdb -o -|blkparse -i -
2. Run the following command(/dev/sdb1 is mounted at /mnt/test_dir):
dd if=/mnt/test_dir/test of=/dev/null bs=4k count=1 iflag=direct
There are only 2 lines of output there:
8,16 0 1 0.
000000000 13183 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.
000000491 13183 Q R 114759 + 8 [dd]
And even we run a command line like:
for((i=0;i<100;i++))do dd if=/mnt/ocfs2/test of=/dev/null bs=4k count=1 iflag=direct;done
We are only given the same 2 lines of output.
While the really one should look like:
8,16 0 1 0.
000000000 13319 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.
000000376 13319 Q R 114759 + 8 [dd]
8,16 0 0 0.
000005931 0 m N cfq13319 alloced
8,16 0 3 0.
000006259 13319 G R 114759 + 8 [dd]
8,16 0 4 0.
000007143 13319 P N [dd]
8,16 0 5 0.
000007817 13319 I R 114759 + 8 [dd]
8,16 0 0 0.
000008491 0 m N cfq13319 insert_request
8,16 0 0 0.
000009029 0 m N cfq13319 add_to_rr
...
The main reason is that in show_entries_rb, we test sequences every time,
but actually with some messages like cfq, the sequence number is always
0 which makes the old sequence check refuses all the logs after it.
So only check/store sequence number if it isn't a message.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Alan D. Brunelle [Mon, 29 Nov 2010 15:34:30 +0000 (10:34 -0500)]
Fixed build warning for btreplay
btreplay.c:1332: warning: comparison between signed and unsigned integer
expressions
Edward Shishkin [Fri, 22 Oct 2010 18:52:29 +0000 (20:52 +0200)]
blktrace: btt documentation update
Fixup for RH bugzilla 595628.
Document btt options:
-m (--seeks-per-second);
-X (--easy-parse-avgs).
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Edward Shishkin [Fri, 30 Jul 2010 14:05:04 +0000 (16:05 +0200)]
blktrace: btreplay man pages update
Fixup for RH bugzilla 595623
Document btreplay option -x (--acc-factor)
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Edward Shishkin [Fri, 22 Oct 2010 18:51:22 +0000 (20:51 +0200)]
blktrace: blktrace documentation update
Fixup for RH bugzilla 595620.
Document undocumented blktrace options.
Update the man pages.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Edward Shishkin [Fri, 30 Jul 2010 14:04:36 +0000 (16:04 +0200)]
blktrace: blkparse documentation update
Fixup for RH bugzilla 595615.
Document blkparse options:
-A, --set-mask,
-a, --act-mask,
-D. --input-directory
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Edward Shishkin [Fri, 30 Jul 2010 14:04:24 +0000 (16:04 +0200)]
blktrace: blkiomon documentation update
Fixup for RH bugzilla 595419.
Document blkiomon option -d (--dump-lldd).
Add drv_data mask description to blktrace man page.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Edward Shishkin [Fri, 30 Jul 2010 14:04:13 +0000 (16:04 +0200)]
blktrace: btrecord man pages fixup
Fix up for RH bugzilla 595413
Mistake in the man page for btrecord:
documented option --input-base is unsupported,
supported option --max-bunch-time is undocumented.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Alan D. Brunelle [Thu, 16 Sep 2010 13:26:22 +0000 (09:26 -0400)]
blktrace: disallow -o when using multiple devices
Document that "-o" does not work when specyfing multiple devices to
blktrace, also: enforce this by stopping blktrace when one tries do
do this.
The technical reason why "-o" doesn't work with multiple devices is
because we use multiple threads of execution - one per device/CPU pair -
and each of them opens a file named "<prefix>.blktrace.<cpu>". With the
"-o" all of the "<prefix>" values are the same - so multiple threads
open the same file and try to do output. Not good. Without the "-o"
we get unique files named: "<device>.blktrace.<cpu>" - as the tuple
(<device>,<cpu>) is unique.
Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
Edward Shishkin [Tue, 20 Apr 2010 13:41:14 +0000 (15:41 +0200)]
blktrace: disable kill option - take 2
Fixup for 513950.
Problem:
'blktrace -d <device> -k' does not kill a running
backgound trace. Executing 'blktrace -d <device> -k'
for the second time results in "BLKTRACETEARDOWN:
Invalid argument" message and then each run of
blktrace on that machine prints the following output:
BLKTRACESETUP: No such file or directory.
The bug:
The option -k results in clobbering information
about running trace by kernel (blk_trace_remove),
while resources (files open in debugfs by the running
background blktrace) are not released.
Solution:
Update documentation:
Undocument the non-working "kill" option. Advise
to send SIGINT signall via kill(1) to the running
background blktrace for its correct termination.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:48:12 +0000 (18:48 +0100)]
blktrace: update blkiomon doc
Fixup for 499398.
Description of problem:
blkiomon does not understand the output of blktrace when
working with logical volume device (it is quiet, while
working with physical device it prints IO statistics as
expected).
BUG (or design feature?):
/dev/dm-* and /dev/md* don't see BLK_TC_COMPLETE actions:
/* we need an older D trace and a younger C trace */
if (t_old->bit.action & BLK_TC_ACT(BLK_TC_ISSUE) &&
t_young->bit.action & BLK_TC_ACT(BLK_TC_COMPLETE)) {
/* matching D and C traces - update statistics */
match++;
blkiomon_account(&t_old->bit, &t_young->bit);
blkiomon_free_trace(t_stored);
return t;
}
Possible solution:
Update documentation.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:59 +0000 (18:47 +0100)]
blktrace: add back conversion
Fixup for bz 502889.
Problem:
when executing with /dev/cciss/foo (long path names)
btreplay complains (No such file or directory).
Bug:
Missed back conversion of erscores to slashes.
Solution:
Convert underscores to slashes to restore device
names that have larger paths.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:53 +0000 (18:47 +0100)]
blktrace: print correct usage
Fixup for 498898:
Problem:
When somebody runs blktrace without parameters, it
shows the usage message. The usage message suggests
that version number "x.y.z" is a required parameter,
which is not true.
Solution:
Don't print version number when running
blktrace, blkparce, btt without parameters.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:47 +0000 (18:47 +0100)]
blktrace: avoid device duplication
Fixup for bz 501457.
Problem:
If the device list file contains the same device
as supplied on the command line, blktrace stops
immediately and further I/O tracing is impossible.
Bug: device duplication in the devpaths ends with
programm termination (BLKTRACESETUP ioctl returns
error) while resources (open files in debugfs) are
not released.
Solution:
Make sure devices are not duplicated in devpaths
pool.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 19 Apr 2010 17:15:27 +0000 (19:15 +0200)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Eric Sandeen [Mon, 19 Apr 2010 17:15:23 +0000 (19:15 +0200)]
blkparse: exit with error if no tracefiles found
If no tracefiles are found, exit with non-0 status
Resolves Red Hat Bugzilla #500118
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Mon, 22 Mar 2010 14:21:12 +0000 (10:21 -0400)]
Fixed incorrect sizeof instead of strlen in btt/rstats.c
Alan D. Brunelle [Mon, 22 Mar 2010 14:20:21 +0000 (10:20 -0400)]
Corrected memory leak in btt/p_live.c
Forgot to free record when updating rather than adding.
Eric Sandeen [Mon, 22 Feb 2010 18:56:52 +0000 (19:56 +0100)]
add libpthread to btreplay/Makefile LIBS
Fedora linking changes picked this up:
/usr/bin/ld: btreplay.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
/usr/bin/ld: note: 'pthread_create@@GLIBC_2.2.5' is defined in DSO /lib64/libpthread.so.0 so try adding it to the linker command line
See also https://bugzilla.redhat.com/show_bug.cgi?id=564775
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Thu, 8 Oct 2009 18:12:12 +0000 (14:12 -0400)]
btt: Added in I/O activity per device and system-wide
It now keeps track of I/O activity on a per-device basis (as well as a
cumulative system-wide view). ``I/O activity'' is defined as defined as
the time during which the device driver and device are activelty working
on at least one I/O. Here's a sample output:
==================== I/O Active Period Information ====================
DEV | # Live Avg. Act Avg. !Act % Live
---------- | ---------- ------------- ------------- ------
(254, 0) | 0 0.
000000000 0.
000000000 0.00
( 8, 17) | 0 0.
000000000 0.
000000000 0.00
( 8, 16) | 29 0.
909596815 0.
094646263 90.87
( 8, 33) | 0 0.
000000000 0.
000000000 0.00
( 8, 32) | 168 0.
097848226 0.
068231948 59.06
---------- | ---------- ------------- ------------- ------
Total Sys | 33 0.
799808811 0.
082334758 90.92
Also added a new btt -Z option that generates per-device and system-wide
I/O activity data that can be plotted.
Refer to the documentation updates (btt.1, btt.tex) for more information.
Alan D. Brunelle [Thu, 8 Oct 2009 12:39:02 +0000 (08:39 -0400)]
btt: better data file naming
More logical naming for .dat files created.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Jens Axboe [Tue, 1 Sep 2009 08:24:24 +0000 (10:24 +0200)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Jens Axboe [Tue, 1 Sep 2009 08:24:01 +0000 (10:24 +0200)]
blkparse: allow stdout output with -d option (using '-' as the filename)
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Fri, 14 Aug 2009 17:01:08 +0000 (13:01 -0400)]
Added in running stats documentation
Alan D. Brunelle [Fri, 14 Aug 2009 17:00:27 +0000 (13:00 -0400)]
Added in running stats for btt
Create an overall system and per-device statistics file containing
MB-per-second and I/Os-per-second values. Format for each file is first
column contains an (integer) time stamp (seconds since start of run)
and a (double) value.
File names are:
sys_mbps_fp.dat - system-wide mbps (for all devices watched, of course)
sys_iops_fp.dat - I/Os per sec
Each device watched will have a file with the device preceeding the
_mbps or _iops section of the above file names.
Jens Axboe [Mon, 11 May 2009 12:00:10 +0000 (14:00 +0200)]
Version 1.0.1
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Thu, 7 May 2009 16:08:25 +0000 (11:08 -0500)]
blkrawverify: warn and return error if no traces are found
blkrawverify is prints no errors and returns success if the
requested tracefiles aren't found:
# blkrawverify foobar
Verifying foobar
# echo $?
0
With this change it's a bit more informative:
# ./blkrawverify foobar
Verifying foobar
No tracefiles found for foobar
# echo $?
1
Resolves Red Hat Bugzilla #499581
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>