Edward Shishkin [Tue, 20 Apr 2010 13:41:14 +0000 (15:41 +0200)]
blktrace: disable kill option - take 2
Fixup for 513950.
Problem:
'blktrace -d <device> -k' does not kill a running
backgound trace. Executing 'blktrace -d <device> -k'
for the second time results in "BLKTRACETEARDOWN:
Invalid argument" message and then each run of
blktrace on that machine prints the following output:
BLKTRACESETUP: No such file or directory.
The bug:
The option -k results in clobbering information
about running trace by kernel (blk_trace_remove),
while resources (files open in debugfs by the running
background blktrace) are not released.
Solution:
Update documentation:
Undocument the non-working "kill" option. Advise
to send SIGINT signall via kill(1) to the running
background blktrace for its correct termination.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:48:12 +0000 (18:48 +0100)]
blktrace: update blkiomon doc
Fixup for 499398.
Description of problem:
blkiomon does not understand the output of blktrace when
working with logical volume device (it is quiet, while
working with physical device it prints IO statistics as
expected).
BUG (or design feature?):
/dev/dm-* and /dev/md* don't see BLK_TC_COMPLETE actions:
/* we need an older D trace and a younger C trace */
if (t_old->bit.action & BLK_TC_ACT(BLK_TC_ISSUE) &&
t_young->bit.action & BLK_TC_ACT(BLK_TC_COMPLETE)) {
/* matching D and C traces - update statistics */
match++;
blkiomon_account(&t_old->bit, &t_young->bit);
blkiomon_free_trace(t_stored);
return t;
}
Possible solution:
Update documentation.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:59 +0000 (18:47 +0100)]
blktrace: add back conversion
Fixup for bz 502889.
Problem:
when executing with /dev/cciss/foo (long path names)
btreplay complains (No such file or directory).
Bug:
Missed back conversion of erscores to slashes.
Solution:
Convert underscores to slashes to restore device
names that have larger paths.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:53 +0000 (18:47 +0100)]
blktrace: print correct usage
Fixup for 498898:
Problem:
When somebody runs blktrace without parameters, it
shows the usage message. The usage message suggests
that version number "x.y.z" is a required parameter,
which is not true.
Solution:
Don't print version number when running
blktrace, blkparce, btt without parameters.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Edward Shishkin [Tue, 15 Dec 2009 17:47:47 +0000 (18:47 +0100)]
blktrace: avoid device duplication
Fixup for bz 501457.
Problem:
If the device list file contains the same device
as supplied on the command line, blktrace stops
immediately and further I/O tracing is impossible.
Bug: device duplication in the devpaths ends with
programm termination (BLKTRACESETUP ioctl returns
error) while resources (open files in debugfs) are
not released.
Solution:
Make sure devices are not duplicated in devpaths
pool.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 19 Apr 2010 17:15:27 +0000 (19:15 +0200)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Eric Sandeen [Mon, 19 Apr 2010 17:15:23 +0000 (19:15 +0200)]
blkparse: exit with error if no tracefiles found
If no tracefiles are found, exit with non-0 status
Resolves Red Hat Bugzilla #500118
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Mon, 22 Mar 2010 14:21:12 +0000 (10:21 -0400)]
Fixed incorrect sizeof instead of strlen in btt/rstats.c
Alan D. Brunelle [Mon, 22 Mar 2010 14:20:21 +0000 (10:20 -0400)]
Corrected memory leak in btt/p_live.c
Forgot to free record when updating rather than adding.
Eric Sandeen [Mon, 22 Feb 2010 18:56:52 +0000 (19:56 +0100)]
add libpthread to btreplay/Makefile LIBS
Fedora linking changes picked this up:
/usr/bin/ld: btreplay.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
/usr/bin/ld: note: 'pthread_create@@GLIBC_2.2.5' is defined in DSO /lib64/libpthread.so.0 so try adding it to the linker command line
See also https://bugzilla.redhat.com/show_bug.cgi?id=564775
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Thu, 8 Oct 2009 18:12:12 +0000 (14:12 -0400)]
btt: Added in I/O activity per device and system-wide
It now keeps track of I/O activity on a per-device basis (as well as a
cumulative system-wide view). ``I/O activity'' is defined as defined as
the time during which the device driver and device are activelty working
on at least one I/O. Here's a sample output:
==================== I/O Active Period Information ====================
DEV | # Live Avg. Act Avg. !Act % Live
---------- | ---------- ------------- ------------- ------
(254, 0) | 0 0.
000000000 0.
000000000 0.00
( 8, 17) | 0 0.
000000000 0.
000000000 0.00
( 8, 16) | 29 0.
909596815 0.
094646263 90.87
( 8, 33) | 0 0.
000000000 0.
000000000 0.00
( 8, 32) | 168 0.
097848226 0.
068231948 59.06
---------- | ---------- ------------- ------------- ------
Total Sys | 33 0.
799808811 0.
082334758 90.92
Also added a new btt -Z option that generates per-device and system-wide
I/O activity data that can be plotted.
Refer to the documentation updates (btt.1, btt.tex) for more information.
Alan D. Brunelle [Thu, 8 Oct 2009 12:39:02 +0000 (08:39 -0400)]
btt: better data file naming
More logical naming for .dat files created.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Jens Axboe [Tue, 1 Sep 2009 08:24:24 +0000 (10:24 +0200)]
Merge branch 'master' of ssh://router.home.kernel.dk/data/git/blktrace
Jens Axboe [Tue, 1 Sep 2009 08:24:01 +0000 (10:24 +0200)]
blkparse: allow stdout output with -d option (using '-' as the filename)
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Fri, 14 Aug 2009 17:01:08 +0000 (13:01 -0400)]
Added in running stats documentation
Alan D. Brunelle [Fri, 14 Aug 2009 17:00:27 +0000 (13:00 -0400)]
Added in running stats for btt
Create an overall system and per-device statistics file containing
MB-per-second and I/Os-per-second values. Format for each file is first
column contains an (integer) time stamp (seconds since start of run)
and a (double) value.
File names are:
sys_mbps_fp.dat - system-wide mbps (for all devices watched, of course)
sys_iops_fp.dat - I/Os per sec
Each device watched will have a file with the device preceeding the
_mbps or _iops section of the above file names.
Jens Axboe [Mon, 11 May 2009 12:00:10 +0000 (14:00 +0200)]
Version 1.0.1
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Thu, 7 May 2009 16:08:25 +0000 (11:08 -0500)]
blkrawverify: warn and return error if no traces are found
blkrawverify is prints no errors and returns success if the
requested tracefiles aren't found:
# blkrawverify foobar
Verifying foobar
# echo $?
0
With this change it's a bit more informative:
# ./blkrawverify foobar
Verifying foobar
No tracefiles found for foobar
# echo $?
1
Resolves Red Hat Bugzilla #499581
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Mon, 4 May 2009 21:04:51 +0000 (16:04 -0500)]
blkiomon manpage and usage reference invalid "msg-queue-name" option
the blkiomon usage text and man page reference a
"msg-queue-name" option, but getopts is only looking
for "msg-qeueue" - fix the docs to match the code.
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Mon, 4 May 2009 20:40:53 +0000 (15:40 -0500)]
fix up btrace options & manpage
The btrace script & man page didn't quite match for options,
and the btrace script was missing a few options in the getopts
specification (b&n). Also, there seems to be no such thing as
a "summarize" option anywhere, so remove it.
Reported-by: /Milos Malik <mmalik@redhat.com <mailto:mmalik@redhat.com>>/
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Mon, 4 May 2009 20:16:31 +0000 (15:16 -0500)]
more manpage fixups
Fix various typos & inconsistencies in man pages.
I think the manpages could use a general tidy-up, but this
mostly fixes things which I'd consider "errors" vs. style
issues.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Eric Sandeen [Thu, 30 Apr 2009 15:56:04 +0000 (10:56 -0500)]
fix max-pkts option inconsistencies
This is for RH bug 498426,
btrecord doesn't recognize --max-pkts and --max-packets cmd line options
Usage text and some documentation references "--max-pkts" and
the man page references "--max-packets" but the getopts code
is looking for --max_pkts.
I'm not sure if it's best to make the code match the majority
of the docs, or fix the docs to match the code? This patch does
the former, but I'm not picky if you want to go the other way :)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Thu, 30 Apr 2009 17:10:31 +0000 (13:10 -0400)]
Converted to using the correct remap entries
This follows the kernel changes to the blk_io_trace_remap structure to
better align the names of the structure elements with the real intent of
"from" and "to" (devices & sectors).
See the kernel patches @
http://lkml.org/lkml/2009/4/30/340
http://lkml.org/lkml/2009/4/30/341
(Note: since the ABI order didn't change, old user code will work with
the new kernel code & vice versa.)
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Mon, 11 May 2009 06:41:33 +0000 (08:41 +0200)]
blkiomon: fix unaligned accesses on ia64
commit
7aa3ebcec011bfe9cc60d6476252c03376a37551 packed
the blkiomon_stat structure so that traces from one
arch could be analyzed on another (in truth only x86
is different, at least from x86_64/ia64/ppc/ppc64/s390/s390x)
Moving the __u32 device member instead of a new padding field should be
fine.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Mon, 20 Apr 2009 14:07:11 +0000 (16:07 +0200)]
fix off-by-one issues in blkiomon.h
Fix two off-by-one issues. Last bucket of histogram was ommitted by mistake
when being converted to big-endian or when being merged with another bucket.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Mon, 20 Apr 2009 14:05:18 +0000 (16:05 +0200)]
fix include statement in stats.h
Some endianess conversion macros have been moved and the corresponding
header file is gone. Need to adapt an include statement in stats.h. The
compiler did not complain because blkiomon.c accidently included blktrace.h
prior to stats.h. Introduced by blktrace rewrite which became version 2.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jeff Moyer [Fri, 17 Apr 2009 06:50:22 +0000 (08:50 +0200)]
handle race to mkdir at startup
I ran into a problem when specifying -D dirname-that-doesnt-yet-exist.
Blktrace would fail, spewing the following messages:
[root@megadeth blktrace]# ./blktrace -d /dev/cciss/c0d1 -D ./2.6.30-rc2-cfq-local
Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists
Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists
Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists
Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists
Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 4: 1/Operation not permitted
FAILED to start thread on CPU 5: 1/Operation not permitted
FAILED to start thread on CPU 6: 1/Operation not permitted
FAILED to start thread on CPU 7: 1/Operation not permitted
I tracked it down to the fact that there is no synchronization between
threads when trying to create the output directory. The fix is simple,
just allow the race to happen and detect it. It's not really worth
putting in any extra synchronization. It looks like no place else in
that startup path needs synchronization either.
This patch fixes the issue for me. I tested it by running the very
command that caused me headaches 100% of the time before. I also did a
chattr +i on the directory and verified that it would really fail in the
case where it couldn't create the directory.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Mon, 6 Apr 2009 11:30:16 +0000 (07:30 -0400)]
Fixed plug/unplug logic in btt
Was not accounting for unplugged time due to timeout unplugs.
Alan D. Brunelle [Thu, 2 Apr 2009 16:09:08 +0000 (12:09 -0400)]
Working on fixing % time q plugged
Eric Sandeen [Thu, 26 Mar 2009 19:44:06 +0000 (20:44 +0100)]
fix trivial typo in manpage
Fix small typo as reported by Kent Baxley <kbaxley@redhat.com>
in Red Hat Bug 489941 - tiny typo in blktrace manpage
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@carl.(none)>
Carl Henrik Lunde [Fri, 6 Feb 2009 11:17:25 +0000 (12:17 +0100)]
Add NOTIFY to activity mask
Allow masking in messages by using "-a notify".
Signed-off-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tom Zanussi [Mon, 23 Mar 2009 18:41:00 +0000 (19:41 +0100)]
Blktrace failed to lock reader threads on the cpu used by the corresponding
writer. This resulted in stale data being consumed when blktrace accidently
read at a position that was being written to at the same time. This issue
surfaced as "bad trace magic" warnings emitted by blktrace tools.
The problem occured on an SMP System z machine. The patch fixes the issue.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@carl.(none)>
Alan D. Brunelle [Thu, 12 Mar 2009 14:29:40 +0000 (10:29 -0400)]
Generate matplotlib plots for btt generated data
btt_plot.py: Generate matplotlib plots for BTT generated data files
Files handled:
AQD - Average Queue Depth Running average of queue depths
BNOS - Block numbers accessed Markers for each block
Q2D - Queue to Issue latencies Running averages
D2C - Issue to Complete latencies Running averages
Q2C - Queue to Complete latencies Running averages
Usage:
btt_plot_aqd.py equivalent to: btt_plot.py -t aqd
btt_plot_bnos.py equivalent to: btt_plot.py -t bnos
btt_plot_q2d.py equivalent to: btt_plot.py -t q2d
btt_plot_d2c.py equivalent to: btt_plot.py -t d2c
btt_plot_q2c.py equivalent to: btt_plot.py -t q2c
Arguments:
[ -A | --generate-all ] Default: False
[ -L | --no-legend ] Default: Legend table produced
[ -o <file> | --output=<file> ] Default: <type>.png
[ -T <string> | --title=<string> ] Default: Based upon <type>
[ -v | --verbose ] Default: False
<data-files...>
The -A (--generate-all) argument is different: when this is specified,
an attempt is made to generate default plots for all 5 types (aqd, bnos,
q2d, d2c and q2c). It will find files with the appropriate suffix for
each type ('aqd.dat' for example). If such files are found, a plot for
that type will be made. The output file name will be the default for
each type. The -L (--no-legend) option will be obeyed for all plots,
but the -o (--output) and -T (--title) options will be ignored.
Jens Axboe [Wed, 18 Feb 2009 12:11:48 +0000 (13:11 +0100)]
Merge branch 'master' of ssh://axboe@router.home.kernel.dk/data/git/blktrace
Jens Axboe [Wed, 18 Feb 2009 12:08:08 +0000 (13:08 +0100)]
Update Jenkins hash to lookup3() variant
It both mixes and performs better than lookup2().
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Tue, 17 Feb 2009 13:48:40 +0000 (08:48 -0500)]
Fixed EAGAIN handling in blktrace.c
EAGAIN was causing header failures in network mode. Added in a usleep
and retried the recv().
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Jens Axboe [Tue, 17 Feb 2009 12:39:40 +0000 (13:39 +0100)]
O_NOATIME isn't always present
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Fri, 13 Feb 2009 17:38:54 +0000 (12:38 -0500)]
btt: Added no remap option
Trying to run btt on pre-2.6.19 kernels has problems handling the
previous remap PDU - it did not include a proper device-from field.
(Probably should have bumped the blktrace version when we did that.)
This option just tosses those out as it just results in lots of crazy
stuff being handled.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Fri, 13 Feb 2009 17:43:45 +0000 (12:43 -0500)]
btt general cleanup plus valgrind clean
Lots of general clean up of code, getting interfaces across different
files to be similar (all are no alloc/free), and made it valgrind clean.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com
Alan D. Brunelle [Thu, 12 Feb 2009 19:28:51 +0000 (14:28 -0500)]
btt: Missed fopen conversion to my_fopen
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com
Alan D. Brunelle [Thu, 12 Feb 2009 18:30:47 +0000 (13:30 -0500)]
Code review updates
Re-coding large functions, re-arranging some stuff.
Alan D. Brunelle [Thu, 12 Feb 2009 16:13:20 +0000 (11:13 -0500)]
Reworked blktrace master/thread interface
Allows parallel initializations.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Thu, 12 Feb 2009 13:01:42 +0000 (08:01 -0500)]
Cleaned up devs that have no data
Working around an issue with older kernels (pre-2.6.19): remaps did not
include device-from, so the pad field is being used for a device which
never has any Q or Ds done to it (it's an invalid ID). This code removes
all such devices before output processing.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Wed, 11 Feb 2009 21:16:12 +0000 (16:16 -0500)]
Moved starting of tracing after tracers are going
Hold off BLKTRACESTART to threads are ready to consume tracers.
Alan D. Brunelle [Wed, 11 Feb 2009 18:40:09 +0000 (13:40 -0500)]
btt: fixed open in setup_ifile
Took my_open & my_fopen code from blktrace 2.0: needed to add in open
resource limit increasing stuff.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Wed, 11 Feb 2009 18:23:21 +0000 (13:23 -0500)]
Synchronized trace gathering
Previously, each tracer thread would start gathering traces as soon as
it got going - which might slow down later thread start ups. This change
allows each thread to be ready to gather traces, and then the main
thread starts all the threads gathering at the same time.
Alan D. Brunelle [Wed, 11 Feb 2009 18:10:13 +0000 (13:10 -0500)]
Invoke gethostbyname once, handle errors better
Instead of invoking gethostbyname once per client, we only need to do it
once at initialization time. Plus: gethostbyname has a non-standard
errno reporting mechanism, handle this better.
Alan D. Brunelle [Wed, 11 Feb 2009 16:42:09 +0000 (11:42 -0500)]
Added accept as a system call needing resource increases
accept(2) opens a socket, and thus needs to handle EMFILE/ENFILE errors
like other system calls.
Alan D. Brunelle [Mon, 9 Feb 2009 20:11:49 +0000 (15:11 -0500)]
Rewrote blktrace to have a single thread per CPU
Massive changes: mostly around the notion of having much fewer threads
(instead of N(devs) X N(cpus) threads, we'll have just N(cpus)). This
is very important for larger systems (with lots of devices to
trace). A lot of the code was stolen from the original blktrace code,
major changes include:
o On the client side we only have a single thread per client CPU. Each
thread will then open all device files for that CPU, and use poll to
determine which file needs processing.
o For network client mode w/ sendfile, this means that a single socket
will carry all data to the remote network server. The network server
side will then distribute its reads off that one socket onto different
trace files.
o For network client mode w/out sendfile, we fall back to doing things
like piped mode: keep buffers of tracers read in, and then the main
thread will issue these on sockets to the server. In this case, the main
thread will still have a single socket per CPU.
o For networked mode we added an OPEN concept on the client side: as
soon as the connection to the server is set up, a "header" is sent
signifying that this connection will handle a <cpu, device> tuple. For
each socket opened on the client side, it will send a header per device
being managed. The server side will handle utilize opens to set up
appropriate data structures to handle incoming data streams.
o For both the OPEN and CLOSE headers the server will acknowledge with
a short write back to the client. This allows the client & server sides
to gracefully close socket connections.
o I also re-did the resource limitiation issue a bit differently: for
open calls (including socket) or for memory map/lock calls I have
provided a wrapper function that will try to increase specific limits as
needed. The previous method (attempting to do it at the beginning of the
run) fails for network server mode - you don't know at initialization
how many devices and CPUs will be handled.
o The standard output is slightly different in a few places, if this is
a problem w/ compatibility we can work to rectify that. The command line
argument handling is identical though.
o Using code stolen from Linux to manipulate doubly-linked lists. I've
found that this makes the code easier to read/write (but may be a bit of
overkill here...)
o The code passes valgrind quite well (at least for my tests so far).
The only nit has to do with inet_ntoa - but that is out of our control.
Thanks to Stefan Raspl <raspl@linux.vnet.ibm.com> for testing and
finding some issues and for providing suggestions.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Fri, 23 Jan 2009 14:48:21 +0000 (09:48 -0500)]
Fix btt to handle large numbers of output files
Simply bump resource limits if file opens fail, and retry.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Alan D. Brunelle [Fri, 23 Jan 2009 10:04:42 +0000 (11:04 +0100)]
Increased limits to allow for large system runs
On 16-way w/ 104 disks and a 32-way w/ 96 disks, I was getting:
$ sudo blktrace -b 1024 -n 8 -I ../files
./cciss_c1d6.blktrace.10: Too many open files
Failed to start worker threads
Due to the nature of our N(cpus) X N(devices) order of file opens, and
our N(cpus) X N(devices) X N(buffers) X (buffer size) amount of mmaps()
going on we're exceeding both the RLIMIT_NOFILE and RLIMIT_MEMLOCK
limits.
This patch raises limits for RLIMIT_NOFILE and RLIMIT_MEMLOCK to
"infinity", and allows blktrace to handle the large(ish) systems. (If
these settings fail, we "guestimate" about how much we really need.)
There is still an underlying blktrace and/or kernel problem: The
directory /sys/kernel/debug/block/<DSF> where <DSF> is the device that
encountered the limit is left behind (not cleaned up correctly). This
stops blktrace from running a second time (even on another device):
$ ls /sys/kernel/debug/block
cciss_c1d6
$ sudo blktrace /dev/sda
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/sda
and requires a reboot. (Looking into that next, as this patch - whilst
stopping the original problem from happening - does not address the
secondary problem. And there may be some other ways for the secondary
problem to still occur...)
I also fixed a warning concerning ftruncate's return value being
ignored.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <axboe@carl.(none)>
Martin Peschke [Wed, 21 Jan 2009 13:58:35 +0000 (14:58 +0100)]
A couple of min-counters weren't initialised correctly (thrput_r,
thrput_w).
We have got a perfectly working init function for this purpose.
Removing partially duplicated code.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Wed, 21 Jan 2009 13:58:35 +0000 (14:58 +0100)]
The git commit
11914a53d2ec2974a565311af327b8983d8c820d added __BLK_TA_ABORT
to blktrace_api.h. A corresponding addition to the blktrace tools repository
has been missing, breaking the API. Blkparse complained:
"Bad fs action
40010011"
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Mon, 12 Jan 2009 17:29:53 +0000 (18:29 +0100)]
Added no messages option to blkparse.c
Added a new option (-M, --no-msgs) option to blkparse: I have found that
the CFQ I/O scheduler sends a *tremendous* amount of messages, that
bloat the .bin file generated when using the -d option. The file sizes
can shrink by >50% when using the -M option in those case.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Wed, 12 Nov 2008 12:11:22 +0000 (07:11 -0500)]
gcc 4.3.2 has started to warn about:
gcc -Wall -W -O2 -g -I. -I.. -D_GNU_SOURCE -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -c -o output.o output.c
output.c: In function ‘output_section_hdr’:
output.c:57: warning: format not a string literal and no format
arguments
output.c: In function ‘__output_pip_avg’:
output.c:496: warning: format not a string literal and no format
arguments
output.c:496: warning: format not a string literal and no format
arguments
so this patch cleans this up.
Alan D. Brunelle [Tue, 11 Nov 2008 18:51:33 +0000 (13:51 -0500)]
Merge branch 'add-P'
Alan D. Brunelle [Tue, 11 Nov 2008 18:50:23 +0000 (13:50 -0500)]
Merge branch 'fix-m'
Alan D. Brunelle [Tue, 11 Nov 2008 18:46:13 +0000 (13:46 -0500)]
Added -P to create a data file w/ Q, D and C per line
Easy parsing for graph creation
Alan D. Brunelle [Tue, 11 Nov 2008 18:42:55 +0000 (13:42 -0500)]
Merge branch 'fix-m' into add-P
Alan D. Brunelle [Tue, 11 Nov 2008 18:41:05 +0000 (13:41 -0500)]
Fixed 'M' displays on per-io output and added in I/O separator
Alan D. Brunelle [Tue, 11 Nov 2008 18:40:10 +0000 (13:40 -0500)]
Fixed segfault in aqd.c : need to check for NULL (not requested)
Alan D. Brunelle [Mon, 10 Nov 2008 15:35:44 +0000 (10:35 -0500)]
Added in -z to provide running waiting-for-issue latencies
Alan D. Brunelle [Mon, 10 Nov 2008 15:34:02 +0000 (10:34 -0500)]
Merge branch 'master' of ssh://alanbrunelle@git.kernel.dk/data/git/blktrace
Jens Axboe [Thu, 30 Oct 2008 14:06:01 +0000 (15:06 +0100)]
Set release version 1.0.0
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 30 Oct 2008 14:04:20 +0000 (15:04 +0100)]
Update rbtree to version with unified parent + color
This saves 4-8 bytes per node, which can be quite a bit of memory
with blktrace.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Wed, 29 Oct 2008 19:04:03 +0000 (15:04 -0400)]
Moved btrecord/btreplay to version 1.0.0
Martin Peschke [Tue, 28 Oct 2008 16:08:12 +0000 (17:08 +0100)]
blkiomon: add through-put statistics
Add accounting of per-request throughput in bytes per millisecond
both for read ad write I/O.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Tue, 28 Oct 2008 16:08:10 +0000 (17:08 +0100)]
blkiomon: separate statistics for read and write requests
Split min/max/avg statistics for request sizes and dispatch-to-completion
latencies into separate statistics for read and write requests.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Tue, 28 Oct 2008 16:08:07 +0000 (17:08 +0100)]
blkiomon: fix some debug messages
Cleaning up error messages. Some perror()'s didn't make sense.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Tue, 28 Oct 2008 16:08:05 +0000 (17:08 +0100)]
blkiomon: fix trace debug output
Removed leftovers of trace tree and made debug code work by using trace
hash instead of trace tree.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Tue, 28 Oct 2008 16:08:02 +0000 (17:08 +0100)]
blkiomon: fix unit in histogram output
Fix unit of request sizes as printed in histogram (it's bytes not kilobytes).
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Tue, 28 Oct 2008 16:07:57 +0000 (17:07 +0100)]
blkiomon: fix cross-arch data analysis issue
This fixes cross-arch issues. Binary data gathered on System z could not
be read on x86.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Mon, 20 Oct 2008 16:14:21 +0000 (18:14 +0200)]
blkiomon: drv_data traces pass-through
This patch adds pass-through support for device driver specific traces
to blkiomon. This way we can aggregate block I/O statistics and device
driver specific statistics at the same time.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Fri, 17 Oct 2008 13:09:05 +0000 (15:09 +0200)]
blkparse: add hint for discarded drv_data traces
Display an informational message on blkparse exit to notify users that
additional data was available which would require to be dumped to binary
output.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Thu, 16 Oct 2008 16:03:45 +0000 (12:03 -0400)]
Added in -L option - output periodic latency information
Alan D. Brunelle [Thu, 16 Oct 2008 14:53:07 +0000 (10:53 -0400)]
Added in -Q / --active-queue-depth option
This will output a data file containing the time stamp and number of
I/Os issued to underlying drivers per device. It will give you an idea
as to how many I/Os are being actively worked per device at any time
during the run.
Stefan Raspl [Thu, 16 Oct 2008 06:14:17 +0000 (08:14 +0200)]
Add driver data support
Adds a new type of action 'drv_data' for blktrace to handle binary
driver-specific data. Since the data is binary, blkparse will only put it in
a binary file, not in the regular human-readable output.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 16 Oct 2008 06:05:53 +0000 (08:05 +0200)]
blktrace: accept -v (lower case) for version info as well
Christof Schmitt <christof.schmitt@de.ibm.com> points out that the
documentation uses -v but blktrace supports only -V, so change
blktrace to accept both cases.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Martin Peschke [Thu, 16 Oct 2008 05:48:25 +0000 (07:48 +0200)]
blkiomon: I/O monitor
blkiomon periodicaly generates per devive request size and request latency
statistics from blktrace data. It provides histograms as well as data that
can be used to calculate min, max, average and variance. For this purpose,
it consumes D and C traces read from stdin.
There are options for binary output and human-readable output to files and
stdout. Output to a message queue is supported as well.
#blktrace /dev/sdw -a issue -a complete -w 200 -o - | blkiomon -I 8 -h -
time: Tue Sep 30 17:39:25 2008
device: 65,96
requests: read 62, write 40, bidir: 0
sizes: num 102, min 4096, max 430080, sum
13312000, squ
3102442782720,
avg 130509.8, var
13383296793.3
d2c: num 102, min 393, max 14261, sum 359441, squ
2830211755, avg 3523.9,
var
15329081.8
sizes histogram (in kB):
0: 0 1024: 0 2048: 0 4096: 6
8192: 0 16384: 15 32768: 4 65536: 24
131072: 11 262144: 30 524288: 12
1048576: 0
2097152: 0
4194304: 0
8388608: 0 >
8388608: 0
d2c histogram (in usec):
0: 0 8: 0 16: 0 32: 0
64: 0 128: 0 256: 0 512: 13
1024: 21 2048: 27 4096: 14 8192: 8
16384: 19 32768: 0 65536: 0 131072: 0
262144: 0 524288: 0
1048576: 0
2097152: 0
4194304: 0
8388608: 0
16777216: 0
33554432: 0
>
33554432: 0
time: Tue Sep 30 17:39:33 2008
device: 65,96
requests: read 312, write 47, bidir: 0
sizes: num 359, min 4096, max 430080, sum
13197312, squ
1575816790016,
avg 36761.3, var
3038067547.5
d2c: num 359, min 294, max 9211, sum 387134, squ
1262489694, avg 1078.4,
var
2353807.5
sizes histogram (in kB):
0: 0 1024: 0 2048: 0 4096: 32
8192: 17 16384: 133 32768: 87 65536: 59
131072: 9 262144: 18 524288: 4
1048576: 0
2097152: 0
4194304: 0
8388608: 0 >
8388608: 0
d2c histogram (in usec):
0: 0 8: 0 16: 0 32: 0
64: 0 128: 0 256: 0 512: 129
1024: 164 2048: 33 4096: 15 8192: 13
16384: 5 32768: 0 65536: 0 131072: 0
262144: 0 524288: 0
1048576: 0
2097152: 0
4194304: 0
8388608: 0
16777216: 0
33554432: 0
>
33554432: 0
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Fri, 10 Oct 2008 12:40:57 +0000 (08:40 -0400)]
Removed excessive amounts of seek modes (for random sets of I/Os)
When doing a random load, we'd get a LARGE amount of single-seek buckets,
this patch just notes that fact, without dumping all the data...
Alan D. Brunelle [Fri, 10 Oct 2008 12:40:21 +0000 (08:40 -0400)]
Merge branch 'master' of ssh://alanbrunelle@git.kernel.dk/data/git/blktrace
Nathan Scott [Fri, 26 Sep 2008 09:05:51 +0000 (11:05 +0200)]
spec file tweak
I found I needed this tweak to make the spec file usable on a RHEL5
system (I hit rpmbuild errors about unpackaged files without this).
Not sure if its correct though, maybe someone with stronger rpm-fu
than I have could take a closer look?
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 26 Sep 2008 09:05:16 +0000 (11:05 +0200)]
Merge branch 'master' of ssh://axboe@router.home.kernel.dk/data/git/blktrace
Nathan Scott [Fri, 26 Sep 2008 09:05:13 +0000 (11:05 +0200)]
man page typo
This fixes up a formatting problem when displaying the blktrace
man pages - just a typo (bold mode not correctly terminated, so
it flows on until the next bold-mode-termination a few lines on).
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Tue, 23 Sep 2008 12:06:41 +0000 (08:06 -0400)]
Added in %done for btt
David Woodhouse [Fri, 15 Aug 2008 09:12:20 +0000 (11:12 +0200)]
Add documentation of 'D' discard operation
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
David Woodhouse [Fri, 15 Aug 2008 08:44:39 +0000 (10:44 +0200)]
blktrace: support discard requests
Add support for discard requests to blktrace userspace tools.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jeff Moyer [Tue, 1 Jul 2008 11:35:29 +0000 (13:35 +0200)]
spelling and grammar fixes for btreplay.tex
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Tue, 27 May 2008 14:39:14 +0000 (10:39 -0400)]
Put message notes from kernel into a separate file for easy tracking
Also made made_dev_hdr standard for all usages.
Alan D. Brunelle [Tue, 27 May 2008 12:34:13 +0000 (08:34 -0400)]
Added in new message updates to the documentation.
Alan D. Brunelle [Tue, 27 May 2008 12:19:57 +0000 (08:19 -0400)]
Added in handling of MESSAGE notes
Sample output:
8,16 1 691118 17.
417000000 0 C R
2660776 + 8 [0]
8,16 1 691119 17.
417000000 0 D R
2660792 + 8 [swapper]
8,16 1 691120 17.
417000000 4688 U N [dd] 42
8,16 1 0 17.
418000000 0 m N elv switch: noop
8,16 1 691121 17.
418000000 4638 C R
2660784 + 8 [0]
8,16 1 691122 17.
418000000 4638 D R
2660800 + 8 [bash]
8,16 1 691123 17.
418000000 4638 C R
2660792 + 8 [0]
Thanks to Carl Henrik Lunde <chlunde@ping.uio.no> for adding in sequence
printing & time-stamp correction.
Alan D. Brunelle [Wed, 21 May 2008 19:55:57 +0000 (15:55 -0400)]
Handled no difference in seek times
For some reason recent kernels (2.6.25.4, for example) we've lost a lot
of resolution in our blktrace times. This can result in lots of things
happening "simultaneously." This change at least tries to handle the
case where all the seeks happen at once.
Probably have other issues that need to be looked into...
Alan D. Brunelle [Wed, 21 May 2008 14:55:20 +0000 (10:55 -0400)]
Added in -m option, seeks-per-second
btt can now output data files containing seeks-per-second information.
Jens Axboe [Sun, 18 May 2008 18:55:25 +0000 (20:55 +0200)]
blkparse: cope with missing process notify event
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alan D. Brunelle [Mon, 12 May 2008 13:56:33 +0000 (09:56 -0400)]
Fixed percentage calculations for phases of an I/O
Alan D. Brunelle [Fri, 9 May 2008 17:46:47 +0000 (13:46 -0400)]
Added S2G times + fixed up -X output to include X2X
Including Q2Q, Q2G, S2G, G2I, Q2M, I2D, M2D, D2C, Q2C.
S2G is part of Q2G, and shows the number of times we had to sleep to
get a request.
Ignored 0-byte I/Os - coming from barrier I/Os...
Alan D. Brunelle [Thu, 8 May 2008 19:28:32 +0000 (15:28 -0400)]
Added -X option - generate easily parseable file
Writes a portion of the default output into a separate file that more
easily parsed.
Luis Useche [Mon, 5 May 2008 18:53:13 +0000 (20:53 +0200)]
Add -x accellerator option
This patch adds a new functionality to the btreplay tool, the -x option.
This parameter accelerate the replication by the factor specified. This
means that the stall time is divided by the number introduced.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Luis Useche [Mon, 5 May 2008 18:53:10 +0000 (20:53 +0200)]
Fix problem with -w option
This patch fixes the problem when the -w option is used in the file mode
(i.e., no fifo mode). It just consists of moving the checking of the
stopwatch_end after the time is updated with genesis. This also includes
the checking of the stopwatch_start.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Luis Useche [Mon, 5 May 2008 18:53:07 +0000 (20:53 +0200)]
eliminate check of empty -F format
This patch eliminates the checking of -F format when it is empty. I am
using this in order to blank out the events that I do not want for
certain act mask. Note that there is no real motivation to have this
checking.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>