summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-05-11fix max-pkts option inconsistenciesEric Sandeen
This is for RH bug 498426, btrecord doesn't recognize --max-pkts and --max-packets cmd line options Usage text and some documentation references "--max-pkts" and the man page references "--max-packets" but the getopts code is looking for --max_pkts. I'm not sure if it's best to make the code match the majority of the docs, or fix the docs to match the code? This patch does the former, but I'm not picky if you want to go the other way :) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11Converted to using the correct remap entriesAlan D. Brunelle
This follows the kernel changes to the blk_io_trace_remap structure to better align the names of the structure elements with the real intent of "from" and "to" (devices & sectors). See the kernel patches @ http://lkml.org/lkml/2009/4/30/340 http://lkml.org/lkml/2009/4/30/341 (Note: since the ABI order didn't change, old user code will work with the new kernel code & vice versa.) Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-11blkiomon: fix unaligned accesses on ia64Martin Peschke
commit 7aa3ebcec011bfe9cc60d6476252c03376a37551 packed the blkiomon_stat structure so that traces from one arch could be analyzed on another (in truth only x86 is different, at least from x86_64/ia64/ppc/ppc64/s390/s390x) Moving the __u32 device member instead of a new padding field should be fine. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-21fix off-by-one issues in blkiomon.hMartin Peschke
Fix two off-by-one issues. Last bucket of histogram was ommitted by mistake when being converted to big-endian or when being merged with another bucket. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-21fix include statement in stats.hMartin Peschke
Some endianess conversion macros have been moved and the corresponding header file is gone. Need to adapt an include statement in stats.h. The compiler did not complain because blkiomon.c accidently included blktrace.h prior to stats.h. Introduced by blktrace rewrite which became version 2. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-17handle race to mkdir at startupJeff Moyer
I ran into a problem when specifying -D dirname-that-doesnt-yet-exist. Blktrace would fail, spewing the following messages: [root@megadeth blktrace]# ./blktrace -d /dev/cciss/c0d1 -D ./2.6.30-rc2-cfq-local Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists Destination dir ./2.6.30-rc2-cfq-local/ can't be made: 17/File exists FAILED to start thread on CPU 0: 1/Operation not permitted FAILED to start thread on CPU 4: 1/Operation not permitted FAILED to start thread on CPU 5: 1/Operation not permitted FAILED to start thread on CPU 6: 1/Operation not permitted FAILED to start thread on CPU 7: 1/Operation not permitted I tracked it down to the fact that there is no synchronization between threads when trying to create the output directory. The fix is simple, just allow the race to happen and detect it. It's not really worth putting in any extra synchronization. It looks like no place else in that startup path needs synchronization either. This patch fixes the issue for me. I tested it by running the very command that caused me headaches 100% of the time before. I also did a chattr +i on the directory and verified that it would really fail in the case where it couldn't create the directory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-04-06Fixed plug/unplug logic in bttAlan D. Brunelle
Was not accounting for unplugged time due to timeout unplugs.
2009-04-02Working on fixing % time q pluggedAlan D. Brunelle
2009-03-26fix trivial typo in manpageEric Sandeen
Fix small typo as reported by Kent Baxley <kbaxley@redhat.com> in Red Hat Bug 489941 - tiny typo in blktrace manpage Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jens Axboe <axboe@carl.(none)>
2009-03-25Add NOTIFY to activity maskCarl Henrik Lunde
Allow masking in messages by using "-a notify". Signed-off-by: Carl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-23Blktrace failed to lock reader threads on the cpu used by the correspondingTom Zanussi
writer. This resulted in stale data being consumed when blktrace accidently read at a position that was being written to at the same time. This issue surfaced as "bad trace magic" warnings emitted by blktrace tools. The problem occured on an SMP System z machine. The patch fixes the issue. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <axboe@carl.(none)>
2009-03-12Generate matplotlib plots for btt generated dataAlan D. Brunelle
btt_plot.py: Generate matplotlib plots for BTT generated data files Files handled: AQD - Average Queue Depth Running average of queue depths BNOS - Block numbers accessed Markers for each block Q2D - Queue to Issue latencies Running averages D2C - Issue to Complete latencies Running averages Q2C - Queue to Complete latencies Running averages Usage: btt_plot_aqd.py equivalent to: btt_plot.py -t aqd btt_plot_bnos.py equivalent to: btt_plot.py -t bnos btt_plot_q2d.py equivalent to: btt_plot.py -t q2d btt_plot_d2c.py equivalent to: btt_plot.py -t d2c btt_plot_q2c.py equivalent to: btt_plot.py -t q2c Arguments: [ -A | --generate-all ] Default: False [ -L | --no-legend ] Default: Legend table produced [ -o <file> | --output=<file> ] Default: <type>.png [ -T <string> | --title=<string> ] Default: Based upon <type> [ -v | --verbose ] Default: False <data-files...> The -A (--generate-all) argument is different: when this is specified, an attempt is made to generate default plots for all 5 types (aqd, bnos, q2d, d2c and q2c). It will find files with the appropriate suffix for each type ('aqd.dat' for example). If such files are found, a plot for that type will be made. The output file name will be the default for each type. The -L (--no-legend) option will be obeyed for all plots, but the -o (--output) and -T (--title) options will be ignored.
2009-02-18Merge branch 'master' of ssh://axboe@router.home.kernel.dk/data/git/blktraceJens Axboe
2009-02-18Update Jenkins hash to lookup3() variantJens Axboe
It both mixes and performs better than lookup2(). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-17Fixed EAGAIN handling in blktrace.cAlan D. Brunelle
EAGAIN was causing header failures in network mode. Added in a usleep and retried the recv(). Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-02-17O_NOATIME isn't always presentJens Axboe
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-02-13btt: Added no remap optionAlan D. Brunelle
Trying to run btt on pre-2.6.19 kernels has problems handling the previous remap PDU - it did not include a proper device-from field. (Probably should have bumped the blktrace version when we did that.) This option just tosses those out as it just results in lots of crazy stuff being handled. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-02-13btt general cleanup plus valgrind cleanAlan D. Brunelle
Lots of general clean up of code, getting interfaces across different files to be similar (all are no alloc/free), and made it valgrind clean. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com
2009-02-12btt: Missed fopen conversion to my_fopenAlan D. Brunelle
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com
2009-02-12Code review updatesAlan D. Brunelle
Re-coding large functions, re-arranging some stuff.
2009-02-12Reworked blktrace master/thread interfaceAlan D. Brunelle
Allows parallel initializations. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-02-12Cleaned up devs that have no dataAlan D. Brunelle
Working around an issue with older kernels (pre-2.6.19): remaps did not include device-from, so the pad field is being used for a device which never has any Q or Ds done to it (it's an invalid ID). This code removes all such devices before output processing. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-02-11Moved starting of tracing after tracers are goingAlan D. Brunelle
Hold off BLKTRACESTART to threads are ready to consume tracers.
2009-02-11btt: fixed open in setup_ifileAlan D. Brunelle
Took my_open & my_fopen code from blktrace 2.0: needed to add in open resource limit increasing stuff. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-02-11Synchronized trace gatheringAlan D. Brunelle
Previously, each tracer thread would start gathering traces as soon as it got going - which might slow down later thread start ups. This change allows each thread to be ready to gather traces, and then the main thread starts all the threads gathering at the same time.
2009-02-11Invoke gethostbyname once, handle errors betterAlan D. Brunelle
Instead of invoking gethostbyname once per client, we only need to do it once at initialization time. Plus: gethostbyname has a non-standard errno reporting mechanism, handle this better.
2009-02-11Added accept as a system call needing resource increasesAlan D. Brunelle
accept(2) opens a socket, and thus needs to handle EMFILE/ENFILE errors like other system calls.
2009-02-09Rewrote blktrace to have a single thread per CPUAlan D. Brunelle
Massive changes: mostly around the notion of having much fewer threads (instead of N(devs) X N(cpus) threads, we'll have just N(cpus)). This is very important for larger systems (with lots of devices to trace). A lot of the code was stolen from the original blktrace code, major changes include: o On the client side we only have a single thread per client CPU. Each thread will then open all device files for that CPU, and use poll to determine which file needs processing. o For network client mode w/ sendfile, this means that a single socket will carry all data to the remote network server. The network server side will then distribute its reads off that one socket onto different trace files. o For network client mode w/out sendfile, we fall back to doing things like piped mode: keep buffers of tracers read in, and then the main thread will issue these on sockets to the server. In this case, the main thread will still have a single socket per CPU. o For networked mode we added an OPEN concept on the client side: as soon as the connection to the server is set up, a "header" is sent signifying that this connection will handle a <cpu, device> tuple. For each socket opened on the client side, it will send a header per device being managed. The server side will handle utilize opens to set up appropriate data structures to handle incoming data streams. o For both the OPEN and CLOSE headers the server will acknowledge with a short write back to the client. This allows the client & server sides to gracefully close socket connections. o I also re-did the resource limitiation issue a bit differently: for open calls (including socket) or for memory map/lock calls I have provided a wrapper function that will try to increase specific limits as needed. The previous method (attempting to do it at the beginning of the run) fails for network server mode - you don't know at initialization how many devices and CPUs will be handled. o The standard output is slightly different in a few places, if this is a problem w/ compatibility we can work to rectify that. The command line argument handling is identical though. o Using code stolen from Linux to manipulate doubly-linked lists. I've found that this makes the code easier to read/write (but may be a bit of overkill here...) o The code passes valgrind quite well (at least for my tests so far). The only nit has to do with inet_ntoa - but that is out of our control. Thanks to Stefan Raspl <raspl@linux.vnet.ibm.com> for testing and finding some issues and for providing suggestions. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-01-23Fix btt to handle large numbers of output filesAlan D. Brunelle
Simply bump resource limits if file opens fail, and retry. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
2009-01-23Increased limits to allow for large system runsAlan D. Brunelle
On 16-way w/ 104 disks and a 32-way w/ 96 disks, I was getting: $ sudo blktrace -b 1024 -n 8 -I ../files ./cciss_c1d6.blktrace.10: Too many open files Failed to start worker threads Due to the nature of our N(cpus) X N(devices) order of file opens, and our N(cpus) X N(devices) X N(buffers) X (buffer size) amount of mmaps() going on we're exceeding both the RLIMIT_NOFILE and RLIMIT_MEMLOCK limits. This patch raises limits for RLIMIT_NOFILE and RLIMIT_MEMLOCK to "infinity", and allows blktrace to handle the large(ish) systems. (If these settings fail, we "guestimate" about how much we really need.) There is still an underlying blktrace and/or kernel problem: The directory /sys/kernel/debug/block/<DSF> where <DSF> is the device that encountered the limit is left behind (not cleaned up correctly). This stops blktrace from running a second time (even on another device): $ ls /sys/kernel/debug/block cciss_c1d6 $ sudo blktrace /dev/sda BLKTRACESETUP: No such file or directory Failed to start trace on /dev/sda and requires a reboot. (Looking into that next, as this patch - whilst stopping the original problem from happening - does not address the secondary problem. And there may be some other ways for the secondary problem to still occur...) I also fixed a warning concerning ftruncate's return value being ignored. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <axboe@carl.(none)>
2009-01-21A couple of min-counters weren't initialised correctly (thrput_r,Martin Peschke
thrput_w). We have got a perfectly working init function for this purpose. Removing partially duplicated code. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-01-21The git commit 11914a53d2ec2974a565311af327b8983d8c820d added __BLK_TA_ABORTMartin Peschke
to blktrace_api.h. A corresponding addition to the blktrace tools repository has been missing, breaking the API. Blkparse complained: "Bad fs action 40010011" Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-01-12Added no messages option to blkparse.cAlan D. Brunelle
Added a new option (-M, --no-msgs) option to blkparse: I have found that the CFQ I/O scheduler sends a *tremendous* amount of messages, that bloat the .bin file generated when using the -d option. The file sizes can shrink by >50% when using the -M option in those case. Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-12gcc 4.3.2 has started to warn about:Alan D. Brunelle
gcc -Wall -W -O2 -g -I. -I.. -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -c -o output.o output.c output.c: In function ‘output_section_hdr’: output.c:57: warning: format not a string literal and no format arguments output.c: In function ‘__output_pip_avg’: output.c:496: warning: format not a string literal and no format arguments output.c:496: warning: format not a string literal and no format arguments so this patch cleans this up.
2008-11-11Merge branch 'add-P'Alan D. Brunelle
2008-11-11Merge branch 'fix-m'Alan D. Brunelle
2008-11-11Added -P to create a data file w/ Q, D and C per lineAlan D. Brunelle
Easy parsing for graph creation
2008-11-11Merge branch 'fix-m' into add-PAlan D. Brunelle
2008-11-11Fixed 'M' displays on per-io output and added in I/O separatorAlan D. Brunelle
2008-11-11 Fixed segfault in aqd.c : need to check for NULL (not requested)Alan D. Brunelle
2008-11-10Added in -z to provide running waiting-for-issue latenciesAlan D. Brunelle
2008-11-10Merge branch 'master' of ssh://alanbrunelle@git.kernel.dk/data/git/blktraceAlan D. Brunelle
2008-10-30Set release version 1.0.0blktrace-1.0.0Jens Axboe
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-30Update rbtree to version with unified parent + colorJens Axboe
This saves 4-8 bytes per node, which can be quite a bit of memory with blktrace. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-29Moved btrecord/btreplay to version 1.0.0Alan D. Brunelle
2008-10-28blkiomon: add through-put statisticsMartin Peschke
Add accounting of per-request throughput in bytes per millisecond both for read ad write I/O. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-28blkiomon: separate statistics for read and write requestsMartin Peschke
Split min/max/avg statistics for request sizes and dispatch-to-completion latencies into separate statistics for read and write requests. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-28blkiomon: fix some debug messagesMartin Peschke
Cleaning up error messages. Some perror()'s didn't make sense. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-28blkiomon: fix trace debug outputMartin Peschke
Removed leftovers of trace tree and made debug code work by using trace hash instead of trace tree. Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-28blkiomon: fix unit in histogram outputMartin Peschke
Fix unit of request sizes as printed in histogram (it's bytes not kilobytes). Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>