Age | Commit message (Collapse) | Author |
|
We should use the standard methods for getting the number of cpus in the
system when they are available. It is good practice to leave the old ways in
place for people stuck on older systems.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The current method fails if once we hit the first offlined cpu. This
will correct that case. However this still underreports the number cpus if
the last cpu are offlined.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
verify_blkpars has troubles with systems larger then 512.
Also there is issue in the scanning code causing the cpu number to be
truncated to the first two digits. i.e cpu 542 would be read as 54.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If block device has many request with size less than 1K,
blkparse ignores such requests because it treats each request
in Kb.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In our experiments blktrace/blkparse file names encode a lot of
infomation about the particular experiment. We noticed that for long
enough file names blkparse does not work.
The reason is that per_cpu_info->fname[] is of 128 bytes. As a result,
in setup_file() function only part of the file name gets to ->fname[].
Then stat() fails and we exit the function. Notice, that no error is
printed in this case.
In the following patch ->fname[] size is increased to POSIX defined
PATH_MAX.
Signed-off-by: Vasily Tarasov <tarasov@vasily.name
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
One was a real bug, assigned i_time twice instead of c_time (which was
left unitialized).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Several places using strcpy would benefit from strncpy
for safety.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
sp was being incremented w/o initialization, but thankfully
not used otherwise. Just remove it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
We malloc'd cpu_map, and then did:
cpu_map[CPU_IDX(cpu)] |= (1UL << CPU_BIT(cpu));
... not sure how that ever worked if cpu_map was not initialized!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Close the file used for btt's -M argument after
processing.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
In several cases space is allocated for a filename but
not freed if open of that file fails.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
No point in malloc()ing space if we just immediately overwrite
the pointer via strdup. That'll leak some space.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The file containing the list of devices was never closed
after processing was complete.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
On this error path, pdu_buf was never freed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The acts[] array is only N_ACTS elements, so we should not
ever set acts[N_ACTS]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Check for setvbuf failure.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
This patch fixes two bugs in blktrace.
1. realloc is called on a wrong memory address (glibc reports heap
corruption if the user sends the output to a pipe, for example "blktrace
/dev/sdc -o -").
2. errno 0 is actually reported if debugfs is not mounted
Mikulas
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
FUA follows WRITE, use the same 'F' flag for both cases and
distinguish them by their (relative) position. The end results
look like (other flags might be shown also):
- WRITE: W
- WRITE_FLUSH: FW
- WRITE_FUA: WF
- WRITE_FLUSH_FUA: FWF
Note that we reuse TC_BARRIER due to lack of bit space of act_mask.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
|
|
I noticed in some traces that I was seeing summaries like the following:
Total (sde):
Reads Queued: 76, 304KiB Writes Queued: 16,384, 1,048MiB
Read Dispatches: 76, 304KiB Write Dispatches: 2,210, 1,048MiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 76, 304KiB Writes Completed: 2,210, 1,048MiB
Read Merges: 0, 0KiB Write Merges: 14,174, 907,136KiB
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 4, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 4 PC Writes Compl.: 2,210
IO unplugs: 2,124 Timer unplugs: 0
Note how there were no PC Writes dispatched, but there were 2210
completed. It turns out to be a minor typo in the code. The attached
patch fixes the reporting for me.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
blk_io_trace->cpu is u32, so use be32_to_cpu instead.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Currently we only check the magic number to see whether
a blktrace is valid or not, but Bill Broadley did meet
with a case that the cpu info is wrong with a number
of 1725552676. So in resize_cpu_info, we meet with a
overflow when calculating
size = new_count * sizeof(struct per_cpu_info);
And the program will be either segfault or has the error
of out of memory. Although this is more likely a kernel
problem, the blkparse shoudn't segfault for it.
So this patch just check whether the cpu stored in the
trace is the same as the file, if not, just warn it out
and skip it.
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: Bill Broadley <bill@broadley.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
To help users better deal with the log message
"You have dropped events, consider using a larger buffer size (-b)",
it's helpful to list the defaults for sub buffer management, without
flags.
Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Add blkiomon, btreplay/btrecord and btreplay/btreplay to
.gitignore so that they don't show up in "Untracked files.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
idx isn't used, so remove it.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
In 38-rc2, there is a bug in mlock which will return
error in mlock of blktrace(I have sent the corresponding
patch to the lkml). So when we try to break the blktrace
by "ctrl+c", mlock will loop forever and in the end, I
have to use "kill -9" to kill it and then run "blktrace -k"
to stop the tracer. I don't think it is good.
How to reproduce it is simple:
Use a 38-rc kernel, and run
blktrace /dev/sdx
then use "ctrl+c", it doesn't exit.
So this patch adds the check for tp->is_done. In
case of is_done is set, break mlock so that we don't
deadloop in the mlock. In case of the real mlock error,
I will let it to retry 10 times and it should succeed
after 10 tries in case of tp->is_done. If tp isn't set
or tp->is_done isn't set, it works like the original
design.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
When we give out some statistics in blkiomon, we don't consider
the situation that the device has no correspoinding action. See
if there is no disk read during the interval, the output in my box is
like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg nan, var nan
With the fix, now it looks like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
Cc: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
|
|
With the newest kernel(say 2.6.37, some older one should also have the
similar problem), some cfq messages are added to blktrace, so it makes
the old blkparse broken.
See a simple example:
1. blktrace /dev/sdb -o -|blkparse -i -
2. Run the following command(/dev/sdb1 is mounted at /mnt/test_dir):
dd if=/mnt/test_dir/test of=/dev/null bs=4k count=1 iflag=direct
There are only 2 lines of output there:
8,16 0 1 0.000000000 13183 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.000000491 13183 Q R 114759 + 8 [dd]
And even we run a command line like:
for((i=0;i<100;i++))do dd if=/mnt/ocfs2/test of=/dev/null bs=4k count=1 iflag=direct;done
We are only given the same 2 lines of output.
While the really one should look like:
8,16 0 1 0.000000000 13319 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.000000376 13319 Q R 114759 + 8 [dd]
8,16 0 0 0.000005931 0 m N cfq13319 alloced
8,16 0 3 0.000006259 13319 G R 114759 + 8 [dd]
8,16 0 4 0.000007143 13319 P N [dd]
8,16 0 5 0.000007817 13319 I R 114759 + 8 [dd]
8,16 0 0 0.000008491 0 m N cfq13319 insert_request
8,16 0 0 0.000009029 0 m N cfq13319 add_to_rr
...
The main reason is that in show_entries_rb, we test sequences every time,
but actually with some messages like cfq, the sequence number is always
0 which makes the old sequence check refuses all the logs after it.
So only check/store sequence number if it isn't a message.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
btreplay.c:1332: warning: comparison between signed and unsigned integer
expressions
|
|
Fixup for RH bugzilla 595628.
Document btt options:
-m (--seeks-per-second);
-X (--easy-parse-avgs).
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595623
Document btreplay option -x (--acc-factor)
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595620.
Document undocumented blktrace options.
Update the man pages.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595615.
Document blkparse options:
-A, --set-mask,
-a, --act-mask,
-D. --input-directory
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595419.
Document blkiomon option -d (--dump-lldd).
Add drv_data mask description to blktrace man page.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fix up for RH bugzilla 595413
Mistake in the man page for btrecord:
documented option --input-base is unsupported,
supported option --max-bunch-time is undocumented.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Document that "-o" does not work when specyfing multiple devices to
blktrace, also: enforce this by stopping blktrace when one tries do
do this.
The technical reason why "-o" doesn't work with multiple devices is
because we use multiple threads of execution - one per device/CPU pair -
and each of them opens a file named "<prefix>.blktrace.<cpu>". With the
"-o" all of the "<prefix>" values are the same - so multiple threads
open the same file and try to do output. Not good. Without the "-o"
we get unique files named: "<device>.blktrace.<cpu>" - as the tuple
(<device>,<cpu>) is unique.
Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com>
|
|
Fixup for 513950.
Problem:
'blktrace -d <device> -k' does not kill a running
backgound trace. Executing 'blktrace -d <device> -k'
for the second time results in "BLKTRACETEARDOWN:
Invalid argument" message and then each run of
blktrace on that machine prints the following output:
BLKTRACESETUP: No such file or directory.
The bug:
The option -k results in clobbering information
about running trace by kernel (blk_trace_remove),
while resources (files open in debugfs by the running
background blktrace) are not released.
Solution:
Update documentation:
Undocument the non-working "kill" option. Advise
to send SIGINT signall via kill(1) to the running
background blktrace for its correct termination.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Fixup for 499398.
Description of problem:
blkiomon does not understand the output of blktrace when
working with logical volume device (it is quiet, while
working with physical device it prints IO statistics as
expected).
BUG (or design feature?):
/dev/dm-* and /dev/md* don't see BLK_TC_COMPLETE actions:
/* we need an older D trace and a younger C trace */
if (t_old->bit.action & BLK_TC_ACT(BLK_TC_ISSUE) &&
t_young->bit.action & BLK_TC_ACT(BLK_TC_COMPLETE)) {
/* matching D and C traces - update statistics */
match++;
blkiomon_account(&t_old->bit, &t_young->bit);
blkiomon_free_trace(t_stored);
return t;
}
Possible solution:
Update documentation.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Fixup for bz 502889.
Problem:
when executing with /dev/cciss/foo (long path names)
btreplay complains (No such file or directory).
Bug:
Missed back conversion of erscores to slashes.
Solution:
Convert underscores to slashes to restore device
names that have larger paths.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Fixup for 498898:
Problem:
When somebody runs blktrace without parameters, it
shows the usage message. The usage message suggests
that version number "x.y.z" is a required parameter,
which is not true.
Solution:
Don't print version number when running
blktrace, blkparce, btt without parameters.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Fixup for bz 501457.
Problem:
If the device list file contains the same device
as supplied on the command line, blktrace stops
immediately and further I/O tracing is impossible.
Bug: device duplication in the devpaths ends with
programm termination (BLKTRACESETUP ioctl returns
error) while resources (open files in debugfs) are
not released.
Solution:
Make sure devices are not duplicated in devpaths
pool.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
If no tracefiles are found, exit with non-0 status
Resolves Red Hat Bugzilla #500118
Reported-by: Milos Malik <mmalik@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|