Age | Commit message (Collapse) | Author |
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we fail doing the BLKTRACESETUP ioctl, blktrace still marches on
and sets up the rest. This results in errors like the below:
blktrace /dev/sdf
BLKTRACESETUP(2) /dev/sdf failed: 5/Input/output error
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or directory
Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or directory
[...]
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
FAILED to start thread on CPU 2: 1/Operation not permitted
and blktrace continues to run, though it can't do anything in this
state.
If the ioctl setup fails, just abort.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When CPU number space is sparse, we don't start threads for non-existent
CPUs. As a result, there are no output files created for these CPUs
which confuses tools like blkparse which expect that CPU numbers are
contiguous. Create fake empty files for non-existent CPUs so that other
tools don't have to bother.
Note that in network mode, the server will create all files in the range
0..max_cpus automatically.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
We would like to generate output file name without having corresponding
iop structure. Reorganize the function to allow that. Also fix couple of
overflows possible when generating the file name when we are modifying
the code anyway.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
On some machines CPU numbers do not form a contiguous interval. In such
cases blktrace will fail to start threads for missing CPUs and exit
effectively rendering itself unusable.
Add support into blktrace to handle systems with sparse CPU numbers.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Some C libraries (notably uClibc) have the posix_spawn*() functions in
librt, so let's link iowatcher with -lrt.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
An earlier commit:
fb7f8667 blktrace: disable kill option - take 2
removed the "-k" option documentation, but left
it in the synopsis.
This is a bit unusual and unhelpful and probably
unintended; remove it from the synopsis as well.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Proper graph name is queue-depth, not queue_depth.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Trace label isn't properly separated with space from suffix (Read /
Write). Fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When user specifies trace files directly via -t option, it doesn't make
sense to prepend blktrace destination directory to them (it is
especially confusing if you specify absolute path names with -t option
and this logic breaks the path names). So avoid that.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently btt keeps the original IO in its RB-tree even if it sees new
IO that is beginning at the same sector. However such IO most likely
means that we have just lost the completion event for the IO that is
still in the tree. So in such case replacing the IO in RB-tree makes
more sense to avoid bogus IOs being reported as taking huge amount of
time.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently queue depth and latency graphs are generated from ISSUE and
COMPLETE events. For traces which miss the ISSUE events (e.g. from
device mapper) use QUEUE events instead. The result won't be as great
but it still conveys some useful information.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When parsing blktrace data, process notify events even outside the
specified interval. This way we can learn about time stamps, process
names etc.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
So far we used maximum of the first trace for the maximum range of the
queue depth graph. Use maximum over all traces similarly as for other
line graphs.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Use maximum of rolling average as the upper range end for the line graph
to use better the available space in the plot.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Just replace the malloc/memset with a calloc().
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Using __DATE__ and __TIME__ will break reproducible builds. The
resulting binary will change with each rebuild even if the source and
toolchain is identical.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
is_reap_done() must also check that SIGINT or SIGTERM have come, or
we hang forever with such backtraces after Ctrl-C:
(gdb) thr a a bt
Thread 3 (Thread 0x7fbff8ff9700 (LWP 12607)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402698 in replay_rec () at btreplay.c:1035
#2 0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7fbfea7fc700 (LWP 12611)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402698 in replay_rec () at btreplay.c:1035
#2 0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7fc00282e700 (LWP 12597)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402303 in __wait_cv () at btreplay.c:413
#2 0x0000000000401ae8 in main () at btreplay.c:426
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
getpid() is a pid of a process, at least tid must be provided.
But if zero is passed, then calling thread will be used.
That exactly what is needed.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Size should be provided, not cpus number.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently, blktrace uses _SC_NPROCESSORS_CONF to find out the number of
CPUs. This is a problem, because if you reduce the number of online
CPUs by passing kernel parameter maxcpus, then blktrace fails to start
with the error:
FAILED to start thread on CPU 4: 22/Invalid argument
FAILED to start thread on CPU 5: 22/Invalid argument
...
The attached patch fixes it to use _SC_NPROCESSORS_ONLN.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Avoids the build failures when sys/types.h does not get included
indirectly through other headers.
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In get_ncpus, we default to using 4096 CPUs if _SC_NPROCESSORS_CONF isn't
enabled. If that is insufficient, sched_getaffinity will fail and we
retry after doubling the size of the cpu_set_t allocation. There's a typo
in there that means we don't actually double the size and will loop
forever allocating the same sized cpu_set_t instead.
Signed-off-by: Josef Cejka <jcejka@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Kills the errors on unchecked return of system()
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Andrew says:
Here are some trivial tweaks which I found were needed or desirable while
adding iowatcher to the blktrace packaging in Fedora. They improve the
integration of iowatcher into the tree and reduce duplication of docs.
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts:
iowatcher/Makefile
|
|
Merge the requirements bits of iowatcher/README into README
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
We were setting C=gcc instead of CC=gcc, and using -O0. Fix both.
Signed-off-by: Chris Mason <clm@fb.com>
|
|
This README is getting out-of-date and its contents are duplicated in
the iowatcher manpage which is up-to-date, so remove it to reduce
duplication of effort.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
iowatcher's manpage wasn't being installed with the other manpages so
add it to the doc directory.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Bump it up to a full 1.1 since we now include iowatcher.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Fix an unchecked strcpy and strcat in plot_io_movie():
$ ./iowatcher -t foo --movie -o foo.ogv -l $(printf 'x%.0s' {1..300})
[...]
*** buffer overflow detected ***: ./iowatcher terminated
There was also very similar code in plot_io() so a new function
plot_io_legend() was added to factor out the common string building code
and replace the buggy code with asprintf().
Also add a closedir() call to an error path in traces_list() to plug a
resource leak and make iowatcher Coverity-clean (ignoring some
false-positives).
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Adding -Wmissing-prototypes showed some functions could be made static
and my 'findunused' script showed some functions weren't being called.
This patch was tested by building from scratch and running with various
combinations of options.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Bring the man page and usage string up-to-date with the new -p behaviour
and improve the formatting and content of the man page.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
For consistency and deduplication, use run_program in start_mpstat. Add
the ability to pass a path to run_program, which will be opened in the
spawned process and used as stdout, in order to capture mpstat output.
This fixes a tricky descriptor leak in start_mpstat which could have
caused a race condition if it was fixed with close().
Some output formatting tweaks have also been added and a bug from a
previous patch, where tracers were killed immediately when -p wasn't
specified, has been fixed.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Rework start_blktrace and use run_program to launch blktrace. Move the
argv-building into the function so that it's easier to work with and
clean it up a bit. Add a signal parameter to wait_program to optionally
kill the pid with a given signal before waiting for it.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Previously the --prog option required the program-to-be-run to be
specified as a single string. This meant that shell escaping would be
lost in translation and a sub-shell would be run. Rework --prog to not
take an argument and accept the arguments left after option processing
has ended as the argv for the program-to-be-run.
As we have the program as an argv, run_program2() can now be used to run
it, and now that run_program() is no longer used we can remove it and
remove the '2' from run_program2.
New usage example:
# iowatcher -p -t foo -d /dev/sda3 sleep 10
running blktrace blktrace -b 8192 -a queue -a complete -a issue -a notify -D . -d /dev/sda3 -o foo
running 'sleep' '10'
sleep exited with 0
...
Docs have been updated accordingly.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Until now run_program2() was a replacement for system() so it always
waited for the process to end before returning. To make this function
more useful move the waiting code into a separate function and add a
mechanism to expect a specific exit code.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
(Caught by Coverity.) tf->gdd_writes and tf->gdd_reads are arrays of
pointers so update their allocations to use the correct element size.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
plot_io_movie() was calling create_movie_temp_dir() which unnecessarily
strdup()ed a string constant leaving plot_io_movie() to free it. Replace
the strdup() with a mutable char array and get rid of the free(). Merge
the few remaining lines which create the movie dir into plot_io_movie().
Also prune a duplicate declaration of start_mpstat() in tracers.h
Signed-off-by: Andrew Price <anprice@redhat.com>
|