blktrace.git
3 months agodoc: btrace: fix wrong format on doc master
Fukui Daichi [Wed, 17 Jan 2024 08:36:51 +0000 (08:36 +0000)]
doc: btrace: fix wrong format on doc

The synopsis part of the btrace documentation gets highlighted with
wrong format. Let's fix the format. There is no change to the contents.

Signed-off-by: Fukui Daichi <a.dog.will.talk@akane.waseda.jp>
Link: https://lore.kernel.org/r/20240117083651.954-1-a.dog.will.talk@akane.waseda.jp
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblkparse: fix incorrectly sized memset in check_cpu_map
Jeff Mahoney [Thu, 21 Oct 2021 14:16:20 +0000 (10:16 -0400)]
blkparse: fix incorrectly sized memset in check_cpu_map

The memset call in check_cpu_map always clears sizeof(unsigned long *)
regardless of what size was allocated.  Use calloc instead to allocate
the map so it's zeroed properly regardless of the size requested.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblkparse: skip check_cpu_map with pipe input
Jeff Mahoney [Thu, 21 Oct 2021 14:16:19 +0000 (10:16 -0400)]
blkparse: skip check_cpu_map with pipe input

When we're using pipe input, we don't track online CPUs and don't have a
cpu_map.  When we start to show entries, check_sequence will be invoked.
If the first entry isn't sequence 1 (perhaps it's been dropped?), we'll
proceed to check_cpu_map.  Since we haven't tracked online CPUs,
pdi->cpu_map_max will be 0 and we'll do a malloc(0).  Then we'll start
setting bits corresponding to CPU numbers in memory we don't own.  Since
there's nothing to check here, let's skip it on pipe input.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace: exit directly when nthreads_running != ncpus in run_tracers()
lijinlin [Mon, 28 Jun 2021 19:41:32 +0000 (13:41 -0600)]
blktrace: exit directly when nthreads_running != ncpus in run_tracers()

We found blktrace got stuck when cgroup restricts blktrace to use cpu,
the messages and stack is:
[root@localhost ~]# blktrace -w 10 -o- /dev/sda
FAILED to start thread on CPU 1: 22/Invalid argument
FAILED to start thread on CPU 2: 22/Invalid argument
[root@localhost ~]# cat /proc/1385110/stack
[<0>] __switch_to+0xe8/0x150
[<0>] futex_wait_queue_me+0xd4/0x158
[<0>] futex_wait+0xf4/0x230
[<0>] do_futex+0x470/0x900
[<0>] __arm64_sys_futex+0x13c/0x188
[<0>] el0_svc_common+0x80/0x200
[<0>] el0_svc_handler+0x78/0xe0
[<0>] el0_svc+0x10/0x260
[<0>] 0xffffffffffffffff

Blktrace failed to start thread is caused by thread can't lock on the
Restricted cpu. In this case, blktrace would't schedule an alarm after
defined time to set variable 'done' as 1.
We debug the code and found the call trace as bellow:
main()
   ==>run_tracers()
      ==>wait_tracers()
         ==>process_trace_bufs()
            ==>wait_empty_entries()
               ==>t_pthread_cond_wait()
Blktrace was set to piped output, so the process is stuck in
wait_empty_entries() for wait variable 'done' have been set as 1.

We set variable 'done' as 1 when 'nthreads_running' is not equal to
'ncpus' in run_tracers() to fix the problem.

Signed-off-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Lixiaokeng <lixiaokeng@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoblktrace 1.3.0 blktrace-1.3.0
Jens Axboe [Mon, 14 Jun 2021 14:55:52 +0000 (08:55 -0600)]
blktrace 1.3.0

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Print time when trace was started
Jan Kara [Wed, 13 Jan 2021 11:26:43 +0000 (12:26 +0100)]
blkparse: Print time when trace was started

For correlating blktrace data with other information, it is useful to
know when the trace has been captured. Since the absolute timestamp
is contained in the blktrace file, just output it.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210113112643.12893-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblktrace: inclusive terminology
Eric Sandeen [Fri, 19 Feb 2021 16:36:08 +0000 (10:36 -0600)]
blktrace: inclusive terminology

Use more inclusive terminology in a couple places.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Print PID information for TN_MESSAGE events
Jan Kara [Wed, 13 May 2020 16:04:02 +0000 (18:04 +0200)]
blkparse: Print PID information for TN_MESSAGE events

The kernel now provides PID information for TN_MESSAGE events. Print it.
Old kernels fill 0 there so the behavior is unaffected for them.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoiowatcher: Handle cgroup information
Jan Kara [Wed, 6 May 2020 13:39:33 +0000 (15:39 +0200)]
iowatcher: Handle cgroup information

Since Linux kernel commit 35fe6d763229 "block: use standard blktrace API
to output cgroup info for debug notes" the kernel can pass
__BLK_TA_CGROUP flag in the action field of generated events. Teach
iowatcher to ignore this information.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoiowatcher: Use blktrace_api.h
Jan Kara [Wed, 6 May 2020 13:39:32 +0000 (15:39 +0200)]
iowatcher: Use blktrace_api.h

Use blktrace_api.h header instead of redefining the constants once more
in blkparse.c.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Handle cgroup information
Jan Kara [Wed, 6 May 2020 13:39:31 +0000 (15:39 +0200)]
blkparse: Handle cgroup information

Since Linux kernel commit 35fe6d763229 "block: use standard blktrace API
to output cgroup info for debug notes" the kernel can pass
__BLK_TA_CGROUP flag in the action field of generated events. blkparse
does not count with this and so it will get confused by such events and
either ignore them or misreport them. Teach blkparse how to properly
process events with __BLK_TA_CGROUP flag.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Fix up the sector and length of split completions
Andreas Gruenbacher [Mon, 13 Apr 2020 19:01:52 +0000 (21:01 +0200)]
blkparse: Fix up the sector and length of split completions

When a split io completes, the sector and length of the completion event refer
to the last part of the original request.  This is in conflict with the
blkparse manual page, makes the blkparse output difficult to read, and leads to
incorrect statistics.  Fix up the sector and length of split completion events
to match the original request.

To achieve that, slightly extend the existing event tracking infrastructure to
track all parts of a split request.  We could almost get by tracking only the
last part of a split, but that wouldn't quite work correctly for splits of
splits.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Initialize and test for undefined request tracking timestamps
Andreas Gruenbacher [Mon, 13 Apr 2020 19:01:51 +0000 (21:01 +0200)]
blkparse: Initialize and test for undefined request tracking timestamps

Currently, event tracking timestamps aren't initialized at all even though some
places in the code assume that a value of 0 indicates 'undefined'.  However, 0
is the timestamp of the first event, so use -1ULL for 'undefined' instead.

In addition, make sure timestamps are only initialized once, and always check
if timestamps are defined before using them.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Allow request tracking on non md/dm devices
Andreas Gruenbacher [Mon, 13 Apr 2020 19:01:50 +0000 (21:01 +0200)]
blkparse: Allow request tracking on non md/dm devices

Fix queue to completion tracking on devices other than md/dm: without this fix,
enabling tracking with the -t option on a non-md/dm device leads to "complete
not found" errors.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblkparse: Fix device in event tracking error messages
Andreas Gruenbacher [Mon, 13 Apr 2020 19:01:49 +0000 (21:01 +0200)]
blkparse: Fix device in event tracking error messages

For some reason, dev in struct per_dev_info isn't set in the log_track_
functions, and so the error messages report (0,0) as the device.  Fix by using
device in struct blk_io_trace instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Use `with open() as ...` context manager
Vincent Legoll [Fri, 20 Mar 2020 21:44:59 +0000 (22:44 +0100)]
btt_plot.py: Use `with open() as ...` context manager

to automatically handle close()

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobno_plot.py: Fix pylint: singleton-comparison
Vincent Legoll [Fri, 20 Mar 2020 21:44:58 +0000 (22:44 +0100)]
bno_plot.py: Fix pylint: singleton-comparison

Comparison to None should be 'expr is None'

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobno_plot.py: Fix pylint: len-as-condition
Vincent Legoll [Fri, 20 Mar 2020 21:44:57 +0000 (22:44 +0100)]
bno_plot.py: Fix pylint: len-as-condition

Do not use `len(SEQUENCE)` to determine if a sequence is empty

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Fix pylint: no-else-return
Vincent Legoll [Fri, 20 Mar 2020 21:44:56 +0000 (22:44 +0100)]
btt_plot.py: Fix pylint: no-else-return

Unnecessary "elif" after "return"

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Fix pylint: singleton-comparison
Vincent Legoll [Fri, 20 Mar 2020 21:44:55 +0000 (22:44 +0100)]
btt_plot.py: Fix pylint: singleton-comparison

Comparison to None should be 'expr is None'

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Fix pylint: len-as-condition
Vincent Legoll [Fri, 20 Mar 2020 21:44:54 +0000 (22:44 +0100)]
btt_plot.py: Fix pylint: len-as-condition

Do not use `len(SEQUENCE)` to determine if a sequence is empty

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Fix pylint: wrong-import-order
Vincent Legoll [Fri, 20 Mar 2020 21:44:53 +0000 (22:44 +0100)]
btt_plot.py: Fix pylint: wrong-import-order

C0411: standard import "import getopt, glob, os, sys" should be placed
before "import six"

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobno_plot.py: Use `with open() as ...` context manager to automatically handle close()
Vincent Legoll [Fri, 20 Mar 2020 21:44:52 +0000 (22:44 +0100)]
bno_plot.py: Use `with open() as ...` context manager to automatically handle close()

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobno_plot.py: Use shutil.rmtree() instead of os.system('/bin/rm')
Vincent Legoll [Fri, 20 Mar 2020 21:44:51 +0000 (22:44 +0100)]
bno_plot.py: Use shutil.rmtree() instead of os.system('/bin/rm')

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtt_plot.py: Use sum() instead of open-coding it to compute list average
Vincent Legoll [Fri, 20 Mar 2020 21:44:50 +0000 (22:44 +0100)]
btt_plot.py: Use sum() instead of open-coding it to compute list average

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agofix parallel build of btt and blkiomon
Gwendal Grignou [Thu, 16 Jan 2020 20:33:26 +0000 (12:33 -0800)]
fix parallel build of btt and blkiomon

rbtree.c is used by both binaries. It is possible that when make -C btt
is invoked rbtree.o does not exist yet, but is already schedule by the
compilation of blkiomon. That could result in recompiling rbtree.o again
for btt/btt.
In that case, at install time, make will recompile blkiomon which can
fail in gentoo, because CC variable is not overriden by ebuild script at
install time. (see https://bugs.gentoo.org/705594)

Add a dependency on SUBDIRS to wait for all binary in . to be compiled.
It will guarante rbtree.o exists.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agodoc: tex: add absolute timestamp printing option
Hiroaki Mihara [Tue, 24 Sep 2019 22:33:11 +0000 (07:33 +0900)]
doc: tex: add absolute timestamp printing option

The functionality of printing out absolute timestamps has been
implemented in code but not documented in doc/blktrace.tex.

Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoblkparse: fix absolute timestamp when reading from file
Hiroaki Mihara [Tue, 24 Sep 2019 22:32:50 +0000 (07:32 +0900)]
blkparse: fix absolute timestamp when reading from file

This patch fixes the wrong absolute timestamps when blkparse reads
data from files.

The blkparse command prints out wrong timestamps if all the following
conditions are met,

* The blkparse command reads data from files created by blktrace.
* "z" format option is set as OUTPUT DESCRIPTION.
  ex.) blkparse xxx.blktrace.0 -f "%z\n"
* start_timestamp(=blktrace command started) != genesis_time(=first
  I/O traced)

When blkparse reads data from pipe instead, it yields correct
timestamps.

The root cause of this issue comes from the fact that the time
difference between start_timestamp and genesis_time is not added when
blkparse reads data from files. When blkparse reads data from pipe,
the time-difference is added through find_genesis() function.

The following test cases show the contradictions in absolute
timestams. Also the Step 4 shows that the issue is fixed with the
blkparse command with the suggesting patch.

* Step 1: After invoking blktrace command, test I/O traffic was
  generated by dd command as follows,

  # date +%Y%m%d_%H%M%S_%N; dd if=/dev/sda3 of=/dev/null count=1 iflag=direct
     20190919_092726_077032490
     1+0 records in
     1+0 records out
     512 bytes copied, 0.00122329 s, 419 kB/s

  The timestamp was recorded just before executing dd command.  The
  test I/O would have been traced right after 09:27:26.077032490 .

* Step 2: The blkparse command reads data from "pipe".

  $ cat test.blktrace.* | blkparse - -f "%T.%t %z %C\n"
  0.000000000 09:27:22.427592 kworker/0:0
  0.000002080 09:27:22.427594 kworker/0:0
  .
  .
  3.652263118 09:27:26.079855 dd
  3.652265818 09:27:26.079857 dd
  3.652274742 09:27:26.079866 dd
  3.652277266 09:27:26.079869 dd

  The first I/O by dd command showed the relative timestamp as
  3.652263118 and the absolute timestamp as 09:27:26.079855, which is
  right after the timestamp shown in the Step 1.

* Step 3: The blkparse command reads from the trace "file" created in
  the Step 1.

  $ blkparse test -f "%T.%t %z %C\n"
  Input file test.blktrace.0 added
  Input file test.blktrace.1 added
  Input file test.blktrace.2 added
  Input file test.blktrace.3 added
  0.000000000 09:27:21.187304 kworker/0:0
  0.000002080 09:27:21.187306 kworker/0:0
  .
  .
  3.652263118 09:27:24.839567 dd
  3.652265818 09:27:24.839570 dd
  3.652274742 09:27:24.839578 dd
  3.652277266 09:27:24.839581 dd

  In the previous step (Step 2), the data was passed via pipe. In this
  case, the blkparse command reads data from the same file, instead.

  The first I/O by dd command showed the relative timestamp as
  3.652263118 and the absolute timestamp as 09:27:24.839567, which is
  a few seconds earlier than the absolute timestamp recorded in the
  Step 1. The order of events and the absolute timestamps contradict.

* Step 4: The blkparse command with the suggesting patch
  (./blkparse_with_patch) reads data from the trace file created in
  the Step 1.

  $ ./blkparse_with_patch test -f "%T.%t %z %C\n"
  Input file test.blktrace.0 added
  Input file test.blktrace.1 added
  Input file test.blktrace.2 added
  Input file test.blktrace.3 added
  0.000000000 09:27:22.427592 kworker/0:0
  0.000002080 09:27:22.427594 kworker/0:0
  .
  .
  3.652263118 09:27:26.079855 dd
  3.652265818 09:27:26.079857 dd
  3.652274742 09:27:26.079866 dd
  3.652277266 09:27:26.079869 dd

  In this case, the absolute timestamps showed the same value as shown
  in the Step 2(the case with pipe).
  The time gap between the genesis_ time and the start_timestamp was
  corrected even if the blkparse reads data from files.

Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
the#

Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoblkparse: split off the timestamp correction code in to a separate function
Hiroaki Mihara [Tue, 24 Sep 2019 22:32:06 +0000 (07:32 +0900)]
blkparse: split off the timestamp correction code in to a separate function

find_genesis() function has code to correct abs_start_time, which is
later used to calculate the absolute timestamps of each traced
records.

Put this code in a separate function, so that it can be used later by
the blkparse code. No functional change.

Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoblkparse: man: add absolute timestamp printing option
Hiroaki Mihara [Fri, 20 Sep 2019 05:03:36 +0000 (14:03 +0900)]
blkparse: man: add absolute timestamp printing option

The functionality of printing out absolute timestamps has been
implemented in code but not documented in man pages.

When comparing the timings of related events with block I/O traces,
the absolute timestams play a key role.  I think that the
documentation of this might be beneficial to blktrace users.

The related commit was done in 2006 as follows,

> commit 7bd4fd0a4fca645bb50a641afac1e460a4e32dfd
> Author: Olaf Kirch <okir@lst.de>
> Date:   Fri Dec 1 10:34:11 2006 +0100
>
>     [PATCH] Add timestamp support
>
>     Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
>

URL of the above patch,
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git/commit/?id=7bd4fd0a4fca645bb50a641afac1e460a4e32dfd

Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agobtreplay: fix device IO remap functionality
Ignat Korchagin [Mon, 16 Sep 2019 16:30:23 +0000 (10:30 -0600)]
btreplay: fix device IO remap functionality

Commit dd093eb1c48e ("Fix warnings on newer gcc") moved string buffers holding
device names during map file parse stage to stack. However, only pointers to
them are being stored in the allocated "struct map_dev" structure. These
pointers are invalid outside of scope of this function and in a different
thread context. Also "release_map_devs" function still tries to "free" them
later as if they were allocated on the heap.

Moving the buffers back to the heap by instructing "fscanf" to allocate them
while parsing the file.

Alternatively, we could redefine the "struct map_dev" to include the whole
buffers instead of just pointers to them and free them as part of releasing the
whole "struct map_dev".

Fixes: dd093eb1c48e ("Fix warnings on newer gcc")
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agoblkparse: add support sort program by io event
Weiping Zhang [Tue, 21 May 2019 13:29:13 +0000 (21:29 +0800)]
blkparse: add support sort program by io event

Displays each program's data sorted by program name or io event, like
Queued, Read, Write and Complete. When -S is specified the -s will be ignored.
The capital letters Q,R,W,C stand for KB, then q/r/w/c stand for IO.
The N is used for sorting programs by name, same to -s.

If you want to sort programs by how many data they queued, you can use:

blkparse -i sda.blktrace. -q -S Q -o sda.parse

Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoiowatcher: spawn NPROCESSORS_ONLN for rsvg-convert-s
Jeff Moyer [Fri, 31 Aug 2018 20:01:34 +0000 (16:01 -0400)]
iowatcher: spawn NPROCESSORS_ONLN for rsvg-convert-s

iowatcher currently always spawns 8 rsvg-convert processes, no matter
how many CPUs a system has.  I did some limited testing of different
numbers of rsvg-convert processes.  Here are the results:

8 processes:
real 4m2.194s
user 23m36.665s
sys 0m38.523s

20 processes:
real 2m28.935s
user 24m51.817s
sys 0m49.227s

40 processes:
real 2m28.150s
user 24m56.994s
sys 0m49.621s

Note that this is the time it takes for a full run of iowatcher -- I
didn't separate out just the rsvg-convert portion.

Given the above results, it seems like a reasonable thing to spawn one
rsvg-convert process per cpu.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoiowatcher: don't add Q events to the io hash
Jeff Moyer [Thu, 30 Aug 2018 22:14:20 +0000 (18:14 -0400)]
iowatcher: don't add Q events to the io hash

Hi,

Bryan Gurney reported iowatcher taking a *really* long time to generate
a movie for 16GiB worth of trace data.  I took a look, and the io hash
was growing without bounds.  The reason was that the I/O pattern looks
like this:

259,0    5        8     0.000208501 31708  A   W 5435592 + 8 <- (259,1) 5433544
259,1    5        9     0.000209537 31708  Q   W 5435592 + 8 [kvdo30:bioQ0]
259,1    5       10     0.000209880 31708  G   W 5435592 + 8 [kvdo30:bioQ0]
259,0    5       11     0.000211064 31708  A   W 5435600 + 8 <- (259,1) 5433552
259,1    5       12     0.000211347 31708  Q   W 5435600 + 8 [kvdo30:bioQ0]
259,1    5       13     0.000212957 31708  M   W 5435600 + 8 [kvdo30:bioQ0]
259,0    5       14     0.000213379 31708  A   W 5435608 + 8 <- (259,1) 5433560
259,1    5       15     0.000213629 31708  Q   W 5435608 + 8 [kvdo30:bioQ0]
259,1    5       16     0.000213937 31708  M   W 5435608 + 8 [kvdo30:bioQ0]
...
259,1    5      107     0.000246274 31708  D   W 5435592 + 256 [kvdo30:bioQ0]

For each of those Q events, an entry was created in the io_hash.  Then,
upon I/O completion, only the first event (with the right starting
sector) was removed!  The runtime overhead of just iterating the hash
chains was enormous.

The solution is to simply ignore the Q events, so long as there are D
events in the trace.  If there are no D events, then go ahead and hash
the Q events as before.  I'm hoping that if we only have Q and C, that
they will actually be aligned.  If that's an incorrect assumption, we
could account merges in an rbtree.  I'll defer that work until someone
can show me blktrace data that needs it.

The comments should be self explanatory.  Review would be appreciated
as the code isn't well documented, and I don't know if I'm missing some
hidden assumption about the data.

Before applying this patch, iowatcher would take more than 12 hours to
complete.  After the patch:

real 9m44.476s
user 41m35.426s
sys 3m29.106s

'nuf said.

Cheers,
Jeff

Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agomake btt scripts python3-ready
Eric Sandeen [Wed, 28 Mar 2018 20:26:36 +0000 (15:26 -0500)]
make btt scripts python3-ready

Many distributions are moving to python3 by default.  Here's
an attempt to make the python scripts in blktrace python3-ready.

Most of this was done with automated tools.  I hand fixed some
space-vs tab issues, and cast an array index to integer.  It
passes rudimentary testing when run under python2.7 as well
as python3.

This doesn't do anything with the shebangs, it leaves them both
invoking whatever "env python" coughs up on the system.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agobtt: make device/devno use PATH_MAX to avoid overflow
Jens Axboe [Wed, 2 May 2018 16:24:17 +0000 (10:24 -0600)]
btt: make device/devno use PATH_MAX to avoid overflow

Herbo Zhang reports:

I found a bug in blktrace/btt/devmap.c. The code is just as follows:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git/tree/btt/devmap.c?id=8349ad2f2d19422a6241f94ea84d696b21de4757

       struct devmap {

struct list_head head;
char device[32], devno[32];    // #1
};

LIST_HEAD(all_devmaps);

static int dev_map_add(char *line)
{
struct devmap *dmp;

if (strstr(line, "Device") != NULL)
return 1;

dmp = malloc(sizeof(struct devmap));
if (sscanf(line, "%s %s", dmp->device, dmp->devno) != 2) {  //#2
free(dmp);
return 1;
}

list_add_tail(&dmp->head, &all_devmaps);
return 0;
}

int dev_map_read(char *fname)
{
char line[256];   // #3
FILE *fp = my_fopen(fname, "r");

if (!fp) {
perror(fname);
return 1;
}

while (fscanf(fp, "%255[a-zA-Z0-9 :.,/_-]\n", line) == 1) {
if (dev_map_add(line))
break;
}

fclose(fp);
return 0;
}

 The line length is 256, but the dmp->device, dmp->devno  max length
is only 32. We can put strings longer than 32 into dmp->device and
dmp->devno , and then they will be overflowed.

 we can trigger this bug just as follows:

 $ python -c "print 'A'*256" > ./test
    $ btt -M ./test

    *** Error in btt': free(): invalid next size (fast): 0x000055ad7349b250 ***
    ======= Backtrace: =========
    /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7f158ce7e5]
    /lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f7f158d6e0a]
    /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7f158da98c]
    btt(+0x32e0)[0x55ad7306f2e0]
    btt(+0x2c5f)[0x55ad7306ec5f]
    btt(+0x251f)[0x55ad7306e51f]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7f15877830]
    btt(+0x26b9)[0x55ad7306e6b9]
    ======= Memory map: ========
    55ad7306c000-55ad7307f000 r-xp 00000000 08:14 3698139
      /usr/bin/btt
    55ad7327e000-55ad7327f000 r--p 00012000 08:14 3698139
      /usr/bin/btt
    55ad7327f000-55ad73280000 rw-p 00013000 08:14 3698139
      /usr/bin/btt
    55ad73280000-55ad73285000 rw-p 00000000 00:00 0
    55ad7349a000-55ad734bb000 rw-p 00000000 00:00 0
      [heap]
    7f7f10000000-7f7f10021000 rw-p 00000000 00:00 0
    7f7f10021000-7f7f14000000 ---p 00000000 00:00 0
    7f7f15640000-7f7f15656000 r-xp 00000000 08:14 14942237
      /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f7f15656000-7f7f15855000 ---p 00016000 08:14 14942237
      /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f7f15855000-7f7f15856000 r--p 00015000 08:14 14942237
      /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f7f15856000-7f7f15857000 rw-p 00016000 08:14 14942237
      /lib/x86_64-linux-gnu/libgcc_s.so.1
    7f7f15857000-7f7f15a16000 r-xp 00000000 08:14 14948477
      /lib/x86_64-linux-gnu/libc-2.23.so
    7f7f15a16000-7f7f15c16000 ---p 001bf000 08:14 14948477
      /lib/x86_64-linux-gnu/libc-2.23.so
    7f7f15c16000-7f7f15c1a000 r--p 001bf000 08:14 14948477
      /lib/x86_64-linux-gnu/libc-2.23.so
    7f7f15c1a000-7f7f15c1c000 rw-p 001c3000 08:14 14948477
      /lib/x86_64-linux-gnu/libc-2.23.so
    7f7f15c1c000-7f7f15c20000 rw-p 00000000 00:00 0
    7f7f15c20000-7f7f15c46000 r-xp 00000000 08:14 14948478
      /lib/x86_64-linux-gnu/ld-2.23.so
    7f7f15e16000-7f7f15e19000 rw-p 00000000 00:00 0
    7f7f15e42000-7f7f15e45000 rw-p 00000000 00:00 0
    7f7f15e45000-7f7f15e46000 r--p 00025000 08:14 14948478
      /lib/x86_64-linux-gnu/ld-2.23.so
    7f7f15e46000-7f7f15e47000 rw-p 00026000 08:14 14948478
      /lib/x86_64-linux-gnu/ld-2.23.so
    7f7f15e47000-7f7f15e48000 rw-p 00000000 00:00 0
    7ffdebe5c000-7ffdebe7d000 rw-p 00000000 00:00 0
      [stack]
    7ffdebebc000-7ffdebebe000 r--p 00000000 00:00 0
      [vvar]
    7ffdebebe000-7ffdebec0000 r-xp 00000000 00:00 0
      [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
      [vsyscall]
    [1]    6272 abort      btt -M test

Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblkparse: add documetation for 'R' requeue request
Weiping Zhang [Sat, 7 Apr 2018 09:12:18 +0000 (17:12 +0800)]
blkparse: add documetation for 'R' requeue request

Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblkparse: remove duplicated entry for flag M
Weiping Zhang [Sat, 7 Apr 2018 09:12:00 +0000 (17:12 +0800)]
blkparse: remove duplicated entry for flag M

remove dupliated entry 'M' for man page of blkparse.

Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblktrace: don't stop tracer if not setup trace successfully
weiping zhang [Mon, 15 Jan 2018 15:53:42 +0000 (23:53 +0800)]
blktrace: don't stop tracer if not setup trace successfully

if we run blktrace on same device twice, the second time will failed
to ioctl(BLKTRACESETUP), then it will call __stop_tracer, which lead
the first blktrace failed to access debugfs entries. So this patch add
a check to handle this case, to avoid stop tracer uncondionally.

Signed-off-by: weiping zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agofix parallel build failures
Robin H. Johnson [Tue, 23 Jan 2018 22:57:55 +0000 (17:57 -0500)]
fix parallel build failures

When building in parallel, the btreplay/btrecord and btreplay/btreplay
targets cause make to kick off two jobs for `make -C btreplay` and they
sometimes end up clobbering each other.  We could fix this by making one
a dependency of the other, but it's a bit cleaner to refactor things to
be based on subdirs.  This way changes in subdirs also get noticed:
  $ touch btreplay/*.[ch]
  $ make
  <btreplay is now correctly updated>

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agorespect LDFLAGS when linking programs
Robin H. Johnson [Tue, 23 Jan 2018 22:47:19 +0000 (17:47 -0500)]
respect LDFLAGS when linking programs

Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agobtt: Fix overlapping IO stats.
Gwendal Grignou [Fri, 18 Aug 2017 22:00:22 +0000 (15:00 -0700)]
btt: Fix overlapping IO stats.

Keep scanning the tree for overlapping IO otherwise Q2G and process
traces will be incorrect.

Let assume we have 2 IOs:

A                                      A+a
|---------------------------------------|
       B                B+b
       |-----------------|

In the red/black tree we have:

                    o -> [A,A+a]
                   / \
                left right
                 /    \
           [...]o      o -> [B, B+b]

In the current code, if we would not be able to find [B+b] in the tree:
B is greater than A, so we won't go left
B+b is smaller than A+a, so we are not going right either.

When we have a [X, X+x] IO to look for:
We need to check for right when either:
 X+x >= A+a (for merged IO)
and
 X > A (for overlapping IO)

TEST=Check with a trace with overlapping IO: Q2C and Q2G are expected.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agobtt/devs: silence warning on sprintf overflow
Jens Axboe [Sun, 5 Nov 2017 15:54:41 +0000 (08:54 -0700)]
btt/devs: silence warning on sprintf overflow

Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agojhash: fix annoying gcc fall through warnings
Jens Axboe [Sun, 5 Nov 2017 15:52:21 +0000 (08:52 -0700)]
jhash: fix annoying gcc fall through warnings

Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoBlktrace 1.2.0 blktrace-1.2.0
Jens Axboe [Sun, 5 Nov 2017 04:13:06 +0000 (22:13 -0600)]
Blktrace 1.2.0

Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblktrace: abort if device ioctl setup fails
Jens Axboe [Sun, 5 Nov 2017 04:10:00 +0000 (22:10 -0600)]
blktrace: abort if device ioctl setup fails

If we fail doing the BLKTRACESETUP ioctl, blktrace still marches on
and sets up the rest. This results in errors like the below:

blktrace /dev/sdf
BLKTRACESETUP(2) /dev/sdf failed: 5/Input/output error
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or directory
Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or directory
[...]
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
FAILED to start thread on CPU 2: 1/Operation not permitted

and blktrace continues to run, though it can't do anything in this
state.

If the ioctl setup fails, just abort.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
7 years agoblktrace: Create empty output files for non-existent cpus
Jan Kara [Thu, 26 Jan 2017 10:23:55 +0000 (11:23 +0100)]
blktrace: Create empty output files for non-existent cpus

When CPU number space is sparse, we don't start threads for non-existent
CPUs. As a result, there are no output files created for these CPUs
which confuses tools like blkparse which expect that CPU numbers are
contiguous. Create fake empty files for non-existent CPUs so that other
tools don't have to bother.

Note that in network mode, the server will create all files in the range
0..max_cpus automatically.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblktrace: Reorganize creation of output file name
Jan Kara [Thu, 26 Jan 2017 10:23:54 +0000 (11:23 +0100)]
blktrace: Reorganize creation of output file name

We would like to generate output file name without having corresponding
iop structure. Reorganize the function to allow that. Also fix couple of
overflows possible when generating the file name when we are modifying
the code anyway.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblktrace: Add support for sparse CPU numbers
Jan Kara [Thu, 26 Jan 2017 10:23:53 +0000 (11:23 +0100)]
blktrace: Add support for sparse CPU numbers

On some machines CPU numbers do not form a contiguous interval. In such
cases blktrace will fail to start threads for missing CPUs and exit
effectively rendering itself unusable.

Add support into blktrace to handle systems with sparse CPU numbers.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoiowatcher: link with -lrt
Thomas Petazzoni [Tue, 23 Aug 2016 14:41:21 +0000 (16:41 +0200)]
iowatcher: link with -lrt

Some C libraries (notably uClibc) have the posix_spawn*() functions in
librt, so let's link iowatcher with -lrt.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblktrace: remove -k from manpage synopsis
Eric Sandeen [Wed, 18 May 2016 16:15:07 +0000 (11:15 -0500)]
blktrace: remove -k from manpage synopsis

An earlier commit:

  fb7f8667 blktrace: disable kill option - take 2

removed the "-k" option documentation, but left
it in the synopsis.

This is a bit unusual and unhelpful and probably
unintended; remove it from the synopsis as well.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoFixup graph name in help text
Jan Kara [Thu, 5 May 2016 15:17:12 +0000 (17:17 +0200)]
Fixup graph name in help text

Proper graph name is queue-depth, not queue_depth.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoSeparate prefix in legend with space
Jan Kara [Thu, 5 May 2016 15:17:11 +0000 (17:17 +0200)]
Separate prefix in legend with space

Trace label isn't properly separated with space from suffix (Read /
Write). Fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoDon't prepend blktrace destination dir if we didn't run blktrace
Jan Kara [Thu, 5 May 2016 15:17:10 +0000 (17:17 +0200)]
Don't prepend blktrace destination dir if we didn't run blktrace

When user specifies trace files directly via -t option, it doesn't make
sense to prepend blktrace destination directory to them (it is
especially confusing if you specify absolute path names with -t option
and this logic breaks the path names). So avoid that.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoZero sectors are strange
Jan Kara [Thu, 5 May 2016 15:17:09 +0000 (17:17 +0200)]
Zero sectors are strange

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtt: Replace overlapping IO
Jan Kara [Thu, 5 May 2016 15:17:08 +0000 (17:17 +0200)]
btt: Replace overlapping IO

Currently btt keeps the original IO in its RB-tree even if it sees new
IO that is beginning at the same sector. However such IO most likely
means that we have just lost the completion event for the IO that is
still in the tree. So in such case replacing the IO in RB-tree makes
more sense to avoid bogus IOs being reported as taking huge amount of
time.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoiowatcher: Use queue events if issue not available
Jan Kara [Thu, 5 May 2016 15:17:07 +0000 (17:17 +0200)]
iowatcher: Use queue events if issue not available

Currently queue depth and latency graphs are generated from ISSUE and
COMPLETE events. For traces which miss the ISSUE events (e.g. from
device mapper) use QUEUE events instead. The result won't be as great
but it still conveys some useful information.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoProcess notify events outside of given interval
Jan Kara [Thu, 5 May 2016 15:17:06 +0000 (17:17 +0200)]
Process notify events outside of given interval

When parsing blktrace data, process notify events even outside the
specified interval. This way we can learn about time stamps, process
names etc.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoUse maximum over all traces for queue depth
Jan Kara [Thu, 5 May 2016 15:17:05 +0000 (17:17 +0200)]
Use maximum over all traces for queue depth

So far we used maximum of the first trace for the maximum range of the
queue depth graph. Use maximum over all traces similarly as for other
line graphs.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoBetter max estimate for line graphs
Jan Kara [Thu, 5 May 2016 15:17:04 +0000 (17:17 +0200)]
Better max estimate for line graphs

Use maximum of rolling average as the upper range end for the line graph
to use better the available space in the plot.

Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtt/unplug_hist: fix bad memset
Jens Axboe [Tue, 3 May 2016 14:34:50 +0000 (08:34 -0600)]
btt/unplug_hist: fix bad memset

Just replace the malloc/memset with a calloc().

Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtreplay: remove timestamps
Olaf Hering [Fri, 20 Feb 2015 08:06:39 +0000 (09:06 +0100)]
btreplay: remove timestamps

Using __DATE__ and __TIME__ will break reproducible builds. The
resulting binary will change with each rebuild even if the source and
toolchain is identical.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtreplay: make Ctrl-C work
Roman Pen [Sat, 23 Apr 2016 11:44:10 +0000 (13:44 +0200)]
btreplay: make Ctrl-C work

is_reap_done() must also check that SIGINT or SIGTERM have come, or
we hang forever with such backtraces after Ctrl-C:

 (gdb) thr a a bt

 Thread 3 (Thread 0x7fbff8ff9700 (LWP 12607)):
 #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
 #1  0x0000000000402698 in replay_rec () at btreplay.c:1035
 #2  0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
 #3  0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
 #4  0x0000000000000000 in ?? ()

 Thread 2 (Thread 0x7fbfea7fc700 (LWP 12611)):
 #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
 #1  0x0000000000402698 in replay_rec () at btreplay.c:1035
 #2  0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
 #3  0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
 #4  0x0000000000000000 in ?? ()

 Thread 1 (Thread 0x7fc00282e700 (LWP 12597)):
 #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
 #1  0x0000000000402303 in __wait_cv () at btreplay.c:413
 #2  0x0000000000401ae8 in main () at btreplay.c:426

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtreplay: fix sched_{set|get}affinity
Roman Pen [Sat, 23 Apr 2016 11:44:09 +0000 (13:44 +0200)]
btreplay: fix sched_{set|get}affinity

getpid() is a pid of a process, at least tid must be provided.
But if zero is passed, then calling thread will be used.
That exactly what is needed.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agobtreplay: fix memory corruption caused by CPU_ZERO_S
Roman Pen [Sat, 23 Apr 2016 11:44:08 +0000 (13:44 +0200)]
btreplay: fix memory corruption caused by CPU_ZERO_S

Size should be provided, not cpus number.

Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agoblktrace: Use number of online CPUs
Abutalib Aghayev [Tue, 9 Feb 2016 15:17:50 +0000 (08:17 -0700)]
blktrace: Use number of online CPUs

Currently, blktrace uses _SC_NPROCESSORS_CONF to find out the number of
CPUs.  This is a problem, because if you reduce the number of online
CPUs by passing kernel parameter maxcpus, then blktrace fails to start
with the error:

FAILED to start thread on CPU 4: 22/Invalid argument
FAILED to start thread on CPU 5: 22/Invalid argument
...

The attached patch fixes it to use _SC_NPROCESSORS_ONLN.

Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agoAdd the "-a discard" filter option to the blktrace.8 man page
John Groves [Fri, 8 Jan 2016 18:46:03 +0000 (11:46 -0700)]
Add the "-a discard" filter option to the blktrace.8 man page

Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agoFix warnings on newer gcc
Jens Axboe [Tue, 15 Sep 2015 14:48:06 +0000 (08:48 -0600)]
Fix warnings on newer gcc

Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agoinclude sys/types.h for dev_t definition
Khem Raj [Tue, 15 Sep 2015 00:05:21 +0000 (00:05 +0000)]
include sys/types.h for dev_t definition

Avoids the build failures when sys/types.h does not get included
indirectly through other headers.

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agobtreplay: Fix typo in scaling up the dynamic cpu set size.
Josef Cejka [Thu, 20 Aug 2015 15:52:51 +0000 (11:52 -0400)]
btreplay: Fix typo in scaling up the dynamic cpu set size.

In get_ncpus, we default to using 4096 CPUs if _SC_NPROCESSORS_CONF isn't
enabled.  If that is insufficient, sched_getaffinity will fail and we
retry after doubling the size of the cpu_set_t allocation.  There's a typo
in there that means we don't actually double the size and will loop
forever allocating the same sized cpu_set_t instead.

Signed-off-by: Josef Cejka <jcejka@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
9 years agoRefer to sda instead of hda in man pages
Olaf Hering [Wed, 18 Feb 2015 12:03:55 +0000 (13:03 +0100)]
Refer to sda instead of hda in man pages

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
9 years agoiowatcher: wrap system() in a checker function
Jens Axboe [Thu, 25 Sep 2014 21:17:06 +0000 (15:17 -0600)]
iowatcher: wrap system() in a checker function

Kills the errors on unchecked return of system()

Signed-off-by: Jens Axboe <axboe@fb.com>
9 years agoMerge branch 'for-upstream' of https://github.com/andyprice/blktrace
Jens Axboe [Thu, 25 Sep 2014 21:11:24 +0000 (15:11 -0600)]
Merge branch 'for-upstream' of https://github.com/andyprice/blktrace

Andrew says:

Here are some trivial tweaks which I found were needed or desirable while
adding iowatcher to the blktrace packaging in Fedora. They improve the
integration of iowatcher into the tree and reduce duplication of docs.

9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/blktrace
Jens Axboe [Thu, 25 Sep 2014 21:11:06 +0000 (15:11 -0600)]
Merge git://git./linux/kernel/git/mason/blktrace

Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts:
iowatcher/Makefile

9 years agoAdd iowatcher requirements to README
Andrew Price [Thu, 25 Sep 2014 20:00:31 +0000 (21:00 +0100)]
Add iowatcher requirements to README

Merge the requirements bits of iowatcher/README into README

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: check the return value from write()
Chris Mason [Thu, 25 Sep 2014 20:13:17 +0000 (16:13 -0400)]
iowatcher: check the return value from write()

Signed-off-by: Chris Mason <clm@fb.com>
9 years agoiowatcher: fixup the Makefile
Chris Mason [Thu, 25 Sep 2014 20:12:21 +0000 (16:12 -0400)]
iowatcher: fixup the Makefile

We were setting C=gcc instead of CC=gcc, and using -O0.  Fix both.

Signed-off-by: Chris Mason <clm@fb.com>
9 years agoiowatcher: Remove iowatcher/README
Andrew Price [Thu, 25 Sep 2014 19:49:47 +0000 (20:49 +0100)]
iowatcher: Remove iowatcher/README

This README is getting out-of-date and its contents are duplicated in
the iowatcher manpage which is up-to-date, so remove it to reduce
duplication of effort.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Move iowatcher.1 into doc directory
Andrew Price [Thu, 25 Sep 2014 19:47:57 +0000 (20:47 +0100)]
iowatcher: Move iowatcher.1 into doc directory

iowatcher's manpage wasn't being installed with the other manpages so
add it to the doc directory.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Add iowatcher to .gitignore
Andrew Price [Thu, 25 Sep 2014 19:45:44 +0000 (20:45 +0100)]
iowatcher: Add iowatcher to .gitignore

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoMakefile: ensure that iowatcher gets cleaned
Jens Axboe [Wed, 24 Sep 2014 20:49:04 +0000 (14:49 -0600)]
Makefile: ensure that iowatcher gets cleaned

Signed-off-by: Jens Axboe <axboe@fb.com>
9 years agoblktrace 1.1.0 blktrace-1.1.0
Jens Axboe [Wed, 24 Sep 2014 19:52:31 +0000 (13:52 -0600)]
blktrace 1.1.0

Bump it up to a full 1.1 since we now include iowatcher.

Signed-off-by: Jens Axboe <axboe@fb.com>
9 years agoiowatcher: add iowatcher to the main blktrace Makefile
Chris Mason [Wed, 24 Sep 2014 19:09:51 +0000 (12:09 -0700)]
iowatcher: add iowatcher to the main blktrace Makefile

Signed-off-by: Chris Mason <clm@fb.com>
9 years agoMerge the iowatcher repository
Chris Mason [Wed, 24 Sep 2014 19:03:35 +0000 (12:03 -0700)]
Merge the iowatcher repository

9 years agoiowatcher: Properly initialize trace.name in find_trace_file
Chris Mason [Wed, 24 Sep 2014 16:44:45 +0000 (09:44 -0700)]
iowatcher: Properly initialize trace.name in find_trace_file

Signed-off-by: Chris Mason <clm@fb.com>
9 years agoiowatcher: Fix up some strcpy and strcat usage
Andrew Price [Sun, 27 Apr 2014 05:16:01 +0000 (06:16 +0100)]
iowatcher: Fix up some strcpy and strcat usage

Fix an unchecked strcpy and strcat in plot_io_movie():

  $ ./iowatcher -t foo --movie -o foo.ogv -l $(printf 'x%.0s' {1..300})
  [...]
  *** buffer overflow detected ***: ./iowatcher terminated

There was also very similar code in plot_io() so a new function
plot_io_legend() was added to factor out the common string building code
and replace the buggy code with asprintf().

Also add a closedir() call to an error path in traces_list() to plug a
resource leak and make iowatcher Coverity-clean (ignoring some
false-positives).

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Clean up some unused functions, make others static
Andrew Price [Sun, 27 Apr 2014 02:58:40 +0000 (03:58 +0100)]
iowatcher: Clean up some unused functions, make others static

Adding -Wmissing-prototypes showed some functions could be made static
and my 'findunused' script showed some functions weren't being called.

This patch was tested by building from scratch and running with various
combinations of options.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Update usage info and improve man page
Andrew Price [Sun, 27 Apr 2014 02:32:17 +0000 (03:32 +0100)]
iowatcher: Update usage info and improve man page

Bring the man page and usage string up-to-date with the new -p behaviour
and improve the formatting and content of the man page.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Convert start_mpstat to run_program
Andrew Price [Sun, 27 Apr 2014 00:55:57 +0000 (01:55 +0100)]
iowatcher: Convert start_mpstat to run_program

For consistency and deduplication, use run_program in start_mpstat.  Add
the ability to pass a path to run_program, which will be opened in the
spawned process and used as stdout, in order to capture mpstat output.
This fixes a tricky descriptor leak in start_mpstat which could have
caused a race condition if it was fixed with close().

Some output formatting tweaks have also been added and a bug from a
previous patch, where tracers were killed immediately when -p wasn't
specified, has been fixed.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Convert start_blktrace to run_program
Andrew Price [Sat, 26 Apr 2014 22:00:58 +0000 (23:00 +0100)]
iowatcher: Convert start_blktrace to run_program

Rework start_blktrace and use run_program to launch blktrace. Move the
argv-building into the function so that it's easier to work with and
clean it up a bit. Add a signal parameter to wait_program to optionally
kill the pid with a given signal before waiting for it.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Rework --prog to make arg processing safer
Andrew Price [Sat, 26 Apr 2014 17:22:53 +0000 (18:22 +0100)]
iowatcher: Rework --prog to make arg processing safer

Previously the --prog option required the program-to-be-run to be
specified as a single string. This meant that shell escaping would be
lost in translation and a sub-shell would be run. Rework --prog to not
take an argument and accept the arguments left after option processing
has ended as the argv for the program-to-be-run.

As we have the program as an argv, run_program2() can now be used to run
it, and now that run_program() is no longer used we can remove it and
remove the '2' from run_program2.

New usage example:

 # iowatcher -p -t foo -d /dev/sda3 sleep 10
 running blktrace blktrace -b 8192 -a queue -a complete -a issue -a notify -D . -d /dev/sda3 -o foo
 running 'sleep' '10'
 sleep exited with 0
 ...

Docs have been updated accordingly.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Separate program running from waiting
Andrew Price [Sat, 26 Apr 2014 14:31:19 +0000 (15:31 +0100)]
iowatcher: Separate program running from waiting

Until now run_program2() was a replacement for system() so it always
waited for the process to end before returning. To make this function
more useful move the waiting code into a separate function and add a
mechanism to expect a specific exit code.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Correct a couple of calloc calls
Andrew Price [Sat, 26 Apr 2014 02:49:22 +0000 (03:49 +0100)]
iowatcher: Correct a couple of calloc calls

(Caught by Coverity.) tf->gdd_writes and tf->gdd_reads are arrays of
pointers so update their allocations to use the correct element size.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Simplify temp movie directory creation
Andrew Price [Sat, 26 Apr 2014 01:56:17 +0000 (02:56 +0100)]
iowatcher: Simplify temp movie directory creation

plot_io_movie() was calling create_movie_temp_dir() which unnecessarily
strdup()ed a string constant leaving plot_io_movie() to free it. Replace
the strdup() with a mutable char array and get rid of the free(). Merge
the few remaining lines which create the movie dir into plot_io_movie().

Also prune a duplicate declaration of start_mpstat() in tracers.h

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Merge trace dumping functions into one
Andrew Price [Fri, 28 Mar 2014 01:47:27 +0000 (01:47 +0000)]
iowatcher: Merge trace dumping functions into one

Now that combine_blktrace_devs() takes a list of traces it's fairly
generic so we might as well merge blktrace_to_dump() into it. The latter
can be replaced with a call using a list with a single entry.

combine_blktrace_devs() is renamed dump_traces() because that's what it
does.

Also eradicate the big global char array 'line' that was being used in a
bunch of places along with some more unnecessary strdup()s.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Check program exit code properly
Andrew Price [Fri, 28 Mar 2014 01:42:47 +0000 (01:42 +0000)]
iowatcher: Check program exit code properly

The return value of posix_spawnp() was being checked but the exit status
of the child process was being ignored. This adds checks and error
reporting based on the status that waitpid returns.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Fix up directory trace processing
Andrew Price [Thu, 27 Mar 2014 21:59:38 +0000 (21:59 +0000)]
iowatcher: Fix up directory trace processing

Similar to the fix for spaces in file names in commit 5d845e3, this
patch fixes processing of directories with spaces in their names by
using posix_spawnp() to run the blkparse command instead of system(). In
doing so, combine_blktrace_devs() and match_trace() have been reworked
to use a list structure instead of doing a lot of strdup()ing and string
appending.

Also make sure that trailing slashes are removed from the directory name
before attempting to use it as the base of the .dump filename.

Update the -t entry in the manpage to mention directory behaviour, too.

Signed-off-by: Andrew Price <anprice@redhat.com>
9 years agoiowatcher: Handle REQUEUE events
Jan Kara [Thu, 4 Apr 2013 10:18:28 +0000 (06:18 -0400)]
iowatcher: Handle REQUEUE events

When requeue event happens we have to decrease number of in-flight
requests. Otherwise they drift away.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
9 years agoiowatcher: Make seconds unsigned
Jan Kara [Thu, 4 Apr 2013 10:18:27 +0000 (06:18 -0400)]
iowatcher: Make seconds unsigned

Compiler was giving some warnings about signed vs unsigned comparisons.
Although these were harmless, make seconds unsigned because they really
are.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
9 years agoiowatcher: Remove duplicate defines from blkparse.c
Jan Kara [Thu, 4 Apr 2013 10:18:26 +0000 (06:18 -0400)]
iowatcher: Remove duplicate defines from blkparse.c

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>