fio.git
9 months agoengines/io_uring: ensure sqe stores are ordered SQ ring tail update
Jens Axboe [Wed, 16 Jan 2019 05:06:05 +0000 (22:06 -0700)]
engines/io_uring: ensure sqe stores are ordered SQ ring tail update

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: use fio provided memory barriers
Jens Axboe [Wed, 16 Jan 2019 04:43:52 +0000 (21:43 -0700)]
t/io_uring: use fio provided memory barriers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agox86-64: correct read/write barriers
Jens Axboe [Wed, 16 Jan 2019 04:43:11 +0000 (21:43 -0700)]
x86-64: correct read/write barriers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: fixes
Jens Axboe [Tue, 15 Jan 2019 21:48:31 +0000 (14:48 -0700)]
t/io_uring: fixes

- Break out if we get a fatal error from reap_events()
- Ignore polled=1 if do_nop=1

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: terminate buf[] file depth string
Jens Axboe [Tue, 15 Jan 2019 20:52:14 +0000 (13:52 -0700)]
t/io_uring: terminate buf[] file depth string

Prevents garbage print for !s->nr_files (do_nop = 1).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: wait if we're at queue limit
Jens Axboe [Tue, 15 Jan 2019 17:58:17 +0000 (10:58 -0700)]
t/io_uring: wait if we're at queue limit

There was an off-by-one there, it's perfectly fine not to specify
events to wait for if the submission will take us to the queue
depth limit.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: print file depths
Jens Axboe [Tue, 15 Jan 2019 13:12:54 +0000 (06:12 -0700)]
t/io_uring: print file depths

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: pick next file if we're over the limti
Jens Axboe [Tue, 15 Jan 2019 12:57:54 +0000 (05:57 -0700)]
t/io_uring: pick next file if we're over the limti

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: use the right check for when to wait
Jens Axboe [Mon, 14 Jan 2019 05:49:48 +0000 (22:49 -0700)]
t/io_uring: use the right check for when to wait

Signed-off-by: Jens Axboe <axboe@kernel.dk>
9 months agot/io_uring: only call setrlimit() for fixedbufs
Jens Axboe [Sun, 13 Jan 2019 21:22:03 +0000 (14:22 -0700)]
t/io_uring: only call setrlimit() for fixedbufs

It's root only.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: add 32-bit x86 support
Jens Axboe [Sun, 13 Jan 2019 17:57:44 +0000 (10:57 -0700)]
io_uring: add 32-bit x86 support

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: add option for register_files
Jens Axboe [Sun, 13 Jan 2019 17:56:39 +0000 (10:56 -0700)]
t/io_uring: add option for register_files

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: fix pointer cast warning on 32-bit
Jens Axboe [Sun, 13 Jan 2019 16:17:39 +0000 (09:17 -0700)]
io_uring: fix pointer cast warning on 32-bit

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: ensure that the io_uring_register() structs are 32-bit safe
Jens Axboe [Sun, 13 Jan 2019 16:15:32 +0000 (09:15 -0700)]
io_uring: ensure that the io_uring_register() structs are 32-bit safe

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoMove io_uring to os/linux/
Jens Axboe [Sun, 13 Jan 2019 15:56:11 +0000 (08:56 -0700)]
Move io_uring to os/linux/

It's not a generic OS header, reflect the fact that it's Linux only
by moving it to a linux/ directory.

Also update io_uring_sqe to match current API.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: add IORING_OP_NOP support
Jens Axboe [Sun, 13 Jan 2019 05:14:54 +0000 (22:14 -0700)]
t/io_uring: add IORING_OP_NOP support

Doesn't do anything on the kernel side, just a round trip through
the SQ and CQ ring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: only set IORING_ENTER_GETEVENTS when actively reaping
Jens Axboe [Fri, 11 Jan 2019 21:40:16 +0000 (14:40 -0700)]
t/io_uring: only set IORING_ENTER_GETEVENTS when actively reaping

Don't set it if we don't need to find an event (to_wait == 0).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/io_uring: remove unused ld->io_us array
Jens Axboe [Fri, 11 Jan 2019 21:15:26 +0000 (14:15 -0700)]
engines/io_uring: remove unused ld->io_us array

Leftover from a previous API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: remember to set p->sq_thread_cpu
Jens Axboe [Fri, 11 Jan 2019 18:38:29 +0000 (11:38 -0700)]
t/io_uring: remember to set p->sq_thread_cpu

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: update to newer API
Jens Axboe [Fri, 11 Jan 2019 17:33:28 +0000 (10:33 -0700)]
io_uring: update to newer API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: add support for registered files
Jens Axboe [Fri, 11 Jan 2019 05:27:56 +0000 (22:27 -0700)]
t/io_uring: add support for registered files

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: make submits/reaps per-second reflected with sq thread poll
Jens Axboe [Fri, 11 Jan 2019 04:38:35 +0000 (21:38 -0700)]
t/io_uring: make submits/reaps per-second reflected with sq thread poll

If we use polling, the numbers currently read as 0. Make them -1 to
reflect that we're actually doing zero calls per IO.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: enable SQ thread poll mode
Jens Axboe [Fri, 11 Jan 2019 04:37:15 +0000 (21:37 -0700)]
t/io_uring: enable SQ thread poll mode

With this, we can do IO without ever entering the kernel.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: make more efficient for multiple files
Jens Axboe [Fri, 11 Jan 2019 02:43:41 +0000 (19:43 -0700)]
t/io_uring: make more efficient for multiple files

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: restore usage of IORING_SETUP_IOPOLL
Jens Axboe [Fri, 11 Jan 2019 02:10:03 +0000 (19:10 -0700)]
t/io_uring: restore usage of IORING_SETUP_IOPOLL

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: cleanup sq thread poll/cpu setup
Jens Axboe [Thu, 10 Jan 2019 22:42:07 +0000 (15:42 -0700)]
io_uring: cleanup sq thread poll/cpu setup

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoUpdate io_uring API
Jens Axboe [Thu, 10 Jan 2019 21:22:08 +0000 (14:22 -0700)]
Update io_uring API

- Fixed buffers are now available through io_uring_register()
- Various thread/wq options are now dead and automatic instead
- sqe->index is now sqe->buf_index
- Fixed buffers require flag, not separate opcode

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: io_uring_setup(2) takes a 'nr_iovecs' field now
Jens Axboe [Thu, 10 Jan 2019 16:48:37 +0000 (09:48 -0700)]
io_uring: io_uring_setup(2) takes a 'nr_iovecs' field now

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoMakefile: make t/io_uring depend on os/io_uring.h
Jens Axboe [Thu, 10 Jan 2019 16:45:58 +0000 (09:45 -0700)]
Makefile: make t/io_uring depend on os/io_uring.h

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoUpdate to newer io_uring API
Jens Axboe [Thu, 10 Jan 2019 16:39:14 +0000 (09:39 -0700)]
Update to newer io_uring API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/io_uring: always setup ld->iovecs[]
Jens Axboe [Wed, 9 Jan 2019 22:11:04 +0000 (15:11 -0700)]
engines/io_uring: always setup ld->iovecs[]

We need it now for the vectored commands. But only pass it in to
ring setup, if we use fixedbufs.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoUpdate to newer io_uring API
Jens Axboe [Wed, 9 Jan 2019 21:53:56 +0000 (14:53 -0700)]
Update to newer io_uring API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/devdax: Make detection of device-dax instances more robust
Dan Williams [Tue, 8 Jan 2019 19:34:19 +0000 (11:34 -0800)]
engines/devdax: Make detection of device-dax instances more robust

In preparation for the kernel switching device-dax instances from the
"/sys/class/dax" subsystem to "/sys/bus/dax" [1], teach the device-dax
instance detection to be subsystem-type agnostic.

Note that the subsystem switch will require an administrator, or distro
opt-in. The opt-in will either be at kernel compile time by disabling
the default compatibility driver in the kernel, or at runtime with a
modprobe policy to override which kernel module service device-dax
devices. The daxctl utility [2] will ship a command to install the
modprobe policy and include a man page that lists the potential
regression risk to older FIO and other userspace tools that are hard
coded to "/sys/class/dax".

[1]: https://lwn.net/Articles/770128/
[2]: https://github.com/pmem/ndctl/tree/master/daxctl

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/io_uring: ensure to use the right opcode for fixed buffers
Jens Axboe [Tue, 8 Jan 2019 17:26:47 +0000 (10:26 -0700)]
t/io_uring: ensure to use the right opcode for fixed buffers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/io_uring: ensure to use the right opcode for fixed buffers
Jens Axboe [Tue, 8 Jan 2019 17:26:19 +0000 (10:26 -0700)]
engines/io_uring: ensure to use the right opcode for fixed buffers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoconfigure: add __kernel_rwf_t check
Jens Axboe [Tue, 8 Jan 2019 17:14:00 +0000 (10:14 -0700)]
configure: add __kernel_rwf_t check

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring: use kernel header directly
Jens Axboe [Tue, 8 Jan 2019 13:20:58 +0000 (06:20 -0700)]
io_uring: use kernel header directly

The kernel header has been designed as such that it doesn't require
a special userland version of it. Use it directly.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoio_uring.h should include <linux/fs.h>
Jens Axboe [Tue, 8 Jan 2019 12:43:38 +0000 (05:43 -0700)]
io_uring.h should include <linux/fs.h>

This ensures we have the __kernel_rwf_t definition.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoRename aioring engine to io_uring
Jens Axboe [Tue, 8 Jan 2019 04:46:30 +0000 (21:46 -0700)]
Rename aioring engine to io_uring

The new API is completely discoupled from the aio/libaio
interface. Rename it while adopting the new API.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoRename t/aio-ring to t/io_uring
Jens Axboe [Tue, 8 Jan 2019 04:35:15 +0000 (21:35 -0700)]
Rename t/aio-ring to t/io_uring

The new API is completely discoupled from the aio/libaio
interface. Rename it while adopting the new API.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: cleanup the code a bit
Jens Axboe [Sat, 5 Jan 2019 14:42:30 +0000 (07:42 -0700)]
t/aio-ring: cleanup the code a bit

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoaioring: make sq/cqring_offsets a bit more future proof
Jens Axboe [Sat, 5 Jan 2019 14:37:02 +0000 (07:37 -0700)]
aioring: make sq/cqring_offsets a bit more future proof

And include 'dropped' as well.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoaioring: update to newer API
Jens Axboe [Sat, 5 Jan 2019 05:22:54 +0000 (22:22 -0700)]
aioring: update to newer API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: use syscall defines
Jens Axboe [Fri, 4 Jan 2019 21:02:25 +0000 (14:02 -0700)]
t/aio-ring: use syscall defines

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aioring: update for newer mmap based API
Jens Axboe [Fri, 4 Jan 2019 21:00:30 +0000 (14:00 -0700)]
engines/aioring: update for newer mmap based API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: update to newer mmap() API
Jens Axboe [Fri, 4 Jan 2019 20:27:46 +0000 (13:27 -0700)]
t/aio-ring: update to newer mmap() API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoaioring: remove IOCB_FLAG_HIPRI
Jens Axboe [Mon, 31 Dec 2018 00:19:40 +0000 (17:19 -0700)]
aioring: remove IOCB_FLAG_HIPRI

New API doesn't require the setting of this flag at runtime,
it's implied from the io context.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoaioring: update API
Jens Axboe [Sun, 30 Dec 2018 23:40:09 +0000 (16:40 -0700)]
aioring: update API

Both the engine and t/aio-ring, drop IORING_FLAG_SUBMIT as it's
been dropped on the kernel side. Renumber IORING_FLAG_GETEVENTS.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: print head/tail as unsigneds
Jens Axboe [Fri, 21 Dec 2018 22:37:16 +0000 (15:37 -0700)]
t/aio-ring: print head/tail as unsigneds

Since we're wrapping now and using the full range, we can get
logging ala:

IOPS=1094880, IOS/call=32/32, inflight=32 (head=-1509517216 tail=-1509517216), Cachehit=0.00%

Ensure we print as unsigned, as that's the right type.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aioring: fix harmless typo
Jens Axboe [Fri, 21 Dec 2018 22:09:45 +0000 (15:09 -0700)]
engines/aioring: fix harmless typo

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: update for continually rolling ring
Jens Axboe [Fri, 21 Dec 2018 21:47:34 +0000 (14:47 -0700)]
t/aio-ring: update for continually rolling ring

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aioring: update for continually rolling ring
Jens Axboe [Fri, 21 Dec 2018 21:47:07 +0000 (14:47 -0700)]
engines/aioring: update for continually rolling ring

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aio-ring: initialization error handling
Jens Axboe [Wed, 19 Dec 2018 19:55:10 +0000 (12:55 -0700)]
engines/aio-ring: initialization error handling

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aio-ring: cleanup read/write prep
Jens Axboe [Wed, 19 Dec 2018 19:51:51 +0000 (12:51 -0700)]
engines/aio-ring: cleanup read/write prep

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoFix 'min' latency times being 0 with ramp_time
Jens Axboe [Fri, 14 Dec 2018 21:36:52 +0000 (14:36 -0700)]
Fix 'min' latency times being 0 with ramp_time

If the job includes a ramp_time setting, we end up with latencies
that look like this:

    slat (nsec): min=0, max=17585, avg=1896.34, stdev=733.35
    clat (nsec): min=0, max=1398.1k, avg=77851.76, stdev=25055.97
     lat (nsec): min=0, max=1406.1k, avg=79824.20, stdev=25066.57

with the 'min' being 0. This is because the reset stats sets the
field to zero, and no new IO will be smaller than that...

Set the min value to the max value of the type when we reset stats.

Reported-by: Matthew Eaton <m.eaton82@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoengines/aioring: get rid of old error on sqwq and sqthread
Jens Axboe [Fri, 14 Dec 2018 20:07:55 +0000 (13:07 -0700)]
engines/aioring: get rid of old error on sqwq and sqthread

They are not mutually exclusive for buffered aio.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agot/aio-ring: add cache hit statistics
Jens Axboe [Fri, 14 Dec 2018 17:54:01 +0000 (10:54 -0700)]
t/aio-ring: add cache hit statistics

Pretty nifty to run it on a drive that will eventually end up being
fully cached, and watch the hit rates climb:

sudo taskset -c 0 t/aio-ring /dev/sde3
polled=0, fixedbufs=1, buffered=1
  QD=32, sq_ring=33, cq_ring=66
submitter=4269
IOPS=477, IOS/call=1/0, inflight=32 (head=50 tail=50), Cachehit=0.00%
IOPS=447, IOS/call=1/1, inflight=32 (head=35 tail=35), Cachehit=0.00%
IOPS=419, IOS/call=1/1, inflight=32 (head=58 tail=58), Cachehit=0.00%
[...]
IOPS=483, IOS/call=1/1, inflight=32 (head=63 tail=63), Cachehit=5.80%
IOPS=452, IOS/call=1/1, inflight=32 (head=53 tail=53), Cachehit=4.65%
IOPS=459, IOS/call=1/1, inflight=32 (head=50 tail=50), Cachehit=5.45%
[...]
IOPS=484, IOS/call=1/1, inflight=32 (head=22 tail=22), Cachehit=11.16%
IOPS=494, IOS/call=1/1, inflight=32 (head=54 tail=54), Cachehit=11.34%
IOPS=508, IOS/call=1/1, inflight=32 (head=34 tail=34), Cachehit=12.99%
[...]
IOPS=606, IOS/call=1/1, inflight=32 (head=18 tail=18), Cachehit=26.07%
IOPS=573, IOS/call=1/1, inflight=32 (head=63 tail=63), Cachehit=26.70%
IOPS=561, IOS/call=1/1, inflight=32 (head=30 tail=30), Cachehit=23.53%
[...]
IOPS=916, IOS/call=1/1, inflight=32 (head=63 tail=63), Cachehit=59.06%
IOPS=882, IOS/call=1/1, inflight=32 (head=21 tail=32), Cachehit=61.79%
IOPS=984, IOS/call=1/1, inflight=32 (head=22 tail=22), Cachehit=63.87%
[...]
IOPS=1993, IOS/call=7/7, inflight=32 (head=58 tail=4), Cachehit=86.75%
IOPS=2260, IOS/call=5/5, inflight=32 (head=12 tail=16), Cachehit=87.15%
IOPS=1957, IOS/call=4/4, inflight=17 (head=7 tail=10), Cachehit=86.78%
[...]
IOPS=3606, IOS/call=7/7, inflight=32 (head=26 tail=35), Cachehit=93.47%
IOPS=3487, IOS/call=6/6, inflight=28 (head=23 tail=31), Cachehit=92.59%
IOPS=3379, IOS/call=7/7, inflight=26 (head=38 tail=40), Cachehit=92.66%
[...]
IOPS=4590, IOS/call=6/6, inflight=26 (head=38 tail=46), Cachehit=95.64%
IOPS=5464, IOS/call=7/7, inflight=28 (head=22 tail=24), Cachehit=95.94%
IOPS=4896, IOS/call=8/8, inflight=18 (head=44 tail=51), Cachehit=95.62%
[...]
IOPS=7736, IOS/call=8/8, inflight=24 (head=25 tail=29), Cachehit=97.35%
IOPS=6632, IOS/call=8/7, inflight=27 (head=54 tail=61), Cachehit=97.28%
IOPS=8488, IOS/call=8/8, inflight=22 (head=33 tail=39), Cachehit=97.33%
[...]
IOPS=10696, IOS/call=8/8, inflight=16 (head=63 tail=64), Cachehit=98.11%
IOPS=11874, IOS/call=7/7, inflight=17 (head=56 tail=56), Cachehit=98.31%
IOPS=11488, IOS/call=8/7, inflight=23 (head=54 tail=57), Cachehit=98.17%
[...]
IOPS=15472, IOS/call=8/8, inflight=17 (head=11 tail=17), Cachehit=98.58%
IOPS=18656, IOS/call=8/7, inflight=22 (head=50 tail=59), Cachehit=98.95%
IOPS=19408, IOS/call=8/8, inflight=18 (head=58 tail=63), Cachehit=99.01%
[...]
IOPS=54768, IOS/call=8/7, inflight=19 (head=63 tail=3), Cachehit=99.64%
IOPS=62888, IOS/call=8/7, inflight=21 (head=51 tail=53), Cachehit=99.73%
IOPS=71656, IOS/call=7/7, inflight=24 (head=28 tail=36), Cachehit=99.75%
[...]
IOPS=125320, IOS/call=8/8, inflight=22 (head=42 tail=46), Cachehit=99.85%
IOPS=201808, IOS/call=8/8, inflight=17 (head=27 tail=35), Cachehit=99.90%
IOPS=390325, IOS/call=7/7, inflight=22 (head=23 tail=27), Cachehit=99.94%
[...]
IOPS=834056, IOS/call=8/8, inflight=8 (head=23 tail=27), Cachehit=100.00%
IOPS=837520, IOS/call=8/8, inflight=8 (head=13 tail=17), Cachehit=100.00%
IOPS=833232, IOS/call=8/8, inflight=8 (head=51 tail=57), Cachehit=100.00%

It's also a nice visual into how high a cache hit rate has to be on a
rotational drive to make a substantial impact on performance.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoAdd cache hit stats
Jens Axboe [Fri, 14 Dec 2018 15:32:01 +0000 (08:32 -0700)]
Add cache hit stats

With the aioring engine, we can get notified if a buffered read was
a cache hit or if it hit media. Add that to the output stats for
normal and json output.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 months agoclient/server: convert nr_zone_resets on the wire
Jens Axboe [Fri, 14 Dec 2018 15:29:14 +0000 (08:29 -0700)]
client/server: convert nr_zone_resets on the wire

Fixes: fd5d733fa34 ("Collect and show zone reset statistics")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/aioring: update to newer API
Jens Axboe [Thu, 13 Dec 2018 21:23:39 +0000 (14:23 -0700)]
engines/aioring: update to newer API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/aioring: enable IOCTX_FLAG_SQPOLL
Jens Axboe [Thu, 13 Dec 2018 20:52:35 +0000 (13:52 -0700)]
engines/aioring: enable IOCTX_FLAG_SQPOLL

With this flag set, we don't have to do any system calls for
polled IO.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoio_u: ensure buflen is capped at maxbs
Jens Axboe [Thu, 13 Dec 2018 16:09:42 +0000 (09:09 -0700)]
io_u: ensure buflen is capped at maxbs

If we use bsranges and the maxbs isn't a natural multiple of the minbs,
then we can generate sizes that are larger than maxbs. Ensure that we
cap the buffer length generated at maxbs.

Sample workload and problem report:

fio --name=App2 --size=10m --rw=read --blocksize_range=3k-10k

App2: (g=0): rw=read, bs=(R) 3072B-10.0KiB, (W) 3072B-10.0KiB, (T) 3072B-10.0KiB, ioengine=psync, iodepth=1
fio-3.12-17-g0fcbc0
Starting 1 process
*** Error in `fio': double free or corruption (!prev): 0x0000555f92a80a60 ***
fio: pid=1468, got signal=6

App2: (groupid=0, jobs=1): err= 0: pid=1468: Wed Dec 12 19:09:07 2018
read: IOPS=8365, BW=52.9MiB/s (55.5MB/s)(9.00MiB/189msec)
clat (nsec): min=874, max=74912k, avg=116222.51, stdev=2186743.16
lat (nsec): min=912, max=74912k, avg=116373.83, stdev=2186743.70
clat percentiles (nsec):
| 1.00th=[ 964], 5.00th=[ 1128], 10.00th=[ 1368],
| 20.00th=[ 1672], 30.00th=[ 2008], 40.00th=[ 2288],
| 50.00th=[ 2704], 60.00th=[ 3088], 70.00th=[ 3536],
| 80.00th=[ 4768], 90.00th=[ 6304], 95.00th=[ 8160],
| 99.00th=[ 544768], 99.50th=[ 2113536], 99.90th=[30539776],
| 99.95th=[74973184], 99.99th=[74973184]
lat (nsec) : 1000=1.52%
lat (usec) : 2=28.34%, 4=44.85%, 10=21.70%, 20=0.51%, 50=0.32%
lat (usec) : 250=0.25%, 500=1.20%, 750=0.76%
lat (msec) : 4=0.25%, 20=0.13%, 50=0.13%, 100=0.06%
cpu : usr=3.72%, sys=3.72%, ctx=43, majf=0, minf=14
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=1581,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
READ: bw=52.9MiB/s (55.5MB/s), 52.9MiB/s-52.9MiB/s (55.5MB/s-55.5MB/s), io=9.00MiB (10.5MB), run=189-189msec

Disk stats (read/write):
sda: ios=24/0, merge=0/0, ticks=188/0, in_queue=260, util=55.70%

Fixes: https://github.com/axboe/fio/issues/726
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/aioring: various updates and fixes
Jens Axboe [Thu, 13 Dec 2018 13:33:37 +0000 (06:33 -0700)]
engines/aioring: various updates and fixes

- Add support for SQWQ and SQTHREAD. Buffered is now async!
- Kill unnecessary ifdefs
- Cleanup/fix error handling
- Handle fsync like a queued command
- Queue depth handling fixups

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: remove features deprecated from old interface
Jens Axboe [Thu, 13 Dec 2018 05:02:16 +0000 (22:02 -0700)]
engines/libaio: remove features deprecated from old interface

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoaioring: remove qd > 1 restriction
Jens Axboe [Thu, 13 Dec 2018 04:10:25 +0000 (21:10 -0700)]
aioring: remove qd > 1 restriction

Just add the extra ring entry we need in ->init().

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoaioring: check for arch support AFTER including the headers
Jens Axboe [Thu, 13 Dec 2018 03:31:52 +0000 (20:31 -0700)]
aioring: check for arch support AFTER including the headers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoaioring: hide it if archs don't define syscalls
Jens Axboe [Thu, 13 Dec 2018 03:21:42 +0000 (20:21 -0700)]
aioring: hide it if archs don't define syscalls

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agot/aio-ring: update for new API
Jens Axboe [Thu, 13 Dec 2018 03:05:40 +0000 (20:05 -0700)]
t/aio-ring: update for new API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoAdd aioring engine
Jens Axboe [Thu, 13 Dec 2018 02:48:15 +0000 (19:48 -0700)]
Add aioring engine

This is a new Linux aio engine, built on top of the new aio
interfaces. It supports polled aio, regular aio, and buffered
aio.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoioengine: remove ancient alias for libaio
Jens Axboe [Thu, 13 Dec 2018 02:47:31 +0000 (19:47 -0700)]
ioengine: remove ancient alias for libaio

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agot/aio-ring: set nr_events after clear
Jens Axboe [Wed, 12 Dec 2018 16:49:40 +0000 (09:49 -0700)]
t/aio-ring: set nr_events after clear

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agot/aio-ring: update to new io_setup2(2)
Jens Axboe [Wed, 12 Dec 2018 16:47:15 +0000 (09:47 -0700)]
t/aio-ring: update to new io_setup2(2)

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agot/aio-ring: update to newer API
Jens Axboe [Wed, 12 Dec 2018 16:28:29 +0000 (09:28 -0700)]
t/aio-ring: update to newer API

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agot/aio-ring: updates/cleanups
Jens Axboe [Mon, 10 Dec 2018 22:14:36 +0000 (15:14 -0700)]
t/aio-ring: updates/cleanups

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoAdd aio-ring test app
Jens Axboe [Mon, 10 Dec 2018 21:53:58 +0000 (14:53 -0700)]
Add aio-ring test app

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: increase RLIMIT_MEMLOCK for user buffers
Jens Axboe [Tue, 4 Dec 2018 18:27:02 +0000 (11:27 -0700)]
engines/libaio: increase RLIMIT_MEMLOCK for user buffers

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: update for newer io_setup2() system call
Jens Axboe [Tue, 4 Dec 2018 18:17:29 +0000 (11:17 -0700)]
engines/libaio: update for newer io_setup2() system call

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: set IOCB_HIPRI if we are polling
Jens Axboe [Sat, 1 Dec 2018 17:17:26 +0000 (10:17 -0700)]
engines/libaio: set IOCB_HIPRI if we are polling

Forgot to set it for the non-user mapped case.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agostat: assign for first stat iteration, don't sum
Jens Axboe [Fri, 30 Nov 2018 21:44:25 +0000 (14:44 -0700)]
stat: assign for first stat iteration, don't sum

Fixes: 70750d6a221f ("stat: only apply proper stat summing for event timestamps")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agostat: only apply proper stat summing for event timestamps
Jens Axboe [Fri, 30 Nov 2018 17:52:31 +0000 (10:52 -0700)]
stat: only apply proper stat summing for event timestamps

We generally sum two kinds of stats, one that is per-io type numbers,
and one that are just samples of performance (like iops, bw). Only
apply proper stat summing to the former, for the latter we just want
to add them up.

This fixes a group reporting case where we have multiple jobs, and
the IOPS/BW output still shows per-job numbers. Before, we'd have:

  read: IOPS=345k, BW=1346MiB/s (1411MB/s)(6229MiB/4628msec)
[...]
   bw (  KiB/s): min=282816, max=377080, per=24.99%, avg=344438.00, stdev=35329.77, samples=36
   iops        : min=70704, max=94270, avg=86109.50, stdev=8832.44, samples=36

with bw/iops showing per-job numbers, after this the same looks like:

  read: IOPS=349k, BW=1365MiB/s (1431MB/s)(6719MiB/4922msec)
[...]
   bw (  MiB/s): min= 1302, max= 1420, per=99.86%, avg=1363.02, stdev=11.14, samples=36
   iops        : min=333433, max=363668, avg=348933.33, stdev=2850.64, samples=36

which is more in line with what a user would expect.

Fixes: https://github.com/axboe/fio/issues/519
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: only initialize iocb members when we need to
Jens Axboe [Fri, 30 Nov 2018 17:49:30 +0000 (10:49 -0700)]
engines/libaio: only initialize iocb members when we need to

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agogettime: use nsec in get_cycles_per_msec division
Vincent Fu [Thu, 29 Nov 2018 18:14:18 +0000 (13:14 -0500)]
gettime: use nsec in get_cycles_per_msec division

Since we now have ntime_since() we can use nsec in the division for
get_cycles_per_msec(). This makes the cycles_per_msec value that fio
ultimately uses more stable since fio will no longer be using values
truncated to usec.

While we're here also modify some debug prints to make it explicit that
fio ultimately uses a trimmed mean in its time calculations.

On platforms where only gettimeofday() is available this will be no
worse than the original code since we will revert back to using times
truncated to usec.

With the patch, the last three digits of the trimmed mean have a tight
range from 784-786. Without the patch, the corresponding value (labeled
avg) has a much wider range from 821-958. Notice also that the standard
error S is an order of magnitude smaller with the patch.

*** WITH PATCH ***
$ ./fio --debug=time --cpuclock-test | grep mean
time     315   min=3466785, max=3466796, mean=3466789.120000, S=0.057234, N=50
time     315   trimmed mean=3466789, N=29
$ ./fio --debug=time --cpuclock-test | grep mean
time     329   min=3466784, max=3466795, mean=3466789.480000, S=0.053641, N=50
time     329   trimmed mean=3466788, N=36
$ ./fio --debug=time --cpuclock-test | grep mean
time     343   min=3466786, max=3466794, mean=3466789.220000, S=0.036654, N=50
time     343   trimmed mean=3466788, N=42
$ ./fio --debug=time --cpuclock-test | grep mean
time     357   min=3466785, max=3466794, mean=3466789.080000, S=0.053580, N=50
time     357   trimmed mean=3466788, N=33
$ ./fio --debug=time --cpuclock-test | grep mean
time     371   min=3466785, max=3466794, mean=3466789.600000, S=0.043519, N=50
time     371   trimmed mean=3466789, N=36
$ ./fio --debug=time --cpuclock-test | grep mean
time     385   min=3466785, max=3466794, mean=3466789.280000, S=0.052534, N=50
time     385   trimmed mean=3466789, N=32
$ ./fio --debug=time --cpuclock-test | grep mean
time     407   min=3466786, max=3466796, mean=3466789.520000, S=0.042616, N=50
time     407   trimmed mean=3466789, N=41
$ ./fio --debug=time --cpuclock-test | grep mean
time     513   min=3466785, max=3466796, mean=3466789.220000, S=0.051316, N=50
time     513   trimmed mean=3466789, N=35

*** WIHTOUT PATCH ***
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     439   avg: 3466821
time     439   min=3466790, max=3467006, mean=3466838.060000, S=0.959748
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     455   avg: 3466958
time     455   min=3466796, max=3466996, mean=3466945.880000, S=0.789178
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     469   avg: 3466887
time     469   min=3466790, max=3466984, mean=3466887.500000, S=0.825069
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     485   avg: 3466951
time     485   min=3466818, max=3466996, mean=3466939.420000, S=0.873334
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     499   avg: 3466957
time     499   min=3466793, max=3467009, mean=3466950.140000, S=0.764626
$ ./fio-3.11 --debug=time --cpuclock-test | grep 'mean\|avg'
time     608   avg: 3466954
time     608   min=3466818, max=3466981, mean=3466937.040000, S=0.862232

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agorand: fix compressible data ratio per segment
Bari Antebi [Thu, 22 Nov 2018 18:14:57 +0000 (20:14 +0200)]
rand: fix compressible data ratio per segment

I've noticed a bug in fio while testing it. I expected to receive
output similar to the expected output below for: "hexdump -n 4096
/dev/nvme0n1" (considering the configuration file I've used that may
be found below).

Expected output:

0000000 fdc6 a8a8 7190 0219 1fb8 9632 d9dd 1e64
/* random data */
00004c0 d8a3 13fe aeec 0fb6 5b14 162e 0000 0000
00004d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000

However, the output I've received contained data after address 4cc
(which should have been filled with zeroes until 1000 as far as I
understand, but as you can see 99a contains data.

0000000 fdc6 a8a8 7190 0219 1fb8 9632 d9dd 1e64
/* random data */
00004c0 d8a3 13fe aeec 0fb6 5b14 162e 0000 0000
00004d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0000990 0000 0000 0000 0000 fdc6 a8a8 7190 0219

Config file:

[global]
group_reporting=1
filename=/dev/nvme0n1
name=task_nvme0n1
rw=write
bs=4096
numjobs=1
iodepth=32
buffer_compress_chunk=4096
buffer_compress_percentage=70
cpus_allowed=0-8
cpus_allowed_policy=split
direct=1
ioengine=libaio
loops=1
refill_buffers=0

[job empty]
size=4096

Fio should write (100 - compress_percentage) * segemnt of random data
followed by compress_percentage * segemnt of compressible
data.

Currently, at each itereation fio fills (100 - compress_percentage) * segemnt
data, followed by (100 - compress_percentage) * segemnt of compressible
data until a the segment is filled.

Fixes: 811ac503a619 ("Compression fixes")
Signed-off-by: Bari Antebi <bari@lightbitslabs.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: use maximum buffer size for fixed user bufs
Jens Axboe [Wed, 28 Nov 2018 02:43:30 +0000 (19:43 -0700)]
engines/libaio: use maximum buffer size for fixed user bufs

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: add preliminary support for pre-mapped IO buffers
Jens Axboe [Tue, 27 Nov 2018 23:01:44 +0000 (16:01 -0700)]
engines/libaio: add preliminary support for pre-mapped IO buffers

Experimental kernel features that allows us to register IO buffers
when the io_context is setup, eliminating the need to perform
get_user_pages() + put_page() for each IO. This dramatically
increases performance and lowers latency.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoworkqueue: update IO counters promptly after handling IO
Vincent Fu [Mon, 26 Nov 2018 16:15:02 +0000 (11:15 -0500)]
workqueue: update IO counters promptly after handling IO

Currently, IO submit worker threads only update parent IO counters when
the threads are idle or when the threads exit. When offload fio jobs are
assigned a restrictive CPU mask, this results in reporting and logging
problems. This patch updates parent IO counters more promptly upon
completing each IO which resolves the reporting and logging problems.

In the output below, notice the missing read data direction output in
the first, simple job and how it appropriately appears after the patch
is applied. In the second job with logging, notice the missing log
entries (unequal file sizes) in the first log and how entries are no
longer missing for the run with the patch applied.

*********************
*** WITHOUT PATCH ***
*********************
$ ./fio --name=test --io_submit_mode=offload --cpus_allowed=1 --filename=/dev/sda --size=10M
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12-19-g41dd
Starting 1 process

test: (groupid=0, jobs=1): err= 0: pid=19746: Mon Nov 26 10:37:42 2018
  lat (nsec)   : 750=13.44%, 1000=42.42%
  lat (usec)   : 2=39.88%, 4=1.56%, 10=0.23%, 20=0.20%, 50=1.02%
  lat (usec)   : 100=0.55%, 250=0.27%, 500=0.35%, 750=0.04%
  lat (msec)   : 10=0.04%
  cpu          : usr=30.30%, sys=0.00%, ctx=5126, majf=0, minf=3
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2560,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):

Disk stats (read/write):
  sda: ios=45/0, merge=0/0, ticks=44/0, in_queue=44, util=20.51%

*********************
***** WITH PATCH ****
*********************
$ ./fio --name=test --io_submit_mode=offload --cpus_allowed=1 --filename=/dev/sda --size=10M
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.12-8-gee63-dirty
Starting 1 process

test: (groupid=0, jobs=1): err= 0: pid=19754: Mon Nov 26 10:37:56 2018
  read: IOPS=77.6k, BW=303MiB/s (318MB/s)(10.0MiB/33msec)
    clat (nsec): min=588, max=5941.5k, avg=5705.48, stdev=120112.09
     lat (nsec): min=1745, max=5950.7k, avg=7302.87, stdev=120311.09
    clat percentiles (nsec):
     |  1.00th=[    628],  5.00th=[    668], 10.00th=[    700],
     | 20.00th=[    740], 30.00th=[    772], 40.00th=[    812],
     | 50.00th=[    852], 60.00th=[    908], 70.00th=[    972],
     | 80.00th=[   1064], 90.00th=[   1256], 95.00th=[   1592],
     | 99.00th=[  48896], 99.50th=[ 158720], 99.90th=[ 544768],
     | 99.95th=[ 651264], 99.99th=[5931008]
  lat (nsec)   : 750=23.28%, 1000=49.92%
  lat (usec)   : 2=22.77%, 4=1.05%, 10=0.16%, 20=0.51%, 50=1.33%
  lat (usec)   : 100=0.27%, 250=0.31%, 500=0.27%, 750=0.08%
  lat (msec)   : 10=0.04%
  cpu          : usr=6.25%, sys=25.00%, ctx=5127, majf=0, minf=3
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2560,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=303MiB/s (318MB/s), 303MiB/s-303MiB/s (318MB/s-318MB/s), io=10.0MiB (10.5MB), run=33-33msec

Disk stats (read/write):
  sda: ios=45/0, merge=0/0, ticks=36/0, in_queue=36, util=20.92%

*********************
*** WITHOUT PATCH ***
*********************
$ ./fio-canonical/fio --name=test --direct=1 --filename=/dev/fioa --numjobs=4 --log_avg_msec=1000 --write_iops_log=test --time_based --runtime=10s --rw=read --bs=4k --cpus_allowed=1,3,2,7 --io_submit_mode=offload --group_reporting
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
...
fio-3.12-19-g41dd
Starting 4 processes
Jobs: 4 (f=4): [R(4)][100.0%][r=385MiB/s][r=98.7k IOPS][eta 00m:00s]
test: (groupid=0, jobs=4): err= 0: pid=22661: Mon Nov 26 08:51:19 2018
  read: IOPS=134k, BW=503MiB/s (527MB/s)(5029MiB/10001msec)
    clat (nsec): min=12481, max=73112, avg=20088.92, stdev=2744.19
     lat (nsec): min=15113, max=74716, avg=23846.94, stdev=3356.56
    clat percentiles (nsec):
     |  1.00th=[14016],  5.00th=[16320], 10.00th=[17024], 20.00th=[17792],
     | 30.00th=[18304], 40.00th=[19072], 50.00th=[20096], 60.00th=[20608],
     | 70.00th=[21376], 80.00th=[21888], 90.00th=[23424], 95.00th=[24704],
     | 99.00th=[28544], 99.50th=[29824], 99.90th=[33024], 99.95th=[34560],
     | 99.99th=[43264]
   bw (  KiB/s): min=126840, max=1711256, per=32.05%, avg=165065.08, stdev=210793.01, samples=60
   iops        : min=17773, max=213907, avg=39624.97, stdev=33794.49, samples=30
  lat (usec)   : 20=48.64%, 50=51.36%, 100=0.01%
  cpu          : usr=5.38%, sys=10.06%, ctx=1795113, majf=0, minf=63
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1341011,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=503MiB/s (527MB/s), 503MiB/s-503MiB/s (527MB/s-527MB/s), io=5029MiB (5274MB), run=10001-10001msec

Disk stats (read/write):
  fioa: ios=1341011/0, merge=0/0, ticks=16916/0, in_queue=18236, util=91.46%

$ ls -l test_iops.?.log
-rw-r--r-- 1 root root  55 Nov 26 08:51 test_iops.1.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.2.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.3.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.4.log

$ cat test_iops.1.log
1000, 17773, 0, 0
3000, 71404, 0, 0
9000, 213907, 0, 0

*********************
***** WITH PATCH ****
*********************
$ ./fio/fio --name=test --direct=1 --filename=/dev/fioa --numjobs=4 --log_avg_msec=1000 --write_iops_log=test --time_based --runtime=10s --rw=read --bs=4k --cpus_allowed=1,3,2,7 --io_submit_mode=offload --group_reporting
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
...
fio-3.12-19-g41dd-dirty
Starting 4 processes
Jobs: 4 (f=4): [R(4)][100.0%][r=528MiB/s][r=135k IOPS][eta 00m:00s]
test: (groupid=0, jobs=4): err= 0: pid=22685: Mon Nov 26 08:51:56 2018
  read: IOPS=131k, BW=514MiB/s (539MB/s)(5137MiB/10001msec)
    clat (usec): min=12, max=184, avg=20.55, stdev= 4.05
     lat (usec): min=15, max=187, avg=24.61, stdev= 4.80
    clat percentiles (nsec):
     |  1.00th=[14016],  5.00th=[16512], 10.00th=[17280], 20.00th=[17792],
     | 30.00th=[18560], 40.00th=[19072], 50.00th=[20096], 60.00th=[20608],
     | 70.00th=[21120], 80.00th=[22144], 90.00th=[24448], 95.00th=[27008],
     | 99.00th=[38144], 99.50th=[39680], 99.90th=[52480], 99.95th=[54528],
     | 99.99th=[56576]
   bw (  KiB/s): min=94640, max=145392, per=24.96%, avg=131304.57, stdev=11668.56, samples=76
   iops        : min=23691, max=36200, avg=32774.75, stdev=2963.63, samples=36
  lat (usec)   : 20=49.59%, 50=50.24%, 100=0.17%, 250=0.01%
  cpu          : usr=5.20%, sys=9.76%, ctx=1794339, majf=0, minf=63
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1315098,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=514MiB/s (539MB/s), 514MiB/s-514MiB/s (539MB/s-539MB/s), io=5137MiB (5387MB), run=10001-10001msec

Disk stats (read/write):
  fioa: ios=1315098/0, merge=0/0, ticks=17156/0, in_queue=20496, util=96.36%

$ ls -l test_iops.?.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.1.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.2.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.3.log
-rw-r--r-- 1 root root 162 Nov 26 08:51 test_iops.4.log

$ cat test_iops.1.log
1000, 27173, 0, 0
2000, 27186, 0, 0
3000, 34395, 0, 0
4000, 36200, 0, 0
5000, 36047, 0, 0
6000, 36043, 0, 0
7000, 36100, 0, 0
8000, 36121, 0, 0
9000, 36124, 0, 0

Fixes: https://www.spinics.net/lists/fio/msg07628.html
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agooptions: fix 'kb_base' being of the wrong type
Jens Axboe [Sun, 25 Nov 2018 16:56:06 +0000 (09:56 -0700)]
options: fix 'kb_base' being of the wrong type

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agooptions: fix 'unit_base' being of the wrong type
Jens Axboe [Sat, 24 Nov 2018 22:10:39 +0000 (15:10 -0700)]
options: fix 'unit_base' being of the wrong type

Fixes: https://github.com/axboe/fio/issues/717
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: cleanup new vs old io_setup() system call path
Jens Axboe [Wed, 21 Nov 2018 18:33:22 +0000 (11:33 -0700)]
engines/libaio: cleanup new vs old io_setup() system call path

Just fall through to the old code if the new one fails.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: use fio_memalign() helper for user iocbs
Jens Axboe [Wed, 21 Nov 2018 16:02:47 +0000 (09:02 -0700)]
engines/libaio: use fio_memalign() helper for user iocbs

Don't rely on posix_memalign() being there, that's why we have
this helper.

Also ensure that the memory is cleared. This is important, as
we are passing this to the kernel, and we can't rely on our
->prep() clearing everything all the time.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: the IOCTX_FLAG_* flags changed
Jens Axboe [Wed, 21 Nov 2018 14:18:06 +0000 (07:18 -0700)]
engines/libaio: the IOCTX_FLAG_* flags changed

Update to current API.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: fallback to old io_setup() system call
Jens Axboe [Wed, 21 Nov 2018 12:53:38 +0000 (05:53 -0700)]
engines/libaio: fallback to old io_setup() system call

We can't rely on the new one being there, if we fail calling
io_setup2(), fallback to io_setup() like before.

Fixes: a1b006fe1cd3 ("engines/libaio: fix new aio poll API")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: add support for user mapped iocbs
Jens Axboe [Wed, 21 Nov 2018 02:47:01 +0000 (19:47 -0700)]
engines/libaio: add support for user mapped iocbs

For polled IO, we can support having the kernel map our iocbs,
instead of having to copy them for each IO submission.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agobackend: initialize io engine before io_u buffers
Jens Axboe [Wed, 21 Nov 2018 02:42:19 +0000 (19:42 -0700)]
backend: initialize io engine before io_u buffers

Otherwise we call io_ops->io_u_init() before the IO scheduler is
setup, which is somewhat backwards.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoKill "No I/O performed by ..." message
Jens Axboe [Tue, 20 Nov 2018 18:59:36 +0000 (11:59 -0700)]
Kill "No I/O performed by ..." message

We keep finding false triggers for this, and it's driving me
nuts. Kill it with fire.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: fix new aio poll API
Jens Axboe [Tue, 20 Nov 2018 02:41:53 +0000 (19:41 -0700)]
engines/libaio: fix new aio poll API

It'll be final. Some day.

Fixes: ebec344dd336 ("engines/libaio: update to new io_setup2() system call")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoengines/libaio: update to new io_setup2() system call
Jens Axboe [Mon, 19 Nov 2018 23:11:56 +0000 (16:11 -0700)]
engines/libaio: update to new io_setup2() system call

We need that to enable polling.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agolibaio: switch to newer libaio polled IO API
Jens Axboe [Fri, 16 Nov 2018 03:31:35 +0000 (20:31 -0700)]
libaio: switch to newer libaio polled IO API

No more new opcodes, just set IOCB_FLAG_HIPRI instead.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agoio_u: fall through to unlock path in case of error
Jens Axboe [Fri, 16 Nov 2018 01:56:12 +0000 (18:56 -0700)]
io_u: fall through to unlock path in case of error

Doesn't really matter since we're exiting anyway, but let's
do this right.

Fixes: d28174f0189c ("workqueue: ensure we see deferred error for IOs")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 months agobackend: silence "No I/O performed by..." if jobs ends in error
Jens Axboe [Thu, 15 Nov 2018 22:24:11 +0000 (15:24 -0700)]
backend: silence "No I/O performed by..." if jobs ends in error

If we have an error, we are logging it. There's no point in
spewing extra info on not having done any IO, that's only
really useful if we don't know WHY we didn't do any IO.

Signed-off-by: Jens Axboe <axboe@kernel.dk>