From: Jens Axboe Date: Wed, 1 Oct 2014 02:28:45 +0000 (-0600) Subject: engines/libaio: don't reap on EAGAIN and no pending events X-Git-Tag: fio-2.1.13~5 X-Git-Url: https://git.kernel.dk/?p=fio.git;a=commitdiff_plain;h=a120ca7f793b41532b04e3915fcd6646fa37bb4f engines/libaio: don't reap on EAGAIN and no pending events Instead just loop on submit, since there are no events for us to reap. This is usually a kernel bug, violating the principle of forward progress guarantee. If we can't submit anything in 30 seconds, error out. Signed-off-by: Jens Axboe --- diff --git a/engines/libaio.c b/engines/libaio.c index cd10aabf..ca7bfdef 100644 --- a/engines/libaio.c +++ b/engines/libaio.c @@ -234,7 +234,8 @@ static int fio_libaio_commit(struct thread_data *td) struct libaio_data *ld = td->io_ops->data; struct iocb **iocbs; struct io_u **io_us; - int ret; + struct timeval tv; + int ret, wait_start = 0; if (!ld->queued) return 0; @@ -262,10 +263,24 @@ static int fio_libaio_commit(struct thread_data *td) /* * If we get EAGAIN, we should break out without * error and let the upper layer reap some - * events for us. + * events for us. If we have no queued IO, we + * must loop here. If we loop for more than 30s, + * just error out, something must be buggy in the + * IO path. */ - ret = 0; - break; + if (ld->queued) { + ret = 0; + break; + } + if (!wait_start) { + fio_gettime(&tv, NULL); + wait_start = 0; + } else if (mtime_since_now(&tv) > 30000) { + log_err("fio: aio appears to be stalled, giving up\n"); + break; + } + usleep(1); + continue; } else break; } while (ld->head != ld->tail);