backend: check if we need to update rusage stats, if stat_mutex is busy
authorJens Axboe <axboe@fb.com>
Fri, 26 Aug 2016 20:39:30 +0000 (14:39 -0600)
committerJens Axboe <axboe@fb.com>
Fri, 26 Aug 2016 20:39:30 +0000 (14:39 -0600)
Even with the fix to check if we need to update rusage happening
right before the stat_mutex lock, we can still deadlock. This
looks something like the below:

helper_thread                   job

lock(stat_mutex);
                                lock(stat_mutex);
down(td->rusage_sem);

And now are both effectively locked in an ABBA deadlock. The helper
thread is waiting for the job to update it's rusage, but the job
is stuck waiting for the stat_mutex.

Fix this by doing a trylock on the stat_mutex, and if it fails,
ensure that we update rusage.

Signed-off-by: Jens Axboe <axboe@fb.com>
backend.c

index d98658606d772ed7d31b1355571ee2566b53a213..fb2a8551e396227fc01ed5849e95fbd0e15c38a8 100644 (file)
--- a/backend.c
+++ b/backend.c
@@ -1731,9 +1731,13 @@ static void *thread_main(void *data)
                 * the rusage_sem, which would never get upped because
                 * this thread is waiting for the stat mutex.
                 */
-               check_update_rusage(td);
+               do {
+                       check_update_rusage(td);
+                       if (!fio_mutex_down_trylock(stat_mutex))
+                               break;
+                       usleep(1000);
+               } while (1);
 
-               fio_mutex_down(stat_mutex);
                if (td_read(td) && td->io_bytes[DDIR_READ])
                        update_runtime(td, elapsed_us, DDIR_READ);
                if (td_write(td) && td->io_bytes[DDIR_WRITE])