io_uring: Avoid needless update of completion queue head pointer
authorAnton Blanchard <anton@ozlabs.org>
Mon, 13 Jul 2020 01:34:48 +0000 (11:34 +1000)
committerAnton Blanchard <anton@ozlabs.org>
Mon, 13 Jul 2020 01:34:48 +0000 (11:34 +1000)
I'm seeing a slowdown in io_uring performance on a POWER9 box when
the userspace and kernel polling threads are on two cores that
share an L2 cache.

fio_ioring_cqring_reap() always stores to the completion queue head
pointer, even if nothing was reaped and the value hasn't changed.

Changing this to only update the head pointer when it changes results
in a 95% improvement in performance on this particular test.

Signed-off-by: Anton Blanchard <anton@ozlabs.org>
engines/io_uring.c

index cd0810f47f57d2e4dfcb5fe5c237c6b5dde768fa..ecff0657ed51e60af19bce99a0be1d803c63dd82 100644 (file)
@@ -307,7 +307,9 @@ static int fio_ioring_cqring_reap(struct thread_data *td, unsigned int events,
                head++;
        } while (reaped + events < max);
 
-       atomic_store_release(ring->head, head);
+       if (reaped)
+               atomic_store_release(ring->head, head);
+
        return reaped;
 }