rate_submit: synchronize accesses to io_u_queue->nr
Accesses to io_u_queue->nr are not properly synchronized in offload
submission mode. put_io_u locks td but the parent td flags reflecting
the need to lock are not propogated to child threads when the child
threads are intialized.
The main thread accesses io_u_queue->nr via io_u_qpop() as it prepares
io_u's for handing off to the worker threads. The worker threads access
io_u_queue->nr via io_u_qpush() as they complete io_u's. When these
accesses are not protected by locks, io_u_qpop() will return NULL when
it means to provide a valid io_u pointer. This occurs in offload
submission mode with iodepth > 1.
Fixes:
26b3a18 ("Make td_io_u_lock/unlock() explicit")