workqueue: Fix race conditions in the workqueue mechanism
authorBart Van Assche <bvanassche@acm.org>
Sat, 4 Jul 2020 16:44:56 +0000 (09:44 -0700)
committerBart Van Assche <bvanassche@acm.org>
Sat, 4 Jul 2020 16:49:54 +0000 (09:49 -0700)
From https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_wait.html:
"The application shall ensure that these functions are called with mutex
locked by the calling thread; otherwise, an error (for
PTHREAD_MUTEX_ERRORCHECK and robust mutexes) or undefined behavior (for
other mutexes) results.

From https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_signal.html:
"however, if predictable scheduling behavior is required, then that mutex
shall be locked by the thread calling pthread_cond_broadcast() or
pthread_cond_signal()."

Hence always hold the associated mutex around pthread_cond_wait() and
pthread_signal() calls.

This patch fixes the hangs reported by Travis and Appveyor for test case
t0013.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
workqueue.c

index 254bf6b6505fd8e047021fc819b99e567d5722e4..9e6c41ff2f399172703b6e438061236670962df8 100644 (file)
@@ -170,26 +170,32 @@ static void *worker_thread(void *data)
                                workqueue_pre_sleep(sw);
                                pthread_mutex_lock(&sw->lock);
                        }
-
-                       /*
-                        * We dropped and reaquired the lock, check
-                        * state again.
-                        */
-                       if (!flist_empty(&sw->work_list))
-                               goto handle_work;
-
+               }
+               /*
+                * We may have dropped and reaquired the lock, check state
+                * again.
+                */
+               if (flist_empty(&sw->work_list)) {
                        if (sw->flags & SW_F_EXIT) {
                                break;
-                       } else if (!(sw->flags & SW_F_IDLE)) {
+                       }
+                       if (!(sw->flags & SW_F_IDLE)) {
                                sw->flags |= SW_F_IDLE;
                                wq->next_free_worker = sw->index;
+                               pthread_mutex_unlock(&sw->lock);
+                               pthread_mutex_lock(&wq->flush_lock);
                                if (wq->wake_idle)
                                        pthread_cond_signal(&wq->flush_cond);
+                               pthread_mutex_unlock(&wq->flush_lock);
+                               pthread_mutex_lock(&sw->lock);
+                       }
+               }
+               if (flist_empty(&sw->work_list)) {
+                       if (sw->flags & SW_F_EXIT) {
+                               break;
                        }
-
                        pthread_cond_wait(&sw->cond, &sw->lock);
                } else {
-handle_work:
                        flist_splice_init(&sw->work_list, &local_list);
                }
                pthread_mutex_unlock(&sw->lock);