nvme: flush namespace scanning work just before removing namespaces
authorSagi Grimberg <sagi@grimberg.me>
Wed, 21 Nov 2018 23:17:37 +0000 (15:17 -0800)
committerChristoph Hellwig <hch@lst.de>
Fri, 30 Nov 2018 16:23:23 +0000 (17:23 +0100)
nvme_stop_ctrl can be called also for reset flow and there is no need to
flush the scan_work as namespaces are not being removed. This can cause
deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers
before controller teardown (and specifically I/O cancellation of the
scan_work itself) takes place, but the scan_work will be blocked anyways
so there is no need to flush it.

Instead, move scan_work flush to nvme_remove_namespaces() where it really
needs to flush.

Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed by: James Smart <jsmart2021@gmail.com>
Tested-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/host/core.c

index bb39b91253c2dd28556d69ba754fb7144a4c555b..3cf1b773158ece0ca99bcfe92f1c37e22afee25b 100644 (file)
@@ -3314,6 +3314,9 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
        struct nvme_ns *ns, *next;
        LIST_HEAD(ns_list);
 
+       /* prevent racing with ns scanning */
+       flush_work(&ctrl->scan_work);
+
        /*
         * The dead states indicates the controller was not gracefully
         * disconnected. In that case, we won't be able to flush any data while
@@ -3476,7 +3479,6 @@ void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
        nvme_mpath_stop(ctrl);
        nvme_stop_keep_alive(ctrl);
        flush_work(&ctrl->async_event_work);
-       flush_work(&ctrl->scan_work);
        cancel_work_sync(&ctrl->fw_act_work);
        if (ctrl->ops->stop_ctrl)
                ctrl->ops->stop_ctrl(ctrl);