NVMe: Unbind driver on failure
authorKeith Busch <keith.busch@intel.com>
Mon, 28 Mar 2016 22:03:21 +0000 (16:03 -0600)
committerJens Axboe <axboe@fb.com>
Tue, 17 May 2016 23:14:21 +0000 (17:14 -0600)
Instead of removing the PCI device from the kernel's topology on
controller failure, this patch simply requests unbinding the device
from the driver. This avoids concurrently running pci removal with the
hot plug event, which has been reported to be problematic when multiple
surprise events occur near simultaneously.

The other benefit is that we will have PCI config and memory space
available to poke around for debugging a failed controller, assuming
the device was not physically removed.

The down side occurs if the platform and/or kernel do not support any
type of surprise hot removal. The device will remain visible through
sysfs (and therefore lspci), and some manual work is necessary to get
the logical topology corrected. But if your platform and/or kernel don't
support surprise removal, you probably shouldn't be doing that anyway.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
drivers/nvme/host/pci.c

index 88ed43d0799c7f4d1b6ecd54ac4b6da989ec7891..194e9014811be7c8e80ba11dabba99d7c41b93a9 100644 (file)
@@ -1857,7 +1857,7 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work)
 
        nvme_kill_queues(&dev->ctrl);
        if (pci_get_drvdata(pdev))
-               pci_stop_and_remove_bus_device_locked(pdev);
+               device_release_driver(&pdev->dev);
        nvme_put_ctrl(&dev->ctrl);
 }