IB/mlx4: Fix race condition between catas error reset and aliasguid flows
authorJack Morgenstein <jackm@dev.mellanox.co.il>
Wed, 6 Mar 2019 17:17:56 +0000 (19:17 +0200)
committerJason Gunthorpe <jgg@mellanox.com>
Mon, 18 Mar 2019 00:40:39 +0000 (21:40 -0300)
Code review revealed a race condition which could allow the catas error
flow to interrupt the alias guid query post mechanism at random points.
Thiis is fixed by doing cancel_delayed_work_sync() instead of
cancel_delayed_work() during the alias guid mechanism destroy flow.

Fixes: a0c64a17aba8 ("mlx4: Add alias_guid mechanism")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
drivers/infiniband/hw/mlx4/alias_GUID.c

index 782499abcd9868d63b5f789ee002a0594b00c4d6..2a0b59a4b6ebc3c34ff9ff3308af7e337d2001f3 100644 (file)
@@ -804,8 +804,8 @@ void mlx4_ib_destroy_alias_guid_service(struct mlx4_ib_dev *dev)
        unsigned long flags;
 
        for (i = 0 ; i < dev->num_ports; i++) {
-               cancel_delayed_work(&dev->sriov.alias_guid.ports_guid[i].alias_guid_work);
                det = &sriov->alias_guid.ports_guid[i];
+               cancel_delayed_work_sync(&det->alias_guid_work);
                spin_lock_irqsave(&sriov->alias_guid.ag_work_lock, flags);
                while (!list_empty(&det->cb_list)) {
                        cb_ctx = list_entry(det->cb_list.next,