RDMA/odp: Use mmu_interval_notifier_insert()
authorJason Gunthorpe <jgg@mellanox.com>
Tue, 12 Nov 2019 20:22:22 +0000 (16:22 -0400)
committerJason Gunthorpe <jgg@mellanox.com>
Sat, 23 Nov 2019 23:56:44 +0000 (19:56 -0400)
commitf25a546e65292b36f15cca0912450c4944fae031
tree045faece87dbe0b3294db953432c84221a620b50
parent107e899874e95dcddc779142942bf285eba38bc5
RDMA/odp: Use mmu_interval_notifier_insert()

Replace the internal interval tree based mmu notifier with the new common
mmu_interval_notifier_insert() API. This removes a lot of code and fixes a
deadlock that can be triggered in ODP:

 zap_page_range()
  mmu_notifier_invalidate_range_start()
   [..]
    ib_umem_notifier_invalidate_range_start()
       down_read(&per_mm->umem_rwsem)
  unmap_single_vma()
    [..]
      __split_huge_page_pmd()
        mmu_notifier_invalidate_range_start()
        [..]
           ib_umem_notifier_invalidate_range_start()
              down_read(&per_mm->umem_rwsem)   // DEADLOCK

        mmu_notifier_invalidate_range_end()
           up_read(&per_mm->umem_rwsem)
  mmu_notifier_invalidate_range_end()
     up_read(&per_mm->umem_rwsem)

The umem_rwsem is held across the range_start/end as the ODP algorithm for
invalidate_range_end cannot tolerate changes to the interval
tree. However, due to the nested invalidation regions the second
down_read() can deadlock if there are competing writers. The new core code
provides an alternative scheme to solve this problem.

Fixes: ca748c39ea3f ("RDMA/umem: Get rid of per_mm->notifier_count")
Link: https://lore.kernel.org/r/20191112202231.3856-6-jgg@ziepe.ca
Tested-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
drivers/infiniband/core/device.c
drivers/infiniband/core/umem_odp.c
drivers/infiniband/hw/mlx5/mlx5_ib.h
drivers/infiniband/hw/mlx5/mr.c
drivers/infiniband/hw/mlx5/odp.c
include/rdma/ib_umem_odp.h
include/rdma/ib_verbs.h