RDMA/mlx5: Fix cache entry update on dereg error
authorMichael Guralnik <michaelgur@nvidia.com>
Thu, 13 Mar 2025 14:29:49 +0000 (16:29 +0200)
committerLeon Romanovsky <leon@kernel.org>
Tue, 18 Mar 2025 10:30:52 +0000 (06:30 -0400)
Fix double decrement of 'in_use' counter on push_mkey_locked() failure
while deregistering an MR.
If we fail to return an mkey to the cache in cache_ent_find_and_store()
it'll update the 'in_use' counter. Its caller, revoke_mr(), also updates
it, thus having double decrement.

Wrong value of 'in_use' counter will be exposed through debugfs and can
also cause wrong resizing of the cache when users try to set cache
entry size using the 'size' debugfs.

To address this issue, the 'in_use' counter is now decremented within
mlx5_revoke_mr() also after a successful call to
cache_ent_find_and_store() and not within cache_ent_find_and_store().
Other success or failure flows remains unchanged where it was also
decremented.

Fixes: 8c1185fef68c ("RDMA/mlx5: Change check for cacheable mkeys")
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://patch.msgid.link/97e979dff636f232ff4c83ce709c17c727da1fdb.1741875692.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
drivers/infiniband/hw/mlx5/mr.c

index 1ffa4b3d0f760f2c148a48666e7d80801c067688..c88016630f9dee76b9d3b836cfc8c9da89ad89b8 100644 (file)
@@ -1967,7 +1967,6 @@ static int cache_ent_find_and_store(struct mlx5_ib_dev *dev,
 
        if (mr->mmkey.cache_ent) {
                spin_lock_irq(&mr->mmkey.cache_ent->mkeys_queue.lock);
-               mr->mmkey.cache_ent->in_use--;
                goto end;
        }
 
@@ -2033,6 +2032,7 @@ static int mlx5_revoke_mr(struct mlx5_ib_mr *mr)
        struct mlx5_ib_dev *dev = to_mdev(mr->ibmr.device);
        struct mlx5_cache_ent *ent = mr->mmkey.cache_ent;
        bool is_odp = is_odp_mr(mr);
+       bool from_cache = !!ent;
        int ret = 0;
 
        if (is_odp)
@@ -2042,6 +2042,8 @@ static int mlx5_revoke_mr(struct mlx5_ib_mr *mr)
                ent = mr->mmkey.cache_ent;
                /* upon storing to a clean temp entry - schedule its cleanup */
                spin_lock_irq(&ent->mkeys_queue.lock);
+               if (from_cache)
+                       ent->in_use--;
                if (ent->is_tmp && !ent->tmp_cleanup_scheduled) {
                        mod_delayed_work(ent->dev->cache.wq, &ent->dwork,
                                         msecs_to_jiffies(30 * 1000));