net/mlx4_en: avoid one cache line miss to ring doorbell
authorEric Dumazet <edumazet@google.com>
Fri, 1 Oct 2021 00:52:49 +0000 (17:52 -0700)
committerDavid S. Miller <davem@davemloft.net>
Mon, 4 Oct 2021 11:50:13 +0000 (12:50 +0100)
commit9ac936276f8628ef5765bcc2ca7138a803de8aff
tree0a1f9ee1782d8f9e51719971c9f168441a5cace8
parent0693b27644f04852e46f7f034e3143992b658869
net/mlx4_en: avoid one cache line miss to ring doorbell

This patch caches doorbell address directly in struct mlx4_en_tx_ring.

This removes the need to bring in cpu caches whole struct mlx4_uar
in fast path.

Note that mlx4_uar is not guaranteed to be on a local node,
because mlx4_bf_alloc() uses a single free list (priv->bf_list)
regardless of its node parameter.

This kind of change does matter in presence of light/moderate traffic.
In high stress, this read-only line would be kept hot in caches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/mellanox/mlx4/en_tx.c
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h