tcp: avoid premature drops in tcp_add_backlog()
authorEric Dumazet <edumazet@google.com>
Tue, 23 Apr 2024 12:56:20 +0000 (12:56 +0000)
committerJakub Kicinski <kuba@kernel.org>
Thu, 25 Apr 2024 19:15:02 +0000 (12:15 -0700)
commitec00ed472bdb7d0af840da68c8c11bff9f4d9caa
tree7b9bb64d8735b8f6f94c6b211959765da79f2f46
parente6be197f23c5cf15fb4a5acac213b77f80e8cf96
tcp: avoid premature drops in tcp_add_backlog()

While testing TCP performance with latest trees,
I saw suspect SOCKET_BACKLOG drops.

tcp_add_backlog() computes its limit with :

    limit = (u32)READ_ONCE(sk->sk_rcvbuf) +
            (u32)(READ_ONCE(sk->sk_sndbuf) >> 1);
    limit += 64 * 1024;

This does not take into account that sk->sk_backlog.len
is reset only at the very end of __release_sock().

Both sk->sk_backlog.len and sk->sk_rmem_alloc could reach
sk_rcvbuf in normal conditions.

We should double sk->sk_rcvbuf contribution in the formula
to absorb bubbles in the backlog, which happen more often
for very fast flows.

This change maintains decent protection against abuses.

Fixes: c377411f2494 ("net: sk_add_backlog() take rmem_alloc into account")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240423125620.3309458-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/ipv4/tcp_ipv4.c