tcp: attach SYNACK messages to request sockets instead of listener
authorEric Dumazet <edumazet@google.com>
Fri, 2 Oct 2015 18:43:35 +0000 (11:43 -0700)
committerDavid S. Miller <davem@davemloft.net>
Sat, 3 Oct 2015 11:32:43 +0000 (04:32 -0700)
commitca6fb06518836ef9b65dc0aac02ff97704d52a05
tree3fbe433aac9dcf1cae49e6715ced18bd3505d811
parent1b33bc3e9e903f7293f7dfe80a875b2a5d0305aa
tcp: attach SYNACK messages to request sockets instead of listener

If a listen backlog is very big (to avoid syncookies), then
the listener sk->sk_wmem_alloc is the main source of false
sharing, as we need to touch it twice per SYNACK re-transmit
and TX completion.

(One SYN packet takes listener lock once, but up to 6 SYNACK
are generated)

By attaching the skb to the request socket, we remove this
source of contention.

Tested:

 listen(fd, 10485760); // single listener (no SO_REUSEPORT)
 16 RX/TX queue NIC
 Sustain a SYNFLOOD attack of ~320,000 SYN per second,
 Sending ~1,400,000 SYNACK per second.
 Perf profiles now show listener spinlock being next bottleneck.

    20.29%  [kernel]  [k] queued_spin_lock_slowpath
    10.06%  [kernel]  [k] __inet_lookup_established
     5.12%  [kernel]  [k] reqsk_timer_handler
     3.22%  [kernel]  [k] get_next_timer_interrupt
     3.00%  [kernel]  [k] tcp_make_synack
     2.77%  [kernel]  [k] ipt_do_table
     2.70%  [kernel]  [k] run_timer_softirq
     2.50%  [kernel]  [k] ip_finish_output
     2.04%  [kernel]  [k] cascade

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/net/tcp.h
net/ipv4/inet_connection_sock.c
net/ipv4/tcp_fastopen.c
net/ipv4/tcp_input.c
net/ipv4/tcp_ipv4.c
net/ipv4/tcp_output.c
net/ipv6/tcp_ipv6.c
net/sched/sch_fq.c