mm: memcontrol: use per-cpu stocks for socket memory uncharging
authorRoman Gushchin <guro@fb.com>
Fri, 8 Sep 2017 23:13:09 +0000 (16:13 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 9 Sep 2017 01:26:47 +0000 (18:26 -0700)
commit475d0487a2adb1718892856078861d430d81fc55
tree1c388577bc01fabcd6e477d9b8690f285c3a63f5
parent3a77d214807c32f900618cee0432feb4f7f99c65
mm: memcontrol: use per-cpu stocks for socket memory uncharging

We've noticed a quite noticeable performance overhead on some hosts with
significant network traffic when socket memory accounting is enabled.

Perf top shows that socket memory uncharging path is hot:
  2.13%  [kernel]                [k] page_counter_cancel
  1.14%  [kernel]                [k] __sk_mem_reduce_allocated
  1.14%  [kernel]                [k] _raw_spin_lock
  0.87%  [kernel]                [k] _raw_spin_lock_irqsave
  0.84%  [kernel]                [k] tcp_ack
  0.84%  [kernel]                [k] ixgbe_poll
  0.83%  < workload >
  0.82%  [kernel]                [k] enqueue_entity
  0.68%  [kernel]                [k] __fget
  0.68%  [kernel]                [k] tcp_delack_timer_handler
  0.67%  [kernel]                [k] __schedule
  0.60%  < workload >
  0.59%  [kernel]                [k] __inet6_lookup_established
  0.55%  [kernel]                [k] __switch_to
  0.55%  [kernel]                [k] menu_select
  0.54%  libc-2.20.so            [.] __memcpy_avx_unaligned

To address this issue, the existing per-cpu stock infrastructure can be
used.

refill_stock() can be called from mem_cgroup_uncharge_skmem() to move
charge to a per-cpu stock instead of calling atomic
page_counter_uncharge().

To prevent the uncontrolled growth of per-cpu stocks, refill_stock()
will explicitly drain the cached charge, if the cached value exceeds
CHARGE_BATCH.

This allows significantly optimize the load:
  1.21%  [kernel]                [k] _raw_spin_lock
  1.01%  [kernel]                [k] ixgbe_poll
  0.92%  [kernel]                [k] _raw_spin_lock_irqsave
  0.90%  [kernel]                [k] enqueue_entity
  0.86%  [kernel]                [k] tcp_ack
  0.85%  < workload >
  0.74%  perf-11120.map          [.] 0x000000000061bf24
  0.73%  [kernel]                [k] __schedule
  0.67%  [kernel]                [k] __fget
  0.63%  [kernel]                [k] __inet6_lookup_established
  0.62%  [kernel]                [k] menu_select
  0.59%  < workload >
  0.59%  [kernel]                [k] __switch_to
  0.57%  libc-2.20.so            [.] __memcpy_avx_unaligned

Link: http://lkml.kernel.org/r/20170829100150.4580-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memcontrol.c