mm: ratelimit stat flush from workingset shrinker
authorShakeel Butt <shakeelb@google.com>
Thu, 28 Dec 2023 07:30:55 +0000 (07:30 +0000)
committerAndrew Morton <akpm@linux-foundation.org>
Fri, 5 Jan 2024 18:17:45 +0000 (10:17 -0800)
One of our workloads (Postgres 14 + sysbench OLTP) regressed on newer
upstream kernel and on further investigation, it seems like the cause is
the always synchronous rstat flush in the count_shadow_nodes() added by
the commit f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical
stats").  On further inspection it seems like we don't really need
accurate stats in this function as it was already approximating the amount
of appropriate shadow entries to keep for maintaining the refault
information.  Since there is already 2 sec periodic rstat flush, we don't
need exact stats here.  Let's ratelimit the rstat flush in this code path.

Link: https://lkml.kernel.org/r/20231228073055.4046430-1-shakeelb@google.com
Fixes: f82e6bf9bb9b ("mm: memcg: use rstat for non-hierarchical stats")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/workingset.c

index 2a2a34234df98267255f43358cca9fefba31dda7..2260129743282d39d91c2c493936ec1301337105 100644 (file)
@@ -680,7 +680,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
                struct lruvec *lruvec;
                int i;
 
-               mem_cgroup_flush_stats(sc->memcg);
+               mem_cgroup_flush_stats_ratelimited(sc->memcg);
                lruvec = mem_cgroup_lruvec(sc->memcg, NODE_DATA(sc->nid));
                for (pages = 0, i = 0; i < NR_LRU_LISTS; i++)
                        pages += lruvec_page_state_local(lruvec,