mm/damon/core: warn and fix nr_accesses[_bp] corruption
authorSeongJae Park <sj@kernel.org>
Tue, 13 May 2025 00:27:10 +0000 (17:27 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Thu, 22 May 2025 21:55:38 +0000 (14:55 -0700)
Patch series "mm/damon: minor fixups and improvements for code, tests, and
documents".

Yet another batch of miscellaneous DAMON changes.  Fix and improve minor
problems in code, tests and documents.

This patch (of 6):

For a bug such as double aggregation reset[1], ->nr_accesses and/or
->nr_accesses_bp of damon_region could be corrupted.  Such corruption can
make monitoring results pretty inaccurate, so the root causing bug should
be investigated.  Meanwhile, the corruption itself can easily be fixed but
silently fixing it will hide the bug.

Fix the corruption as soon as found, but WARN_ONCE() so that we can be
aware of the existence of the bug while keeping the system running in a
more sane way.

Link: https://lkml.kernel.org/r/20250513002715.40126-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250513002715.40126-2-sj@kernel.org
Link: https://lore.kernel.org/20250302214145.356806-1-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/damon/core.c

index 587fb9a4fef82e855220d8f998dcdf110ab1197f..0bb71e2ab713138ba35aef1164fc02cb80c88230 100644 (file)
@@ -1391,6 +1391,19 @@ int damos_walk(struct damon_ctx *ctx, struct damos_walk_control *control)
        return 0;
 }
 
+/*
+ * Warn and fix corrupted ->nr_accesses[_bp] for investigations and preventing
+ * the problem being propagated.
+ */
+static void damon_warn_fix_nr_accesses_corruption(struct damon_region *r)
+{
+       if (r->nr_accesses_bp == r->nr_accesses * 10000)
+               return;
+       WARN_ONCE(true, "invalid nr_accesses_bp at reset: %u %u\n",
+                       r->nr_accesses_bp, r->nr_accesses);
+       r->nr_accesses_bp = r->nr_accesses * 10000;
+}
+
 /*
  * Reset the aggregated monitoring results ('nr_accesses' of each region).
  */
@@ -1404,6 +1417,7 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
 
                damon_for_each_region(r, t) {
                        trace_damon_aggregated(ti, r, damon_nr_regions(t));
+                       damon_warn_fix_nr_accesses_corruption(r);
                        r->last_nr_accesses = r->nr_accesses;
                        r->nr_accesses = 0;
                }