tracing/timerlat: Fix duplicated kthread creation due to CPU online/offline
authorWei Li <liwei391@huawei.com>
Tue, 24 Sep 2024 09:45:11 +0000 (17:45 +0800)
committerSteven Rostedt (Google) <rostedt@goodmis.org>
Thu, 3 Oct 2024 20:43:22 +0000 (16:43 -0400)
osnoise_hotplug_workfn() is the asynchronous online callback for
"trace/osnoise:online". It may be congested when a CPU goes online and
offline repeatedly and is invoked for multiple times after a certain
online.

This will lead to kthread leak and timer corruption. Add a check
in start_kthread() to prevent this situation.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20240924094515.3561410-2-liwei391@huawei.com
Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations")
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
kernel/trace/trace_osnoise.c

index 1439064f65d60c27347c91a4796034d56d8e79b8..d1a539913a5f5cd193394a4ee04ee701a0a87d12 100644 (file)
@@ -2007,6 +2007,10 @@ static int start_kthread(unsigned int cpu)
        void *main = osnoise_main;
        char comm[24];
 
+       /* Do not start a new thread if it is already running */
+       if (per_cpu(per_cpu_osnoise_var, cpu).kthread)
+               return 0;
+
        if (timerlat_enabled()) {
                snprintf(comm, 24, "timerlat/%d", cpu);
                main = timerlat_main;
@@ -2061,11 +2065,10 @@ static int start_per_cpu_kthreads(void)
                if (cpumask_test_and_clear_cpu(cpu, &kthread_cpumask)) {
                        struct task_struct *kthread;
 
-                       kthread = per_cpu(per_cpu_osnoise_var, cpu).kthread;
+                       kthread = xchg_relaxed(&(per_cpu(per_cpu_osnoise_var, cpu).kthread), NULL);
                        if (!WARN_ON(!kthread))
                                kthread_stop(kthread);
                }
-               per_cpu(per_cpu_osnoise_var, cpu).kthread = NULL;
        }
 
        for_each_cpu(cpu, current_mask) {