x86/mm/tlb: Update mm_cpumask lazily
authorRik van Riel <riel@surriel.com>
Thu, 14 Nov 2024 15:26:16 +0000 (10:26 -0500)
committerIngo Molnar <mingo@kernel.org>
Tue, 19 Nov 2024 11:02:46 +0000 (12:02 +0100)
commit209954cbc7d0ce1a190fc725d20ce303d74d2680
treebd9e8fe18c7d837b73433cb2bcdb545a395a85ff
parent7e33001b8b9a78062679e0fdf5b0842a49063135
x86/mm/tlb: Update mm_cpumask lazily

On busy multi-threaded workloads, there can be significant contention
on the mm_cpumask at context switch time.

Reduce that contention by updating mm_cpumask lazily, setting the CPU bit
at context switch time (if not already set), and clearing the CPU bit at
the first TLB flush sent to a CPU where the process isn't running.

When a flurry of TLB flushes for a process happen, only the first one
will be sent to CPUs where the process isn't running. The others will
be sent to CPUs where the process is currently running.

On an AMD Milan system with 36 cores, there is a noticeable difference:
$ hackbench --groups 20 --loops 10000

  Before: ~4.5s +/- 0.1s
  After:  ~4.2s +/- 0.1s

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Link: https://lore.kernel.org/r/20241114152723.1294686-2-riel@surriel.com
arch/x86/kernel/alternative.c
arch/x86/mm/tlb.c