locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCK...
authorWaiman Long <longman@redhat.com>
Wed, 3 Oct 2018 17:07:18 +0000 (13:07 -0400)
committerIngo Molnar <mingo@kernel.org>
Tue, 9 Oct 2018 07:56:33 +0000 (09:56 +0200)
commit8ca2b56cd7da98fc8f8d787bb706b9d6c8674a3b
treef73f5ff4065aab44ebc28dab5103393012a87c47
parentce52a18db45842f5b992851a552bd7f6acb2241b
locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y

A sizable portion of the CPU cycles spent on the __lock_acquire() is used
up by the atomic increment of the class->ops stat counter. By taking it out
from the lock_class structure and changing it to a per-cpu per-lock-class
counter, we can reduce the amount of cacheline contention on the class
structure when multiple CPUs are trying to acquire locks of the same
class simultaneously.

To limit the increase in memory consumption because of the percpu nature
of that counter, it is now put back under the CONFIG_DEBUG_LOCKDEP
config option. So the memory consumption increase will only occur if
CONFIG_DEBUG_LOCKDEP is defined. The lock_class structure, however,
is reduced in size by 16 bytes on 64-bit archs after ops removal and
a minor restructuring of the fields.

This patch also fixes a bug in the increment code as the counter is of
the 'unsigned long' type, but atomic_inc() was used to increment it.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/d66681f3-8781-9793-1dcf-2436a284550b@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
include/linux/lockdep.h
kernel/locking/lockdep.c
kernel/locking/lockdep_internals.h
kernel/locking/lockdep_proc.c