x86: improve bitop code generation with clang
authorLinus Torvalds <torvalds@linux-foundation.org>
Tue, 9 Apr 2024 18:55:07 +0000 (11:55 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 22 May 2024 21:12:11 +0000 (14:12 -0700)
commitb9b60b3199b70fe3ce74ff493b1870ccd7554134
tree232e0543c7015119bf752331a1642fa4b8f659de
parent7453b9485114f7ffec4a99bccee469a4d4809894
x86: improve bitop code generation with clang

This uses the new ASM_INPUT_RM macro to avoid the bad code generation
issue that clang has with more generic asm inputs.

This ends up avoiding generating code like this:

  mov    %r10,(%rsp)
  tzcnt  (%rsp),%rcx

which now becomes just

  tzcnt  %r10,%rcx

and in the process ends up also removing a few unnecessary stack frames
when the only use was that pointless "asm uses memory location off stack".

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/x86/include/asm/bitops.h