As recently pointed out by Thomas, if a register is forced for two
different register variables, among them one is used as "+" (both input
and output) and another is only used as input, Clang would treat the
conflicting input parameters as undefined behaviour and optimize away
the argument assignment.
Instead use "=r" (only output) for the output parameter and "r" (only
input) for the input parameter.
While the example from the GCC documentation uses "0" for the input
parameter, this is not necessary as confirmed by the GCC developers and "r"
matches what the other architectures' vDSO implementations are using.