crypto: x86/aes-xts - additional optimizations
authorEric Biggers <ebiggers@google.com>
Thu, 12 Dec 2024 21:28:45 +0000 (13:28 -0800)
committerHerbert Xu <herbert@gondor.apana.org.au>
Sat, 21 Dec 2024 14:46:24 +0000 (22:46 +0800)
commit3cd46a78eeee8f1be545492a9de6dc37cd7d69d9
tree9fb3f07132bb8b0745f3c44654e78f4aa21348c2
parent68e95f5c6418ce1d0171fa756608a84170c56165
crypto: x86/aes-xts - additional optimizations

Reduce latency by taking advantage of the property vaesenclast(key, a) ^
b == vaesenclast(key ^ b, a), like I did in the AES-GCM code.

Also replace a vpand and vpxor with a vpternlogd.

On AMD Zen 5 this improves performance by about 3%.  Intel performance
remains about the same, with a 0.1% improvement being seen on Icelake.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
arch/x86/crypto/aes-xts-avx-x86_64.S