crypto: iaa - Optimize rebalance_wq_table()
authorYury Norov <yury.norov@gmail.com>
Thu, 8 May 2025 19:59:50 +0000 (15:59 -0400)
committerHerbert Xu <herbert@gondor.apana.org.au>
Wed, 14 May 2025 09:45:22 +0000 (17:45 +0800)
commit714ca27e9bf4608fcb1f627cd5599441f448771e
treea6d186d3d7dc44b19fa81b4db15ef4b52f010d7d
parent33cd93435cea665b24ca3f9b3d6af42afb3ba7bc
crypto: iaa - Optimize rebalance_wq_table()

The function opencodes for_each_cpu() by using a plain for-loop. The
loop calls cpumask_weight() inside the conditional section. Because
cpumask_weight() is O(1), the overall complexity of the function is
O(node * node_cpus^2). Also, cpumask_nth() internally calls hweight(),
which, if not hardware accelerated, is slower than cpumask_next() in
for_each_cpu().

If switched to the dedicated for_each_cpu(), the rebalance_wq_table()
can drop calling cpumask_weight(), together with some housekeeping code.
This makes the overall complexity O(node * node_cpus), or simply speaking
O(nr_cpu_ids).

While there, fix opencoded for_each_possible_cpu() too.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
drivers/crypto/intel/iaa/iaa_crypto_main.c