qed: put cond_resched() in qed_grc_dump_ctx_data()
authorMichal Schmidt <mschmidt@redhat.com>
Mon, 30 Sep 2024 20:13:05 +0000 (22:13 +0200)
committerJakub Kicinski <kuba@kernel.org>
Fri, 4 Oct 2024 16:25:15 +0000 (09:25 -0700)
On a kernel with preemption none or voluntary, 'ethtool -d'
on a qede network device can cause a big latency spike.
The biggest part of it is the loop in qed_grc_dump_ctx_data.

The function is called only from the .get_size and .perform_dump
callbacks for the "grc" feature defined in qed_features_lookup[].
As far as I can see, they are used in:
 - qed's devlink healh reporter .dump op
 - qede's ethtool get_regs/get_regs_len/get_dump_data ops
 - qedf's qedf_get_grc_dump, called from:
   - qedf_sysfs_write_grcdump - "grcdump" sysfs attribute write
   - qedf_wq_grcdump - a workqueue

It is safe to sleep in all of them.
Let's insert a cond_resched() in the outer loop to let other tasks run.

Measured using this script:

  #!/bin/bash
  DEV=ens3f1
  echo wakeup_rt > /sys/kernel/tracing/current_tracer
  echo 0 > /sys/kernel/tracing/tracing_max_latency
  echo 1 > /sys/kernel/tracing/tracing_on
  echo "Setting the task CPU affinity"
  taskset -p 1 $$ > /dev/null
  echo "Starting the real-time task"
  chrt -f 50 bash -c 'while sleep 0.01; do :; done' &
  sleep 1
  echo "Running: ethtool -d $DEV"
  time ethtool -d $DEV > /dev/null
  kill %1
  echo 0 > /sys/kernel/tracing/tracing_on
  echo "Measured latency: $(</sys/kernel/tracing/tracing_max_latency) us"
  echo "To see the latency trace: less /sys/kernel/tracing/trace"

The patch lowers the latency from 180 ms to 53 ms on my test system with
voluntary preemption.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Link: https://patch.msgid.link/20240930201307.330692-3-mschmidt@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/qlogic/qed/qed_debug.c

index f67be4b8ad43512f4d5b5ae3dabda82f8cad1dbb..464a72afb758946621aebe62252ed25ec7556fae 100644 (file)
@@ -2873,6 +2873,7 @@ static u32 qed_grc_dump_ctx_data(struct qed_hwfn *p_hwfn,
                                                          false,
                                                          SPLIT_TYPE_NONE, 0);
                }
+               cond_resched();
        }
 
        return offset;