virtio-net: don't re-enable refill work too early when NAPI is disabled
authorJakub Kicinski <kuba@kernel.org>
Wed, 30 Apr 2025 16:37:58 +0000 (09:37 -0700)
committerJakub Kicinski <kuba@kernel.org>
Mon, 5 May 2025 20:53:53 +0000 (13:53 -0700)
Commit 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx")
fixed a deadlock between reconfig paths and refill work trying to disable
the same NAPI instance. The refill work can't run in parallel with reconfig
because trying to double-disable a NAPI instance causes a stall under the
instance lock, which the reconfig path needs to re-enable the NAPI and
therefore unblock the stalled thread.

There are two cases where we re-enable refill too early. One is in the
virtnet_set_queues() handler. We call it when installing XDP:

   virtnet_rx_pause_all(vi);
   ...
   virtnet_napi_tx_disable(..);
   ...
   virtnet_set_queues(..);
   ...
   virtnet_rx_resume_all(..);

We want the work to be disabled until we call virtnet_rx_resume_all(),
but virtnet_set_queues() kicks it before NAPIs were re-enabled.

The other case is a more trivial case of mis-ordering in
__virtnet_rx_resume() found by code inspection.

Taking the spin lock in virtnet_set_queues() (requested during review)
may be unnecessary as we are under rtnl_lock and so are all paths writing
to ->refill_enabled.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Bui Quang Minh <minhquangbui99@gmail.com>
Fixes: 4bc12818b363 ("virtio-net: disable delayed refill when pausing rx")
Fixes: 413f0271f396 ("net: protect NAPI enablement with netdev_lock()")
Link: https://patch.msgid.link/20250430163758.3029367-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/virtio_net.c

index 848fab51dfa1b7690638414d019b837809d35480..b5549d542c022071249ea58ca255ae61eceeeece 100644 (file)
@@ -3383,12 +3383,15 @@ static void __virtnet_rx_resume(struct virtnet_info *vi,
                                bool refill)
 {
        bool running = netif_running(vi->dev);
+       bool schedule_refill = false;
 
        if (refill && !try_fill_recv(vi, rq, GFP_KERNEL))
-               schedule_delayed_work(&vi->refill, 0);
-
+               schedule_refill = true;
        if (running)
                virtnet_napi_enable(rq);
+
+       if (schedule_refill)
+               schedule_delayed_work(&vi->refill, 0);
 }
 
 static void virtnet_rx_resume_all(struct virtnet_info *vi)
@@ -3728,8 +3731,10 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs)
 succ:
        vi->curr_queue_pairs = queue_pairs;
        /* virtnet_open() will refill when device is going to up. */
-       if (dev->flags & IFF_UP)
+       spin_lock_bh(&vi->refill_lock);
+       if (dev->flags & IFF_UP && vi->refill_enabled)
                schedule_delayed_work(&vi->refill, 0);
+       spin_unlock_bh(&vi->refill_lock);
 
        return 0;
 }