net_sched: prio: fix a race in prio_tune()
authorEric Dumazet <edumazet@google.com>
Wed, 11 Jun 2025 11:15:11 +0000 (11:15 +0000)
committerJakub Kicinski <kuba@kernel.org>
Thu, 12 Jun 2025 15:05:49 +0000 (08:05 -0700)
Gerrard Tai reported a race condition in PRIO, whenever SFQ perturb timer
fires at the wrong time.

The race is as follows:

CPU 0                                 CPU 1
[1]: lock root
[2]: qdisc_tree_flush_backlog()
[3]: unlock root
 |
 |                                    [5]: lock root
 |                                    [6]: rehash
 |                                    [7]: qdisc_tree_reduce_backlog()
 |
[4]: qdisc_put()

This can be abused to underflow a parent's qlen.

Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog()
should fix the race, because all packets will be purged from the qdisc
before releasing the lock.

Fixes: 7b8e0b6e6599 ("net: sched: prio: delay destroying child qdiscs on change")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250611111515.1983366-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/sched/sch_prio.c

index cc30f7a32f1a786fa1c3937a1b9d5c96a52e56e7..9e2b9a490db23d858b27b7fc073b05a06535b05e 100644 (file)
@@ -211,7 +211,7 @@ static int prio_tune(struct Qdisc *sch, struct nlattr *opt,
        memcpy(q->prio2band, qopt->priomap, TC_PRIO_MAX+1);
 
        for (i = q->bands; i < oldbands; i++)
-               qdisc_tree_flush_backlog(q->queues[i]);
+               qdisc_purge_queue(q->queues[i]);
 
        for (i = oldbands; i < q->bands; i++) {
                q->queues[i] = queues[i];