net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
authorVladimir Oltean <vladimir.oltean@nxp.com>
Thu, 4 Sep 2025 12:52:37 +0000 (15:52 +0300)
committerJakub Kicinski <kuba@kernel.org>
Sat, 6 Sep 2025 00:49:43 +0000 (17:49 -0700)
commit0ba5b2f2c381dbec9ed9e4ab3ae5d3e667de0dc3
tree71a18bdb8f8bb2d84eb14a661e4fad3d3a164ea5
parent03e79de4608bdd48ad6eec272e196124cefaf798
net: phylink: add lock for serializing concurrent pl->phydev writes with resolver

Currently phylink_resolve() protects itself against concurrent
phylink_bringup_phy() or phylink_disconnect_phy() calls which modify
pl->phydev by relying on pl->state_mutex.

The problem is that in phylink_resolve(), pl->state_mutex is in a lock
inversion state with pl->phydev->lock. So pl->phydev->lock needs to be
acquired prior to pl->state_mutex. But that requires dereferencing
pl->phydev in the first place, and without pl->state_mutex, that is
racy.

Hence the reason for the extra lock. Currently it is redundant, but it
will serve a functional purpose once mutex_lock(&phy->lock) will be
moved outside of the mutex_lock(&pl->state_mutex) section.

Another alternative considered would have been to let phylink_resolve()
acquire the rtnl_mutex, which is also held when phylink_bringup_phy()
and phylink_disconnect_phy() are called. But since phylink_disconnect_phy()
runs under rtnl_lock(), it would deadlock with phylink_resolve() when
calling flush_work(&pl->resolve). Additionally, it would have been
undesirable because it would have unnecessarily blocked many other call
paths as well in the entire kernel, so the smaller-scoped lock was
preferred.

Link: https://lore.kernel.org/netdev/aLb6puGVzR29GpPx@shell.armlinux.org.uk/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250904125238.193990-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/phy/phylink.c