ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock
authorXiubo Li <xiubli@redhat.com>
Wed, 20 May 2020 07:51:19 +0000 (03:51 -0400)
committerIlya Dryomov <idryomov@gmail.com>
Mon, 1 Jun 2020 11:22:53 +0000 (13:22 +0200)
send_mds_reconnect takes the s_mutex while the mdsc->mutex is already
held. That inverts the locking order documented in mds_client.h. Drop
the mdsc->mutex, acquire the s_mutex and then reacquire the mdsc->mutex
to prevent a deadlock.

URL: https://tracker.ceph.com/issues/45609
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
fs/ceph/mds_client.c

index 6c283c52d401855fe57060764aeb7f5f1278f122..0e0ab01694dc1c395208396407e32b52ccc1dace 100644 (file)
@@ -3769,8 +3769,6 @@ fail:
  * recovering MDS might have.
  *
  * This is a relatively heavyweight operation, but it's rare.
- *
- * called with mdsc->mutex held.
  */
 static void send_mds_reconnect(struct ceph_mds_client *mdsc,
                               struct ceph_mds_session *session)
@@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc,
                            oldstate != CEPH_MDS_STATE_STARTING)
                                pr_info("mds%d recovery completed\n", s->s_mds);
                        kick_requests(mdsc, i);
+                       mutex_unlock(&mdsc->mutex);
                        mutex_lock(&s->s_mutex);
+                       mutex_lock(&mdsc->mutex);
                        ceph_kick_flushing_caps(mdsc, s);
                        mutex_unlock(&s->s_mutex);
                        wake_up_session_caps(s, RECONNECT);