md: don't unregister sync_thread with reconfig_mutex held
authorGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Sat, 13 Feb 2021 00:49:59 +0000 (01:49 +0100)
committerSong Liu <song@kernel.org>
Mon, 23 May 2022 06:07:21 +0000 (23:07 -0700)
commit8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934
tree8de244d677ec5e71cea81588e8b3e388a410cf22
parent537b9f2bf60f4bbd8ab89cea16aaab70f0c1560d
md: don't unregister sync_thread with reconfig_mutex held

Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Song Liu <song@kernel.org>
drivers/md/dm-raid.c
drivers/md/md.c
drivers/md/md.h