md: split MD_RECOVERY_NEEDED out of mddev_resume
authorYu Kuai <yukuai3@huawei.com>
Thu, 7 Dec 2023 02:07:24 +0000 (10:07 +0800)
committerSong Liu <song@kernel.org>
Thu, 7 Dec 2023 18:19:47 +0000 (10:19 -0800)
commitb39113349de60e9b0bc97c2e129181b193c45054
tree16d09c2bb9b60da6a0afa203087edb9b62f81154
parentf52f5c71f3d4bb0992800139d2f35cf9f6f6e0ee
md: split MD_RECOVERY_NEEDED out of mddev_resume

New mddev_resume() calls are added to synchronize IO with array
reconfiguration, however, this introduces a performance regression while
adding it in md_start_sync():

1) someone sets MD_RECOVERY_NEEDED first;
2) daemon thread grabs reconfig_mutex, then clears MD_RECOVERY_NEEDED and
   queues a new sync work;
3) daemon thread releases reconfig_mutex;
4) in md_start_sync
   a) check that there are spares that can be added/removed, then suspend
      the array;
   b) remove_and_add_spares may not be called, or called without really
      add/remove spares;
   c) resume the array, then set MD_RECOVERY_NEEDED again!

Loop between 2 - 4, then mddev_suspend() will be called quite often, for
consequence, normal IO will be quite slow.

Fix this problem by don't set MD_RECOVERY_NEEDED again in md_start_sync(),
hence the loop will be broken.

Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration")
Suggested-by: Song Liu <song@kernel.org>
Reported-by: Janpieter Sollie <janpieter.sollie@edpnet.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218200
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20231207020724.2797445-1-yukuai1@huaweicloud.com
drivers/md/md.c