md: fix sync_action incorrect display during resync
authorZheng Qixing <zhengqixing@huawei.com>
Sat, 16 Aug 2025 00:25:34 +0000 (08:25 +0800)
committerYu Kuai <yukuai3@huawei.com>
Sat, 16 Aug 2025 00:52:33 +0000 (08:52 +0800)
During raid resync, if a disk becomes faulty, the operation is
briefly interrupted. The MD_RECOVERY_RECOVER flag triggered by
the disk failure causes sync_action to incorrectly show "recover"
instead of "resync". The same issue affects reshape operations.

Reproduction steps:
  mdadm -Cv /dev/md1 -l1 -n4 -e1.2 /dev/sd{a..d} // -> resync happened
  mdadm -f /dev/md1 /dev/sda                     // -> resync interrupted
  cat sync_action
  -> recover

Add progress checks in md_sync_action() for resync/recover/reshape
to ensure the interface correctly reports the actual operation type.

Fixes: 4b10a3bc67c1 ("md: ensure resync is prioritized over recovery")
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Link: https://lore.kernel.org/linux-raid/20250816002534.1754356-3-zhengqixing@huaweicloud.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
drivers/md/md.c

index abd327ade4bdf74ba24c1bd7760d750ffb212155..1baaf52c603c1c229331771749a69bec6101e5ef 100644 (file)
@@ -4848,9 +4848,33 @@ static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t sectors)
               rdev->recovery_offset < sectors;
 }
 
+static enum sync_action md_get_active_sync_action(struct mddev *mddev)
+{
+       struct md_rdev *rdev;
+       bool is_recover = false;
+
+       if (mddev->resync_offset < MaxSector)
+               return ACTION_RESYNC;
+
+       if (mddev->reshape_position != MaxSector)
+               return ACTION_RESHAPE;
+
+       rcu_read_lock();
+       rdev_for_each_rcu(rdev, mddev) {
+               if (rdev_needs_recovery(rdev, MaxSector)) {
+                       is_recover = true;
+                       break;
+               }
+       }
+       rcu_read_unlock();
+
+       return is_recover ? ACTION_RECOVER : ACTION_IDLE;
+}
+
 enum sync_action md_sync_action(struct mddev *mddev)
 {
        unsigned long recovery = mddev->recovery;
+       enum sync_action active_action;
 
        /*
         * frozen has the highest priority, means running sync_thread will be
@@ -4874,8 +4898,17 @@ enum sync_action md_sync_action(struct mddev *mddev)
            !test_bit(MD_RECOVERY_NEEDED, &recovery))
                return ACTION_IDLE;
 
-       if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
-           mddev->reshape_position != MaxSector)
+       /*
+        * Check if any sync operation (resync/recover/reshape) is
+        * currently active. This ensures that only one sync operation
+        * can run at a time. Returns the type of active operation, or
+        * ACTION_IDLE if none are active.
+        */
+       active_action = md_get_active_sync_action(mddev);
+       if (active_action != ACTION_IDLE)
+               return active_action;
+
+       if (test_bit(MD_RECOVERY_RESHAPE, &recovery))
                return ACTION_RESHAPE;
 
        if (test_bit(MD_RECOVERY_RECOVER, &recovery))