btrfs: Add handling for disk split-brain scenario during fsid change
authorNikolay Borisov <nborisov@suse.com>
Tue, 30 Oct 2018 14:43:25 +0000 (16:43 +0200)
committerDavid Sterba <dsterba@suse.com>
Mon, 17 Dec 2018 13:51:38 +0000 (14:51 +0100)
Even though fsid change without rewrite is a very quick operation it's
still possible to experience a split-brain scenario if power loss occurs
at the most inconvenient time. This patch handles the case where power
failure occurs while the first transaction (the one setting
CHANGING_FSID_V2) flag is being persisted on disk. This can cause the
btrfs_fs_devices of this filesystem to be created by a device which:

 a) has the CHANGING_FSID_V2 flag set but its fsid value is intact

 b) or a device which doesn't have CHANGING_FSID_V2 flag set and its
    fsid value is intact

This situation is trivially handled by the current find_fsid code since
in both cases the devices are going to be treated like ordinary devices.
Since btrfs is always mounted using the superblock of the latest
device (the one with highest generation number), meaning it will have
the CHANGING_FSID_V2 flag set, ensure it's being cleared on mount. On
the first transaction commit following mount all disks will have it
cleared.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/disk-io.c

index ed6d4c83c1e7f2d44593aee640bed95a26a09b1f..feb67dfd663d72f2d265a60a799fa4debf53cdf5 100644 (file)
@@ -2799,10 +2799,10 @@ int open_ctree(struct super_block *sb,
         * the whole block of INFO_SIZE
         */
        memcpy(fs_info->super_copy, bh->b_data, sizeof(*fs_info->super_copy));
-       memcpy(fs_info->super_for_commit, fs_info->super_copy,
-              sizeof(*fs_info->super_for_commit));
        brelse(bh);
 
+       disk_super = fs_info->super_copy;
+
        ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid,
                       BTRFS_FSID_SIZE));
 
@@ -2812,6 +2812,16 @@ int open_ctree(struct super_block *sb,
                                BTRFS_FSID_SIZE));
        }
 
+       features = btrfs_super_flags(disk_super);
+       if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) {
+               features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2;
+               btrfs_set_super_flags(disk_super, features);
+               btrfs_info(fs_info,
+                       "found metadata UUID change in progress flag, clearing");
+       }
+
+       memcpy(fs_info->super_for_commit, fs_info->super_copy,
+              sizeof(*fs_info->super_for_commit));
 
        ret = btrfs_validate_mount_super(fs_info);
        if (ret) {
@@ -2820,7 +2830,6 @@ int open_ctree(struct super_block *sb,
                goto fail_alloc;
        }
 
-       disk_super = fs_info->super_copy;
        if (!btrfs_super_root(disk_super))
                goto fail_alloc;