Announcement

Collapse
No announcement yet.

BTRFS converting RAID1 to a single drive

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    BTRFS converting RAID1 to a single drive

    Due to a hard drive starting to report errors, I needed to remove my data from it and replace it. It was part of a 2-drive BTRFS RAID1 file system of 2 - 2TB drives. For those unfamiliar, RAID1 is "mirroring" which means both drives have duplicate copies of all the data. This protects your data from a single drive failure as both drives have a full copy of all your data. My server has two sets of drives in RAID1 arrays, the old set od 2x2TB and a newer set of 2x6TB leaving a total data capacity of 8TB, all protected with RAID1.

    Since the drive it was paired with is also showing signs of age, I plan on removing it soon as well. The replacement for both of these drives will be a single, larger drive. At 10TB, this new drive is considerably larger than both the old drives combined, so I will be moving away from the RAID1 setup for the time being and utilizing a backup strategy instead of RAID1. This will reduce heat, power consumption, and the likelihood of a failure (more devices = an increased chance of failure).

    Using BTRFS, there are two ways to remove a drive from a RAID1 array - you can "fail" the drive, replace it, and re-build the array or you can remove the RAID completely, converting the array into a "JBOD" array (Just a Bunch Of Disks) and then move the data off of the failing drive. The first option requires mounting the RAID array in "degraded" mode - basically, you pretend the drive has actually died and replace it with a similar size drive. This didn't fit my larger plan of upgrading the drive capacity so I went for the second option.

    Since I am using BTRFS, this entire operation can occur with the data still accessible and while still using the system. This is not possible if you're using MDADM RAID and EXT4. I often talk about the virtues of BTRFS and this is a big one. My hardware setup has a four drive limitation in the case but benefits from them being externally removable. My motherboard also supports "hot swapping" which means I can replace a drive while powered on. If all goes well, all of the steps below will occur without powering the system down and maybe not even rebooting. Since it's a 14.04 Ubuntu server, I may end up having to reboot once if it doesn't "see" the new drive. I will also be removing the second drive of this array because it's old and will not be needed once the new drive is in place.

    The steps involved:

    Phase 1 - Getting the bad drive out of the system.
    1) Make a backup of the data.
    2) Convert the data, metadata, and system data from RAID1 to JBOD.
    3) Move all the data off of the failing drive to the remaining drive.
    4) Physically remove the bad drive.

    Phase 2 - Getting the new drive into the system and removing the last old drive.
    5) Install the new drive.
    6) Format and test the new drive.
    7) Move all the data off of the old drive to the new drive.
    8) Remove the last old drive.

    I will document theses Phases and their Steps in following posts...
    Last edited by oshunluvr; Apr 23, 2018, 04:48 PM.

    Please Read Me

    #2
    Phase 1, Step 1: The Backup

    Since I use BTRFS, all the data on theses drives are segregated into 11 subvolumes. This means I can create full usable copies of the subvolumes with a simple BTRFS command set: send|receive. The total drive space consumed by these subvolumes is just over 1TB.

    To backup a subvolume, you must make a read-only snapshot of your subvolume, then "send" it to another BTRFS file system where it is "received." You can also send it to a non-BTRFS backup location by using a file instead of a BTRFS filesystem to receive the data. The disadvantage to "sending" to a file is the data is not accessible until it is "received" but the advantage is you can move or copying it to any location you like and restore it when need be. Note that you can only send|receive a read-only subvolume (snapshots are also subvolumes). This prevents data corruption during the copy because remember; the file system is still usable during all this. Nothing will be unmounted.

    I am lucky in that my second drive array has enough excess capacity to hold all the subvolumes on this set of drives. I will simply send the subvolumes to the second array as they are. As you might expect, this requires root access so I will open a terminal, switch to a "root" session and proceed with the operation. Note my two arrays are "pool1" and "pool2" and I am removing the drives from pool2.

    Here are the commands:

    sudo -i
    btrfs subvolume snapshot -r /mnt/pool2/@Documents /mnt/pool2/@Documents_ro
    btrfs send /mnt/pool2/@Documents_ro | btrfs receive /mnt/pool1/

    That's it. The first command opens a root session using sudo, the second takes a read-only (the "-r" switch) for @Documents and saves it as @Documents_ro, and the third command sends the read-only snapshot to pool1. Note in the third command there is no target name for the subvolume. That's because you cannot change the target name of the subvolume during this operation. The length of time it takes for the send|receive to finish will obviously depend on the amount of data you are coping and many other factors.

    Moving forward, if I wanted to keep and use the subvolumes on pool1, I would need only to re-snapshot the read-only subvolumes to read-write versions and re-mount the subvolumes where they are. But I'm planning on added the new drive and moving all the data from pool1 to it, so I might as well leave them read-only for now.

    I repeated the above commands for each subvolume over the course of a day. Now I'm ready for Step 2 of Phase 1.
    Last edited by oshunluvr; Apr 23, 2018, 07:17 AM.

    Please Read Me

    Comment


      #3
      Phase 1, Step 2: Convert RAID1 to JBOD

      Using the same root session from above, I issued this command:

      btrfs balance start -f -sconvert=single -mconvert=single -dconvert=single /mnt/pool2


      Breaking down each part of this btrfs command:
      • "balance" is the command BTRFS uses to relocate data. You balance the data onto a RAID from a single disk or as in here, vice-versa.
      • "start" is obvious, but you can also "stop" a balance operation and check if progress with "status". Balance operations can be long if you have a lot of data.
      • "-f" is the "force" switch. It is needed here because I'm moving from a RAID1 set of data types to a "single" set of metadata and system data. This is considered less-safe than having multiple copies of these parts of the filesystem so the "force" requirement helps insure you are not doing this by mistake. With BTRFS, even if I do not want to keep my data files in a RAID1 array, I can still copy the system and metadata to multiple disks for safety. Since I'm removing a drive completely, I do not want that it this time.
      • "-sconvert -mconvert -dconvert" and "=single" tell the system to convert all these types of data to "single" (existing only on one disk). "s" means "system", m means "metadata", and "d" means "data" (your files).
      • "/mnt/pool2" is the mounted location of the RAID array I'm working on. Notice the command operates on the mount, not the device names.


      I did this operation over night because I expected it to take about 6 hours to finish. Here's the data arrangement before the above command and after:

      Before:
      Code:
      # btrfs device usage /mnt/pool2
      /dev/sdb3, ID: 1
         Device size:             1.76TiB
         Data,RAID1:              1.02TiB
         Metadata,RAID1:          3.00GiB
         System,RAID1:           32.00MiB
         Unallocated:           761.08GiB
      
      
      /dev/sdd3, ID: 2
         Device size:             1.76TiB
         Data,RAID1:              1.02TiB
         Metadata,RAID1:          3.00GiB
         System,RAID1:           32.00MiB
         Unallocated:           761.08GiB
      and after:

      Code:
      [FONT=monospace][COLOR=#000000]# btrfs device usage /mnt/pool2                                        [/COLOR]        
      /dev/sdb3, ID: 1
         Device size:             1.76TiB
         Data,single:           521.00GiB
         Metadata,single:         2.00GiB
         System,single:          32.00MiB
         Unallocated:             1.25TiB
      
      /dev/sdd3, ID: 2
         Device size:             1.76TiB
         Data,single:           522.00GiB
         Metadata,single:         1.00GiB
         Unallocated:             1.25TiB[/FONT]
      You can see the data is now only existing in one copy and fairly evenly distributed across both disks as one might expect. Note all the system data is one only one drive but the metadata exists on the same drive as the data files it refers to. Metadata refers directly to your files while systemdata contains information about the filesystem as a whole.

      Please Read Me

      Comment


        #4
        Phase 1, Steps 3 and 4: Move the data and remove the failing drive

        With a single BTRFS command, I will now move the data off of the failing drive leaving a single drive BTRFS file system:

        btrfs device delete /dev/sdd3 /mnt/pool2

        and the results to compare to the previous post:

        Code:
        # btrfs device usage /mnt/pool2         
        /dev/sdb3, ID: 1
           Device size:             1.76TiB
           Data,single:             1.02TiB
           Metadata,single:         3.00GiB
           System,single:          32.00MiB
           Unallocated:           761.08GiB
        As you can see, now all the data was was on the previous RAID1 is now on just this one disk, freeing me to remove the failing drive from the computer. I have not yet needed to reboot the server or even edit /etc/fstab as the mounts haven't changed. I will need to do some editing to fstab before I actiually pull the physical drive because there are 3 other partitions on the drive that are in use for other things.

        This is where I'll stop for the time being, as I have not yet purchased a replacement drive. I will continue this when the new drive arrives and I have time to install it.
        Last edited by oshunluvr; Apr 23, 2018, 11:03 AM.

        Please Read Me

        Comment


          #5
          Another of a long line of excellent posts about Btrfs! I'm book marking this one as well.

          I went the same route, except that I left the metadata as RAID1.

          Code:
          $ [B]sudo btrfs device usage /[/B]
          [sudo] password for jerry: 
          /dev/sda1, ID: 1
             Device size:           691.19GiB
             Device slack:              0.00B
             Data,single:            64.00GiB
             Metadata,[B]RAID1[/B]:          3.00GiB
             Unallocated:           624.19GiB
          
          /dev/sdc, ID: 2
             Device size:           698.64GiB
             Device slack:              0.00B
             Data,single:            65.00GiB
             Metadata,[B]RAID1[/B]:          3.00GiB
             System,single:          32.00MiB
             Unallocated:           630.61GiB
          
          :~[B]$ sudo -i[/B]
          :~[B]# mount /dev/sda1 /mnt[/B]    
          #/mnt becomes my pool designation because that is where
          # I mount my primary device.  I could have mounted
          # /dev/sdc to /mnt and got the same listing.
          
          :~[B]# btrfs device usage /mnt[/B]
          /dev/sda1, ID: 1
             Device size:           691.19GiB
             Device slack:              0.00B
             Data,single:            64.00GiB
             Metadata,[B]RAID1[/B]:          3.00GiB
             Unallocated:           624.19GiB
          
          /dev/sdc, ID: 2
             Device size:           698.64GiB
             Device slack:              0.00B
             Data,single:            65.00GiB
             Metadata,[B]RAID1[/B]:          3.00GiB
             System,single:          32.00MiB
             Unallocated:           630.61GiB
          Also, even though the data and system of the pool is converted from RAID1 to SINGLE, both drives are listed as part of the pool because they still are in the pool, acting as a single drive. Converting to a single doesn't take a drive out of the pool, it merely changes how data is stored on the drive. Deleting the drive takes it out of the pool, as you've shown.

          But, perhaps, I may just emphasizing the obvious.
          Last edited by GreyGeek; Apr 23, 2018, 06:46 PM. Reason: added ", acting as a single drive"
          "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
          – John F. Kennedy, February 26, 1962.

          Comment


            #6
            Originally posted by GreyGeek View Post
            Another of a long line of excellent posts about Btrfs! I'm book marking this one as well.

            I went the same route, except that I left the metadata as RAID1.

            Code:
            $ [B]sudo btrfs device usage /[/B]
            [sudo] password for jerry: 
            /dev/sda1, ID: 1
               Device size:           691.19GiB
               Device slack:              0.00B
               Data,single:            64.00GiB
               Metadata,[B]RAID1[/B]:          3.00GiB
               Unallocated:           624.19GiB
            
            /dev/sdc, ID: 2
               Device size:           698.64GiB
               Device slack:              0.00B
               Data,single:            65.00GiB
               Metadata,[B]RAID1[/B]:          3.00GiB
               System,single:          32.00MiB
               Unallocated:           630.61GiB
            
            :~[B]$ sudo -i[/B]
            :~[B]# mount /dev/sda1 /mnt[/B]    
            #/mnt becomes my pool designation because that is where
            # I mount my primary device.  I could have mounted
            # /dev/sdc to /mnt and got the same listing.
            
            :~[B]# btrfs device usage /mnt[/B]
            /dev/sda1, ID: 1
               Device size:           691.19GiB
               Device slack:              0.00B
               Data,single:            64.00GiB
               Metadata,[B]RAID1[/B]:          3.00GiB
               Unallocated:           624.19GiB
            
            /dev/sdc, ID: 2
               Device size:           698.64GiB
               Device slack:              0.00B
               Data,single:            65.00GiB
               Metadata,[B]RAID1[/B]:          3.00GiB
               System,single:          32.00MiB
               Unallocated:           630.61GiB
            Also, even though the data and system of the pool is converted from RAID1 to SINGLE, both drives are listed as part of the pool because they still are in the pool. Converting to a single doesn't take a drive out of the pool, it merely changes how data is stored on the drive. Deleting the drive takes it out of the pool, as you've shown.

            But, perhaps, I may just emphasizing the obvious.
            Good information and this reinforces the power and flexibility of BTRFS. To me, this illustrates why BTRFS is clearly the best file system available. Just like Linux in general, the user has control to mold their filesystem to their own needs and wishes.

            Please Read Me

            Comment


              #7
              Phase 2, Steps 4 and 5:

              OK, new drive installed;

              Code:
              [FONT=monospace][COLOR=#000000]smith@server:~$ sudo btrfs fi sh /mnt/pool[/COLOR]
              Label: none  uuid: 8c45e18c-4781-4461-86f8-721d8bc33c0c
                      Total devices 1 FS bytes used 384.00KiB
                      devid    1 size 9.10TiB used 2.02GiB path /dev/sdd
              
              smith@server:~$ sudo btrfs fi df /mnt/pool
              Data, single: total=8.00MiB, used=256.00KiB
              System, DUP: total=8.00MiB, used=16.00KiB
              Metadata, DUP: total=1.00GiB, used=112.00KiB
              GlobalReserve, single: total=16.00MiB, used=0.00B
              smith@server:~$ 
              
              [/FONT]
              Notice 9.1T = A TB to TiB conversion is 9.1% less, so this 10TB drive actually has 9.1 TiB of space! Seems really dishonest, doesn't it?

              Anyway, you can also see the single device btrfs filesystem has "Metadata, DUP" and "total = 1 GiB" which explains the used amount of 2.02GiB in the above "df" output.

              I only ran a short "smart" test on the drive so far. Kind of neat to see "1" in the Power_on_hours and Power_cycle_count fields. A full "long" test will take 18+ hours!

              When that is complete, I will migrate all 5.5T of data to the new drive. This will take some time!
              Last edited by oshunluvr; Apr 27, 2018, 09:22 AM.

              Please Read Me

              Comment

              Working...
              X