Announcement

Collapse
No announcement yet.

BTRFS incremental backups - Awesome time and space saver

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    BTRFS incremental backups - Awesome time and space saver

    I have a bunch of subvolumes on my server and about 5TB of data in them. Doing a full set of backups is time consuming to say the least! Luckily, BTRFS has an "incremental" backup feature. Basically it allows you to "send" a backup based on the difference between the current snapshot and the previous one. Confused? Here's a better explanation;

    I have a subvolume called @movies for my Plex server. To make an initial backup using the btrfs send|receive you must make a read-only snapshot, then send|receive it to another btrfs file system. My @movies subvolume is in /mnt/pool and my backup drive is mounted at (oddly enough) /mnt/backup. I am keeping my backup snapshots in /mnt/pool/snapshots because snapshot must reside on the same filesystem as the source subvolume - but not necessarily in the same folder. Putting them in their own folder keeps things tidier.

    Since all btrfs commands require root access, I use a root session to do these tasks (remember, I'm doing this for almost 2 dozen subvolumes!). Here how I made my initial snapshot and backup:

    sudo -i
    btrfs subvolume snapshot -r /mnt/pool/@movies /mnt/pool/snapshots/@movies_backup

    I now have a read only (the -r option in the snapshot command) snapshot, so I can now "send" it to the backup file system.

    btrfs send /mnt/pool/snapshots/@movies_backup | btrfs receive /mnt/backup

    Note the "receive" command only wants the target folder. The received subvolume will have the same name as the sent one. Once this completes, I have a full copy of @movies in read-only format as @movies_backup. I did this last month. During the last few weeks I added 11 more movies to my collection and I wanted to back them up also. The full backup took several hours to complete because there are several hundred HD and 4K movies already in my collection. Luckily I can just back up the changes to the subvolume without doing a full copy.

    To do this I need another read-only snapshot;

    btrfs su sn -r /mnt/pool/@movies /mnt/pool/snapshots/@movies_backup1

    Note that btrfs commands can be abbreviated to the least number of unique characters, so su = subvolume and sn=snapshot. Now I have two snapshots in my folder: movies_backup and movies_backup1. A quick file count shows @movie_backup contains 212 files and @movies_backup1 contains 223.

    Here's the really cool part about btrfs and this snapshot thing: These two snapshots and the source subvolume are sharing data. That means although when I look at the contents of @movies_backup1 and compare it to @movies_backup, it appears that there are 212 copies of the same movies and 11 unique ones, the actuality is only one file exists of each of the duplicated files. I mean that the movie 'The Magnificent Seven (1960).mp4' is listed in both backups subvolumes and the original subvolume, it only actually exists once on the drive. All three subvolumes have their own metadata (file catalog info) but share file data. Anyway, back to the operation:

    Now I can send only the difference between the two snapshots to the backup file system;

    btrfs send -p /mnt/pool/snapshots/@movies_backup /mnt/pool/snapshots/@movies_backup1 | btrfs receive /mnt/backup

    This sends only the difference (aka "partial" copy by using the -p option) between the two subvolumes to /mnt/backup as @moves_backup1. Because I added only 11 files, this completed in a few minutes rather than hours required for the first backup. Now /mnt/backup contains both subvolumes @movies_backup and @movies_backup1. Just as the source backups, @movies_backup contains 212 files and @movies_backup1 contains 223 - BUT THESE ALSO ARE SHARING THE DATA SPACE!

    Now I can delete the first snapshot (@movies_backup) from both /mnt/pool/snapshots and /mnt/backup if I wish or leave them - because they are not actually using additional space!

    I must keep at least one previous backup snapshot in /mnt/pool/snapshots so I have something to compare the next one too, and same for the receiving end - but since they take no extra space, I'm leaving them for now.

    Please Read Me

    #2
    Sweet!

    And, even while the @movies_backup subvolume took hours to send to your backup HD, you can continue to use your system without any significant loss of performance. Inadvertently delete a movie from the server? Use Dolphin (or mc) to drag and drop the movie out of the @movie_backup subvolume onto the server. That's right, you can browse those subvolumes just like they were folders, because in a sense they are.

    (File systems I won't be returning back to: EXT4, EXT3, EXT2, RiserFS or DOS)
    Last edited by GreyGeek; Jul 06, 2018, 08:32 PM.
    "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
    – John F. Kennedy, February 26, 1962.

    Comment


      #3
      Especially in cases where you use any kind of automation (like cronjobs, for example) you should probably include some sync commands before sending, to make sure everything is written to disk, or you might get errors when sending.

      (According to https://btrfs.wiki.kernel.org/index....emental_Backup, I'm not 100% sure whether it's still relevant, but it can't hurt).

      Comment


        #4
        All the documentation I’ve read suggests using sync after creating or deleting a subvolume. The Btrfs subvolume delete command uses either a “-c” or “-C” parameter to commit the deletion. Btrfs subvolume delete also includes a sync parameter:

        “sync <path> [subvolid…]
        Wait until given subvolume(s) are completely removed from the filesystem after deletion. If no subvolume id is given, wait until all current deletion requests are completed, but do not wait for subvolumes deleted meanwhile. The status of subvolume ids is checked periodically.
        Options
        -s <N>
        sleep N seconds between checks (default: 1)”

        There are, in some situations, definite pauses after issuing just a sync command at the prompt while at other times the prompt command returns immediately.
        "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
        – John F. Kennedy, February 26, 1962.

        Comment


          #5
          I always use the -c when deleting a subvolume. I've never done the sync command nor had any issues caused by not using it. That wiki page is very old. The issue that occurs (reportedly) is a "stale file handle" error unless the subvolume is physically on the disk (not still a cached operation). I've never seen this error. Most of the references to it are circa 2014 so maybe it's been solved.

          Still, I agree if you're automating a script it can't hurt.

          Please Read Me

          Comment


            #6
            Originally posted by oshunluvr View Post
            Still, I agree if you're automating a script it can't hurt.
            Scripts run commands quite a bit faster than a typical human manually can, so it might explain why you haven't seen any errors (of course the issues might have also been fixed...perhaps the btrfs tools run sync now before releasing the command line [exiting]). It usually doesn't really take too long for disk writes to settle, although it obviously depends on the size of the operation and how busy the system is.

            Comment


              #7
              BTRFS incremental backups - Awesome time and space saver

              Originally posted by oshunluvr View Post
              I always use the -c when deleting a subvolume. I've never done the sync command nor had any issues caused by not using it. ...
              “Options
              -c —commit-after
              wait for transaction commit at the end of the operation
              -C —commit-each
              wait for transaction commit after deleting each subvolume”
              Last edited by GreyGeek; Jul 08, 2018, 03:06 AM.
              "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
              – John F. Kennedy, February 26, 1962.

              Comment

              Working...
              X