Saturday, March 09, 2013

protecting your data, locally and globally

I've spent a number of years trying various backup methods for my Linux box.  I think I finally have a pretty good one down.  The main idea is to setup my system in order to make backup and restore easier. This setup involves two components:
- a data source: my documents, videos, audio files and pictures are stored locally on a redundant  hardware RAID5 set
- the separation of system and data partitions on different physical devices (hard drives)

Backup Strategy
My actual backup strategy comes in three parts:
- an archive solution: fsarchiver to backup the system and data partitions
- network backup: a network backup device such as a NAS or in my case, a Drobo
- global backup storage solution: unlimited CrashPlan account

This backup strategy has been working well, though not without hiccups along the way.  It protects my important data by providing redundancy at multiple levels.  At a high level, here is how this is implemented:
- disk redundancy via RAID5 set
- local redundancy via network addressable storage
- global redundancy via CrashPlan if my house is destroyed

More specifically:
- my Linux system drive, Fedora 17, is one physical SATA drive
- my data drive is a hardware RAID5 unit using a 3Ware 9650SE with four physical SATA drives
- when I install a new OS, I use symbolic links (screen cap) in my user's home directory to point to my data, explained below

The bottom line is that no single backup method should be your entire backup strategy.  If you only have two of these methods implemented, you're better off than most people.

System Setup
Symbolic links are the key to segment the system partitions from your data partitions.  Segmenting is important because it separates your system from your data.  With this separation, it is much easier (from a Linux perspective), to upgrade and try different versions of Linux on your system drive, while your data drive stays essentially untouched and less prone to upgrade or experimentation tragedies.  Welcome to Linux!

Technically, here is how segmentation is implemented.  On my system drive, I mount my data partition.  In this example, I'm using /mnt as the mount point for my ext4 data partition:
[sodo@computer ~]$ cat /etc/fstab
/dev/mapper/vg_computer-lv_root /                    ext4    defaults        1 1
/dev/mapper/vg_computer-lv_home /home                ext4    defaults        1 2
/dev/mapper/vg_computer-lv_swap swap                 swap    defaults        0 0
/dev/mapper/vg_ogre-lv_root /mnt                     ext4    defaults        0 0

I have my content folders on the data partition that I will symbolically link from my system drive:
[sodo@computer mnt]$ ls -ltr /mnt
total 36
drwxr-xr-x. 16 sodo sodo 4096 Dec 21 19:04 MusicLibrary
drwxrwxr-x.  9 sodo sodo 4096 Jan 12 12:07 videos
drwxr-xr-x.  8 sodo sodo 4096 Feb  1 10:57 doc
drwxrwxr-x.  8 sodo sodo 4096 Mar  2 09:26 pictures

I then create symbolic links from my user's home directory to the equivalent directories that I've setup on my data partition:
[sodo@computer ~]$ ls -l | grep '^l'
lrwxrwxrwx.  1 sodo sodo    8 Oct 23  2011 Documents -> /mnt/doc
lrwxrwxrwx.  1 sodo sodo   18 May 30  2011 Music -> /mnt/MusicLibrary/
lrwxrwxrwx.  1 sodo sodo   14 Sep  2  2011 Pictures -> /mnt/pictures/
lrwxrwxrwx.  1 sodo sodo   12 May 28  2011 videos -> /mnt/videos/

With the symbolic links in place, I've made the link from my system to my data.

The Archive-Backup Process
I am going to use the terms "archive" and "backup" synonymously.  The overview is that I'll show how I back up my data partition using fsarchiver and then I'll copy those backups to both my local network and global storage solutions.  FSarchiver is a bit different than regular backup systems in that it archives entire filesystems only.  It is not a file-based backup method.  So when you go to save or restore a filesystem, you specify a filesystem to backup and have limited ability to exclude (but not include) directories or files with the "exclude" switch only (as of 5/2016).

Below, I show the archive process for the data partition.  Feel free to extrapolate the information herein to do the same for your system partition.

The Core Archive Process
1) check the used space on the source partition to be archived, as well as the available space on the destination/target for the backup (fill in missing info).  From the below example:
a. source filesystem (the data partition): vg_ogre-lv_root
b. destination partition (for backup storage): vg_computer-lv_home
[sodo@computer ~]$ df -H
Filesystem                       Size  Used Avail Use% Mounted on
/dev/sda1                        508M   80M  403M  17% /boot
devtmpfs                         5.3G     0  5.3G   0% /dev
/dev/mapper/vg_computer-lv_root   53G   15G   36G  30% /
/dev/mapper/vg_computer-lv_home  2.0T   .2T  1.8T  67% /home
/dev/mapper/vg_ogre-lv_root      4.5T  1.4T  2.9T  32% /mnt

2) verify available space on the destination filesystem
The source partition is using 1.4TB on vg_ogre-lv_root and I have 1.8TB available on the destination for the backup, lv_home.  So..good to go.

3) if there is enough space on the target filesystem, prepare to run fsarchiver and unmount the data partition.
In order to keep the filesystem from being updated during the archive process, fsarchiver asks to unmount the target filesystem before making the backup.  Like so:
[sodo@computer ~]$ sudo umount /mnt/

The nice thing about the split system-data partition setup is that it is unnecessary to load a Live CD in order to backup the data partition.  Normally, one has to boot with a LiveCD in order to backup the system partition.

4) Once the filesystem is unmounted, run fsarchiver.  As one of the destinations for the backup is the cloud, use the -c option to encrypt with a password:
[sodo@computer ~]$ sudo fsarchiver -j8 -c [password] -o savefs ~/f17backup/backup_lv_root.fsa /dev/mapper/vg_ogre-lv_root

The archive of my 1.2TB drive took five hours on my eight-core, 1.6Ghz Dell SC1430.
[sodo@computer ~]$ ll ~/backup/backup_lv_root.fsa 
-rw-r--r--. 1 root root 1186229328445 Mar  5 08:32 /home/sodo/backup/backup_lv_root.fsa

Copying the file to network-based storage
I have a 2.5TB CIFS (Windows share) created on my Drobo.  On my Linux box:
1) I mount the Drobo filesystem:
[sodo@computer ~]$ sudo mount -t cifs //drobo/Linux /mnt/drobo -o credentials=/home/sodo/smb.credentials

2) Copy the archive to it:
[sodo@computer ~]$ sudo cp -rp ~/f17backup/ /mnt/drobo/

Copying the file over my home network took about 16 hours.

3) Review the archive info (the -c switch allows you to enter a password for the archive):
[sodo@computer ~]$ fsarchiver -c - archinfo /mnt/drobo/f17backup/backup_vg_ogre-lv_root.fsa
Enter password: 
====================== archive information ======================
Archive type: filesystems
Filesystems count: 1
Archive id: 5131fa94
Archive file format: FsArCh_002
Archive created with: 0.6.15
Archive creation date: 2013-03-05_01-20-04
Archive label:
Minimum fsarchiver version: 0.6.4.0
Compression level: 3 (gzip level 6)
Encryption algorithm: blowfish

===================== filesystem information ====================
Filesystem id in archive: 0
Filesystem format: ext4
Filesystem label:
Filesystem uuid: 3ffe7328-8f96-4028-bd79-5f644a030fc2
Original device: /dev/mapper/vg_ogre-lv_root
Original filesystem size: 4.02 TB (4416677830656 bytes)
Space used in filesystem: 1.22 TB (1338969956352 bytes)


A Word About the Drobo
The Drobo has been one of the easiest storage and backup solutions that I've ever used.  It integrates seemlessly with Time Machine for my MacBook and I've created a Windows share on the device in order to copy my Linux archive.

Over the past few years, I've expanded the drives within it about three times now.  I went from four 500GB drives, to four 1TB drives and one 500GB to my configuration as it is today, five 1TB drives.  The drive upgrades were easy, though time consuming.  I removed each of the older drives one at a time and replaced them with the larger drives.  Each time a drive was upgraded, the Drobo would automatically and non-destructively rebuild it's storage protection.  The Drobo's storage protection is called BeyondRAID, Drobo's own custom algorithm on top of RAID.

The integration with Mac is seemless; however, the Windows/CIFS file share can be a bit wonky, as the share has a tendency to become unavailable for whatever reason.  The resolution is to shutdown and restart the Drobo and that seems to fix the problem.

Cloud-Based Storage
A last layer of protection above and beyond the local and network copy of my data is to copy the encrypted archive to a cloud-based solution.  The purpose is to protect my data in case of a natural disaster that destroys all my local storage media.  With the increasing amount of natural and man-made disasters happening these days, I've recently invested in a data protection plan with www.crashplan.com.  I got an unlimited plan to store my 1.2TB of data to the cloud.  Most of that data is audio, video and image files.

After tweaking the CrashPlan app to pump more data through my local and wide area network (from 1280KB and 2560KB, respectively, to about 6400KB for both), it took about two weeks to upload this amount of data to CrashPlan's cloud!

In sum, if you have a lot of data, all these procedures take time.  From backing up, local network copy and then cloud copy.  If you're using Linux, you probably have the stomach for all this.  In the end, though, you'll have a backup solution that is pretty solid and relatively easy to implement unlike custom scripted solutions.

Love to hear any comments on how you backup your systems.

ciao!
TAG

References
http://crazedmuleproductions.blogspot.com/2010/02/fsarchiver-good-backup-for-ext4.html
http://www.drobo.com/how-it-works/beyond-raid.php
http://support.crashplan.com/doku.php/recipe/stop_and_start_engine

Here are some of my earlier articles on fsarchiver and a review of the Drobo.

No comments:

Feel free to drop me a line or ask me a question.