Diary and notebook of whatever tech problems are irritating me at the moment.


Simple off-site backup of a MD RAID 1 system

Standard backup tools like BackupPC are great for backing-up moderate amounts of user data but they can be impractical with huge data stores such as multi-terabyte RAID arrays as they need a backup store that is larger than the source data. My simple solution is to clone the array with another drive and store it off-site.

For this to work I had to categorize the data between smaller dynamic files (like documents) and larger static files (videos). The smaller files are backed up daily with BackupPC. The larger files are not backed up. Both are stored on a RAID 1 (mirror) array for redundancy in case of drive failure. On my server BackupPC uses a different, smaller RAID 1 array for a backup store. Since it is only backing up part of the data it doesn't have to be the same size as the main array. For backing up the larger/static files (and everything else) I simply add another drive to the main array, let it sync, then remove it and store off-site.

Ideally this system would use hot-swap but I don't have removable bays so I have to power-off the server each time. The rest of the procedure is relatively easy. With a RAID 1 array I have two drives (sda, sdb) and the added drive may show up as sdc. I say "may" because Ubuntu uses UUIDs for drive mappings and the actual device assignments may change. I always check with:

cat /proc/mdstat

to verify what devices are being used. I also check the partition sizes of all drives using "fdisk -l" and make sure the new drive has the same size partitions as the original RAID members. The partitions need to be of type fd "Linux raid autodetect" but no formatting with mkfs is necessary. Next I grow each RAID 1 MD device from 2 to 3 devices. For example:

mdadm -G -n 3 /dev/md0

This just tells the kernel that the array will now have three devices but does not assign another device to it. To allocate the device:

mdadm -a /dev/md0 /dev/sdc1

Resync should begin immediately. To monitor, I just use "cat /proc/mdstat" but the kernel will also send status messages to the console. After resynching, I disable the backup device by failing it:

mdadm -f /dev/md0 /dev/sdc1

This results in the RAID degradation warnings to be emailed to root. Next I remove it:

mdadm -r /dev/md0 /dev/sdc1

Finally, I shrink the array back to two devices:

mdadm -G -n 2 /dev/md0

This works well for my simple server setup. Obviously some scripting could be used to automate it. While this works well for a 2-drive RAID 1 array, it doesn't scale well with a larger number of drives or other RAID types.