I decided to never let this happen again (since it's a huge time suck sitting around waiting for software to install) and began investigating free, yet solid and reliable backup solutions suitable for a home office. This post is an attempt to document what whole disk backup and recovery solution worked for me, using several freely available open-source tools.
NOTE: This backup and restore procedure has worked for me twice as of 6/11/09. See the FOLLOWUP in red at the bottom of this post for details.
For the sake of this HOWTO, I'm going to assume you are familiar with "Linux on a CD" distributions like SystemRescueCD or Knoppix. Nothing personal, but if you aren't familiar with Linux, the command line, or how to use a Linux on a CD distro then this HOWTO is probably going to feel a bit over your head. BTW, this HOWTO assumes the drive you want to backup is at /dev/sda. Your block device DSF (device special file) might be different.
This HOWTO is provided to you "as is", without warranty of any kind, express or implied. I am not responsible for data loss or hardware damage that occurs as a result of using these instructions. Use at your own risk.
1 - Boot into Linux on a CD
Pop in your favorite Linux on a CD distro and boot your PC accordingly. For the sake of this HOWTO I'm going to assume you're using SystemRescueCD. However, any decent Linux on a CD distribution should have all of the tools you'll need.
2 - Figure Out Where to Place the Backup
Before you do anything further, you should figure out where you are going to place your backups. Backups are usually quite big, so expect them to chew up a good 80-100 GB of storage in most cases. The better compression you use when making the backup, the less storage space you'll need.
In my case, I decided to put the backup on a large RAID-1 (mirror) volume I have in my home datacenter. The mirror is attached to another Linux box, so I need to mount the mirror volume via Samba (you could use NFS too if you want):
rescuecd#/> mkdir /mirror
rescuecd#/> /sbin/mount.cifs //192.168.1.109/mark /mirror -ouser=mark
Once my mirror is mounted via Samba, I can read and write data to /mirror which will directly pipe it to the box connected to my RAID-1 volume.
3 - Determine the Appropriate Block Size
For a quicker backup, it can help to nail down the optimal block size of the disk device you are going to backup. Assuming you are going to backup /dev/sda, here's how you can use the fdisk command to determine the best block size:
rescuecd#/> /sbin/fdisk -l /dev/sda | grep Units
Units = cylinders of 16065 * 512 = 8225280 bytes
Note the fdisk output says "cylinders of 16065 * 512". This means that there are 512 bytes per block on the disk. You can significantly improve the speed of the backup by increasing the block size by a multiple of 2 to 4. In this case, an optimal block size might be 1k (512*2) or 2k (512*4). BTW, getting greedy and using a block size of 5k (512*10) or something excessive won't help; eventually the system will bottleneck at the device itself and you won't be able to squeeze out any additional performance from the backup process.
4 - Backup the Partition Layout
Before you do anything, it's always a good idea to backup the partition layout. When you create a whole disk backup, you don't have to worry about partitions. However, it can be handy to have this partition information (in case you need to mount a specific partition in the backup as a file using the exact offset). Use the sfdisk command to backup the partition layout:
rescuecd#/> sfdisk -d /dev/sda > /mirror/backup-sda.sfOnce you've backed up the partition layout, you can cat /mirror/backup-sda.sf to verify that you've correctly saved the partition mapping.
5 - Backup the Master Boot Record (MBR)
Again, you don't need to explicitly do this since a whole disk backup includes the MBR, but it's a good idea to snag the master boot record just in case. To backup the MBR, you can use the dd command:
rescuecd#/> dd if=/dev/sda of=/mirror/backup-sda.mbr count=1 bs=512If you want to prove to yourself that you've successfully saved the MBR, you can run file /mirror/backup-sda.mbr to confirm you got what you needed ...
rescuecd#/> file /mirror/backup-sda.mbrYep, the file command confirmed that we've successfully snagged the MBR of the disk. The MBR always sits on the first 512-bytes of any bootable disk.
backup-sda.mbr: x86 boot sector; partition 2: ID=0x83, active, starthead 1, \
startsector 63, 2104452 sectors; partition 3: ID=0x82, starthead 0, \
startsector 2104515, 4192965 sectors, code offset 0x48
6 - Run the Backup
Now that you've saved everything you need from the disk, it's time to make the backup. To create the backup, we'll use the dd command in conjunction with gzip (-9 for max compression). For dd, we'll use an optimal block size of 1024 (as determined in Step 3 of this HOWTO).
Warning, this will take a long time so it's probably best to let this run overnight. On my system at home, it took me 7+ hours to backup an entire 250 GB disk:
rescuecd#/> dd if=/dev/sda bs=1024The conv=noerror,sync asks dd to keep going even if there are any read errors with the disk and to pad every input block with NULs to match your input block size. Note that I'm using the pv command to monitor the speed and progress of data flowing between dd and gzip. The pv command will tell me how much data I've processed, how long the backup has been running, and the approx speed of my backup; essentially it displays a progress bar on the console.
conv=noerror,sync| pv | \
gzip -c -9 > /mirror/backup-sda.gz
If you're saving the backup to a file system that doesn't handle large files (e.g., FAT32), you can use the split command to automatically split the gzip'ed backup into smaller chunks:
rescuecd#/> dd if=/dev/sda bs=1024
conv=noerror,sync| pv | \
gzip -c -9 | split -b 2048m - /mirror/backup-sda.gz.
Using the split command together with gzip will create /mirror/backup-sda.gz.aa, backup-sda.gz.ab, backup-sda.gz.ac, backup-sda.gz.ad, and so on. Each file will be as big as the -b argument of split. In this example, I'm splitting the backup into 2 GB (2048 MB) chunks.
7 - Re-compress with P7ZIP (if desired)
Gzip offers pretty decent compression, but if you want insanely awesome compression, you can use P7ZIP to compress your backups. After the dd to gzip backup is complete in Step 6 above, you can re-compress backup-sda.gz using P7ZIP if you'd like to save a little storage space. If so, here's how:
rescuecd#/> gunzip -c /mirror/backup-sda.gz | 7za a /mirror/backup-sda.7z -siAgain, be warned, this process will seem like it takes forever. However, using P7ZIP over Gzip, saved me about 5 GB on the compressed backup. Using gzip -9 alone, I compressed a 250 GB backup image down to about 31 GB. With P7ZIP, the same backup was only 26 GB. P7ZIP is interesting because it sacrifices CPU cycles for compression, using a more exhaustive and complete compression algorithm. If you want more information on P7ZIP, check out Wikipedia's article on the 7z compression format.
8 - Restore from Backup
Backups are useless unless you can actually restore your data. If you need to restore a P7ZIP compressed backup to /dev/sda, here's how:
rescuecd#/> 7za x /mirror/backup-sda.7z -so | dd of=/dev/sda bs=1024If you decided to skip P7ZIP compression, and need to restore a Gzip compressed backup to /dev/sda, here's how:
rescuecd#/> gunzip -c /mirror/backup-sda.gz | pv | \
dd of=/dev/sda bs=1024
9 - Compression Tip (added 12/25/09)
For best performance, I strongly recommend zeroing out your disk before installing any OS'es on the drive you plan to backup. For example, I recently upgraded my Vista Enterprise box to Windows 7 Professional. I backed up Vista using the instructions in this post, then used the "dd" command to zero out the disk before installing Windows 7:
rescuecd#/> dd if=/dev/zero of=/dev/sda bs=1024 conv=noerror,sync
This writes zeroes to the entire disk, essentially eliminating any random or stray data that's lingering at the end of the drive from a previous OS install. Then, once I install Windows 7 and back it up, the backup process will compress Windows 7, my data, and all installed applications. Eventually, it will hit the "rest of the disk", which is all zeroes. As a result, the backup run time is reduced not to mention that the backup itself could be a fraction of the normal size (most compression algorithms LOVE large streams of similar patterns; they're built for that, so if you give gzip, p7zip, or bzip2 some data then a huge stream of all zeros, expect some insanely good compression). Using this technique, on my system, I compressed a 250GB disk with Windows 7 Professional installed down to only 12GB!
So, if you plan on re-installing your OS, then making a backup, you should always use /dev/zero to zero out your disk before doing anything.
10- Further Reading/Whatever
Some folks would say it's stupid not to use official backup tools like Partimage, Clonezilla, or Norton Ghost. To some extent, I agree with them. However ...
Why not use Partimage?
As of 5/26/09 Partimage claims NTFS support is experimental. So, if I want to backup a Windows box, I'm not going to leave my data with a tool that claims support for Windows file systems is experimental. I demand something more stable, and reliable.
Why not use Clonezilla?
I've never used Clonezilla, but it was recently recommned to me by an established IT professional. Perhaps I'll have to look into it. Assuming it dosen't lock my data into some god-awful proprietary format, maybe it will be a better solution that a raw dd disk backup.
What about Norton Ghost?
No way, Jose. Why pay for a commercial backup solution when everything you need to archive your system is available for free? But I guess if you care about support, and data warranties, then yes a commerical solution might be right for you.
All this talk of Linux, what about HP-UX?
Basically, all of the same concepts apply. However, if you're running HP-UX you can use HP's DRD (Dynamic Root Disk) clone feature to create an exact copy of an active system disk.
I lost another disk this afternoon, in the same system that I originally had problems with (the same system that triggered me to write this post in the first place). Anyways, I luckily had a gzip'ed backup of the system image I created only 5 days ago. Using the backup and restore procedure documented in this blog post, I successfully backed up and restored a system running Microsoft's Windows Vista Enterprise. This is the 2nd backup and restore process that has worked for me using the instructions documented in this post. If you doubt this HOWTO, you can take some comfort in the fact that my whole disk backup and restore procedure has worked for me twice as of 6/11/09.