Protect disk files and data to keep virtual machines humming.
Virtual machine backups encompass two data sources: the application data inside the VM and the disk files that make up the VM itself. You need to protect both to ensure that you can recover from a failure. This calls for a smart mixture of backup types to satisfy your data protection and recovery objectives as well as some unique considerations and techniques, including snapshots.
Backing up a VM while it runs and serves clients is accomplished, regardless of the hypervisor platform, via the creation of a VM snapshot. When a VM is snapshotted, the hypervisor stops writing to its existing disk file and creates a new disk file to write changes to. If the machine is live, it also saves the contents of running memory to a separate file. This allows the backup software to copy the snapshot while letting the VM continue operating. Snapshots are also useful because they serve as VMs copies that can be reused if the original backup effort fails. Snapshots can also be used to restore a VM to a known-good state if updates or changes to the VM cause a glitch.
While snapshots are useful, we can run into problems if we're not careful about management. For example, once an operation is successful, snapshots should be deleted because they gobble storage space. Yet time and again, I've seen administrators use snapshots as quasi-backups instead of how they're intended--as temporary safety nets. If enough snapshots accumulate on a production machine, the VM will run out of space and likely fail. Where there's very little data change, it may be OK to leave some snapshots in production, but be careful.
In addition, disk file backups do not take the place of guest-based backup software agents that run at the VM guest operating system level. These agents provide several advantages over disk file backups. The agents are selective: You have the option to take only the data that's changed or the data you want. Backing up the operating system over and over again doesn't do you any good if all you care about is the application data on the machine.
We recommend backing up the disk files of VMs once per week. Send these backups to a repository, such as a deduplicated SAN, that's also replicated to a secondary site. You can also back up VMs to a repository, such as autoloader, that you can physically move off site. Then, take daily guest-OS-level backups of application files and data. Store the daily backups on a mixture of disk, tape, or replicating storage; good backup products can easily accommodate all three.
A word about deduplication: This process can happen in several places. If your SAN supports deduplication, the dedupe software lives at the controller level and automatically deduplicates data as it passes. You can also use a dedicated deduplication appliance. Finally, some backup agents that sit on the deduplication target can provide source deduplication so that only new data gets backed up. In a disaster recovery scenario, you can simply restore the disk files to a freshly provisioned virtual host cluster and spin up new VMs, bringing you right back to where you were during the disk file backup. Or you can update data on the bare-metal images with a restore from the data-only backup.