informa
Commentary

Knowing That Your Recovery Will Work, Verification

In our last entry we talked about the importance of creating and managing to service level agreements (SLA) to set recovery expectations correctly and to give some sense of clarity and priority to the backup jobs that you manage. The second step is to be able to verify that those critical jobs will actually work when you need them to.
In our last entry we talked about the importance of creating and managing to service level agreements (SLA) to set recovery expectations correctly and to give some sense of clarity and priority to the backup jobs that you manage. The second step is to be able to verify that those critical jobs will actually work when you need them to.The absolute best way to be able to verify that a recovery is going to work is actually recover that data and start the application. Clearly you can't do that for every single application in the environment. Once again SLAs add value here knowing what applications are the most critical and building a periodic test recovery into the SLA provides the ultimate confidence in the ability to recover data. This recovery can be recovery of the system in a local environment maybe once per quarter and then recovery in a DR location twice a year.

In my experience when I managed a backup technical support center the number one problem IT professionals had in the recovery process was lack of experience in actually doing it. They did backups and dealt with backup problems every day. If they had to recover most often it was a single file, they did not very often have to deal with full application recovery nor dealing with the problems that might ensue. As the saying goes practice makes perfect. The challenge for backup administrators is how do you find the time to practice something hard like system recoveries when all your doing is putting out fires? We find that when you create and manage to SLAs and formalize data protection from a set of loosely related tasks into a process or workflow then you have organized your day to the point that time spent putting out fires greatly diminishes.

To make the testing of the recovery capabilities in your environment fit into that work flow we recommend that you leverage image backup and server virtualization. Image backups essentially store the servers as a self contained unit that can be quickly restored or even launched in place without recovery at all. Software that can provide image backups and leverage them in virtualized server environments can launch the backups servers within the virtual environment. This makes verification of a backup job as simple as starting the virtual machine. Again in environments with 100's of servers, virtual or physical, you won't want to do this on every system every day, but with SLAs in place you could certainly do it more routinely then we describe above. This can even be applied to backup of non-virtualized servers. The non-virtualized server is backed up but then stored in a VM ready state, allowing for similar testing.

There are some misconceptions about image based backup like they can't do incremental restores, they can't do point in time restores, their slow, they put more of your data at risk and there is no tape out functionality. We'll address those issues in part three of this series.

Track us on Twitter: http://twitter.com/storageswiss

Subscribe to our RSS feed.

George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.

Recommended Reading: