Backup Deduplication 2.0 - Power SavingsIn our last entry we opened a discussion of what is needed as we move into the next era of backup deduplication and focused on integration to backup software. Another area that is becoming increasingly important is to be able to lower the power requirements that disk backup deduplication hardware requires. Power is a pressing issue in the data center and disk backup systems need to address those concerns
In our last entry we opened a discussion of what is needed as we move into the next era of backup deduplication and focused on integration to backup software. Another area that is becoming increasingly important is to be able to lower the power requirements that disk backup deduplication hardware requires. Power is a pressing issue in the data center and disk backup systems need to address those concerns.When it comes to power consumption most IT professionals will think of spin down drives, drives that can either slow down or power off depending on when they were last accessed. Deduplication vendors are going to want you to think differently about that. Instead, focus on how many fewer drives deduplication will use than a standard disk backup approach. While that is a fair line of reasoning, at the end of the day spinning drives, no matter how optimized, use more power than disk backup's biggest competitor, tape.
The answer to one disk backups biggest weakness is to figure out how to integrate power managed drives into disk deduplication systems. The use of these drives can be troublesome with disk solutions that use deduplication. Deduplication makes heavy use of indexing to identify redundant data, it performs frequent data integrity checks and often use garbage collection techniques to remove old data that no longer has active pointers. All of this constant access makes it difficult to spin down a drive for any significant amount of time.
There are ways to get some power management in backup deduplication systems. For example you can add deduplication technology to a spin down system as we discussed in "Power Managed Dedupe". The deduplication software can be optimized to narrow its garbage collection windows and error checking so the system could be in a spin down mode for the bulk of the non-backup window. Further multiples of these systems could be used over time with backup re-directed to different units at different times, alternating by quarter for example. The downside to this approach of course would be some increase in redundancy of the backup data set but it would increase power optimization. Over time though deduplication systems are going to have to learn to self-isolate old data.
The backup software applications that can do their own deduplication may be able to perform this for you as well. By setting up different drive groups in a power managed array or even using different arrays you could send deduplicated backup data to backup pools, which would give the system more time to power the drives down.
Clustered or scale out based disk backup systems are going to have to take all of this a step further, since each node is a potential power consumer. They are going to have to be able to move data to older nodes and then power or at least idle those nodes down. Steps could be taken to not only power the drives down but lower fan speed and processor speed which could lead to a very efficient scale out story. That is either going to require sophisticated node communication or an internal sub-dividing of the nodes to segregate off the infrequently accessed data set.
Another option for power efficiency is to use backup virtualization as we talked about in our recent article "Backup Virtualization Brings Flexibility to Disk Backup". Leveraging this technology backups could be sent to a very small high speed disk cache, then quickly spool that data to a disk deduplication system for medium term storage and then finally spill to tape as the data becomes old and infrequently accessed. This gives you the use of each backup device for what it is already best at instead of waiting for technology to fill in the gaps.
Track us on Twitter: http://twitter.com/storageswiss
Subscribe to our RSS feed.
George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.