In my recent article on data deduplication on InformationWeek's sister site, Byte and Switch, a question of speed impact came up. As we talk to customers throughout the storage community about backup priorities, a surprising trend continues: the importance of shrinking the backup window has become less of a priority for disk to disk backup solutions. Why?
In my recent article on data deduplication on InformationWeek's sister site, Byte and Switch, a question of speed impact came up. As we talk to customers throughout the storage community about backup priorities, a surprising trend continues: the importance of shrinking the backup window has become less of a priority for disk to disk backup solutions. Why?Speed of the backup target is really not the issue anymore as a single LTO4 Tape can receive data at an amazing 120 MBs. Even in-line data deduplication devices that are supposed to sacrifice speed for advantages of inline deduplication processing are now receiving data at more than 1 TB per hour. Most servers, infrastructures, and even the backup software itself can't keep up with the ingestion capabilities of the modern backup target.
For disk to disk backup, customers are putting the priority on how well they store data long term, how can they improve recovery performance, and, in what seems to capture the most interest, how well they enhance the ability to replicate data to a disaster recovery site. In all of these cases, target side data deduplication provides solutions to this. In my next article on Byte and Switch, we will discuss the pros and cons of doing the deduplication inline vs. post processing.
For today's entry, though, there are two issues to discuss, but I only have space for one now, so the other I'll save for another day. What do centers with massive amounts of data, those that are mostly likely to actually move data faster than 1 TB an hour and that need to reduce the backup window, do?
About 40% of the users we work with have well over a 100 TBs of storage under management. Tape is staying. How do you integrate that into the process? In most cases, it's a separate move from the disk target back through the backup server. In smaller, sub-50 TB centers (it's amazing that 50 TBs is small!), that's not a massive challenge. In large centers I believe this is impractical and a different technology is needed -- backup virtualization.
Backup virtualization creates a virtual pool of the various backup targets and presents a consolidated target to the backup server. The backup virtualization appliance performs the movement of data between the targets, not the backup application.
In sites where you have TBs of data to move and need to do so quickly, consider backup server virtualization. With these solutions in place you can buy a small but really fast disk cache, trickle that to a relatively fast disk-based data deduplication appliance, leverage the deduplication's ability to DR that data across a thinner WAN segment and, when the time is right, move that data to tape. This can all be done without having to set up complex jobs in the backup application.
In an upcoming entry I will talk about some ideas for reducing the backup window by thinning the amount of data used in the backup process.
George Crump is founder of Storage Switzerland, an analyst firm focused on the virtualization and storage marketplaces. It provides strategic consulting and analysis to storage users, suppliers, and integrators. An industry veteran of more than 25 years, Crump has held engineering and sales positions at various IT industry manufacturers and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators.