In closing out our <a href="http://www.informationweek.com/blog/main/archives/2010/06/revisiting_the.html">series</a> about keeping data forever we will examine the financial aspects of keeping data forever. What can be done to curtail costs and how does it compare to the more traditional finite data retention model? In this entry we will look at the costs of a finite data retention policy.

George Crump, President, Storage Switzerland

July 2, 2010

3 Min Read

In closing out our series about keeping data forever we will examine the financial aspects of keeping data forever. What can be done to curtail costs and how does it compare to the more traditional finite data retention model? In this entry we will look at the costs of a finite data retention policy.First we have to agree that data retention has a cost, whether you are going to keep it forever or for a short time. All the tools that are available to curtail the cost of retaining data are available to both models; deduplication, compression, disk archive, MAID disk and of course tape. In the keep data forever model the use of those tools becomes potentially more critical as does indexing that we discussed in our second entry.

There are two major cost disadvantages that the finite data retention model has. First, systems need to be put in place to manage the data going through that process. Data has to be identified and moved to the retained storage target, typically a disk or tape archive. While the keep data forever strategy will share this same need, the finite strategy has a greater layer of complexity and organization required because that data needs to be able to be identified for deletion at just the right time.

The cost to identify all the data that interrelates to a given policy can be astronomical. For example, say you decide to keep all Human Resources (HR) data for seven years and have another policy to keep all employee data for five years. It is fairly easy to identify data that is stored in the HR share as well as emails sent to and from HR. However what about the copy of a document that HR sends to an employee and then that employee re-saves that information in their own sub-directory? How is that document going to be identified to be kept two years longer than the other data within the employee's directory? While software does exist to manage that process, it is not free and there is still the complexity of managing the process. And this still does not address the potential for the employee to email a copy of that document to their personal email address or to copy to a thumb drive.

That last sentence leads to the second major cost of a finite data retention policy. What if something is deleted before it should have been, does not get deleted when it should or gets out of IT's control prior to the retention policy being invoked? Honestly as long as your company is never sued or has a legal discovery made against it, the cost is zero. If you have been or have the possibility of being sued then you have to be able to produce data. If you can't and you should be able to given your corporate data retention guidelines, then you have a problem, a costly one, on your hands. Potentially worse is the cost associated if someone else can then produce that data (via personal email) and you can't.

A finite data retention policy that was supposed to protect the organization may end up condemning it. The expectations that these policies set are simply too high. If you don't keep data and the data the relates to it for the exact period of time, you lose and it could cost the organization millions if it goes to court. A keep data forever strategy, as long as you can find that data, does not have the same issue.

Track us on Twitter: http://twitter.com/storageswiss

Subscribe to our RSS feed.

George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.

About the Author(s)

George Crump

President, Storage Switzerland

George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for datacenters across the US, he has seen the birth of such technologies as RAID, NAS, and SAN. Prior to founding Storage Switzerland, he was CTO at one the nation’s largest storage integrators, where he was in charge of technology testing, integration, and product selection. George is responsible for the storage blog on InformationWeek's website and is a regular contributor to publications such as Byte and Switch, SearchStorage, eWeek, SearchServerVirtualizaiton, and SearchDataBackup.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights