Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

News

8/24/2009
11:44 AM
George Crump
George Crump
Commentary
50%
50%

Getting To The Last Copy Of Data

One of the storage management challenges we see every day in customer data centers is there are too many copies of data in circulation. Ironically its this fact that built much of the value and motivation behind data deduplication. It should not be this way. Why should you get to a last copy of data?

One of the storage management challenges we see every day in customer data centers is there are too many copies of data in circulation. Ironically its this fact that built much of the value and motivation behind data deduplication. It should not be this way. Why should you get to a last copy of data?One of the downsides to inexpensive capacity is that storage practices don't have to be as strict. You can store hundreds of versions of the same or similar data and suffer limited hard cost impact. Deduplication further enhances the affordability of capacity making this practice more forgivable from a expense standpoint.

Of course the data is not just stored multiple times on the file server, versions of it exist on laptops, thumb drives, tape media, replicated disk and a host of other "just in case" storage locations. Ironically it seems, especially as this data ages, having this many copies of the same piece of data make it no easier to find nor any faster to recover, it just means there are that many more places to look for the data.

Ideally a best practice would be that as data ages there are less copies of it and the final copy moves to a known good location, potentially a disk archive solution and is replicated to a disk archive at a disaster recovery site. This means that part of the policy will be to have inactive data moved to an archive much sooner. Disk archive as we discuss in our article on Archiving Basics enables a much more aggressive migration policy because the recall of data happens with almost no noticeable performance impact on the user. In addition the backup application will need to be set to tape media age-out and be retired much sooner.

Archiving solutions like those from EMC, Nexsan or Permabit can set the retention time of this files as they are being stored. For example they can be set to make the data unmodifiable for 7 years and then have them deleted after 10 years. The key is that once you decide you need information from a data set or that you need to get rid of a data set, you know exactly where to go to find that data.

Most of the archives can be indexed by solutions like those offered by Index Engines or Kazeon to help find data on the archive itself but also help identify the data on primary storage that needs to be archived. In our next entry we will discuss how eDiscovery is evolving from a litigation readiness application into a more mainstream application that helps storage managers achieve goals like last copy of data.

At some point in a file's life you have two decisions to make; keep it or delete it. If it is a keep it decision that means that you think someday you will need that data again. If so, you don't want to hunt all over the data center for that data, you want it in one place. Even more so, if it is a delete it decision you want to know you have removed all the copies and versions of that file. Both are best enabled by an IT Discovery application and a final storage location like a disk archive.

Track us on Twitter: http://twitter.com/storageswiss

Subscribe to our RSS feed.

George Crump is founder of Storage Switzerland, an analyst firm focused on the virtualization and storage marketplaces. It provides strategic consulting and analysis to storage users, suppliers, and integrators. An industry veteran of more than 25 years, Crump has held engineering and sales positions at various IT industry manufacturers and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
2020: The Year in Security
Download this Tech Digest for a look at the biggest security stories that - so far - have shaped a very strange and stressful year.
Flash Poll
Assessing Cybersecurity Risk in Today's Enterprises
Assessing Cybersecurity Risk in Today's Enterprises
COVID-19 has created a new IT paradigm in the enterprise -- and a new level of cybersecurity risk. This report offers a look at how enterprises are assessing and managing cyber-risk under the new normal.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-12512
PUBLISHED: 2021-01-22
Pepperl+Fuchs Comtrol IO-Link Master in Version 1.5.48 and below is prone to an authenticated reflected POST Cross-Site Scripting
CVE-2020-12513
PUBLISHED: 2021-01-22
Pepperl+Fuchs Comtrol IO-Link Master in Version 1.5.48 and below is prone to an authenticated blind OS Command Injection.
CVE-2020-12514
PUBLISHED: 2021-01-22
Pepperl+Fuchs Comtrol IO-Link Master in Version 1.5.48 and below is prone to a NULL Pointer Dereference that leads to a DoS in discoveryd
CVE-2020-12525
PUBLISHED: 2021-01-22
M&M Software fdtCONTAINER Component in versions below 3.5.20304.x and between 3.6 and 3.6.20304.x is vulnerable to deserialization of untrusted data in its project storage.
CVE-2020-12511
PUBLISHED: 2021-01-22
Pepperl+Fuchs Comtrol IO-Link Master in Version 1.5.48 and below is prone to a Cross-Site Request Forgery (CSRF) in the web interface.