Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

News

8/11/2010
05:52 PM
George Crump
George Crump
Commentary
50%
50%

Cleaning The Digital Dump

One of the challenges that IT faces is getting rid of all old unused files that are clogging up primary storage. Primary storage can have data on it that has not been modified or even opened for years. The challenge is how do you deal with the digital dump, especially since most IT people don't have the authority to delete other peoples files?

One of the challenges that IT faces is getting rid of all old unused files that are clogging up primary storage. Primary storage can have data on it that has not been modified or even opened for years. The challenge is how do you deal with the digital dump, especially since most IT people don't have the authority to delete other peoples files?While there are many things we can do to minimize the junk on primary storage, deduplication and compression for example, this data should be removed from primary storage as soon as possible. Also this does not conflict with a keep it forever strategy. You may want to keep it forever, but you don't want to keep it on your most expensive storage.

Beyond space optimization there is also the concept of auto tiering. Again this capability serves a purpose but for long term storage it typically is still less expensive to move this data to a different platform. Your primary storage also may not be as safe a place to hold that data. Longer term storage platforms have built in capabilities to verify data integrity for years into the future. So at some point it makes sense to move this digital junk to a separate storage platform, designed for the task.

The first step is deciding how you are going to get it there. This can be as simple as manually copying the data to the secondary storage platform. This would involve using some sort of tool to identify that data and in most cases means there is no transparent link to bring it back in case a user needs it. The alternative is some sort of automated technology. As we discuss in our article "What is File Virtualization?" one of the best ways to accomplish this is with file virtualization technologies. They can transparently move data from one storage platform to another and then sent up a transparent link to the file. In most cases this is done without the use of stub files. They operate similar to a DNS server. You don't need to know the IP addresses of every web site, you reference them by name. File virtualization is similar, you don't need to know where the file is, just the name of the file and the file virtualization appliance handles the rest.

The second step is deciding what platform to put this data on. Most often here you are looking for something that is reliable, scalable and of course cost effective. Depending on your environment you may also want to have some level of power management available to you as well. To work with file virtualization these products will need to be able to present themselves as a NAS device, although some file virtualization vendors are working on directly supporting object based storage devices. This would allow you to use a NAS for primary storage and then use an object storage system for long term data retention.

The final step is deciding what to move. The obvious target is data that has not been accessed in years. The problem is that if you are like many customers over 50% of your data has not been accessed in years. Freeing up that much storage may be too much of an initial jump. What we recommend is that you decide how much storage you need to free up and then use a measurement tool to decide what age group of files will equal that amount. This gives you the ability to start slowly on data that is not being accessed at all while you get comfortable with the process.

Track us on Twitter: http://twitter.com/storageswiss

Subscribe to our RSS feed.

George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Commentary
What the FedEx Logo Taught Me About Cybersecurity
Matt Shea, Head of Federal @ MixMode,  6/4/2021
Edge-DRsplash-10-edge-articles
A View From Inside a Deception
Sara Peters, Senior Editor at Dark Reading,  6/2/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-34682
PUBLISHED: 2021-06-12
Receita Federal IRPF 2021 1.7 allows a man-in-the-middle attack against the update feature.
CVE-2021-31811
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an OutOfMemory-Exception while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
CVE-2021-31812
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an infinite loop while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
CVE-2021-32552
PUBLISHED: 2021-06-12
It was discovered that read_file() in apport/hookutils.py would follow symbolic links or open FIFOs. When this function is used by the openjdk-16 package apport hooks, it could expose private data to other local users.
CVE-2021-32553
PUBLISHED: 2021-06-12
It was discovered that read_file() in apport/hookutils.py would follow symbolic links or open FIFOs. When this function is used by the openjdk-17 package apport hooks, it could expose private data to other local users.