Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

News

4/13/2009
07:29 PM
George Crump
George Crump
Commentary
50%
50%

Primary Storage Optimization Compromises

Primary file system storage optimization, i.e. squeezing more data into the same space, continues to grow in popularity. The challenge is that the deduplication of primary storage is not without its rules. You can't dedupe this, you can dedupe that and you have to be cognizant of the performance impact on a deduplicated volume.

Primary file system storage optimization, i.e. squeezing more data into the same space, continues to grow in popularity. The challenge is that the deduplication of primary storage is not without its rules. You can't dedupe this, you can dedupe that and you have to be cognizant of the performance impact on a deduplicated volume.EMC has announced deduplication on their Celerra platform and NetApp has had it for a while. Others have added it in a near active fashion by compressing and deduplicating data after it becomes stagnant and then companies like Storwize have been providing it in the form of inline real time compression.

As storage virtualization and thin provisioning have proven, primary storage is better when you don't have to compromise. The problem with imposing conditions for use on primary storage is that things can get complicated and that complication can lead people to not use the technology. The more transparent and universally applicable a technology is, the greater its chances for success.

The challenge with some primary storage optimization is it is largely dependent on the type of data you have and the workload that is accessing that data. Obviously for deduplication to generate any benefit there has to be duplicate data which is why, with its weekly fulls, backup is such an ideal application for deduplication. Primary storage on the other hand is not full of duplicate data.

In addition primary storage deduplication is going to have issues with heavy write IO and with random read/write IO. In these situations the performance impact of applying deduplication may be felt by users.

As a result most vendors suggest limiting the deployment of the technology to home directories and to VMware images where the likelihood of duplicate data is high and the workloads are more read intensive.

Databases in particular are left out of the process, concerns arises around the amount of duplicate data that would be found in a database and the performance impact associated with the process. As we stated in our article on database storage optimization, Data Reducing Oracle, inline, real time compression solutions may be a better fit here. Databases are very compressible, whether there is duplicate data or not and in most cases real time compression has no direct impact on performance.

As data growth continues to accelerate more data optimization will be required and applying multiple techniques may be the only way to stem the tide. Compression may be applied universally and as a compliment to deduplication that should be applied to specific workloads, this deduplicated data should then be moved to an archive and out of primary storage all together. Finally as I stated in the last few entries, all this has to be wrapped around tools that increase IT personnel efficiency instep with resource efficiencies.

Track us on Twitter: http://twitter.com/storageswiss.

Subscribe to our RSS feed.

George Crump is founder of Storage Switzerland, an analyst firm focused on the virtualization and storage marketplaces. It provides strategic consulting and analysis to storage users, suppliers, and integrators. An industry veteran of more than 25 years, Crump has held engineering and sales positions at various IT industry manufacturers and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators.

Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
Commentary
What the FedEx Logo Taught Me About Cybersecurity
Matt Shea, Head of Federal @ MixMode,  6/4/2021
Edge-DRsplash-10-edge-articles
A View From Inside a Deception
Sara Peters, Senior Editor at Dark Reading,  6/2/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-34682
PUBLISHED: 2021-06-12
Receita Federal IRPF 2021 1.7 allows a man-in-the-middle attack against the update feature.
CVE-2021-31811
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an OutOfMemory-Exception while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
CVE-2021-31812
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an infinite loop while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
CVE-2021-32552
PUBLISHED: 2021-06-12
It was discovered that read_file() in apport/hookutils.py would follow symbolic links or open FIFOs. When this function is used by the openjdk-16 package apport hooks, it could expose private data to other local users.
CVE-2021-32553
PUBLISHED: 2021-06-12
It was discovered that read_file() in apport/hookutils.py would follow symbolic links or open FIFOs. When this function is used by the openjdk-17 package apport hooks, it could expose private data to other local users.