Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

News

12/9/2009
10:05 AM
George Crump
George Crump
Commentary
50%
50%

When Controllers Fail

What are the chances of a controller failing in a storage system? I don't know the exact statistic but its safe to assume that its pretty low. When they do fail, the ramifications can be extreme, especially in the increasingly virtualized data center that counts on shared storage. Active-Active controllers provide the protection from controller failure but they are a bit of a misnomer. Both controllers are being used but they are assigned to specific workloads.

What are the chances of a controller failing in a storage system? I don't know the exact statistic but its safe to assume that its pretty low. When they do fail, the ramifications can be extreme, especially in the increasingly virtualized data center that counts on shared storage. Active-Active controllers provide the protection from controller failure but they are a bit of a misnomer. Both controllers are being used but they are assigned to specific workloads.Most controller based storage architectures have at least two controllers for redundancy, higher end systems may have more. As stated earlier the problem is that these controllers do not typically share a workload. Each controller is assigned a specific set of disks or LUNs to manage. That controller is responsible for responding to I/O requests, providing the XOR calculation for RAID strategies and providing any of the data services that the storage system provides like snapshots, thin provisioning or replication to just those specific LUNs.

If a controller fails, access to the LUNs that were assigned to the controller is now rerouted through the primary controller. In the virtualized world this could mean the instant movement of the I/O requests of dozens of virtual machines to another controller. Of course this other controller already had a series workloads of its own to support. The result is that in a dual controller system your performance just got cut in half. In today's environment with RAID 6 and all the various data services that storage controllers provide, they may already be burdened and may not have the excess capacity to support the extra load without noticeable losses in performance to the user.

Quad or more controllers does not really help this situation as there is always going to be a load that moves fully to an additional controller. The exception could be if the storage system had the intelligence to move the LUNs to the least busy controller. The answer may be to have all the storage workloads already spread across all the available controllers evenly. For example in a four controller system, all four controllers are responding to the I/O requests for all the LUNs in the storage system. If there is a controller failure only 25% of the workload needs to be reallocated. Assuming that most systems do not run at sustained 75% utilization, then the failure should cause no noticeable performance loss to the applications.

To deliver this type of capability is more than likely going to require a clustered or grid storage implementation where the storage I/O workload is shared across all of the controllers or nodes in the system. Without that capability storage managers should pay very close attention to their storage processor utilization. Anything above 50% on any of the controllers should be a cause for concern and possibly a hardware upgrade.

Track us on Twitter: http://twitter.com/storageswiss

Subscribe to our RSS feed.

George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.

 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 8/3/2020
Pen Testers Who Got Arrested Doing Their Jobs Tell All
Kelly Jackson Higgins, Executive Editor at Dark Reading,  8/5/2020
New 'Nanodegree' Program Provides Hands-On Cybersecurity Training
Nicole Ferraro, Contributing Writer,  8/3/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
Special Report: Computing's New Normal, a Dark Reading Perspective
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
The Changing Face of Threat Intelligence
The Changing Face of Threat Intelligence
This special report takes a look at how enterprises are using threat intelligence, as well as emerging best practices for integrating threat intel into security operations and incident response. Download it today!
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-15820
PUBLISHED: 2020-08-08
In JetBrains YouTrack before 2020.2.6881, the markdown parser could disclose hidden file existence.
CVE-2020-15821
PUBLISHED: 2020-08-08
In JetBrains YouTrack before 2020.2.6881, a user without permission is able to create an article draft.
CVE-2020-15823
PUBLISHED: 2020-08-08
JetBrains YouTrack before 2020.2.8873 is vulnerable to SSRF in the Workflow component.
CVE-2020-15824
PUBLISHED: 2020-08-08
In JetBrains Kotlin before 1.4.0, there is a script-cache privilege escalation vulnerability due to kotlin-main-kts cached scripts in the system temp directory, which is shared by all users by default.
CVE-2020-15825
PUBLISHED: 2020-08-08
In JetBrains TeamCity before 2020.1, users with the Modify Group permission can elevate other users' privileges.