Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud Security //

Azure

5/4/2017
05:00 AM
Ashwin Krishnan
Ashwin Krishnan
News Analysis-Security Now
50%
50%

First AWS, Now Microsoft Cloud; Who's Next?

Outages are inevitable, but how can we deal with them better?

So, there we have it. Within a window of a few months, the top two public cloud providers on the planet -- Amazon Web Services LLC and Microsoft Cloud -- have had bodily seizures that have caused the rest of us (mere cells in their ecosystem) to go into crazy orbits. Enough of the drama, let's get to facts. In this age of information deluge it would not be presumptuous to assume that the reader may have forgotten the specifics, so let's recollect.

The Amazon Simple Storage Service (S3) had an outage on Tuesday, February 28. An authorized S3 team member who was using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. However, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended. And the rest, as they say, is history!

Now let's turn to the Microsoft episode. On Tuesday, March 21, Outlook, Hotmail, OneDrive, Skype and Xbox Live were all significantly impacted, and trouble ranged from being unable to log in to degraded services. True to form, the Microsoft response was to downplay the impact and provide little detail (by contrast, Amazon provided a much more detailed post mortem). A subset of Azure customers may have experienced intermittent login failures while authenticating with their Microsoft accounts. Engineers identified a recent deployment task as the potential root cause. Engineers rolled back the recent deployment task to mitigate the issue.

So, is this the death of public cloud? Nah. Far from it. And anyone who says otherwise should have their head examined. BUT, it should serve as a wake-up call to every IT, security and compliance professional across every industry. Why? Because this kind of "user error" or "deployment task snafu" can happen anywhere -- on-premises, on private cloud and on public cloud. And since every enterprise is deployed on one or more of the above, every enterprise is at risk. So enough of the fear mongering. What does someone do about it? Glad you asked.

There are really three vectors of control: scope, privileges and governance model.

Scope is really the number of "objects," a.k.a. the nuclear radius of what each admin (or script) is authorized to work on at any given time. Using the Microsoft Cloud example (I realize I am extrapolating since they have not provided any details), this may be the number of containers a deployment task can operate on at any given time.

Privileges calls for controlling what an administrator or task can do on the object. For instance, continuing with the container example from above, the privilege restriction could be that the container can be launched but not destroyed.

And finally, you need a governance model. This is really the implementation of best practices and a well-defined policy for enforcing the above two functions -- scope overview and control enforcement -- in a self-driven fashion. In this example, the policy could be to ensure that the number of containers an admin can operate remains under 100 (scope) and that any increase in that number automatically requires a pre-defined approval process (control). Further sophistication can easily be built in, where the human approver could easily be a bot that checks the type of container and the load on the system and approves (or denies) the request. Bottom-line -- checks and balances.

So there you have it. The two large public clouds have suffered embarrassing outages in the past month. They will recover, get stronger and most likely have future outages as well. The question for the rest of us is what we learn from their experience and how to make our environments in our own data centers and on private and public clouds better! If we don't, we may not be lucky enough to fight another day.

— Ashwin Krishnan, SVP, Products & Strategy, HyTrust

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Edge-DRsplash-10-edge-articles
I Smell a RAT! New Cybersecurity Threats for the Crypto Industry
David Trepp, Partner, IT Assurance with accounting and advisory firm BPM LLP,  7/9/2021
News
Attacks on Kaseya Servers Led to Ransomware in Less Than 2 Hours
Robert Lemos, Contributing Writer,  7/7/2021
Commentary
It's in the Game (but It Shouldn't Be)
Tal Memran, Cybersecurity Expert, CYE,  7/9/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
How Data Breaches Affect the Enterprise
Data breaches continue to cause negative outcomes for companies worldwide. However, many organizations report that major impacts have declined significantly compared with a year ago, suggesting that many have gotten better at containing breach fallout. Download Dark Reading's Report "How Data Breaches Affect the Enterprise" to delve more into this timely topic.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-43790
PUBLISHED: 2021-11-30
Lucet is a native WebAssembly compiler and runtime. There is a bug in the main branch of `lucet-runtime` affecting all versions published to crates.io that allows a use-after-free in an Instance object that could result in memory corruption, data race, or other related issues. This bug was introduce...
CVE-2021-44428
PUBLISHED: 2021-11-29
Pinkie 2.15 allows remote attackers to cause a denial of service (daemon crash) via a TFTP read (RRQ) request, aka opcode 1.
CVE-2021-44429
PUBLISHED: 2021-11-29
Serva 4.4.0 allows remote attackers to cause a denial of service (daemon crash) via a TFTP read (RRQ) request, aka opcode 1, a related issue to CVE-2013-0145.
CVE-2021-44427
PUBLISHED: 2021-11-29
An unauthenticated SQL Injection vulnerability in Rosario Student Information System (aka rosariosis) before 8.1.1 allows remote attackers to execute PostgreSQL statements (e.g., SELECT, INSERT, UPDATE, and DELETE) through /Side.php via the syear parameter.
CVE-2021-43783
PUBLISHED: 2021-11-29
@backstage/plugin-scaffolder-backend is the backend for the default Backstage software templates. In affected versions a malicious actor with write access to a registered scaffolder template is able to manipulate the template in a way that writes files to arbitrary paths on the scaffolder-backend ho...