04:05 PM
Bill Kleyman
Bill Kleyman
Connect Directly

Your Cloud Was Breached. Now What?

You're not happy. You just experienced a breach. Here's how to keep calm and secure your cloud.

First of all, take a deep breath.  If you stay vigilant during a cloud breach -- and have a proactive security model in place -- you’ll weather the storm. The first step is to be prepared.

The preparation
There are a lot of similarities between a physical breach and one that happens in the cloud. Some of the preparation mechanisms remain the same. The big difference comes in the toolset. Lots of cloud service providers (CSPs) offer very granular log aggregation, visibility into virtual networks, and even the ability to create cloud-ready audit trails. To be completely ready, here’s what you need to organize in advance:

  • Documentation,  electronic or physical. In many cases, you will need to create a breach protocol to follow. CSPs will probably have their own breach protocols. However, the data and settings residing on your system may still be your responsibility to document.
  • Snapshotting services and physical removal tools. Even if your virtual machine lives in the cloud, at some point a physical server was compromised. You need to snapshot and isolate that server. This needs to be done immediately. Remember, snapshots can be done if a VM is on or off. Your initial step should not be to alter the state of the VM. Rather, it should be to document and snapshot the instance.
  • Virtual and physical machines. There is a solid chance that you may need to transport data, snapshots, and other resources physically as well as virtually. In some cases, you’ll need to make arrangements to transfer the affected physical gear for post-breach analysis. Your CSP can help you take down impacted hardware for further testing.  

The immediate response
There are three mandatory rules you must follow immediately, particularly if your workload is a VM or is residing in the cloud: Do not alter the condition of the VM or cloud instance. If it’s off, leave it off; If it’s on, leave it on. Avoid attempts to access files, and do not change settings.

The follow-up
These seven steps will take you from breach to remediation.

Step 1: Create snapshots of VMs, virtual appliances, and configurations. This can include screenshots, log dumps, or configuration collections. In some cases you may need to take written notes on what appears on the monitor, management screen, or any other output device. Active programs may require more extensive documentation of the virtual machine’s activity. At this point, it is imperative that you do not make any state changes to the cloud instance or VM, as it could significantly alter your research. The machine should remain active until you have the snapshotting process,

Step 2: Protect perishable data, both physical and virtual. Is there a drive attached to a server? Is DAS being used? Perishable data (both physical and virtual) should be immediately secured, documented, and/or snapshotted, and in some cases physically photographed. If an end-point using a cloud service becomes compromised, make sure to include the power supply and ensure these devices remain plugged in, even when in storage.

Step 3: Properly take down the physical resource or virtual instance. There are ways to take down a physical machine -- and ways to take down a virtual instance. Both are critical processes during a breach. Regardless, prior to changing the state of the physical or virtual instance, document and snapshot everything! Because most VMs and cloud platforms utilize shared storage, you may have some extra work here. Massive breaches can force you to take your storage platform offline temporarily for snapshots and evidence gathering. Document the LUNs, connections, and even disk aggregates that were used for that VM; create a snapshot of the assigned virtual disk(s); and make sure to document all processes during your investigation.

Step 4: Identify all incoming network lines, connections, virtual interfaces, and ports assigned to the VM or cloud instance. You will need to work with security, network, storage, and infrastructure teams to document and understand how all configurations impact the state of the breached cloud or VM instance. Collaboration during a breach is absolutely critical. Plus, your CSP should have dedicated teams to reference as well.

Step 5: Collect and label all media used during the response process. Just because your breach happened in the cloud doesn’t mean you won’t have physical documentation. Massive breaches still involve digital photographs, paper trails, and governance documentation. You will have digital and physical media that will be collected from the breached instance. Fortunately, cloud management tools can help with log aggregation and VM state identification and can even help provide historical reporting.

Step 6: Seal all collected devices, drives, and evidence in a secured area. Proper protocol will dictate that any and all evidence gathered must be locked down and secured. At this point in the process, you’ve taken your snapshots, pulled necessary physical components, and gathered as much data as possible. Now you absolutely need to lock it down for analysis and evaluation.

Step 7: Remediate and respond. You’re not happy -- you just experienced a breach. At a high-level, you understand where the breach came from. So your final task is to now to lock down ports, services, or other affected areas. But, if you followed my earlier advice, you’re also staying calm and looking at better ways to secure your cloud for the future.


Bill is an enthusiastic technologist with experience in datacenter design, management, and deployment. His architecture work includes large virtualization and cloud deployments as well as business network design and implementation. Bill enjoys writing, blogging, and educating ... View Full Bio

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Bill Kleyman
Bill Kleyman,
User Rank: Apprentice
3/14/2014 | 11:16:26 AM
Re: Leave the intruder alone for a little while longer?
@Charlie - I was just waiting for someone to give me a solid use-case. The advanced nature of today's modern infrastructure allows us to do great things with technology. Virtualization, cloud, and a distributed platform optimizes data flow and application delivery.

However, all of this presents new types of targets. So, we have a few scenarios here...

There are a number of different types of cloud-based attacks that can and do happen. These include port attacks, DDoS, application-specific threats, database attacks and much more.

So the answer really depends on the attack and who it's against. Let's look at this example - According to a recent Arbor Networks report, DDoS attacks originally targeted Spamhaus on 16th March, 2013. Spamhaus engaged the services of CloudFlare ( who were able to mitigate the initial attacks successfully. The attacks then escalated between 19th and 21st March exhausting the capabilities of CloudFlare. The report goes on to say that the attacks also moved on to target next-hop addresses at IX's around the world (AMS-IX, DEC-IC, HK-IX, Equinix and LINX) causing congestion and a perceived Internet slow down in some geographies. ISPs around the world have worked to deploy filters to mitigate the impact of the attacks.

In this case, it was a scramble to halt this type of congestion and attack.

In other cases, very specific attacks may target a service or an application. During this attack a malicious piece of software or user continue to run and operate on the system. In these cases you still need to isolate the application or data point to identify and quantify the ramifications of the attack. If it's a VM, snapshotting it will allow you to see present-state metrics around the attack. Of course, governance and compliance play a big role as well. 

Basically, there will be cases where a security professional will want to regain control, monitor, and remediate a potential attack. 
Bill Kleyman
Bill Kleyman,
User Rank: Apprentice
3/14/2014 | 10:56:51 AM
Re: Thanks for great post.
I second that :) Much appreciated!
Charlie Babcock
Charlie Babcock,
User Rank: Moderator
3/13/2014 | 12:34:24 PM
Leave the intruder alone for a little while longer?
Bill, your description of needing to be prepared to preserve the server and storage as is for forensic analysis is extremely interesting. Nice job of that. But tell me, doesn't that assume the damage caused by the breach is a fait accompli and over? What if an intruder or active malware is still at work? Do you have to allow it to continue as you go about snapshotting and recording? That would be hard to do.
Marilyn Cohodas
Marilyn Cohodas,
User Rank: Strategist
3/13/2014 | 11:21:43 AM
Re: Thanks for great post.
thanks for the complement for Bill, Eddiemayan. What did you like about the post? Tell us what you learned, or what you will do differently after reading it.
Eddie Mayan
Eddie Mayan,
User Rank: Apprentice
3/13/2014 | 8:03:26 AM
Thanks for great post.
Thanks for great post.
Register for Dark Reading Newsletters
White Papers
Current Issue
E-Commerce Security: What Every Enterprise Needs to Know
The mainstream use of EMV smartcards in the US has experts predicting an increase in online fraud. Organizations will need to look at new tools and processes for building better breach detection and response capabilities.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
Published: 2015-10-15
The Direct Rendering Manager (DRM) subsystem in the Linux kernel through 4.x mishandles requests for Graphics Execution Manager (GEM) objects, which allows context-dependent attackers to cause a denial of service (memory consumption) via an application that processes graphics data, as demonstrated b...

Published: 2015-10-15
netstat in IBM AIX 5.3, 6.1, and 7.1 and VIOS 2.2.x, when a fibre channel adapter is used, allows local users to gain privileges via unspecified vectors.

Published: 2015-10-15
Cross-site request forgery (CSRF) vulnerability in eXtplorer before 2.1.8 allows remote attackers to hijack the authentication of arbitrary users for requests that execute PHP code.

Published: 2015-10-15
Directory traversal vulnerability in QNAP QTS before 4.1.4 build 0910 and 4.2.x before 4.2.0 RC2 build 0910, when AFP is enabled, allows remote attackers to read or write to arbitrary files by leveraging access to an OS X (1) user or (2) guest account.

Published: 2015-10-15
Cisco Application Policy Infrastructure Controller (APIC) 1.1j allows local users to gain privileges via vectors involving addition of an SSH key, aka Bug ID CSCuw46076.

Dark Reading Radio