Cloud Monitoring: The New 'Alert Overload' Problem & How to Fix It

While cloud computing offers a variety of proven business benefits, from a security perspective, IT teams are often still wavering in uncharted territory – and cloud monitoring is one such area.

Joe Vadakkan, Global Cloud Security Leader, Optiv Security

January 8, 2020

5 Min Read

"Alert overload" in cybersecurity is a well-understood phenomenon. You'd be hard-pressed to find an IT security professional who hasn't experienced the pains associated with trying to keep up with a cacophony of security tools and services, each of which generates a deluge of alerts warranting analysis and action. The security industry is working to solve this problem by using automation, artificial intelligence, machine learning and other technologies designed to cut down on the "noise." Unfortunately for IT security professionals, as they tackle this issue, another overload problem is emerging -- one that is even more onerous and dangerous: cloud monitoring.

The cloudy state of affairs
The public cloud is becoming the underlying fabric of enterprise IT organizations. According to Gartner, Inc.: "The worldwide public cloud services market is projected to grow 17.5 percent in 2019 to total $214.3 billion, up from $182.4 billion in 2018." While cloud computing offers a variety of proven business benefits, from a security perspective, IT teams are often still wavering in uncharted territory -- and cloud monitoring is one such area.

The intended purpose of cloud monitoring is to analyze cloud applications, services, assets and environments to quickly detect and remediate potential threats in the cloud. While its purpose is straightforward, the act of successfully executing cloud monitoring functions can be much more complex. There are two main reasons for this:

  1. Cloud environments are in a state of continuous change, thanks to next-generation application development processes, such as DevOps and continuous delivery; multi-cloud and hybrid-cloud architectures; multi-data sources, including from third parties; and the general flexibility and elasticity of cloud environments. This means new vulnerabilities and potential compliance violations are continuously created, making it impossible for IT security teams to keep up using traditional manual monitoring and remediation processes.


  1. There are too many tools issuing alerts, and not enough IT staff available to manage them. Not only are alerts being generated at a rapid pace due to the dynamic nature of cloud environments, but IT infrastructures today are overly complex, with too many vendor applications, services and tools issuing alerts. In fact, the average enterprise has 70 different security vendors in its infrastructure. In this state, organizations simply cannot hire enough people to monitor and remediate all of the issues arising in the turbulent cloud, especially with the industry's chronic shortage in cybersecurity and cloud skills.

In short, over-stretched IT teams are struggling to monitor, manage and secure dynamic hybrid- and multi-cloud environments from ever-evolving threats, vulnerabilities and compliance violations, leading to increased enterprise risk.

The fix: first, see clearly in the cloud
As a starting point, normally organizations discover cloud issues in one of two ways: something goes wrong, or they take steps to attain full visibility into the cloud environment and detect issues before something goes wrong. Obviously, the second approach is far preferable to the first. The way to achieve full visibility is to use cloud monitoring tools, which can effectively track disk configurations, mislabeled tags, CIS benchmarks, regulatory modules, NIST frameworks, etc. Fortunately, many cloud providers already offer strong monitoring tools within their own platforms, such as Google Cloud Stackdriver and its cloud security command center, Microsoft Azure security center and AWS CloudWatch/CloudTrail, which integrates with AWS's Macie, GuardDuty and Inspector. IT teams can use these tools natively and get good visibility into the critical state of a cloud environment (low, medium, high, etc.). If you start validating an environment with these tools, your organization will find out how vulnerable its cloud security posture really is.

Where this approach fails is in the next step…once someone sees a high or medium criticality, it may get fixed on a one-time basis, but that is not enough. You need to fix the full infrastructure code across the entire environment (via configuration management) and add guardrails so that the security posture continually evolves. It requires more than a one-time point fix solution.

Auto remediation = automation and action
This post-visibility point is where auto-remediation can have a significant impact -- it allows IT teams to create functions within their environment with logic written around it. For example, you can use if/then statements such as "if there is a misconfiguration, then do XYZ task." Or, "if there is an S3 bucket that has a read attribute to it, then shut it off or encrypt every object that gets uploaded." This approach moves beyond cloud visibility and monitoring -- it puts the information through the DevOps lifecycle and applies automation to take appropriate action to make it secure.

A broad shift in processes (and mindset)
It's important to note that organizations first need to evolve their business processes in order to effectively use auto remediation. As enterprises move from on-premise to the cloud, they may still be using traditional (legacy) change-management processes. That is where things can really slow down. Old processes are often at odds with automation, so they need to be modernized. This is a struggle that many, if not most, enterprises are currently having, which explains why so few organizations are currently taking advantage of auto remediation.

The problem is not the tools -- there are plenty of those -- it's the shift in mindset that is needed around how to automate workflows. It comes back to human behavior -- the muscle-memory of doing things a certain way over a long period of time. (Anyone who's used self-parking technology in a car knows how unnerving it can be to take their hands off the steering wheel for the first time.) Automation needs to be the basis of the "new process," and people need to understand that the old process is not a solution to anything -- it's actually the root cause of the cloud monitoring and remediation problem.

As usual, it's people, process and technology
When it comes to cloud monitoring, the old "people, process and technology" approach to organizational change still applies. In most cases today, the process is outdated, the technology is not being used properly, and people are in a no-win situation. By implementing technology that enables full visibility into the cloud and auto-remediation, agile change-management processes that accommodate the new world of automation, and expanding the concept of "people" to third-party specialists to take the burden off staff, organizations can stop the cloud-monitoring overload problem before it metastasizes into breaches and compliance violations.

— Joe Vadakkan is the global cloud security leader at Optiv Security. He also serves as the president of the Cloud Security Alliance, Southwest Chapter.

Read more about:

Security Now

About the Author(s)

Joe Vadakkan

Global Cloud Security Leader, Optiv Security

Joe Vadakkan brings more than 18 years of global infrastructure architecture and security experience, focusing on all aspects of cyber and data security to his role of global practice leader, cloud security, for Optiv. Vadakkan's expertise in information security and IT infrastructure spans public and private sector companies across diverse industries, including aerospace and defense, software development, finance and insurance, healthcare, transportation logistics, retail, government, and consulting. Prior to his role at Optiv, Vadakkan worked and consulted at various Fortune 500 organizations building secure architecture solutions for 100s of clients in public cloud, building large scale ITO portfolio solutions for "big 10" IT service providers, designing secure public cloud for Iaas and SaaS consumption for a top 5 public cloud service provider, and architecting/implementing private cloud for government and defense programs.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like

More Insights