Vulnerabilities / Threats

2/12/2018
01:00 PM
Dan Koloski
Dan Koloski
Commentary
Connect Directly
Twitter
RSS
E-Mail vvv
100%
0%

Better Security Analytics? Clean Up the Data First!

Even the best analytics algorithms using incomplete and unclean data won't yield useful results.

Our industry is losing the cybersecurity war. Not a week goes by in which we don't hear about a new data breach. Overwhelmed security operations center (SOC) personnel, who already were in short supply, are leaving the profession because of sheer exhaustion. The rapid rate of change brought on by DevOps and cloud computing has completely overwhelmed our traditional, rules-based perimeter defense. Sophisticated hacking syndicates and nation-states are coming at us with machines, and we're responding with humans.

The industry's current response to this has been to offer practitioners a dizzying array of shiny, new artificial intelligence (AI)-enabled analytics regimes, each of which claims to have better algorithms than everybody else. Nowhere has this been more pronounced than in standalone user and behavior analytics regimes, but it's undeniable that there has been a rush to add fancy new analytical features to existing, siloed security information and event management tools, intrusion-detection systems, threat feeds, network monitoring, cloud access security brokers, common vulnerabilities and exposures lists, configuration management databases, log tools, and more.

Here's the problem with that approach: even the best analytics algorithms operating against incomplete and unclean data aren't going to yield useful results.

Economists, behavioral scientists, mathematicians, and ethicists often refer to the concept of "imperfect information," in which parties involved in the decision-making process (be it a market, game-theory scenario, or ethical question) do not have equal access to all the information required to make a decision. The concept is important because it is both theoretically and empirically demonstrable that imperfect information leads to bad outcomes; for example: markets don't function as well, game theory doesn't accurately predict what will happen, or one party in a transaction takes advantage of another. The drive toward transparency in many areas of business and life is a direct reflection of the fact that imperfect information is undesirable. Even though truly perfect information may be unachievable, most transactional and behavioral scenarios certainly benefit from the availability of less-imperfect information (or in other words, closer-to-perfect information).

The environment already presents a huge amount of data to the SOC. We have security events, user activity, intrusion detection, threat intelligence, network activity, cloud access, known exploits and vulnerabilities, configuration and IT activity metrics, security and operational logs, identity, and many other sources of data. Each of these sources tends to both emanate from and land in separate data silos. Traditionally, we expected our human SOC operators to be able to work across all of these silos, process all of this data, and turn it into actionable information. That didn't work. SOCs were overwhelmed, exhausted people naturally missed things, we didn't have all the information we needed, and we landed pretty much where we are today.

Now we are expecting that bolted-on, AI-enabled regimes will solve all of our problems. It's true that machines don't get tired and can analyze more data at scale than humans. That's good. But machines can only analyze the data with which they are presented. That means if we apply AI to, say, our user activity data silo, but that data is separated from our configuration information silo, our topology-mapping silo, or our network monitoring silo (you get the idea…), we're back to the imperfect information problem. Fancy analytics against imperfect information still yields decisions you can't entirely trust. If you can't trust the decisions, how can you automate the remediation based on them?

Time for a Fresh Approach
The hard truth is that we need to rethink our data tier. A data tier that perpetuates unconnected silos of data and expects an AI-enabled analytic regime to somehow normalize across them will yield the same "analysis paralysis" that faces human operators: too much uncertainty and too many gray areas to draw a definitive conclusion (and therefore to take action). The common phrase for this is "garbage in, garbage out." The reason that truth is hard is because most SOCs have substantial investment in those siloed data tiers already, and there is natural inertia to consider replacing them.

A better data tier will allow the ingest and normalization of the full operational and security data set as a single data lake that can then be optimized for AI-enabled analysis. I say "operational and security data set" because they are closely related. For example, user activity drives optimization for performance (operations) and hardening (security). Configuration information is critical to resolving performance issues (operations) and vulnerabilities (security). Derived topology and dependency mapping is as equally useful for troubleshooting performance problems as it is for data-loss prevention and attack detection.

Better data tiers exist, but they aren't bolt-ons to existing silos; they are replacements for them. While that may be hard to swallow, we need to adapt to the new reality, and a bolt-on approach won't get us there. Armed with better and cleaner data, an AI-based analytics regime is more able to derive better conclusions, and those conclusions can be used to directly interface with automated remediation, yielding a highly automated cyber-defense regime that is more appropriate for today's threat environment.

Think radically. Your attackers are, I assure you.

Related Content:

 

Black Hat Asia returns to Singapore with hands-on technical Trainings, cutting-edge Briefings, Arsenal open-source tool demonstrations, top-tier solutions and service providers in the Business Hall. Click for information on the conference and to register.

Dan Koloski is a software industry expert with broad experience as both a technologist working on the IT side and as a management executive on the vendor side. Dan is a Vice President in Oracle's Systems Management and Security products group, which produces the Oracle ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
AMoss
50%
50%
AMoss,
User Rank: Guru
3/11/2018 | 2:14:13 PM
data quality & normalization
Absolutely agree.  We should be spending as much time normalizing our operational/config/posture data as we are threat data.  
When Your Sandbox Fails
Kowsik Guruswamy, Chief Technology Officer at Menlo Security,  4/11/2019
Julian Assange Arrested in London
Dark Reading Staff 4/11/2019
8 'SOC-as-a-Service' Offerings
Steve Zurier, Freelance Writer,  4/12/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
5 Emerging Cyber Threats to Watch for in 2019
Online attackers are constantly developing new, innovative ways to break into the enterprise. This Dark Reading Tech Digest gives an in-depth look at five emerging attack trends and exploits your security team should look out for, along with helpful recommendations on how you can prevent your organization from falling victim.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-1840
PUBLISHED: 2019-04-18
A vulnerability in the DHCPv6 input packet processor of Cisco Prime Network Registrar could allow an unauthenticated, remote attacker to restart the server and cause a denial of service (DoS) condition on the affected system. The vulnerability is due to incomplete user-supplied input validation when...
CVE-2019-1841
PUBLISHED: 2019-04-18
A vulnerability in the Software Image Management feature of Cisco DNA Center could allow an authenticated, remote attacker to access to internal services without additional authentication. The vulnerability is due to insufficient validation of user-supplied input. An attacker could exploit this vuln...
CVE-2019-1826
PUBLISHED: 2019-04-18
A vulnerability in the quality of service (QoS) feature of Cisco Aironet Series Access Points (APs) could allow an authenticated, adjacent attacker to cause a denial of service (DoS) condition on an affected device. The vulnerability is due to improper input validation on QoS fields within Wi-Fi fra...
CVE-2019-1829
PUBLISHED: 2019-04-18
A vulnerability in the CLI of Cisco Aironet Series Access Points (APs) could allow an authenticated, local attacker to gain access to the underlying Linux operating system (OS) without the proper authentication. The attacker would need valid administrator device credentials. The vulnerability is due...
CVE-2019-1830
PUBLISHED: 2019-04-18
A vulnerability in Locally Significant Certificate (LSC) management for the Cisco Wireless LAN Controller (WLC) could allow an authenticated, remote attacker to cause the device to unexpectedly restart, which causes a denial of service (DoS) condition. The attacker would need to have valid administr...