Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Vulnerabilities / Threats

2/12/2018
01:00 PM
Dan Koloski
Dan Koloski
Commentary
Connect Directly
Twitter
RSS
E-Mail vvv
100%
0%

Better Security Analytics? Clean Up the Data First!

Even the best analytics algorithms using incomplete and unclean data won't yield useful results.

Our industry is losing the cybersecurity war. Not a week goes by in which we don't hear about a new data breach. Overwhelmed security operations center (SOC) personnel, who already were in short supply, are leaving the profession because of sheer exhaustion. The rapid rate of change brought on by DevOps and cloud computing has completely overwhelmed our traditional, rules-based perimeter defense. Sophisticated hacking syndicates and nation-states are coming at us with machines, and we're responding with humans.

The industry's current response to this has been to offer practitioners a dizzying array of shiny, new artificial intelligence (AI)-enabled analytics regimes, each of which claims to have better algorithms than everybody else. Nowhere has this been more pronounced than in standalone user and behavior analytics regimes, but it's undeniable that there has been a rush to add fancy new analytical features to existing, siloed security information and event management tools, intrusion-detection systems, threat feeds, network monitoring, cloud access security brokers, common vulnerabilities and exposures lists, configuration management databases, log tools, and more.

Here's the problem with that approach: even the best analytics algorithms operating against incomplete and unclean data aren't going to yield useful results.

Economists, behavioral scientists, mathematicians, and ethicists often refer to the concept of "imperfect information," in which parties involved in the decision-making process (be it a market, game-theory scenario, or ethical question) do not have equal access to all the information required to make a decision. The concept is important because it is both theoretically and empirically demonstrable that imperfect information leads to bad outcomes; for example: markets don't function as well, game theory doesn't accurately predict what will happen, or one party in a transaction takes advantage of another. The drive toward transparency in many areas of business and life is a direct reflection of the fact that imperfect information is undesirable. Even though truly perfect information may be unachievable, most transactional and behavioral scenarios certainly benefit from the availability of less-imperfect information (or in other words, closer-to-perfect information).

The environment already presents a huge amount of data to the SOC. We have security events, user activity, intrusion detection, threat intelligence, network activity, cloud access, known exploits and vulnerabilities, configuration and IT activity metrics, security and operational logs, identity, and many other sources of data. Each of these sources tends to both emanate from and land in separate data silos. Traditionally, we expected our human SOC operators to be able to work across all of these silos, process all of this data, and turn it into actionable information. That didn't work. SOCs were overwhelmed, exhausted people naturally missed things, we didn't have all the information we needed, and we landed pretty much where we are today.

Now we are expecting that bolted-on, AI-enabled regimes will solve all of our problems. It's true that machines don't get tired and can analyze more data at scale than humans. That's good. But machines can only analyze the data with which they are presented. That means if we apply AI to, say, our user activity data silo, but that data is separated from our configuration information silo, our topology-mapping silo, or our network monitoring silo (you get the idea…), we're back to the imperfect information problem. Fancy analytics against imperfect information still yields decisions you can't entirely trust. If you can't trust the decisions, how can you automate the remediation based on them?

Time for a Fresh Approach
The hard truth is that we need to rethink our data tier. A data tier that perpetuates unconnected silos of data and expects an AI-enabled analytic regime to somehow normalize across them will yield the same "analysis paralysis" that faces human operators: too much uncertainty and too many gray areas to draw a definitive conclusion (and therefore to take action). The common phrase for this is "garbage in, garbage out." The reason that truth is hard is because most SOCs have substantial investment in those siloed data tiers already, and there is natural inertia to consider replacing them.

A better data tier will allow the ingest and normalization of the full operational and security data set as a single data lake that can then be optimized for AI-enabled analysis. I say "operational and security data set" because they are closely related. For example, user activity drives optimization for performance (operations) and hardening (security). Configuration information is critical to resolving performance issues (operations) and vulnerabilities (security). Derived topology and dependency mapping is as equally useful for troubleshooting performance problems as it is for data-loss prevention and attack detection.

Better data tiers exist, but they aren't bolt-ons to existing silos; they are replacements for them. While that may be hard to swallow, we need to adapt to the new reality, and a bolt-on approach won't get us there. Armed with better and cleaner data, an AI-based analytics regime is more able to derive better conclusions, and those conclusions can be used to directly interface with automated remediation, yielding a highly automated cyber-defense regime that is more appropriate for today's threat environment.

Think radically. Your attackers are, I assure you.

Related Content:

 

Black Hat Asia returns to Singapore with hands-on technical Trainings, cutting-edge Briefings, Arsenal open-source tool demonstrations, top-tier solutions and service providers in the Business Hall. Click for information on the conference and to register.

Dan Koloski is a software industry expert with broad experience as both a technologist working on the IT side and as a management executive on the vendor side. Dan is a Vice President in Oracle's Systems Management and Security products group, which produces the Oracle ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
AMoss
50%
50%
AMoss,
User Rank: Guru
3/11/2018 | 2:14:13 PM
data quality & normalization
Absolutely agree.  We should be spending as much time normalizing our operational/config/posture data as we are threat data.  
US Turning Up the Heat on North Korea's Cyber Threat Operations
Jai Vijayan, Contributing Writer,  9/16/2019
Preventing PTSD and Burnout for Cybersecurity Professionals
Craig Hinkley, CEO, WhiteHat Security,  9/16/2019
NetCAT Vulnerability Is Out of the Bag
Dark Reading Staff 9/12/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
7 Threats & Disruptive Forces Changing the Face of Cybersecurity
This Dark Reading Tech Digest gives an in-depth look at the biggest emerging threats and disruptive forces that are changing the face of cybersecurity today.
Flash Poll
The State of IT Operations and Cybersecurity Operations
The State of IT Operations and Cybersecurity Operations
Your enterprise's cyber risk may depend upon the relationship between the IT team and the security team. Heres some insight on what's working and what isn't in the data center.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-3738
PUBLISHED: 2019-09-18
RSA BSAFE Crypto-J versions prior to 6.2.5 are vulnerable to an Improper Verification of Cryptographic Signature vulnerability. A malicious remote attacker could potentially exploit this vulnerability to coerce two parties into computing the same predictable shared key.
CVE-2019-3739
PUBLISHED: 2019-09-18
RSA BSAFE Crypto-J versions prior to 6.2.5 are vulnerable to Information Exposure Through Timing Discrepancy vulnerabilities during ECDSA key generation. A malicious remote attacker could potentially exploit those vulnerabilities to recover ECDSA keys.
CVE-2019-3740
PUBLISHED: 2019-09-18
RSA BSAFE Crypto-J versions prior to 6.2.5 are vulnerable to an Information Exposure Through Timing Discrepancy vulnerabilities during DSA key generation. A malicious remote attacker could potentially exploit those vulnerabilities to recover DSA keys.
CVE-2019-3756
PUBLISHED: 2019-09-18
RSA Archer, versions prior to 6.6 P3 (6.6.0.3), contain an information disclosure vulnerability. Information relating to the backend database gets disclosed to low-privileged RSA Archer users' UI under certain error conditions.
CVE-2019-3758
PUBLISHED: 2019-09-18
RSA Archer, versions prior to 6.6 P2 (6.6.0.2), contain an improper authentication vulnerability. The vulnerability allows sysadmins to create user accounts with insufficient credentials. Unauthenticated attackers could gain unauthorized access to the system using those accounts.