Vulnerabilities / Threats

2/12/2018
01:00 PM
Dan Koloski
Dan Koloski
Commentary
Connect Directly
Twitter
RSS
E-Mail vvv
100%
0%

Better Security Analytics? Clean Up the Data First!

Even the best analytics algorithms using incomplete and unclean data won't yield useful results.

Our industry is losing the cybersecurity war. Not a week goes by in which we don't hear about a new data breach. Overwhelmed security operations center (SOC) personnel, who already were in short supply, are leaving the profession because of sheer exhaustion. The rapid rate of change brought on by DevOps and cloud computing has completely overwhelmed our traditional, rules-based perimeter defense. Sophisticated hacking syndicates and nation-states are coming at us with machines, and we're responding with humans.

The industry's current response to this has been to offer practitioners a dizzying array of shiny, new artificial intelligence (AI)-enabled analytics regimes, each of which claims to have better algorithms than everybody else. Nowhere has this been more pronounced than in standalone user and behavior analytics regimes, but it's undeniable that there has been a rush to add fancy new analytical features to existing, siloed security information and event management tools, intrusion-detection systems, threat feeds, network monitoring, cloud access security brokers, common vulnerabilities and exposures lists, configuration management databases, log tools, and more.

Here's the problem with that approach: even the best analytics algorithms operating against incomplete and unclean data aren't going to yield useful results.

Economists, behavioral scientists, mathematicians, and ethicists often refer to the concept of "imperfect information," in which parties involved in the decision-making process (be it a market, game-theory scenario, or ethical question) do not have equal access to all the information required to make a decision. The concept is important because it is both theoretically and empirically demonstrable that imperfect information leads to bad outcomes; for example: markets don't function as well, game theory doesn't accurately predict what will happen, or one party in a transaction takes advantage of another. The drive toward transparency in many areas of business and life is a direct reflection of the fact that imperfect information is undesirable. Even though truly perfect information may be unachievable, most transactional and behavioral scenarios certainly benefit from the availability of less-imperfect information (or in other words, closer-to-perfect information).

The environment already presents a huge amount of data to the SOC. We have security events, user activity, intrusion detection, threat intelligence, network activity, cloud access, known exploits and vulnerabilities, configuration and IT activity metrics, security and operational logs, identity, and many other sources of data. Each of these sources tends to both emanate from and land in separate data silos. Traditionally, we expected our human SOC operators to be able to work across all of these silos, process all of this data, and turn it into actionable information. That didn't work. SOCs were overwhelmed, exhausted people naturally missed things, we didn't have all the information we needed, and we landed pretty much where we are today.

Now we are expecting that bolted-on, AI-enabled regimes will solve all of our problems. It's true that machines don't get tired and can analyze more data at scale than humans. That's good. But machines can only analyze the data with which they are presented. That means if we apply AI to, say, our user activity data silo, but that data is separated from our configuration information silo, our topology-mapping silo, or our network monitoring silo (you get the idea…), we're back to the imperfect information problem. Fancy analytics against imperfect information still yields decisions you can't entirely trust. If you can't trust the decisions, how can you automate the remediation based on them?

Time for a Fresh Approach
The hard truth is that we need to rethink our data tier. A data tier that perpetuates unconnected silos of data and expects an AI-enabled analytic regime to somehow normalize across them will yield the same "analysis paralysis" that faces human operators: too much uncertainty and too many gray areas to draw a definitive conclusion (and therefore to take action). The common phrase for this is "garbage in, garbage out." The reason that truth is hard is because most SOCs have substantial investment in those siloed data tiers already, and there is natural inertia to consider replacing them.

A better data tier will allow the ingest and normalization of the full operational and security data set as a single data lake that can then be optimized for AI-enabled analysis. I say "operational and security data set" because they are closely related. For example, user activity drives optimization for performance (operations) and hardening (security). Configuration information is critical to resolving performance issues (operations) and vulnerabilities (security). Derived topology and dependency mapping is as equally useful for troubleshooting performance problems as it is for data-loss prevention and attack detection.

Better data tiers exist, but they aren't bolt-ons to existing silos; they are replacements for them. While that may be hard to swallow, we need to adapt to the new reality, and a bolt-on approach won't get us there. Armed with better and cleaner data, an AI-based analytics regime is more able to derive better conclusions, and those conclusions can be used to directly interface with automated remediation, yielding a highly automated cyber-defense regime that is more appropriate for today's threat environment.

Think radically. Your attackers are, I assure you.

Related Content:

 

Black Hat Asia returns to Singapore with hands-on technical Trainings, cutting-edge Briefings, Arsenal open-source tool demonstrations, top-tier solutions and service providers in the Business Hall. Click for information on the conference and to register.

Dan Koloski is a software industry expert with broad experience as both a technologist working on the IT side and as a management executive on the vendor side. Dan is a Vice President in Oracle's Systems Management and Security products group, which produces the Oracle ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
AMoss
50%
50%
AMoss,
User Rank: Guru
3/11/2018 | 2:14:13 PM
data quality & normalization
Absolutely agree.  We should be spending as much time normalizing our operational/config/posture data as we are threat data.  
'PowerSnitch' Hacks Androids via Power Banks
Kelly Jackson Higgins, Executive Editor at Dark Reading,  12/8/2018
How Well Is Your Organization Investing Its Cybersecurity Dollars?
Jack Jones, Chairman, FAIR Institute,  12/11/2018
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
10 Best Practices That Could Reshape Your IT Security Department
This Dark Reading Tech Digest, explores ten best practices that could reshape IT security departments.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2018-1480
PUBLISHED: 2018-12-12
IBM BigFix Platform 9.2.0 through 9.2.14 and 9.5 through 9.5.9 does not set the 'HttpOnly' attribute on authorization tokens or session cookies. If a Cross-Site Scripting vulnerability also existed attackers may be able to get the cookie values via malicious JavaScript and then hijack the user sessi...
CVE-2018-1481
PUBLISHED: 2018-12-12
IBM BigFix Platform 9.2.0 through 9.2.14 and 9.5 through 9.5.9 stores sensitive information in URL parameters. This may lead to information disclosure if unauthorized parties have access to the URLs via server logs, referrer header or browser history. IBM X-Force ID: 140763.
CVE-2018-1484
PUBLISHED: 2018-12-12
IBM BigFix Platform 9.2.0 through 9.2.14 and 9.5 through 9.5.9 does not set the secure attribute on authorization tokens or session cookies. Attackers may be able to get the cookie values by sending a http:// link to a user or by planting this link in a site the user goes to. The cookie will be sent...
CVE-2018-1485
PUBLISHED: 2018-12-12
IBM BigFix Platform 9.2.0 through 9.2.14 and 9.5 through 9.5.9 does not renew a session variable after a successful authentication which could lead to session fixation/hijacking vulnerability. This could force a user to utilize a cookie that may be known to an attacker. IBM X-Force ID: 140970.
CVE-2018-1901
PUBLISHED: 2018-12-12
IBM WebSphere Application Server 8.5 and 9.0 could allow a remote attacker to temporarily gain elevated privileges on the system, caused by incorrect cached value being used. IBM X-Force ID: 152530.