Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Perimeter

6/18/2012
05:07 PM
Wendy Nather
Wendy Nather
Commentary
50%
50%

Logging Smarter, Not Just Harder

The problem is not just Big Data -- it's variable data. We attempt to find the answer in late-night commercials

In many circles, Big Data seems to mean "more data than our system can handle," in which case you might just have a lousy system. I've also seen it used to mean "data volumes that our product can handle and theirs can't." Whenever it's used in this way, it appears that size is the important factor here, so can we just call it "Moby Data" instead?

In any case, it presents a problem for security monitoring -- not just because of size, and not just because of variety, but because of variability. I was greatly interested in a blog post on Packet Pushers by the Socratically named Mrs. Y, on thin-slicing security data.. She talks about the unknown unknowns, but it's not just about detecting those. She also points out that when piped through a complex decision-making process -- such as with security monitoring -- massive amounts of varied data can result in information overload:

Maybe the application of Thin-slicing techniques applied to the right data could make a difference, because I think it’s obvious we can’t continue in this current direction.

How do we determine the "right" data? In security, we have multiple techniques for identifying, reducing, exploring, and detecting. The word "signature" has become such a dirty word that many who actually use it won’t admit to it. ("They’re not signatures! They’re rules!") But fundamentally speaking, we use different kinds of signatures when trying to classify events for the purposes of detecting and deciding. We’re either looking for "anything that is X," or defined, known badness (i.e., a blacklist), or "anything that is not Y," which is defined, known goodness (i.e., a whitelist).

If you want to be less judgmental, you move to anomaly detection, which was first proposed for intrusion detection by Dorothy Denning in the '80s: You collect a whole bunch of data, categorized by type, such as user activities, network traffic, configuration states, and so on. Then you create a profile based on statistical analysis of each data category. Don’t fool yourself, though: It’s still a signature.*

Even after you’ve decided that you’ve collected enough data to perform decent statistical analysis, and have a system for detecting outliers (anomalies), you’ll still need to investigate them so that you can label them as "new or additional goodness" (i.e., false positives) and "badness" (better call out the troops). That’s the challenge with all these types of detection: They assume there is a pattern so static that you can define it, and give it to something automated to monitor.

And real life isn’t always like that. Real attackers aren’t like that, either. Our systems and users change, and adversaries adapt, and it’s very hard to compensate for one while still catching the other.

Another option would be to classify data further, as more static or more dynamic -- as patterns or statistics that are expected not to change very much over time, such as an assigned IP address, and those that are expected to drift (user interaction patterns with an application that gets new features). The latter you’ll need to assess and tweak more often, as time goes by and the "normal" state of the data changes; it also helps to have reasonable heuristics in place that can work within a certain range of variation. Binary security decisions are what lead to a plague of false positives.

Would we be better off with less data? I don’t know of anyone who wants to miss anything; security professionals tend to be data hoarders, and the events that looked innocuous last month suddenly become sinister when put together with new ones. Thin-slicing, or statistical sampling, may appear to make the volume problem more manageable, and it might work for static data profiles in a moby data store. But I think what we really need is tiered processing of security data, starting with the most static -- and therefore the most confident -- data decisions, and working with multiple analysis techniques until the most variable data floats to the top -- the kind that changes all the time, and always requires context and external information that a SIEM can’t have (it’s not a malicious DoS attack; your site got Huffposted).

It’s not thin-slicing; it’s multislicing. Or slicing and dicing. It’s the Ginsu knife model of security monitoring.

* An activity profile characterizes the behavior of a given subject (or set of subjects) with respect to a given object (or set thereof), thereby serving as a signature or description of normal activity for its respective subject(s) and object(s). -- Denning

Wendy Nather is Research Director of the Enterprise Security Practice at the independent analyst firm 451 Research. You can find her on Twitter as @451wendy.

Wendy Nather is Research Director of the Enterprise Security Practice at independent analyst firm 451 Research. With over 30 years of IT experience, she has worked both in financial services and in the public sector, both in the US and in Europe. Wendy's coverage areas ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Commentary
Ransomware Is Not the Problem
Adam Shostack, Consultant, Entrepreneur, Technologist, Game Designer,  6/9/2021
Edge-DRsplash-11-edge-ask-the-experts
How Can I Test the Security of My Home-Office Employees' Routers?
John Bock, Senior Research Scientist,  6/7/2021
News
New Ransomware Group Claiming Connection to REvil Gang Surfaces
Jai Vijayan, Contributing Writer,  6/10/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win an Amazon Gift Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-20027
PUBLISHED: 2021-06-14
A buffer overflow vulnerability in SonicOS allows a remote attacker to cause a Denial of Service (DoS) by sending a specially crafted request. This vulnerability affects SonicOS Gen5, Gen6, Gen7 platforms, and SonicOSv virtual firewalls.
CVE-2021-32684
PUBLISHED: 2021-06-14
magento-scripts contains scripts and configuration used by Create Magento App, a zero-configuration tool-chain which allows one to deploy Magento 2. In versions 1.5.1 and 1.5.2, after changing the function from synchronous to asynchronous there wasn't implemented handler in the start, stop, exec, an...
CVE-2021-34693
PUBLISHED: 2021-06-14
net/can/bcm.c in the Linux kernel through 5.12.10 allows local users to obtain sensitive information from kernel stack memory because parts of a data structure are uninitialized.
CVE-2021-27887
PUBLISHED: 2021-06-14
Cross-site Scripting (XSS) vulnerability in the main dashboard of Ellipse APM versions allows an authenticated user or integrated application to inject malicious data into the application that can then be executed in a victim’s browser. This issue affects: Hitachi ABB Power Grids ...
CVE-2021-27196
PUBLISHED: 2021-06-14
Improper Input Validation vulnerability in Hitachi ABB Power Grids Relion 670 Series, Relion 670/650 Series, Relion 670/650/SAM600-IO, Relion 650, REB500, RTU500 Series, FOX615 (TEGO1), MSM, GMS600, PWC600 allows an attacker with access to the IEC 61850 network with knowledge of how to reproduce the...