Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


02:47 PM
Connect Directly

Connecting The Dots With Quality Analytics Data

Get creative about sourcing data, find ways to improve its quality, and then normalize it to mine its value

Security analytics practices are only as good as the data they base their analysis on. If data simply isn't mined, if it is of poor quality or accuracy, if it isn't in a useable format or if it isn't contextualized against complementary data or risk priorities, then the organization that holds it will be challenged to scratch value out of analytics.

"Your security analytics are only as accurate and useful as the data you put in," says Gidi Cohen, CEO of SkySecurity. "If the data has gaping holes, misses important network zones, or lacks input from security controls, then you will have gaping holes in your view and miss key dependencies between the myriad security tools and processes you use."

So what does a data-centric analysis process look like? It starts first with recognizing that you've got access to more relevant data than you think you do. Most organizations already have everything they need to know in order to know themselves for analytics' sake, says Kelly White, vice president and information security manager of a top 25 U.S. financial institution, who shared best practices on the condition of not naming his employer.

[Do you see the perimeter half empty or half full? See Is The Perimeter Really Dead?.]

"If you just think about and internalize the amount of information your systems produce -- just by the fact that they're running on your network -- if you think about all of the security information that your users produce as they go about their daily work, it's not something that you have to go out and buy from somebody," White says. "You don't need to subscribe to a report. Really, everything you need to know yourself, you've got already."

Organizations that get creative with their sourcing of data are the ones that tend to get more value out of analytics than those that simply lump together security system log data in a SIEM or who think of threat intelligence from outside sources interchangeably with security analytics.

Some of the data sources that could play a big part in forming more complete data sets could include network footprint data, platform configuration information, log-in and identity management data, database server logs and NetFlow data. White's organization is even as creative as to use a Google appliance to index and search against unstructured data stores such as SharePoint servers to find relevant information, such as unstructured repositories of PII, and create a map of relevant information that would otherwise present blind spots when assessing security risks.

Identifying potential internal sources of data is only the first step in ensuring that it can provide value to an analytics program. Organizations also must groom and prepare the data to make sure it is of reliable quality and it is in a useful format. This means doing a bit of quality assurance -- a sort of presecurity analytics, as Mike Lloyd, CTO of RedSeal Networks, calls it -- to make sure gaps are filled and sources are refined so their feeds are accurate enough to make operational assumptions upon.

"If the data quality is bad, you have to do analysis on that first to decide what's wrong with the data, how bad a problem is it and what you can do about it to make it useable," Lloyd says, explaining that the more data sources you combine to get slightly different views of the same environment, the easier it is to do this. "When you combine data, you can criticize the data feed itself and not rush headlong into security analytics."

And this kind of criticism of data feeds shouldn't just happen on the front end of the analytics process -- it should be an on-going routine. Because, as Rajesh Goel, CTO of Brainlink International, points out, changes from infrastructure vendors could greatly impact data feeds.

"Vendor updates, patches and changes can change the meaning of the raw data generated and subsequent analytics. Some vendors communicate the changes clearly, others bury them in massive updates, and do NOT take into account that the events being generated have changed," he says. "It's important to confirm/validate that we're still getting the needed data and that the value of threats or events hasn't changed."

Even if the data itself is good, it may not be dispensed by a particular piece of software or hardware in any kind of format useable to a security analytics team.

The data required to perform accurate and thorough security big data analytics exists, however the challenge is in having to consume vast amounts of dissimilar and proprietary formats," says Jim Butterworth, CSO of HBGary.

This is why normalization may also play an important role in getting data ready for analytics prime time.

"In order for the data to be useful, it must be collected and normalized, so that all of the data is speaking the same language," says Cohen. "Once the data is normalized, your analytical tools can operate on that data in a common way, which reduces the amount of vendor-specific expertise needed."

However, organizations shouldn't worship at the normalization altar to the point where it holds back nimble analysis.

"I would argue that you don't necessarily have to normalize everything. There's going to be a lot of unstructured data that doesn't necessarily have to be structured," says Michael Roytman, data scientist for Risk I/O, explaining that for example an organization may take a piece of external data from a report like the DBIR that says its industry is 12% more likely experience something like a SQL injection attack and add a 'fudge factor' that increases the weight of those vulnerabilities. "It's about looking at that data and figuring out a quick, easy and dirty way to apply that to your target asset."

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message. Ericka Chickowski specializes in coverage of information technology and business innovation. She has focused on information security for the better part of a decade and regularly writes about the security industry as a contributor to Dark Reading.  View Full Bio


Recommended Reading:

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 8/3/2020
'BootHole' Vulnerability Exposes Secure Boot Devices to Attack
Kelly Sheridan, Staff Editor, Dark Reading,  7/29/2020
Average Cost of a Data Breach: $3.86 Million
Jai Vijayan, Contributing Writer,  7/29/2020
Register for Dark Reading Newsletters
White Papers
Cartoon Contest
Current Issue
Special Report: Computing's New Normal, a Dark Reading Perspective
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
The Threat from the Internetand What Your Organization Can Do About It
The Threat from the Internetand What Your Organization Can Do About It
This report describes some of the latest attacks and threats emanating from the Internet, as well as advice and tips on how your organization can mitigate those threats before they affect your business. Download it today!
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2020-08-05
Affected versions of Atlassian Fisheye allow remote attackers to view the HTTP password of a repository via an Information Disclosure vulnerability in the logging feature. The affected versions are before version 4.8.3.
PUBLISHED: 2020-08-04
In solidus before versions 2.8.6, 2.9.6, and 2.10.2, there is an bility to change order address without triggering address validations. This vulnerability allows a malicious customer to craft request data with parameters that allow changing the address of the current order without changing the shipm...
PUBLISHED: 2020-08-04
Extreme Analytics in Extreme Management Center before allows unauthenticated reflected XSS via a parameter in a GET request, aka CFD-4887.
PUBLISHED: 2020-08-04
save-server (npm package) before version 1.05 is affected by a CSRF vulnerability, as there is no CSRF mitigation (Tokens etc.). The fix introduced in version version 1.05 unintentionally breaks uploading so version v1.0.7 is the fixed version. This is patched by implementing Double submit. The CSRF...
PUBLISHED: 2020-08-04
An exploitable arbitrary file delete vulnerability exists in SoftPerfect RAM Disk 4.1 spvve.sys driver. A specially crafted I/O request packet (IRP) can allow an unprivileged user to delete any file on the filesystem. An attacker can send a malicious IRP to trigger this vulnerability.