Analytics

8/29/2017
02:00 PM
Nik Whitfield
Nik Whitfield
Commentary
Connect Directly
Twitter
LinkedIn
RSS
E-Mail vvv
50%
50%

Security Analytics: Making the Leap from Data Lake to Meaningful Insight

Once you've got a lake full of data, it's essential that your analysis isn't left stranded on the shore.

Second of a two-part series.

Lots of technology and security teams, particularly in finance, are running data lake projects to advance their data analytics capabilities. Their goal is to extract meaningful, timely insights from data, so that security leaders, control managers, IT, and security operations can make effective decisions using information from their live environment.

During the four phases of a data lake project (build data lake; ingest data; do analysis; deliver insight), the hurdles to success are different. In the first two phases, it's easy for a data lake to become a data swamp. In the last two, it's easy to have a lake full of data that delivers a poor return on investment. Here are three steps to avoid that happening.

Step 1: Clean up messy analysis workflows from the get-go.
Security teams know the frustrations of ad hoc data analysis efforts only too well. For example, a risk question from an executive is given to an analyst, who collects data from whatever parts of the technology "frankenstack" he or she can get access to. This ends up in spreadsheets and visualization tools. Then, after days or weeks of battling with hard-to-link, unwieldy data sets, the results of best-effort analysis are sent off in a slide deck. If this doesn't answer the questions "So what?" and "What now?" the cycle repeats.

Automating the process of putting data from the many and varied security-relevant technologies that exist in an enterprise environment into a data lake, and then just replicating a process like the one above, means analysis efforts may start sooner. However, if data isn't structured so it's easy to understand and interact with, it doesn't get easier to deliver meaningful output.

To avoid this, look at ways to optimize your data analysis workflow. Consider everything from how you ingest, store, and model data to how you scope questions with stakeholders and set their expectations about "speed to answer." Build a framework for iterating analysis — and be clinical about quickly proving or disproving the ROI a data set can contribute toward the stakeholder-requested insight. Also think about how to build a knowledge base about what relationships in what data sets are valuable, and which aren't. It's important that analysts' experience isn't trapped in their heads and that teams can avoid repeating analysis that eats up time and money without delivering value.

Step 2: Buy time for the hard work that needs doing up front.
There's often a lot of complexity involved in correlating data sets from different technologies that secure an end-to-end business process. When data analytics teams first start working with data from the diverse security and IT solutions that are in place, they need to do a lot of learning to make sure the insight they present to decision makers is robust and has the necessary caveats.

This learning is on three levels: first, understanding the data; second, implementing the right analysis for the insight required; and third, finding the best way to communicate the insight to relevant stakeholders so they can make decisions.

The item with the biggest lead time is "understanding the data." Even if security teams interact regularly with a technology's user interface, getting to grips with the raw data it generates can be a hard task. Sometimes there's little documentation about how the data is structured, and it can be difficult to interpret and understand the relevance of information from just looking at the labels. As a result, data analytics teams usually need to spend a long time answering questions such as: What do the fields in each data source mean? How does technology configuration influence the data available? Across data sources, how does the format of information referring to the same thing differ? And what quirks can occur in the data that need to be accounted for?

It's critical to set expectations with budget holders and executives about how important this is and the time it will take. This isn't just because understanding and modeling data is a key enabler of delivering speed to insight in the long term. It's also because it's easy to create "data skeptics" when pressure for fast results leads to rushed analysis that delivers incorrect information.

Step 3: Plan to scale from the start for long-term success.
Providing data-driven risk and security insights for executives and operations teams usually means answering the following questions: What's our status? Is it good or bad? If it's bad, why? Do we act or gather more information? If we act, what are our best cost actions?

Once data analytics teams start providing high-value insights that answer these questions — for example, "What are our best cost options to achieve large reductions in risk due to vulnerability exposure?" — the next challenge is scaling the team's capability to answer more (and usually more difficult) questions.

At this point, the number of people on your data analytics team can become a bottleneck to servicing requests for insight. And when other departments like IT, audit, and compliance see the benefits from data-driven decisions, the volume of requests for insight can increase exponentially.

To avoid this, think about how to enable people who aren't data analytics experts to interact with data so they can "self-serve" insights. Who are your different stakeholders? What insights and metrics do they need to run their business process? How will they need to explore the dimensions of data relevant to answer questions they have and diagnose issues? Mapping a workflow of "these stakeholders, need this insight, from this data" helps answer these questions and will enable you to identify data sets that are relevant to decisions for multiple stakeholders. In turn, this helps you focus efforts to understand high-value data sets as early as possible.

Related Content:

Learn from the industry’s most knowledgeable CISOs and IT security experts in a setting that is conducive to interaction and conversation. Click for more info and to register.

Nik Whitfield is a noted computer scientist and cyber security technology entrepreneur. He founded Panaseer in 2014, a cybersecurity software company that gives businesses unparalleled visibility and insight into their cybersecurity weaknesses. Panaseer announced in November ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
1.9 Billion Data Records Exposed in First Half of 2017
Kelly Jackson Higgins, Executive Editor at Dark Reading,  9/20/2017
Get Serious about IoT Security
Derek Manky, Global Security Strategist, Fortinet,  9/20/2017
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
Security Vulnerabilities: The Next Wave
Just when you thought it was safe, researchers have unveiled a new round of IT security flaws. Is your enterprise ready?
Flash Poll
[Strategic Security Report] Assessing Cybersecurity Risk
[Strategic Security Report] Assessing Cybersecurity Risk
As cyber attackers become more sophisticated and enterprise defenses become more complex, many enterprises are faced with a complicated question: what is the risk of an IT security breach? This report delivers insight on how today's enterprises evaluate the risks they face. This report also offers a look at security professionals' concerns about a wide variety of threats, including cloud security, mobile security, and the Internet of Things.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2017-0290
Published: 2017-05-09
NScript in mpengine in Microsoft Malware Protection Engine with Engine Version before 1.1.13704.0, as used in Windows Defender and other products, allows remote attackers to execute arbitrary code or cause a denial of service (type confusion and application crash) via crafted JavaScript code within ...

CVE-2016-10369
Published: 2017-05-08
unixsocket.c in lxterminal through 0.3.0 insecurely uses /tmp for a socket file, allowing a local user to cause a denial of service (preventing terminal launch), or possibly have other impact (bypassing terminal access control).

CVE-2016-8202
Published: 2017-05-08
A privilege escalation vulnerability in Brocade Fibre Channel SAN products running Brocade Fabric OS (FOS) releases earlier than v7.4.1d and v8.0.1b could allow an authenticated attacker to elevate the privileges of user accounts accessing the system via command line interface. With affected version...

CVE-2016-8209
Published: 2017-05-08
Improper checks for unusual or exceptional conditions in Brocade NetIron 05.8.00 and later releases up to and including 06.1.00, when the Management Module is continuously scanned on port 22, may allow attackers to cause a denial of service (crash and reload) of the management module.

CVE-2017-0890
Published: 2017-05-08
Nextcloud Server before 11.0.3 is vulnerable to an inadequate escaping leading to a XSS vulnerability in the search module. To be exploitable a user has to write or paste malicious content into the search dialogue.