Partner Perspectives  Connecting marketers to our tech communities.
09:20 AM
Celeste Fralick
Celeste Fralick
Partner Perspectives

Knowledge Gap Series: The Myths Of Analytics

It may not be rocket science, but it is data science.

Do you have your eye on machine learning or a nice neural network to help your security team make decisions faster? Be aware that there are quite a few myths circulating about how these work; even the language used can be confusing. Many new terms -- and some familiar words --have different meanings in the world of statistical analytics. For example, “variable” means something significantly different to a programmer than to a statistician. And the capabilities of a statistician are different from those of a data scientist.

Let’s start with building an analytical model. This does not happen quickly, because you need to capture enough data from your environment to give you a representative distribution. Roughly put, the distribution is the shape of the data (much like the classic bell curve from college), including the upper and lower limits, symmetry, presence of outliers, and other characteristics. There are dozens of statistical distributions, and the choice is critical because they form the foundation of the behavioral model. Another issue is cleaning the data prior to exploring potential models. How do you want to deal with outliers? What weights will you assign to the various components? Which ones are fully or partially dependent?

Some machine-learning technologies will gather and analyze the data to try and determine an appropriate distribution for you, but you still need to be able to understand the decision. For example, many data sets do not fit the symmetry of a bell curve (formally called a normal distribution), and the distribution that fits probably has an unfamiliar name. Some of these tools only work with certain types of data sets, and all of them have underlying assumptions that you need to understand. You also need to understand some of the math, at least at a cursory level. Different tools may use different equations for a similar application -- such as correlation coefficients that show the degree of dependence between two sets of data, especially if the relationships are nonlinear.

Say you have been through this exercise, some statisticians and data scientists have advised you, and you now have an analytical model for identifying data exfiltration, phishing attacks, or some other security event. What is the appropriate level of confidence in the results? No model is always right, and you need to know how well the model fits, what your statistical level of confidence is, and what to look for when an automated decision gets punted for human judgment. These models are ultimately built by humans, so you also need to make sure that you have an appropriate level of trust in the quality and ethics of your modeller.

Statistics, analytics, and machine learning are powerful tools that will help resolve security problems faster, with fewer resources. They will empower the next wave of automated and even predictive defenses. However, this will take time, and we have to work our way up from reactive models, through proactive ones, before we get to predictive.

This journey is going to require some learning on your part, whether it is a review of your college stats classes or building an understanding of the terms and concepts, so that you can communicate clearly and effectively with the statisticians and data scientists that will be joining your team. You need to ensure that your data scientists have a strong working knowledge of statistics, as this title is loosely defined and may be overused. Finally, you will need to be able to translate these concepts and plans to members of the C-suite, who may be skeptical about the uses and abuses of statistics.

My intent is not to scare you off with the amount of work involved. When properly implemented, the security benefits of big data analytics are substantial.

The Intel Security Knowledge Gap series brings forward unique educational content to bridge the gap between what cybersecurity professionals know and what they need to know to be successful against the threat landscape of today and tomorrow.

Dr. Fralick is responsible for Intel Security's technical strategy related to analytics that integrates into the Intel Security corporate products.  Dr. Fralick brings over 35 years of industry experience to the Analytics CTO position, 20 of those with Intel.  ... View Full Bio
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Register for Dark Reading Newsletters
Partner Perspectives
What's This?
In a digital world inundated with advanced security threats, Intel Security seeks to transform how we live and work to keep our information secure. Through hardware and software development, Intel Security delivers robust solutions that integrate security into every layer of every digital device. In combining the security expertise of McAfee with the innovation, performance, and trust of Intel, this vision becomes a reality.

As we rely on technology to enhance our everyday and business life, we must too consider the security of the intellectual property and confidential data that is housed on these devices. As we increase the number of devices we use, we increase the number of gateways and opportunity for security threats. Intel Security takes the “security connected” approach to ensure that every device is secure, and that all security solutions are seamlessly integrated.
Featured Writers
White Papers
Cartoon Contest
Write a Caption, Win a Starbucks Card! Click Here
Latest Comment: just wondering...Thanx
Current Issue
Security Operations and IT Operations: Finding the Path to Collaboration
A wide gulf has emerged between SOC and NOC teams that's keeping both of them from assuring the confidentiality, integrity, and availability of IT systems. Here's how experts think it should be bridged.
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
Published: 2017-05-09
NScript in mpengine in Microsoft Malware Protection Engine with Engine Version before 1.1.13704.0, as used in Windows Defender and other products, allows remote attackers to execute arbitrary code or cause a denial of service (type confusion and application crash) via crafted JavaScript code within ...

Published: 2017-05-08
unixsocket.c in lxterminal through 0.3.0 insecurely uses /tmp for a socket file, allowing a local user to cause a denial of service (preventing terminal launch), or possibly have other impact (bypassing terminal access control).

Published: 2017-05-08
A privilege escalation vulnerability in Brocade Fibre Channel SAN products running Brocade Fabric OS (FOS) releases earlier than v7.4.1d and v8.0.1b could allow an authenticated attacker to elevate the privileges of user accounts accessing the system via command line interface. With affected version...

Published: 2017-05-08
Improper checks for unusual or exceptional conditions in Brocade NetIron 05.8.00 and later releases up to and including 06.1.00, when the Management Module is continuously scanned on port 22, may allow attackers to cause a denial of service (crash and reload) of the management module.

Published: 2017-05-08
Nextcloud Server before 11.0.3 is vulnerable to an inadequate escaping leading to a XSS vulnerability in the search module. To be exploitable a user has to write or paste malicious content into the search dialogue.

Dark Reading Radio
Archived Dark Reading Radio
In past years, security researchers have discovered ways to hack cars, medical devices, automated teller machines, and many other targets. Dark Reading Executive Editor Kelly Jackson Higgins hosts researcher Samy Kamkar and Levi Gundert, vice president of threat intelligence at Recorded Future, to discuss some of 2016's most unusual and creative hacks by white hats, and what these new vulnerabilities might mean for the coming year.