02:30 PM
Guy Caspi
Guy Caspi
Connect Directly
E-Mail vvv

Introducing Deep Learning: Boosting Cybersecurity With An Artificial Brain

With nearly the same speed and precision that the human eye can identify a water bottle, the technology of deep learning is enabling the detection of malicious activity at the point of entry in real-time.

Editor’s Note: Last month, Dark Reading editors named Deep Instinct the most innovative startup in its first annual Best of Black Hat Innovation Awards program at Black Hat 2016 in Las Vegas. For more details on the competition and other results, read Best Of Black Hat Innovation Awards: And The Winners Are

It’s hot outside and you’re thirsty. As you reach for a water bottle, you don’t pause to analyze its material, size or shape in order to determine whether it’s a water bottle. Instead, you immediately reach for it, with complete confidence in its identification.

If I show the same water bottle to any traditional computer vision module, it will easily recognize it. If I partially obstruct the image with my fingers, then traditional computer vision modules will have difficulty recognizing it. But, if I apply an advanced form of artificial intelligence that is called deep learning, which is resistant to small changes and can generalize from partial data, it would be very easy for the computer vision module to correctly recognize the water bottle, even when most of the image is obstructed.

Deep learning, also known as neural networks, is “inspired” by the brain’s ability to learn to identify objects. Take vision as an example. Our brain can process raw data derived from our sensory inputs and learn the high-level features all on its own. Similarly, in deep learning, raw data is fed through the deep neural network, which learns to identify the object on which it is trained. Machine learning, on the other hand, requires manual intervention in selecting which features to process through the machine learning modules. As a result, the process is slower and accuracy can be affected by human error. Deep learning's more sophisticated, self-learning capability results in higher accuracy and faster processing.

Similar to image recognition, in cybersecurity, more than 99% of new threats and malware are actually very small mutations of previously existing ones. And even that 1% of supposedly brand-new malware are rather substantial mutations of existing malicious threats and concepts. But, despite this fact, cybersecurity solutions -- even the most advanced ones that use dynamic analysis and traditional machine learning -- have great difficulty in detecting a large portion of these new malware. The result is vulnerabilities that leave organizations exposed to data breaches, data theft, seizure for ransomware, data corruption, and destruction. We can solve this problem by applying deep learning to cybersecurity.

The history of malware detection in a nutshell
Signature-based solutions are the oldest form of malware detection, which is why they are also called legacy solutions. To detect malware, the antivirus engine compares the contents of an unidentified piece of code to its database of known malware signatures. If the malware hasn’t been seen before, these methods rely on manually tuned heuristics to generate a handcrafted signature, which is then released as an update to clients. This process is time-consuming, and sometimes signatures are released months after the initial detection. As a result, this detection method can’t keep up with the million new malware variants that are created daily. This leaves organizations vulnerable to the new threats as well as threats that have already been detected but have yet to have a signature released.

Heuristic techniques identify malware based on the behavioral characteristics in the code, which has led to behavioral-based solutions. This malware detection technique analyzes the malware’s behavior at runtime, instead of considering the characteristics hardcoded in the malware code itself. The main limitation of this malware detection method is that it is able to discover malware only once the malicious actions have begun. As a result, prevention is delayed, sometimes available only once it’s too late.

Sandbox solutions are a development of the behavioral-based detection method. These solutions execute the malware in a virtual (sandbox) environment to determine whether the file is malicious or not, instead of detecting the behavioral fingerprint at runtime. Although this technique has shown to be quite effective in its detection accuracy, it is achieved at the cost of real-time protection because of the time-consuming process involved. Additionally, newer types of malicious code that can evade sandbox detection by stalling their execution in a sandbox environment are posing new challenges to this type of malware detection and consequently, prevention capabilities.

Malware detection using AI: machine learning & deep learning
Incorporating AI capabilities to enable more sophisticated detection capabilities is the latest step in the evolution of cybersecurity solutions. Malware detection methods that are based on machine learning AI apply elaborate algorithms to classify a file’s behavior as malicious or legitimate according to feature engineering that is conducted manually. However, this process is time-consuming and requires massive human resources to tell the technology on which parameters, variables or features to focus during the file classification process. Additionally, the rate of malware detection is still far from 100%. 

Deep learning AI is an advanced branch of machine learning, also known as “neural networks” because it is "inspired" by the way the human brain works. In our neocortex, the outer layer of our brain where high-level cognitive tasks are performed, we have several tens of billions of neurons. These neurons, which are largely general purpose and domain-agnostic, can learn from any type of data. This is the great revolution of deep learning because deep neural networks are the first family of algorithms within machine learning that do not require manual feature engineering. Instead, they learn on their own to identify the object on which they are trained by processing and learning the high-level features from raw data -- very much like the way our brain learns on its own from raw data derived from our sensory inputs.

When applied to cybersecurity, the deep learning core engine is trained to learn without any human intervention whether a file is malicious or legitimate. Deep learning exhibits potentially groundbreaking results in detecting first-seen malware, compared with classical machine learning. In real environment tests on publicly known databases of endpoints, mobile and APT malware, for example, the detection rates of a deep learning solution detected over 99.9% of both substantial and slightly modified malicious code. These results are consistent with improvements achieved by deep learning in other fields, such as computer vision, speech recognition and text understanding.

In the same way humans can immediately identify a water bottle in the real world, the technology advancements of deep learning -- applied to cybersecurity -- can enable the precise detection of new malware threats and fill in the critical gaps that that leave organizations exposed to attacks.

Related Content:

Guy Caspi is a leading mathematician and a data scientist global expert. He has 15 years of extensive experience in applying mathematics and machine learning in a technology elite unit of the Israel Defense Forces (IDF), financial institutions and intelligence organizations ... View Full Bio
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Register for Dark Reading Newsletters
White Papers
Current Issue
5 Security Technologies to Watch in 2017
Emerging tools and services promise to make a difference this year. Are they on your company's list?
Flash Poll
Secure Application Development - New Best Practices
Secure Application Development - New Best Practices
The transition from DevOps to SecDevOps is combining with the move toward cloud computing to create new challenges - and new opportunities - for the information security team. Download this report, to learn about the new best practices for secure application development.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
Published: 2015-10-15
The Direct Rendering Manager (DRM) subsystem in the Linux kernel through 4.x mishandles requests for Graphics Execution Manager (GEM) objects, which allows context-dependent attackers to cause a denial of service (memory consumption) via an application that processes graphics data, as demonstrated b...

Published: 2015-10-15
netstat in IBM AIX 5.3, 6.1, and 7.1 and VIOS 2.2.x, when a fibre channel adapter is used, allows local users to gain privileges via unspecified vectors.

Published: 2015-10-15
Cross-site request forgery (CSRF) vulnerability in eXtplorer before 2.1.8 allows remote attackers to hijack the authentication of arbitrary users for requests that execute PHP code.

Published: 2015-10-15
Directory traversal vulnerability in QNAP QTS before 4.1.4 build 0910 and 4.2.x before 4.2.0 RC2 build 0910, when AFP is enabled, allows remote attackers to read or write to arbitrary files by leveraging access to an OS X (1) user or (2) guest account.

Published: 2015-10-15
Cisco Application Policy Infrastructure Controller (APIC) 1.1j allows local users to gain privileges via vectors involving addition of an SSH key, aka Bug ID CSCuw46076.

Dark Reading Radio
Archived Dark Reading Radio
In past years, security researchers have discovered ways to hack cars, medical devices, automated teller machines, and many other targets. Dark Reading Executive Editor Kelly Jackson Higgins hosts researcher Samy Kamkar and Levi Gundert, vice president of threat intelligence at Recorded Future, to discuss some of 2016's most unusual and creative hacks by white hats, and what these new vulnerabilities might mean for the coming year.