Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Operational Security

12/11/2017
08:35 AM
Bogdan Botezatu
Bogdan Botezatu
News Analysis-Security Now
Connect Directly
Twitter
LinkedIn
Google+
RSS
E-Mail vvv
50%
50%

Machine Learning for Ransomware Defense

Ransomware keeps getting more dangerous but defense is improving, too. Machine learning might be the key to actually keeping up with the level of attacks.

In the past several years, ransomware has inflicted financial losses estimated at billions of dollars -- and that's only what victims have reported to law enforcement. Described by security researchers as one of the most prolific and financially stimulating malware categories, ransomware has successfully been ported to all operating systems (OS), including mobile OSs.

While originally developed to target computers running Windows, as of 2016 Apple's MacOS and Linux have seen their own distributions of ransomware. In fact, the ransomware business has been so prolific that cybercriminals have even turned it into an as-a-service offering. Ransomware-as-a-service lets even less tech-savvy users rent ransomware services and start infecting victims for their own financial gain.

Relying on tools such as encryption, obfuscation and polymorphism, ransomware has caused serious concerns for the security industry, as traditional detection mechanisms are ill-equipped to detect each victim-specific ransomware samples. Consequently, machine learning algorithms designed to automatically and correctly tag ransomware samples based on their behavior or similarity with known ransomware have become a necessity.

While machine learning plays a significant part in detecting ransomware, no single machine learning algorithm can spot all ransomware. That would require an ensemble of specifically trained machine learning algorithms working together, each designed to identify either a specific ransomware family, a ransomware-disseminating website, or packers (a technique commonly referred to as "executable compression" and used to compress the ransomware payload to make it difficult for security tools to analyze it).

Ransomware doesn't discriminate
Ransomware indiscriminately selects targets, with the sole aim of locking access to critical files and demanding payment to restore access. However, it has recently undergone some transformations that have allowed it to infect victims even without any user interaction.

WannaCry was the first ransomware outbreak that leveraged a Windows vulnerability to automatically spread across networks and infect victims without interaction with the victim. Simply having a vulnerable PC with an Internet connection would have been enough to get infected, which is why hundreds of thousands of computers were rendered inoperative during its relatively short outbreak. GoldenEye was yet another example of a ransomware pandemic that affected several European countries, including Poland, Germany, Italy, Spain and France.

While in both cases the attackers made little money from ransoms, these incidents proved highly disruptive and demonstrated just how easily unpatched vulnerabilities can be exploited to deliver any type of threat, even one as pervasive as ransomware.

The Internet of Things is not immune either, but the business model is fundamentally different from that of the PC. Instead of locking you out of your files, for example, ransomware designed for IoT devices and smart homes could lock you out of your home. Even medical devices -- including implantable ones -- could be exploited and used to extort victims. For example, the Internet connection of Dick Cheney's Pacemaker was allegedly disabled for fear that terrorists -- or even ransomware -- could threaten his life.

Machine learning steps in
Where traditional file-based detection security technologies fail, machine learning algorithms succeed. Neural networks and deep learning algorithms can detect unknown ransomware samples if they're properly trained and adjusted to produce a low number of false positives. Augmenting cloud-based detections with machine learning and genetic algorithms is also effective in combating the rampant growth of ransomware caused by its polymorphic behavior.

In a nutshell, it all starts with a large dataset of ransomware files and an even larger set of clean files. The algorithm is tasked with finding some characteristics for each file in the training set, and normalizing it into a number that is usually called a "feature." As one characteristic may create more than one feature, only a subset of those features will be used to train a model for the sample set.

When using neural networks to create models used for ransomware identification, all samples are usually mapped in a matrix comprising tens of thousands of features. Instead of having a three-dimensional matrix that describes three features necessary for a file to be considered ransomware, imagine an n-dimensional matrix that has more than 40,000 features. That might sound extremely complicated, but the end result is actually a mathematical equation -- also known as a model -- that acts as a condition that, once satisfied, will tag a file as ransomware.

A major benefit of using machine learning models to spot ransomware is that it increases the number of possible ransomware files it can detect -- if enough ransomware features are present in an unknown ransomware sample, the file is likely ransomware.

The second benefit is that machine learning models are extremely small, usually around 1 kilobyte, which makes them easy to deploy across the entire user base. The only downside of using machine learning models to detect ransomware is that they have to be extensively tested before deployment to avoid incorrectly tagging clean files as malicious.

Some machine learning algorithms can even identify suspicious URLs that are either used to disseminate ransomware or act as command and control servers. Using Natural Language Processing (NLP) algorithms and various clustering methods to parse texts, they can potentially block new or never-before-seen links from being accessed by victims, preventing the actual ransomware payload from reaching the computer.

Machine learning algorithms for ransomware identification can be used as a proactive method for combating ransomware threats, regardless of whether they're designed for PCs, mobile devices or even IoTs. The main benefit of machine learning is that it can be used as a tool to augment existing security layers, giving them proactivity, efficacy and performance.

Ransomware is here to stay: So is defense
It's highly unlikely that ransomware will go away any time soon, especially since digitalization has brought increased interconnectivity between systems. With a proven and tested business model and financial gains in the billions of dollars, ransomware is likely the biggest mass-market threat to both end users and organizations.

However, machine learning algorithms can augment all security layers to detect and plug threats at pre-execution, on-execution and post-execution, making ransomware less of threat and more of a nuisance.

Related posts:

&emdash; Bogdan Botezatu is living his second childhood at Bitdefender as senior e-threat analyst.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Pen Testers Who Got Arrested Doing Their Jobs Tell All
Kelly Jackson Higgins, Executive Editor at Dark Reading,  8/5/2020
Researcher Finds New Office Macro Attacks for MacOS
Curtis Franklin Jr., Senior Editor at Dark Reading,  8/7/2020
A Patriotic Solution to the Cybersecurity Skills Shortage
Adam Benson, Senior VP, Vrge Strategies,  8/3/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
Special Report: Computing's New Normal, a Dark Reading Perspective
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
The Changing Face of Threat Intelligence
The Changing Face of Threat Intelligence
This special report takes a look at how enterprises are using threat intelligence, as well as emerging best practices for integrating threat intel into security operations and incident response. Download it today!
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-12777
PUBLISHED: 2020-08-10
A function in Combodo iTop contains a vulnerability of Broken Access Control, which allows unauthorized attacker to inject command and disclose system information.
CVE-2020-12778
PUBLISHED: 2020-08-10
Combodo iTop does not validate inputted parameters, attackers can inject malicious commands and launch XSS attack.
CVE-2020-12779
PUBLISHED: 2020-08-10
Combodo iTop contains a stored Cross-site Scripting vulnerability, which can be attacked by uploading file with malicious script.
CVE-2020-12780
PUBLISHED: 2020-08-10
A security misconfiguration exists in Combodo iTop, which can expose sensitive information.
CVE-2020-12781
PUBLISHED: 2020-08-10
Combodo iTop contains a cross-site request forgery (CSRF) vulnerability, attackers can execute specific commands via malicious site request forgery.