Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Operational Security

12/11/2017
08:35 AM
Bogdan Botezatu
Bogdan Botezatu
News Analysis-Security Now
Connect Directly
Twitter
LinkedIn
Google+
RSS
E-Mail vvv
50%
50%

Machine Learning for Ransomware Defense

Ransomware keeps getting more dangerous but defense is improving, too. Machine learning might be the key to actually keeping up with the level of attacks.

In the past several years, ransomware has inflicted financial losses estimated at billions of dollars -- and that's only what victims have reported to law enforcement. Described by security researchers as one of the most prolific and financially stimulating malware categories, ransomware has successfully been ported to all operating systems (OS), including mobile OSs.

While originally developed to target computers running Windows, as of 2016 Apple's MacOS and Linux have seen their own distributions of ransomware. In fact, the ransomware business has been so prolific that cybercriminals have even turned it into an as-a-service offering. Ransomware-as-a-service lets even less tech-savvy users rent ransomware services and start infecting victims for their own financial gain.

Relying on tools such as encryption, obfuscation and polymorphism, ransomware has caused serious concerns for the security industry, as traditional detection mechanisms are ill-equipped to detect each victim-specific ransomware samples. Consequently, machine learning algorithms designed to automatically and correctly tag ransomware samples based on their behavior or similarity with known ransomware have become a necessity.

While machine learning plays a significant part in detecting ransomware, no single machine learning algorithm can spot all ransomware. That would require an ensemble of specifically trained machine learning algorithms working together, each designed to identify either a specific ransomware family, a ransomware-disseminating website, or packers (a technique commonly referred to as "executable compression" and used to compress the ransomware payload to make it difficult for security tools to analyze it).

Ransomware doesn't discriminate
Ransomware indiscriminately selects targets, with the sole aim of locking access to critical files and demanding payment to restore access. However, it has recently undergone some transformations that have allowed it to infect victims even without any user interaction.

WannaCry was the first ransomware outbreak that leveraged a Windows vulnerability to automatically spread across networks and infect victims without interaction with the victim. Simply having a vulnerable PC with an Internet connection would have been enough to get infected, which is why hundreds of thousands of computers were rendered inoperative during its relatively short outbreak. GoldenEye was yet another example of a ransomware pandemic that affected several European countries, including Poland, Germany, Italy, Spain and France.

While in both cases the attackers made little money from ransoms, these incidents proved highly disruptive and demonstrated just how easily unpatched vulnerabilities can be exploited to deliver any type of threat, even one as pervasive as ransomware.

The Internet of Things is not immune either, but the business model is fundamentally different from that of the PC. Instead of locking you out of your files, for example, ransomware designed for IoT devices and smart homes could lock you out of your home. Even medical devices -- including implantable ones -- could be exploited and used to extort victims. For example, the Internet connection of Dick Cheney's Pacemaker was allegedly disabled for fear that terrorists -- or even ransomware -- could threaten his life.

Machine learning steps in
Where traditional file-based detection security technologies fail, machine learning algorithms succeed. Neural networks and deep learning algorithms can detect unknown ransomware samples if they're properly trained and adjusted to produce a low number of false positives. Augmenting cloud-based detections with machine learning and genetic algorithms is also effective in combating the rampant growth of ransomware caused by its polymorphic behavior.

In a nutshell, it all starts with a large dataset of ransomware files and an even larger set of clean files. The algorithm is tasked with finding some characteristics for each file in the training set, and normalizing it into a number that is usually called a "feature." As one characteristic may create more than one feature, only a subset of those features will be used to train a model for the sample set.

When using neural networks to create models used for ransomware identification, all samples are usually mapped in a matrix comprising tens of thousands of features. Instead of having a three-dimensional matrix that describes three features necessary for a file to be considered ransomware, imagine an n-dimensional matrix that has more than 40,000 features. That might sound extremely complicated, but the end result is actually a mathematical equation -- also known as a model -- that acts as a condition that, once satisfied, will tag a file as ransomware.

A major benefit of using machine learning models to spot ransomware is that it increases the number of possible ransomware files it can detect -- if enough ransomware features are present in an unknown ransomware sample, the file is likely ransomware.

The second benefit is that machine learning models are extremely small, usually around 1 kilobyte, which makes them easy to deploy across the entire user base. The only downside of using machine learning models to detect ransomware is that they have to be extensively tested before deployment to avoid incorrectly tagging clean files as malicious.

Some machine learning algorithms can even identify suspicious URLs that are either used to disseminate ransomware or act as command and control servers. Using Natural Language Processing (NLP) algorithms and various clustering methods to parse texts, they can potentially block new or never-before-seen links from being accessed by victims, preventing the actual ransomware payload from reaching the computer.

Machine learning algorithms for ransomware identification can be used as a proactive method for combating ransomware threats, regardless of whether they're designed for PCs, mobile devices or even IoTs. The main benefit of machine learning is that it can be used as a tool to augment existing security layers, giving them proactivity, efficacy and performance.

Ransomware is here to stay: So is defense
It's highly unlikely that ransomware will go away any time soon, especially since digitalization has brought increased interconnectivity between systems. With a proven and tested business model and financial gains in the billions of dollars, ransomware is likely the biggest mass-market threat to both end users and organizations.

However, machine learning algorithms can augment all security layers to detect and plug threats at pre-execution, on-execution and post-execution, making ransomware less of threat and more of a nuisance.

Related posts:

&emdash; Bogdan Botezatu is living his second childhood at Bitdefender as senior e-threat analyst.

Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
COVID-19: Latest Security News & Commentary
Dark Reading Staff 8/3/2020
Pen Testers Who Got Arrested Doing Their Jobs Tell All
Kelly Jackson Higgins, Executive Editor at Dark Reading,  8/5/2020
Browsers to Enforce Shorter Certificate Life Spans: What Businesses Should Know
Kelly Sheridan, Staff Editor, Dark Reading,  7/30/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
Special Report: Computing's New Normal, a Dark Reading Perspective
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
The Changing Face of Threat Intelligence
The Changing Face of Threat Intelligence
This special report takes a look at how enterprises are using threat intelligence, as well as emerging best practices for integrating threat intel into security operations and incident response. Download it today!
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-17366
PUBLISHED: 2020-08-05
An issue was discovered in NLnet Labs Routinator 0.1.0 through 0.7.1. It allows remote attackers to bypass intended access restrictions or to cause a denial of service on dependent routing systems by strategically withholding RPKI Route Origin Authorisation ".roa" files or X509 Certificate...
CVE-2020-9036
PUBLISHED: 2020-08-05
Jeedom through 4.0.38 allows XSS.
CVE-2020-15127
PUBLISHED: 2020-08-05
In Contour ( Ingress controller for Kubernetes) before version 1.7.0, a bad actor can shut down all instances of Envoy, essentially killing the entire ingress data plane. GET requests to /shutdown on port 8090 of the Envoy pod initiate Envoy's shutdown procedure. The shutdown procedure includes flip...
CVE-2020-15132
PUBLISHED: 2020-08-05
In Sulu before versions 1.6.35, 2.0.10, and 2.1.1, when the "Forget password" feature on the login screen is used, Sulu asks the user for a username or email address. If the given string is not found, a response with a `400` error code is returned, along with a error message saying that th...
CVE-2020-7298
PUBLISHED: 2020-08-05
Unexpected behavior violation in McAfee Total Protection (MTP) prior to 16.0.R26 allows local users to turn off real time scanning via a specially crafted object making a specific function call.