Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Vulnerabilities / Threats

02:30 PM
Saumitra Das
Saumitra Das
Connect Directly
E-Mail vvv

When Every Attack Is a Zero Day

Stopping malware the first time is an ideal that has remained tantalizingly out of reach. But automation, artificial intelligence, and deep learning are poised to change that.

The collective efforts of hackers have fundamentally changed the cyber defense game. Today, adversarial automation is being used to create and launch new attacks at such a rate and volume that every strain of malware must now be considered a zero day and every attack considered an advanced persistent threat.

That's not hyperbole. According to research by AV-Test, more than 121.6 million new malware samples were discovered in 2017. That is more than 333,000 new samples each day, more than 230 new samples each minute, nearly four new malware samples every second.

When malicious, morphing malware is unleashed at that scale, traditional defenses are quickly overwhelmed. Signature-based detection only works for known threats. Sandboxing-based detection techniques can't keep up because there isn't enough time and resources to analyze and identify attack signatures when your enterprise is being bombarded with malware variants that have never been seen before.

Stopping malware attacks the first time is an ideal that has remained tantalizingly out of reach, and so success measured over time became the standard—a standard that has been obviated by the insidiously effective nature of malware. If an attack succeeds once but is stopped on 99 subsequent attacks, that's a 99% success rate. To achieve that, however someone has to be "Patient Zero." Someone must take one for the team so that the intelligence gained from that first attack can be shared and used to prevent subsequent attacks. But when attacks are launched at a massive, global scale, and when there are more than 121 million new samples every year, there's never just one Patient Zero. And it's no fun if you happen to be among them.

Thanks to advancements in the development of automation, artificial intelligence and deep learning, there may be hope. (Editor's note: Blue Hexagon is one of several early innovators developing security products based on deep learning.)

Deep learning is a type of machine learning that uses artificial neural networks to make decisions. Artificial neural networks are not new, but recent advancements in processing have increased their capabilities. At the same time the costs of the underlying tech have lowered, putting deep learning applications within the reach of many industries — including cybersecurity. In fact, deep learning's capabilities are an ideal application for addressing many of the challenges that continue to stymie efforts to secure networks against hacking's daily onslaught.

Fundamentally cybersecurity is about data and patterns, and there is a huge pool of threat data available through threat intelligence services and repositories that has been aggregated over the years and that can be used to inform deep learning-based defenses. By exposing neural nets to the vast threat data set, deep learning can learn to identify malicious traffic, even if the specific attack is brand new.

This is not theoretical. Deep learning has been applied at network entry points — both on-premises and in the cloud — to inspect traffic in early, live customer deployments, where it has successfully detected and blocked polymorphic malware, including Emotet variants, on first encounter. The underlying architecture ensures that threat analysis, verdict, and prevention occur in seconds, keeping malware out of the network in real time.

It's early days yet, and while there has been no independent testing disclosed to date, the potential for deep learning to make a quantum leap is in evidence. In our lab and beta test environments, we have consistently achieved nearly 100% detection rates for all threats encountered, including both known samples and zero days, regardless of OS or application. We are also pursuing independent testing to verify these results.

This is important because hackers have developed techniques to evade and defeat traditional defenses such as sandboxes and signatures. These results suggest that the industry may have reached a point where stemming the tide of threat escalation is achievable and the traditional game of cybersecurity whack-a-mole — where threat actors create and distribute new malware, security vendors identify the new strain and distribute its signature, and threat actors would respond by creating more new malware strains — may be at an end.

When attackers realized they could use automation to generate and distribute malware variants faster than the industry could react, they embraced their new ability with enthusiasm. If deep learning gives our industry the means to return fire and blunt their attacks with overwhelming speed and intelligence, we should likewise embrace our newfound power.

Related Content:



 Join Dark Reading LIVE for two cybersecurity summits at Interop 2019. Learn from the industry's most knowledgeable IT security experts. Check out the Interop agenda here.

Saumitra Das is the CTO and Co-Founder of Blue Hexagon. He has worked on machine learning and cybersecurity for 18 years. As an engineering leader at Qualcomm, he led teams of machine learning scientists and developers in the development of ML-based products shipped in ... View Full Bio

Recommended Reading:

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
4/27/2019 | 9:10:11 PM
Re: Rational post.... Have you seen how Fortinet deals with this challenge?
I would love to tlak to you more about why the amount of virus/malware per week is not relevant when you do not have to create unique signatures for each variant.  Fortinet has patented technology that allows a core signature to match multiple variations where a typical A/V database has to contain a signature for every variant.  That large number of 1.8 million shrinks considerably down when you don't have to track each variation of the same family.
Saumitra Das
Saumitra Das,
User Rank: Author
4/26/2019 | 2:56:41 AM
Re: Rational post.... Have you seen how Fortinet deals with this challenge?
Yes I have seen how several techniques have been used to deal with this including more complicated hashing techniques, complex signatures applied on more involved static analysis involving emulation or unpacking (like CPRL). While these are very interesting ways to deal with these challenges, in my opinion, they do not scale to the current threat landscape where we see a high degree of automation and millions of threats every single day. As an example, despite advancements like CPRL, documentation online touts the following - 
  • 1.8 Million new and updated AV definitions per week
  • Hourly updates of the AV signature database

Clearly if signatures need hourly updates and millions new per week, the existing signatures are not able to generalize to the scale of attack creation in the threat landscape despite the innovation in the nature of signatures. If that was the case, one should not need to update signatures so often.

Additionally, sandboxing is proposed to handle the real "unknowns" which are not captured by traditional "one signature, one variant" technique or CPRL. But that product has several caveats like max file sizes and a conserve mode (to reduce file types analyzed when sandbox is loaded). If CPRL could handle all the variants, I would assume the sandbox should have very few unknowns to deal with and not have these caveat and throughput concerns. Ideally, if signatures could generalize so well, one should not even need a sandbox appliance since there would be so few true unknowns that a cloud sandbox would suffice.

My opinion is that while techniques like CPRL are a meaningful and necessary improvement over the "one signature, one variant" technique, the current threat landscape calls for a level of generalization to cover attacks that is at the same scale as that of the attackers. This is not just needed for new unkown variants but also to cover the existing known attacks. Fitting the known attack signatures into perimeter protection without degrading throughput is as much a problem as detecting new variants.  
User Rank: Apprentice
4/25/2019 | 11:28:36 AM
Rational post.... Have you seen how Fortinet deals with this challenge?
Do a google search for "fortinet CPRL" - Compact Pattern Recognition Language
What the FedEx Logo Taught Me About Cybersecurity
Matt Shea, Head of Federal @ MixMode,  6/4/2021
A View From Inside a Deception
Sara Peters, Senior Editor at Dark Reading,  6/2/2021
Register for Dark Reading Newsletters
White Papers
Current Issue
The State of Cybersecurity Incident Response
In this report learn how enterprises are building their incident response teams and processes, how they research potential compromises, how they respond to new breaches, and what tools and processes they use to remediate problems and improve their cyber defenses for the future.
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2021-06-13
The package studio-42/elfinder before 2.1.58 are vulnerable to Remote Code Execution (RCE) via execution of PHP code in a .phar file. NOTE: This only applies if the server parses .phar files as PHP.
PUBLISHED: 2021-06-12
Receita Federal IRPF 2021 1.7 allows a man-in-the-middle attack against the update feature.
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an OutOfMemory-Exception while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
PUBLISHED: 2021-06-12
In Apache PDFBox, a carefully crafted PDF file can trigger an infinite loop while loading the file. This issue affects Apache PDFBox version 2.0.23 and prior 2.0.x versions.
PUBLISHED: 2021-06-12
It was discovered that read_file() in apport/hookutils.py would follow symbolic links or open FIFOs. When this function is used by the openjdk-16 package apport hooks, it could expose private data to other local users.