Keeping pace with the endless deluge of security vulnerabilities has become one of the truly Sisyphean tasks for enterprise IT and security teams. Every operating system, device, and application is a potential source of vulnerabilities. This can include the traditional laptops and servers that power an organization but also extends to virtual machines, cloud-based assets, Internet of Things, mobile devices, and the list goes on.
To make matters worse, the rate at which new vulnerabilities are being discovered has accelerated. A quick check of the National Vulnerability Database (NVD) shows that historically the industry would expect to see around 5,000 to 7,000 common vulnerabilities and exposures (CVEs) released each year. However, in 2017 that number spiked to 14,649, continued to climb to 16,515 in 2018, and shows no signs of slowing down. These numbers are likely underrepresenting the total number of vulnerabilities in the real world given that many platforms are not covered by CVE Numbering Authorities (typically, these are vendors or researchers that focus on specific products).
Weaponization Is the Key
However, not every vulnerability becomes weaponized (abused by an exploit or malware). In fact, most don't. Of the more than 120,000 total vulnerabilities tracked by the NVD, fewer than 24,000 have been weaponized. As a result, many organizations are turning to analytics and risk-based vulnerability management to prioritize those that are weaponized and have the highest impact.
Even this approach is somewhat reactive in that it relies on the attackers making the first move in the wild. New innovations are starting to change this equation. By applying data science and machine learning to vulnerabilities, researchers are increasingly able to predict which vulnerabilities will be weaponized even before threats are seen in the wild.
Let's take a look at how it works.
The Art and Science of Predicting Weaponization
Needless to say, the details underlying predictive models of weaponization can get quite complex. However, we can understand the basic logic behind them without diving into the specific algorithms and analysis.
First, it's important to curate the right data set. Simply having large amounts of data is not enough — we also need to have broad context around a vulnerability. For example, we will want to know a variety of details underlying a vulnerability such as the traits that contributed to its risk score, the underlying weakness that led to the vulnerability itself, how it would be abused in the context of an attack, the types of assets that it is likely to affect, and more.
For instance, Common Vulnerability Scoring System (CVSS) scores rely on a variety of base metrics to rate a vulnerability. Likewise, Common Weakness Enumeration (CWE) data provides insight into the underlying weakness of a vulnerability, and Common Platform Enumeration (CPE) gives similar insight into the platform. Each of these perspectives can be highly predictive in their own right but become far more powerful as we learn to correlate across them.
Exploitability and Impact
CVSS scores are built on a few base metrics that provide insight into the difficulty of exploiting a vulnerability and the impact that an exploit would have on a target. Obviously, from an attacker's perspective, the fewer constraints on an exploit and the higher the impact, the better. For example, a vulnerability that can be exploited remotely over a network is more valuable than one that requires the attacker to be on the local network or that requires user interaction in order to execute.
Likewise, low attack complexity can greatly increase the chances that a vulnerability will be weaponized. Low complexity typically means that the attack simply works without that attacker needing to perform additional steps such as collecting local information. Privileges also play an important role. Many exploits will simply maintain the privileges of the targeted user or application.
However, vulnerabilities that can escalate the attacker's privileges to an admin or system level become highly strategic for an attacker. Identifying low-complexity attacks that either have remote code execution or privilege escalation is often a good start to predicting if a vulnerability will be weaponized.
Feeding the AI Engine
With a data set established, we need analytical models to gain predictive insights. By looking at historical weaponization trends, we can train algorithms to look across diverse types of data and identify the combination of traits that best predicts which vulnerabilities will be weaponized by attackers in the wild. Just as importantly, this approach can predict the speed at which a given vulnerability is likely to be weaponized.
The end goal of this analysis is to allow security and IT teams to prioritize patching the few vulnerabilities that will actually become threats. Put another way, the goal is to find needles in the haystack before they even become needles.
Ultimately, predictive models should not be considered a perfect answer on their own. They can, however, help make vulnerability management a much more proactive discipline, where instead of constantly playing to catch up to attackers, defenders gain a first-mover advantage.