|Click here for more of Dark Reading's Black Hat articles.|
In reality, the exact number of vulnerabilities reported in different databases each year varies widely -- by as much as 75 percent in 2012. The fundamental problems in counting vulnerabilities, along with the issues of assigning a meaningful severity to each vulnerability, means that analyses based on the data should be treated with skepticism, argue two security professionals who plan to outline problems with vulnerability data at Black Hat in Las Vegas later this summer.
Researchers Brian Martin, content manager of the Open Source Vulnerability Database (OSVDB), and Steve Christey, principal information security engineer in the security and information operations division at The MITRE Corporation, say that the goal of their talk is to not only point out unreliable data, but also to help people pinpoint which reports are based on such shaky foundations.
"At the very least, it is important that people understand the limitations of the data that [is] being used and be able to read reports based on that data with a sufficient dose of skepticism," Christey says.
The impact of the uncertainty in vulnerability statistics goes beyond just the cybercliques of bug hunters, security researchers, and data scientists. Companies frequently rely on the severity assigned to vulnerabilities to triage patch deployments, Martin says.
"Companies are basing their decisions off of all of these stats, and those decisions are very sweeping, in the sense that it is affecting the budget, it's affecting the personnel, and their lives to a degree," he says.
A major source of confusion is the wide range of flaw counts. Recent reports from Sourcefire and Symantec, for example, were based on vulnerabilities tallied from the National Vulnerability Database and its collection of flaws that have a Common Vulnerability and Exposures (CVE) identifier. Thus, the two reports had very similar numbers: 5,281 and 5,291, respectively. On the other hand, the Open-Source Vulnerability Database (OSVDB) seeks out a large number of additional vulnerability reports and posts the highest bug counts -- 9,184 for 2012, 75 percent higher than that reported by Sourcefire. Other vendors that have their own sources of vulnerability data typically land between the two extremes. Hewlett-Packard's Zero-Day Initiative, which buys information on serious software security issues, claimed to have found 8,137.
[Reports like this one, which marked a 26 percent jump in vulnerabilities year-over-year, need to have better disclaimers about the data. See Lessons Learned From A Decade Of Vulnerabilities.]
And those numbers are hardly set in stone. Every database updates its tallies with new information on old vulnerabilities. By the end of 2013, each count will be higher than it is now.
"When deriving statistics from the CVE data set, it is important to document assumptions and maintain a consistent approach," Brian Gorenc, manager of the Zero Day Initiative at HP Security Research, said in an e-mail interview. "The key is that readers should be able to follow the author's rationale."
Adding to the problems, the most popular method of assigning a severity to each vulnerability has major issues of its own. Known as the Common Vulnerability Scoring System (CVSS), the metric is often treated as an absolute measure of a vulnerability's severity -- both by researchers and companies. Yet the system often scores vulnerabilities incorrectly, or allows researchers too much leeway in ranking the criticality of a flaw. Often, when a vendor has not given enough information on a flaw in its product, security researchers will cautiously give it the highest CVSS score, Martin says.
"The biggest gripe is that there are too many unknown which are left up to the scorers' discretion," he says.
While the criticism of reports based on the data should be taken to heart, and vulnerability counts not taken as absolute, security researchers working on analyzing the data should still find it valuable, says Stefan Frei, research director at NSS Labs and an author of one report that used available data. As long as the source of the data is kept consistent, he says, the overall trends should be valid.
"This is not physical science, where you can repeatedly measure something -- it's more like a social science," Frei says.
In the end, the authors of any report based on vulnerability data should add a discussion of the data and its weaknesses, OSVDB's Martin says.
"We don't yet have that rigor in the vulnerability information industry," he says. "Every one is going toward the 'gimme' stats."
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.