Researchers Find Bugs Using Single-Codebase Inconsistencies
A Northeastern University research team finds code defects -- and some vulnerabilities -- by detecting when programmers used different code snippets to perform the same functions.
May 3, 2021
Repeatable, consistent programming is considered a best practice in software development, and it becomes increasingly important as the size of a development team grows. Now, research from Northeastern University shows that detecting inconsistent programming — code snippets that implement the same functions in different ways — can also be used to find bugs and, potentially, vulnerabilities.
In a paper to be presented at the USENIX Security Conference in August, a team of researchers from the university used machine learning to find bugs by first identifying code snippets that implemented the same functionality and then comparing the code to determine inconsistencies. The project, dubbed "Functionally-similar yet Inconsistent Code Snippets" (FICS), found 22 new and unique bugs by analyzing five open source projects, including QEMU and OpenSSL.
The research is not meant to replace other forms of static analysis but to give developers another weapon in their arsenal to analyze their code and find potential errors, says Mansour Ahmadi, a post-doc research associate PhD student at Northeastern University who now works as a security engineer at Amazon.
Other static analysis approaches have to have previously encountered an issue or be given a rule to detect an issue to recognize the pattern, he says.
"If there is a bug in the system with no previously found variant, [those approaches] will fail to find the bug," Ahmadi says. "In contrast, if there are correct implementations of the functionally similar code snippets to the buggy counterpart, FICS can detect that."
The research uses machine-learning techniques — not to find matches to know vulnerability patterns, as many other projects do — but to find functionally similar code that is implemented in different, or inconsistent, ways. Such bugs can be easily verified by developers and testers when presented with both implementations, the researchers stated in a prepublication paper.
"[F]rom basic bugs such as absent bounds checking to complex bugs such as use-after-free, as long as the codebase contains non-buggy code snippets that are functionally similar to a buggy code snippet, the buggy one can be detected as an inconsistent implementation of the functionality or logic," the researchers state. "This observation is more obvious in software projects of reasonable sizes, which usually contain many clusters of functionally-similar code snippets, often contributed by different developers."
The FICS system aims to find bugs and not vulnerabilities, but it is not uncommon that the issues found impact security, Ahmadi says. The list of bugs found by the researchers include memory leaks, missing checks of values, and bad typecasting.
The researchers believe that some of the issues should be considered vulnerabilities, but the developers maintaining the project produced patches for the defects without much consideration for their exploitability.
"We have requested CVE for a couple of the bugs, without providing the exploits that we found. While we were acknowledged by the developers for our findings, the developers did not proceed to assign CVEs to them as they believe the bugs are not exploitable," Ahmadi says. "Overall, this is the drawback of all static analyzers as it is hard to prove if a bug is exploitable without providing a proof-of-concept."
The researchers used two types of unsupervised clustering, in which the machine-learning system organizes data with similar features into groupings. First, the researchers transformed code into functional constructs so that parts of a program's code could be clustered together based on their functionality. After that, the researchers compared code in the same clusters and used machine learning to group them by implementation. A code snippet that accounted for the majority of implementations in a specific functional cluster is considered to be the correct way of coding.
False positives are a problem. The researchers used filtering to reduce the total reported consistencies by a factor of 10, which still left 1,821 identified inconsistencies. Of those, 218 are considered valid cases. The high level of false positives is an issue with all static analyzers, but specifically in the case of FICS, is not a showstopper because verification is fairly simple, says Ahmadi.
"The manual vetting effort is not as heavy as required to validate results from many other static analyzers," he says. "The ease of manual validation of FICS's reports is largely due to the presence of both the consistent and the inconsistent constructs and the highlighted differences."
The technique could be fooled into deciding the wrong code snippet is the correct one if the developer used the incorrect method more often than the correct one. Yet, this error is rare and only occurred in a single instance during the research, when two similar code snippets were incorrect and the single inconsistent code snippet was correct, Ahmadi says.
The research team also included Northeastern University PhD students Reza Mirzazade Farkhani and Ryan Williams, and Long Lu, an associate professor of computer science.
About the Author
You May Also Like
Cybersecurity Day: How to Automate Security Analytics with AI and ML
Dec 17, 2024The Dirt on ROT Data
Dec 18, 2024