Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Application Security

Researchers Find Bugs Using Single-Codebase Inconsistencies

A Northeastern University research team finds code defects -- and some vulnerabilities -- by detecting when programmers used different code snippets to perform the same functions.

Repeatable, consistent programming is considered a best practice in software development, and it becomes increasingly important as the size of a development team grows. Now, research from Northeastern University shows that detecting inconsistent programming — code snippets that implement the same functions in different ways — can also be used to find bugs and, potentially, vulnerabilities. 

In a paper to be presented at the USENIX Security Conference in August, a team of researchers from the university used machine learning to find bugs by first identifying code snippets that implemented the same functionality and then comparing the code to determine inconsistencies. The project, dubbed "Functionally-similar yet Inconsistent Code Snippets" (FICS), found 22 new and unique bugs by analyzing five open source projects, including QEMU and OpenSSL.

Related Content:

Developers Need More Usable Static Code Scanners to Head Off Security Bugs

Special Report: Assessing Cybersecurity Risk in Today's Enterprises

New From The Edge: Ghost Town Security: What Threats Lurk in Abandoned Offices?

The research is not meant to replace other forms of static analysis but to give developers another weapon in their arsenal to analyze their code and find potential errors, says Mansour Ahmadi, a former post-doc research associate at Northeastern University who now works as a security engineer at Amazon.

Other static analysis approaches have to have previously encountered an issue or be given a rule to detect an issue to recognize the pattern, he says. 

"If there is a bug in the system with no previously found variant, [those approaches] will fail to find the bug," Ahmadi says. "In contrast, if there are correct implementations of the functionally similar code snippets to the buggy counterpart, FICS can detect that."

The research uses machine-learning techniques — not to find matches to know vulnerability patterns, as many other projects do — but to find functionally similar code that is implemented in different, or inconsistent, ways. Such bugs can be easily verified by developers and testers when presented with both implementations, the researchers stated in a prepublication paper.

"[F]rom basic bugs such as absent bounds checking to complex bugs such as use-after-free, as long as the codebase contains non-buggy code snippets that are functionally similar to a buggy code snippet, the buggy one can be detected as an inconsistent implementation of the functionality or logic," the researchers state. "This observation is more obvious in software projects of reasonable sizes, which usually contain many clusters of functionally-similar code snippets, often contributed by different developers."

The FICS system aims to find bugs and not vulnerabilities, but it is not uncommon that the issues found impact security, Ahmadi says. The list of bugs found by the researchers include memory leaks, missing checks of values, and bad typecasting. 

The researchers believe that some of the issues should be considered vulnerabilities, but the developers maintaining the project produced patches for the defects without much consideration for their exploitability.

"We have requested CVE for a couple of the bugs, without providing the exploits that we found. While we were acknowledged by the developers for our findings, the developers did not proceed to assign CVEs to them as they believe the bugs are not exploitable," Ahmadi says. "Overall, this is the drawback of all static analyzers as it is hard to prove if a bug is exploitable without providing a proof-of-concept."

The researchers used two types of unsupervised clustering, in which the machine-learning system organizes data with similar features into groupings. First, the researchers transformed code into functional constructs so that parts of a program's code could be clustered together based on their functionality. After that, the researchers compared code in the same clusters and used machine learning to group them by implementation. A code snippet that accounted for the majority of implementations in a specific functional cluster is considered to be the correct way of coding.

False positives are a problem. The researchers used filtering to reduce the total reported consistencies by a factor of 10, which still left 1,821 identified inconsistencies. Of those, 218 are considered valid cases. The high level of false positives is an issue with all static analyzers, but specifically in the case of FICS, is not a showstopper because verification is fairly simple, says Ahmadi.

"The manual vetting effort is not as heavy as required to validate results from many other static analyzers," he says. "The ease of manual validation of FICS's reports is largely due to the presence of both the consistent and the inconsistent constructs and the highlighted differences."

The technique could be fooled into deciding the wrong code snippet is the correct one if the developer used the incorrect method more often than the correct one. Yet, this error is rare and only occurred in a single instance during the research, when two similar code snippets were incorrect and the single inconsistent code snippet was correct, Ahmadi says.

The research team also included Northeastern University PhD students Reza Mirzazade Farkhani and Ryan Williams, and Long Lu, an associate professor of computer science.

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio
 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
News
FluBot Malware's Rapid Spread May Soon Hit US Phones
Kelly Sheridan, Staff Editor, Dark Reading,  4/28/2021
Slideshows
7 Modern-Day Cybersecurity Realities
Steve Zurier, Contributing Writer,  4/30/2021
Commentary
How to Secure Employees' Home Wi-Fi Networks
Bert Kashyap, CEO and Co-Founder at SecureW2,  4/28/2021
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
How Enterprises are Developing Secure Applications
How Enterprises are Developing Secure Applications
Recent breaches of third-party apps are driving many organizations to think harder about the security of their off-the-shelf software as they continue to move left in secure software development practices.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2021-31755
PUBLISHED: 2021-05-07
An issue was discovered on Tenda AC11 devices with firmware through 02.03.01.104_CN. A stack buffer overflow vulnerability in /goform/setmac allows attackers to execute arbitrary code on the system via a crafted post request.
CVE-2021-31756
PUBLISHED: 2021-05-07
An issue was discovered on Tenda AC11 devices with firmware through 02.03.01.104_CN. A stack buffer overflow vulnerability in /gofrom/setwanType allows attackers to execute arbitrary code on the system via a crafted post request. This occurs when input vector controlled by malicious attack get copie...
CVE-2021-31757
PUBLISHED: 2021-05-07
An issue was discovered on Tenda AC11 devices with firmware through 02.03.01.104_CN. A stack buffer overflow vulnerability in /goform/setVLAN allows attackers to execute arbitrary code on the system via a crafted post request.
CVE-2021-31758
PUBLISHED: 2021-05-07
An issue was discovered on Tenda AC11 devices with firmware through 02.03.01.104_CN. A stack buffer overflow vulnerability in /goform/setportList allows attackers to execute arbitrary code on the system via a crafted post request.
CVE-2021-31458
PUBLISHED: 2021-05-07
This vulnerability allows remote attackers to execute arbitrary code on affected installations of Foxit Reader 10.1.1.37576. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. The specific flaw exists within the handlin...