Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Application Security

02:05 PM

Researchers Scan for Supply-Side Threats in Open Source

A recent project to scan the main Python repository's 268,000 packages found only a few potentially malicious programs, but work earlier this year uncovered hundreds of instances of malware.

Open source repositories form the backbone of modern software development — nearly every software project includes at least one component — but security experts increasingly worry that attackers are focused on infecting systems by inserting malicious code into popular repositories.

A number of projects have kicked off this year to search for such Trojan horses. Last week, Stripe engineer Jordan Wright published the results of a home-brew research project that downloaded every Python component from the Python Package Index (PyPI) and looked for system calls that could indicate malicious intent. Overall, he found hundreds of packages that created network connections — most by including a common dependency — and a few packages that seemed risky. These included two that appeared to be test cases — one named "i-am-malicious" and another named "maliciouspackage" — and a third that used obfuscation to hide commands.

Related Content:

Attackers Aim at Software Supply Chain with Package Typosquatting

The Changing Face of Threat Intelligence

New on The Edge: An Inside Look at an Account Takeover

However, none of the scanned packages seemed outright malicious, Wright said in his analysis.

"Looking through the data, I didn't find any packages doing significantly harmful activity that didn't also have 'malicious' somewhere in the name, which was good," he said. "But it's always possible I missed something, or that [attackers installing malicious code] would happen in the future."

In fact, such attacks have already happened. Two years ago, for example, an attacker compromised a developer's account and published malicious versions of two components of the popular Javascript package ESLint to the Node Package Manager (NPM) service. While the package has millions of weekly downloads, the project group received a warning and unpublished the packages within two hours, limiting the impact.

The attack often takes another form: typosquatting, where attackers create Trojan horses that have names similar to common packages. In April, an attacker seeded the Ruby package repository, RubyGems, with more than 760 malicious packages with names similar to legitimate packages. Such attacks attempt to take advantage of mistyped install commands — relatively rare, perhaps, but devastating if they produce a compromise.

Last year, the Python core development team asked the community for ways of finding malicious code inserted into the modules and packages used by Python. For open source projects, these issues are particularly challenging, said Mike Myers, principal security engineer at Trail of Bits, a software security consultancy, in an answering comment.

"[T]he Google and Apple app stores have both invested heavily in runtime analysis sandboxes and static analysis approaches for detecting malice in their app stores," he said. "The difference there being, they can run their detections in secret, and adversaries can't develop an evasion in advance without disclosing it in a submission."

A team of researchers from the Georgia Institute of Technology carried out a similar analysis for three major repositories: Python's PyPI, the Node Package Manager (NPM), and RubyGems. Their system, dubbed MalOSS, combines metadata analysis, static code analysis, and dynamic runtime analysis to determine whether a package is behaving maliciously. The researchers found seven malicious packages in PyPI, 41 in NPM, and 291 in RubyGems, according to their paper published in February 2020.

Inspired by the Georgia Tech work, Wright aimed to look for signs that attackers inserted malicious code into packages by analyzing the system functions called during installation. Using the PyPI API, he downloaded 268,000 packages into a container, installed each, and watched for suspicious changes. The entire process cost about $120 in cloud fees, he said.

Wright plans to expand the effort to continuously monitor PyPI and add repositories for other platforms in the future.

"This found a few instances of potentially malicious behavior that you can find in the post, but the real power will be setting up continuous monitoring moving forward," he stated on Twitter.

Overall, Wright makes the case that each of the major repositories need to implement their own security and continuously monitor for malicious supply chain attacks in the future. Otherwise, installing packages from code in the repositories presents too great a risk, he said.

"I still don't like that it's possible to run arbitrary commands on a user's system just by them pip installing a package," Wright said. "I get that the majority of use cases are benign, but it opens up risk that must be considered. Hopefully by increasingly monitoring various package managers we can identify signs of malicious activity before it has a significant impact."

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio

Recommended Reading:

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Look Beyond the 'Big 5' in Cyberattacks
Robert Lemos, Contributing Writer,  11/25/2020
Why Vulnerable Code Is Shipped Knowingly
Chris Eng, Chief Research Officer, Veracode,  11/30/2020
Register for Dark Reading Newsletters
White Papers
Cartoon Contest
Write a Caption, Win an Amazon Gift Card! Click Here
Latest Comment: I think the boss is bing watching '70s TV shows again!
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2020-12-02
Textpattern CMS 4.6.2 allows CSRF via the prefs subsystem.
PUBLISHED: 2020-12-02
Multiple cross-site scripting (XSS) vulnerabilities in Papermerge before 1.5.2 allow remote attackers to inject arbitrary web script or HTML via the rename, tag, upload, or create folder function. The payload can be in a folder, a tag, or a document's filename. If email consumption is configured in ...
PUBLISHED: 2020-12-02
CAPI (Cloud Controller) versions prior to 1.101.0 are vulnerable to a denial-of-service attack in which an unauthenticated malicious attacker can send specially-crafted YAML files to certain endpoints, causing the YAML parser to consume excessive CPU and RAM.
PUBLISHED: 2020-12-02
Editors/LogViewerController.cs in Umbraco through 8.9.1 allows a user to visit a logviewer endpoint even if they lack Applications.Settings access.
PUBLISHED: 2020-12-02
A security vulnerability has been identified in the HPE Edgeline Infrastructure Manager, also known as HPE Edgeline Infrastructure Management Software. The vulnerability could be remotely exploited to bypass remote authentication leading to execution of arbitrary commands, gaining privileged access,...