The finding underscores the growing risk of weaponizing publicly available AI models and the need for better security to combat the looming threat.

A hand pointing to a screen on which the letters "AI" are typed alongside other technology icons
Source: WrightStudio via Alamy Stock Photo

Researchers have discovered about 100 machine learning (ML) models that have been uploaded to the Hugging Face artificial intelligence (AI) platform and potentially enable attackers to inject malicious code onto user machines. The findings further underscore the growing threat that lurks when attackers poison publicly available AI models for nefarious activity.

The discovery of the malicious models by JFrog Security Research is part of ongoing research by the firm into how attackers can use ML models to compromise user environments, according to a blog post published this week.

Specifically, JFrog developed a scanning environment to scrutinize model files uploaded to Hugging Face — a widely used, public AI model repository — to detect and neutralize emerging threats, particularly from code execution.

In running this tool, the researchers discovered that models loaded onto the repository were harboring malicious payloads. In one example, the scanner flagged a PyTorch model uploaded into a repository by a user named baller423 — an account that has since been deleted — that enables attackers to insert arbitrary Python code into a key process. This potentially could lead to malicious behavior when the model is loaded onto a user's machine.

Hugging Face Payload Analysis

While typically payloads embedded within AI models uploaded by researchers aim to demonstrate vulnerabilities or showcase proofs-of-concept without causing harm, the payload uploaded by baller423 differed significantly, JFrog senior security researcher David Cohen wrote in the post.

It initiated a reverse shell connection to an actual IP address, 210.117.212.93, behavior that "is notably more intrusive and potentially malicious, as it establishes a direct connection to an external server, indicating a potential security threat rather than a mere demonstration of vulnerability," he wrote.

JFrog found that the IP address range belongs to Kreonet, which stands for “Korea Research Environment Open Network." Kreonet serves as a high-speed network in South Korea to support advanced research and educational endeavors; therefore, it's possible that AI researchers or practitioners may have been behind the model.

"However, a fundamental principle in security research is refraining from publishing real working exploits or malicious code," a principle that was breached when the malicious code attempted to connect back to a real IP address, Cohen noted.

Moreover, shortly after the model was removed, the researchers encountered further instances of the same payload with varying IP addresses, one of which remains active.

Further investigation into Hugging Face uncovered some 100 potentially malicious models, highlighting the wider impact of the overall security threat from malicious AI models, which demands constant vigilance and more proactive security, Cohen wrote.

How Malicious AI Models Work

To understand how attackers can weaponize Hugging Face ML models requires an understanding of how a malicious PyTorch model like the one uploaded by baller423 works in the context of Python and AI development.

Code execution can happen when loading certain types of ML models — for example, a model that uses what's called the "pickle" format, a common format for serializing Python objects. That's because pickle files can also contain arbitrary code that is executed when the file is loaded, according to JFrog.

Loading PyTorch models with transformers, a common approach by developers, involves using the torch.load() function, which deserializes the model from a file. Particularly when dealing with PyTorch models trained with Hugging Face's Transformers library, developers often employ this method to load the model along with its architecture, weights, and any associated configurations, according to JFrog.

Transformers, then, provide a comprehensive framework for natural language processing tasks, facilitating the creation and deployment of sophisticated models, Cohen observed.

"It appears that the malicious payload was injected into the PyTorch model file using the __reduce__ method of the pickle module," he wrote. "This method enables attackers to insert arbitrary Python code into the deserialization process, potentially leading to malicious behavior when the model is loaded."

While Hugging Face has a number of quality built-in security protections — including malware scanning, pickle scanning, and secrets scanning — it doesn't outright block or restrict pickle models from being downloaded. Instead, it just marks them as "unsafe," which means someone can still download and execute potentially harmful models.

Furthermore, it’s important to note that it's not just pickle-based models that are susceptible to executing malicious code. For instance, the second-most prevalent model type on Hugging Face is Tensorflow Keras, which also can execute arbitrary code, even though it's not as easy for attackers to exploit this method, according to JFrog.

Mitigating Risk from Poisoned AI Models

This isn't the first time that researchers have found an AI security risk in Hugging Face, a platform where the ML community collaborates on models, data sets, and applications. Researchers at AI security startup Lasso Security previously said they were able to access Meta's Bloom, Meta-Llama, and Pythia large language model (LLM) repositories using unsecured API access tokens they discovered on GitHub and the Hugging Face platform for LLM developers.

The access would have allowed an adversary to silently poison training data in these widely used LLMs, steal models and data sets, and potentially execute other malicious activities.

Indeed, the growing existence of publicly available and thus potentially malicious AI/ML models poses a major risk to the supply chain, particularly for attacks that specifically target demographics such as AI/ML engineers and pipeline machines, according to JFrog.

To mitigate this risk, AI developers should use new tools available to them such as Huntr, a bug-bounty platform tailored specifically for AI vulnerabilities to enhance the security posture of AI models and platforms, Cohen wrote.

"This collective effort is imperative in fortifying Hugging Face repositories and safeguarding the privacy and integrity of AI/ML engineers and organizations relying on these resources," he wrote.

About the Author(s)

Elizabeth Montalbano, Contributing Writer

Elizabeth Montalbano is a freelance writer, journalist, and therapeutic writing mentor with more than 25 years of professional experience. Her areas of expertise include technology, business, and culture. Elizabeth previously lived and worked as a full-time journalist in Phoenix, San Francisco, and New York City; she currently resides in a village on the southwest coast of Portugal. In her free time, she enjoys surfing, hiking with her dogs, traveling, playing music, yoga, and cooking.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights