Machine Learning Models: A Dangerous New Attack Vector

Threat actors can weaponize code within AI technology to gain initial network access, move laterally, deploy malware, steal data, or even poison an organization's supply chain.

Threat actors can hijack machine learning (ML) models that power artificial intelligence (AI) to deploy malware and move laterally across enterprise networks, researchers have found. These models, which often are publicly available, serve as a new launchpad for a range of attacks that also can poison an organization's supply chain — and enterprises need to prepare.

Researchers from HiddenLayer's SAI Team have developed a proof-of-concept (POC) attack that demonstrates how a threat actor can use ML models — the decision-making system at the core of almost every modern AI-powered solution — to infiltrate enterprise networks, they revealed in a blog post published Dec. 6. The research is attributed to HiddenLayer's Tom Bonner, senior director of adversarial threat research; Marta Janus, principal adversarial threat researcher; and Eoin Wickens, senior adversarial threat researcher.

A recent report from CompTIA found that more than 86% of CEOs surveyed said their respective companies were using ML as a mainstream technology in 2021. Indeed, solutions as broad and varied as self-driving cars, robots, medical equipment, missile-guidance systems, chatbots, digital assistants, facial-recognition systems, and online recommendation systems rely on ML to function.
Because of the complexity of deploying these models and the limited IT resources of most companies, organizations often use open source model-sharing repositories in their deployment of ML models, which is where the problem lies, the researchers said.

"Such repositories often lack comprehensive security controls, which ultimately passes the risk on to the end user — and attackers are counting on it," they wrote in the post.

Anyone that uses pretrained machine learning models obtained from untrusted sources or public model repositories is potentially at risk from the type of attack researchers demonstrated, Marta Janus, principal adversarial ML researcher at HiddenLayer, tells Dark Reading. 

"Moreover, companies and individuals that rely on trusted third-party models can also be exposed to supply chain attacks, in which the supplied model has been hijacked," she says.

An Advanced Attack Vector

Researchers demonstrated how such an attack would work in a POC focused on the PyTorch open source framework, showing also how it could be broadened to target other popular ML libraries, such as TensorFlow, scikit-learn, and Keras

Specifically, researchers embedded a ransomware executable into the model's weights and biases using a technique akin to steganography; that is, they replaced the least significant bits of each float in one of the model's neural layers, Janus says.

Next, to decode the binary and execute it, the team used a flaw in PyTorch/pickle serialization format that allows for the loading of arbitrary Python modules and execute methods. They did this by injecting a a small Python script at the beginning of one of the model's files, preceded by an instruction for executing the scrip, Janus says.

"The script itself rebuilds the payload from the tensor and injects it into memory, without dropping it to the disk," she says. "The hijacked model is still functional and its accuracy is not visibly affected by any of these modifications."

The resulting weaponized model evades current detection from antivirus and endpoint detection and response (EDR) solutions while suffering only a very insignificant loss in efficacy, the researchers said. Indeed, the current, most popular anti-malware solutions provide little or no support in scanning for ML-based threats, they said.

In the demo, researchers deployed a 64-bit sample of the Quantum ransomware on a Windows 10 system, but noted that any bespoke payload can be distributed in this way and tailored to target different operating systems, such as Windows, Linux, and Mac, as well as other architectures, such as x86/64.

The Risk for the Enterprise

For an attacker to take advantage of ML models to target organizations, they first must obtain a copy of the model they want to hijack, which, in the case of publicly available models, is as simple as downloading it from a website or extracting it from an application using it. 

"In one of the possible scenarios, an attacker could gain access to a public model repository (such as Hugging Face or TensorFlow Hub) and replace a legitimate benign model with its Trojanized version that will execute the embedded ransomware," Janus explains. "For as long as the breach remains undetected, everyone who downloads the trojanized model and loads it on a local machine will get ransomed."

An attacker could also use this method to conduct a supply chain attack by hijacking a service provider’s supply chain to distribute a Trojanized model to all service subscribers, she adds. "The hijacked model could provide a foothold for further lateral movement and enable the adversaries to exfiltrate sensitive data or deploy further malware," Janus says.

The business implications for an enterprise vary, but can be severe, the researchers said. They range from initial compromise of a network and subsequent lateral movement to deployment of ransomware, spyware, or other types of malware. Attackers can steal data and intellectual property, launch denial-of-service attacks, or even, as mentioned, compromise an entire supply chain.

Mitigations and Recommendations

The research is a warning for any organization using pretrained ML models downloaded from the Internet or provided by a third party to treat them "just like any untrusted software," Janus says. 

Such models should be scanned for malicious code — although currently there are few products that offer this feature — as well as undergo thorough evaluation in a secure environment before being executed on a physical machine or put into production, she tells us.

Moreover, anyone who produces machine learning models should use secure storage formats — for example, formats that don’t allow for code execution — and cryptographically sign all their models so they cannot be tampered with without breaking the signature. 

"Cryptographic signing can assure model integrity in the same way as it does for software," Janus says.

Overall, the researchers said undertaking a security posture of understanding risk, addressing blind spots, and identifying areas of improvement in terms of any ML models deployed in an enterprise also can help mitigate an attack from this vector.