Attackers could exploit a common AI experience — false recommendations — to spread malicious code via developers that use ChatGPT to create software.

Image shows a robot torso and head sitting on a table with blurred chat text surrounding it and attached to it with lines
Source: Atchariya Wattanakul via Alamy Stock Photo

Attackers can exploit ChatGPT's penchant for returning false information to spread malicious code packages, researchers have found. This poses a significant risk for the software supply chain, as it can allow malicious code and Trojans to slide into legitimate applications and code repositories like npm, PyPI, GitHub, and others. 

By leveraging so-called "AI package hallucinations," threat actors can create ChatGPT-recommended, yet malicious, code packages that a developer could inadvertently download when using the chatbot, building them into software that then is used widely, researchers from Vulcan Cyber's Voyager18 research team revealed in a blog post published today. 

In artificial intelligence, a hallucination is a plausible response by the AI that's insufficient, biased, or flat-out not true. They arise because ChatGPT (and other large language models or LLMs that are the basis for generative AI platforms) answer questions posed to them based on the sources, links, blogs, and statistics available to them in the vast expanse of the Internet, which are not always the most solid training data. 

Due to this extensive training and exposure to vast amounts of textual data, LLMs like ChatGPT can generate "plausible but fictional information, extrapolating beyond their training and potentially producing responses that seem plausible but are not necessarily accurate," lead researcher Bar Lanyado of Voyager18 wrote in the blog post, also telling Dark Reading, "it's a phenomenon that's been observed before and seems to be a result of the way large language models work."

He explained in the post that in the developer world, AIs will also generate questionable fixes to CVEs and offer links to coding libraries that don't exist — and the latter presents an opportunity for exploitation. In that attack scenario, attackers might ask ChatGPT for coding help for common tasks; and ChatGPT might offer a recommendation for an unpublished or non-existent package. Attackers can then publish their own malicious version of the suggested package, the researchers said, and wait for ChatGPT to give legitimate developers the same recommendation for it.

How to Exploit an AI Hallucination

To prove their concept, the researchers created a scenario using ChatGPT 3.5 in which an attacker asked the platform for a question to solve a coding problem and ChatGPT responded with multiple packages, some of which did not exist — i.e., are not published in a legitimate package repository.

"When the attacker finds a recommendation for an unpublished package, they can publish their own malicious package in its place," the researchers wrote. "The next time a user asks a similar question they may receive a recommendation from ChatGPT to use the now-existing malicious package."

If ChatGPT is fabricating code packages, attackers can use these hallucinations to spread malicious ones without using familiar techniques like typosquatting or masquerading, creating a "real" package that a developer might use if ChatGPT recommends it, the researchers said. In this way, that malicious code can find its way into a legitimate application or in a legitimate code repository, creating a major risk for the software supply chain.

"A developer who asks a generative AI like ChatGPT for help with their code could wind up installing a malicious library because the AI thought it was real and an attacker made it real," Lanyado says. "A clever attacker might even make a working library, as kind of a Trojan, which could wind up being used by multiple people before they realized it was malicious."

How to Spot Bad Code Libraries

It can be difficult to tell if a package is malicious if a threat actor effectively obfuscates their work, or uses additional techniques such as making a Trojan package that is actually functional, the researchers noted. However, there are ways to catch bad code before it gets baked into an application or published to a code repository.

To do this, developers need to validate the libraries they download and make sure they not only do what they say they do, but also "are not a clever Trojan masquerading as a legitimate package," Lanyado says.

"It's especially important when the recommendation comes from an AI rather than a colleague or people they trust in the community," he says.

There are many ways a developer can do this, such as checking the creation date; number of downloads and comments, or a lack of comments and stars; and having a look at any of the library's attached notes, the researchers said. "If anything looks suspicious, think twice before you install it," Lanyado recommended in the post.

ChatGPT: Risks and Rewards

This attack scenario is just the latest in a line of security risks that ChatGPT can present. And the technology caught on quickly since its release last November — not only with users, but also with threat actors keen to leverage it for cyberattacks and malicious campaigns.

In the first half of 2023 alone, there have been scammers mimicking ChatGPT to steal user business credentials; attackers stealing Google Chrome cookies through malicious ChatGPT extensions; and phishing threat actors using ChatGPT as a lure for malicious websites.

While some experts think the security risk of ChatGPT is potentially being overhyped, it certainly exists because of how quickly people have embraced generative AI platforms to support their professional activity and ease the burdens of day-to-day workloads, the researchers said.

"Unless you [are] living under a rock, you'll be well aware of the generative AI craze," with millions of people embracing ChatGPT at work, Lanyado wrote in the post.

Developers, too, are not immune to the charms of ChatGPT, turning away from online sources such as Stack Overflow for coding solutions and to the AI platform for answers, "creating a major opportunity for attackers," he wrote.

And as history has demonstrated, any new technology that quickly attracts a solid user base also as quickly draws bad actors aiming to exploit it for their own opportunity, with ChatGPT providing a real-time example of this scenario.

About the Author(s)

Elizabeth Montalbano, Contributing Writer

Elizabeth Montalbano is a freelance writer, journalist, and therapeutic writing mentor with more than 25 years of professional experience. Her areas of expertise include technology, business, and culture. Elizabeth previously lived and worked as a full-time journalist in Phoenix, San Francisco, and New York City; she currently resides in a village on the southwest coast of Portugal. In her free time, she enjoys surfing, hiking with her dogs, traveling, playing music, yoga, and cooking.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights