Expert Insight: Dangers of Using Large Language Models Before They Are Baked

Vulnerabilities & Threats

Today's LLMs pose too many trust and security risks.

Gary McGraw Ph.D., Co-Founder, Berryville Institute of Machine Learning

April 20, 2023

6 Min Read

The word "ChatGPT" over a picture of a person's hands on a keyboard

Source: Skorzewiak via Alamy Stock Photo

The Berryville Institute of Machine Learning (BIML) recently attended Calypso AI’s Accelerate AI 2023 conference in Washington, DC. The meeting included practitioners from both government and industry, regulators, and academics. I participated in a very timely panel entitled "Emerging Risks and Opportunities of LLMs in the NATSEC and Beyond." That panel spurred this article.

None of the content in this column was created by an LLM.

Large language models (LLMs) are a kind of machine learning system that has taken the world by storm. ChaptGPT (aka GPT-3.5), an LLM produced and fielded by OpenAI, is one of the most pervasive and popular of the many generative AI models for text. Other generative models include the image generators Dall-E, Midjourney, and Stable Diffusion, and the code generator Copilot.

LLMs and other generative tools present incredible opportunities for applications, and the rush to field such systems is in full gallop. LLMs have the potential to be lots of fun, solve hard problems in science, help with the thorny problems of knowledge management, create interesting content, and most importantly in my view, move the world a little closer to artificial general intelligence (AGI) and a theory of representation that produces massive cognitive science breakthroughs.

But using today's nascent LLMs — and any generative AI tools — comes with real risks.

Of Parrots and Pollution

LLMs are trained on massive corpuses of information (300 billion words, mostly scraped from the Internet) and use auto-association to predict the next word in a sentence. As such, most scientists agree that LLMs do not really "understand" anything or do any real thinking. Rather, they are "stochastic parrots," as some researchers call them. LLMs still do generate impressive streams of text, in context, and in an often useful and deeply surprising manner.

Feedback loops are an important class of problem in all kinds of ML models (and one that BIML introduced several years ago as “[raw:8:looping]”). Here is what we said at the time:

Model confounded by subtle feedback loops. If data output from the model are later used as input back into the same model, what happens? Note that this is rumored to have happened to Google translate in the early days when translations of pages made by the machine were used to train the machine itself. Hilarity ensued. To this day, Google restricts some translated search results through its own policies.

Imagine what will happen when a majority of what can be easily scraped from the Internet is generated by AI models of dubious quality, and then these LLMs start to eat their own tails. Talk about information pollution.

Some AI researchers and information security professionals are already deeply worried about information pollution today, but an infinite spewing information pollution pipeline sounds bad.

Mansplaining-as-a-Service

LLMs are very good B.S. artists. These models regularly express incorrect opinions and alternative facts with great confidence, and often make up information to justify their answers in a conversation (including pretend scientific references). Imagine what happens when average knowledge as scraped from the Internet — which includes lots of wrong things — is used to create new content.

The ultimate "reply guy" is not who we need writing news stories for us in the future. Facts actually do matter.

Mirroring Broken Security

LLMs are often trained on data that is biased in many ways. Sexist, racist, xenophobic, and misogynistic systems can and will be modeled from data sets originally collected when society was less progressive. Bias is a serious issue in LLMs, and one that operational Band-Aids are unlikely to fix.

Trusting Deepfakes

Deepfakes are a side effect of generative AI that has been discussed for some time. Fakes or spoofing are always a risk in security, but the capacity to make higher-quality fakes that are easy to believe is here. There are some obvious worst-case scenarios: moving markets, causing wars, inflaming cultural divides, and so on. Careful where that video came from.

Automating Everything

Generative models like LLMs have the capacity to replace lots of jobs, including low-level white collar jobs. Millions of people get plenty of satisfaction out of their jobs, but what if all of those jobs are willy-nilly replaced by ML systems that are cheaper to operate? We already see this with some writing content-creation jobs in local sports stories and marketing copy, for example. Should LLMs be writing our articles? Should they be practicing law? How about doing medical diagnosis?

Using a Tool for Evil

Generative AI can create some fun stuff. If you haven’t played with ChapGPT, you should give it a shot. I haven’t had this much fun with new technology since I got my first Apple ][+ in 1981, or since Java applets came out in 1995.

But the power of generative AI can be used for evil as well. For example, what if we put ML to the task of designing a novel virus with the propagation capability of COVID and the delayed onset parameters of rabies? Or how about the task of creating a plant with pollen that is also a neurotoxin. If I can think up these dastardly things in my kitchen, what happens when a real evil person starts brainstorming with ML?

Tools have no intrinsic morality or ethics. And for the record, the idea of "protecting prompts" is a pretend solution akin to protecting supercomputer access with command-line filtering.

Holding Information in a Broken Cup

Data moats are a thing now. That’s because if an ML model (with the help of humans) can get to your data and learn how to profitably parrot what you do by training on your data, it will. This leads to multiple risks best solved by carefully protecting your data sets instead of cramming them unto an ML model and fielding the model to the general public. In the gold rush world of ML, the gold is data.

Protecting data is harder than it sounds. Think how the government's current information classification system can't always protect classified information. Clearly, protecting sensitive and confidential information is already a huge challenge, so imagine what happens when we intentionally train up our ML systems on confidential information and set them out into the world.

Extraction and transfer attacks exist today, so be careful what you pour in your ML cup.

What Should We Do About LLM Risk?

So should we put some kind of magic moratorium on ML research for a few months as some technologists have self-servingly suggested? No, that's just silly. What we need to do is recognize these risks and face them directly.

The good news is that an entire industry is being built around securing ML systems, which I call machine learning security or MLsec. Check out what these new hot startups are building to control ML risk — but do so with a skeptical and well-informed eye.

About the Author(s)

Gary McGraw Ph.D.

Co-Founder, Berryville Institute of Machine Learning

Gary McGraw is co-founder of the Berryville Institute of Machine Learning where his work focuses on machine learning security. He is a globally recognized authority on software security and the author of eight best selling books on this topic. His titles include Software Security, Exploiting Software, Building Secure Software, Java Security, Exploiting Online Games, and 6 other books; and he is editor of the Addison-Wesley Software Security series. Dr. McGraw has also written over 100 peer-reviewed scientific publications. Gary serves on the Advisory Boards of Calypso AI, Legit, Irius Risk, Maxmyinterest, Protopia AI, and Red Sift. He has also served as a Board member of Cigital and Codiscope (acquired by Synopsys) and as Advisor to CodeDX (acquired by Synopsys), Black Duck (acquired by Synopsys), Dasient (acquired by Twitter), Fortify Software (acquired by HP), and Invotas (acquired by FireEye). Gary produced the monthly Silver Bullet Security Podcast for IEEE Security & Privacy magazine for thirteen years. His dual PhD is in Cognitive Science and Computer Science from Indiana University where he serves on the Dean’s Advisory Council for the Luddy School of Informatics, Computing, and Engineering.

See more from Gary McGraw Ph.D.

Related Topics

Related Topics

Related Topics

Related Topics

Of Parrots and Pollution

Mansplaining-as-a-Service

Mirroring Broken Security

Trusting Deepfakes

Automating Everything

Using a Tool for Evil

Holding Information in a Broken Cup

What Should We Do About LLM Risk?

About the Author(s)

Editor's Choice