Amidst concerns that employees could be entering sensitive information into the ChatGPT artificial intelligence model, a data privacy vendor has launched a redaction tool aimed at reducing companies' risk from inadvertently exposing customer and employee data.
Private AI's new PrivateGPT platform integrates with OpenAI's high-profile chatbot, automatically redacting 50+ types of personally identifiable information (PII) in real time as users enter ChatGPT prompts.
PrivateGPT sits in the middle of the chat process, stripping out everything from health data and credit-card information to contact data, dates of birth, and Social Security numbers from user prompts, before sending them through to ChatGPT. When ChatGPT responds, PrivateGPT re-populates the PII within the answer, to make the experience more seamless for users, according to a statement this week from PrivateGPT creator Private AI.
"Generative AI will only have a space within our organizations and societies if the right tools exist to make it safe to use," said Patricia Thaine, co-founder and CEO of Private AI, in a statement. "By sharing personal information with third-party organizations, [companies] lose control over how that data is stored and used, putting themselves at serious risk of compliance violations."
Privacy Risks & ChatGPT
Every time a user enters data into a prompt for ChatGPT, the information is ingested into the service's LLM data set, used to train the next generation of the algorithm. The concern is that the information could be retrieved at a later date if proper data security isn't in place for the service.
"The aspect of AI consuming all input as source material for others queries presents a black box of uncertainty as to exactly how and where a company's data would end up and completely upends the tight data security at the heart of most all companies today," warns Roy Akerman, co-founder and CEO at Rezonate.
This risk of data exposure is not theoretical, it should be noted: OpenAI in March acknowledged a bug that released users' chat histories, after screen shots of private chats started showing up on Reddit.
OpenAI has warned users to be selective when using ChatGPT: "We are not able to delete specific prompts from your history. Please don't share any sensitive information in your conversations," OpenAI's user guide notes.
Yet employees are still learning about how to handle privacy when it comes to ChatGPT, even as the service sees a dizzying amount of adoption (it reached the milestone of 100 million users in record time, just two months after launch).
In a recent report, data security service Cyberhaven detected and blocked requests to input sensitive data into ChatGPT from 4.2% of the 1.6 million workers at its client companies, including confidential information, client data, source code, and regulated information.
As a concrete example of the phenomenon, earlier in the month it came to light that Samsung engineers had made three significant leaks to ChatGPT: buggy source code from a semiconductor database, code for identifying defects in certain Samsung equipment, and the minutes of an internal meeting.
"The wide adoption of AI language models is becoming widely accepted as a means of accelerating delivery of code creation and analysis," says Akerman. "Yet data leakage is most often a by-product of that speed, efficiency, and quality. Developers worldwide are anxious to use these technologies, yet guidance from engineering management has yet to be put in place on the do's and don'ts of AI usage to ensure data privacy is respected and maintained."