Data Privacy in the Age of GenAI

Consumer data is still a prime target for threat actors, and organizational consumption of data must be aligned to protect it. The new rights act seeks to do some of this, but it still needs tweaking.

Nathan Vega, Vice President, Product Marketing & Strategy, Protegrity

May 31, 2024

4 Min Read
1s and 0s and silhouettes of heads
Source: Brain light via Alamy Stock Photo


The American Privacy Rights Act of 2024 (APRA) is the most comprehensive proposed national legislation defining privacy for Americans to date — something that historically has meant difficulties in federal approval. We're looking at legislation that holds organizations accountable at a level we've not yet seen. With APRA, these companies will need: 

  • Annual CEO-signed certification of compliance

  • Mandated reporting lines for privacy and security officers (You can't have a figurehead chief privacy officer with no reports or budget.) 

  • To conduct biennial audits and Privacy Impact Assessments (PIAs) 

  • To publish the privacy policies for the past 10 years and deliver annual reports on consumer requests related to privacy 

There's a reason why the United States has not passed any comprehensive data privacy laws in recent history: Companies largely monetize consumer data. Data is profitable, and restricting that cash flow would have economic ripple effects. However, while well-intentioned, APRA does warrant some scrutiny. Notably, its Civil Rights and Algorithm section lacks concern about transparency and ethics. 

The dynamic between "covered entity" and "service provider" is detailed in a way that places responsibility on entities like retailers to have well-defined processes, programs, and procedures to maintain compliance. The onus is not placed on service providers like white-label loyalty programs. This presents a challenge that we've all experienced: Try to delete an embarrassing picture that's been put in a third-party platform. It always seems to pop back up. 

Another example: APRA requires annual algorithm impact assessments if there is a "consequential risk of harm to defined groups or outcomes." The way to measure the impact and to define the consequential risk of harm is not well defined. If a member of a protected class is denied a loan from a provider using an algorithm, and they lose their car because they needed the loan, is that consequential harm? What if they were denied by one provider but got the loan from another provider? Could the first provider be liable for bias or disparate impact? 

Undoubtedly, the United States needs widespread and comprehensive data privacy regulation. Every consumer is having their personally identifiable information (PII) and online activities gathered up by organizations such as social media giants — even if the consumer doesn't have an account. Those companies are not obligated to define user activity or data collected this way as sensitive data, or to notify individuals that their data is being collected. Their argument is that they don't actually sell your data. Instead, they sell access to data about you to third parties for targeting. 

There are more gray areas than answers here, especially considering that today's technology and our ability to enact enforcement may need improvements to keep up with requirements. 

Too Much Trust in GenAI?

The proliferation of proprietary generative artificial intelligence (GenAI) models like ChatGPT opens a new can of worms when considering the data these models are built upon and how it impacts responses. 

We've seen some of the brightest in society fall victim to the false perception that GenAI will produce correct and evidence-based responses. In June 2023, lawyers in New York were documented as having used ChatGPT to generate case briefs after it was revealed that GenAI made up fake cases in its responses. 

Adding to the complexity for businesses and individuals is an inability to detect biases. Different models are likely to produce different results, leading to the issue of transparency and ethics within APRA and how service providers can ensure their GenAI models are providing fair and equitable results across customers.

For instance, if a customer walks into a physical bank looking for a loan but leaves having been denied, there are likely clear policies set in place that are communicated along with the rejection. 

In this scenario, there's a clear, policy-backed resolution, but if you replace the banker with a GenAI model, you can't see the data or policy upon which it's built. How do you know the GenAI wasn't built on faulty or biased data? 

To achieve the results APRA ultimately wants, we need a clearly defined and established policy that GenAI can then ingest and interpret equitably. We are likely many innovation cycles away from this reality. Which is why we need human operators to be responsible for these policies and ensure the model complies with them. 

A Problem Without a Solution — Yet 

While APRA is a great start to laying down the foundation for GenAI use, we're still far from artificial general intelligence, which can function appropriately without human oversight and intervention. We still need human operators to use AI effectively, and we need to consider these tools as a complementary extension of what humans are already doing rather than a replacement. 

Some companies are still adopting a fast and furious approach to integrating AI into their processes, but to get anywhere they still need to place high-value customer data into these GenAI models. After all, high-value data gets high-value results with these tools. The challenge comes in securing sensitive data while leveraging its organizational benefits. 

Consumer personal data is still a prime target for threat actors, and organizational consumption of data must be aligned to protecting it from unauthorized access. APRA seeks to do some of that but may still need tweaking to ensure comprehensive coverage for Americans.

About the Author(s)

Nathan Vega

Vice President, Product Marketing & Strategy, Protegrity

Nathan Vega has spent his career defining, building, and delivering cybersecurity products to market. He is passionate about collaboration that builds and engages communities of practice inside and outside of InfoSec. Vega brings deep experience and expertise in data security and analytics, regularly providing thought leadership on data privacy, precision data protection, data sovereignty, compliance, and other critical industry issues. Before Protegrity, Vega worked at IBM, where he brought Watson to market as a tool set of Cloud APIs. He holds a bachelor of science in computer science and an MBA.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like

More Insights