Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


06:45 PM

Shifting Privacy Landscape, Disruptive Technologies Will Test Businesses

A new machine learning tool aims to mine privacy policies on behalf of users.

Aiming to correct the privacy imbalance between consumers and businesses, a group of academics released a tool that uses automation and machine learning to mine privacy policies and deliver easy-to-use options for a consumer to limit a company's use of data.

The browser plug-in, called Opt-Out Easy,  is the brainchild of a group of researchers from Carnegie Mellon University, the University of Michigan, Stanford University, and Penn State University and represents the latest shift on the status quo in data collection. The groups have analyzed a large number of privacy policies with machine learning algorithms to identify the actionable choices users can take using those policies.

Related Content:

Data Privacy Concerns, Lack of Trust Foil Automated Contact Tracing

How Data Breaches Affect the Enterprise

New From The Edge: Understanding TCP/IP Stack Vulnerabilities in the IoT

The goal of the tool is to allow consumers to easily apply their own privacy wishes to any website they visit, says Norman Sadeh, a CyLab researcher and professor in Carnegie Mellon’s School of Computer Science.

"Privacy regulations are a great step forward because you need to offer people choices," he says. "On the other hand, what good are those choices to anyone if engaging with these policies is too burdensome? Right now we don't see a lot of people making privacy decisions because they don't know they can."

The tool represents the latest potential disruption to the data economy that businesses may have to contend with this year.

In the past three years, new regulations — such as the European Union's General Data Protection Regulation (CDPR) and the California Consumer Protection Act (CCPA) — have come into force, driving ever-larger fines for data breaches and privacy violations. In addition, new technologies, such as the Solid project at the Massachussetts Institute of Technology, offer a different approach to data sharing that empowers individuals over businesses. 

These changes are already being noted by privacy-focused companies, says Caitlin Fennessy, research director at the International Association of Privacy Professionals. 

"Data is valuable and so companies are still going to want to collect and use it, but if it is not providing value to the company, then it is creating big risk," she says. "With the increase in hacks and breaches ... as well as the increased focus of regulators on enforcing substantive privacy protections, companies are becoming a lot more strategic about how the approach data collection and retention."

In many ways, companies are being dragged into the future. 

The legal framework that allows businesses to collect a broad range of data with purported consumer approval, so-called "notice and consent," has largely failed to provide any meaningful privacy protection. Companies regularly drown out meaningful language in legalese deep inside privacy policies written at a grade level that very few people can, and ever, read. An analysis of 150 privacy policies by The New York Times, for example, found the vast majority required a college-level reading ability and at least 15 minutes to read.

The university researchers aim to even the playing field. Using machine learning, the group built a model to recognize the choices provided by privacy policies, including opting out of data collection and sharing of data. The approach has been used to identify the opt-out links in nearly 7,000 privacy policies

The approach could even be used to allow consumers to specify their desired level of sharing and use the machine learning system to find the right settings to achieve that, says CMU's Sadeh. While the tool does not have that capability yet, finding ways to tailor privacy preferences may be preferable to a one-size-fits-all approach. 

"Privacy is ultimately about ethical principles," he says. "Those principles include transparency, they include fairness, but they also include agency. I should be able to take control of what happens to my data."

Fennessy sees the tool as a way give users more control of privacy without requiring companies to take action — perhaps the best of both worlds. 

However, she stresses that widespread adoption of the tool will require companies to better manage the privacy preferences of every user. While automated tools for data security and privacy compliance are available, many companies have not yet adopted them, she says. 

"The more opt-out requests that companies see, the more likely that they will need an automated solution," she says. "Companies who are looking to the future are saying that they need to automate."

She also notes that the automation extends down to whichever companies are being used to process data or transactions. Just as supply chain issues have become a significant consideration for security, third-party suppliers of data processing services are a significant privacy issue as well.

"If you are passing private data onto processors, you will have to work with them to correctly handle the data as well as process correction and deletion requests," she says. "As the volume of transactions increase, handling those different communications will require automation, especially for vendors that are handling a whole bunch of clients."

Veteran technology journalist of more than 20 years. Former research engineer. Written for more than two dozen publications, including CNET News.com, Dark Reading, MIT's Technology Review, Popular Science, and Wired News. Five awards for journalism, including Best Deadline ... View Full Bio

Recommended Reading:

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Former CISA Director Chris Krebs Discusses Risk Management & Threat Intel
Kelly Sheridan, Staff Editor, Dark Reading,  2/23/2021
Security + Fraud Protection: Your One-Two Punch Against Cyberattacks
Joshua Goldfarb, Director of Product Management at F5,  2/23/2021
Cybercrime Groups More Prolific, Focus on Healthcare in 2020
Robert Lemos, Contributing Writer,  2/22/2021
Register for Dark Reading Newsletters
White Papers
Cartoon Contest
Write a Caption, Win an Amazon Gift Card! Click Here
Latest Comment: This comment is waiting for review by our moderators.
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you today!
Flash Poll
Building the SOC of the Future
Building the SOC of the Future
Digital transformation, cloud-focused attacks, and a worldwide pandemic. The past year has changed the way business works and the way security teams operate. There is no going back.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
PUBLISHED: 2021-02-26
PrestaShop is a fully scalable open source e-commerce solution. In PrestaShop before version 1.7.2 there is a CSV Injection vulnerability possible by using shop search keywords via the admin panel. The problem is fixed in
PUBLISHED: 2021-02-26
PrestaShop is a fully scalable open source e-commerce solution. In PrestaShop before version 1.7.2 the soft logout system is not complete and an attacker is able to foreign request and executes customer commands. The problem is fixed in
PUBLISHED: 2021-02-26
Synapse is a Matrix reference homeserver written in python (pypi package matrix-synapse). Matrix is an ecosystem for open federated Instant Messaging and VoIP. In Synapse before version 1.25.0, requests to user provided domains were not restricted to external IP addresses when calculating the key va...
PUBLISHED: 2021-02-26
Synapse is a Matrix reference homeserver written in python (pypi package matrix-synapse). Matrix is an ecosystem for open federated Instant Messaging and VoIP. In Synapse before version 1.25.0, a malicious homeserver could redirect requests to their .well-known file to a large file. This can lead to...
PUBLISHED: 2021-02-26
All versions of package github.com/thecodingmachine/gotenberg are vulnerable to Server-side Request Forgery (SSRF) via the /convert/html endpoint when the src attribute of an HTML element refers to an internal system file, such as <iframe src='file:///etc/passwd'>.