Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Application Security

Speech Stirs Clickstream Controversy

Online privacy proponents express concerns over ISP licensing of end-user Web surfing data

It wasn't a big part of his presentation. But when David Cancel -- CTO of clickstream analysis service Compete -- mentioned last week at the Open Data 2007 conference in New York that his company licenses clickstream data from ISPs, he stirred up a swarm of bees that are still buzzing on the Web.

Privacy proponents this week are registering concerns that ISPs are selling "anonymized" user clickstream data -- concerns that were spurred partly by reports on Cancel's presentation.

"There is no way that this [data] is sufficiently anonymized," said blogger Adam Fields in one response. "It is readily obvious from reading my clickstream who I am. URLS for many online services contain usernames... All it takes is one of those usernames to be tied to a real name, and your entire clickstream becomes un-anonymized, irreversibly and forever."

The clickstream privacy buzz is further fueled by research following AOL's blunder last year, in which the ISP released "anonymized" search data from about 650,000 subscribers. Researchers found that it was easy to trace the search data back to individual subscribers, exposing personal information and embarrassing Web surfing habits. (See Users Outraged by AOL Gaffe.) An AOL spokesman yesterday said his company is not among the ISPs that sell anonymized clickstream data.

Clickstream analysis experts say this week's controversy is largely unwarranted. They point out that ISPs have been licensing clickstream data for years, making it available in a format that contains no usernames or personally-identifiable information.

"We contractually require all our data partners to make sure they never send us PII," Cancel said in an interview today. "Each record is identified by a random integer ID. We do not want any IP addresses, user/agents, etc., transmitted to us." The ISPs are responsible for anonymizing the data before it's sent to Compete, which aggregates the data to show trends in user browsing habits on any Website.

Compete is not the only clickstream analysis vendor to use information from ISPs. Hitwise, a Compete competitor, collects much of its data via software that reports clickstream data from ISP customers who "opt in" to the analysis. A detailed audit report from PricewaterhouseCoopers on the Hitwise Website assures users that Hitwise's data contains no PII and is collected only from users who know they are being tracked.

Other clickstream analysis services, such as Alexa and Nielsen/NetRatings, get most of their data from toolbars and user contributors who must consciously add software to their PCs in order to be monitored. But such services' results may not be as accurate as results from Compete, which collects data from ISPs, application service providers, and panels of willing users, Cancel says.

Still, many consumers -- and their service providers -- are becoming more aware of the privacy issues surrounding Web browsing analysis. Google last week said in a blog that it plans to revamp its information collection process, scrubbing personal information from cookies and removing some parts of IP addresses after the data has been stored for 18 to 24 months.

There are many tools on the Web for monitoring end-user behavior, and consumers should be aware of them, Cancel said. "This should include third-party cookies, too, whether they come from an ad network or the latest Web 2.0 Javascript widget," he said. "As far as I know, there is no [end user licensing agreement] or equivalent for this type of application, even though they may be collecting data on you."

— Tim Wilson, Site Editor, Dark Reading

  • Compete Inc.
  • Google (Nasdaq: GOOG)
  • Hitwise Pty. Ltd.
  • Nielsen/NetRatings Tim Wilson is Editor in Chief and co-founder of Dark Reading.com, UBM Tech's online community for information security professionals. He is responsible for managing the site, assigning and editing content, and writing breaking news stories. Wilson has been recognized as one ... View Full Bio

    Comment  | 
    Print  | 
    More Insights
  • Comments
    Newest First  |  Oldest First  |  Threaded View
    Attackers Leave Stolen Credentials Searchable on Google
    Kelly Sheridan, Staff Editor, Dark Reading,  1/21/2021
    How to Better Secure Your Microsoft 365 Environment
    Kelly Sheridan, Staff Editor, Dark Reading,  1/25/2021
    Register for Dark Reading Newsletters
    White Papers
    Cartoon Contest
    Write a Caption, Win an Amazon Gift Card! Click Here
    Latest Comment: We need more votes, check the obituaries.
    Current Issue
    2020: The Year in Security
    Download this Tech Digest for a look at the biggest security stories that - so far - have shaped a very strange and stressful year.
    Flash Poll
    Assessing Cybersecurity Risk in Today's Enterprises
    Assessing Cybersecurity Risk in Today's Enterprises
    COVID-19 has created a new IT paradigm in the enterprise -- and a new level of cybersecurity risk. This report offers a look at how enterprises are assessing and managing cyber-risk under the new normal.
    Twitter Feed
    Dark Reading - Bug Report
    Bug Report
    Enterprise Vulnerabilities
    From DHS/US-CERT's National Vulnerability Database
    PUBLISHED: 2021-01-26
    KLog Server through 2.4.1 allows authenticated command injection. async.php calls shell_exec() on the original value of the source parameter.
    PUBLISHED: 2021-01-26
    The ftpd gem 0.2.1 for Ruby allows remote attackers to execute arbitrary OS commands via shell metacharacters in a LIST or NLST command argument within FTP protocol traffic.
    PUBLISHED: 2021-01-26
    SmartAgent 3.1.0 allows a ViewOnly attacker to create a SuperUser account via the /#/CampaignManager/users URI.
    PUBLISHED: 2021-01-26
    NVIDIA Jetson AGX Xavier Series, Jetson Xavier NX, TX1, TX2, Nano and Nano 2GB, L4T versions prior to 32.5, contains a vulnerability in the apply_binaries.sh script used to install NVIDIA components into the root file system image, in which improper access control is applied, which may lead to an un...
    PUBLISHED: 2021-01-26
    NVIDIA Tegra kernel in Jetson AGX Xavier Series, Jetson Xavier NX, TX1, TX2, Nano and Nano 2GB, all L4T versions prior to r32.5, contains a vulnerability in the INA3221 driver in which improper access control may lead to unauthorized users gaining access to system power usage data, which may lead to...