Risk
5/27/2011
12:36 PM
50%
50%

35 Million Google Profiles Captured In Database

A security researcher was able to collect information from Google Profiles and save millions of files in a SQL database in about a month.

Top 15 Google Apps For Business
Slideshow: Top 15 Google Apps ForBusiness
(click image for larger view and for full slideshow)
Caveat poster: A security researcher has assembled a single database containing 35 million people's Google Profiles information, including Twitter feeds, real names, and email addresses, among other data points.

Google bills Profiles as a way to "decide what the world sees when it searches for you."

But Matthijs R. Koot, a privacy and anonymity researcher at the University of Amsterdam, also found that because of the nature of Google Profiles--it's meant to be indexed by search engines--he was able to easily save available information into a SQL database. Doing so required about a month's effort "to retrieve the data, convert it to SQL using spidermonkey and some custom Javascript code, and import it into a database," he said in a blog post.

The resulting database contains whatever people have added to their own Google Profile, which potentially includes their real name, aliases, Twitter conversations, work experience and educational background, and links to Picasa photos. In addition, Koot said that about 15 million profiles also have a username, which is the same as a person's Gmail address. Interestingly, Koot said that he was able to assemble the data "without Google throttling, blocking, CAPTCHAing" or encountering any other form of security protection.

The potential threat, or nuisance, posed by Google Profiles has to do with social engineering attacks and marketing firm practices. Namely, savvy attackers would have access to extensive amounts of personal information, which they could use to help make phishing or targeted attacks appear more realistic. Likewise, marketing firms have more information available for targeting potential customers. This threat, challenge, or--depending on your perspective--business opportunity isn't new. What is new, however, is the sheer amount of personal information that's easily available in one go.

According to a recent, global study, Internet users typically have an online expectation of privacy. But as Koot's project demonstrates, the reality can be different. Notably, third-party advertisers and affiliates can collect extensive amounts of personal information.

Koot said as much when explaining his rationale for this project. "My activities are directed at inciting, or poking up, debate about privacy--not to create distrust but to achieve realistic trust--and the meaning of 'informed consent.' Which, when signing up for online services like Google Profile, amounts to checking a box." The value of research such as Koot's project is also to illustrate not just what's possible, but what--from a marketing, advertising, or social engineering perspective--has probably already been done.

Koot's work recalls a similar project conducted in July 2010 by Ron Bowes, a security researcher and developer at Tenable Network Security, only with Facebook. Notably, thanks to Facebook's directory, Bowes was able to build a script that harvested 171 million Facebook usernames, 100 million of which were unique, as well as the URL for each profile. (Gathering more names may also have been possible, with tweaks for non-Romance-language alphabets.) Bowes published the information he'd gathered as a torrent file.

"This is a scary privacy issue," he said in a blog post at the time. "I can find the name of pretty much every person on Facebook. Facebook helpfully informs you that "[a]nyone can opt out of appearing here by changing their search privacy settings"--but that doesn't help much anymore considering I already have them all (and you will too, when you download the torrent)."

In this new Tech Center report, we profile five database breaches--and extract the lessons to be learned from each. Plus: A rundown of six technologies to reduce your risk. Download it here (registration required).

Comment  | 
Print  | 
More Insights
Register for Dark Reading Newsletters
White Papers
Cartoon
Current Issue
Dark Reading Tech Digest, Dec. 19, 2014
Software-defined networking can be a net plus for security. The key: Work with the network team to implement gradually, test as you go, and take the opportunity to overhaul your security strategy.
Flash Poll
Video
Slideshows
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2014-4467
Published: 2015-01-30
WebKit, as used in Apple iOS before 8.1.3, does not properly determine scrollbar boundaries during the rendering of FRAME elements, which allows remote attackers to spoof the UI via a crafted web site.

CVE-2014-4476
Published: 2015-01-30
WebKit, as used in Apple iOS before 8.1.3; Apple Safari before 6.2.3, 7.x before 7.1.3, and 8.x before 8.0.3; and Apple TV before 7.0.3, allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption and application crash) via a crafted web site, a different vulner...

CVE-2014-4477
Published: 2015-01-30
WebKit, as used in Apple iOS before 8.1.3; Apple Safari before 6.2.3, 7.x before 7.1.3, and 8.x before 8.0.3; and Apple TV before 7.0.3, allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption and application crash) via a crafted web site, a different vulner...

CVE-2014-4479
Published: 2015-01-30
WebKit, as used in Apple iOS before 8.1.3; Apple Safari before 6.2.3, 7.x before 7.1.3, and 8.x before 8.0.3; and Apple TV before 7.0.3, allows remote attackers to execute arbitrary code or cause a denial of service (memory corruption and application crash) via a crafted web site, a different vulner...

CVE-2014-4480
Published: 2015-01-30
Directory traversal vulnerability in afc in AppleFileConduit in Apple iOS before 8.1.3 and Apple TV before 7.0.3 allows attackers to access unintended filesystem locations by creating a symlink.

Best of the Web
Dark Reading Radio
Archived Dark Reading Radio
If you’re a security professional, you’ve probably been asked many questions about the December attack on Sony. On Jan. 21 at 1pm eastern, you can join a special, one-hour Dark Reading Radio discussion devoted to the Sony hack and the issues that may arise from it.