Late last week the security industry learned that a trove of public data had been exposed in an unsecured server, compromising 4 terabytes of information, or about 1.2 billion records.
The leak was discovered by security researcher Vinny Troia and first reported by Wired. All of the data exposed was publicly available: Profiles of hundreds of millions of people included home and mobile phone numbers, related social media profiles (Facebook, Twitter, LinkedIn, Github), and work histories seemingly pulled from LinkedIn profiles. Troia found nearly 50 million unique phone numbers and 622 unique email addresses, all easily accessible online.
Dave Farrow, senior director of information security at Barracuda Networks, was dismayed when he first heard mention of a security incident last week. A data breach would mean his team would have to mobilize, he says, and at the time he knew little information about the incident. Farrow initially guessed the news would be another exposed Elasticsearch cluster or Amazon S3 bucket, a fairly common occurrence he describes as "a prominent and easy mistake to make."
When he learned more about the data leak, Farrow was relieved. The employees and customers he's tasked with protecting "weren't any more exposed that day than they were the day before," he says. While data enrichment makes him "a little bit uneasy," he adds, there wasn't much of a change in security posture for anyone whose data was exposed in the leak.
A security incident that generates conversation typically involves a breach or exposure of comparatively more sensitive data. But this unprotected server didn't store personally identifiable information like Social Security numbers, nor did it contain passwords or payment card data. So why did the exposure of publicly accessible data have people talking?
The amount and type of information exposed, and the way it was organized, could give cybercriminals the tools they need to assume other identities or launch spear-phishing attacks. As Wade Woolwine, Rapid7's principal threat intelligence researcher, puts it: "Data in aggregate is always worth something to someone … large sums of data are worth their weight in gold."
You Are for Sale
The server seemed to contain four separate data sets. Three were labeled to indicate they held data from San Francisco-based People Data Labs, which claims to sell contact, resume, social, and demographic data for more than 1.5 billion people. Its website advertises more than 1 billion personal email addresses, 420 million LinkedIn URLs, and 1 billion Facebook URLs and IDs.
The fourth dataset was labeled "OXY," which is believed to contain information from data broker Oxydata, Troia told Wired. Its website claims to sell information on more than 380 million business professionals, including contact info, social profiles, industry, and education.
Data enrichment is a legal but controversial practice. "The industry exists for the purpose of influencing people and giving you access to people you want to influence," says Farrow, who says he has heard both sides of the argument. On one hand, employees often use this data to ensure they're not sending mailers to or cold-calling the wrong people. They could get the same information themselves on Facebook or LinkedIn; data aggregators speed up the process.
At the same time, it "feels like an intrusion on our privacy," he says. Cybercriminals can use this leaked data to influence victims to their advantage. A leak like this gives attackers access to organized and meaningful information, as opposed to a broad data dump. It forces those affected to think twice about who they trust — about whether a message is legitimate or malicious.
Further, there is a difference between this data leak and other security breaches in which credit card numbers or passwords are stolen. "In those types of breaches, there is a clear call to action," Farrow says. "The people whose data is leaked actually need to go out and do something." Stolen passwords can be changed, and stolen credit cards replaced. When people give up so much personal data to tech platforms, there isn't much they can do about how it's used.
Responsibility to Protect Data
There are steps that can be taken to protect this data. The question is, whose job is it?
"Any company that's holding data that might even be remotely valuable is potentially at risk of having it stolen," Woolwine says. "It gets progressively worse as sensitivity of the data goes up."
There are certain businesses in which the customer's security posture has a direct impact on the organization, and this is certainly true for data aggregators, Farrow adds. Sean Thorne, co-founder of People Data Labs, told Wired that customers are responsible for securing data on their servers. While the company does free security audits and consultations, it isn't accountable.
Woolwine suggests it may be time for the government to get involved with helping the industry move forward in implementing penalties for securing information. "I don't think that without some kind of authoritative oversight, the smaller players are going to get their act together to secure that data," he explains.
In the meantime, security practitioners should chat with their business colleagues about the best practices involved with data handling. Most business professionals know sensitive data must be encrypted, he says, but the process for handling less-critical information – like that exposed in this leak – should also be evaluated. A scenario like this could prove damaging to a company's reputation if customers learn their data is mishandled or left exposed on the Web.
"An event like this should be a wake-up call to everybody that handles sensitive data to make sure they coordinate with their security teams to ensure the controls most teams are tasked with putting in place are actually applied so an accident like this doesn't happen," Farrow says. "There are all good reasons to protect it, and no good reasons to expose it."
- When You Know Too Much: Protecting Security Data from Security People
- 6 Top Nontechnical Degrees for Cybersecurity
- DDoS: An Underestimated Threat
- Tushu, Take Twoshu: Malicious SDK Reappears in Google Play
Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "Home Safe: 20 Cybersecurity Tips for Your Remote Workers."