Data Leak Week: Billions of Sensitive Files Exposed Online

A total of 2.7 billion email addresses, 1 billion email account passwords, and nearly 800,000 applications for copies of birth certificate were found on unsecured cloud buckets.

Revelations this week of separate data exposure incidents — a billion passwords displayed in plaintext as well as hundreds of thousands of US birth certificate applications — shared a common thread: unsecured cloud-based databases that left the sensitive information wide open for anyone to access online.

An epidemic in the past year or so of organizations inadvertently leaving their Amazon Web Services S3 and ElasticSearch cloud-based storage buckets exposed and without proper security has added a new dimension to data breaches. Organizations literally aren't locking down their cloud servers, researchers are finding them en masse, and it's likely cybercriminals and nation-state are as well. Misconfigured online storage has led to an increase of 50% in exposed files this year over 2018, according to data from Digital Shadows published in May. 

"Cloud services are inexpensive ways to do things we've done expensively for years, so it makes sense why so many people are moving their resources to the cloud. The problem is that it's still far too easy to make mistakes that expose all your data to the Internet," says John Bambanek, vice president of security research and intelligence at ThreatStop.

Security researcher Bob Diachenko last week discovered a massive ElasticSearch database of more than 2.7 billion email addresses, 1 billion of which included passwords in plaintext. Most of the stolen email domains were from Internet providers in China, such as Tencent, Sina, Sohu, and NetEase, although there were some Yahoo, Gmail, and Russian email domains as well. The pilfered emails that came with the passwords were confirmed to be part of a previous massive breach from 2017, when a Dark Web vendor had them for sale.

The ElasticSearch server was hosted at a US-based colocation service, which on Dec. 9 took down the server after Diachenko reported it. It had sat wide open and searchable, with no password protection, for at least one week.

"In terms of numbers, this is perhaps the biggest thing I've seen" in exposed records, says Diachenko, cyber threat intelligence director at, who has unearthed multiple data exposures since 2018, including a database of 275 million personal records of Indian citizens this past May. "What's interesting about [this latest] particular exposure is that it was stored in a public cluster, and it seemed the data has been [updating] in real time."

Diachenko says he wasn't able to verify each email as valid and active, but he did cross-reference some with previously reported breaches he had found. He says it's unlikely many of the victims are aware of the breach. "The chances are high these email accounts are still vulnerable," he says, because users in that region often are not alerted to a breach and services to check email compromises can be blocked by China's Great Firewall. He teamed up with Comparitech to study the exposed data.

It's unclear for sure just who was behind the database — cybercriminals or even security researchers — but either way, the configuration oversight was a blatant security misstep. ElasticSearch offers security options, Diachenko notes, but this example and others are just another example of how many organizations ignore or overlook securing cloud storage.

One clue he found: The owners of the database had hashed the stolen email addresses with MD5, SHA1, and SHA256 hashes of each address, which Diachenko believes was for ease of search purposes in the database. "My best shot is that somebody just bought it and was trying to start a searchable database for I don't know what reasons," he says. "And ElasticSearch was misconfigured and became publicly available."

Another Badly Built Bucket
Meanwhile, researchers at Fidus Information Security, a UK-based penetration testing firm, separately discovered nearly 800,00 online applications for copies of US birth certificates on an exposed AWS S3 storage bucket belonging to a firm that provides a service for obtaining copies of birth and death certificate copies. The bucket had no password protection, so the database was open to anyone who found it.

Interestingly, the storage bucket's trove of 94,000 death certificate copy applications was not accessible, according to TechCrunch, which reported this week that it had verified the records for Fidus. 

Data included in the birth record applications, which dated back to late 2017, ranges from names, birthdates, addresses, email addresses, phone numbers, and other personal data, TechCrunch found.

Andrew Mabbitt, director of Fidus, says his firm found the data while working on an AWS S3 project. "The bucket was configured for complete world readable access — allowing anybody with the URL to obtain a full list of all files," Mabbitt says.

The server — and data — still remain exposed. "We contacted the company numerous times and got no response at all. We contacted the Amazon AWS security team, who thanked us for the report and said they would pass it on to the bucket owner," he says. "I assume this was done, but their email to the owner was ignored, too."

Misconfigured and exposed data sitting on the public Internet is ripe for fraud and identity theft. Attackers can use email addresses for targeted phishing or use personally identifiable information to hack bank or other valuable accounts as well.

Anurag Kahol, CTO of Bitglass, recommends organizations ensure they have full knowledge and visibility of customer data. He also advises they employ real-time access control, encryption of at-rest data, and can detect any misconfigured cloud security settings.

Related Content:

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "Security 101: What Is a Man-in-the-Middle Attack?"

About the Author(s)

Kelly Jackson Higgins, Editor-in-Chief, Dark Reading

Kelly Jackson Higgins is the Editor-in-Chief of Dark Reading. She is an award-winning veteran technology and business journalist with more than two decades of experience in reporting and editing for various publications, including Network Computing, Secure Enterprise Magazine, Virginia Business magazine, and other major media properties. Jackson Higgins was recently selected as one of the Top 10 Cybersecurity Journalists in the US, and named as one of Folio's 2019 Top Women in Media. She began her career as a sports writer in the Washington, DC metropolitan area, and earned her BA at William & Mary. Follow her on Twitter @kjhiggins.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like

More Insights