Billions of records were found exposed this week due to unprotected databases owned by major corporations and third-party providers.

Kelly Sheridan, Former Senior Editor, Dark Reading

June 18, 2021

5 Min Read

Unsecured cloud-based databases continue to threaten corporate and consumer data, as indicated by a series of reports this week involving incidents at Cognyte, CVS, and Wegmans.

First to make headlines this week was Cognyte, a cybersecurity analytics company that left some 5 billion records exposed online and accessible without authentication. The data was part of Cognyte's cyber-intelligence service, which alerts people to third-party data exposures and claims to have more than 1,000 government and enterprise customers across 100 countries.

"Ironically, the database used to cross-check that personal information with known breaches was itself exposed," security firm Comparitech wrote in a blog post on the discovery made by Bob Diachenko, who leads its security research team and discovered the data on May 29. If someone's information was in this database, they may be notified of an account compromise; if one of their passwords had been breached before, they would receive an alert to change it.

"The information included names, passwords, email addresses, and the original source of the leak," said researchers of the exposed data, noting that not all breaches from which the data was sourced included passwords; however, they couldn't determine an exact percentage that did. All of the data was stored on an Elasticsearch cluster.

This database was indexed by search engines on May 28; the day after, Diachenko found it and alerted Cognyte, which secured the data on June 2. It's unknown if any other third parties accessed the information during the window when it was exposed, or for how long it was exposed prior to being indexed, researchers reported in their June 14 blog post.

A few days later, security researcher Jeremiah Fowler and the WebsitePlanet research team disclosed their discovery of a non-password-protected database holding more than 1 billion records connected to CVS Health, a corporation that also owns CVS Pharmacy, CVS Caremark, and Aetna.

Researchers sent a responsible disclosure notice to CVS Health, which revoked public access the same day. It also confirmed this dataset was managed by a contractor or vendor that operated on CVS Health's behalf; however, details on the vendor were not disclosed.

The 204GB database contained aggregate and event data, including production records that exposed visitor ID, session ID, and device information — for example, whether site visitors used iPhone, iPad, or Android. Exposed files also gave "a clear understanding of configuration settings, where the data is stored, and a blueprint of how the logging service operates from the backend," Fowler said in a writeup of the findings.

Exposed records also disclosed individuals' search queries: "In this case these were search logs from everything that visitors searched for and contained references to both CVS Health and CVS.com," Fowler wrote.

In his research, he saw multiple records that indicate people searched for medications, COVID vaccines, and other CVS products. They also contained email addresses, which CVS confirmed were not from customer account records but entered in the search bar by the individuals. Reviewing the mobile CVS site, he said it's possible visitors believed they were logging in to their account but entering their email address into the search bar.

He noted he was able to identify some people by searching Google for their publicly exposed email address. "Hypothetically, it could have been possible to match the Session ID with what they searched for or added to the shopping cart during that session and then try to identify the customer using the exposed emails," Fowler wrote. That said, the visitor ID and session ID alone did not contain identifiable data; they could only identify a user with that person's email address.

While tracking activity from websites and e-commerce platforms may provide valuable insight, it may also contain metadata or error logs that expose more-sensitive data. He recommended CVS block searches that match email address patterns or domain names from being executed or logged, which could help prevent unwanted data from being collected or stored.

Closing out the week, grocery chain Wegmans disclosed two of its cloud databases, both of which are used for business purposes and meant to be kept internal, were accidentally left open to outside access "due to a previously undiscovered configuration issue," officials said in a statement. The issue was confirmed around April 19 and corrected shortly after, they report.

The databases contained customer information including names, addresses, phone numbers, birth dates, Shoppers Club numbers, and email addresses and passwords used to access Wegmans.com accounts. Wegmans confirmed all passwords were hashed and salted, so the actual password characters were not in the databases.

A Consistent and Dangerous Problem
The risk of unprotected databases isn't news to security teams. In fact, more and more of these occurrences have been making headlines in recent years. But why are they so common, even as organizations become aware of them?

"Cloud service providers provide a complex and highly configurable environment," says PJ Norris, senior systems engineer at Tripwire, and businesses need to have the appropriately skilled staff to securely configure them. Those with multiple cloud providers — a growing trend — must have employees who understand major cloud providers are configured in different ways. Cloud configuration assessments are another key step that aren't necessarily undertaken, he adds, advising businesses to conduct regular audits and reviews of public-facing environments.

These issues are often cases of simple misconfigurations that go undetected or aren't addressed fast enough, says Eric Kedrosky, CISO and research director at Sonrai Security. Most companies that move data to the cloud lack the visibility they need to know when it's at risk.

"There are often a lot of different teams involved in an organization's cloud, and there are different levels of security knowledge," he explains. When these issues are found, he says, they are often sent to the wrong places for remediation or not addressed quickly. Following the "shift left" methodology, these problems should be sent to the team that made the error.

About the Author(s)

Kelly Sheridan

Former Senior Editor, Dark Reading

Kelly Sheridan was formerly a Staff Editor at Dark Reading, where she focused on cybersecurity news and analysis. She is a business technology journalist who previously reported for InformationWeek, where she covered Microsoft, and Insurance & Technology, where she covered financial services. Sheridan earned her BA in English at Villanova University. You can follow her on Twitter @kellymsheridan.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights