Big Texas Breach A Hard Lesson In Database Discovery

Many organizations have policies that prohibit dissemination of unprotected databases and files, but no way to enforce them

Dark Reading Staff, Dark Reading

April 13, 2011

5 Min Read

The data breach revealed this week by the state of Texas is yet another in a long list of breaches caused by a failure to enforce policies that govern how data is accessed and duplicated from the database. Costly gaffes like these could be headed off by more regular discovery and auditing procedures within IT infrastructure, experts say, along with some deep thought on how to make it easier to use protected information directly from the database locations in which they are supposed to reside.

In the case of Texas, more than 3 million citizens were exposed after their information was left on a public FTP server for over a year -- a place that clearly wasn't meant to store unencrypted sensitive databases or information stores in any form, says Jason Reed, a consultant for security firm System Experts. He believes that Texas is simply a bellwether among many IT organizations, which, like the government agency, has policies in place that prohibit dissemination of unprotected databases and data files, but have no mechanism to enforce them.

"The issue -- and it's the same issue as with any number of companies -- is they have policies, but they're not going far enough in actually ensuring the implementation of controls to meet those policies," he says. "If they had gone back and actually looked at the data and looked at the servers, they would have realized that it contained sensitive information and hopefully would have protected correctly."

The issue of duplicate databases and stray data spread across the enterprise has been a longstanding issue for many IT departments.

"We're good at protecting the main database, but how many times have employees pulled subsets of that data out, and how is that data stored and handled?" says Rob Ayoub, global program director of network security for Frost & Sullivan.

In a recent Independent Oracle Users Group survey, half of polled organizations said they don't have a firm grasp on where all of the sensitive data resides across the enterprise. And approximately 37 percent of respondents admitted to using live data in nonproduction environments.

"It happens a lot in organizations that you'll find data where you'd least expect it," says Slavik Markovich, CTO for Sentrigo, which was recently acquired by McAfee. "Duplication happens all of the time."

He says that he recently helped a new customer search for duplicate billing database information and found more than 100 repeats in a single server instance. The danger with so many of these extra instances of data is that they're rarely as protected as the main data from which the information came, he says.

Most experts suggest that organizations rely on automated technology tools to help find information in places where it shouldn't reside.

"If you really don't have a kind of framework of scanning your servers or your databases consistently just trying to look for this information, you will definitely be blindsided," Markovich warns. "It has to be enforced and checked via technological means."

Markovich suggests at least a biweeky, if not a weekly, scan to look for rogue databases. Ideally, organizations should not have just a way to scan for whole databases, but also for data that has been exported.

"There are actually combinations of sets of tools you should have that are capable of finding databases, finding data out there, scanning within the database for sensitive information, and scanning on file systems for sensitive information, which is something DLP might do," he says. "All of that should work together."

Systems Experts' Reed also believes organizations should be conducting full audits across the infrastructure, looking for pockets of rogue data at very least on an annual basis -- and more often if there are major changes to the network architecture.

Perhaps even more important, he says, is for organizations to do a little soul-searching as to why exactly there is so much duplication floating around in the first place. Part of it, he says, is a bit of laziness in not doing due diligence and risk assessment when spreading data around for multiple projects, whether within the organization or to an outsourced provider.

"Companies are tending to produce data sets for use in other places that are far too rich for their intended purpose," he says, telling a story of one application service provider he worked with that was storing Social Security numbers from a data dump that a customer sent along to complete an e-mail campaign. There was no reason the numbers had to be included: The customer just didn't spend the time purging that sensitive information before sending it along.

Organizations won't have to deal with this problem as extensively if they start thinking about ways to facilitate easier use of data directly from the main database, rather than having users look for ways to simplify things by exporting data into another form, he says.

"You have to have rich interfaces that allow various applications to access the data as it sits in its protected state," he says. "What's happening is people are looking for information to be provided in a different format. You'd be amazed at what we find sometimes because people thought access to the main database was too hard, and so they dumped it out and have it sitting on a flash drive. "

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.

About the Author(s)

Dark Reading Staff

Dark Reading

Dark Reading is a leading cybersecurity media site.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like

More Insights