In the case of Texas, more than 3 million citizens were exposed after their information was left on a public FTP server for over a year -- a place that clearly wasn't meant to store unencrypted sensitive databases or information stores in any form, says Jason Reed, a consultant for security firm System Experts. He believes that Texas is simply a bellwether among many IT organizations, which, like the government agency, has policies in place that prohibit dissemination of unprotected databases and data files, but have no mechanism to enforce them.
"The issue -- and it's the same issue as with any number of companies -- is they have policies, but they're not going far enough in actually ensuring the implementation of controls to meet those policies," he says. "If they had gone back and actually looked at the data and looked at the servers, they would have realized that it contained sensitive information and hopefully would have protected correctly."
The issue of duplicate databases and stray data spread across the enterprise has been a longstanding issue for many IT departments.
"We're good at protecting the main database, but how many times have employees pulled subsets of that data out, and how is that data stored and handled?" says Rob Ayoub, global program director of network security for Frost & Sullivan.
In a recent Independent Oracle Users Group survey, half of polled organizations said they don't have a firm grasp on where all of the sensitive data resides across the enterprise. And approximately 37 percent of respondents admitted to using live data in nonproduction environments.
"It happens a lot in organizations that you'll find data where you'd least expect it," says Slavik Markovich, CTO for Sentrigo, which was recently acquired by McAfee. "Duplication happens all of the time."
He says that he recently helped a new customer search for duplicate billing database information and found more than 100 repeats in a single server instance. The danger with so many of these extra instances of data is that they're rarely as protected as the main data from which the information came, he says.
Most experts suggest that organizations rely on automated technology tools to help find information in places where it shouldn't reside.
"If you really don't have a kind of framework of scanning your servers or your databases consistently just trying to look for this information, you will definitely be blindsided," Markovich warns. "It has to be enforced and checked via technological means."
Markovich suggests at least a biweeky, if not a weekly, scan to look for rogue databases. Ideally, organizations should not have just a way to scan for whole databases, but also for data that has been exported.
"There are actually combinations of sets of tools you should have that are capable of finding databases, finding data out there, scanning within the database for sensitive information, and scanning on file systems for sensitive information, which is something DLP might do," he says. "All of that should work together."
Systems Experts' Reed also believes organizations should be conducting full audits across the infrastructure, looking for pockets of rogue data at very least on an annual basis -- and more often if there are major changes to the network architecture.
Perhaps even more important, he says, is for organizations to do a little soul-searching as to why exactly there is so much duplication floating around in the first place. Part of it, he says, is a bit of laziness in not doing due diligence and risk assessment when spreading data around for multiple projects, whether within the organization or to an outsourced provider.
"Companies are tending to produce data sets for use in other places that are far too rich for their intended purpose," he says, telling a story of one application service provider he worked with that was storing Social Security numbers from a data dump that a customer sent along to complete an e-mail campaign. There was no reason the numbers had to be included: The customer just didn't spend the time purging that sensitive information before sending it along.
Organizations won't have to deal with this problem as extensively if they start thinking about ways to facilitate easier use of data directly from the main database, rather than having users look for ways to simplify things by exporting data into another form, he says.
"You have to have rich interfaces that allow various applications to access the data as it sits in its protected state," he says. "What's happening is people are looking for information to be provided in a different format. You'd be amazed at what we find sometimes because people thought access to the main database was too hard, and so they dumped it out and have it sitting on a flash drive. "
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.