"A lot of enterprises are concerned about securing their front door, but they don't pay as much attention to that patio door where some of their sensitive information might be accessible," says Todd Thiemann, senior director of product marketing of Vormetric. "There are these flows of information around the database that you also need to pay attention to."
According to the Verizon Business 2011 Payment Card Industry Compliance Report, losing track of regulated data is one of the No. 1 reasons for unnecessarily increasing the scope of a PCI audit.
"Understanding data flows and stores is essential to establishing the scope of assessment," the report said. "A poor understanding of this usually results in an overly large scope, which, in turn, usually results in more expense and difficulty."
Once it has made its way out of the database, data can hide in number of places, but some of the most likely include backup files, developer testing files, extract-transform-load (ETL) data for data warehousing functions, and even places like script files spit out by applications touching the database or database configuration and control files.
Some of the biggest gaps are within backup files or even test databases containing production data, warns Amichai Shulman, CTO of Imperva.
"Enterprises might secure the database in production, but they also need to pay equal attention to backups," he says.
[ What are the hidden costs of compliance? See The Compliance Officer's Dirty Little Secret. ]
He says that evidence from both the recent Yahoo Voices and LinkedIn data breaches shows what was taken in both incidents was likely old data stolen from somewhere other than these organizations' main production systems. According to Shulman, organizations may find that if they lock down their production data within the database, they can still be bit in the rear if they forget these secondary sources.
"It is a problem at many organizations where many copies of the same data exists outside the database," he says. "Some of them are just being neglected, but some of them find their way into public Web-facing places where they either get indexed by search engines or just hackers stumble upon them and are able access the data."
Another oft-forgotten secondary source is ETL data that is used within data warehouses that are commonly used with business intelligence applications to analyze critical patterns within the enterprise.
"That's a process where you suck off the transaction information, flatten it out, transform it to put into a warehouse, and then load it into that data warehouse," he says. "Well, that ETL data typically includes sensitive bits that also need to be secured."
Thiemann also warns that items like log, configuration, control, and script files could also be potential sources of sensitive information.
"There are some situations where you may not need to consider some of these sources like script files, but some script files might contain an username and password, in which case you better look at securing that information because someone who gets a hold of that might get access to your database," he warns.
Shulman believes, though, that configuration and control files probably shouldn't have sensitive information in the first place -- "unless of course you keep your passwords in those, in which case you're making a huge mistake," he says.
But more than confidentiality worries, Shulman says control and configuration files need to be protected for integrity's sake.
"I think that sometimes organizations don't realize the risk if those files are tampered with," he says. "Even small changes to a configuration file or control file of a database server would cause a denial-of-service, and more subtle changes would allow attackers to access all of the information within a database server or see all of the traffic by specific users, and so on."
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.