Do You Know Where Your Databases Are?

Database discovery and important first step to securing sensitive data stores
One of the most important first steps to any database security strategy is also, coincidentally, one of the most likely to be forgotten: enumerating the databases an organization manages. After all, unless an enterprise knows how many databases it has and which ones contain sensitive information, it is pretty difficult to prioritize them based on risk and implement appropriate controls. And, yet, many organizations are operating in the dark with regard to database discovery.

"Many companies struggle to locate and accurately maintain an inventory of all their data across databases," says Anu Yamunan, senior product manager at Imperva.

It's true, says Paul Borchardt, senior manager of Vigilant by Deloitte, who sees many organizations fail to maintain any kind of centralized inventory of databases or applications across the enterprise.

"This sounds so simple and logical, but an accurate asset inventory is frequently nonexistent or, if it exists, is fragmented and managed by disparate asset managers, such as DBAs and developers," he says. "Failing to identify the one database containing the PII of your clients because you didn't know about it will not please the regulators or the court of public opinion."

Part of the issue is one of scale. Many organizations operate hundreds of databases across their IT infrastructure, some more visible than others. According to the recent IOUG Enterprise Data Security Survey, 38 percent of organizations have more than 100 databases, with 18 percent managing more than 1,000 databases. Add to that the dynamic nature of databases and the applications they feed with data, and it becomes clearer why such a seemingly simple task remains on the IT to-do list.

[Are you missing the downsides of big data security analysis? See 3 Inconvenient Truths About Big Data In Security Analysis.]

"The main issue with databases is the complexity and constant change makes it virtually impossible for manual processes to keep up [with discovery]," says Kevin O'Malley, vice president of marketing and product strategy for MENTIS Software.

Additionally, other business and technology trends are amplifying the problem of finding and tracking databases across the board, Yamunan says.

"Virtualization is one of these," Yamunan says. "For example, an administrator can easily create a new virtual image of a database with sensitive information. This virtual image now contains a 'rogue' database that is not under IT security controls."

Similarly, backing up data stores to the cloud has created potential issues for discovering and adequately protecting databases. Not only could snapshot features create copies of the database that could be difficult to track down, but they often don't feature encryption capabilities. For example, Amazon AWS has a relational database service (RDS) with no option to encrypt database snapshots.

"Additionally, Amazon has a redundant failover option that keeps an up-to-date hot replica of your database in case the primary fails," says Fred Thiele, co-founder of Laconic Security. "Again, if you have unencrypted data in your database, the unencrypted data is replicated to a different part of Amazon-land in plaintext."

Regardless of the complications, organizations should be finding ways to scan infrastructure automatically to accomplish discovery and institute data classification to centrally keep track of databases and the information contained within. O'Malley suggests full scans on a monthly or quarterly basis at minimum to ensure organizations are turning over all the rocks necessary to find sensitive data. Doing this regularly is important, as the contents of a database could shift over time and a seemingly innocuous set of data could become sensitive as time goes on.

"Organizations should layer on top of that the ability to identify and remediate infrastructure vulnerabilities and misconfigurations, and assess who has access to sensitive data on an ongoing basis," says Yamunan, explaining that will make it easier to identify and remediate a sensitive database that's vulnerable or overly accessible. Doing this essentially creates risk scores for various data sets across different databases. "In essence, these steps help organizations generate risk scores for the various data sets in the enterprise. For example, a database that is not kept up to date with the latest patches, containing credit card information and accessed by external users and applications, is a high risk asset.

Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.