"Most of the attention in the past has been on transactional database systems, but what about the data warehouses?" says Phil Neray, vice president of data security strategy for IBM. "Those need security, too, because you're still going to have sensitive data, and with big data you're going to have a lot sensitive data, and it's all going to be centralized in one big repository."
According to Neray and other security experts, all the same rules of database security should apply -- things like using the rule of least privilege, encrypting the most sensitive bits of data, monitoring access, and keeping systems patched. But the challenge when dealing with big data is that it is much more difficult to secure it while it is stored in those data warehouses tasked with holding it.
"Discovery is far harder, [and] it's correspondingly easier to leak sensitive information if you don't know what to protect," says Adrian Lane, analyst with Securosis. "Many of these databases are now nonrelational and don't have the same security controls as established relational platforms. For example, many relational platforms have 'labeling' to help segment what a user sees. In nonrelational databases, you need to tag data as it is input and filter the output based on usage rules associated with the tag. "
What's more, it's difficult to encrypt data in warehouses meant to power business intelligence tools because encryption of huge data stores "screws up the processing and analytics," Lane says. It's a conundrum not without readily available technology to solve the problem, though.
"Masking will shield data from people who should not see the real values without effecting the reports. Tokenization is a simple replacement strategy for payment data and PII that does not affect the quality of the metrics," Lane says. "Labeling helps, as does file-level encryption if archives are being made. "
But beyond the technological issues, there lurks around big data a much bigger challenge: politics.
"Data warehouses are expensive and politically sensitive, which causes a tug-of-war between security and cost-effectiveness," says Jake Freivald, vice president of marketing for Information Builders, a business intelligence software firm. "To get the most value from the consolidated, cleansed information data warehouses hold, enterprises should share the information in them as widely as possible. But if that information is valuable to the enterprise, it's also valuable to competitors, saboteurs, and disgruntled employees."
As Freivald explains, even without foul play there's the possibility that someone with access to a data warehouse will accidentally share sensitive information. But it's the motivated malicious insider who can do the most damage without proper monitoring in place.
"Every sales rep should know every detail about every customer in his territory, but you risk having your most successful rep walk out the door with powerful insights about your customers -- insights that you spent an incredible amount of money to develop," he explains, saying the most common security mistake businesses make with big data is not thinking through the threats. "In the business intelligence business, we tend to tout the benefits of information-sharing without considering what could go wrong. Enterprises need to consider carefully how they should balance the need for security with the need to share the data warehouse's information."
Have a comment on this story? Please click "Add Your Comment" below. If you'd like to contact Dark Reading's editors directly, send us a message.