Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Perimeter

12/12/2012
02:50 PM
Adrian Lane
Adrian Lane
Commentary
50%
50%

Don't Get 'Ha-Duped' By Big Data Security Products

Some Big Data security recommendations

A couple of posts back I posed the question, "How do you secure big data environments?" I covered several avenues of attack against NoSQL environments that I considered low-hanging fruit from a security standpoint. To address those attacks, I offer the following recommendations to provide basic cluster security.

But before I make any recommendations to securing NoSQL, I have a couple of requirements for security solutions in big data environments. Many security vendors offer security products for big data, but these are the same products they offer for normal IT environments. Because these solutions were not architected for big data, they often impose limits on how they are deployed, how they scale, or limit big data cluster functionality in some way.

We don't want to be "Ha-Duped" into buying traditional security products that don't work with big data, so all of my recommendations must scale and must not limit the core capabilities of big data. Here is what I recommend:

Kerberos
The goal here is to validate nodes to ensure unwanted copies of data or unwanted queries are run to copy data. In traditional IT, this may seem far-fetched, but in cloud and virtual environments with thousands of nodes in a cluster, this is both easy and difficult to detect. Kerberos provides a means of authenticating nodes before they are allowed to participate on the cluster.

TLS
The goal is to keep communications private. Built-in Hadoop capabilities provide for secure client-to-Web application-layer communication, but not internode communication, nor M-R output before it is passed to the application. TLS provides a mechanism for secure communication between all nodes and services, and scales as nodes are added to the cluster.

File Layer Encryption
The idea is to protect the contents of the cluster from those who administer it. In the event an admin wants to inspect data files, either through a basic text editor or by taking a snapshot and examining the archive, encryption keeps data safe. File-layer encryption is transparent to the database, and scales up as new nodes are added to the cluster. Granted, the encryption keys must be kept secure for this feature to be effective.

Key Management
As an extension of file-layer encryption, key management is critical to successful encryption deployment. All too often when we review cloud and virtual security systems, we find that administrators keep encryption keys stored unprotected on the disk. Encryption keys must be readily available when a cluster restarts; otherwise, data is inaccessible, and leaving keys in the open was the only way they knew how to ensure safe system restarts. Most central key management systems provide for key negotiation on restart, and still keep keys safe.

Deployment Validation
The goal here it to ensure that all nodes in the cluster are properly patched, properly configured, and run the correct (i.e., not hacked) copies of software. It's easy to miss things when you have thousands of nodes running, so automated node validation makes basic node security much easier. Scripts to install patches and set and test configuration settings are an easy way to verify no nodes enter the cluster without proper setup and security. This can also be linked into Kerberos authentication and external application validation (security as a service) offerings.

Logging
Big data is excellent at managing and mining large amounts of data. And most big data environments have logging capabilities baked-in. There is not need to reinvent the wheel: Use these features and the cluster storage model to log events. You can even create security-related M-R scripts to mine the data. There are several commercial and open-source add-ons with additional logging features to help. Note that during this series of posts that I have omitted discussing application security -- specifically, security of the applications that use big data and code vulnerabilities with JSON, Java, and Ruby. I could discuss these as well, given that the security of the big data APIs is of concern. The problems are independent of big data, relevant to any application, and need to be dealt with by application developers with or without a NoSQL database. The subject is beyond the scope of the database, but I raise the issue so that you keep in mind that there are entire classes of attacks that I've not discussed.

Also, note that these are the recommendations I feel comfortable making at this time. There are many different research papers being published on big data security, and I hope to see some innovation in this area within the coming years. And I am starting to see security products architected specifically for NoSQL environments.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 9/17/2020
Cybersecurity Bounces Back, but Talent Still Absent
Simone Petrella, Chief Executive Officer, CyberVista,  9/16/2020
Meet the Computer Scientist Who Helped Push for Paper Ballots
Kelly Jackson Higgins, Executive Editor at Dark Reading,  9/16/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-8225
PUBLISHED: 2020-09-18
A cleartext storage of sensitive information in Nextcloud Desktop Client 2.6.4 gave away information about used proxies and their authentication credentials.
CVE-2020-8237
PUBLISHED: 2020-09-18
Prototype pollution in json-bigint npm package < 1.0.0 may lead to a denial-of-service (DoS) attack.
CVE-2020-8245
PUBLISHED: 2020-09-18
Improper Input Validation on Citrix ADC and Citrix Gateway 13.0 before 13.0-64.35, Citrix ADC and NetScaler Gateway 12.1 before 12.1-58.15, Citrix ADC 12.1-FIPS before 12.1-55.187, Citrix ADC and NetScaler Gateway 12.0, Citrix ADC and NetScaler Gateway 11.1 before 11.1-65.12, Citrix SD-WAN WANOP 11....
CVE-2020-8246
PUBLISHED: 2020-09-18
Citrix ADC and Citrix Gateway 13.0 before 13.0-64.35, Citrix ADC and NetScaler Gateway 12.1 before 12.1-58.15, Citrix ADC 12.1-FIPS before 12.1-55.187, Citrix ADC and NetScaler Gateway 12.0, Citrix ADC and NetScaler Gateway 11.1 before 11.1-65.12, Citrix SD-WAN WANOP 11.2 before 11.2.1a, Citrix SD-W...
CVE-2020-8247
PUBLISHED: 2020-09-18
Citrix ADC and Citrix Gateway 13.0 before 13.0-64.35, Citrix ADC and NetScaler Gateway 12.1 before 12.1-58.15, Citrix ADC 12.1-FIPS before 12.1-55.187, Citrix ADC and NetScaler Gateway 12.0, Citrix ADC and NetScaler Gateway 11.1 before 11.1-65.12, Citrix SD-WAN WANOP 11.2 before 11.2.1a, Citrix SD-W...