Perimeter
1/31/2013
11:25 AM
Adrian Lane
Adrian Lane
Commentary
Connect Directly
RSS
E-Mail
50%
50%

Big Data Security Discussion

Answers to common big-data security questions

Last week I participated in 'a "tweet jam" to discuss security of big data clusters. That use of social media is pretty interesting as it's an open forum to any/all who have interest. It's a great way to get some community involvement and buzz, but at the same time, 140 characters is insufficient to answer complex questions. You can portray an idea, or a facsimile of an idea, but it's insufficient to flesh out the nuances of a subject as complex as securing bi data clusters.

Worse, I thought some of the other participants in the conversation provided tangential responses to the questions posed, possibly generating more confusion than clarity. So I wanted to provide a little more color and context to the questions posed about big data security during that chat session.

Q1: What is big data security, and is it different that traditional data security?
Big data security is about securing big data clusters and the data stored within the cluster. The types of technologies we use to respond to specific threats are not new; for example, we still use encryption to secure data stored on data nodes, and we still need to use authentication to identify users before they are granted access to the system.

However, big data clusters are a type of distributed application. The goal is to protect both the data stored in the cluster, but also the function of the cluster itself. This poses new challenges because of the way big data scales, the "velocity" of data that is input into the cluster, and the elastic, self-organizing nature of big data clusters. These core traits break many existing security tools. That is to say most security tools don't automatically work with big data. For most security tools to work as currently implemented, they limit some fundamental trait of big data.

Q2: What about security systems as producers of big data?
The question is backward. It's not about big data as producers of security data -- rather consumers. We have event data coming from every device, agent, and application deployed, and security information is embedded -- sometimes subtly so -- in the stream of normal activity. Why big data is viewed as a solution because it can handle the volume of data, the rate at which it is being generated, and provides the analysis engine to comb through the data and find actionable security information.

Q3: Most big data stacks have no security built in. What does this mean for securing big data?
The premise is only partially true. Every open source and commercial distribution I've seen offers some security. Usually logging, identity management, and application proxies used to secure client-cluster sessions. But that's only a partial tool set to combat threats to the cluster or to data privacy, and the implementations of these tools is usually poor. That means looking at what third-party options are available and don't break the cluster. Today, most use a security model where the NoSQL database is secured behind a firewall, and partial security controls are applied to inbound traffic. Most do not leverage the cluster itself for security, rather the existing network tools they have in place. They benefit for familiarity of these tools and cost savings for using existing products, but the cost is partial security model that

Q4: How is the industry dealing with the social and ethical uses of consumer generated data?
I think the jury is out on this, and it's too early to tell. I cited the case where Target determined a teen pregnancy before the parents did. That's one of the few times tangible social and ethical issues have been discussed. Right now, the companies I've spoken with feel "my business productivity trumps you're privacy," so businesses are "full steam ahead," and the social and ethical issues will get sorted out later.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

Comment  | 
Print  | 
More Insights
Register for Dark Reading Newsletters
Partner Perspectives
What's This?
In a digital world inundated with advanced security threats, Intel Security seeks to transform how we live and work to keep our information secure. Through hardware and software development, Intel Security delivers robust solutions that integrate security into every layer of every digital device. In combining the security expertise of McAfee with the innovation, performance, and trust of Intel, this vision becomes a reality.

As we rely on technology to enhance our everyday and business life, we must too consider the security of the intellectual property and confidential data that is housed on these devices. As we increase the number of devices we use, we increase the number of gateways and opportunity for security threats. Intel Security takes the “security connected” approach to ensure that every device is secure, and that all security solutions are seamlessly integrated.
Featured Writers
White Papers
Cartoon
Current Issue
Dark Reading's October Tech Digest
Fast data analysis can stymie attacks and strengthen enterprise security. Does your team have the data smarts?
Flash Poll
Title Partner’s Role in Perimeter Security
Title Partner’s Role in Perimeter Security
Considering how prevalent third-party attacks are, we need to ask hard questions about how partners and suppliers are safeguarding systems and data.
Video
Slideshows
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2014-1927
Published: 2014-10-25
The shell_quote function in python-gnupg 0.3.5 does not properly quote strings, which allows context-dependent attackers to execute arbitrary code via shell metacharacters in unspecified vectors, as demonstrated using "$(" command-substitution sequences, a different vulnerability than CVE-2014-1928....

CVE-2014-1928
Published: 2014-10-25
The shell_quote function in python-gnupg 0.3.5 does not properly escape characters, which allows context-dependent attackers to execute arbitrary code via shell metacharacters in unspecified vectors, as demonstrated using "\" (backslash) characters to form multi-command sequences, a different vulner...

CVE-2014-1929
Published: 2014-10-25
python-gnupg 0.3.5 and 0.3.6 allows context-dependent attackers to have an unspecified impact via vectors related to "option injection through positional arguments." NOTE: this vulnerability exists because of an incomplete fix for CVE-2013-7323.

CVE-2014-3409
Published: 2014-10-25
The Ethernet Connectivity Fault Management (CFM) handling feature in Cisco IOS 12.2(33)SRE9a and earlier and IOS XE 3.13S and earlier allows remote attackers to cause a denial of service (device reload) via malformed CFM packets, aka Bug ID CSCuq93406.

CVE-2014-3636
Published: 2014-10-25
D-Bus 1.3.0 through 1.6.x before 1.6.24 and 1.8.x before 1.8.8 allows local users to (1) cause a denial of service (prevention of new connections and connection drop) by queuing the maximum number of file descriptors or (2) cause a denial of service (disconnect) via multiple messages that combine to...

Best of the Web
Dark Reading Radio
Archived Dark Reading Radio
Follow Dark Reading editors into the field as they talk with noted experts from the security world.