Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Application Security //

Database Security

9/24/2013
01:50 PM
Adrian Lane
Adrian Lane
Commentary
50%
50%

The Big Data Is The New Normal

Big data, not relational, is the new platform of choice

I get a lot of questions on big data. What is it? How are people using it? How do you secure it? How do I leverage it? I've been on the phone with three different journalists in the past couple of weeks talking about what security analytics with big data really means. Be it journalists, security professionals, IT, or management, big data is relatively new to the mainstream practitioner, so the questions are not particularly surprising.

What is surprising is just about every new database installation or project I hear about sits atop a big data foundation. The projects focus on data, looking at new ways to mine data for interesting information. From retail-buying trends, to weather analysis, to security intelligence, these platforms are the direction the market is heading. And it's because you can Hadoop. Cassandra. Mongo. Whatever. And it's developer-driven -- not IT or DBA or security. Developers and information architects specify the data management engine during their design phase. They are in the driver's seat. They are the new "buying center" for database security products.

Since the bulk of the questions I get are now focused on big data, I am going to begin shifting coverage a bit to cover more big data topics and trends. And I'll spend some time addressing the questions I am getting about security and uses for big data. Yes, I will continue coverage of interesting relational security as I get questions or new trends develop, but because most of you are asking about big data, I'm going to rebalance coverage accordingly.

And to kick it off, today I want to address a specific, critical point: Big data is all about databases. But rather than a "relational" database, which has a small number of defining characteristics, these databases come in lots of different configurations, each assembled to address a specific use case. Calling this trend "big data" is even a disservice to the movement that is under way. The size of the data set is about the least interesting aspect of these platforms. It's time to stop thinking about big data as big data and start looking at these platforms as the next logical step in data management.

What we call "big data" is really a building-block approach to databases. Rather than the prepackaged relational systems we have grown accustomed to during the past two decades, we now assemble different pieces (data management, data storage, orchestration, etc.) together in order to fit specific requirements. These platforms, in dozens of different flavors, have more than proved their worth and no longer need to escape the shadow of relational platforms. It's time to simply think of big data as modular databases.

The key here is that these databases are fully customizable to meet different needs. Developers for the past decade have been starting with relational and then stripping it of unneeded parts and tweaking it to get it to work the way they want it. Part of MySQL's appeal in the development community was the ability to change some parts to suit the use case, but it was still kludgy. With big data it's pretty much game on for pure customization. Storage model, data model, task management, data access, and orchestration are all variable. Want a different query engine? No problem, you can run SQL and non-SQL queries on the same data. It's just how you bundle it. Hadoop and Cassandra come with "stock" groupings of features, but most developers I speak with "roll-their-own" infrastructure to suit their use case.

But just as importantly, they work! This is not a fad. These platforms are not going away. It is not always easy to describe what these modular databases look like, as they are as variable as the applications that use them, but they have a set of common characteristics. And one of those characteristics, as of this writing, is the lack of security. I'll be going into a lot more detail in the coming weeks. Till then, call them modular databases or database 3.0 or whatever -- just understand that "NoSQL" and "Big Data" fail to capture what's going on.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security analyst firm. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
TechGuy1313
50%
50%
TechGuy1313,
User Rank: Apprentice
9/27/2013 | 8:45:38 PM
re: The Big Data Is The New Normal
Thanks, Adrian.

The term Big Data is so often bandied about rendering into buzzword Hall of Fame territory. I find that so many focus on the "Big" part of the phrase and don't consider the "4 Vs" if you will (Volume, Velocity, Variety, and Veracity) as a whole.

Wanted to share real quick a video (http://www.youtube.com/watch?v... I saw that not only speaks to IT's role in Big Data initiatives G but also delivers it via at least 8 sci-fi references. The video is based of research but talks more about ways to approach and plan for new large data projects.
Ulf Mattsson
50%
50%
Ulf Mattsson,
User Rank: Moderator
9/26/2013 | 9:58:21 PM
re: The Big Data Is The New Normal
As the article mentions, there is currently a lack of GǣstockGǥ security in big data
platforms.

However, although inherent security is lacking, security vendors now provide many
options to protect big data, from coarse grained approaches, such as volume or
file encryption, to fine grained methods such as masking and Vaultless Tokenization. Much in the same way that the platforms themselves can be assembled in different configurations, each security approach may have particular usefulness for a particular customized big data model. For example, in a storage model, you may only require coarse grained security, while a data access model may require very fine grained security with options to expose business intelligence.

Obviously, in cases such as the ones the article describes with Gǣroll-their-ownGǥ
infrastructure, itGs necessary to consider that each Gǣbuilding blockGǥ may require a different security method, in order to ensure that the data is protected throughout the environment, and not just in one or two of the components. This can, and does, create issues where developers fail to consider (or are not realistically able to foresee) what may be coming down the road in this exploding new field. So for many, it is difficult to reconcile this building block database with comprehensive data security, and this most likely plays a large part in why security has thus far been lacking.

But despite the lack of GǣstockGǥ security in big data environments, there is hope.
Data security vendors have been keen to develop new solutions to meet these
challenges, and continue to innovate along with this exciting, and ever-expanding new platform. IGm sure Adrian will have plenty to say on this subject in the coming weeks.

Ulf Mattsson, CTO Protegrity
COVID-19: Latest Security News & Commentary
Dark Reading Staff 9/25/2020
Hacking Yourself: Marie Moe and Pacemaker Security
Gary McGraw Ph.D., Co-founder Berryville Institute of Machine Learning,  9/21/2020
Startup Aims to Map and Track All the IT and Security Things
Kelly Jackson Higgins, Executive Editor at Dark Reading,  9/22/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-15208
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, when determining the common dimension size of two tensors, TFLite uses a `DCHECK` which is no-op outside of debug compilation modes. Since the function always returns the dimension of the first tensor, malicious attackers can ...
CVE-2020-15209
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, a crafted TFLite model can force a node to have as input a tensor backed by a `nullptr` buffer. This can be achieved by changing a buffer index in the flatbuffer serialization to convert a read-only tensor to a read-write one....
CVE-2020-15210
PUBLISHED: 2020-09-25
In tensorflow-lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, if a TFLite saved model uses the same tensor as both input and output of an operator, then, depending on the operator, we can observe a segmentation fault or just memory corruption. We have patched the issue in d58c96946b and ...
CVE-2020-15211
PUBLISHED: 2020-09-25
In TensorFlow Lite before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, saved models in the flatbuffer format use a double indexing scheme: a model has a set of subgraphs, each subgraph has a set of operators and each operator has a set of input/output tensors. The flatbuffer format uses indices f...
CVE-2020-15212
PUBLISHED: 2020-09-25
In TensorFlow Lite before versions 2.2.1 and 2.3.1, models using segment sum can trigger writes outside of bounds of heap allocated buffers by inserting negative elements in the segment ids tensor. Users having access to `segment_ids_data` can alter `output_index` and then write to outside of `outpu...