Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Perimeter

9/20/2012
12:06 PM
Adrian Lane
Adrian Lane
Commentary
50%
50%

A Look At Encrypted Query Processing

Stupid encryption tricks, only without a funny YouTube video

Encrypting data is one of the most basic -- and most effective -- data security measures we have at our disposal. But when used with relational databases, encryption creates two major problems.

The first problem is relational databases require that you define the data type prior to storage. VARCHAR() is a common database data type for storing application data, but requires a pre-defined size. Encryption algorithms typically output binary data, whose output length is not known beforehand. This creates a mismatch that requires redefining, and in most cases rebuilding, the database to accommodate encrypted data. The second and more serious issue is you cannot perform queries or functions on encrypted data. You can't check date ranges or make comparisons inside the database when data is encrypted. And you can effectively use indexes to sort and mange data either.

There are several ways encryption is employed today to address these issues, most commonly a) using a form of transparent encryption or b) encrypting at the application layer. With transparent encryption data stored on disk is encrypted, but processed inside the database in clear text. With encryption at the application layer, the app decrypts and processes data locally and uses the database purely as a place to store data.

But what if you don't trust the DBA? Or you just don't trust your cloud service provider? Worse, what if you think the database engine may be compromised by an attacker? I came across a post on Werner Vogels' blog Back-to-the-Future Weekend Reading - CryptDB, where he discusses a research paper on processing encrypted data within a relational database. The idea that is presented in this research paper is "SQL-aware Encryption." The goal is to keep data protected even if the database server and app server have been compromised. Their approach is to provide encryption that still allows normal relational database functions to work.

What does this mean? It means comparisons of two encrypted values like "=", or ">" would work on encrypted data. Database functions and most comparisons operations would continue to work in the scheme being described. SQL queries of the most common types will continue to work as before, so you get full database functionality on encrypted data. That sounds ideal, right? Not so fast.

The concept the authors are trying to duplicate is homomorphic encryption. But there is no true homomorphic encryption available commercially today. What they are in fact doing is using "off-the-shelf" encryption algorithms like AES, only without initialization vectors or nonce to randomize the output of the block cipher. That means when you encrypt the word "SELECT" with a specific key, you get the same binary result every time.

And that makes it a lot easier to guess the encrypted values! Keep in mind that SQL queries have a common structure and finite set of elements. It's fairly easy to pre-compute encrypted values on the words SELECT, FROM, WHERE, MAX, SORT, GROUP BY, DISTINCT, etc. If all data is stored under Bob's schema is encrypted with Bob's single key, text can be guessed by their frequency of occurrence.

So what's going on here is we are sacrificing a degree of security encryption provides us to make it harder for an attacker to steal sensitive information should they compromise the database server, the application server, or both. The degree of security is inverse to the level of utility. The more complex the query operation provided, the less secure the encryption variant. The data won't be sitting in clear text where a malicious party can steal it. However, if the host platform has been compromised, your data is still subject to several types of attack. It's much more likely an attacker will conduct word-frequency attacks and guess the contents of the database -- with a reasonable degree of accuracy. It's more security, but a 'speed-bump' rather than a barrier.

The lesson here is there is no free lunch. If you want strong crypto to preserve the privacy and integrity of data for long periods of time, some of the variations described in CryotDB will not be a good option. It will -- as the paper posits -- raise the bar on data privacy while allowing the relational database platform to still function. There are several small commercial vendors that offer this type of technology today -- with the same basic methods and the same basic flaws. But if you have a database environment you suspect will be compromised, there are better technologies available. Use tokenization or masking to create non-sensitive random copies that also preserve data value and database operations. Those technologies completely remove the risk without the performance penalty or complexity.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

 

Recommended Reading:

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 7/6/2020
Ripple20 Threatens Increasingly Connected Medical Devices
Kelly Sheridan, Staff Editor, Dark Reading,  6/30/2020
DDoS Attacks Jump 542% from Q4 2019 to Q1 2020
Dark Reading Staff 6/30/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
How Cybersecurity Incident Response Programs Work (and Why Some Don't)
This Tech Digest takes a look at the vital role cybersecurity incident response (IR) plays in managing cyber-risk within organizations. Download the Tech Digest today to find out how well-planned IR programs can detect intrusions, contain breaches, and help an organization restore normal operations.
Flash Poll
The Threat from the Internetand What Your Organization Can Do About It
The Threat from the Internetand What Your Organization Can Do About It
This report describes some of the latest attacks and threats emanating from the Internet, as well as advice and tips on how your organization can mitigate those threats before they affect your business. Download it today!
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-15037
PUBLISHED: 2020-07-07
NeDi 1.9C is vulnerable to cross-site scripting (XSS) attack. The application allows an attacker to execute arbitrary JavaScript code via the Reports-Devices.php page st[] parameter.
CVE-2019-4323
PUBLISHED: 2020-07-07
"HCL AppScan Enterprise advisory API documentation is susceptible to clickjacking, which could allow an attacker to embed the contents of untrusted web pages in a frame."
CVE-2019-4324
PUBLISHED: 2020-07-07
"HCL AppScan Enterprise is susceptible to Cross-Site Scripting while importing a specially crafted test policy."
CVE-2020-15036
PUBLISHED: 2020-07-07
NeDi 1.9C is vulnerable to cross-site scripting (XSS) attack. The application allows an attacker to execute arbitrary JavaScript code via the Topology-Linked.php dv parameter.
CVE-2020-15577
PUBLISHED: 2020-07-07
An issue was discovered on Samsung mobile devices with P(9.0) and Q(10.0) software. Cameralyzer allows attackers to write files to the SD card. The Samsung ID is SVE-2020-16830 (July 2020).