Cloud

5/21/2015
10:30 AM
50%
50%

Data Encryption In The Cloud: Square Pegs In Round Holes

Conventional encryption is a surefire solution for protecting sensitive data -- except when it breaks cloud applications. "Format-preserving" encryption could change all that.

Faced with the security concerns of storing sensitive data in the cloud, many organizations’ first response is, “let’s encrypt everything.” Encryption can be necessary to satisfy data residency regulatory requirements and can help prevent disclosure of data for a blind subpoena. Even more important, for enterprises, encryption can provide peace of mind in the event of a data breach. The bad news is that traditional encryption techniques can also pose limitations to the functionality of cloud applications. I call this the “Square-Pegs-Round-Holes” problem.

The client-side encryption challenge: structured data
Existing cloud data APIs most often expect data to come in certain formats. As an example, credit cards are 16-digit strings. These are the round holes of provider APIs. Data encrypted under conventional schemes ends up random, unformatted bit strings; the square pegs of our analogy. The result is that using conventional encryption means we have encrypted data that simply can’t replace plaintext data in requests made to the cloud. As any toddler knows, square pegs just don’t fit into round holes.

But actually the situation is quite a bit worse than all that, as every type of sensitive data comes with its own format. Not only do credit card numbers have to be 16-digit strings, but salaries must be positive integer numbers, emails must be alphanumeric strings with an ‘@’ character, a domain name, and a TLD like ‘.com’, and so much more. So it’s not just that square pegs must fit into round holes, but also stars, triangles, pentagons, rhombuses, and so on.

Even if we solve this issue for one target format — say credit card numbers — this most often won’t help us figure out how to fit ciphertexts into any of the hundreds of other formats we might encounter. And from engineers at leading security companies, I hear about them regularly having to handle new formats when working with their customers.

Flipping the problem around
This is where a line of research I’ve been pursuing, together with a number of fantastic colleagues, comes in. Back in 2009, we flipped the problem around. Forget about square pegs, and change encryption to be a primitive that allows easy specification of the target format — the ciphertexts produced by encryption should come out already formatted in the right “shape.” We called this “format-preserving encryption” (FPE), a name first coined by Terence Spies of HP (formerly Voltage Security).

Indeed, Terence and others already had identified this problem and even had ideas about how one might build FPE. We worked out some schemes, did some theoretical analyses of their security, and eventually some work by my co-authors and Terence led to the widely used FFX mode that handles a pretty wide range of formats such as credit card numbers, small integers, and more. But it doesn’t handle all formats. For example, using FFX to encrypt email addresses in a way that passes the checks mentioned above would be complex and error-prone.

A big insight in that original 2009 paper was that, in principle, one could build FPE schemes that themselves expose a simple way of specifying expressive formats. In practice, a regular expression would be a convenient way of defining a format. For example, [a-z0-9]*@[a-z0-9]*.[com|edu|net] would express a format handling many kinds of email addresses.

But at that point we seemed to hit a stumbling block: the most natural way of building FPE schemes that accept a regular expression as format description is in general inefficient. Strikingly inefficient, in fact, being that it requires an algorithm for efficiently ranking the language defined by a regular expression, which is known to be very difficult. In a sequence of works we developed a number of improvements that make FPE work efficiently for regular expressions, as well as context-free grammars. The cryptographic strength of these schemes arises from their use of FFX (or equally secure variants), which is used within them.

We also built more general algorithms, realizing what we call “format-transforming encryption” (FTE). With FPE the plaintext is, itself, formatted as the ciphertext should be, but with FTE it could be that the input and output formats are different.

Making it easy for users
In the end, the resulting encryption algorithms are not only secure, but solve the key usability issues of making it easy to specify a “peg size.” Innovative security vendors offer the ability to specify regular expressions to allow fast prototyping of formats at customer sites, and to greatly decrease the cost of developing new encryption engines. In most encryption products for structured data, each different type of field needs its own encryption engine. This is time-consuming, complex, and error-prone. With FPE, the process of creating a new encryption engine is as simple as picking a regular expression, which describes the field. Thus, creating a new encryption engine is something that any developer can do seamlessly. This allows them to quickly adapt to the particulars of different cloud services.

FTE and FPE with easy-to-program “pegs” are of interest beyond their use by security companies, or even beyond encryption for cloud services. For example, we’ve shown how FTE schemes supporting regular expressions are useful in helping combat Internet censorship. The idea is to encrypt network packets so that the resulting ciphertexts match against the regular-expression-based filters used by censors to identify protocols they whitelist as benign (e.g., plaintext HTTP). In fact, an implementation of FTE for regular expressions by Kevin Dyer is deployed with Tor (a tool frequently used to avoid censorship).

It’s gratifying to see emerging security technologies bring these types of academic breakthroughs to the cloud security market. The intention is that with more functional encryption capabilities, companies will be able to enable cloud services for a wider range of use cases.

Thomas Ristenpart is on the Skyhigh Networks Cryptography Advisory Board and is a professor in the Department of Computer Sciences at the University of Wisconsin. His research spans a wide range of computer security topics, most recently focusing on new threats to, and ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
High Stress Levels Impacting CISOs Physically, Mentally
Jai Vijayan, Freelance writer,  2/14/2019
Valentine's Emails Laced with Gandcrab Ransomware
Kelly Sheridan, Staff Editor, Dark Reading,  2/14/2019
Making the Case for a Cybersecurity Moon Shot
Adam Shostack, Consultant, Entrepreneur, Technologist, Game Designer,  2/19/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
5 Emerging Cyber Threats to Watch for in 2019
Online attackers are constantly developing new, innovative ways to break into the enterprise. This Dark Reading Tech Digest gives an in-depth look at five emerging attack trends and exploits your security team should look out for, along with helpful recommendations on how you can prevent your organization from falling victim.
Flash Poll
How Enterprises Are Attacking the Cybersecurity Problem
How Enterprises Are Attacking the Cybersecurity Problem
Data breach fears and the need to comply with regulations such as GDPR are two major drivers increased spending on security products and technologies. But other factors are contributing to the trend as well. Find out more about how enterprises are attacking the cybersecurity problem by reading our report today.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-8980
PUBLISHED: 2019-02-21
A memory leak in the kernel_read_file function in fs/exec.c in the Linux kernel through 4.20.11 allows attackers to cause a denial of service (memory consumption) by triggering vfs_read failures.
CVE-2019-8979
PUBLISHED: 2019-02-21
Koseven through 3.3.9, and Kohana through 3.3.6, has SQL Injection when the order_by() parameter can be controlled.
CVE-2013-7469
PUBLISHED: 2019-02-21
Seafile through 6.2.11 always uses the same Initialization Vector (IV) with Cipher Block Chaining (CBC) Mode to encrypt private data, making it easier to conduct chosen-plaintext attacks or dictionary attacks.
CVE-2018-20146
PUBLISHED: 2019-02-21
An issue was discovered in Liquidware ProfileUnity before 6.8.0 with Liquidware FlexApp before 6.8.0. A local user could obtain administrator rights, as demonstrated by use of PowerShell.
CVE-2019-5727
PUBLISHED: 2019-02-21
Splunk Web in Splunk Enterprise 6.5.x before 6.5.5, 6.4.x before 6.4.9, 6.3.x before 6.3.12, 6.2.x before 6.2.14, 6.1.x before 6.1.14, and 6.0.x before 6.0.15 and Splunk Light before 6.6.0 has Persistent XSS, aka SPL-138827.