Security teams are torn between the quest to encrypt everything and the technical feasibility of doing so. The advantage of encryption is that it obscures data, even after a breach, and satisfies privacy regulations. But it can also obstruct application performance, especially when applied to data in cloud services.
Concerns over government inspection of data, service provider breaches, and insufficient access controls all drive interest in encryption in the cloud. Many companies have internal policies or regulatory compliance standards that require data to be encrypted, with keys managed by the company (rather than the cloud provider) before it leaves their control. Security teams look for encryption schemes with the strongest possible data protection capabilities. Business and application owners want to preserve the functionality of underlying cloud applications. So what's the "best" type of encryption?
The Functionality vs. Security Trade-off
A scheme's security is always at odds with functionality in the cloud. No encryption scheme offers full cloud application functionality and performance with unmatched crypto strength. When implementing the strongest security, critical features of SaaS applications may fail. For example, search, document preview, graphically rendered data, and logical operations may break when data is encrypted. In other words, it's possible to secure data to the point where it's no longer useful.
Teams charged with evaluating encryption in the cloud should take a three-step approach:
Let's examine the relative strengths and weaknesses of various encryption approaches.
Regular (Unstructured) Encryption
The primary goals of regular symmetric key encryption are data confidentiality, data integrity, and sender authenticity.
The strongest schemes hide all useful information about the data: the key, the message, any bit of the message, and any function of the message. Schemes can also provide data integrity and sender authenticity, meaning an attacker can't create a valid ciphertext or modify a legitimate ciphertext without the user noticing. Regular encryption should be used for any data that requires the highest security, even at the price of losing search and other functionality.
Selective encryption only encrypts noncompliant substrings of a larger piece of data. This category of scheme might be used to encrypt sensitive data to ensure regulatory compliance while leaving other data unencrypted to preserve as much functionality as possible. This method is commonly used to encrypt data within collaborative content-sharing cloud applications, intranets, or extranets where personnel may be working jointly on a project.
Sensitive data fields such as a Social Security number can be encrypted with regular encryption. Assuming one's inspection and identification policy catches all references to the sensitive value, its security ends up fully protected. At the same time, end users may lose search functionality on this data.
Format-preserving encryption (FPE) retains the format of the original text. Using FPE, a company may take a credit card number and encrypt it so that the resulting ciphertext is a 16-digit number — helpful when an application requires a specific format. Typical scenarios requiring format preservation involve protection of credit card numbers and Social Security numbers. With FPE, the application field validation rules still function correctly while the underlying data remains encrypted. FPE leaks equality between plaintexts — that is, patterns between plaintexts and ciphertexts — and fails to provide data integrity and sender authenticity. Equality leakage allows some forms of statistical attacks, which take advantage of frequency information observed in large sets of ciphertexts to make guesses about plaintexts. So, if attackers know that the most frequent plaintext was "cat," they can look for the ciphertext that arises most frequently in the database and infer that its plaintext is "cat." Email addresses are a typical application of FPE.
Regular encryption hides data so well that search becomes impossible. But searching on encrypted data is possible if one sacrifices some security. This category of encryption leaks the equality of keywords, enabling certain statistical attacks similar to the frequency attacks discussed above. Different types of searchable encryption result in different extents of leakage, exposing data to varying levels of risk.
Order-preserving encryption (OPE) is a searchable encryption method by which ciphertexts preserve the order of plaintexts. The ability to index, search, and sort encrypted data in external servers gives enterprises flexibility in their use of cloud services. Using OPE, an organization can protect numeric or alphanumeric fields while preserving functionality such as sorting and range queries.
Practitioners should realize that leaking order means other related information is leaked. A worst case for security arises when all possible plaintexts are encrypted: an attacker can sort the ciphertexts and know that the first ciphertext encrypts the first plaintext, the second encrypts the second plaintext, and so on. Even when smaller amounts of data are encrypted, some specific OPE algorithms have even been shown to leak up to half of the plaintext. One should tread carefully when considering using it to protect high-value data.
Tokenization creates tokens for each plaintext, stores data and tokens locally, and then passes the tokens to the cloud application. This approach preserves a great deal of application functionality, such as searching for keywords.
This method works well for satisfying compliance rules for data residency. The security drawbacks are similar to those of searchable encryption. Local storage of data and corresponding tokens should be protected. Users must have access to the tokenization database, potentially causing issues for remote or mobile users.
Fully Homomorphic Encryption
In theory, fully homomorphic encryption (FHE) lets the client ask the server to search encrypted data for any function of the plaintexts or to compute, say, the average of all encrypted numbers in a database field; the server won't learn anything about the data. While the theory is appealing, higher-level operations and real-world functionality are many years away. Even when FHE becomes feasible to use, linear search times are likely to be unacceptable for large databases.
Security teams need to communicate the trade-offs that come with technology decisions, and that road leads to encryption. While security suggests use of regular encryption for as much data as possible, functionality and legacy constraints may impede this. Newer approaches such as OPE and searchable encryption can potentially satisfy requirements when data can't be left in the clear. In the end, practitioners must weigh the trade-offs between security and functionality to arrive at the best implementation for their needs.