Over the last decade we meticulously taught ourselves how to collect, store, and process big data. Now, the next challenge is to get rid of this data.
The General Data Protection Regulation (GDPR), with its sweeping mandates for protecting personal data, was a wake-up call for businesses across the board that they needed to exercise greater control over many aspects of their data processing practices. The California Consumer Privacy Act followed suit, and there is a high probability that other upcoming privacy laws around the world will likely continue the trend.
Regulations around how data is used, data retention time frames, and data subjects' right to be forgotten all necessitate particular attention to data destruction. In the good old days, incinerating backup tapes or shredding a few hard drives would have solved the problem. Today, we have a bigger challenge on our hands.
We now work with complex, massively distributed computing environments. The resources we directly control are often spread across the globe, and the rest live in some external organization's opaque cloud. System components interact in complex (and sometimes unexpected) ways, forming both explicit and implicit data flows between them. The challenge is to track down where exactly data is before we can even start thinking about how to destroy it.
Cryptographic erasure roughly means encrypting the data first, and when it is time to delete it, discarding the encryption key instead. Under computational assumptions that the underlying cryptographic primitives cannot be broken (and we can all agree that cryptography is the strongest link in a secure system), without the key, that data could never be decrypted again. It is as good as deleted.
Many readers will be familiar with the term from the recent NIST and ISO guidelines that recommend it as a secure data destruction technique. Storage media vendors have also been promoting cryptographic erasure as a faster alternative to traditional data destruction mechanisms. For example, self-encrypting drives in the market can refresh the key stored in their onboard controller, instantaneously rendering the contents unreadable.
In reality, however, this idea dates all the way back to 1996, first publicly proposed by Dan Boneh and Richard Lipton. In their paper titled "A Revocable Backup System," published in the USENIX Security Symposium, the authors describe a tape backup scheme in which backed-up data is encrypted with a periodically refreshed key. Every time the key changes, old backups are lost without requiring any modifications to the tape itself, analogous to modern self-encrypting drives.
So, how does this apply to our times and solve the problem of tracking data in and across complex computing environments? All of the previous examples focus on the use of cryptographic erasure as an efficient way to destroy all content on a given physical storage medium. However, let's take a step back and get a better view of the general principle behind the idea.
Cryptographic Erasure: Two Useful Properties
First, unlike in the previous scenarios, we do not need to restrict ourselves to using a single key that encrypts an entire drive or data set. Instead, we can have as many unique keys as we need, encrypting data at the granularity that serves our purposes. For example, a cloud service provider may decide to assign a unique key for each of its customers, allowing it to selectively destroy a specific customer's data when necessary. Otherwise, the provider may choose to partition the data at a finer granularity — a unique key per user, file, or even a database entry. The possibilities and business applications are immense.
Second, cryptographic erasure entirely bypasses the issue of tracking data flows. Whether the data resides in a remote data center, in someone else's cloud, or in a long-forgotten tape archive is irrelevant. The encrypted data is always bound to the encryption key, and it is sufficient to know where our keys are to be able to destroy all instances of our data.
Unfortunately, there is no silver bullet in security, and this is not the exception. A prerequisite for this scheme to work is that all sensitive data must be encrypted at all times. (Maybe that is a good thing!) This implies a computational overhead for cryptographic operations, but more importantly, the decision to incorporate cryptographic erasure into a system is probably best considered at early architectural design stages. Integration into legacy systems may be difficult and error prone.
Furthermore, as with every cryptographic system, storage and distribution of keys becomes a prime concern, especially with very fine-grained data partitioning schemes that could require large numbers of keys. This would necessitate building an appropriate key management infrastructure — a task with which security professionals often have a love-hate relationship.
Cryptographic erasure is a powerful technique that can address emerging data destruction challenges, especially in the face of stringent privacy laws, where traditional approaches remain impractical. Security professionals should take advantage of this tool in their arsenal, understand its trade-offs, and recognize that cryptographic erasure can have advanced applications beyond wiping hard drives.