Two technologies — homomorphic encryption and federated learning — could allow companies and researchers to collaborate on data analyses and the creation of machine-learning models without actually exposing data to leakage risks, according to a pair of Intel researchers who spoke at the company virtual labs event on Thursday.
Federated learning allows collaborators to start with a single common machine-learning model, train that model on their own in-house data, and then securely collect and combine the now-divergent models to create a more accurate iteration that incorporates data from all participants. Homomorphic encryption is a more general approach, the fruit of a special field of cryptography that focuses on ways to protect the data but allow calculations to be done on the encrypted data — searching and training machine-learning algorithms, for example. This essentially protects privacy while maintaining the usefulness of the information.
Intel has doubled down on both technologies, building support for them into its hardware using the Software Guard Extensions (SGX). The result will be less expensive applications of the two technologies, said Jason Martin, principal engineer with the secure intelligence team at Intel, in a separate interview with Dark Reading.
"Unprocessed data is data that's not useful," he said. "The primary tools that we have for making sense of the increasing volumes of data is machine learning and statistics technologies, but companies are worried about sharing that data because of security and privacy concerns."
Intel's research and plans for the technology were revealed by Martin and Rosario Cammarota, a principal engineer focused on computing on encrypted data at Intel, at Intel Labs Day.
Finding ways to share and analyze data securely has become a major research issue. This year, a multidisciplinary group of researchers from the Massachussetts Institute of Technology created a system that uses privacy-preserving encryption to allow companies to share information on breaches without revealing the actual data. While some companies — such as Duality and Enveil — have focused on specific security-focused uses of homomorphic encryption, Intel hopes to broaden the possibilities by including support in its chips.
"This is the time where a lot of the advances of which we are hearing needs to meet the applied science, and that is where our exploration comes into place," Cammarota said. "There needs to be more theoretical advances and standardization, which we are taking part in."
Technologies like federated learning and homomorphic encryption allow companies to collaborate without giving up control of their data, Cammarota said.
Federated data solves two problems: The data silo problem limits data use because the information cannot be transmitted for privacy concerns, intellectual-property considerations, or regulatory regimes. A more practical issue is the size of the data set. Bandwidth limitations can restrict companies from directly sharing large data sets for training a machine-learning model in a central location.
Industries such as healthcare and financial services are looking at federated learning as way to collaborate without violating privacy rules or releasing sensitive information. The University of Pennsylvania used federated learning to train machine-learning models to recognize brain tumors using a variety of siloed data sets. The federated training led to a model that performed 17% better, Martin said.
"In federated learning, we split that computation up and distribute the computation out to the individual silos, so each hospital will have its own infrastructure," he said. "A portion of training is done there, and then the models are pushed out to the aggregation server, and the aggregator takes those models and combines them into an updated global model."
Homomorphic encryption allows data analysis on encrypted ciphertext, rather than first having to decrypt the data to use it. The technology promises to allow analysis without actually revealing the data.
Yet there are challenges. The size of data explodes when using homomorphic encryption, with the ciphertext some 100 to 1,000 times larger than the original data. The complexity of computation also grows enormously — by a factor of 10,000 to 1 million — making even simple functions expensive to implement in practice.
But the industry has tackled such challenges before, Cammarota pointed out. In 1960, a single transistor cost between $1 and $4 (or about $8 to $30 today). Over the past 60 years, that price has dropped by more than a factor of a billion.
If homomorphic can be implemented as inexpensively, Cammarota expects similar widespread applications.
"As soon as transistor technology started scaling and the transistor became inexpensive, then completely unforeseen applications became possible," he said. "If homomorphic encryption becomes inexpensive, then we will see a lot more possibilities for the technology."