News, news analysis, and commentary on the latest trends in cybersecurity technology.
Confidential AI Protects Data and Models Across Clouds
Confidential AI integrates zero trust and confidential computing to guard data and models during inferencing, training, learning, and fine-tuning.
Artificial intelligence (AI) is transforming a variety of industries, including finance, manufacturing, advertising, and healthcare. IDC predicts global spending on AI will exceed $300 billion by 2026. Companies spend millions of dollars building AI models, which are considered priceless intellectual property, and the parameters and model weights are closely guarded secrets. Even knowing some of the parameters in a competitor's model is considered valuable intelligence.
The data sets used to train these models are also highly confidential and can create a competitive advantage. As a result, data and model owners are looking to protect these assets from theft or compliance violations. They need to ensure confidentiality and integrity.
This brings us to the new field of confidential AI. The goal of confidential AI is to ensure that model creation, training, preprocessing, and curation of the training data — and the execution of the model and data through its life cycle — are protected from compromise, tampering, and exposure while at rest, in transit, and in use. Protected from whom? From infrastructure providers, rogue system administrators, model owners, data owners, and other actors who could steal or alter critical elements of the model or data. Confidential AI emphasizes strong policy enforcement and zero-trust principles.
Use Cases for Confidential AI
Confidential AI requires a variety of technologies and capabilities, some new and some extensions of existing hardware and software. This includes confidential computing technologies, such as trusted execution environments (TEEs) to help keep data safe while in use — not just on the CPUs, but on other platform components, like GPUs — and attestation and policy services used to verify and provide proof of trust for CPU and GPU TEEs. It also includes services that ensure the right data sets are sourced, preprocessed, cleansed, and labeled. And finally, key management, key brokering, and distribution services ensure that models, data, prompts, and context are encrypted before being accessed inside a TEE or delivered for execution.
Let's look at four of the top confidential AI scenarios.
1. Confidential Inferencing
This is the most typical use case for confidential AI. A model is trained and deployed. Consumers or clients interact with the model to predict an outcome, generate output, derive insights, and more.
Model owners and developers want to protect their model IP from the infrastructure where the model is deployed — from cloud providers, service providers, and even their own admins. That requires the model and data to always be encrypted with keys controlled by their respective owners and subjected to an attestation service upon use. A key broker service, where the actual decryption keys are housed, must verify the attestation results before releasing the decryption keys over a secure channel to the TEEs. Then the models and data are decrypted inside the TEEs, before the inferencing happens.
Multiple variations of this use case are possible. For example, inference data could be encrypted with real-time data streamed directly into the TEE. Or for generative AI, the prompts and context from the user would be visible inside the TEE only, when the models are operating on them. Last, the output of the inferencing may be summarized information that may or may not require encryption. The output could also be fed downstream to a visualization or monitoring environment.
2. Confidential Training
Before any models are available for inferencing, they must be created and then trained over significant amounts of data. For most scenarios, model training requires massive amounts of compute power, memory, and storage. A cloud infrastructure is well-suited for this, but it requires strong security guarantees for data at rest, in transit, and in use. The requirements presented for confidential inferencing also apply to confidential training, to provide evidence to the model builder and the data owner that the model (including the parameters, weights, checkpoint data, etc.) and the training data aren't visible outside the TEEs.
An often-stated requirement about confidential AI is, "I want to train the model in the cloud, but would like to deploy it to the edge with the same level of security. No one other than the model owner should see the model." The approach presented for confidential training and confidential inference work in tandem to accomplish this. Once the training is done, the updated model is encrypted inside the TEE with the same key that was used to decrypt it before the training process, the one belonging to the model owner's.
This encrypted model is then deployed, along with the AI inference application, to the edge infrastructure into a TEE. Realistically, it's downloaded from the cloud to the model owner, and then it is deployed with the AI inferencing application to the edge. It follows the same workflow as confidential inference, and the decryption key is delivered to the TEEs by the key broker service at the model owner, after verifying the attestation reports of the edge TEEs.
3. Federating Learning
This technique provides an alternative to a centralized training architecture, where the data is not moved and aggregated from its sources due to security and privacy concerns, data residency requirements, size and volume challenges, and more. Instead, the model moves to the data, where it follows a precertified and accepted process for distributed training. The data is housed in the client's infrastructure, and the model moves to all the clients for training; a central governor/aggregator (housed by the model owner) collects the model changes from each of the clients, aggregates them, and generates a new updated model version.
The big concern for the model owner here is the potential compromise of the model IP at the client infrastructure where the model is getting trained. Similarly, the data owner often worries about visibility of the model gradient updates to the model builder/owner. Combining federated learning and confidential computing provides stronger security and privacy guarantees and enables a zero-trust architecture.
Doing this requires that machine learning models be securely deployed to various clients from the central governor. This means the model is closer to data sets for training, the infrastructure is not trusted, and models are trained in TEE to help ensure data privacy and protect IP. Next, an attestation service is layered on that verifies TEE trustworthiness of each client's infrastructure and confirms that the TEE environments can be trusted where the model is trained. Finally, trained models are sent back to the aggregator or governor from different clients. Model aggregation happens inside the TEEs, the model is updated and processes repeatedly until stable, and then the final model is used for inference.
4. Confidential Tuning
An emerging scenario for AI is companies looking to take generic AI models and tune them using business domain-specific data, which is typically private to the organization. The primary rationale is to fine-tune and improve the precision of the model for a set of domain-specific tasks. For example, an IT support and service management company might want to take an existing LLM and train it with IT support and help desk-specific data, or a financial company might fine-tune a foundational LLM using proprietary financial data.
This fine-tuning most likely would require an external cloud infrastructure, given the huge demands on compute, memory, and storage. A confidential training architecture can help protect the organization's confidential and proprietary data, as well as the model that's tuned with that proprietary data.
About the Authors
You May Also Like
How to Evaluate Hybrid-Cloud Network Policies and Enhance Security
Sep 18, 2024DORA and PCI DSS 4.0: Scale Your Mainframe Security Strategy Among Evolving Regulations
Sep 26, 2024Harnessing the Power of Automation to Boost Enterprise Cybersecurity
Oct 3, 202410 Emerging Vulnerabilities Every Enterprise Should Know
Oct 30, 2024