Microsoft is working with Nvidia to build confidential GPUs in the cloud

Microsoft is collaborating with Nvidia to build confidential GPUs on the Azure cloud where data can be securely offloaded and processed in a trusted execution environment (TEE), ensuring privacy.

Many organizations, both big and small, utilize the cloud for their data-related needs in storage and AI solutions. In order to secure the privacy and security of sensitive data, it is essential to use confidential computing. For those unaware, this is basically a set of hardware and software controls that governs how the data is being shared and used, as well as how data owners can validate these processes.

Intel and AMD CPUs already enable the creation of trusted execution environments (TEEs) to power confidential computing at a CPU-level. TEEs ensure that the data remains encrypted at rest, in transit, and even in use. It also offers remote attestation to validate the configuration of the hardware and and grant data access only to required algorithms. Microsoft's confidential computing solutions on Azure utilize the same principles too.

However, existing solutions are tied to TEEs being built in CPUs, so Microsoft is now looking to extend this boundary to GPUs as well, making sure that data can be securely offloaded to more powerful hardware for computation needs. This is even more important when it comes to AI workloads for organizations, and Microsoft is collaborating with Nvidia on this front.

A graphic for Microsoft&039s vision for confidential computing with NVIDIA GPUs

Microsoft has noted that this is not a simple implementation as it needs to protect GPUs from various attacks while ensuring that Azure host machines have adequate control for administrative activities. Even at a hardware level, the implementation should not have negative impact on thermals and performance, and ideally, should not require changes to the existing GPU microarchitecture either. The company's vision includes the following capabilities for confidential GPUs:

A new mode where all sensitive state on the GPU, including GPU memory, is isolated from the host

A hardware root-of-trust on the GPU chip that can generate verifiable attestations capturing all security sensitive state of the GPU, including all firmware and microcode

Extensions to the GPU driver to verify GPU attestations, set up a secure communication channel with the GPU, and transparently encrypt all communications between the CPU and GPU

Hardware support to transparently encrypt all GPU-GPU communications over NVLink

Support in the guest operating system and hypervisor to securely attach GPUs to a CPU TEE, even if the contents of the CPU TEE are encrypted

Microsoft has stated that it has already built confidential computing capabilities in Nvidia A100 Tensor Core GPUs on Azure. This has been done through a new feature called Ampere Protected Memory (APM). The implementation details are highly technical in nature and you can check them out here.

This solution is now available in private preview through Azure Confidential GPU VMs. Organizations can use VMs with up to four Nvidia A100 Tensor Core GPUs for their Azure workloads at this point. Microsoft's next steps include ensuring the broader adoption of these practices and working with Nvidia on its Hopper architecture for further enhancements to the existing implementation.