Microsoft announces Maia 200, its 2nd-gen AI accelerator for cost-efficient inference

Microsoft has announced Maia 200, a second-generation 3nm AI accelerator designed for inference, boasting significant performance advantages over rival chips.

Back in 2023, Microsoft first revealed that it was developing an AI accelerator chip under the name Maia. Later, at Build 2024, the company shared more details about Maia 100, its first in-house AI accelerator. At Hot Chips 2024, Microsoft released the specifications for Maia 100 and revealed additional details, including its power requirements.

Today, Microsoft announced Maia 200, its second-generation AI accelerator chip targeting inference workloads. While Maia 100 was built on TSMC’s 5nm node, Maia 200 is built on TSMC’s 3nm process and includes native FP8/FP4 tensor cores. It supports 216GB of HBM3e memory with 7 TB/s of bandwidth, along with 272MB of on-chip SRAM.

In the official blog post announcing Maia 200, Microsoft claims that it is the most performant in-house silicon from any hyperscaler, including Amazon and Google. In a surprise move, Microsoft also published a comparison table showing Maia 200 against equivalent chips from Google and Amazon. As the table shows, Maia 200 delivers nearly twice the FP8 performance of Amazon’s third-generation Trainium and around 10% higher FP8 performance than Google’s seventh-generation TPU.

Maia 200 performance

Microsoft also highlighted Maia 200’s efficiency, claiming 30% better performance per dollar than the latest-generation hardware currently deployed in Azure. Maia 200 is also designed for scale-up deployments, featuring an integrated on-die NIC with 2.8 TB/s of bidirectional bandwidth for communication across a cluster of 6,144 accelerators.

Maia 200 can serve a range of AI models, including OpenAI’s GPT-5.2 models, enabling the company to deliver AI features across Microsoft 365 and other services. Microsoft’s Superintelligence team will also use Maia 200 for synthetic data generation and reinforcement learning to develop upcoming in-house models.

Unlike Maia 100, which was announced well ahead of deployment, Maia 200 is already deployed in Microsoft’s US Central datacenter region near Des Moines, Iowa, and in the US West 3 datacenter region near Phoenix, Arizona.

To help developers and startups optimize their tools and models for Maia 200, Microsoft is releasing a preview of the Maia SDK. The SDK includes PyTorch integration, a Triton compiler, optimized kernel libraries, and access to Maia’s low-level programming language.