Microsoft's AI model has outperformed humans in natural language understanding

Microsoft is heavily invested in artificial intelligence models with expertise in natural language understanding (NLU). To that end, the company has acquired startups studying natural language processing (NLP) and also has an exclusive license to OpenAI's GPT-3 language model. Now, the Redmond tech giant has announced that its AI model has outperformed humans in SuperGLUE benchmarks.

SuperGLUE is considered to be a difficult benchmark as it tests a variety of NLU operations such as answering questions when given a premise, natural language inference, and co-reference resolution, among many others. To tackle this benchmark, Microsoft updated its Decoding-enhanced BERT with Disentangled Attention (DeBERTa) model, and boosted it to have a total of 48 Transformer layers with 1.5 billion parameters.

As a result, the single DeBERTa model now scores 89.9 in SuperGLUE while the ensemble model with 3.2 billion parameters scores 90.3. Both of these scores are slightly higher than the human baseline of 89.8, which means that the model performs better than humans.

It is important to note that this is not the first model to surpass human baselines. The "T5 + Meena" model developed by the Google Brain team scored 90.2 just a couple of days ago, on January 5. However, Microsoft's DeBERTa even outperformed that model on January 6.

Moving forward, Microsoft has noted that it is integrating DeBERTa into the Microsoft Turing natural language representation model (Turing NLRv4), which means that it will then be utilized by customers across Bing, Office, Dynamics, and Azure Cognitive Services. The company says that the fact that its model uses fewer parameters than Google's solution means that it is more energy-efficient and is more maintainable because it is easier to compress and deploy. It went on to say that:

DeBERTa surpassing human performance on SuperGLUE marks an important milestone toward general AI. Despite its promising results on SuperGLUE, the model is by no means reaching the human-level intelligence of NLU. Humans are extremely good at leveraging the knowledge learned from different tasks to solve a new task with no or little task-specific demonstration. This is referred to as compositional generalization, the ability to generalize to novel compositions (new tasks) of familiar constituents (subtasks or basic problem-solving skills). Moving forward, it is worth exploring how to make DeBERTa incorporate compositional structures in a more explicit manner, which could allow combining neural and symbolic computation of natural language similar to what humans do.

Microsoft has released the model, its documentation, and its source code for public use on GitHub here.

Tags

Subscribe to our Newsletter

Trending Stories

Windows 11 22635.4000 adds a new taskbar feature and more

Windows 11 26257 adds a way to duplicate a tab in File Explorer

Meta: candidates are subject to the same rules as regular users. It's a blatant lie

Edifier STAX Spirit S5. Probably the best closed-back Planar Magnetic headphones

Blazing PCIe 5.0 speeds with T-Force Z540 2 TB NVMe and DARK AirFlow I

Windows 11 26120.1330 adds a new Power setting and more

TerraMaster F4-424 Pro: powerful media class 4-bay NAS, the best on the market

So cheap, so good - EasySMX X05 games controller offers multi-platform fun

Launches from China and New Zealand coming up, Ariane 6 maiden flight

Windows Server 2025 version 26244 does away with a known issue

Oukitel C50: a cheap and cheerful 5G phone with a 5,150mAh battery

GEEKOM GT13 Pro: 13th gen i9 power inside a tiny aluminum frame

Self-hosting: What is it and why you might (or might not!) be interested

How to set up and use Eye Tracking on your iPhone running iOS 18

Login