Following the successful launch of its in-house models MAI-Voice-1 and MAI-Image-2, Microsoft last week introduced its third model, MAI-Transcribe-1. The company claims it is the most accurate transcription model in the world across 25 languages, with an average word error rate (WER) of just 3.9%.
Now, Microsoft’s Bing team has released the Harrier family of embedding models, which outperform other open-source and proprietary embedding models in the industry. Embedding models convert data (usually text, but also images, audio, etc.) into dense numerical vectors (called embeddings) that capture the meaning or semantics of that data. In the multilingual MTEB-v2 benchmark, a Harrier model ranks No. 1, ahead of Google’s Gemini Embedding 2.
Microsoft has released three variants of the Harrier model:
- Harrier-OSS-v1-27B
- Harrier-OSS-v1-0.6B
- Harrier-OSS-v1-270M
The above Harrier embedding models support more than 100 languages with a 32k context window, and can produce fixed-size embeddings for each input.
Apart from its state-of-the-art performance, the model is also open-sourced, which will enable developers to easily improve the grounding quality of AI applications without licensing restrictions. Microsoft announced that they made the following improvements to achieve the high performance of Harrier:
- Large-scale contrastive pre-training and fine-tuning. By scaling the dataset size throughout both the contrastive pre-training and fine-tuning stages, we observed consistent improvements in performance.
- Synthetic data generation. Utilizing frontier models such as GPT-5, we generated multilingual text pairs at scale, employing a variety of synthesis strategies to enhance data diversity.
- Knowledge distillation. LLM-based re-rankers produced high-quality training signals and efficiently filtered noisy data. Our smaller models benefit from knowledge distillation, receiving guidance from larger teacher models during training.
Building on Harrier, the company highlighted that it is developing a new grounding service which will deliver better retrieval quality, stronger semantic understanding, and more robust context selection at scale. Microsoft will be using these advancements to bring better user experiences to Bing in the future.