Microsoft releases Phi-4-reasoning-vison-15B, an AI model that chooses when to think

Image: Microsoft

Microsoft just released Phi-4-reasoning-vision-15B, a new open-weight multimodal AI model, and its biggest selling point is that it knows exactly when to stop thinking, which is something we don’t see in many open-source LLMs.

The newly released Phi-4-reasoning-vision-15B is a 15-billion parameter model designed for handling complex tasks like image captioning, UI element grounding, and advanced math. What’s particularly interesting about it is that Microsoft engineered Phi-4-reasoning-vision-15B to be able to decide when to activate the thinking mode and when to give instant answers.

Most AI models require you to either enable or disable thinking, and they strictly follow that decision. Phi-4-reasoning-vision-15B choosing when to think on its own could prove to be remarkably efficient, but it could also be unpredictable. More real-world tests are needed to determine the appeal of this approach.

Microsoft was also very efficient when training this Phi-4 variant, as it was trained on just 200 billion tokens. For reference, most decent-sized AI models devour over a trillion tokens just to get up to speed. The development team was also selective when choosing content on which to train Phi-4-reasoning-vision-15B. On paper, this should mean that the model gives better answers, simply because it was trained on higher-quality data. But that doesn’t have to be the case in practice, especially because Microsoft used GPT 4o to assist with training.

Microsoft presented benchmark results for Phi-4-reasoning-vision-15B against other open-source LMMs from and slightly above its class, and the results were mixed. While Microsoft’s model outperforms even bigger models in some tests, it also falls behind in other categories. We should give props to Microsoft here for showcasing realistic benchmark results, instead of trying to inflate them in favor of its model. Still, benchmarks often give an inaccurate picture of a model"s capabilities, so real-world results might be different.

Here are the Phi-4-reasoning-vision-15B model benchmarks:

Image: Microsoft

Microsoft’s Phi-4 series of open-weight models is often underrated, as the open-source community is more focused on LLMs from Chinese companies, like Qwen 3.5. And to be fair, Microsoft doesn’t put too much effort into advertising it, as the company is more focused on providing infrastructure for third-party frontier models. However, Phi-4-reasoning-vision-15B could still be a decent pick, because it delivers decent results in a lean package.

You can check out more details about Microsoft Phi-4-reasoning-vision-15B on Microsoft"s blog.

Microsoft has already made the model publicly available. You can grab the open weights right now on Hugging Face, and Microsoft Foundry.

Edit: The article has been updated to reflect the model"s official name.

Tags