When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Microsoft reveals Phi-3-vision, a new multimodal AI small language model

Microsoft logo on a blue background

In April, Microsoft first announced its new Phi-3 family of AI small language models (SMLs). They are designed to be run on devices rather than on cloud servers. Today, as part of Build 2024, Microsoft announced yet another Phi-3 model, which has a different theme.

The new model is called Phi-3-vision, and as its name suggests, this SML can handle not just text prompts but can also accept images and text prompts to generate answers to questions. Microsoft says people who use this model can get answers to questions about a chart they have submitted or ask any open-ended questions about images that the Phi-3-vision model receives.

Phi-3-vision includes 4.2 billion parameters, which is bigger than the 3.8 billion parameter Phi-3 Mini mode but much smaller than the 7 billion parameter Phi-3 Small model and the 14 billion parameter Phi-3 Medium mode. Microsoft says Phi-3-vision can answer questions about "general visual reasoning tasks as well as chart, graph and table reasoning."

The new Phi-3-vision model is currently available as a preview version, but there's no word on when it will become generally available. However, Phi-3 Mini, Phi-3 Small, and Phi-3 Medium are now all generally available to access via Microsoft's Azure AI model as a service.

In related news from Build 2024, Microsoft stated that Azure AI Studio is now generally available. The company stated;

The pro-code platform empowers responsible generative AI development, including the development of copilots, to support complex apps and tasks like content generation, data analysis, project management, automation of routine tasks and more.

The company stated that Azure AI Studio includes support for both "code-first" functions and a "friendly user interface" so developers can choose how to use the tools for their own coding projects.

Microsoft also announced that OpenAI's latest large language model, ChatGPT-4o, is now generally available from Azure AI Studio and as an API.

Report a problem with article
copilot for microsoft 365 in windows
Next Article

Microsoft reveals Copilot extensions as the final name for expanding Copilot features

Khanmigo on a laptop
Previous Article

Khan Academy's Khanmigo AI tool to be free for US educators thanks to Microsoft

Join the conversation!

Login or Sign Up to read and post a comment.

0 Comments - Add comment