Gemini 2.5 Computer Use model outperforms leading alternatives on multiple AI benchmarks

During Google I/O earlier this year, Google revealed that it would be bringing computer use capabilities to the Gemini API. Today, Google announced Gemini 2.5 Computer Use, a new specialized model to power agents that can interact with user interfaces (UIs). Google claims that this new model outperforms other similar models on multiple web and mobile control benchmarks.

Here"s how the Gemini API computer_use tool works:

  • Developers need to send the user request as the input to the tool, which includes a screenshot of the environment and a history of recent actions.
  • Along with the input, developers can also specify whether to exclude functions from the full list of supported UI actions or if any additional custom functions need to be included.
  • The model will analyze the received inputs and generate a response, which will be one of the UI actions, such as clicking or typing.
  • If the model is unsure, it may even request end-user confirmation. For example, if the action is related to purchasing an item, user confirmation will be required.
  • The client-side code then executes the received action, such as clicking a button or displaying an end-user confirmation.
  • Once the action is completed, a new screenshot of the current GUI and the current URL are sent back to the Computer Use model as a function response, restarting the loop.
  • Until the main task objective is reached, the above steps are repeated.

While the Gemini 2.5 Computer Use model is optimized for web browsers, Google claims that this model also performs well for mobile UI control tasks. Google specifically mentioned that this model is not yet optimized for desktop OS-level control. As you can notice in the benchmarks below, Gemini 2.5 Computer Use delivers state-of-the-art results in several key benchmarks.

The Gemini 2.5 Computer Use model is now available in public preview, and developers can access it via the Gemini API on Google AI Studio and Vertex AI.

Report a problem with article
Next Article

Google rolls out AI Plus plan and AI Mode to dozens of new countries

Previous Article

WD Elements 14TB external HDD is a mouth-watering deal this 2025 Amazon Prime Day