
Google has added Computer Use as a built-in tool in Gemini 3.5 Flash, giving developers a single model that can reason about a task and operate graphical interfaces across browsers, mobile devices, and desktop environments. The feature is available through the Gemini API and Google’s Gemini Enterprise Agent Platform, although it remains a preview feature for now.
Computer Use enables an AI agent to examine screenshots and return actions such as mouse clicks, scrolling, and keyboard input. A developer’s application must execute those actions, capture the resulting screen, and send it back to Gemini, creating a continuous loop until the task is completed.
Google says the integration can be used for activities including repetitive form filling, application testing, research across multiple websites, and longer enterprise workflows. Gemini 3.5 Flash can work with browser, mobile, and desktop environments, whereas Google’s earlier standalone Computer Use model was primarily positioned around browser interaction.
The main change is consolidation. Computer control was previously offered through the separate Gemini 2.5 Computer Use preview model. As Neowin reported when that model was introduced, it was designed to interpret a visual interface and generate actions without requiring a website-specific API.
Google later brought Computer Use to preview versions of Gemini 3 Pro and Gemini 3 Flash in January 2026. The latest release now incorporates the tool into the stable Gemini 3.5 Flash model rather than requiring developers to select a specialized model solely for interface automation.
Gemini 3.5 Flash itself was announced in May as Google’s latest fast model for coding and multi-step agent workflows. It supports a one-million-token input context window and up to 65,000 output tokens, along with adjustable thinking levels that let developers trade additional reasoning for lower latency and cost.
Google also added that Gemini 3.5 Flash received targeted adversarial training for computer-use scenarios. The company is also offering safeguards that can require user confirmation before sensitive or irreversible actions and automatically stop a workflow when suspected prompt injection is detected. Its developer documentation describes configurable protections for areas such as financial transactions and changes to sensitive records.
Google isn't the first to bring Computer Use to its platform. Anthropic has made computer control available through Claude, while OpenAI has continued improving computer-use performance in its recent models. Microsoft has also applied the concept to business workflows, including a Computer Use capability for the Researcher agent in Microsoft 365 Copilot.
0 Comments
Load the comments and join the conversation!
Read the comments, ask the editors questions, show respect and join the conversation.