Google announces Gemini 3 Flash for cost- and latency-sensitive AI applications

Last month, Google announced Gemini 3 Pro, its flagship frontier model, which delivered state-of-the-art results across several AI benchmarks. Gemini 3 Pro is priced at $2 per million input tokens and $12 per million output tokens, with even higher costs when processing prompts exceeding 200,000 tokens.

However, not all AI applications require Gemini 3 Pro–level intelligence. Moreover, the model’s cost and latency make it unsuitable for a wide range of use cases. To address this, Google today announced Gemini 3 Flash (preview), a lightweight version of its flagship model that delivers strong performance at a significantly lower price point.

Google positions Gemini 3 Flash as a low-latency model optimized for real-time and high-throughput inference, while offering multimodal and reasoning capabilities comparable to Gemini 3 Pro. The model supports text, vision, audio, and video inputs.

Gemini 3 Flash is priced at $0.30 per million input tokens and $2 per million output tokens. In cached mode, input tokens cost just $0.075 per million. Overall, Gemini 3 Flash is approximately 85% cheaper than its flagship counterpart while still delivering strong performance for its class. For example, Gemini 3 Flash scores 90.4% on GPQA Diamond and 33.7% on Humanity’s Last Exam, outperforming several significantly larger frontier models.

Here"s how Gemini 3 Flash performs in key AI benchmarks:

Gemini 3 Flash is now available in the Gemini API via Google AI Studio, Google Antigravity, Gemini CLI, Android Studio for developers and to enterprises via Vertex AI.

Just last week, OpenAI responded to the Gemini 3 Pro model with the launch of the new GPT-5.2 series. The GPT-5.2 models perform marginally better than Gemini 3 Pro on most AI benchmarks and are priced at a similar level for developers. With the launch of the cost-effective Gemini 3 Flash, Google now offers an affordable alternative for developers. In response, OpenAI is likely to introduce a GPT-5.2 Mini model in the coming weeks, targeting a comparable price-to-performance ratio.

Tags