2025-06-17 · Google

We're expanding our Gemini 2.5 family of models

pricingmodelsinfrastructure

We’re expanding our Gemini 2.5 family of models

Source: DeepMind Date: 2025-06-17 URL: https://deepmind.google/blog/were-expanding-our-gemini-25-family-of-models/

Summary

Google moved Gemini 2.5 Flash and Pro to GA stable releases and introduced 2.5 Flash-Lite in preview, claiming “Pareto Frontier of cost and speed.” Flash-Lite outperforms 2.0 Flash-Lite on coding, math, science, reasoning, and multimodal benchmarks at lower latency — optimized for high-volume, latency-sensitive tasks. All models support 1M token context and variable thinking budgets. Snap and SmartBear named as production adopters.

Implications

GA stable releases are the enterprise permission structure. “Experimental” and “preview” gates block enterprise production use; GA removes that blocker. Gemini 2.5 Flash and Pro moving to stable is the moment the 2.5 family becomes deployable for enterprise software with SLA requirements.

Flash-Lite positioning for translation/classification at scale. Translation and classification at high volume with low latency is the same use case as Flash-Lite’s “40,000 image captions per dollar” claim from 2.0. Google is consistent: the Lite tier is for programmatic bulk workloads, not conversational AI. That’s a defensible niche against GPT-4o mini.

Named production users are an ecosystem signal. Snap (social media, real-time content) and SmartBear (software quality) as named production adopters cover two different verticals — consumer and developer tooling. These aren’t pilot customers; they’re volume production users on Flash in production.

Watch:

Flash-Lite preview → GA timeline and whether the benchmark claims hold in third-party evaluation
Thinking budget API patterns that emerge from production Flash usage — what do developers actually set?
Whether the Snap/SmartBear adoption announcements are followed by competitors in the same verticals

← all signals