2025-10-23 · Google

Try Deep Think in the Gemini app

models

Try Deep Think in the Gemini app

Source: DeepMind Date: 2025-10-23 URL: https://deepmind.google/blog/try-deep-think-in-the-gemini-app/

Summary

Google launched Deep Think as a consumer-accessible toggle in the Gemini app for AI Ultra subscribers, using parallel thinking techniques and novel RL to simultaneously explore and combine multiple solution paths. Consumer release achieves bronze-level IMO performance; a competition model reached gold. State-of-the-art on LiveCodeBench V6 (competitive coding) and Humanity’s Last Exam. Safety note: improved content safety vs. 2.5 Pro but “higher tendency to refuse benign requests.”

Implications

Bronze-level IMO in a consumer product is a capability milestone. The gap between gold-medal (competition-specific model) and bronze-medal (consumer Deep Think) is notable — bronze-level mathematical reasoning accessible to any subscriber is the real product story. Most competitive programming tools today are far below bronze IMO.

The over-refusal problem is disclosed, which is unusual. Explicitly noting “higher tendency to refuse benign requests” in a product launch post is rare candor. It signals Google is aware of the safety-utility tradeoff and is communicating it rather than burying it. Watch whether user feedback on over-refusal drives model updates quickly.

Parallel thinking UX via toggle is the right consumer interface. A toggle is the correct abstraction for users: “think harder on this one.” It maps to the thinking-budget API surface for developers. Consistent interface from consumer to API is how Google is standardizing the thinking-model interaction pattern.

Watch:

Whether Deep Think becomes a default-on feature or stays toggle-gated as capabilities improve
Over-refusal rate compared to base Gemini 2.5 Pro — does it stay elevated or improve in subsequent versions?
Adoption signal: do AI Ultra subscribers use Deep Think toggle regularly or treat it as a novelty?

← all signals