2026-02-12 · Google

Gemini 3 Deep Think: Advancing science, research and engineering

modelsresearch

Gemini 3 Deep Think: Advancing science, research and engineering

Source: DeepMind Date: 2026-02-12 URL: https://deepmind.google/blog/gemini-3-deep-think-advancing-science-research-and-engineering/

Summary

Google released Gemini 3 Deep Think with benchmark results that collectively define a new frontier: 84.6% on ARC-AGI-2 (ARC Prize Foundation verified), 48.4% on Humanity’s Last Exam, Elo 3455 on Codeforces, gold-medal IMO 2025, and gold on International Physics and Chemistry Olympiad written sections. Early-access users include a Rutgers mathematician who used it to find flaws in peer-reviewed papers and Duke University’s Wang Lab optimizing crystal fabrication. Available to AI Ultra subscribers and API early access program.

Implications

The frontier benchmark story has shifted. ARC-AGI-2 at 84.6% and Humanity’s Last Exam at 48.4% are both expressly designed to resist benchmark saturation — they were created because standard evals ran out of signal. Google hitting both in the same release is the clearest statement yet that Gemini 3 Deep Think is the current frontier reasoning model, not just competitive.

Science-as-deployment, not just research. The Rutgers mathematician finding flaws in peer-reviewed papers, Duke’s crystal growth optimization — these are not demos. They’re the product story: a reasoning model that does useful scientific work, not just olympiad performance. That’s a different audience than the coding-integrators market (Cursor, Bolt).

Olympiad sweep is a credibility stake. IMO gold is one thing. Adding International Physics and Chemistry Olympiad gold in the same cycle makes the claim generalize beyond mathematics. DeepMind is saying the reasoning capability is domain-general, not math-specific.

Watch:

ARC-AGI-2 vs. OpenAI o3-series and Anthropic’s upcoming extended thinking benchmarks — this is the new yardstick
AI Ultra subscription adoption trajectory — this is Gemini’s premium tier, pricing matters
Real-world science use cases: which research institutions announce Gemini 3 Deep Think integration in 2026

← all signals