2025-10-24 · Google

Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad

protocolsmodelsinfrastructure

Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad

Source: DeepMind Date: 2025-10-24 URL: https://deepmind.google/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/

Summary

Gemini with Deep Think solved 5 of 6 IMO 2025 problems (35 points, gold-medal threshold) in natural language within the 4.5-hour competition window, officially graded by IMO coordinators. The jump from 2024’s silver (28 points, formal language translation required, 2–3 days of compute) to 2025’s gold (natural language, competition-time) represents both a capability leap and an infrastructure improvement. The key technique is “parallel thinking” — simultaneously exploring and combining multiple solution paths — enhanced by new RL techniques on multi-step reasoning data.

Implications

The math reasoning thread, settled for now. Gold-medal IMO performance was a stated goalpost for AI mathematical reasoning. Google hit it first, in official competition conditions with official grading. OpenAI’s o-series and Anthropic’s extended thinking are in the same contest, but this result is the credentialed benchmark — IMO coordinators, same criteria as students.

Natural language end-to-end is the key advancement. Last year’s system needed formal-language translation (Lean) and days of compute. This year’s runs in 4.5 hours in natural language. That compression makes the capability practically deployable, not just a research proof. It’s what bridges “AI does math” to “AI assists mathematical research in real workflows.”

Reasoning + RL compounding. “Parallel thinking” combined with RL on multi-step reasoning data is the same recipe Google is applying across domains (see Deep Loop Shaping at LIGO). The methodology is generalizing beyond math — watch for this combination applied to scientific discovery workflows in 2026.

Watch:

Whether Deep Think’s IMO capability translates to useful performance on open research problems (not olympiad-style exercises)
OpenAI and Anthropic IMO results for the same competition year — are they close, or is this a gap?
Integration of Deep Think math capabilities into Gemini API tiers for research customers

← all signals