2026-02-09 · Google

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

agentsmodelsresearch

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Source: DeepMind Date: 2026-02-09 URL: https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think/

Summary

Google DeepMind’s “Aletheia” system (Gemini Deep Think in agentic research mode) autonomously contributed to peer-reviewed mathematics research: solving four open problems from Erdős’s 700-problem database, contributing to a paper on eigenweights in arithmetic geometry accepted to ICLR ‘26, and resolving a decade-long conjecture in online submodular optimization. On IMO-ProofBench Advanced, the system reached ~90% accuracy; on FutureMath Basic (PhD-level), 38%. The post explicitly claims “Level 2” results (publishable quality) but not Level 3–4 (major breakthroughs).

Implications

This is the most significant frontier research AI claim to date. Peer-reviewed publication, open problem resolution, and a named system (Aletheia) doing “autonomous research” without human intervention on content that passes journal review — this is not benchmark performance, it’s the actual scientific process. If reproducible, it changes the timeline for AI-assisted research.

The 38% on PhD-level exercises is the honest number. Alongside the impressive Erdős results, the FutureMath Basic score acknowledges that deep research capability is still narrow. The model solves specific problem classes well but isn’t a general research superintelligence. That calibration matters for how seriously to take the claims.

“Aletheia” as a product signal. Naming the system and describing it as “reliable autonomous research” implies Google is productizing this into an API offering for research institutions. Combined with the ANRF India partnership and the science application thread across other posts, this is the emerging Google DeepMind research-tools business.

Watch:

ICLR ‘26 reception of the accepted Aletheia paper — peer reviewer responses will reveal whether the autonomous research meets professional standards
Whether Erdős problem database solutions survive expert scrutiny
Availability timeline for an AI Co-scientist API — Google has named it; the question is when it’s generally accessible

← all signals