LOADING...
New Gemini 3.1 Pro is Google's most advanced reasoning model
Gemini 3.1 Pro outperforms predecessor in reasoning tasks

New Gemini 3.1 Pro is Google's most advanced reasoning model

Feb 20, 2026
09:37 am

What's the story

Google has launched its latest AI model, Gemini 3.1 Pro. The new version comes with "advanced reasoning" capabilities for complex problem-solving, making it more useful for difficult tasks than its predecessor, the Gemini 3 Pro. The model builds on the success of Gemini 3, which outperformed several competitor models since its November release, and is now available in the Gemini app and NotebookLM for Google AI Pro and Ultra subscribers.

Performance boost

Gemini 3.1 Pro outperforms predecessor in reasoning tasks

The Gemini 3.1 Pro model has achieved an ARC-AGI-2 score of 77.1%, which is more than double the reasoning performance of its predecessor, the Gemini 3 Pro. This "advanced reasoning" can be applied in practical scenarios where users need a clear visual explanation of complex topics or data synthesis into a single view. The model is also available via Google AI Studio, Antigravity, Vertex AI, Gemini Enterprise, Gemini CLI and Android Studio for developers.

Advanced features

Model designed to tackle tough research challenges

The release of Gemini 3.1 Pro follows a major upgrade to Gemini 3 Deep Think last week, which introduced new capabilities in chemistry and physics, as well as math and coding. Google says this model is the "upgraded core intelligence that makes those breakthroughs possible." It was designed to tackle tough research challenges where problems often lack clear guardrails or a single correct solution, and data is often messy or incomplete.

Advertisement

Benchmark results

How Gemini 3.1 Pro fared on other benchmarks

Gemini 3.1 Pro has also surpassed its predecessor in other benchmarks, scoring 44.4% on Humanity's Last Exam (HLE) and 77.1% on ARC-AGI-2 logic benchmark tests. The Deep Think update scored higher at 48.4% and an impressive 84.6%, respectively, but these scores are still better than those of the previous model, which scored a high of just over 38%.

Advertisement