Google continues its rapid evolution in the AI space with the release of a significant update to its flagship Gemini 2.5 Pro model. The new version—designed to address performance regressions from prior updates—is now available in Vertex AI and AI Studio, with a broader rollout expected soon through the Gemini app and web interface.
Targeting Regressions, Enhancing Creativity
Following criticism of the 2.5 Pro’s performance—particularly outside of coding tasks—since the 03-25 update, Google has acknowledged the feedback and taken corrective action. According to Logan Kilpatrick, the newly released 06-05 build of Gemini Pro "closes the gap on 03-25 regressions," restoring the model’s previously noted creativity and improving the formatting quality of its responses. This includes better use of headers, bullet points, and structured output—enhancements that Google claims have been well received in user testing.
Outpacing Competitors in Coding Benchmarks
Gemini 2.5 Pro’s latest iteration continues to strengthen its position as a top-tier coding assistant. The updated model achieved an impressive 82.2% score on the Aider Polyglot test, outperforming leading models from OpenAI, Anthropic, and DeepSeek. The prior I/O Edition (05-06) was already a strong performer in coding tasks, but this release further refines that capability.
A Step Toward Stability
With this update, Google introduces configurable thinking budgets for developers—a feature that gives more control over how the model allocates cognitive resources during complex queries. Kilpatrick notes that this version is expected to become the long-term stable release, marking the end of its "Preview" status in upcoming app and web deployments.
Benchmarking Performance: LMArena and WebDevArena
Google places significant emphasis on user-perceived quality, frequently leveraging platforms like LMArena and WebDevArena where users compare AI-generated outputs blind. Gemini 2.5 Pro’s updated version has bolstered its lead on both platforms, achieving a 24-point Elo gain on LMArena and a 35-point jump on WebDevArena. These improvements signal growing user preference for Google’s model in side-by-side evaluations.
A Model That Thinks Clearly—Even About Magenta
In internal evaluations, the new model also successfully handled a long-standing benchmark test. When asked, “Would the color be called ‘magenta’ if the town of Magenta didn’t exist?”—a query that often stumps other models—Gemini 2.5 Pro gave an unambiguous and correct response: “No.” This showcases its strengthened reasoning abilities.
Availability
Developers and early testers can explore the latest Gemini 2.5 Pro today in Vertex AI and AI Studio. A broader release across the Gemini mobile app and web interface is expected in the coming weeks.