Gemini 2.5 Pro tops the Triple Crown! The list of the strongest AI programming slaughter comprehensively crushed Claude 3.7

Google Gemini 2.5 Pro (I/O version) was born, reaching the top of LMAreana, winning three consecutive championships in text, vision, and coding, and even crushing Claude 3.7 in programming capabilities, and the strongest coding model on the surface was born.

The AI programming throne changed hands overnight.

Last night, Google released the newly upgraded Gemini 2.5 Pro Preview (I/O version), winning three consecutive championships in one fell swoop and reaching the top of LMeana.

It became the first SOTA model to sweep the text, visual, and WebDev Arena benchmarks, crushing Claude 3.7 Sonnet in coding performance.

Whether it’s code conversion, code editing, or even developing complex agent workflows, Gemini 2.5 Pro can be handy.

Just draw a sketch and Gemini 2.5 Pro can turn it into a painting applet.

With just one prompt, it transforms natural images into code to represent unique patterns.

In a sentence, make a little game with your dog.

Hassabis proudly said that just a casual improvement of 147 points in ELO is not a big deal.

Jeff Dean, Google’s chief scientist and head of Gemini, also tweeted several tweets introducing the updated Gemini 2.5 Pro, showing confidence in its performance.

Developers can now start developing with the updated Gemini 2.5 Pro in the Gemini API through Google AI Studio and Vertex AI. The new model has also been officially launched on the Gemini App, supporting Canvas and other functions.

Anyone can write code and build interactive web applications with a single prompt.

01 The strongest programming model in history

Gemini 2.5 Pro Preview (I/O version) Early Access is a new upgrade to 2.5 Pro, especially in terms of programming, especially good at creating engaging interactive web applications.

How can product managers do a good job in B-end digitalization?
All walks of life have taken advantage of the ride-hail of digital transformation and achieved the rapid development of the industry. Since B-end products are products that provide services for enterprises, how should enterprises ride the digital ride?

View details >

In visual benchmarks, Gemini 2.5 Pro Preview leads models such as GPT-4o and o3 with a huge advantage.

On WebDev, it became the first programming model to surpass Claude, and even the newly released GPT-4.1 is inferior to Gemini 2.5 Pro.

From the benchmark test, the new version of Gemini 2.5 Pro has been improved in all aspects compared with the original.

Google claimed on its official blog that it was originally planning to release this update at Google I/O, but seeing everyone’s enthusiasm for this model, it decided to release it early so that everyone could develop and use it as soon as possible.

This update, in addition to UI-centric development, extends to code transformation, code editing, and developing complex AI agent workflows.

In the blog, Google casually put a small example: a sentence expresses the behavior of leaves on a picture in code.

The updated Gemini 2.5 Pro also has top-notch performance in video understanding, scoring 84.8% in the VideoMME benchmark and can generate an interactive learning website based on a single YouTube video.

Google also gave a comparison video with the pre-update Gemini 2.5 Pro in the blog.

Commenting on the update, Michael Truell, CEO of AI programming tool Cursor, commented: “We are very excited about the latest Gemini 2.5 Pro, which builds on its already powerful coding capabilities. “We have internally observed a significant reduction in the failure rate of the new model when calling tools, and we believe this improvement will make users feel that 2.5 Pro performs better in Cursor than before.”

02 The king of “atmosphere programming”, the whole network is crazy

As soon as Gemini 2.5 Pro Preview was released, the popularity exploded directly. Developers have built interesting demos with the help of their powerful coding capabilities.

Google researcher JB Alayrac said that the latest upgraded code capabilities of Gemini 2.5 Pro are simply amazing.

What’s even more amazing is that it combines its programming prowess with top-notch multimodal video understanding to directly transcribe YouTube videos into p5.js animations.

In another demo released by Google DeepMind researcher Ali Eslami, a 3D tour of the Art Institute of Chicago collection is “programmed with atmospheres” with Gemini 2.5 Pro.

Another researcher, Fei Xia, even called Gemini 2.5 Pro a “model” of “atmosphere programming”.

Based on plant height and sunlight orientation, optimal planting spacing, and symbiotic plant matching, it can easily write an intelligent garden planner to automatically generate the optimal layout.

Dave Messer, head of AI product at Google, made a game – listening and drawing.

Tim Bettridg used Canvas to develop a book recommendation app in one go, just take a picture of a bookshelf.

Patrick Loeber shared that it can also make a financial app with a more complex and beautiful interface.

Researcher Megan Ben Dor Ruthven used Gemini 2.5 Pro to create an interactive table of the chemical periods of the game cards.

Developer Chetaslua asked Gemini 2.5 Pro to create a 3D demonstration website of how the Earth was formed.

During the test, o3 failed to compile, Claude 3.7 crashed with a blue screen, and only Gemini 2.5 Pro was the rightful king.

Netizen Arthur Lee only needed to adjust it once to generate a 3D solar system, which is very beautiful and can be interacted with at will.

Gemini 2.5 Pro can also perform real-time simulations.

In the demonstration below, it can dynamically simulate real-time light and shadow changes, day and night cycles, as well as generate characters, names, and perform real-time heart rate detection and other physiological indicators.

Create an app in Gemini Canvas that explores the world with the Maps API.

In the physical simulation test, Gemini 2.5 Pro simulated water shaking back and forth in a bucket, defeating Claude 3.7 Sonnet and o3 in one fell swoop.

A series of demonstrations truly demonstrated the powerful programming capabilities of Gemini 2.5 Pro.

AI tycoon Andrew Curran said, “Gemini replacing Gemini is a signal that the top spot will still change hands, but the dragon has awakened.”

Resources:

https://techcrunch.com/2025/05/06/google-debuts-an-updated-gemini-2-5-pro-ai-model-ahead-of-i-o/

https://x.com/OfficialLoganK/status/1919770687167684808

https://x.com/GeminiApp/status/1919770661439865029

https://blog.google/products/gemini/gemini-2-5-pro-updates/

Gemini 2.5 Pro tops the Triple Crown! The list of the strongest AI programming slaughter comprehensively crushed Claude 3.7

01 The strongest programming model in history

02 The king of “atmosphere programming”, the whole network is crazy

JD.com vs. Meituan, Cudi won

Several variables affecting JD.com’s takeaway appeared at the same time

Exceeded expectations! Taobao flash sale opened up nationwide in advance, and joined forces with Ele.me to reverse the takeaway war

JD.com VS Meituan: The final deduction of the “takeaway war”

Why is a Hello bicycle more expensive than a bus?

Xiaohongshu Entertainment live broadcast sprints urgently, appearing in the background in early May, and the voice hall may appear, are you ready?

o3 In-depth Interpretation: OpenAI Finally Uses Tool Use, Is Agent Products Dangerous?

The Truth Behind AI App Hits: From Cursor to Arc, PMF’s Key Insights That Determine Life and Death

In-depth Interview Practical Guide: Say goodbye to awkward chats and superficial information, and dig into user treasures

How does AI programming choose the right large model? 4 stages + 6 recommendations

In-depth Interview Practical Guide: Say goodbye to awkward chats and superficial information, and dig into user treasures

Building a large-scale AI recommendation system from 0: How to define an effectiveness evaluation system?

The most complete interpretation of the whole network: party and government information innovation (localization)

Still doing demand analysis? This article will take you from 0 to 1 to learn to accurately tap your needs

The big anchor plays digital “clones”, which is more like a self-hilarity