Gemini Deep Think learns math, wins gold medal at International Math Olympiad

Date:

Share:


In the past, making LLMs better at math would involve reinforcement learning with final answers. Luong explained to Ars that models trained in this way can get to the correct answer, but they have “incomplete reasoning,” and part of the IMO grading is based on showing your work. To prepare Deep Think for the IMO, Google used new reinforcement learning techniques with higher-quality “long answer” solutions to mathematical problems, giving the model better grounding in how to handle every step on the way to an answer. “With this kind of training, you can actually get robust, long-form reasoning,” said Luong.

As you might expect, Deep Think takes more time to generate an output compared to the simpler versions you can access in the Gemini app. However, the AI followed the same rules as the flesh-and-blood participants, which was only possible because of its ability to ingest the problems as natural language. Gemini was provided with the problem descriptions and gave its answers within the 4.5-hour time limit of the competition.

Rigorous proofs

AI firms like DeepMind have taken an interest in the IMO over the past few years because it presents a unique challenge. While the competition is aimed at pre-university mathematicians, the questions require critical thinking and an understanding of multiple mathematical disciplines, including algebra, combinatorics, geometry, and number theory. Only the most advanced AI models have any hope of accurately answering these multi-layered problems.

The DeepMind team has pointed out some interesting aspects of Deep Think’s performance, which they say come from its advanced training. In the third problem (below), for example, many human competitors applied a graduate-level concept called Dirichlet’s Theorem, using mathematics outside the intended scope of the competition. However, Deep Think recognized that it was possible to solve the problem with simpler math. “Our model actually made a brilliant observation and used only elementary number theory to create a self-contained proof of the given problem,” said DeepMind researcher and Brown University professor Junehyuk Jung.



Source link

━ more like this

Why London businesses are building their own software instead of buying it – London Business News | Londonlovesbusiness.com

Something has shifted in how London’s small and mid-sized businesses think about software. For years, the default was to sign up for whatever...

What Ukraine really means when land is reduced to a bargaining chip – London Business News | Londonlovesbusiness.com

There’s a phrase that gets thrown around far too casually lately, that Ukraine and Russia are “haggling over a few square kilometers.” So...

The influencer economy’s invisible workers are first in line for the AI chop

The creator economy loves a neat little fairy tale: one magnetic person, one camera, one lucky break. It’s a great story. It’s also...

The Morning After: Amazon pledges its satellite internet starts this year

Amazon’s satellite-based internet service, Leo, will enter service by mid-2026, so says company CEO Andy Jassy. Writing in his annual letter, Jassy claimed...

Starmer draws Trump–Putin parallel as energy crisis deepens – London Business News | Londonlovesbusiness.com

Keir Starmer has drawn a controversial parallel between Donald Trump and Vladimir Putin, blaming global instability for pushing up energy costs faced by...
spot_img