Tao Zhexuan supports!The AI Mathematics Olympiad is here, with a prize of 5 million US dollars, looking for a large model that can win the IMO gold medal

巴比特_

2023-11-28 09:21:55

Article source: qubits

Image source: Generated by Unbounded AI

The IMO International Mathematical Olympiad specially set up for AI is here -

That’s a whopping $10 million!

The competition claims to be “on behalf of the new Turing Test”, how to compare?

PK head-to-head with the smartest math geniuses of mankind and win the gold medal of the same standard.

Don’t underestimate this event, even the math master Tao Zhexuan is here, and recommends it on the official website:

This competition provides a benchmark for identifying AI’s problem-solving strategies, which is exactly what we need right now.

As soon as the news came out, netizens were quite excited.

As the IMO president said: Which model can compete with the world’s brightest wave of young people?

The so-called “under the reward, there must be a brave man”, AI with its own path is really exciting.

AI Participates in IMO with a Maximum of $5 Million

The abbreviation of this competition is AI-MO.

Its original intention is to promote the mathematical reasoning ability of large language models and encourage the development of new AI models that can match the highest level of human mathematics (IMO competition).

Why choose IMO as a benchmark?

IMO topics are generally divided into four categories: algebra, geometry, number theory and combinatorics, which do not require advanced mathematical knowledge, but require participants to have the right way of thinking and mathematical literacy.

Statistics show that its gold medalists are 50 times more likely to win the Fields Medal than the average Cambridge PhD graduate.

In addition, half of the Fields Medal winners have competed in IMO competitions.

Based on this competition, this AI-MO competition dedicated to AI will open in early 2024.

The organizing committee requires that the participating AI models must process the questions in the same format as the human contestants, and must generate human-readable final answers, which will then be scored by a panel of experts using IMO standards.

The results will be announced at the 65th IMO Congress in Bath, UK, in July next year.

In the end, the AI that reaches the gold level will receive a grand prize of $5 million.

The remaining AI models that “achieved key milestones” will share the remaining progress awards, with a total amount of $5 million.

It is worth mentioning that in order to be eligible for the award, participants must comply with the AI-MO public sharing agreement, that is, the winning model must be open source.

As for the specific rules, the organizing committee is still deliberating, and the official is currently recruiting members of the advisory board (especially mathematicians, AI and machine learning experts) and the director to lead the competition, all of which are paid and can be completely remote, and it is not known which bigwigs will join.

However, it should be noted that AI-MO is not an official competition initiated by IMO.

The real sponsor is XTX Markets, a London-based non-bank financial institution that engages in machine learning quantitative trading.

If nothing else, XTX Markets focuses on a hero.

It had also set up a scholarship with the University of Oxford last year to encourage female students to study mathematics.

As for the competition itself, some netizens also started a wave of speculation: Which AI model is the most promising?

GPT-4 with the Wolfram plug-in was the first to be brought out, but it was also the first to be poured cold water.

However, OpenAI behind it is still favored (although big tech companies are not the target audience for the competition).

Some pessimistic netizens directly asserted:

It’s cool, but no one should be able to do it in five years.

At the same time, it is also argued that:

It is not difficult to train such a model, but it is difficult to obtain and process data, after all, these problems involve not only text, but also many images and symbols with complex meanings.

Everything will be revealed in 2024.

It is worth mentioning that AI-MO is not the first AI to challenge IMO.

In 2019, several researchers from universities and institutions such as OpenAI, Microsoft, Stanford University and Google had already launched a competition called the IMO Grand Challenge.

No one has succeeded in the challenge before

The IMO Grand Challenge is also a competition set up to find an AI that can win the IMO gold medal.

Let’s take a look at the 5 rules for AI in this math game:

About the format. In order to ensure the rigor and verifiability of the proof process, both the problem and the proof need to be done in a formal (machine-verifiable) way.

In other words, the IMO problem will be transformed into an expression based on the Lean programming language and input to the AI through the Lean theorem prover, and the AI will also need to write the proof in the Lean programming language.

About Score. Each proof question of the AI will be judged to be true or false within 10 minutes, as this is also the time for the IMO referee to score. Unlike humans, there is no such thing as a “partial score” for AI.

About Resources. Like humans, AI needs to solve 3 problems in 4.5 hours a day (a total of 2 days), and there is no limit to computing resources.

About reproducibility. AI must be open source, and the model must be publicly available and reproducible by the end of the first day of IMO. Requires that the AI cannot be connected to the Internet.

About the Challenge itself. The biggest challenge is to get AI to win gold 🏅 like humans.

The competition was initiated by 7 AI research scholars and mathematicians:

Daniel Selsam of OpenAI, Leonardo de Moura of Microsoft, Kevin Buzzard of Imperial College, Reid Barton of the University of Pittsburgh, Percy Liang of Stanford University, Sarah Loos of Google AI, and Freek Wiedijk of Radboud University.

Now, four years later, it has received the attention of some contestants.

However, although many AI and mathematics researchers have tried to challenge this field, or a small goal in the field, they are still far from the final goal of winning the IMO championship.

There was even a suggestion that there should be a “simple mode” for this game:

For example, researcher Xi Wang has tried to use several existing SMT solvers to do IMO real questions, but the results are mediocre.

At that time, the existing AI was able to prove some less difficult IMO problems, such as Napoleon’s theorem, which states that if the sides of any triangle are used as sides to make a regular triangle outward, their central line must form a regular triangle.

However, when it came to proving some of the other real problems, such as the IMO 2019 geometry problem, the existing solvers could not do it, or it was half an hour overtime.

For example, OpenAI researchers (who were still at Microsoft at the time) Dan Selsam and Jesse Michael Han also studied for a while on “AI to solve IMO geometry problems” and summarized a blog.

This blog explains how they came up with a geometry solver and the steps to design it, including:

Geometric representations, constraint solving, algorithm selection, solver architecture, challenges and solutions.

Geometric representations, for example, represent geometric problems in a format that computers can understand and process, and vice versa, including the use of geometric solvers to automatically convert programming languages into graphs that are easy for humans to read:

In addition, it also introduces how to choose the appropriate solution algorithm according to different IMO geometry problem types, etc.

But even so, the blog post doesn’t give a specific solution, only stating at the end that “the solver has the potential to achieve the goal of winning IMO gold”.

Moreover, the geometry problems targeted by the challengers above only occupy a quarter of the IMO problem types (along with algebra, combinatorics, and number theory)…

Although it has been launched for 4 years, there is still no real AI “IMO all-rounder”, but as the originator of this idea, the IMO Grand Challenge still makes a lot of waves in the industry.

Alex Gerko admits that the IMO Grand Challenge was also the opportunity for him to hold AI-MO:

It’s time to give the “AI Challenge IMO” a little bit of excitement on the whole!

Of course, the prize money of this AI-MO has indeed attracted the attention of the organizers of the IMO Grand Challenge and many challengers:

I wonder if there will really be an AI in the industry that can solve difficult math problems and successfully surpass many humans to win the IMO gold medal driven by money 💰.

Judging from the current strength, which AI company do you think is most likely to take the lead?

Reference Links:
[1]
[2]
[3]

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.