Bitcoin

Bitcoin

$117,055.24

BTC -1.18%

Ethereum

Ethereum

$3,720.71

ETH -1.03%

  • Login
  • Register
Metaverse Media Group
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
Metaverse Media Group

OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks

OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks

The Decoderby The Decoder
21 July 2025
An unreleased AI model from OpenAI has reportedly solved five out of six problems from the International Mathematical Olympiad (IMO) under competition conditions. But the real story is not what it solved, but how it did it. The article OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks appeared first on THE DECODER….

summary
Summary

An unreleased AI model from OpenAI has reportedly solved five out of six problems from the International Mathematical Olympiad (IMO) under competition conditions. But the real story is not what it solved, but how it did it.

OpenAI says an experimental language model scored 35 out of 42 possible points in an internal IMO-style test – enough for a gold medal. Three former IMO winners independently graded the model’s natural language proofs, which were evaluated just like submissions from human contestants. According to the company, the test mirrored real IMO rules: two four-and-a-half-hour sessions, no internet, no external tools or code – just text.

OpenAI claims the model wasn’t specifically trained on IMO tasks. Instead, it was developed as a general-purpose reasoning model, drawing on recent advances in reinforcement learning and using substantial compute during inference. Researcher Alexander Wei emphasized in an X post that this was not a task-specific system, but one capable of autonomously generating complex, multi-page proofs. There are hints it might even be a multi-agent system.

Sustained reasoning without tools

What makes this achievement stand out is that the model reasoned consistently for hours at a time without any symbolic tools like code interpreters or mathematical software. That sets it apart from other high-performing systems such as DeepMind’s AlphaProof, which rely on hybrid neuro-symbolic approaches.

THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Cancel at any time

Until recently, it was widely believed that language models couldn’t sustain consistent mathematical reasoning over long sessions. As recently as June, mathematician Terence Tao said on the Lex Fridman Podcast that IMO-level problems were too difficult for AI to solve in real time. “You can’t hire enough humans to grade those,” Tao said, referring to the labor-intensive verification of long proofs in reinforcement learning training.

The result came as a surprise, even to prediction markets, which put the odds of an AI winning IMO gold before the end of 2025 at under 20 percent. (These forecasts used slightly stricter criteria.)

Both the markets and Tao seemed to assume that a reasoning model like o3 would need to be trained explicitly for IMO proofs, receiving expert feedback at every step. OpenAI, however, appears to have found a more general method for eliciting this behavior. Wei also highlighted that the model wasn’t tailored for the task, but instead was a generalist reasoning system.

OpenAI researcher Jerry Tworek says the reinforcement learning system used here also helped train ChatGPT Agent and the model that recently took second place at the Heuristics World Finals on AtCoder, where it generated code non-stop for nearly ten hours.

Transparency questions

As usual, OpenAI’s claims have sparked criticism. Gary Marcus called the achievement impressive but raised a list of questions in an X post: How is the model architecturally different from its predecessors? What were the costs per problem? Was the model trained on raw text or preprocessed data? And how transferable are these results to other scientific domains? So far, OpenAI has kept all those details under wraps.

Recommendation

OpenAI has faced similar criticism before, notably for a lack of transparency around the ARC-AGI benchmark test. The ARC Prize Foundation found that the final o3 model performed worse than a previously tested preview version. It also only came to light after the fact that OpenAI funded the supposedly independent FrontierMath benchmark, just after it hit a record result there.

A scalable approach to reasoning?

In a recent essay, “How o3 and Grok 4 accidentally vindicated neurosymbolic AI,” Marcus argued that modern AI models are increasingly relying on symbolic tools like code interpreters to overcome the limits of pure language models.

OpenAI’s IMO system, on the other hand, worked entirely in text – no tools – which, if the results hold up, would be a notable exception. If the model’s ability to generalize is confirmed, it could call Marcus’s thesis into question, at least in part. Still, his main criticism remains: without methodological transparency, it’s hard to interpret these achievements.

For now, OpenAI seems to have built a language model that can reason consistently for hours – without any external tools. That would have been hard to imagine just a short time ago. The generalist reasoning approach appears to scale, at least for now. According to OpenAI, the next step is reasoning sessions that last several days.

Join our community
Join the DECODER community on Discord, Reddit or Twitter – we can’t wait to meet you.

Read the full article on The-Decoder.com
in AI
Reading Time: 4 mins read
0
0
20
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

Google DeepMind’s Gemini wins Mathematical Olympiad gold using only natural language
AI

Google DeepMind’s Gemini wins Mathematical Olympiad gold using only natural language

9 hours ago
21
Yet another study finds that overloading LLMs with information leads to worse results
AI

Yet another study finds that overloading LLMs with information leads to worse results

11 hours ago
22
AI training shifts from clickworkers to experts in physics, biology and engineering
AI

AI training shifts from clickworkers to experts in physics, biology and engineering

12 hours ago
21

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
US Government Puts Stamp of Approval on Key Digital Asset in Historic First Move
Crypto

US Government Puts Stamp of Approval on Key Digital Asset in Historic First Move

Bitcoin.com News
by Bitcoin.com News
24 minutes ago
19
Peter Schiff Says Sell Ether and Buy Bitcoin
Crypto

Peter Schiff Says Sell Ether and Buy Bitcoin

Bitcoin.com News
by Bitcoin.com News
1 hour ago
19
The Hard Road’s Behind Us, the Easy Is Ahead
Crypto

The Hard Road’s Behind Us, the Easy Is Ahead

Bitcoin.com News
by Bitcoin.com News
2 hours ago
21
Cynthia Lummis Calls for Powell to Resign
Crypto

Cynthia Lummis Calls for Powell to Resign

Bitcoin.com News
by Bitcoin.com News
2 hours ago
21
Leaked Memo: Anthropic CEO Says the Company Will Pursue Gulf State Investments After All
Business

Leaked Memo: Anthropic CEO Says the Company Will Pursue Gulf State Investments After All

Wired
by Wired
3 hours ago
25
High Profile Trader Portnoy Ditches XRP at $2.40—Misses Millions After 60% Surge
Crypto

High Profile Trader Portnoy Ditches XRP at $2.40—Misses Millions After 60% Surge

Bitcoin.com News
by Bitcoin.com News
3 hours ago
21
Load More
Next Post
Solana Market Cap Blasts Past $100B as Shorts Get Obliterated in $16M Liquidation Wave

Solana Market Cap Blasts Past $100B as Shorts Get Obliterated in $16M Liquidation Wave

ADVERTISEMENT

Follow Us

Categories

  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
Bitcoin

Bitcoin

$117,055.24

BTC -1.18%

Ethereum

Ethereum

$3,720.71

ETH -1.03%

  • Login
  • Sign Up
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now