Bitcoin

Bitcoin

$118,061.40

BTC -0.57%

Ethereum

Ethereum

$3,630.35

ETH -3.80%

  • Login
  • Register
Metaverse Media Group
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
Metaverse Media Group

OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks

OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks

The Decoderby The Decoder
21 July 2025
An unreleased AI model from OpenAI has reportedly solved five out of six problems from the International Mathematical Olympiad (IMO) under competition conditions. But the real story is not what it solved, but how it did it. The article OpenAI’s math gold hints that AI may soon tackle even longer and harder tasks appeared first on THE DECODER….

summary
Summary

An unreleased AI model from OpenAI has reportedly solved five out of six problems from the International Mathematical Olympiad (IMO) under competition conditions. But the real story is not what it solved, but how it did it.

OpenAI says an experimental language model scored 35 out of 42 possible points in an internal IMO-style test – enough for a gold medal. Three former IMO winners independently graded the model’s natural language proofs, which were evaluated just like submissions from human contestants. According to the company, the test mirrored real IMO rules: two four-and-a-half-hour sessions, no internet, no external tools or code – just text.

OpenAI claims the model wasn’t specifically trained on IMO tasks. Instead, it was developed as a general-purpose reasoning model, drawing on recent advances in reinforcement learning and using substantial compute during inference. Researcher Alexander Wei emphasized in an X post that this was not a task-specific system, but one capable of autonomously generating complex, multi-page proofs. There are hints it might even be a multi-agent system.

Sustained reasoning without tools

What makes this achievement stand out is that the model reasoned consistently for hours at a time without any symbolic tools like code interpreters or mathematical software. That sets it apart from other high-performing systems such as DeepMind’s AlphaProof, which rely on hybrid neuro-symbolic approaches.

THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Cancel at any time

Until recently, it was widely believed that language models couldn’t sustain consistent mathematical reasoning over long sessions. As recently as June, mathematician Terence Tao said on the Lex Fridman Podcast that IMO-level problems were too difficult for AI to solve in real time. “You can’t hire enough humans to grade those,” Tao said, referring to the labor-intensive verification of long proofs in reinforcement learning training.

The result came as a surprise, even to prediction markets, which put the odds of an AI winning IMO gold before the end of 2025 at under 20 percent. (These forecasts used slightly stricter criteria.)

Both the markets and Tao seemed to assume that a reasoning model like o3 would need to be trained explicitly for IMO proofs, receiving expert feedback at every step. OpenAI, however, appears to have found a more general method for eliciting this behavior. Wei also highlighted that the model wasn’t tailored for the task, but instead was a generalist reasoning system.

OpenAI researcher Jerry Tworek says the reinforcement learning system used here also helped train ChatGPT Agent and the model that recently took second place at the Heuristics World Finals on AtCoder, where it generated code non-stop for nearly ten hours.

Transparency questions

As usual, OpenAI’s claims have sparked criticism. Gary Marcus called the achievement impressive but raised a list of questions in an X post: How is the model architecturally different from its predecessors? What were the costs per problem? Was the model trained on raw text or preprocessed data? And how transferable are these results to other scientific domains? So far, OpenAI has kept all those details under wraps.

Recommendation

OpenAI has faced similar criticism before, notably for a lack of transparency around the ARC-AGI benchmark test. The ARC Prize Foundation found that the final o3 model performed worse than a previously tested preview version. It also only came to light after the fact that OpenAI funded the supposedly independent FrontierMath benchmark, just after it hit a record result there.

A scalable approach to reasoning?

In a recent essay, “How o3 and Grok 4 accidentally vindicated neurosymbolic AI,” Marcus argued that modern AI models are increasingly relying on symbolic tools like code interpreters to overcome the limits of pure language models.

OpenAI’s IMO system, on the other hand, worked entirely in text – no tools – which, if the results hold up, would be a notable exception. If the model’s ability to generalize is confirmed, it could call Marcus’s thesis into question, at least in part. Still, his main criticism remains: without methodological transparency, it’s hard to interpret these achievements.

For now, OpenAI seems to have built a language model that can reason consistently for hours – without any external tools. That would have been hard to imagine just a short time ago. The generalist reasoning approach appears to scale, at least for now. According to OpenAI, the next step is reasoning sessions that last several days.

Join our community
Join the DECODER community on Discord, Reddit or Twitter – we can’t wait to meet you.

Read the full article on The-Decoder.com
in AI
Reading Time: 4 mins read
0
0
20
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

Google DeepMind’s Gemini wins Mathematical Olympiad gold using only natural language
AI

Google DeepMind’s Gemini wins Mathematical Olympiad gold using only natural language

15 hours ago
21
Yet another study finds that overloading LLMs with information leads to worse results
AI

Yet another study finds that overloading LLMs with information leads to worse results

17 hours ago
22
AI training shifts from clickworkers to experts in physics, biology and engineering
AI

AI training shifts from clickworkers to experts in physics, biology and engineering

18 hours ago
21

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
The Ether Machine Goes Public to Offer Institutional Access to a $1.5 Billion Yield Opportunity
Crypto

The Ether Machine Goes Public to Offer Institutional Access to a $1.5 Billion Yield Opportunity

Bitcoin.com News
by Bitcoin.com News
56 minutes ago
21
Brian Quintenz Suddenly Dropped From CFTC Chair Nomination Vote Roster
Crypto

Brian Quintenz Suddenly Dropped From CFTC Chair Nomination Vote Roster

Bitcoin.com News
by Bitcoin.com News
2 hours ago
22
Latam Insights Encore: El Salvador’s IMF Bitcoin-Fueled Defiance Was a Fugazi
Crypto

Latam Insights Encore: El Salvador’s IMF Bitcoin-Fueled Defiance Was a Fugazi

Bitcoin.com News
by Bitcoin.com News
3 hours ago
21
Cross-Chain Crisis? Why Wrapped BTC Might Be Bitcoin’s Weakest Link
Crypto

Cross-Chain Crisis? Why Wrapped BTC Might Be Bitcoin’s Weakest Link

Bitcoin.com News
by Bitcoin.com News
4 hours ago
21
Arthur Hayes: BOJ’s Subtle Shift Could Explode Bitcoin Price
Crypto

Arthur Hayes: BOJ’s Subtle Shift Could Explode Bitcoin Price

Bitcoin.com News
by Bitcoin.com News
5 hours ago
21
US Government Puts Stamp of Approval on Key Digital Asset in Historic First Move
Crypto

US Government Puts Stamp of Approval on Key Digital Asset in Historic First Move

Bitcoin.com News
by Bitcoin.com News
6 hours ago
22
Load More
Next Post
Solana Market Cap Blasts Past $100B as Shorts Get Obliterated in $16M Liquidation Wave

Solana Market Cap Blasts Past $100B as Shorts Get Obliterated in $16M Liquidation Wave

ADVERTISEMENT

Follow Us

Categories

  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
Bitcoin

Bitcoin

$118,061.40

BTC -0.57%

Ethereum

Ethereum

$3,630.35

ETH -3.80%

  • Login
  • Sign Up
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now