Bitcoin

Bitcoin

$117,098.38

BTC 0.16%

Ethereum

Ethereum

$3,757.06

ETH -0.10%

  • Login
  • Register
Metaverse Media Group
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
Metaverse Media Group

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

The Decoderby The Decoder
30 July 2025
A Stanford professor has spent the past year testing the same unsolved math problem on OpenAI’s models, unintentionally tracking their progress in self-assessment along the way. The article OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits appeared first on THE DECODER….

summary
Summary

A Stanford professor has spent the past year testing the same unsolved math problem on OpenAI’s models, unintentionally tracking their progress in self-assessment along the way.

“I’ve actually been emailing with the Stanford mathematics professor. He emailed me about a year ago before we announced o1 and said, ‘Hey, do you want to do a collaboration on solving hard math problems?’ Basically, I told him I think we just have to advance general reasoning capabilities, and eventually they’re going to be able to help you with your hard math problems. I think that’s actually the most promising route to getting there. He was a little skeptical, but with every model release, every reasoning model release, he emails me with a follow-up and asks, ‘Can it solve this problem now?’ I plug them in and send him the output, and he says, ‘Yeah, that’s wrong,'” recalls Noam Brown of OpenAI.

But after OpenAI’s recent breakthrough at the International Mathematical Olympiad, something important has changed: “He emailed me a follow-up this time with the same problem, asking, ‘Hey, can it solve it now?’ It still can’t solve it, but at least this time it recognizes that it can’t solve it, so I think that’s a big step.” Instead of hallucinating, the model simply said “no answer” to this year’s hardest IMO problem. As Brown puts it, “I think it was good to see the model doesn’t try to hallucinate or just make up some solution, but instead will say ‘no answer.'”

This accidental long-term study reveals a kind of progress that standard benchmarks have missed: the models might be getting a little better at recognizing their own limitations, rather than generating confident but wrong answers.

THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Cancel at any time

A Spanish research team takes a similar view of the much-discussed results from Apple’s reasoning study. There too, reasoning models like o3 stopped their output prematurely. While Apple’s researchers saw this as a simple failure, the Spanish team argues it’s evidence of a learned strategy: the models realize they’ve hit a wall and stop.

It will likely be some time before we’re fully protected from AI-generated bullshit. OpenAI plans to make its IMO model available to mathematicians for testing, but the core improvements behind this progress are not expected to appear in commercial models for several more months.

Join our community
Join the DECODER community on Discord, Reddit or Twitter – we can’t wait to meet you.

Read the full article on The-Decoder.com
in AI
Reading Time: 2 mins read
0
0
24
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

Cohere’s new vision model can process images, diagrams, PDFs, and other types of visual data
AI

Cohere’s new vision model can process images, diagrams, PDFs, and other types of visual data

1 hour ago
20
Google adds image-to-video and Veo 3 Fast to the Gemini API
AI

Google adds image-to-video and Veo 3 Fast to the Gemini API

2 hours ago
24
BFL and Krea release FLUX.1 Krea: Open image model designed for realism
AI

BFL and Krea release FLUX.1 Krea: Open image model designed for realism

4 hours ago
23

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
From Outflows to Overflowing: Ethereum ETFs Quietly Amass $21.8B in a Year
Crypto

From Outflows to Overflowing: Ethereum ETFs Quietly Amass $21.8B in a Year

Bitcoin.com News
by Bitcoin.com News
3 minutes ago
19
Atkins: SEC’s ‘Project Crypto’ Will Uphold Right to Self-Custody Digital Assets
Crypto

Atkins: SEC’s ‘Project Crypto’ Will Uphold Right to Self-Custody Digital Assets

Bitcoin.com News
by Bitcoin.com News
37 minutes ago
20
Cohere’s new vision model can process images, diagrams, PDFs, and other types of visual data
AI

Cohere’s new vision model can process images, diagrams, PDFs, and other types of visual data

The Decoder
by The Decoder
1 hour ago
20
XRP Price Watch: Momentum Indicator and MACD Turns Bearish Amid Sideways Action
Crypto

XRP Price Watch: Momentum Indicator and MACD Turns Bearish Amid Sideways Action

Bitcoin.com News
by Bitcoin.com News
1 hour ago
24
Google adds image-to-video and Veo 3 Fast to the Gemini API
AI

Google adds image-to-video and Veo 3 Fast to the Gemini API

The Decoder
by The Decoder
2 hours ago
24
Bitcoin Shrugs Off Brief Fed Panic
Crypto

Bitcoin Shrugs Off Brief Fed Panic

Bitcoin.com News
by Bitcoin.com News
2 hours ago
23
Load More
Next Post
UK’s Smarter Web Company Buys 225 Bitcoin, Boosts Treasury to 2,050 BTC

UK’s Smarter Web Company Buys 225 Bitcoin, Boosts Treasury to 2,050 BTC

ADVERTISEMENT

Follow Us

Categories

  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
Bitcoin

Bitcoin

$117,098.38

BTC 0.16%

Ethereum

Ethereum

$3,757.06

ETH -0.10%

  • Login
  • Sign Up
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now