Bitcoin

Bitcoin

$77,213.55

BTC 0.28%

Ethereum

Ethereum

$2,106.63

ETH 0.42%

  • Login
  • Register
Metaverse Media Group
No Result
View All Result
No Result
View All Result
Metaverse Media Group

Nearly 29 percent of “Humanity’s Last Exam” chemistry/biology answers are wrong or misleading

Nearly 29 percent of “Humanity’s Last Exam” chemistry/biology answers are wrong or misleading

The Decoderby The Decoder
24 July 2025
It looks as if humanity itself will not pass its final exam. The article Nearly 29 percent of “Humanity’s Last Exam” chemistry/biology answers are wrong or misleading appeared first on THE DECODER….
image

It looks like humanity might flunk its own “final AI exam.” According to FutureHouse, about 29 percent of biology and chemistry questions in the AI benchmark Humanity’s Last Exam (HLE) have answers that are incorrect or misleading, based on published literature. The error rate was uncovered through a combination of human review and AI-backed analysis.

HLE was built to push language models to their limits with especially tough questions, but the analysis suggests that many of its items are themselves misleading or wrong. Experts only spent a few minutes per question, and a full accuracy check wasn’t required. In response, FutureHouse has released a smaller, vetted version called “HLE Bio/Chem Gold” on HuggingFace.

Join our community
Join the DECODER community on Discord, Reddit or Twitter – we can’t wait to meet you.

Read the full article on The-Decoder.com
in AI
Reading Time: 1 min read
0
0
33
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help

3 weeks ago
26
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help

3 weeks ago
29
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Broadcom reportedly won’t build OpenAI’s custom chip unless Microsoft buys 40 percent of them

3 weeks ago
27

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help

The Decoder
by The Decoder
3 weeks ago
29
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help

The Decoder
by The Decoder
3 weeks ago
26
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Broadcom reportedly won’t build OpenAI’s custom chip unless Microsoft buys 40 percent of them

The Decoder
by The Decoder
3 weeks ago
24
Fields Medalist says ChatGPT 5.5 Pro delivered “PhD-level” math research in under two hours with zero human help
AI

Broadcom reportedly won’t build OpenAI’s custom chip unless Microsoft buys 40 percent of them

The Decoder
by The Decoder
3 weeks ago
27
XRP Positioned to Play Key Role in Bridging Tokenized Assets Across Jurisdictions, Says Ripple CTO
Crypto

XRP Positioned to Play Key Role in Bridging Tokenized Assets Across Jurisdictions, Says Ripple CTO

Bitcoin.com News
by Bitcoin.com News
10 months ago
473
SEC v Ripple: SEC Has Just Days Left to Respond on XRP Case, Says Legal Expert
Crypto

SEC v Ripple: SEC Has Just Days Left to Respond on XRP Case, Says Legal Expert

Bitcoin.com News
by Bitcoin.com News
10 months ago
45
Load More
Next Post
Sydney Sweeney Sends AEO Soaring in Meme Stock Frenzy

Sydney Sweeney Sends AEO Soaring in Meme Stock Frenzy

ADVERTISEMENT

Follow Us

Categories

Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
Bitcoin

Bitcoin

$77,213.55

BTC 0.28%

Ethereum

Ethereum

$2,106.63

ETH 0.42%

  • Login
  • Sign Up
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now