Bitcoin

Bitcoin

$118,873.64

BTC 1.93%

Ethereum

Ethereum

$3,733.43

ETH 2.09%

  • Login
  • Register
Metaverse Media Group
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
Metaverse Media Group

OpenAI’s new agent moves its 2017 vision for AI closer to reality

OpenAI’s new agent moves its 2017 vision for AI closer to reality

The Decoderby The Decoder
22 July 2025
OpenAI has been working on the development of a versatile AI agent for years. With the new ChatGPT agent, the company is relying on massive computing power, targeted reinforcement learning and a strong pre-trained basis – and is pursuing a vision that goes back to 2017. The article OpenAI’s new agent moves its 2017 vision for AI closer to reality appeared first on THE DECODER….

summary
Summary

OpenAI’s 2017 research paper “World of Bits” ended with a clear-eyed assessment: “We showed that while standard supervised and reinforcement learning techniques can be applied to achieve adequate results across these environments, the gap between agents and humans remains large, and welcomes additional modeling advances.”

That paper outlined a long-term vision for the company, one that’s now inching closer to reality with the new ChatGPT agent. Casey Chu, a member of the development team, confirmed in a recent interview that this goal never faded: “This project has a very long lineage, dating back to around 2017. Our codename is ‘World of Bits 2’ for the computer use part.” The lineage stretches back even further – in 2016, OpenAI released a blog post about the related training environment Universe.

But the way OpenAI tries to close that “large gap” has fundamentally changed. The biggest shift is the starting point: instead of beginning from scratch, the new agent is built on top of a massive, unsupervised, pretrained foundation model. That baseline competence is now required for everything that follows. “Before we apply Reinforcement Learning, the model must be good enough to achieve a basic completion of tasks” says Issa Fulford.

According to OpenAI, reinforcement learning is very data-efficient

OpenAI now relies on reinforcement learning (RL) for crucial fine-tuning, calling the process extremely data-efficient: “The scale of the data is minuscule compared to the scale of pre-training data. We are able to teach the model new capabilities by curating these much smaller, high-quality datasets,” Fulford explains. These datasets are made up of dynamic collections of difficult, targeted tasks. The team starts by defining what they want the agent to accomplish, then designs training scenarios accordingly. “We work backwards from the use cases we want to solve to train the model and build the product,” Fulford adds.

THE DECODER Newsletter
The most important AI news straight to your inbox.
✓ Weekly
✓ Cancel at any time

When it comes to hands-on training, the agent faces these tasks and has to figure out solutions without being told how. As Chu puts it, “We essentially give the model all these tools, lock it in a room, and it experiments. We don’t tell it when to use what tool, it figures that out by itself.” The mechanism driving this experimental learning is simple but effective: a reward system based on the outcome. Edward Sun explains: “As long as you can grade the task—judge whether the model’s performance on the result was good or not—you can reliably train the model to become even better at it..”

Massive scaling of computing power

This approach, where only the final result needs to be evaluated, is far more efficient than collecting thousands of human demonstrations for every mouse click and keystroke. It lets OpenAI train agents across hundreds of thousands of virtual machines at once, allowing them to independently discover the best solutions to complex problems.

The “further advances” called for in the 2017 paper didn’t come from a new algorithm, but from scaling up on every level. “Essentially, the scale of the training has changed,” Chu says. ” I don’t know the exact multiplier, but it must be something like 100,000x in terms of compute.”

For now, OpenAI says the agent still shouldn’t be used for critical tasks.

Join our community
Join the DECODER community on Discord, Reddit or Twitter – we can’t wait to meet you.

Read the full article on The-Decoder.com
in AI
Reading Time: 3 mins read
0
0
20
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

AI Math Olympiad wins revive the debate over symbols, reasoning, and the nature of intelligence
AI

AI Math Olympiad wins revive the debate over symbols, reasoning, and the nature of intelligence

13 hours ago
21
Anthropic’s CEO admits compromising with authoritarian regimes to secure AI funding
AI

Anthropic’s CEO admits compromising with authoritarian regimes to secure AI funding

13 hours ago
21
Google’s Gemini 2.5 now supports “conversational image segmentation”
AI

Google’s Gemini 2.5 now supports “conversational image segmentation”

13 hours ago
21

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
MEXC Debuts Stock Futures With Zero Fees
Crypto

MEXC Debuts Stock Futures With Zero Fees

Bitcoin.com News
by Bitcoin.com News
1 hour ago
21
‘Crypto Queen’ Cynthia Lummis Calls for Senate to Cancel August Recess and Stay in Session
Crypto

‘Crypto Queen’ Cynthia Lummis Calls for Senate to Cancel August Recess and Stay in Session

Bitcoin.com News
by Bitcoin.com News
2 hours ago
22
New IRS Crypto Form Could Trigger ‘Phantom Gains’ and Unwanted Audits, Expert Warns
Crypto

New IRS Crypto Form Could Trigger ‘Phantom Gains’ and Unwanted Audits, Expert Warns

Bitcoin.com News
by Bitcoin.com News
2 hours ago
20
Nubank Announces New Nucoin Loyalty Program
Crypto

Nubank Announces New Nucoin Loyalty Program

Bitcoin.com News
by Bitcoin.com News
3 hours ago
22
US Senators Unveil Draft Bill to Overhaul Crypto Regulation Framework
Crypto

US Senators Unveil Draft Bill to Overhaul Crypto Regulation Framework

Bitcoin.com News
by Bitcoin.com News
4 hours ago
22
Treasury Secretary Scott Bessent Calls for Full Fed Audit
Crypto

Treasury Secretary Scott Bessent Calls for Full Fed Audit

Bitcoin.com News
by Bitcoin.com News
5 hours ago
22
Load More
Next Post
Anthropic’s CEO admits compromising with authoritarian regimes to secure AI funding

Anthropic’s CEO admits compromising with authoritarian regimes to secure AI funding

ADVERTISEMENT

Follow Us

Categories

  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
Bitcoin

Bitcoin

$118,873.64

BTC 1.93%

Ethereum

Ethereum

$3,733.43

ETH 2.09%

  • Login
  • Sign Up
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now