Bitcoin

Bitcoin

$101,395.17

BTC 0.49%

Ethereum

Ethereum

$2,256.19

ETH 2.51%

  • Login
  • Register
Metaverse Media Group
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
No Result
View All Result
Metaverse Media Group

A New Benchmark for the Risks of AI

A New Benchmark for the Risks of AI

WiredbyWired
4 December 2024
MLCommons, a nonprofit that helps companies measure the performance of their artificial intelligence systems, is launching a new benchmark to gauge AI’s bad side too.The new benchmark, called AILuminate, assesses the responses of large language models to more than 12,000 test prompts in 12 categories including inciting violent crime, child sexual exploitation, hate speech,…
image

MLCommons, a nonprofit that helps companies measure the performance of their artificial intelligence systems, is launching a new benchmark to gauge AI’s bad side too.

The new benchmark, called AILuminate, assesses the responses of large language models to more than 12,000 test prompts in 12 categories including inciting violent crime, child sexual exploitation, hate speech, promoting self-harm, and intellectual property infringement.

Models are given a score of “poor,” “fair,” “good,” “very good,” or “excellent,” depending on how they perform. The prompts used to test the models are kept secret to prevent them from ending up as training data that would allow a model to ace the test.

Peter Mattson, founder and president of MLCommons and a senior staff engineer at Google, says that measuring the potential harms of AI models is technically difficult, leading to inconsistencies across the industry. “AI is a really young technology, and AI testing is a really young discipline,” he says. “Improving safety benefits society; it also benefits the market.”

Reliable, independent ways of measuring AI risks may become more relevant under the next US administration. Donald Trump has promised to get rid of President Biden’s AI Executive Order, which introduced measures aimed at ensuring AI is used responsibly by companies as well as a new AI Safety Institute to test powerful models.

The effort could also provide more of an international perspective on AI harms. MLCommons counts a number of international firms, including the Chinese companies Huawei and Alibaba, among its member organizations. If these companies all used the new benchmark, it would provide a way to compare AI safety in the US, China, and elsewhere.

Some large US AI providers have already used AILuminate to test their models. Anthropic’s Claude model, Google’s smaller model Gemma, and a model from Microsoft called Phi all scored “very good” in testing. OpenAI’s GPT-4o and Meta’s largest Llama model both scored “good.” The only model to score “poor” was OLMo from the Allen Institute for AI, although Mattson notes that this is a research offering not designed with safety in mind.

“Overall, it’s good to see scientific rigor in the AI evaluation processes,” says Rumman Chowdhury, CEO of Humane Intelligence, a nonprofit that specializes in testing or red-teaming AI models for misbehaviors. “We need best practices and inclusive methods of measurement to determine whether AI models are performing the way we expect them to.”

MLCommons says the new benchmark is meant to be similar to automotive safety ratings, with model makers pushing their products to score well and the standard then improving over time.

The benchmark is not designed to measure the potential for AI models to become deceptive or difficult to control, an issue that garnered attention after ChatGPT blew up in late 2022. Governments worldwide launched efforts to study this issue and AI companies have teams dedicated to researching and probing models for problematic behaviors.

Mattson says MLCommon’s approach is meant to be complementary but also more expansive. “Safety institutes are trying to do evaluations, but they’re not necessarily able to consider the full range of hazards that you may want to see from a full featured product safety space,” Mattson says. “We’re able to think about a broader array of hazards.”

Rebecca Weiss, executive director of MLCommons, adds her organization should be better able to keep track of the latest developments in AI than slower-moving government bodies can. “Policy makers have really good intent,” she says. “But sometimes aren’t necessarily able to keep up with the industry as it’s going forward.”

MLCommons has around 125 member organizations including big tech companies like OpenAI, Google, and Meta, and institutions including Stanford and Harvard.

No Chinese company has yet used the new benchmark, but Weiss and Mattson note that the organization has partnered with AI Verify, a Singapore-based AI Safety organization, to develop standards with input from scientists, researchers, and companies in Asia.

Read the full article on Wired.com
in Business
Reading Time: 4 mins read
0
0
24
VIEWS
Share on TwitterShare on Facebook

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now

Subscribe to our newsletter

For the latest news & monthly prize giveaways
Join Now
ADVERTISEMENT

Related Posts

A Kid Made $50,000 Dumping Crypto He’d Created. Then Came the Backlash
Business

A Kid Made $50,000 Dumping Crypto He’d Created. Then Came the Backlash

7 months ago
23
In Sam Altman We Trust?
Business

In Sam Altman We Trust?

7 months ago
137
Tim Cook Wants Apple to Literally Save Your Life
Business

Tim Cook Wants Apple to Literally Save Your Life

7 months ago
23

Comments

Please login to join discussion
ADVERTISEMENT

Latest News

  • All
  • Crypto
  • NFTs
  • Technology
  • Business
Meta’s Ray-Ban AR Smart Glasses: From European Best-Seller to Indian Expansion
AI

Meta’s Ray-Ban AR Smart Glasses: From European Best-Seller to Indian Expansion

XR Today
by XR Today
1 month ago
34
From Presence to Productivity: How XR Hand and Eye Tracking Enhances Virtual Workspaces
AI

From Presence to Productivity: How XR Hand and Eye Tracking Enhances Virtual Workspaces

XR Today
by XR Today
2 months ago
29
Meta’s XR Strategy: Dominating the Smart Glasses Market in the Age of AI and Augmented Reality
AI

Meta’s XR Strategy: Dominating the Smart Glasses Market in the Age of AI and Augmented Reality

XR Today
by XR Today
2 months ago
37
XR User Tracking: Privacy, Security and Compliance Considerations for Businesses
AI

XR User Tracking: Privacy, Security and Compliance Considerations for Businesses

XR Today
by XR Today
2 months ago
39
Vuzix Supercharges Smart Glass Innovation with Silicon Valley Hub and OEM Boost
AI

Vuzix Supercharges Smart Glass Innovation with Silicon Valley Hub and OEM Boost

XR Today
by XR Today
2 months ago
35
Vuzix and Xander Deliver Real-Time Captioning Through AR Smart Glasses
AI

Vuzix and Xander Deliver Real-Time Captioning Through AR Smart Glasses

XR Today
by XR Today
2 months ago
26
Load More
Next Post
QANplatform Becomes IBM Business Partner

QANplatform Becomes IBM Business Partner

ADVERTISEMENT

Follow Us

Categories

  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
  • Crypto
  • NFTs
  • AI
  • Technology
  • Business
Subscribe to our Newsletter

© 2022 Metaverse Media Group – The Metaverse Mecca

Privacy and Cookie Policy | Sitemap

No Result
View All Result
  • Home
  • Crypto
  • NFTs
  • Artificial Intelligence
  • More
    • Technology
    • Business
    • Newsletter
Bitcoin

Bitcoin

$101,395.17

BTC 0.49%

Ethereum

Ethereum

$2,256.19

ETH 2.51%

  • Login
  • Sign Up

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Subscribe to our newsletter

Get the latest news & win monthly prizes

Subscribe to our newsletter

For the Latest News and Monthly Prize Giveaways

Join Now
Join Now