The Obtain: Rethinking AI benchmarks, and the ethics of AI brokers

November 27, 2024

48

Each time a brand new AI mannequin is launched, it’s sometimes touted as acing its efficiency in opposition to a collection of benchmarks. OpenAI’s GPT-4o, for instance, was launched in Could with a compilation of outcomes that confirmed its efficiency topping each different AI firm’s newest mannequin in a number of exams.

The issue is that these benchmarks are poorly designed, the outcomes onerous to copy, and the metrics they use are ceaselessly arbitrary, in response to new analysis. That issues as a result of AI fashions’ scores in opposition to these benchmarks decide the extent of scrutiny they obtain.

AI firms ceaselessly cite benchmarks as testomony to a brand new mannequin’s success, and people benchmarks already kind a part of some governments’ plans for regulating AI. However proper now, they won’t be adequate to make use of that manner—and researchers have some concepts for a way they need to be improved.

—Scott J Mulligan

We have to begin wrestling with the ethics of AI brokers

Generative AI fashions have turn into remarkably good at conversing with us, and creating photographs, movies, and music for us, however they’re not all that good at doing issues for us.

AI brokers promise to vary that. Final week researchers printed a brand new paper explaining how they educated simulation brokers to copy 1,000 folks’s personalities with beautiful accuracy.

AI fashions that mimic you would exit and act in your behalf within the close to future. If such instruments turn into low cost and simple to construct, it would elevate numerous new moral considerations, however two particularly stand out. Learn the total story.

—James O’Donnell

The Obtain: Rethinking AI benchmarks, and the ethics of AI brokers

Related Articles

How Bedrock Robotics is altering the development business

Nokia touts huge TCO discount with new line of coherent optical options

Are AI tokens the brand new signing bonus or only a value of doing enterprise?

LEAVE A REPLY Cancel reply

Latest Articles

How Bedrock Robotics is altering the development business

Nokia touts huge TCO discount with new line of coherent optical options

Are AI tokens the brand new signing bonus or only a value of doing enterprise?

Joe Allison to be inducted into the TCT Corridor of Fame

Turning information experiences into information with Gemini

ABOUT US