Home Technology Anthropic plans to fund a new generation of more comprehensive AI benchmarks.

Anthropic plans to fund a new generation of more comprehensive AI benchmarks.

Anthropic plans to fund a new generation of more comprehensive AI benchmarks.

Anthropic is launching a program to fund the development of new types of benchmarks to evaluate the performance and impact of AI models, including generative models like Claude.

Anthropic’s program, unveiled Monday, will distribute payments to third-party organizations that can “effectively measure advanced capabilities of AI models,” the company said in a blog post. Those interested can submit applications to be evaluated on a rolling basis.

“Our investment in these assessments is intended to advance the entire field of AI safety and provide a valuable tool that benefits the entire ecosystem,” Anthropic wrote on its official blog. “Developing high-quality safety-related assessments remains challenging, and demand continues to outpace supply.”

As I’ve highlighted before, AI has a benchmarking problem. The most commonly cited benchmarks for AI today fall short of capturing how the average person actually uses the system under test. There are also questions about whether some benchmarks, especially those published before the dawn of modern generative AI, measure what they purport to measure, given their age.

The solution Anthropic proposes, which is both high-level and harder to understand than it sounds, is to create challenging benchmarks focused on AI security and societal impact through new tools, infrastructure, and methods.

The company specifically calls for tests that evaluate the model’s ability to perform tasks like conducting cyberattacks, “enhancing” weapons of mass destruction (such as nuclear weapons), and manipulating or deceiving people (such as through deepfakes or misinformation). For AI risks related to national security and defense, Anthropic says it is committed to developing a kind of “early warning system” to identify and assess risks, though the blog post does not specify what such a system would entail.

Anthropic also said the new program plans to support research on “end-to-end” work that includes benchmarking and supporting scientific research in AI, multilingual conversations, mitigating deep-rooted biases, and investigating the toxicity of self-censorship.

To achieve all this, Anthropic envisions a new platform where subject matter experts can develop their own evaluations and large-scale tests of models involving “thousands” of users. The company says it has hired a full-time coordinator for the program and can even buy or scale projects that it thinks have scalability.

“We offer a variety of funding options to fit the needs and phases of each project,” Anthropic wrote in the post, though an Anthropic spokesperson declined to provide further details on those options. “Teams will have the opportunity to interact directly with domain experts from Anthropic’s Frontier Red Team, Fine-Tuning, Trust & Safety, and other relevant teams.”

Anthropic’s efforts to support new AI benchmarks are commendable—if, of course, it has the cash and manpower to back it up—but given the company’s commercial ambitions in the AI ​​race, it may be difficult to trust it entirely.

In a blog post, Anthropic was somewhat transparent about the fact that it wants to ensure that the specific evaluations it funds are consistent with AI safety classifications. that Development (with some input from third parties, such as the nonprofit AI research organization METR). This is within the company’s purview. However, it may force program applicants to accept definitions of “safe” or “unsafe” AI that they may not agree with.

Some in the AI ​​community will likely also take issue with Anthropic’s references to “catastrophic” and “deceptive” AI risks, such as the dangers of nuclear weapons. Many experts say there is little evidence that AI as we know it will soon, or even ever, achieve world-ending capabilities or surpass human capabilities. These experts add that claims of imminent “superintelligence” only serve to distract from today’s pressing AI regulation issues, such as AI’s hallucinogenic tendencies.

Anthropic wrote in the post that it hopes its program will serve as a “catalyst for progress toward a future where comprehensive AI assessments become the industry standard.” It’s a mission that many public, non-corporate efforts to create better AI benchmarks can relate to. But it’s not yet clear whether such efforts will ultimately be willing to join forces with AI vendors who are loyal to their shareholders.

Exit mobile version