AI safety and development company Anthropic is taking a step towards a more robust evaluation of artificial intelligence with a new funding program. They aim to create a fresh generation of AI benchmarks – essentially tests that gauge an AI model’s capabilities and potential risks.

Current benchmarks, according to Anthropic, might not be keeping pace with the rapid advancements in AI. These new benchmarks would be designed to comprehensively assess a wider range of functionalities in AI models, including generative models like Anthropic’s own Claude, known for its text-based interactions.

The demand for high-quality AI assessments is outgrowing available options, prompting Anthropic to provide financial backing. Their initiative seeks to elevate the bar for AI safety assessments by funding evaluations that measure a broader spectrum of capabilities. This could encompass an AI’s ability to reason, handle complex situations, and avoid generating harmful outputs.

By fostering a more diverse ecosystem of assessments, Anthropic hopes to address the intricate challenges in AI research and development. They emphasize the importance of these evaluations in mitigating potential risks associated with the growing sophistication of AI.

This move highlights a growing recognition within the AI development community that current methods of evaluating AI models might be insufficient. Anthropic’s initiative is a step towards creating a more comprehensive picture of AI capabilities, paving the way for the development of safer and more reliable AI systems.

Shares: