Anthropic, an AI startup, launched its Claude 3 AI model on Monday. The company claims it sets new benchmarks for cognitive tasks, positioning it as the most intelligent AI yet.

Claude 3 has three versions: Haiku, Sonnet, and Opus, offering increasing power. Co-founder Daniela Amodei said the new models answer questions correctly twice as often as similar AI chatbots.

The top model, Claude 3 Opus, exhibits near-human comprehension for complex tasks, according to Anthropic. Opus surpassed GPT-4 on 10 AI benchmarks testing knowledge, coding, math, and more. For instance, on the MMLU undergrad knowledge test, Opus scored 86.8% versus GPT-4’s 86.4%. While the margins of victory are narrow in some instances, such as the five-shot MMLU trial where the latter secured 86.8% compared to GPT-4’s 86.4%, the gaps are more significant in others, like the Multilingual Maths (MGSM) benchmark, where Claude 3 scored an impressive 90.7%, leaving GPT-4 trailing at 74.5%.

Anthropic發表宣稱可擊敗OpenAI GPT-4與Google Gemini 1.0 Ultra的Claude 3模型| iThome

Moreover, Anthropic states that the model improved analysis, forecasting, content creation, multilingual abilities, code generation, and vision for processing images/diagrams like GPT-4V.

However, early users report it sometimes struggles with complex reasoning and math despite excelling at factual questions and text extraction. Additionally, it shows biases favoring certain racial groups.

Also Read:

To address limitations, Anthropic emphasizes Claude 3’s safety features preventing harmful/illegal content generation. The company pioneered Constitutional AI, establishing ethical values the system must follow.

While currently the costliest language model, Anthropic plans affordable Claude 3 versions soon. Overall, early reports and benchmarks suggest it significantly advances large language models.