Anthropic, an AI startup, launched its Claude 3 AI model on Monday. The company claims it sets new benchmarks for cognitive tasks, positioning it as the most intelligent AI yet.

Claude 3 has three versions: Haiku, Sonnet, and Opus, offering increasing power. Co-founder Daniela Amodei said the new models answer questions correctly twice as often as similar AI chatbots.

The top model, Claude 3 Opus, exhibits near-human comprehension for complex tasks, according to Anthropic. Opus surpassed GPT-4 on 10 AI benchmarks testing knowledge, coding, math, and more. For instance, on the MMLU undergrad knowledge test, Opus scored 86.8% versus GPT-4’s 86.4%. While the margins of victory are narrow in some instances, such as the five-shot MMLU trial where the latter secured 86.8% compared to GPT-4’s 86.4%, the gaps are more significant in others, like the Multilingual Maths (MGSM) benchmark, where Claude 3 scored an impressive 90.7%, leaving GPT-4 trailing at 74.5%.

Moreover, Anthropic states that the model improved analysis, forecasting, content creation, multilingual abilities, code generation, and vision for processing images/diagrams like GPT-4V.

However, early users report it sometimes struggles with complex reasoning and math despite excelling at factual questions and text extraction. Additionally, it shows biases favoring certain racial groups.

To address limitations, Anthropic emphasizes Claude 3’s safety features preventing harmful/illegal content generation. The company pioneered Constitutional AI, establishing ethical values the system must follow.

While currently the costliest language model, Anthropic plans affordable Claude 3 versions soon. Overall, early reports and benchmarks suggest it significantly advances large language models.