Anthropic's latest AI model writes code with unmatched accuracy

By Mudit Dube

Aug 06, 2025

11:42 am

What's the story

Anthropic, a leading artificial intelligence firm, has unveiled its latest model, Claude Opus 4.1. The new addition to the Claude family focuses on enhancing software engineering accuracy and agentic tasks. The company claims that the updated model has improved software engineering accuracy to 74.5%, higher than other models from the company. The performance is also touted to be better than rival LLMs from OpenAI and Google.

Enhanced features

Model available for all paid users via Claude Code

Claude Opus 4.1 is better at "in-depth research and data analysis skills, especially around detail tracking and agentic search," according to Anthropic.

The company has made it available for all paid Claude users through Claude Code, its API service on Amazon Bedrock and Google Cloud's Vertex AI platform.

Benchmark results

New model beats OpenAI and Google's Gemini on various benchmarks

The new model outperforms its predecessor on complex tasks, including software engineering. It also excels in multilingual capabilities and graduate-level reasoning prompts.

In fact, it beats competitors' reasoning models such as OpenAI o3 and Gemini 2.5 Pro in various benchmarks, including SWE-bench Verified—a benchmark that tests LLMs's ability to solve real-world software engineering tasks sourced from GitHub.