Anthropic's new Sonnet 5 makes AI agents cheaper to run

By Mudit Dube

Jul 01, 2026

09:27 am

What's the story

Anthropic has launched its latest artificial intelligence (AI) model, the Claude Sonnet 5. The new release is a more powerful and agentic version of the lab's midsize model. "It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models," said Anthropic in a blog post.

Model performance

Pricing and availability

Claude Sonnet 5 promises performance on par with Opus 4.8, but at a much lower cost. Starting Tuesday, it will be the default model for free and Pro plans across all subscriptions. The initial pricing is $2 per million input tokens and $10 per million output tokens until August 31. That makes Sonnet 5 cheaper than Opus 4.8, OpenAI's GPT-5.5, and Google's Gemini 3.1 Pro.

Improved performance

Performance improvements over Sonnet 4.6

The new model shows improvements over its predecessor, Sonnet 4.6, which was released in February. It is proficient at reasoning, tool use, software coding, and knowledge work. In one benchmark test for agentic coding tasks, Sonnet 5 scored 63.2%, compared to Opus 4.8's score of 69.2% and Sonnet 4.6's score of just 58.1%.

Task completion

Sonnet 5 excels at automation tasks

Testers have noted that Sonnet 5 excels at completing complex tasks where previous model versions would have faltered. Daniel Shepard, a senior engineer at Zapier, said, "We handed Claude Sonnet 5 a two-part job, update Salesforce account tiers, send a launch announcement to enterprise contacts, and it finished end to end." This shows the new model's improved capability for day-to-day automation.

Enhanced security

Improved safety features

Sonnet 5 also comes with improved safety features, showing a lower rate of "undesirable behaviors" like cooperation with misuse and deception. It is better at refusing malicious requests and dodging hijack attempts in prompt-injection attacks. However, it still doesn't match the capabilities of Opus 4.8 and Claude Mythos Preview when it comes to misaligned behavior or performing dangerous cybersecurity tasks.