Anthropic launches new Claude AI models with advanced coding and reasoning capabilities that can operate autonomously for extended periods. These models represent a significant step toward creating virtual collaborators that maintain full context awareness while tackling complex software development projects. The update brings Claude Opus 4 and Sonnet 4 to market without price increases, while introducing enhanced coding abilities and improved performance on industry benchmarks.
The big picture: Anthropic’s newest Claude models focus specifically on software development capabilities, claiming to set “new standards for coding, advanced reasoning, and AI agents” with improved precision and problem-solving abilities.
- Opus 4 is positioned as “the world’s best coding model” with the ability to operate independently without human oversight for extended periods.
- When tested by shopping app Rakuten, Opus 4 demonstrated impressive autonomous operation, running independently for seven hours.
Key capabilities: The new models combine superior coding abilities with enhanced reasoning and web search integration.
- Both models can “deliver superior coding” with more precise responses to user instructions and deeper reasoning through complex problems.
- Claude Code, now widely available with this release, integrates AI assistance directly into developers’ existing tools with in-line edit suggestions.
Industry context: Major tech companies are rapidly integrating AI coding assistants into their development workflows.
- Microsoft reports that 30% of its code is already AI-generated, while Meta aims to reach 50% AI-written code by 2026.
- Anthropic maintains competitive pricing for developers accessing via API: $15/$75 per million tokens for Opus 4 and $3/$15 for Sonnet 4, positioning between OpenAI’s o3 model at $10/$40.
Behind the numbers: Anthropic backs its claims with benchmark performance that outpaces competitors.
- Claude 4 leads on key software engineering metrics, scoring 72.5% on SWE-bench and 43.2% on Terminal-bench, surpassing both OpenAI models and Google Gemini 2.5 Pro.
- To demonstrate improvements in accessible terms, Anthropic created a Twitch stream showing its models playing Pokémon Red autonomously, with Claude 4 models performing best thanks to new “memory files” capabilities.
Anthropic's Claude 4 Models Can Write Complex Code for You