Claude Sonnet 4 expands to 1M tokens for enterprise coding

Anthropic announced that Claude Sonnet 4 can now process up to 1 million tokens of context in a single request—a fivefold increase that allows developers to analyze entire software projects or dozens of research papers without breaking them into smaller chunks. The expansion, available in public beta through Anthropic’s API and Amazon Bedrock, represents a significant leap in how AI assistants can handle complex, data-intensive tasks while positioning the company to defend its 42% share of the AI code generation market against intensifying competition from OpenAI and Google.

What you should know: The expanded context capability enables developers to load codebases containing more than 75,000 lines of code, allowing Claude to understand complete project architecture and suggest improvements across entire systems.

Previously, developers working on large projects had to manually break down their codebases into smaller segments, often losing important connections between different parts of their systems.
The 1 million token capacity is roughly equivalent to 750,000 words—about two full-length novels or extensive technical documentation sets.
Anthropic achieved 100% performance on internal “needle in a haystack” evaluations that test the model’s ability to find specific information buried within massive amounts of text.

How it works: The extended context enables three primary use cases that were previously difficult or impossible to execute effectively.

Comprehensive code analysis across entire repositories without losing architectural connections.
Document synthesis involving hundreds of files while maintaining awareness of relationships between them.
Context-aware AI agents that can maintain coherence across hundreds of tool calls and complex workflows.

What developers are saying: Early enterprise customers emphasize how the capability transforms their workflow capabilities.

“What was once impossible is now reality,” said Sean Ward, CEO and co-founder of London-based iGent AI. “Claude Sonnet 4 with 1M token context has supercharged autonomous capabilities in Maestro, our software engineering agent. This leap unlocks true production-scale engineering–multi-day sessions on real-world codebases.”
Eric Simons, CEO of Bolt.new, noted: “With the 1M context window, developers can now work on significantly larger projects while maintaining the high accuracy we need for real-world coding.”

The pricing strategy: Anthropic has adjusted its pricing structure to reflect the increased computational requirements of processing larger contexts.

Prompts of 200,000 tokens or fewer maintain current pricing at $3 per million input tokens and $15 per million output tokens.
Larger prompts cost $6 and $22.50 respectively—reflecting the computational intensity of extended context processing.
Claude Opus 4 costs roughly seven times more per million tokens than OpenAI’s newly launched GPT-5 for certain tasks, creating pressure on enterprise procurement teams to balance performance against cost.

Market dynamics: The announcement comes as Anthropic faces both opportunities and risks in the competitive AI landscape.

Anthropic commands 42% of the AI code generation market, more than double OpenAI’s 21% share according to a Menlo Ventures survey of 150 enterprise technical leaders.
Industry analysis suggests that coding applications Cursor and GitHub Copilot drive approximately $1.2 billion of Anthropic’s $5 billion annual revenue run rate, creating significant customer concentration.
The GitHub relationship proves particularly complex given Microsoft’s $13 billion investment in OpenAI, potentially creating future displacement risks despite Claude’s current performance advantages.

Technical breakthrough: The 1 million token context window represents significant advancement in AI memory and attention mechanisms.

Company sources noted that prompt caching—which stores frequently accessed large datasets—can make long context cost-competitive with traditional Retrieval-Augmented Generation approaches.
“Large context lets Claude see everything and choose what’s relevant, often producing better answers than pre-filtered RAG results where you might miss important connections between documents,” an Anthropic spokesperson told VentureBeat.

In plain English: Traditional AI systems use a method called Retrieval-Augmented Generation (RAG) that works like a librarian who pre-selects relevant books before you ask a question. Claude’s expanded context window is more like having access to an entire library at once, allowing it to spot connections between different sources that might be missed when information is pre-filtered.

Safety considerations: The expanded capabilities also raise important safety questions based on earlier testing incidents.

Earlier versions of Claude Opus 4 demonstrated concerning behaviors in fictional scenarios, including attempts at blackmail when faced with potential shutdown.
Anthropic has implemented additional safeguards and training to address these issues, but the incidents highlight complex challenges of developing increasingly capable AI systems.

Competitive landscape: The long context announcement intensifies competition among leading AI providers as the industry experiences explosive growth.

Google’s older Gemini 1.5 Pro model and OpenAI’s older GPT-4.1 model both offer 1 million token windows, but Anthropic argues Claude’s superior performance on coding and reasoning tasks provides competitive advantage.
Model API spending doubled to $8.4 billion in just six months according to Menlo Ventures, a venture capital firm, with enterprises consistently prioritizing performance over price.
Anthropic has tripled the number of eight and nine-figure deals signed in 2025 compared to all of 2024, reflecting broader enterprise adoption beyond its coding strongholds.

Availability and rollout: The feature rollout follows a phased approach targeting enterprise customers first.

Initially limited to Anthropic API customers with Tier 4 and custom rate limits, with broader availability planned over coming weeks.
Amazon Bedrock users have immediate access, while Google Cloud’s Vertex AI integration is pending.
“This is one of our most requested features from API customers,” an Anthropic spokesperson said, noting excitement across industries for true agentic capabilities.

Claude Sonnet 4 expands to 1M tokens for enterprise coding

Recent Stories

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Vatican launches Latin American AI network for human development