Cerebras expands AI inference capacity 20x to challenge Nvidia, implying company success

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Cerebras Systems is dramatically expanding its AI inference capacity and strategically positioning itself to challenge Nvidia’s market dominance in the artificial intelligence infrastructure space. By adding six new data centers across North America and Europe and securing partnerships with major tech platforms, Cerebras is betting on the growing demand for high-speed AI inference services as enterprises seek faster alternatives to traditional GPU solutions. This expansion represents a significant development in the evolving AI hardware landscape, potentially reshaping how businesses access and deploy artificial intelligence capabilities.

The big picture: Cerebras Systems announced a massive twentyfold increase in its AI inference capacity, adding six new data centers across North America and Europe to deliver over 40 million tokens per second.

The expansion includes facilities in Dallas, Minneapolis, Oklahoma City, Montreal, New York, and France, with 85% of the total capacity located in the United States.
This infrastructure build-out represents a direct challenge to Nvidia‘s dominance in the AI processing market, focusing specifically on high-speed inference services.

Strategic partnerships: Cerebras has secured integrations with two significant platforms that will expand its market reach.

Hugging Face, a popular AI developer platform with five million users, will offer one-click access to Cerebras Inference services.
AlphaSense, a market intelligence platform, has switched to Cerebras to accelerate its AI-powered search capabilities, representing a major enterprise customer win.

Technical advantages: The company is positioning its Wafer-Scale Engine (WSE-3) processor as significantly faster than GPU-based alternatives for specific AI workloads.

Cerebras claims its technology can run AI models 10 to 70 times faster than GPU solutions.
The company is targeting three specific high-value areas: real-time voice and video processing, reasoning models, and coding applications.

Behind the numbers: Cerebras is pursuing a dual strategy of superior speed and cost-effectiveness.

James Wang, Director of Product Marketing at Cerebras, noted that Meta’s Llama 3.3 70B model now performs similarly to OpenAI‘s GPT-4 while costing significantly less to run.
The company’s Oklahoma City facility is designed with triple redundant power stations and custom water-cooling solutions to withstand extreme weather events.

Why this matters: With 85% of its inference capacity located in the United States, Cerebras is advancing domestic AI infrastructure at a time when processing capabilities are becoming a critical resource for businesses adopting AI technologies.

What they’re saying: “This year, our goal is to truly satisfy all the demand and all the new demand we expect will come online as a result of new models like Llama 4 and new DeepSeek models,” said James Wang of Cerebras.

Wang described the expansion as a “huge growth initiative” designed to “satisfy almost unlimited demand we’re seeing across the board for inference tokens.”

Cerebras just announced 6 new AI datacenters that process 40M tokens per second — and it could be bad news for Nvidia

VentureBeat

Menu

Cerebras expands AI inference capacity 20x to challenge Nvidia, implying company success

Recent News

Maryland offers up to $500K for cyber-AI clinics to train workers

Pro-tip: 3 AI stocks draw investor focus across healthcare, voice, and analytics

JEDEC unveils UFS 5.0 storage standard with 10.8GB/s speeds for AI apps

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

Cerebras expands AI inference capacity 20x to challenge Nvidia, implying company success

Recent News

Maryland offers up to $500K for cyber-AI clinics to train workers

Pro-tip: 3 AI stocks draw investor focus across healthcare, voice, and analytics

JEDEC unveils UFS 5.0 storage standard with 10.8GB/s speeds for AI apps

Join the revolution

CO/AI

Resources

Join the revolution