×
More setbacks for NVIDIA as Blackwell chips overheat in servers
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Increasing adoption of artificial intelligence is creating surging demand for high-performance computing chips, leading to technical challenges as manufacturers push the boundaries of what’s possible.

Critical Development: Nvidia’s next-generation Blackwell GPUs are experiencing overheating issues in server configurations, potentially causing further delays to their planned release.

  • The server racks, designed to connect up to 72 GPUs simultaneously, are creating thermal management challenges that require ongoing redesign efforts
  • This setback could impact the scheduled openings of new data centers for major tech companies including Google, Microsoft, and Meta
  • A previous design flaw had already pushed back the launch from its initial Q2 2024 target

Technical Context: GPU performance and heat generation are intrinsically linked, creating unique challenges for high-density computing environments.

  • GPUs consume substantial energy during operation, with more powerful chips typically generating more heat
  • The cryptocurrency mining industry has faced similar challenges, sometimes employing immersion cooling techniques where hardware is submerged in liquid
  • Nvidia claims the Blackwell chips will be 30 times faster than previous generations, suggesting significantly increased power requirements

Industry Impact: The delays could have cascading effects across the AI industry and its infrastructure.

  • Tech giants are already struggling to secure adequate power supplies for their AI data centers
  • Companies like Meta, Microsoft, and Google have begun exploring nuclear power options to meet growing energy demands
  • Nvidia’s stock has surged over 180% in the past year despite these challenges, while competitor AMD has recently initiated layoffs

Nvidia’s Response: The company maintains that the ongoing engineering changes are part of normal development processes.

  • A company spokesperson told Reuters they are working closely with cloud service providers as part of their engineering process
  • The statement suggests Nvidia is actively working on new server designs to address the thermal management issues
  • The company has not provided updated timeline estimates for the Blackwell GPU release

Broader Energy Implications: The situation highlights growing concerns about AI’s expanding energy footprint and infrastructure requirements.

  • Experts predict possible power shortages for AI data centers as soon as next year
  • The rate of data center construction is outpacing the addition of new power sources to the grid
  • Traditional power purchase agreements may not adequately address the fundamental energy challenges facing the AI industry
Nvidia's Delayed Blackwell AI Chips Overheating in Servers

Recent News

Crossing ponds: Former UK PM Rishi Sunak joins Microsoft and Anthropic as AI adviser

Regulators approved the roles despite conflicts tied to £1.4 billion in Microsoft government contracts.

Dia browser available for Mac users with $20/month Pro subscription

The agentic browser automates shopping and research through natural language prompts.

Salesforce debuts AI agents for 24/7 IT support across platforms

Guardrails and human oversight aim to prevent AI mistakes that could worsen problems.