Nvidia has unveiled the Rubin CPX GPU, a new compute-focused graphics processor featuring 128GB of GDDR7 memory specifically designed for enterprise AI inference workloads. The announcement positions Nvidia to address the growing demand for long-context AI applications in software development, research, and high-definition video generation, with shipments planned for late 2026.
What you should know: The Rubin CPX represents Nvidia’s first GPU to reach 128GB memory capacity, delivering up to 30 petaFLOPs of NVFP4 compute performance.
• The GPU integrates hardware attention acceleration that Nvidia claims is three times faster than the GB300 NVL72.
• Four NVENC and four NVDEC units are built in to accelerate video workflows.
• This is explicitly not a gaming GPU—it’s engineered purely for compute-intensive inference tasks.
The big picture: Nvidia is implementing a disaggregated inference strategy where different processors handle specific AI workload phases.
• Rubin CPX focuses on the compute-heavy context phase of AI processing.
• Other Rubin GPUs and Vera CPUs handle generation tasks.
• Nvidia’s Dynamo software manages low-latency cache transfers and routing across components behind the scenes.
In plain English: Think of this like a restaurant kitchen where different chefs specialize in different courses. Instead of one chef making the entire meal, Rubin CPX handles the heavy preparation work (understanding context), while other processors focus on the final presentation (generating responses). This specialization makes the entire process faster and more efficient.
Massive deployment scale: The flagship Vera Rubin NVL144 CPX rack represents Nvidia’s largest deployment configuration.
• Each rack integrates 144 Rubin CPX GPUs, 144 standard Rubin GPUs, and 36 Vera CPUs.
• Combined performance delivers 8 exaFLOPs of NVFP4 compute, 100TB of high-speed memory, and 1.7PB/s of memory bandwidth.
• Connectivity comes through Quantum-X800 InfiniBand or Spectrum-X Ethernet with ConnectX-9 SuperNICs.
Timeline and roadmap: Rubin CPX and NVL144 CPX racks are scheduled to ship in late 2026 following recent tape-out at TSMC.
• Rubin Ultra is expected in 2027 with higher density modules.
• Feynman architecture is slated for 2028, featuring HBM4E memory and faster networking.
• The roadmap extends the Rubin architecture with progressive performance improvements.
Why this matters: By concentrating specialized hardware on context processing tasks, Nvidia aims to improve AI inference throughput while reducing deployment costs for high-value enterprise applications that require processing large amounts of contextual information.
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...