AI inference gets a boost on Google Cloud Run: Google Cloud has enhanced its serverless platform, Cloud Run, by adding support for NVIDIA L4 GPUs, significantly improving its capabilities for handling complex AI workloads.
Key features of the upgrade:
- The integration of NVIDIA L4 GPUs extends Cloud Run’s capabilities to support real-time AI inference
- This enhancement is particularly beneficial for deploying lightweight generative AI models and small language models
- The NVIDIA L4 GPU offers up to 120 times the video performance compared to CPUs and 2.7 times the performance for generative AI tasks compared to previous GPU generations
Deployment process and flexibility:
- Developers can create container images with necessary dependencies, including NVIDIA GPU drivers and AI models
- The platform supports various Google Cloud services, including Google Kubernetes Engine and Google Compute Engine
- This flexibility allows developers to choose their preferred level of abstraction for building and deploying AI-enabled applications
Expanded capabilities beyond AI inference:
- The enhanced Cloud Run with NVIDIA L4 GPUs also enables other compute-intensive tasks
- These tasks include on-demand image recognition, video transcoding, streaming, and 3D rendering
NVIDIA-Google Cloud partnership:
- The partnership aims to provide advanced AI capabilities across various layers of the AI stack
- It includes the provision of Google Cloud A3 VMs powered by NVIDIA H100 GPUs
- NVIDIA DGX Cloud, a software and AI supercomputing solution, is available to customers directly through web browsers
- NVIDIA AI Enterprise is available on Google Cloud Marketplace, providing a secure, cloud-native platform for enterprise-ready AI applications
Real-world applications:
- L’OrĂ©al is using this technology to power its real-time AI inference applications
- Writer, an AI writing platform, has seen substantial improvements in model inference performance while reducing hosting costs by 15%
Impact on cloud-based AI development:
- The addition of NVIDIA L4 GPU support to Google Cloud Run represents a major milestone in serverless AI inference
- By combining Cloud Run’s ease of use and scalability with NVIDIA GPUs’ powerful performance, Google Cloud is providing developers and businesses with essential tools for building, deploying, and scaling AI applications
Looking ahead: As AI continues to evolve and become more integral to business operations, the integration of powerful GPU capabilities into serverless platforms like Cloud Run is likely to accelerate the development and deployment of AI-powered applications across various industries.
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...