×
NVIDIA’s NeMo Retriever Boosts Accuracy and Throughput for Enterprise LLMs
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

NVIDIA introduces NeMo Retriever NIM microservices to enhance the accuracy and throughput of large language models (LLMs) for enterprises by leveraging their proprietary data.

Key components of NeMo Retriever: The microservices consist of embedding and reranking models that work together to efficiently retrieve the most relevant data for generating accurate responses:

  • Embedding models transform diverse data into numerical vectors, capturing meaning and nuance, and are computationally efficient.
  • Reranking models score the retrieved data based on its relevance to the query, improving accuracy but being more computationally complex.
  • NeMo Retriever combines both model types to ensure the most helpful and accurate results for enterprises.

Integration with industry platforms: NeMo Retriever microservices are being integrated into various data platforms to enhance their AI capabilities:

  • Cohesity’s Gaia AI product will use NeMo Retriever to power insightful, generative AI applications through retrieval-augmented generation (RAG).
  • NetApp is collaborating with NVIDIA to connect NeMo Retriever to its intelligent data infrastructure, allowing customers to access proprietary insights securely.
  • DataStax has integrated NeMo Retriever into its Astra DB and Hyper-Converged platforms, enabling faster time-to-market for RAG capabilities.

Versatile use cases: NeMo Retriever powers a wide range of AI applications across industries:

  • Building intelligent chatbots that provide accurate, context-aware responses.
  • Analyzing vast amounts of data to identify security vulnerabilities.
  • Extracting insights from complex supply chain information.
  • Enhancing AI-enabled retail shopping advisors for personalized experiences.

Compatibility and flexibility: NeMo Retriever microservices can be used in conjunction with other NVIDIA NIM microservices and integrated with various models:

  • Riva NIM microservices can be used alongside NeMo Retriever to supercharge speech AI applications.
  • The microservices can be integrated with community models, NVIDIA models, or users’ custom models.
  • They can be deployed in the cloud, on-premises, or in hybrid environments, providing flexibility for developers.

Broader implications: NVIDIA’s NeMo Retriever NIM microservices demonstrate the company’s continued efforts to democratize AI and empower enterprises with cutting-edge tools for building accurate, data-driven applications. By leveraging proprietary data and state-of-the-art models, NeMo Retriever has the potential to accelerate the adoption of generative AI across industries, enabling businesses to unlock valuable insights and create more intelligent, responsive solutions. However, as with any powerful AI technology, it will be crucial for enterprises to consider the ethical implications and ensure responsible deployment to maximize benefits while mitigating potential risks.

AI, Go Fetch! New NVIDIA NeMo Retriever Microservices Boost LLM Accuracy and Throughput

Recent News

Nvidia’s new AI agents can search and summarize huge quantities of visual data

NVIDIA's new AI Blueprint combines computer vision and generative AI to enable efficient analysis of video and image content, with potential applications across industries and smart city initiatives.

How Boulder schools balance AI innovation with student data protection

Colorado school districts embrace AI in classrooms, focusing on ethical use and data privacy while preparing students for a tech-driven future.

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft's new AI feature can analyze on-screen content, offering contextual assistance without the need for additional searches or explanations.