×
Cohere just gave the power of vision to its RAG search offering
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Cohere enhances RAG search with multimodal capabilities: Cohere has upgraded its Embed 3 model to include multimodal embeddings, allowing for image-based retrieval augmented generation (RAG) in enterprise search.

Key features of the new Embed 3 model:

  • Generates embeddings for both images and text
  • Utilizes a unified latent space for encoders, enabling mixed modality searches
  • Available in over 100 languages
  • Accessible on Cohere’s platform and Amazon SageMaker

Expanding enterprise data accessibility:

  • Enables businesses to search complex reports, product catalogs, and design files
  • Increases the volume of data accessible through RAG search
  • Allows incorporation of charts, graphs, product images, and design templates

Performance improvements and advantages:

  • Claims to be the “most generally capable multimodal embedding model on the market”
  • Prioritizes meaning behind data without biasing towards specific modalities
  • Eliminates the need for separate databases for images and text
  • Offers better mixed modality search results compared to other models

Industry context and competitive landscape:

  • Multimodal search is becoming increasingly familiar to consumers through platforms like Google and ChatGPT
  • Other companies, including Google and OpenAI, offer multimodal embedding options
  • Open-source models also facilitate embeddings for various modalities

Cohere’s market positioning:

  • Founded by researchers behind the Transformer model
  • Has faced challenges in gaining top-of-mind status in the enterprise space
  • Recently updated APIs to allow easy switching from competitor models
  • Aims to align with industry standards where customers often toggle between models

Implications for enterprise users:

  • Enables more comprehensive search capabilities across various file formats
  • Potentially boosts workforce productivity through improved data accessibility
  • Aligns enterprise search capabilities with consumer expectations for multimodal search

Looking ahead: The race for superior multimodal embeddings: As enterprises increasingly recognize the benefits of multimodal search, competition among model developers is likely to intensify, focusing on speed, accuracy, and security to meet demanding enterprise requirements.

Cohere adds vision to its RAG search capabilities

Recent News

Nvidia’s new AI agents can search and summarize huge quantities of visual data

NVIDIA's new AI Blueprint combines computer vision and generative AI to enable efficient analysis of video and image content, with potential applications across industries and smart city initiatives.

How Boulder schools balance AI innovation with student data protection

Colorado school districts embrace AI in classrooms, focusing on ethical use and data privacy while preparing students for a tech-driven future.

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft's new AI feature can analyze on-screen content, offering contextual assistance without the need for additional searches or explanations.