Cohere enhances RAG search with multimodal capabilities: Cohere has upgraded its Embed 3 model to include multimodal embeddings, allowing for image-based retrieval augmented generation (RAG) in enterprise search.
Key features of the new Embed 3 model:
- Generates embeddings for both images and text
- Utilizes a unified latent space for encoders, enabling mixed modality searches
- Available in over 100 languages
- Accessible on Cohere’s platform and Amazon SageMaker
Expanding enterprise data accessibility:
- Enables businesses to search complex reports, product catalogs, and design files
- Increases the volume of data accessible through RAG search
- Allows incorporation of charts, graphs, product images, and design templates
Performance improvements and advantages:
- Claims to be the “most generally capable multimodal embedding model on the market”
- Prioritizes meaning behind data without biasing towards specific modalities
- Eliminates the need for separate databases for images and text
- Offers better mixed modality search results compared to other models
Industry context and competitive landscape:
- Multimodal search is becoming increasingly familiar to consumers through platforms like Google and ChatGPT
- Other companies, including Google and OpenAI, offer multimodal embedding options
- Open-source models also facilitate embeddings for various modalities
Cohere’s market positioning:
- Founded by researchers behind the Transformer model
- Has faced challenges in gaining top-of-mind status in the enterprise space
- Recently updated APIs to allow easy switching from competitor models
- Aims to align with industry standards where customers often toggle between models
Implications for enterprise users:
- Enables more comprehensive search capabilities across various file formats
- Potentially boosts workforce productivity through improved data accessibility
- Aligns enterprise search capabilities with consumer expectations for multimodal search
Looking ahead: The race for superior multimodal embeddings: As enterprises increasingly recognize the benefits of multimodal search, competition among model developers is likely to intensify, focusing on speed, accuracy, and security to meet demanding enterprise requirements.
Cohere adds vision to its RAG search capabilities