×
Salesforce’s TACO is a new family of multimodal AI models
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Salesforce has unveiled TACO, a new family of multimodal AI models that can process multiple types of data and perform complex reasoning tasks using a step-by-step approach.

Key Innovation: TACO represents a significant advancement in multimodal AI by combining chains-of-thought-and-action (CoTA) with the ability to process various data types including images, text, and numerical calculations.

  • The system utilizes external tools like optical character recognition (OCR), depth estimation, and calculators to process different types of information
  • TACO can break down complex questions into smaller, manageable steps and execute them sequentially
  • The model demonstrates particular strength in tasks requiring both visual understanding and mathematical reasoning

Technical Implementation: Salesforce developed TACO through an extensive training process designed to enhance its problem-solving capabilities.

  • The model was trained using over 1 million synthetic CoTA traces
  • Training incorporated both model-based and programmatic generation methods
  • TACO showed 30-50% better performance compared to traditional direct-answer models
  • The system achieved up to 20% improvement over baseline models on the MMVet benchmark

Practical Applications: TACO’s architecture enables it to tackle real-world problems that require multiple steps and different types of reasoning.

  • The model can handle practical questions like calculating gas purchases from photographed price signs
  • Future applications could include medical question answering and web navigation tasks
  • The framework is designed to be adaptable for training new models with different actions across various domains

Looking Ahead: While TACO represents a significant step forward in multimodal AI capabilities, its true impact will likely depend on how effectively it can be integrated into practical applications and whether it can maintain consistent performance across diverse real-world scenarios.

Salesforce Introduces New Family of Multimodal Action Models Named TACO

Recent News

AI-powered agents poised to upend US auto industry in customers’ favor

Car buyers show strong interest in AI assistance for maintenance alerts and repair verification as dealerships aim to restore consumer confidence.

Eaton’s AI data center stock dips on the arrival of DeepSeek

Market jitters over AI efficiency gains overlook tech giants' continued commitment to data center expansion.

Long story short: Top AI summarizers for articles and documents in 2025

Enterprise-grade AI document summarizers are gaining traction as companies seek to cut down the 20% of work time spent organizing information.