Salesforce Introduces New Family of Multimodal Action Models Named TACO

TACO leverages chains-of-thought-and-action to enhance AI's ability to handle multimodal reasoning and real-world challenges.

Highlights

  • The model utilises OCR, depth estimation, and calculators to handle diverse data types.
  • Salesforce trained TACO with over 1 million synthetic CoTA traces to optimise its capabilities.
  • Potential applications include web navigation and medical question answering.

Follow Us

Salesforce Introduces New Family of Multimodal Action Models Named TACO
Salesforce AI Research has introduced TACO, a family of multimodal large action models designed to improve performance on complex, multi-step problems that require multiple reasoning across various data types, such as images, text, and calculations. "We present TACO, a family of multi-modal large action models designed to improve performance on complex questions that require multiple capabilities and demand multi-step solutions," Salesforce said in a blog post on January 16, 2025.

Also Read: Meta Expands Access to Llama AI Models for US Government Use




Overcoming Limitations of Current AI Systems

According to the company, TACO tackles a significant limitation of current AI systems (open-source multi-modal models), which struggle to solve realistic complex problems in a step-by-step manner. For instance, when posed with a question like "How much gas can I buy with $50?" from a photo of a gas station sign, TACO can identify price information, extract the text using OCR, and perform the necessary calculations. This capability is powered by chains-of-thought-and-action (CoTA), where the model generates both reasoning and actionable steps to arrive at the correct answer.

"To answer such questions, TACO produces chains-of-thought-and-action (CoTA), executes intermediate steps by invoking external tools such as OCR, depth estimation and calculator, then integrates both the thoughts and action outputs to produce coherent responses," the company explained.

Also Read: Meta Unveils New AI Models and Tools to Drive Innovation

Training TACO

To train TACO, Salesforce said it created over 1 million synthetic CoTA traces through model-based and programmatic generation methods. These steps help the model learn to perform complex reasoning and execute external actions such as text recognition and mathematical operations.

Salesforce claims that TACO achieved 30-50 percent higher performance compared to models using traditional direct answers. It also outperformed baseline models by up to 20 percent on the MMVet benchmark.

Also Read: Microsoft, Dell, Google and Others Launch Initiatives to Propel AI Infrastructure and Innovation

Future Applications

With this framework, Salesforce AI hopes to pave the way for new multimodal models that can be applied across various domains, such as medical question answering and web navigation.

"With our framework, future works can train new models with different actions for other applications such as web navigation or for other domains such as medical question answering," Salesforce said.

Reported By

Kirpa B is passionate about the latest advancements in Artificial Intelligence technologies and has a keen interest in telecom. In her free time, she enjoys gardening or diving into insightful articles on AI.

Recent Comments

shivraj roy :

In my house im getting max speeds of 850mbps and avg day time speeds of 350mbps min speeds can go…

Indian Government to Raise Stake in Vodafone Idea to 48.99…

Bala :

Can 26ghz mmwave be used for FWA services? Does anybody know the answer?

Smaller ISPs Struggle as Airtel, Jio Push Fixed Wireless Broadband:…

Faraz :

Good.. Let's see what trai report for march show for Mumbai. We will know if Vi added or not.

Indian Government to Raise Stake in Vodafone Idea to 48.99…

TheAndroidFreak :

Off Topic : Speedcap removed as of now. Peak times Vashi station.

Indian Government to Raise Stake in Vodafone Idea to 48.99…

TheAndroidFreak :

You have good experience with Jio everywhere? Jio speeds dropping below 200-300Mbps for you?

Indian Government to Raise Stake in Vodafone Idea to 48.99…

Load More
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments