Nvidia Unveils New AI Model Fugatto That Generates Audio from Text and Audio

Nvidia said Fugatto is a foundational generative transformer model that builds on prior work in areas such as speech modeling, audio vocoding and audio understanding.

Highlights

  • It allows users to manipulate sound output with just text input.
  • Music producers, ad agencies, language tools, and game developers can all benefit from its capabilities.
  • Fugatto is powered by Nvidia's H100 GPUs and a global team of researchers.

Follow Us

Nvidia Unveils New AI Model Fugatto That Generates Audio from Text and Audio
Nvidia has unveiled a new generative AI model that can create any combination of music, voices and sounds using text and audio as inputs. Called Fugatto, (Foundational Generative Audio Transformer Opus 1), it generates or transforms any mix of music, voices and sounds described with prompts, using any combination of text and audio files. "While some AI models can compose a song or modify a voice, none have the dexterity of the new offering," said Nvidia in a blog post on Monday.

Also Read: Anthropic Unveils New AI Model with Computer Use Capability




What Can Fugatto AI Model Do?

Nvidia describes this model as a "Swiss Army knife for sound," one that allows users to control the audio output simply using text. Fugatto can create a music snippet based on a text prompt, remove or add instruments from an existing song, change the accent or emotion in a voice and even let people produce sounds never heard before, the company explained.

"We wanted to create a model that understands and generates sound like humans do," said Rafael Valle, a manager of applied audio research at Nvidia.

Key Features of Fugatto

Supporting numerous audio generation and transformation tasks, Fugatto is the first foundational generative AI model that showcases emergent properties — capabilities that arise from the interaction of its various trained abilities — and the ability to combine free-form instructions, Nvidia said.

"Fugatto is our first step toward a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale," Valle added.

Also Read: Microsoft Launches Industry-Specific AI Models to Drive Business Transformation

Potential Use Cases for Fugatto AI

According to Nvidia, music producers could use Fugatto to quickly prototype or edit an idea for a song, trying out different styles, voices and instruments. They could also add effects and enhance the overall audio quality of an existing track.

An ad agency could apply Fugatto to quickly target an existing campaign for multiple regions or situations, applying different accents and emotions to voiceovers.

Additionally, Nvidia says language learning tools could be personalised to use any voice a speaker chooses. Imagine an online course spoken in the voice of any family member or friend.

Video game developers could use the AI model to modify prerecorded assets in their title to fit the changing action as users play the game. Or, they could create new assets easily from text instructions and optional audio inputs.

Also Read: Microsoft Announces New AI Models and Solutions for Healthcare

The Technology Behind Fugatto

Nvidia said Fugatto is a foundational generative transformer model that builds on prior work in areas such as speech modeling, audio vocoding and audio understanding. Fugatto was made by a diverse group of people from around the world, including India, Brazil, China, Jordan and South Korea. "Their collaboration made Fugatto's multi-accent and multilingual capabilities stronger," said the company.

The full version used 2.5 billion parameters and was trained on a bank of Nvidia DGX systems, equipped with 32 Nvidia H100 Tensor Core GPUs.

Recent Comments

Shivraj Roy :

time bhai time itne jaldi 2 years nikal gaya bas 7-8 din phir 2020 5 years ago hoga

Vodafone Idea's Revised Prepaid Plans: December 2024 Edition

Faraz :

Damn.. I forgot they started rolling out just after spectrum purchase.. You are right, 2 years completed.

Vodafone Idea's Revised Prepaid Plans: December 2024 Edition

Dev :

These coupons are of no use. I am getting lots of these types of coupons in google pay and phonepe…

Jio Payments Bank Announces Festive Rewards for New Account Holders

Dev :

Bechare garibo ko koi nahi pooch raha. For 138, only 10 local on-net night min and 100 MB data, they…

Vodafone Idea's Revised Prepaid Plans: December 2024 Edition

Load More
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments