Nvidia Unveils AI Model That Can Modify Voices And Generate New Sounds

Fugatto joins other technologies developed by startups like Runway and larger companies like Meta Platforms which can generate audio or video from text prompts.

NVIDIA Edited by
Nvidia Unveils AI Model That Can Modify Voices And Generate New Sounds

Nvidia Unveils AI Model That Can Modify Voices And Generate New Sounds.

The world’s biggest supplier of chips Nvidia on Monday showcased a new artificial intelligence model called Fugatto, which can generate music and audio, modify voices and create new sounds.

This technology is said to be aimed at music, film and video game producers. Nvidia, the world’s largest supplier of software for AI systems, doesn’t plan to release Fugatto to the public just yet.


As per the company’s blog post, Fugatto joins other technologies developed by startups like Runway and larger companies like Meta Platforms which can generate audio or video from text prompts.

Nvidia‘s model stands out because it can take existing audio and modify it, such as changing a piano line into a human voice or altering the accent and mood of a spoken word recording.

According to Bryan Catanzaro, Nvidia’s vice president of applied deep learning research, “Generative AI is going to bring new capabilities to music, video games, and ordinary people who want to create things.”

However, the relationship between tech and Hollywood has become tense, especially after Scarlett Johansson accused OpenAI of imitating her voice without permission.

Also, read| NVIDIA AI Summit: Jensen Huang, Mukesh Ambani Discuss India’s Rise As AI Service Hub

Nvidia’s new model was trained on open-source data and the company is still deciding whether and how to release it publicly. Catanzaro explained, “Any generative technology carries risks, as people might use it to generate things we’d prefer they don’t. We need to be careful, which is why we don’t have immediate plans to release this.”

The creators of generative AI models are still figuring out how to prevent misuse such as generating misinformation or infringing on copyrights.

Moreover, OpenAI and Meta have also not announced plans to release their audio and video generation models to the public.

Also, read| Nvidia Hits Record High As TSMC’s Strong Sales Forecast Boosts Chip Stocks

Nvidia’s Fugatto technology is said to have more potential to revolutionise the music and entertainment industries, but the company is taking a cautious approach to ensure it’s used responsibly.