×
Nvidia’s new AI generates music from text and audio inputs
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The artificial intelligence industry continues to push boundaries in audio generation, with Nvidia announcing Fugatto, a new AI tool capable of creating unique sounds and music through text and audio inputs.

Key Innovation: Nvidia’s Fugatto represents a significant advancement in AI-generated audio by producing novel sound combinations and transformations that claim to be entirely original.

  • The system can create unconventional audio combinations, such as a trumpet that meows or a saxophone that transitions from howling to barking before merging into electronic music
  • Fugatto can generate specific sound effects based on detailed descriptions, including complex audio scenarios like simulating the sound of a sentient machine awakening
  • The tool offers voice modification capabilities, allowing users to alter accents and emotional tones

Technical Capabilities: Fugatto demonstrates versatility in audio manipulation and creation through a comprehensive set of features.

  • Users can isolate vocals within existing songs and add new instruments
  • The system enables melody transformation by switching between different instruments and vocal styles
  • Built on millions of audio samples, including BBC sound effects, the model uses specialized instructions to expand its task range and accuracy

Industry Context: While several major tech companies offer AI audio tools, Fugatto’s claimed ability to create entirely new sounds sets it apart in a competitive landscape.

  • Existing players in the AI audio space include Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe
  • The industry faces ongoing legal challenges, with some AI startups dealing with copyright lawsuits over music creation tools
  • Recent reports have highlighted that Nvidia, among other companies, trained AI models using YouTube video subtitles

Development Status: The path to public availability remains unclear, though technical details have been disclosed.

  • Nvidia has released a research paper detailing the datasets used in training
  • The company has not announced any timeline for public release or commercial availability
  • The technology required extensive training on millions of audio samples to achieve its current capabilities

Looking Ahead: While Fugatto’s ability to create novel sounds could open new creative possibilities for music and audio production, questions about commercial viability, copyright implications, and potential misuse will likely influence its eventual deployment and adoption in the broader market.

Nvidia claims a new AI audio generator can make sounds never heard before

Recent News

Nvidia’s new AI generates music from text and audio inputs

Tech firms are developing AI tools that can generate and manipulate any type of sound, from music to sound effects, by responding to simple text commands.

Luma launches AI-powered creative platform and mobile app

A startup founded by ex-Google employees has attracted 25 million users to its AI video platform by simplifying creative workflows and offering faster processing speeds.

5 AI prompts to maximize your savings on Black Friday

AI tools are helping shoppers navigate Black Friday's maze of deals by tracking prices, stacking discounts, and monitoring flash sales across both online and physical stores.