×
Nvidia’s new AI generates music from text and audio inputs
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The artificial intelligence industry continues to push boundaries in audio generation, with Nvidia announcing Fugatto, a new AI tool capable of creating unique sounds and music through text and audio inputs.

Key Innovation: Nvidia’s Fugatto represents a significant advancement in AI-generated audio by producing novel sound combinations and transformations that claim to be entirely original.

  • The system can create unconventional audio combinations, such as a trumpet that meows or a saxophone that transitions from howling to barking before merging into electronic music
  • Fugatto can generate specific sound effects based on detailed descriptions, including complex audio scenarios like simulating the sound of a sentient machine awakening
  • The tool offers voice modification capabilities, allowing users to alter accents and emotional tones

Technical Capabilities: Fugatto demonstrates versatility in audio manipulation and creation through a comprehensive set of features.

  • Users can isolate vocals within existing songs and add new instruments
  • The system enables melody transformation by switching between different instruments and vocal styles
  • Built on millions of audio samples, including BBC sound effects, the model uses specialized instructions to expand its task range and accuracy

Industry Context: While several major tech companies offer AI audio tools, Fugatto’s claimed ability to create entirely new sounds sets it apart in a competitive landscape.

  • Existing players in the AI audio space include Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe
  • The industry faces ongoing legal challenges, with some AI startups dealing with copyright lawsuits over music creation tools
  • Recent reports have highlighted that Nvidia, among other companies, trained AI models using YouTube video subtitles

Development Status: The path to public availability remains unclear, though technical details have been disclosed.

  • Nvidia has released a research paper detailing the datasets used in training
  • The company has not announced any timeline for public release or commercial availability
  • The technology required extensive training on millions of audio samples to achieve its current capabilities

Looking Ahead: While Fugatto’s ability to create novel sounds could open new creative possibilities for music and audio production, questions about commercial viability, copyright implications, and potential misuse will likely influence its eventual deployment and adoption in the broader market.

Nvidia claims a new AI audio generator can make sounds never heard before

Recent News

China-based DeepSeek just released a very powerful ultra large AI model

Chinese startup achieves comparable performance to GPT-4 while cutting typical training costs by 99% through an innovative parameter activation approach.

7 practical tips and tools for using AI to improve your relationships

AI tools offer relationship support through structured communication guidance and conflict management, but experts emphasize they should complement rather than replace human interaction.

How AI-powered tsunami prediction will save lives in future disasters

Emergency response teams are leveraging AI systems to cut tsunami warning times from hours to minutes while improving evacuation planning and damage assessment.