×
GPT-4o’s New Voice Feature Is Accidentally Mimicking Real Users
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Unexpected voice imitation by AI: OpenAI’s latest language model, GPT-4o, exhibited a concerning ability to replicate users’ voices without permission during testing of its Advanced Voice Mode feature.

  • OpenAI’s GPT-4o scorecard revealed that the AI unexpectedly imitated users’ voices in rare instances, raising significant privacy and consent concerns.
  • A demonstration clip shows ChatGPT abruptly switching to an uncanny rendition of a user’s voice, shouting “No!” for no apparent reason.
  • This unintended voice cloning capability has been likened to a plot from a sci-fi horror movie or a potential “Black Mirror” episode.

Technical capabilities and risks: The AI model’s voice generation abilities extend beyond simple speech synthesis, posing potential threats to privacy and information integrity.

  • GPT-4o can create human-sounding synthetic voices and replicate nonverbal vocalizations, including sound effects and music.
  • OpenAI acknowledges that these capabilities could facilitate fraud through impersonation and potentially aid in spreading misinformation.
  • The AI’s voice cloning behavior may be triggered by picking up ambient noise in user inputs, similar to how prompt injection attacks function.

Mitigation efforts: OpenAI has implemented measures to reduce the risks associated with unintended voice generation.

  • The company has restricted the AI to using only pre-approved voices created in collaboration with voice actors.
  • OpenAI claims to have implemented robust protection against attempts to trick the system into using unapproved voices.
  • The risk of unintentional voice replication is described as “minimal” by the company.

Expert perspectives: AI researchers and data scientists have weighed in on the implications of this technology.

  • Simon Willison, an AI researcher, believes that tricking the system into using unapproved voices would be extremely difficult due to the protective measures in place.
  • Willison also expressed disappointment at the restrictions, noting the potential for entertaining applications like getting the AI to sing to pets.
  • The incident has sparked discussions about the ethical implications of AI voice generation and the need for stringent safeguards.

Broader implications: The unintended voice cloning incident raises important questions about AI development and deployment.

  • This situation highlights the unpredictable nature of advanced AI systems and the potential for unintended consequences in their development.
  • It underscores the importance of rigorous testing and transparent reporting of AI capabilities and risks by companies like OpenAI.
  • The incident may lead to increased scrutiny of AI voice technologies and their potential impact on privacy, consent, and digital identity verification.
ChatGPT Went Rogue, Spoke In People's Voices Without Their Permission

Recent News

Veo 2 vs. Sora: A closer look at Google and OpenAI’s latest AI video tools

Tech companies unveil AI tools capable of generating realistic short videos from text prompts, though length and quality limitations persist as major hurdles.

7 essential ways to use ChatGPT’s new mobile search feature

OpenAI's mobile search upgrade enables business users to access current market data and news through conversational queries, marking a departure from traditional search methods.

FastVideo is an open-source framework that accelerates video diffusion models

New optimization techniques reduce the computing power needed for AI video generation from days to hours, though widespread adoption remains limited by hardware costs.