×
New AI Tool Lets You Upload Silent Videos and Read Speakers’ Lips
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

AI-powered lip reading technology debuts: Symphonic Labs, an audio tech startup, has launched an online tool showcasing their AI’s lip reading capabilities, potentially revolutionizing speech understanding in various contexts.

  • The San Francisco and Canada-based company specializes in “multimodal speech understanding” tools, with applications ranging from voice calls in noisy environments to whispering commands to voice assistants in public.
  • The startup’s new website, readtheirlips.com, allows users to upload short video clips of speakers and receive text transcriptions of what the AI calculates is being said, even when the audio is inaudible.
  • The tool requires clear visibility of the speaker’s face and lips to function effectively.

Real-world testing reveals promising results: Initial tests of Symphonic Lab’s lip reading AI on various video clips demonstrate a high level of accuracy, with some minor errors in transcription.

  • A 26-second Getty Images clip of U.S. VP Kamala Harris speaking at a Gun Violence Awareness Day event was used to evaluate the software’s performance.
  • The AI accurately transcribed most of the speech, with only minor errors such as “to try to comfort them” instead of “to try and comfort them.”
  • Some moderate errors were also observed, like “will recall every day in gun violence” instead of “or what we call everyday gun violence.”
  • The software also showed potential in decoding silent film era clips, providing insights into what movie stars like Gloria Swanson might have been saying in vintage footage.

Broader applications and future developments: Symphonic Labs’ technology extends beyond the online demo, with potential applications in personal computing and accessibility.

  • The company’s Mac OS software application, MAMO, integrates this technology to allow users to issue voice commands silently.
  • Chris Samra, an engineer with Symphonic Labs, stated that the startup’s goal is to build an interface that feels telepathic without the need for implants or bulky hardware.
  • The AI model serves dual purposes: enabling faster, silent communication and analyzing speech at long distances or in loud environments.
  • Recent updates to the software allow for the addition of personal context and vocabulary, improving its ability to work with individual users’ voices and interactions.

Addressing public discomfort with voice assistants: The technology could potentially solve the issue of user discomfort when using voice assistants in public spaces.

  • A PwC survey found that 74% of U.S. consumers primarily use mobile voice assistants at home, with many feeling uncomfortable using them in public.
  • Symphonic Labs’ technology could allow users to interact with voice assistants silently, potentially increasing adoption in public settings.

Potential impact on accessibility and communication: The lip reading AI technology shows promise in improving accessibility for individuals with speech or hearing impairments.

  • Samra highlighted the potential for the technology to aid people with dysphonia, repetitive strain injury (RSI), and hearing difficulties.
  • The ability to dictate in public and noisy environments without vocalization or additional hardware could significantly benefit these groups.

Looking ahead: Balancing innovation and privacy concerns: While Symphonic Labs’ lip reading AI demonstrates impressive capabilities, its widespread adoption may raise questions about privacy and consent in public spaces.

  • As the technology continues to develop, it will be crucial to address potential ethical concerns surrounding the ability to decode speech from silent video footage.
  • Striking a balance between the benefits of improved communication and accessibility and the need to protect individual privacy will be essential for the future development and implementation of this technology.
We tried new AI lip reading tool - here's how it fared

Recent News

Nvidia’s new AI agents can search and summarize huge quantities of visual data

NVIDIA's new AI Blueprint combines computer vision and generative AI to enable efficient analysis of video and image content, with potential applications across industries and smart city initiatives.

How Boulder schools balance AI innovation with student data protection

Colorado school districts embrace AI in classrooms, focusing on ethical use and data privacy while preparing students for a tech-driven future.

Microsoft Copilot Vision nears launch — here’s what we know right now

Microsoft's new AI feature can analyze on-screen content, offering contextual assistance without the need for additional searches or explanations.