×
Veo 3 brings audio to AI video and tackles the Will Smith Test
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google’s Veo 3 represents a significant leap in AI video generation by introducing synchronized audio capabilities, enabling users to create realistic videos with voices, dialog, and sound effects. This advancement marks a notable evolution from the silent AI videos of 2022-2024, though it still exhibits quirks like the infamous “crunchy spaghetti” effect when generating eating sounds. As AI video technology rapidly improves, it raises important questions about the potential for creating increasingly convincing synthetic content of real people.

The big picture: Google has launched Veo 3, a groundbreaking AI video synthesis model that generates synchronized audio tracks with eight-second high-definition video clips, a capability previously unavailable in major AI video generators.

The Will Smith test: The AI community quickly benchmarked Veo 3 using the infamous “Will Smith eating spaghetti” test, which has become a standard reference point for evaluating AI video generation quality.

  • The spaghetti benchmark originated in March 2023 with horrific early AI-generated videos from the open-source ModelScope, later becoming well-known enough that Will Smith himself parodied it in February 2024.
  • AI developer Javi Lopez tested Veo 3 with the Smith spaghetti scenario and posted results on X, showing significant improvements over previous iterations.

Technical quirks: Despite impressive visual quality, Veo 3’s audio generation still exhibits notable glitches in specific scenarios.

  • The AI-generated Will Smith appears to crunch on spaghetti rather than producing realistic pasta-eating sounds, likely due to training data biases toward crunching sound effects paired with chewing mouths.
  • This audio anomaly highlights the challenges in creating perfectly synchronized and contextually appropriate sound effects for AI-generated videos.

Why this matters: Veo 3’s capability to generate coherent dialog and music represents a substantial evolution in AI video synthesis technology since 2023, with realistic examples already circulating online.

  • The technology currently includes celebrity filters as a safeguard, limiting some potential misuses.
  • As these tools become increasingly realistic, concerns grow about the potential for creating convincing synthetic videos of real people performing actions they never did.

Reading between the lines: The rapid advancement from crude, silent AI videos in 2023 to today’s more sophisticated audio-visual synthesis suggests we’re witnessing just the early stages of what will likely become increasingly indistinguishable from authentic video content.

Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy?

Recent News

AI courses from Google, Microsoft and more boost skills and résumés for free

As AI becomes critical to business decision-making, professionals can enhance their marketability with free courses teaching essential concepts and applications without requiring technical backgrounds.

Veo 3 brings audio to AI video and tackles the Will Smith Test

Google's latest AI video generation model introduces synchronized audio capabilities, though still struggles with realistic eating sounds when depicting the celebrity in its now-standard benchmark test.

How subtle biases derail LLM evaluations

Study finds language models exhibit pervasive positional preferences and prompt sensitivity when making judgments, raising concerns for their reliability in high-stakes decision-making contexts.