Veo 3 brings audio to AI video and tackles the Will Smith Test

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Google’s Veo 3 represents a significant leap in AI video generation by introducing synchronized audio capabilities, enabling users to create realistic videos with voices, dialog, and sound effects. This advancement marks a notable evolution from the silent AI videos of 2022-2024, though it still exhibits quirks like the infamous “crunchy spaghetti” effect when generating eating sounds. As AI video technology rapidly improves, it raises important questions about the potential for creating increasingly convincing synthetic content of real people.

The big picture: Google has launched Veo 3, a groundbreaking AI video synthesis model that generates synchronized audio tracks with eight-second high-definition video clips, a capability previously unavailable in major AI video generators.

The Will Smith test: The AI community quickly benchmarked Veo 3 using the infamous “Will Smith eating spaghetti” test, which has become a standard reference point for evaluating AI video generation quality.

The spaghetti benchmark originated in March 2023 with horrific early AI-generated videos from the open-source ModelScope, later becoming well-known enough that Will Smith himself parodied it in February 2024.
AI developer Javi Lopez tested Veo 3 with the Smith spaghetti scenario and posted results on X, showing significant improvements over previous iterations.

Technical quirks: Despite impressive visual quality, Veo 3’s audio generation still exhibits notable glitches in specific scenarios.

The AI-generated Will Smith appears to crunch on spaghetti rather than producing realistic pasta-eating sounds, likely due to training data biases toward crunching sound effects paired with chewing mouths.
This audio anomaly highlights the challenges in creating perfectly synchronized and contextually appropriate sound effects for AI-generated videos.

Why this matters: Veo 3’s capability to generate coherent dialog and music represents a substantial evolution in AI video synthesis technology since 2023, with realistic examples already circulating online.

The technology currently includes celebrity filters as a safeguard, limiting some potential misuses.
As these tools become increasingly realistic, concerns grow about the potential for creating convincing synthetic videos of real people performing actions they never did.

Reading between the lines: The rapid advancement from crude, silent AI videos in 2023 to today’s more sophisticated audio-visual synthesis suggests we’re witnessing just the early stages of what will likely become increasingly indistinguishable from authentic video content.

Google’s Will Smith double is better at eating AI spaghetti … but it’s crunchy?

Ars Technica

Menu

Veo 3 brings audio to AI video and tackles the Will Smith Test

Recent News

Tim Cook tells Apple staff AI is “as big as the internet”

Google adds 4 new AI search features including image analysis

Take that, Oppenheimer: Meta offers AI researcher $250M over 4 years in talent war

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

Veo 3 brings audio to AI video and tackles the Will Smith Test

Recent News

Tim Cook tells Apple staff AI is “as big as the internet”

Google adds 4 new AI search features including image analysis

Take that, Oppenheimer: Meta offers AI researcher $250M over 4 years in talent war

Join the revolution

CO/AI

Resources

Join the revolution