×
ByteDance’s new ‘OmniHuman’ AI tools turns single user photos into full-body videos
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

ByteDance researchers have created OmniHuman, an AI system that can generate realistic full-body videos of people speaking, gesturing, and moving naturally from a single photograph, marking a significant advancement in AI-generated media.

Key innovation: The OmniHuman system represents a breakthrough in AI video generation by producing complete body animations that synchronize with speech and natural movements, moving beyond the limitations of previous systems that could only animate faces or upper bodies.

  • The system utilizes an “omni-conditions” training approach that combines text, audio, and body movement inputs
  • Researchers trained the AI on more than 18,700 hours of human video data
  • The technology can create videos of people delivering speeches and playing musical instruments

Technical architecture: ByteDance’s novel approach integrates multiple conditioning signals to maximize the efficiency of data usage during the training process.

  • The system processes text, audio, and pose data simultaneously to generate natural movements
  • This comprehensive training strategy allows for learning from larger and more diverse datasets than previous methods
  • In benchmark testing, OmniHuman demonstrated superior performance compared to existing systems

Industry context: ByteDance’s development comes at a time of increasing competition in AI video generation technology.

  • Google, Meta, and Microsoft are actively developing similar technologies
  • The breakthrough could provide ByteDance, TikTok’s parent company, with a competitive advantage
  • The technology has potential applications in entertainment production, educational content creation, and digital communications

Potential implications: While OmniHuman represents a significant technological advancement, it also raises important considerations about synthetic media.

  • The technology could streamline content creation processes across multiple industries
  • Concerns exist regarding potential misuse for creating deceptive content
  • ByteDance researchers plan to present their findings at an upcoming computer vision conference

Future outlook: The development of OmniHuman signals a potential shift in how digital content is created and consumed, though questions remain about implementation timeframes and access to the technology. The system’s ability to generate realistic full-body videos from single images could fundamentally alter the landscape of digital media production, while simultaneously intensifying discussions about synthetic media verification and authentication methods.

OmniHuman: ByteDance’s new AI creates realistic videos from a single photo

Recent News

OpenAI, CSU partner to bring ChatGPTEdu to 500,000 students

California's largest university system brings OpenAI's chatbot to help half a million students with writing and research tasks.

YouTube ad sales hit record $10.5B as Alphabet plans $75B AI investment

YouTube's ad revenue surge comes as parent company Alphabet commits to massive AI infrastructure spending amid growing competition from Microsoft and Meta.

Tempus AI acquires Ambry Genetics to advance precision medicine

Genomics testing firm buys diagnostics company in $600m deal to combine AI analysis with genetic screening workflows.