×
Zuckerberg was reportedly aware that Meta trained its AI model on pirated works
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The core revelation: Meta CEO Mark Zuckerberg approved the use of Library Genesis (LibGen), a known pirated content repository, to train the company’s Llama 3 AI model, according to newly unsealed court documents.

Key details of the disclosure: Internal communications revealed through a class-action lawsuit show Meta executives discussing the company’s deliberate use of unauthorized copyrighted material.

  • Sony Theakanath, Meta’s director of product management, confirmed in an email that Zuckerberg approved LibGen’s use for AI training
  • The company explicitly planned to keep its use of LibGen confidential
  • Meta employees discussed methods to remove copyright indicators from the pirated content
  • Internal discussions revealed concerns about downloading pirated content from corporate devices

Legal context: A class-action lawsuit filed by authors Christopher Golden, Richard Kadrey, and comedian Sarah Silverman alleges unauthorized use of their copyrighted work.

  • The documents were unsealed by Judge Vince Chhabria of the U.S. District Court for Northern California
  • Meta’s legal team had previously argued that their use of text for AI training fell under fair use provisions
  • Zuckerberg reportedly acknowledged in a deposition that such piracy would raise “lots of red flags”

Corporate strategy and risk assessment: Meta executives weighed the benefits against potential backlash while implementing this controversial decision.

  • Internal communications cited performance benchmarks as justification for using LibGen
  • Documents referenced rumors that competitors like OpenAI and Mistral AI were also using the library
  • Executives acknowledged potential legislative risks, particularly in the US and EU
  • The company developed specific “mitigations” to address potential fallout

Industry implications: This revelation comes at a critical time for AI development and copyright law.

  • Meta announced a 5% workforce reduction targeting “lowest performers” (approximately 3,600 workers)
  • The case could set important precedents for numerous other AI-related copyright lawsuits
  • The controversy highlights the tension between rapid AI development and intellectual property rights

Analyzing the deeper impact: This controversy exposes a fundamental contradiction in the AI industry’s approach to training data – while companies need vast amounts of high-quality content to develop effective AI models, their methods of obtaining this content often conflict with established intellectual property rights, potentially setting up a long-term conflict between content creators and AI developers.

Zuckerberg Appeared to Know Meta Trained AI on Pirated Library

Recent News

A deeper look into how Google and Microsoft think about AI-powered search

Tech companies are increasingly framing AI safety measures as competitive advantages rather than regulatory burdens.

AI hardware strategies for scaling infrastructure efficiently

As companies navigate a projected $100 billion AI infrastructure market, most will opt for cloud services over building in-house systems.

Man with paralysis flies virtual drone using brain implant

A paralyzed patient successfully navigated virtual obstacles by imagining finger movements that were translated into drone commands through brain-implanted electrodes.