×
Meta faces new lawsuit for using copyrighted books to train AI
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Meta faces accusations of deliberately using pirated books to train its AI systems, with CEO Mark Zuckerberg allegedly approving the use of unauthorized content despite internal concerns.

Key allegations: The lawsuit, filed by prominent authors including Ta-Nehisi Coates and Sarah Silverman, claims Meta knowingly used pirated books from the LibGen dataset to train its Llama language model.

  • Internal Meta communications revealed that executives were aware LibGen contained pirated content
  • The dataset was allegedly distributed through peer-to-peer torrents
  • Documents produced during discovery suggest Zuckerberg approved the use of LibGen despite concerns from Meta’s AI executive team

Legal context: This case represents one of several ongoing legal battles over AI companies’ use of copyrighted material for training purposes.

  • The authors initially sued Meta in 2023 over unauthorized use of their works
  • U.S. District Judge Vince Chhabria previously dismissed claims about chatbot-generated text and copyright management information
  • The authors are now seeking permission to file an updated complaint with new evidence
  • Similar lawsuits are pending against other AI companies regarding unauthorized use of copyrighted works

Meta’s position: The tech giant has yet to respond to these specific allegations.

  • Meta and other AI companies have generally argued that their use of copyrighted material falls under “fair use” doctrine
  • The company has not immediately responded to requests for comment on these new allegations

Legal developments: Recent court proceedings suggest a complex path forward for the case.

  • Authors are attempting to revive previously dismissed claims based on new evidence
  • They’re also seeking to add a new computer fraud claim
  • Judge Chhabria has indicated he will allow an amended complaint
  • However, the judge expressed skepticism about some of the proposed claims

Looking ahead: This case could set important precedents for AI training practices and intellectual property rights in the digital age, potentially forcing tech companies to reconsider how they source training data for their AI models. The outcome may also influence how other courts handle similar cases involving AI training and copyright infringement.

Meta knew it used pirated books to train AI, authors say

Recent News

New framework prevents AI agents from taking unsafe actions in enterprise settings

The framework provides runtime guardrails that intercept unsafe AI agent actions while preserving core functionality, addressing a key barrier to enterprise adoption.

Leaked database reveals China’s AI-powered censorship system targeting political content

The leaked database exposes how China is using advanced language models to automatically identify and censor indirect references to politically sensitive topics beyond traditional keyword filtering.

Study: Anthropic uncovers neural circuits behind AI hallucinations

Anthropic researchers have identified specific neural pathways that determine when AI models fabricate information versus admitting uncertainty, offering new insights into the mechanics behind artificial intelligence hallucinations.