Copyright reform advocate Anna’s Archive, which hosts over 140 million copyrighted texts, reveals that Chinese AI companies are extensively using their illegal collection to train large language models (LLMs), highlighting urgent needs for Western copyright reform.
The current landscape: Anna’s Archive emerged as the world’s largest shadow library following challenges faced by previous platforms like Sci-Hub and Z-Library.
- The platform hosts over 140 million copyrighted texts, including books, academic papers, magazines, and newspapers
- Created as a preservation effort when other shadow libraries faced legal challenges and shutdowns
- Team operates based on ideological beliefs about preserving humanity’s cultural heritage, citing concerns about library funding cuts and corporate control
AI training dynamics: Major differences have emerged between Western and Chinese companies’ approaches to using copyrighted material for AI development.
- Most US-based companies declined to use the illegal database after learning of its status
- Chinese firms, despite being signatories to international copyright treaties, readily embraced the collection
- Approximately 30 companies, predominantly Chinese, have received high-speed access to the archive
- DeepSeek, a Chinese AI company, acknowledged using the collection for earlier models
National security implications: The situation has evolved beyond copyright infringement into a matter of strategic competition.
- Countries are developing AI systems for scientific research, cybersecurity, and military applications
- Access to vast text databases has become crucial for developing competitive AI systems
- Current copyright restrictions may be hampering Western AI development compared to competitors
Proposed solutions: Anna’s Archive advocates for specific copyright reforms to address these challenges.
- Reduce copyright terms from 70 years after author’s death to 20 years after filing, matching patent regulations
- Create exceptions for mass preservation and AI training while maintaining restrictions on individual distribution
- Follow examples from China and Japan, which have already implemented AI-specific copyright exceptions
Looking ahead: The intersection of copyright law and AI development presents complex challenges for international competition and innovation.
- Western nations face increasing pressure to balance intellectual property protection with AI development needs
- Current copyright frameworks may need fundamental revision to address emerging technological realities
- The outcome of this debate could significantly influence the future landscape of AI development and international technological competition
Copyright reform is necessary for national security