Web-browsing bots now account for the majority of internet traffic for the first time, with AI company crawlers like ChatGPT-User and ClaudeBot representing 6% and 13% of all web traffic respectively. Content creators are fighting back with “AI poisoning” tools that corrupt training data, but these same techniques could be weaponized to spread misinformation at scale.
The big picture: The battle between AI companies scraping data and content creators protecting their work has escalated beyond legal disputes into a technological arms race that could reshape how information flows across the internet.
Key details: Major AI companies argue data scraping falls under fair use doctrine and is essential for model development.
- OpenAI previously stated it would be “impossible” to develop AI models without using copyrighted work.
- Disney and Universal sued Midjourney, an AI image generator, in June, alleging it plagiarizes characters from franchises like Star Wars and Despicable Me.
- When the US Copyright Office analyzed public comments about AI and copyright, 91% contained negative sentiments about AI.
How the poisoning works: Researchers have developed tools that make imperceptible changes to content that confuse AI models without affecting human perception.
- Glaze applies “style cloaks” that cause AI to misinterpret artistic styles, making watercolor paintings appear as oil paintings to AI systems.
- Nightshade goes further by poisoning data to create false associations, potentially making AI models link “cat” with images of dogs.
- Both tools, developed at the University of Chicago, have been downloaded over 10 million times.
Corporate countermeasures: Tech companies are deploying their own defensive strategies against data scraping.
- Cloudflare’s AI Labyrinth creates mazes of nonsense AI-generated pages to waste bot resources and trap the 50 billion daily AI crawler requests it encounters.
- The company also released a tool requiring AI companies to pay for website access or face blocking.
- Testing shows Glaze remains 85% effective against countermeasures, suggesting AI companies may find dealing with poisoned data more trouble than it’s worth.
The darker implications: Nation-states may be exploiting similar poisoning techniques to manipulate AI-generated information.
- The Atlantic Council, a US-based think tank, alleges Russia’s Pravda network posted millions of fake news pages designed to trick AI crawlers into promoting Kremlin narratives about Ukraine.
- Analysis by NewsGuard, a technology firm that tracks misinformation, found that 10 major AI chatbots outputted text aligned with Pravda’s views in one-third of cases.
- The volume of poisoned content could lead AI systems to over-emphasize certain narratives when responding to users.
What they’re saying: Experts see both the potential and peril of these defensive tools.
- “These are trillion-dollar market-cap companies, literally the biggest companies in the world, taking by force what they want,” said Ben Zhao, the University of Chicago researcher behind Glaze and Nightshade.
- Jacob Hoffman-Andrews at the Electronic Frontier Foundation, a digital rights nonprofit, noted these tools offer “a neat method of action that doesn’t rely on changing regulations, which can take a while.”
- “At the root of all this is money,” Zhao added, pointing to AI companies’ reluctance to pay licensing fees for legitimate content.
Why this matters: The proliferation of AI poisoning tools represents a fundamental shift in how content creators can protect their work, but it also opens the door for malicious actors to corrupt the information ecosystem that powers AI systems worldwide.
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...