×
Cloudflare Fights AI Bots as Tech Giants Clash Over Web Scraping Rules
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

AI is rewriting the rules of the internet, and Cloudflare is stepping in to help customers protect their data from being scraped by AI bots:

Tech giants grapple with web scraping for AI: Major tech companies are changing their policies around web scraping, with some blaming third parties for ignoring robots.txt files and others seemingly asserting the right to use any publicly posted data for AI training:

  • Google’s AI chatbot Bard has updated its privacy policy, allowing it to train on data scraped from the web.
  • Microsoft executive Suleyman has suggested that anything posted online is fair game for AI training, likening it to “freeware.”

Cloudflare offers AI bot blocking for customers: In response to the growing concern over web scraping by AI bots, Cloudflare is providing its CDN customers with a tool to detect and block these bots:

  • The service leverages Cloudflare’s global network data to identify new scraping tools and their behavior patterns without manual fingerprinting.
  • This allows customers to stay protected against the latest waves of bot activity as they emerge.

Most popular AI bots on Cloudflare’s network: Cloudflare has shared data on the most prevalent AI bots observed on its network in terms of request volume:

  • The graph shows the user agent matches for known AI bots over the past year, indicating a significant increase in activity.
  • The rise in AI bot traffic highlights the growing need for robust defenses against unauthorized web scraping.

Broader implications: As AI continues to reshape the internet landscape, the debate over data scraping and usage rights is likely to intensify:

  • Website owners and users may need to reconsider their content sharing practices and privacy expectations in light of AI’s increasing reliance on web-scraped data.
  • The development of AI-focused web scraping defenses, like those offered by Cloudflare, could become a critical aspect of online security and data protection strategies.
  • Legal and ethical frameworks around AI training data sourcing and usage will need to evolve to strike a balance between innovation and individual rights in the era of AI-driven services.
Cloudflare is offering to block crawlers scraping information for AI bots.

Recent News

Propaganda is everywhere, even in LLMS — here’s how to protect yourself from it

Recent tragedy spurs examination of AI chatbot safety measures after automated responses proved harmful to a teenager seeking emotional support.

How Anthropic’s Claude is changing the game for software developers

AI coding assistants now handle over 10% of software development tasks, with major tech firms reporting significant time and cost savings from their deployment.

AI-powered divergent thinking: How hallucinations help scientists achieve big breakthroughs

Meta's new AI model combines powerful performance with unusually permissive licensing terms for businesses and developers.