×
Cloudflare Fights AI Bots as Tech Giants Clash Over Web Scraping Rules
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

AI is rewriting the rules of the internet, and Cloudflare is stepping in to help customers protect their data from being scraped by AI bots:

Tech giants grapple with web scraping for AI: Major tech companies are changing their policies around web scraping, with some blaming third parties for ignoring robots.txt files and others seemingly asserting the right to use any publicly posted data for AI training:

  • Google’s AI chatbot Bard has updated its privacy policy, allowing it to train on data scraped from the web.
  • Microsoft executive Suleyman has suggested that anything posted online is fair game for AI training, likening it to “freeware.”

Cloudflare offers AI bot blocking for customers: In response to the growing concern over web scraping by AI bots, Cloudflare is providing its CDN customers with a tool to detect and block these bots:

  • The service leverages Cloudflare’s global network data to identify new scraping tools and their behavior patterns without manual fingerprinting.
  • This allows customers to stay protected against the latest waves of bot activity as they emerge.

Most popular AI bots on Cloudflare’s network: Cloudflare has shared data on the most prevalent AI bots observed on its network in terms of request volume:

  • The graph shows the user agent matches for known AI bots over the past year, indicating a significant increase in activity.
  • The rise in AI bot traffic highlights the growing need for robust defenses against unauthorized web scraping.

Broader implications: As AI continues to reshape the internet landscape, the debate over data scraping and usage rights is likely to intensify:

  • Website owners and users may need to reconsider their content sharing practices and privacy expectations in light of AI’s increasing reliance on web-scraped data.
  • The development of AI-focused web scraping defenses, like those offered by Cloudflare, could become a critical aspect of online security and data protection strategies.
  • Legal and ethical frameworks around AI training data sourcing and usage will need to evolve to strike a balance between innovation and individual rights in the era of AI-driven services.
Cloudflare is offering to block crawlers scraping information for AI bots.

Recent News

AI agents and the rise of Hybrid Organizations

Meta makes its improved AI image generator free to use while adding visible watermarks and daily limits to prevent misuse.

Adobe partnership brings AI creativity tools to Box’s content management platform

Box users can now access Adobe's AI-powered editing tools directly within their secure storage environment, eliminating the need to download files or switch between platforms.

Nvidia’s new ACE platform aims to bring more AI to games, but not everyone’s sold

Gaming companies are racing to integrate AI features into mainstream titles, but high hardware requirements and artificial interactions may limit near-term adoption.