×
Written by
Published on
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

AI is rewriting the rules of the internet, and Cloudflare is stepping in to help customers protect their data from being scraped by AI bots:

Tech giants grapple with web scraping for AI: Major tech companies are changing their policies around web scraping, with some blaming third parties for ignoring robots.txt files and others seemingly asserting the right to use any publicly posted data for AI training:

  • Google’s AI chatbot Bard has updated its privacy policy, allowing it to train on data scraped from the web.
  • Microsoft executive Suleyman has suggested that anything posted online is fair game for AI training, likening it to “freeware.”

Cloudflare offers AI bot blocking for customers: In response to the growing concern over web scraping by AI bots, Cloudflare is providing its CDN customers with a tool to detect and block these bots:

  • The service leverages Cloudflare’s global network data to identify new scraping tools and their behavior patterns without manual fingerprinting.
  • This allows customers to stay protected against the latest waves of bot activity as they emerge.

Most popular AI bots on Cloudflare’s network: Cloudflare has shared data on the most prevalent AI bots observed on its network in terms of request volume:

  • The graph shows the user agent matches for known AI bots over the past year, indicating a significant increase in activity.
  • The rise in AI bot traffic highlights the growing need for robust defenses against unauthorized web scraping.

Broader implications: As AI continues to reshape the internet landscape, the debate over data scraping and usage rights is likely to intensify:

  • Website owners and users may need to reconsider their content sharing practices and privacy expectations in light of AI’s increasing reliance on web-scraped data.
  • The development of AI-focused web scraping defenses, like those offered by Cloudflare, could become a critical aspect of online security and data protection strategies.
  • Legal and ethical frameworks around AI training data sourcing and usage will need to evolve to strike a balance between innovation and individual rights in the era of AI-driven services.
Cloudflare is offering to block crawlers scraping information for AI bots.

Recent News

AI Anchors are Protecting Venezuelan Journalists from Government Crackdowns

Venezuelan news outlets deploy AI-generated anchors to protect human journalists from government retaliation while disseminating news via social media.

How AI and Robotics are Being Integrated into Sex Tech

The integration of AI and robotics into sexual experiences raises questions about the future of human intimacy and relationships.

63% of Brands Now Embrace Gen AI in Marketing, Research Shows

Marketers embrace generative AI despite legal and ethical concerns, with 63% of brands already using the technology in their campaigns.