×
AI search tools fail on 60% of news queries, Perplexity best performer
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Imagine a vivid dream…of a fake URL.

Generative AI search tools are proving to be alarmingly unreliable for news queries, according to comprehensive new research from Columbia Journalism Review’s Tow Center. As roughly 25% of Americans now turn to AI models instead of traditional search engines, the implications of these tools delivering incorrect information more than 60% of the time raises significant concerns about public access to accurate information and the unintended consequences for both news publishers and information consumers.

The big picture: A new Columbia Journalism Review study found generative AI search tools incorrectly answer over 60 percent of news-related queries, raising serious concerns as Americans increasingly adopt these tools as search engine alternatives.

Key details: Researchers tested eight AI-driven search tools by asking them to identify headline, publisher, publication date, and URL from direct news article excerpts.

  • The study included 1,600 queries across eight different generative search platforms.
  • Instead of declining to respond when information was unavailable, most tools confabulated plausible-sounding but incorrect answers.

Important stats: Error rates varied dramatically among the platforms tested in the research.

  • Perplexity provided incorrect information in 37 percent of queries.
  • ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles.
  • Grok 3 demonstrated the worst performance, with a 94 percent error rate.

Behind the numbers: Surprisingly, premium paid versions of these AI search tools often performed worse than their free counterparts.

  • Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) delivered incorrect responses more confidently than free versions.
  • More than half of citations from Google’s Gemini and Grok 3 led to fabricated or broken URLs.

Why this matters: The research identified serious technical violations that could impact both publishers and information consumers.

  • Evidence suggests some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access.
  • URL fabrication was common, creating an illusion of credibility through citations that don’t actually exist.
AI search engines give incorrect answers at an alarming 60% rate, study says

Recent News

India reviewing copyright law as AI firms face legal challenges

Expert panel examines whether India's 1957 Copyright Act can address claims that AI systems are using content without permission to train large language models.

AI platform Korl customizes messaging with multiple LLMs

Korl's platform connects siloed business data systems to automatically generate personalized customer communications using model-specific AI assignments.

AI firms Musk’s xAI, TWG Global and Palantir target finance industry

The partnership will integrate xAI's Grok language models with Palantir's analytics to enhance data-driven decision making in finance and insurance operations.