CO/AI Subscribe
Thursday · June 18, 2026 · Issue No. 899
Video

I Tested 3 Literature Review AIs – Only One Didn’t Lie to Me

Watch on YouTube

AI literature review tools face truth test

In the rapidly evolving landscape of AI research tools, separating genuine innovation from potentially dangerous shortcuts has become increasingly challenging. A recent investigation into three leading AI-powered literature review assistants—Consensus, Elicit, and Scite.AI—reveals concerning disparities in accuracy and reliability. This eye-opening comparison exposes how some AI tools may be misleading researchers and academic professionals with fabricated or misrepresented citations, while others offer promising solutions for legitimate scholarly work.

Key findings from the comparison:

  • Consensus demonstrated high reliability, providing verifiable quotes directly from source papers and accurate citations without fabricating information
  • Elicit exhibited troubling behavior, including "hallucinating" citations that don't exist and making claims not supported by the source material
  • Scite.AI showed mixed results, occasionally referencing papers that weren't actually in their database while providing some useful citation context

The accuracy problem isn't just academic

The most striking revelation from this analysis is the wide gap in factual reliability between these tools. While Consensus consistently delivered verifiable information with links to the original research papers, Elicit often fabricated evidence—a critical flaw that undermines the entire purpose of literature review tools.

This discrepancy matters tremendously in today's information ecosystem. As researchers, students, and professionals increasingly rely on AI tools to navigate overwhelming volumes of academic literature, the risks of propagating false information grow exponentially. Imagine a medical researcher using hallucinated research citations to support clinical recommendations, or policy decisions being made based on non-existent studies. The potential harm extends far beyond academic integrity into real-world consequences.

Beyond the video: The broader implications for research integrity

The findings from this comparison connect to larger challenges facing academia and knowledge work. Even before AI entered the picture, academic publishing faced a replication crisis, with numerous studies showing that significant percentages of published research couldn't be reproduced. AI-powered literature review tools that hallucinate or fabricate citations compound this problem dramatically.

Consider the case of a 2022 Nature survey that found 38% of researchers had trouble reproducing even their own experimental results. When we layer potentially unreliable AI tools onto this existing fragility in research methodology, we risk creating a house of cards where citations point to fabricated claims that reference other fabricated claims

Share: X LinkedIn Email
Video Feed

More videos

All videos →
Claude Fable 5: When Capability Meets Economics
Video

Claude Fable 5: When Capability Meets Economics

Anthropic released Cloud Fable 5 with a paradox built in: safeguards sophisticated enough to let a mythosclass model...

Run Agentic AI Entirely on Your Mac—No Cloud, No Latency, No Privacy Tradeoffs
Video

Run Agentic AI Entirely on Your Mac—No Cloud, No Latency, No Privacy Tradeoffs

Apple’s MLX framework is mature enough now that you can run serious agentic AI workflows locally on Silicon...

Hermes Agent Master Class
Video

Hermes Agent Master Class

Welcome to the Hermes Agent Master Class — an 11-episode series taking you from zero to fully leveraging...

CONSULTING

Outsider
Labs.

A management consulting team focused on AI transformations for executives and business owners.

Work with us →